This page begins with a broad overview of how the Redact framework secures user data while allowing websites to use that data in their UIs. The sections afterwards dive into more detail on each component which makes the secure flow of Redact data possible.
Table of Contents
On a typical modern website, data is stored in a database which is owned and managed by the creator of the website. This data can be public, such as a profile photo and status update for a social media website, or it can be private, such as medical records in a health portal. In either case, the owner of the website has the ability to view, modify, or delete data, and a web user must trust that the owner does not mishandle it. To remove this requirement of trust entirely, Redact gives users the tools they need to set up a personal encrypted database, as well as a framework for website creators to build websites using this data.
Redact is made up of three major components which allow users to maintain control of their data while still allowing websites to use that data to build useful applications. These three components are:
- Redact-enabled website: A website which can use private data without being able to see it.
- Redact client: A program installed locally on a user’s device (e.g. laptop, phone) which communicates with Redact-enabled websites to populate information that is only accessible to the user. This application is the only one that can decrypt data.
- Redact storage: A user or third-party owned database which interacts with the Redact client to provide arbitrary encrypted or unencrypted data. It does not own any decryption keys.
To understand the problem that Redact solves, and how it solves it, it is important to understand the drawbacks of traditional data storage. Imagine a person logs in to an online healthcare portal to view medical documents and information. The information they see is sensitive, so they might assume it is reasonably secured - and they are often correct. If the portal provider took reasonable precautions, the data is stored in an encrypted form within a database. When it is requested by the person’s browser or device, the data is decrypted, passed to the web portal’s servers, then re-encrypted for transit between the web server and the browser. The data is owned by the healthcare portal provider, who have the following problematic administrative rights:
- Read: They can view the data in its unencrypted form.
- Delete: They can delete, or choose not to delete, the data.
Redact makes it possible to store data without these downsides. In its simplest form, Redact integrates with websites in order to store and display data which is referenced by the website, but stored by the user. Not only does this allow users to guarantee that their data is secured using encryption built into the Redact client, it also relieves portal providers and websites from needing to manage encrypted data and manage related cryptography overhead.
A Redact-enabled website is a website which fully or partially depends on data stored within a user’s Redact database. Because only the user has access to this data, the website requires that the user is running the Redact client on their machine - otherwise the data cannot be retrieved and displayed. In addition to viewing data, a Redact-enabled website may give the user the ability to securely create, edit, or delete data.
To display a piece of data stored in Redact, a website needs to have a reference to this data in the form of a path. Assuming the data exists on the user’s Redact storage, it is retrieved and placed on the page for the user to see. Redact data can be positioned and styled on a web page similarly to any other page element. Instead of resolving data (such as a person’s phone number) by making a request to the website’s backing server, the element is resolved by making a request to the user’s Redact client.
Creating and Editing Data
Because data in Redact is entirely controlled by the user themselves, the user must be the one to create the data in the first place. For data to remain completely secure, it must never leave the Redact environment. In other words, the data must never be sent to an arbitrary server (such as the website provider’s server) in an unencrypted form. To accomplish this, websites request editable fields from the Redact client, which is responsible for receiving new or edited data, optionally encrypting it, and storing it in the user’s personal database. It is not possible for the website provider to view the raw or encrypted data unless given explicit access by the user.
Visually Differentiating Between “Redacted” vs Non “Redacted” Data
In order for a user to use a Redact-enabled website, they must be able to differentiate between “redacted” data and non “redacted” data. Imagine a website presents a form to a user where they will input private information into a text box. The user needs a way to verify that the text they submit will truly be stored using Redact, in order to avoid being tricked into thinking the data is secure. The user can identify that this data will be “redacted” (in other words, the data will be stored in their Redact database instead of the website’s designated database) using a secret code or icon which is known only to the user, and displayed alongside an editable form field. This is similar to the browser lock icon, indicating whether or not the webpage is secure.
The client application is software that runs locally on a user’s device and manages fetching, decrypting, and displaying secure data in a Redact-enabled website. The client must be running on the same device that a website is being viewed on in order to respond to the browser’s requests for “redacted” data.
When someone navigates to a Redact-enabled website, the website uses placeholders for “Redacted” data. These placeholders prompt the web browser to make requests to the client for the corresponding data. The client fetches the encrypted data from the storage, uses the appropriate decryption key to decrypt it, and responds to the browser request with the raw data. Using this same pattern, it also allows users to edit the data by serving a form input instead of plain text.
Below is a diagram of how data flows throughout the Redact system and ensures only the user can access it. Blue arrows represent requests and green arrows are replies.
- A user visits a Redact-enabled website on their device (laptop, phone, etc.) via the browser.
- The web server responds with a Redact-enabled website. Private data is represented as a Redact reference, rather than as the data itself.
- The browser recognizes the Redact references on the page and sends a request to the Redact client for each piece of private data.
- The client receives the private data requests and contacts the Redact storage provider to get the encrypted form of the private data.
- The Redact storage, which is owned by the user, returns the encrypted data.
- The client decrypts the data and serves it to the browser. The website is unable to access the raw data.
In theory, Redact data never needs to leave the device it is being viewed on. It can be stored and accessed via a Redact-enabled website without leaving a user’s device. In practice, storing the data locally on a single machine makes it impossible to be used across multiple devices, or shared with others who are granted access. To address this limitation, the storage can be (and typically is) configured to use a remotely hosted database. Data is encrypted by the Redact client before being transferred to the storage, so security of the data does not depend on the security of the storage itself. The storage itself only contains encrypted data and no decryption keys, so the storage provider does not need to be trusted.
In order to support its various encryption and identity sharing schemes, Redact is backed by a robust cryptographic system that allows identities and data to flexibly move throughout user devices. At the base-level, all entities which make requests in the Redact pipeline are represented as separate asymmetric keypairs. A list of such entities is provided below:
- Client: must identify itself when it makes requests to storage or Redact-enabled websites in order for those websites to know who is making the requests
- Storage: must identify itself to requesting clients via TLS
- Redact-enabled websites: must identify itself to requesting clients via TLS
- User: the human user using the system has their own keypair which authorizes the client’s key pair
As noted in point four, the client key pair is useless if not backed by a user’s key pair which identifies who the client is requesting for. This is achieved by signing the client’s key pair with the user’s, allowing the client to provide a signature authorizing it to act on behalf of the user. This also allows a user to maintain the same identity and data access across multiple devices, by simply signing the key pair of a new device. Although these asymmetric key pairs handle the issue of user identity, they are not used for data encryption.
To encrypt data, much more efficient symmetric key pairs are used. When a piece of encrypted data is created, it is encrypted multiple times to produce a ciphertext for each group of users that will have access to it. Each group of users is backed by a single symmetric key which is shared ahead of time amongst that group. As other users access websites which reference this data, their clients will request its encrypted form from the appropriate storage, and then decrypt and display it with this previously shared key.