Data without context is meaningless; data without trust is useless.
2017-12-18 is nothing but a string—
Imagine spotting a colorful shop on the high street. Right as you walk through the door, the shop owner jumps in front of you, blocking your way with a giant tray of home-baked goods. You just wanted to have a quick look around—
cookies improve your experience, and therefore you must decide now which ones you want.
It’s super weird and awkward because you don’t even know this person. You cannot not choose—
It’s hardly an improvement to anyone’s experience when websites open with a big banner they can’t ignore. Before fully seeing the main page, we’re asked to enter into a legal contract. With just one click, we sometimes accept hundreds of legal policies, most of which are written such that the average person cannot understand them. It’s almost as if they want to make their own experience more pleasant, rather than ours…
Why are we confronted with so much more resistance online than in the real world? Shops wouldn’t have any visitors if they treated people like the average website does. Clearly, trust on the Web is at an all-time low and it’s high time we change that.
Fixing this requires paying attention to both sides of the equation. It’s not just people who lose today: companies also fight a losing battle and the end is not in sight. Trust—
Our overall approach to make personal data work better for everyone, leverages personal data vaults, or data pods as we call them. The core idea is that every piece of data someone produces about you, can be stored in your personal data vault. That’s our starting point for trust: your own private data location, with a provider of your choice.
Pods are a great idea because they unlock innovation with data. Essentially, they’re putting data closer to people such that people can decide how their data will work for them. Personal data vaults are fundamentally about empowerment.
And if we do this well, there will be more data available for innovation instead of less, because people can choose to reuse data from other places to their own advantage. If a company offers them a tangible benefit, a customer can decide to share some piece of data obtained elsewhere. The company can then temporarily employ that data during the specific interaction to provide the customer with an enhanced quality of service. This without ever taking control of the customer’s data, and thus without either party having to deal with the additional complexities of doing so.
That’s a forward-looking example of a win–win strategy: people regain agency, and more data becomes available for innovation by companies (still fully under people’s control). We need to apply a similar win–win strategy when it comes to trust.
Achieving such a level of trust becomes possible when a pod guards my data for me. But how will we implement these varying levels of control? And how can we avoid those predatory screens asking for our consent, with a personal data pod that might look like an irresistible honeypot to data hoarders?
First, we cannot look at the future with the eyes of today. Companies today have an incentive to pry away from us as much data as they possibly can. In a future where data flows more responsibly, hoarding data is likely to become more expensive than relying on people sharing data from their pods at the point of need. Hoarding will simply no longer be worth it, because of its substantial costs and massive liability under GDPR and other legislation. Even for companies who go through all the effort, in the best case, they’ll have data that’s almost as good as what will already be available to their competitors through a data pod. Not exactly a future-proof strategy.
Second and most importantly, the consent screens of your pod will be working for you. In contrast, today’s cookie screens work on behalf of the party requesting access, who have every reason to make those screens as confusing and unintelligible as they can possibly be. It’s not sufficient to be a lawyer to interpret them—
- automatically grant or deny access
- involve a trusted person to help you decide
- explicitly give or withhold permission yourself
So if you’re as legally inexperienced as I am, you might prefer a dialog from your pod that explains in layperson terms what data will be used for what purpose. You’ll see the option to say yes to some things and no to others. For cases where you don’t care—
Your pod doesn’t have any incentive to share anything you don’t want to share.
Your pod doesn’t have any incentive to show you pop-ups you don’t want to see or ask you questions you don’t want to answer or even think about.
All screens will be designed and personalized for you, such that you have everything you need to make an informed choice. And what you need will be different from what others need. Your pod protects you when you’re about to share sensitive information, and leave you alone when you’re not. You get to decide what you consider sensitive, and this decision can be different for everyone.
Let’s take a deliberately simple case: I’m happy to share my shoe size with any shoe or clothing store. And it just so happens I’m planning to buy new sneakers, which I want the store to send to my home address. So they need data to deliver this service to me, for the duration of our transaction. What we want is data sharing to our mutual benefit. It’s that simple—or is it?
Try any website, I challenge you. Adidas, Reebok, Nike, Puma. All of them want me to decide on cookies first. I’m a customer here! All I need is for them to show me a nice pair of shoes that fit, and I will buy them. Why does that straightforward transaction require a complicated upfront legal contract? My data experience is terrible:
- ❌ I need to make decisions about cookies.
- ❌ I need to manually copy over my data.
- ❌ I have no idea what happens to my data.
So despite living in the age of ever advancing AI, we’re forced by today’s Web to manually copy over the same data again and again, every time agreeing to a new contract. Just think about how many times you have to type out your name or email address on a weekly basis. It’s 2023 and we still manually carry our data around.
That’s all because of a mutual lack of trust. To buy shoes, I have to enter my address and payment details. Companies have to trust that those are correct; I have to trust that they will handle my data correctly. Us carrying our data across their doorstep is an attempt to compensate for being unable to make data and trust flow together. Fortunately, our pods can correctly share correct data on our behalf.
So let’s solve these problems once and for all. Not because I care so much about buying shoes—
I arrive on Didasbok, the shoe store website of the future. I can see a bunch of shoes: Nice ones, some that fit, some that don’t. There are no cookie pop-ups or consent dialogs to click or accept before I can see their products. Just shoes, and I like it.
At first, Didasbok doesn’t know anything about me. That’s great—
Now this is where the magic starts to happen. I’m not clicking any button on their website: they’re not storing my data—I am, through my pod, which is securely connected to my browser. Instead of logging in on the website, I’m clicking a button on the frame of my browser window to indicate I’m ready to share data via my pod.
The reason we can do this, is because Solid adds a layer to the existing Web with standard authentication and authorization. Just like how HTML and CSS standards form a contract between websites and browsers to decide what webpages should look like, the Solid specification will enable contracts for standardized identity and data sharing.
- entirely anonymous
- a student
- a private citizen
- an employee of a specific company
- a pseudonymous commenter
That’s why I first tell my browser what persona to pick for each website. It’s basically the equivalent of signing up for a website with my personal email address, work email address, or a burner email account. When I share any data, it will be from that context. So now my browser knows how I want to share data with the website. But of course, I’ll only share what’s convenient to me.
Next, the website informs my pod that it can use my shoe size to filter results, and my country to calculate shipping fees. My pod then generates a dialog that allows me to choose whether I want to share data from my selected identity:
I’m not interested in promotions right now, but I agree to share my shoe size. Lo and behold, my screen fills up with only shoes that fit me. At this point, Didasbok still does not know who I am. All it knows is that this visitor is a Belgian with shoe size 48. And I didn’t have to type this data into the website: my pod selected the right data and sent it over, after making sure that’s what I want. Take a moment to reflect how much better this future experience is compared to what we have today:
- 🚫 I received no questions about cookies.
- 👍 I did not have to type anything.
- ✅ I was consulted in detail before my data was shared.
The previous dialog still contained some legalese, which is not what I want to see when I’m buying shoes (your mileage may vary). Therefore I changed my pod’s settings to show me more simplified dialogs.
I click the giant “Buy now” button. In order to finalize the transaction, Didasbok needs my complete address for shipping, and my credit card details for the payment. Today, we would enter all of this data manually in text fields on a form. But this is the future! Not only can my pod automatically fill out such forms; the forms don’t need to be there in the first place, because my pod can just share the needed data, machine to machine.
- 🚫 I received no questions about cookies.
- 👍 I did not have to type anything.
- ✅ I could share my data via a simple dialog.
In user experience design, the ultimate way to improve a dialog is to make it disappear. Just like pod-based data sharing can make forms obsolete, it can also make consent dialogs a thing of the past.
I’m fine with shoe stores knowing my shoe size. They don’t need to ask me. In fact, by asking me, you’re bothering me more than by just using that data already. So what if we didn’t need a dialog at all?
Don’t get me wrong—
But data pods allow us to make that choice beforehand; or make it once and then tick a box that we don’t want to make that choice again. That’s because, unlike a website you visit for the first time, your pod knows you and your preferences across any online and offline case where you might want to use your data. And since your pod works for you, you can trust it to consistently apply those preferences.
In essence, I am giving prior consent to any future requests coming in, such that I don’t have to be confronted with pop-ups or make decisions in the moment:
- 🚫 I received no questions about anything whatsoever.
- 👍 I still did not have to type anything.
- ✅ I pre-approved automated sharing of specific data.
- Verify the requester is indeed a shoe store or clothing store, for instance through their NACE or ISIC codes.
- Instantiate my internal policy for that single store, packaging my data together with this highly specific policy.
That’s quite some more legalese than I would have written, but that’s why I’m letting my pod automatically generate it! The recipient doesn’t know that I have an internal rule set that gives the same permission to all shoe stores; all they know is that they can now legally use the data I’m sending them (only) for the purpose I agreed to.
The reason these previous examples work, is because they make trust flow along with the data. We’ve known for a long time that we need to send the semantics along with the data, such that people and machines from different organizations can interpret the data in the same way. If we don’t, confusion ensues—
With Solid pods, we use RDF to integrate semantics with the data. My proposal is to encapsulate this RDF data into a trust envelope, which describes additional context, also in RDF. This envelope describes context to establish mutual trust between the sender and the recipient by explicitly detailing the history and destiny of the data:
Today, without a trust envelope, I have control over what I share, but my pod is just sending raw data. So once the gate opens—
When my pod instead generates a trust envelope detailing the intended destiny of my data, the recipient can track for what purposes they can use it. For example: I can share my date of birth for the purpose of verifying whether I am of legal age to buy wine at Online Store Inc., valid for the duration of 1 hour. The whole envelope and its contents will be digitally signed with my private key, such that I can prove at any point in time which requests I have and haven’t authorized. Therefore, the envelope establishes trust that your data will be used correctly.
Now of course, just because we’re sending the semantics and trust along with the data, doesn’t prevent the recipient from separating them. They can discard the semantics and make a wrong interpretation. They can throw the trust envelope in the bin and just work with the raw data as if there are no limitations. This is why I refer to this concept as a “trust envelope” rather than more established terms such as sticky policies, because policies are never really sticky. But while the envelope metaphor emphasizes how someone could separate it from its data, a digital signature on the sealed envelope means we would notice whenever they tried to reseal the data in a different envelope.
Today, without a trust envelope, companies have to hope that we send them the correct data. Is this really the customer’s address, or did they make a typo? Am I really 18+, or did I just make up a date of birth so I could buy a bottle of wine for my friend? Or worse, does the data require a custom verification process, which is expensive to install and maintain, and actually mostly bothers people who already provided correct data?
If the trust envelope explains the history of the data, digitally signed, companies can automatically verify whether they consider data trustworthy. Such provenance can originally have been supplied by other parties. For instance, my date of birth in my pod might be accompanied by digitally signed provenance from a government’s citizens’ registry. That way, the hard part of the verification performed by another party can be effortlessly reused by both my pod and the recipient of the message. On top of the trust envelope, my pod simply provides the existing provenance trail. Thereby, the envelope establishes trust that my data is correct.
Trust is a bi-lateral construct. It is to my benefit that the recipient has trust in the correctness of my data, as well as to their benefit that the data is correct. Similarly, it is to my benefit that the recipient knows what I consider correct usage of my data, and it is to the recipient’s benefit that they only use my data in correct ways.
And that’s the reason why recipients will want to keep the trust envelope instead of throwing it in the bin. Because it’s their ticket to proving to an auditor that, indeed, they have only used our data for purposes to which we consented. That’s where our part of the trust comes in: when our pod sends data in a trust envelope, we’re not going to be naive and merely assume that these companies will now do the right thing. No, it’s because, even though they technically can do anything with that data, they will only ever be able to prove legally whatever the policy envelope allows them to do.
The way laws are written, assumes that the majority will do the right thing. Today, being GDPR-compliant is hard even for companies who want to do the right thing. How can they prove that they were indeed allowed to use my date of birth for an age check? All they have is their own claim that you ticked a certain checkbox on their website. That’s not exactly the strong proof that they and I want.
The policy envelope supports those companies who aim to do the right thing, and to ensure that companies who do the wrong thing will not have proof that would hold up in an audit. They can still do the wrong thing—
Raw data is like raw meat. An ingredient of delicious dishes to some, but not exactly a thing you’d touch without thoroughly washing your hands. And just like raw chicken can infect an entire kitchen or restaurant, raw data can be infectious to an entire organization. Organizations must be able to trust that data is correct, and that they are using data correctly. We deserve the same guarantees from them.
Many seem to think that our data is flowing too easily today: we’ve lost control and our data is everywhere. I’m actually arguing the opposite point: our data does not flow well enough. Because our data is currently shared without trust, we need to enter that data ourselves on every website we visit, time and time again. Data isn’t flowing easily at all: we have to manually sustain data flows, serving as a human replacement for the broken trust link in the data chain.
Companies deal with this lack of trust today by rolling out their own storage and trust systems, but that doesn’t scale. Not for them, and not for us. First, many companies aren’t doing it right, because it’s not what they specialize in, and legal data compliance across the globe is inherently difficult. Second, such compliance is expensive and needs constant maintenance. Third, the burden of compliance is pushed upon us, with legalese-ridden pop-ups that even most lawyers can no longer understand.
GDPR has exposed the lack of trusted data flows, but companies have doubled down by merely acknowledging this lack through cookie pop-ups, rather than restoring trust. The cure has proven worse than the disease: their legal compliance has become our responsibility, rather than a healthy basis for a mutually beneficial relationship.
Data pods can make data flow responsibly, and provide the right environment to make trust flow along with the data. Not only can pods, with our consent, share useful data automatically; they can also ensure that we only need to give explicit consent when we want to, and opt for prior consent for cases where we want to be left alone. Pods can provide control as a right, not as a duty, by helping us exercise this control rather than making it more complicated.
The shoe store example shows that, rather than claiming to give people a better experience with cookies, companies can actually provide us with a better experience when they allow for pod-driven data sharing. And as much as sharing my shoe size was a toy example, even this simple case isn’t possible today—yet. We’re building on creating the standards and technology to make this happen, and when ready, this will open the door to many more complex mutually beneficial scenarios.
Think about all cases where data sharing is a burden today. Think about job searching and application, where you have to send structured and unstructured data around, and then companies have to spend time verifying your input. Trusted data flows could make such scenarios so much more efficient, saving us and companies valuable time and resources. In the medical domain, trusted data flows can literally save lives. The fact that my Fitbit tracker data cannot flow to my doctor is a missed opportunity of significant proportions. Trust envelopes are a game changer for preventive healthcare.
To make data and trust flow together, the design of our data interfaces will need to change fundamentally. In today’s Solid ecosystem, data, history, and destiny get mixed up, so we need to recombine them. Rather than implicitly assuming that certain data comes with a certain provenance and intended usage, we need to make trust explicit. Pod servers should no longer send generic raw documents, but encapsulate the data from each response in a trust envelope instantiated for the specific interaction.
Once we master sending original documents, we want trust to flow with derived data such that, instead of my full date of birth, I can share a verifiable claim that
I am older than 18. To make this happen, I send a trust envelope with my birthdate to an intermediary we both approve. This intermediary replies using a new trust envelope with an intact provenance trail, containing my claim of adulthood. I can then share that second envelope with the store, which they will trust because of its provenance trail, without them having the burden of handling personally identifiable information.
In addition to solving these technological challenges, we also need progress elsewhere. From a usability perspective, if our goal is for people to not be bothered as much as they are today, the proposed automated consent instantiation can help. However, such automated instantiation paradoxically requires us to answer beforehand questions we actually didn’t want to answer ourselves. Presenting these to people as literal questions, assuming this is legally possible, could defeat our purpose of preventing consent fatigue. An interesting direction could be reusable consent profiles, where for instance a government releases a list of profiles that people can combine into a data sharing strategy that suits them. For example, one profile might be “relaxed shopper”, which I can mix into my pod’s usage policy rules to give only shoe or clothing stores prior permission to use my body measurements for result filtering. And if I’m not okay with this, that’s fine: I can instead select the default “strict shopper” experience where no such data is shared without asking. To facilitate such consent, we need to explore the usability of different kinds of personalized consent dialogs for different people.
In all of this, it’s abundantly clear that technology cannot—
And that’s what trust envelopes can do for us. They ensure that the majority of those who want to do the right thing, can do so without encountering the absurd hurdles we face today. Those with less sincere intentions will not be able to leverage the benefits of trusted data flows. The key of this all is entangling data with trust, ensuring that no data leaves our pods raw, by always sharing it inside of a trust envelope.
Thanks to Sabrina Kirrane for first introducing me to sticky policies many years ago, and for openly sharing her insights ever since. My appreciation goes out to Harsh Pandit and Beatriz Esteves for their interdisciplinary approach on consent and GDPR.