Originally designed as a decentralized network, the Web has undergone a significant centralization in recent years. In order to regain freedom and control over the digital aspects of our lives, we should understand how we arrived at this point and how we can get back on track. This chapter explains the history of decentralization in a Web context, and details Tim Berners-Lee’s role in the continued battle for a free and open Web. The challenges and solutions are not purely technical in nature, but rather fit into a larger socio-economic puzzle, to which all of us are invited to contribute. Let us take back the Web for good, and leverage its full potential as envisioned by its creator.
As an inventor, you might envision a purpose and destiny for your creation—
The concept of centralization does not pose a problem in and of itself: there are good reasons for bringing people and things together. The situation becomes problematic when we are robbed of our choice, deceived into thinking there is only one access gate to a space that, in reality, we collectively own. Some time ago, it seemed unimaginable that a fundamentally open platform like the Web would become the foundation for closed spaces, where we pay with our personal data for a fraction of the freedoms that are actually already ours. Yet a majority of Web users today find themselves confined to the boundaries of a handful of influential social networks for their daily interactions. Such networks gather opinions from all over the world, only to condense that richness into one space, where they simultaneously act as the director and judge of the resulting stream they present to us.
Because this change happened so suddenly, perhaps we need a reminder that the Web landscape looked quite different not even that long ago. In 2008, Iranian blogger Hossein Derakhshan was sentenced to 20 years of jail, primarily because of blog posts he had written. He and many others were able to state their critical opinions because they had the Web as an open platform, so they did not depend on anyone’s permission to publish their words. Crucially, the Web’s hyperlinking mechanism lets blogs point to each other, again without requiring any form of permission. This allows for a decentralized value network between equals, where readers remain in active and conscious control of their next move. When Derakhshan was eventually released in 2014, he came back to an entirely different Web : critical readers had transformed into passive viewers, as if watching television. While Web technology had of course evolved, its core foundations had not—
Of course, social media are not our enemies here: they should be credited with lowering the barrier for the online publication of short texts and photos by anyone. Yet they operate under a winner-takes-all strategy, each striving to become the dominant portal instead of mutually interoperating like the rest of the Web. In contrast to blogs, we typically cannot interact with posts in one network from within another: we would need to either move the people or the data. This famous
walled gardens problem of social media  has significantly worsened since 2008, because some gardens have grown huge while their walls remain in place. A major problem is that access to the dominant networks invariably means giving up control over our personal data: we can enter through the door in the wall if we pay with our digital belongings. That personal data can then be leveraged to unwittingly influence us through excessively personalized advertising for brands, products, and even political agendas. Furthermore, once there, people tend to form small conversational circles within each garden—
- taking back control of our personal data;
- preventing the spread of misinformation;
- realizing transparency for political advertising.
Clearly, it is undesirable to tackle these challenges through centralized solutions, for instance by appointing an authority for personal data, news, and advertising. This would create yet another single point of failure, which—
From the above, it is clear that our primary obstacles are not technological ; hence Tim Berners-Lee’s call  to
assemble the brightest minds from business, technology, government, civil society, the arts, and academia to tackle the threats to the Web’s future. Yet at the same time, computer scientists and engineers need to deliver the technological burden of proof that decentralized personal data networks can scale globally and that they can provide people with an experience similar to that of centralized platforms.
In this chapter, we will therefore start with a technological perspective on decentralization, highlighting Tim Berners-Lee’s role in the continuing fight to keep the Web open and decentralized. After a historical overview of power struggles on the Web, we will zoom in on the changes that decentralization requires, and examine what a more healthy ecosystem would look like. As a concrete implementation of these principles, we will study the Solid project. We will end with a discussion of open challenges and an outlook on the future.
The arrows of the decentralization movement have not always been aimed at social media—
Decentralized systems, which do not require a central mediator to function, were already around at the time the Web was invented. Most notably, the Internet was increasingly gaining popularity as a large-scale decentralized network. Email was even more decentralized than the traditional postal mail service it mimicked, since different mail servers would directly exchange messages with each other. Long forgotten protocols such as the Network News Transfer Protocol (NNTP) allowed for the decentralized exchange of news articles. In short, decentralization was not some crazy new idea, but rather the spirit of the time.
Therefore, when Tim Berners-Lee set out to design a new hypertext system in 1989, it was presumed to be decentralized, in contrast to documentation systems of the time, but in alignment with many others. The main selling point of the Web was its universality , its independence of, among others, hardware and software; decentralization was simply the unspoken assumption. This is reflected in the original article introducing the Web , which emphasizes universal readability across operating systems, but does not mention the term
decentralization at all.
The only component with centralized roots in the Web’s architectural design is the Domain Name System (DNS), which resolves the domain name part of a Web address (such as example.org) to a physical machine on the Internet. This was not as much of an issue back in the days when the number of domains was relatively small and domain ownership would be stationary. Nowadays, millions of domain names frequently change hands, thereby breaking existing links in possibly malicious ways. By manipulating DNS, governments can block or alter access to existing websites. Tim Berners-Lee has indicated that, in hindsight, a more decentralized naming system might have been preferred. Apart from that, the Web contained all ingredients to thrive in a decentralized way.
A first wave of centralization resulted as collateral damage from the browser war of the nineties, in which companies competed to become the sole vendor of the software through which we access the Web. The Web’s design principle of universality demanded readability on any platform, so the emergence of multiple browsers was a blessing—
While competition through innovation is fine, these features came at the cost of incompatibility across browsers and therefore directly endangered the Web’s universality. Websites would carry badges such as
best viewed in Internet Explorer, since a consistent experience across platforms could not be guaranteed. Those who did not want to use a particular browser—
The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee with a mission of compatibility, enabling cross-browser consistency through recommendations that specify the correct workings of Web technologies. While W3C standardization is administratively centralized, it incorporates feedback from a decentralized network of members through a consensus-driven process. A problem by the early 2000s was that Internet Explorer deviated from W3C recommendations at crucial points, forcing developers to follow either the actual standards or their incorrect implementation in the most popular browser.
Fortunately, pressure from Firefox and Safari during a second browser war eventually forced Microsoft onto a more standards-oriented course . Since 2010, no single browser has gained more than two thirds of global market share anymore, meaning that standards compatibility is now in the interest of browser vendors and Web developers alike. The balkanization of the Web through centralized browser development has thereby largely been averted.
Microsoft’s short-lived victory after the first browser war quickly turned out to be insignificant, since the centralization battle had gradually shifted to other fields. While each browser was quarreling to become the default application, search engines were racing to become the main entry point. Soon, it did not matter anymore what software you used for browsing; what mattered was who gave you the directions of where to browse next. After all, no immediate income could be generated from free browser development, whereas companies would gladly pay for a prime spot in one of the major search engines’ rankings.
The early search engine landscape featured several competitors, such as AltaVista and Lycos, but it took Google only a couple of years to become by far the most popular. The centralization of search meant that one company gained an overly strong influence on what content people would access, based on the ordering of search results for given terms. Even assuming the best of intentions and ignoring paid advertising, the fact that one algorithm makes decisions for a large number of people leads to an information bias, as there clearly exists no single objective way to rank the
best webpages on any topic. External attempts to manipulate these algorithms started to occur, first through relatively simple interventions such as misleading keywords, later through advanced Search Engine Optimization (SEO) techniques that aimed to improve website rankings in various (and sometimes dubious) ways.
The advent of search engines also brought the first online monetization of user-generated data. Our search terms contribute to a detailed profile of what we need in our private and professional lives. Search engines might know more about some aspects of our lives than our close friends. This profile determines the ads we receive and the personalization of our search results, encouraging us to visit websites and buy things we otherwise might not have. While personalization has helpful effects for many people, the problem is that we are left without choice or control. We are directed to the large search engines, which, due to their large accumulation of data, provide us with the best search experience. Yet these search engines do not provide us with options for how we want to pay for their services, as most of them only accept our personal data. Furthermore, we are not informed about—
While the reign of Google still continues, social media have found an even more powerful way of collecting and marketing our personal data. The social Web revolution of the 2000s encouraged people to be present online, which drove many of us to various platforms to share blog posts, bookmarks, photos, videos, and more. Some year later, social media companies created centralized platforms to take over many of these features, which until then were spread out across multiple providers. These platforms store our personal data and request far-reaching usage rights in exchange for their services, all of which operate within their own walled garden.
Like search engines, the main service of social networks consists of a linear list of content, ranked by factors and algorithms we can only minimally influence. In contrast to search, a social feed is generated without any input terms from our side, like a television that no longer requires a remote. The ensuing show is meticulously personalized based on data we consciously left on social network platforms, combined with traces from our browsing history picked up—without our concious consent —by social trackers on third-party websites. In his 2018 Dertouzos distinguished lecture, Tim Berners-Lee mentioned that political advertising has been banned from television in the UK  because of concerns about the impact of such a direct medium. Yet by that logic, he continued, we should be much more concerned about the heavily personalized political advertising that current social media platforms enable and allow. Even if we refrain from explicitly sharing certain sensitive traits, seemingly insignificant pieces of other data can be combined into reliable predictors of highly personal information  such as sexual orientation, ethnicity, and religious or political views, which are subsequently used to target us.
As in the previous two centralization races, a subtle force is exerted upon us: we feel pressured to be part of the large networks, because not joining means missing out on the volatile virtual traces of our friends’ and family members’ lives. Often the easiest way for grandparents to see their grandchildren’s latest pictures is to create a Facebook or Instagram account. This is how the digital memory of a large part of today’s generation ends up in one space, often beyond control of those that are part of the memories. The centralization of our online activities has become so extreme that some Facebook users have become unaware of their ability to access the Internet . This paradox has sadly become a reality in many countries, where Facebook’s Internet.org initiative provides a severely constrained version of the Web that further reduces people’s options, in blatant violation of Net Neutrality.
Meanwhile, another race is happening in the background, namely the battle to become our identity provider. An increasing number of websites are gradually replacing their own login systems with authentication tied to large platforms such as Google or Facebook. For people with an existing account, the
Log in with Facebook buttons are a convenience. For those without, they form additional pressure to join. And in both cases, such buttons are yet another way of tracking our online activities. This centralization of identity takes away our freedom to assume the persona we want—
A recurring theme in the above centralization races is the lack of choice: a choice of browser and operating system, of entry point to the Web, of storage for our personal data. Decentralization is fundamentally about enabling choice, by breaking up artificially coupled decisions into individual options that can be combined at will. Just as we are free to choose any combination of device, operating system, and browser to access the Web, we should be able to interact with websites and other people without commitment to a single social or other platform.
Taking back control of our personal data, as envisioned by Tim Berners-Lee, is realized by decoupling data storage from services. This means people can store their data wherever they choose, while still enjoying the services they want. We can pick any provider to store our texts, photos, and videos—
This mindset gives rise to the concept of a personal data pod, in which we can store every single piece of information we produce. As shown in the figure below this statement can be taken quite literally: even a seemingly trivial piece of data, such as simple
like we gave a certain webpage or thing, can be stored in our own pod. While such a degree of decentralization might seem extreme, recall that even supposedly trivial likes can reveal much deeper personal information , so it makes sense to give people control over them. Furthermore, since we do not depend on anyone’s permission to publish data in our own pod, we can place likes, annotations, and comments on anything we want, without fear of them being censored or deleted.
This total data ownership enables highly granular access control: people can selectively give permission to friends or applications to read or write specific parts of their data pod. For instance, they can decide whether or not they make their profile picture and full name public, who can see which of their likes and comments, and what applications can edit their pictures or posts on their behalf. These permissions can be changed or revoked at any time. People can have multiple data pods for different purposes, for instance, a pod for personal and family pictures at home, a pod governed by retention policies for professional data at the workplace, and a university pod with study materials and grades. Upon creation, they can decide which data is stored in which pod.
By choosing the storage location of our own data, we prevent unauthorized access and exploitation. We are no longer obliged to pay with our data in order to access a certain service. Moreover, we can protect the most sensitive parts of our data by keeping them to ourselves, and limit sharing to people and services that really require it—
When people store their own data, privacy-unfriendly business models centered around data ownership will not be viable anymore. Such an economic change can be accelerated through legislation, like the EU’s General Data Protection Regulation (GDPR), as well as growing awareness among the general population about the dangers of centralization, given recent data scandals at companies such as Equifax and Facebook. Consequently, new business models for applications become necessary.
Decentralization requires the nature of applications to evolve from silos to shared views. As shown in the figure below, current Web apps combine data and service. Because of this coupling, our LinkedIn contacts cannot comment on our Facebook pictures, and an RSVP on a Facebook event will not be reflected in our Doodle calendar’s availability. Decentralized applications, on the other hand, act as views on top of our data pod and those of others. When granted specific access rights, photos uploaded into our data pod by a photo gallery application can be accessed by a social feed app. Events in my personal calendar that have public visibility can show up in the same feed. Our friends can view the parts of our data to which we grant them access through whatever application they wish to use.
Because the choice of data and service provider becomes decoupled, separate markets for data and services emerge. The figure below shows that centralized applications compete in a single market based on data ownership, because usage of a service is coupled with usage of its storage. As such, people cannot easily switch to a better application experience, as migrating their data—
This independence means we can freely switch data and service providers, without requiring our friends to choose the same ones. This brings down the walls in between the gardens, because we gain the ability to reuse and move our data, and can interact with anyone in the entire landscape. Data and service providers can evolve without dependency on each other, which enables a faster and more creative innovation cycle. Anyone can enter either market and attract customers by providing a better experience than others, without asking for control of our data.
In order to realize this vision of data ownership and data/
One of the crucial aspects of Solid is that it provides a Read–Write platform, as was Tim Berners-Lee’s original intention for the Web . While writing has always been possible, in the sense that anyone could start their own website, the Web 2.0 and social media revolutions should be credited with making writing considerably easier. This explains part of the success of these platforms, as anyone can now be a content producer at any time, especially through their mobile devices.
Solid should make authoring content similarly easy, the difference being of course that we would always write to our own data pods instead of to the application through which we create. In doing so, we guarantee that everyone can express themselves without risking censorship. To maximize interoperability, our Linked Data should be stored using Semantic Web technologies , which interweave a piece of data with its meaning. That way, applications can make sense of (parts of) each other’s data, without having to agree upfront exactly what our data should look like. When storing data in our own pods, we need a mechanism to inform others when things have been created or modified—
By transforming data ownership and the role of applications in a decentralized ecosystem, Solid is able to disrupt many interactions that happen on the Web. Many processes that currently depend on centralization can be revolutionized in a decentralized way, by cutting out the middlemen that control these processes. This can stimulate innovation in areas that are embracing the current status quo and resisting change.
A first obvious target are social interactions between people. Sharing multimedia with friends, colleagues, and family members without privacy concerns becomes possible through Solid. Other examples include collaborating on various kinds of documents under transparent access control, and organizing meetings and events—
Moreover, Solid has the technological potential to disrupt entire industries, such as for instance scholarly publishing. The current scholarly publication process assumes that an author uploads a scientific manuscript to a centralized platform, where a closed group of reviewers evaluates it. After acceptance, the manuscript is published as an article and then becomes accessible to the public, possibly at a fee. This process is rather slow, as the wider scientific community can only read the article at the end—
Re-decentralizing the Web along the lines of the Solid vision can help us tackle Tim Berners-Lee’s three challenges . We can take back control of our personal data by storing data in our own data pods. The spread of misinformation can be halted, because a free choice of applications allows us to influence our news feed—
Freedom of course always comes at a cost: what constitutes a victory for personal rights and freedom of speech also facilitates the spread of illegal messages, since decentralized networks make it harder to control what information is exchanged. Legality is of course a tricky matter, as some countries instate laws that prevent their citizens from voicing opinions that would be legal elsewhere. An intriguing case is the increased popularity of the decentralized social network Mastodon in Japan : as Twitter started removing images that were deemed questionable under US norms, Japanese users began publishing them on platforms with less censorship. We will have to accept this trade-off between freedom and control—
This brings us to another aspect of decentralization, which is the tension between freedom and universality. The Paradox of Freedom states that we can only be free if we subject ourselves to certain rules. Simply said, we can take our bike and ride anywhere—
Importantly, the arrows of decentralization and Solid are not aimed at specific companies such as Google, Facebook, or Twitter. Instead, they point at centralization in general, since many of the problems and challenges faced by these companies are inherent to centralization and the business model of data ownership. We have come to the point where companies possess so much data that they themselves are unable to predict the long-term effects that such a centralization might have . Therefore, it is unreasonable to use
informed consent as an excuse, since no individual can reasonably understand what giving up control over small or large pieces of their data will eventually lead to. Storing our data in a trusted place of our choice, combined with a granular permission model, is therefore the only safe bet.
Note that none of us are dreaming of a Web without large players. Quite the contrary: Tim Berners-Lee insists that the Web should always remain scale-free , with room for the very large and the very small. The problem is that the very large are currently trying to make the rest obsolete, which endangers the online freedoms we have enjoyed for so many years. As argued above, decentralization is foremost about choice, so people should be free to join large or small communities. And while there are several technical issues ahead of us for decentralized applications, notably guaranteeing a similar user experience as centralized platforms in terms of usability and speed, the first technological proof has been delivered with Solid. Now, it is up to all of us to anchor this technological progress in today’s and tomorrow’s socio-economic reality in order to re-decentralize the Web for good. Only when we succeed in taking back control and choice over our most precious digital assets, we are able to truly say: this is for everyone.
- Derakhshan, H. (2015), “The Web We Have to Save”, 14 July, available at: https://medium.com/
- “Break down these walls”. (2008), The Economist, available at: https://www.economist.com/
- Pariser, E. (2011), The Filter Bubble, Penguin Books.
- Berners-Lee, T. (2017), “Three challenges for the Web, according to its inventor”, Web Foundation, 12 March, available at: https://webfoundation.org/
2017/. 03/ web-turns-28-letter/
- Rosenthal, D. (2018), “It Isn’t About The Technology”, 11 January, available at: https://blog.dshr.org/
2018/. 01/ it-isnt-about-technology.html
- Berners-Lee, T. (2018), “The Web is under threat. Join us and fight for it”., Web Foundation, 12 March, available at: https://webfoundation.org/
2018/. 03/ web-birthday-29/
- Berners-Lee, T. (2005), “Universality of the Web”, 23 March, available at: https://www.w3.org/
2005/. Talks/ 0323-yorkshire-tbl/ slide5-2.html
- Berners‐Lee, T., Cailliau, R., Groff, J.F. and Pollermann, B. (1992), “World‐wide web: the information universe”, Electronic Networking, Vol. 2 No. 1.
- Gustafson, A. (2008), “Beyond DOCTYPE: Web Standards, Forward Compatibility, and IE8”, 21 January, available at: https://alistapart.com/
- Berjon, R. (2018), “Advertising’s War on Consent”, 19 March, available at: https://berjon.com/
- Berners-Lee, T. (2018), “From Utopia to Dystopia in 29 Short Years”, 18 May, available at: https://www.csail.mit.edu/
- Kosinski, M., Stillwell, D. and Graepel, T. (2013), “Private traits and attributes are predictable from digital records of human behavior”, Proceedings of the National Academy of Sciences, National Academy of Sciences, Vol. 110 No. 15, pp. 5802–5805.
- Samarajiva, R. (2014), “More Facebook users than Internet users in South East Asia?”, 30 August, available at: http://lirneasia.net/
2014/. 08/ more-facebook-users-than-internet-users-in-south-east-asia/
- “Solid”. (n.d.). , available at: https://solid.mit.edu/.
- Berners-Lee, T. (2006), “Linked Data”, 27 July, available at: https://www.w3.org/
- Berners-Lee, T. and O’Hara, K. (2013), “The read–write Linked Data Web”, Philosophical Transactions of the Royal Society A, Vol. 371 No. 1987.
- Berners-Lee, T., Hendler, J. and Lassila, O. (2001), “The Semantic Web”, Scientific American, Vol. 284 No. 5, pp. 34–43.
- Capadisli, S. and Guy, A. (Eds.). (2017), Linked Data Notifications, Recommendation, World Wide Web Consortium, available at: https://www.w3.org/
- Capadisli, S., Guy, A., Verborgh, R., Lange, C., Auer, S. and Berners-Lee, T. (2017), “Decentralised Authoring, Annotations and Notifications for a Read–Write Web with dokieli”, in Proceedings of the 17th International Conference on Web Engineering, pp. 469–481, available at: http://csarven.ca/
- Zuckerman, E. (2017), “Mastodon is big in Japan. The reason why is… uncomfortable”, 18 August, available at: http://www.ethanzuckerman.com/
blog/. 2017/ 08/ 18/ mastodon-is-big-in-japan-the-reason-why-is-uncomfortable/
- Barabási, A.-L. and Albert, R. (1999), “Emergence of Scaling in Random Networks”, Science, Vol. 286, pp. 509–512.