That's a good question which deserves an honest answer but this comment box is really too small for that and essay sized comments are frowned upon.
For starters: navigating the web in the beginning consisted of clicking links which caused you to go from one website to another. This all worked well when (a) the web was small and (b) there were (hardly) no trash pages.
Search engines changed that, and once they got 'good enough' the link graph became a mere starting point for crawling the web rather than the way we navigated from site to site. For a little while the link graph was used as a popularity measurement but this too changed (because of the huge number of low value links).
Then we got silos. A 'silo' is a bunch of data locked up under a trade between users and large web properties. The trade is 'you give us your content and a bunch of information about yourself and we'll use that content to attract others and to sell ads'.
Examples of such silos are Google, Yahoo and Facebook.
Finally, if originally (and the internet itself) was strung together by a peer-to-peer approach it turned more and more into a division between producers and consumers, with the producers on the 'server' side and the consumers on the 'client' side.
Mobile devices accessing the net further accelerated this trend, right now the only internet (not web) applications that are still peer-to-peer are torrent applications. For the most part the division on the web is complete and hosting a web server on your very powerful cable modem or DSL line would be grounds for termination of your access.
Servers are hosted centrally and are operated by companies whereas clients are simply terminals that access the content stored on those servers.
I hope that answers your question in enough detail, you could easily write a book about this.
The internet is, by it's nature, peer to peer and decentralized. Cut a cable, or take out a large networks, the internet will route around it, either quickly (routes converging on a new peer) or slowly (a poorly connected network finding a new upstream to purchase connectivity through). That companies then build on top of this and implement services where they are the middle of both connections does not change this fundamentally, it just adds an optional layer. To assume our connections have upstream bandwidth that is never or rarely used is false. I would argue that we generate more content per-person than ever in history. The seer amount of pictures, videos, webcams, posts and comments is much higher than ever before. Are they hosting it directly from their connections? Usually not, but that's as much a case of being efficient and reaching an audience as it is in companies wanting control over the data. Even then, there are services which are decentralized from that, such as email. It's not efficient to host content yourself. Even the large networks use dedicated CDNs. For the end user, Facebook is a CDN.
That said, I agree there is a clear move towards our data and services being handled by fewer, and larger entities, such as Google, Yahoo, Microsoft, Apple, Amazon. But they aren't a single entity, and I don't consider that centralized. Any one of those providers could implode today, and very little of their services could not be picked up by some competitor easily. I don't consider that centralized.
And there are many of them, some owned by companies that use them exclusively, some conglomerations of many different providers but owned by yet another party. How is this centralization? I still think you're just arguing that we've compartmentalized certain services to sets of companies, for the most part, but even that isn't centralization, because there are multiple distinct companies using multiple distinct networks and in many cases they are presenting multiple distinct capabilities. Not having something handled at the end point does not mean it's centralized, there's a very large middle ground here, and that's where we are currently at. I'm not sure I see any evidence that we are moving away from that towards actual centralization.
> When I received mail in '95 or so the machine receiving it was the workstation I wrote the reply on.
And many people that used POP3 continued to do so well into the 2000's. It's silly to run a mail server on your workstation. I know, I did it for years myself. You run into all sorts of stupid problems related to your workstation not being always on, badly configured backup MX servers, and other issues. We don't do it anymore not because we were forced out of it (you can still do it now), but because there are solutions that are better for most use cases, and we opt for those.
We don't all wash our own cars, or do our own plumbing, or even clean our own houses. Some people do, some people pay others to do that work. The fact they pay others doesn't mean we've moved towards centralizing those services. There isn't some national bureau of plumbing that is our only recourse when the toilet is clogged and we don't want to fix it ourselves.
Ok. So you say we're not trending towards a more centralized internet because you discard all proof that that is exactly what is happening. That's fine with me but it really doesn't help to move the discussion forward.
The reasons why we are moving to a more centralized internet are what is interesting, such as - you rightly identified those - that stuff isn't always powered up and that keeping a mailserver up and running is work and so on.
But none of that changes that centralization is happening.
Multiple distinct companies != peer-to-peer internet. That's what a decentralized network infrastructure used to mean, where the 'peers' were equals.
Nowadays it means clients in one camp and servers in another, and large scale consolidation of those servers in the datawarehouses of a relatively low number of companies serving up the bulk of the data. If that trend continues it's not a bad or a good thing per-se but it would be good to stop and think about how desirable that is.
So from that point of view a lot of centralization has already happened.
Everybody running their own mailserver: could be a good thing, presuming they can be made easy to set up and easy to maintain (I don't see any technical reason why not). Ditto webhosting, why should facebook host all your content (or google, or Yahoo).
In the end, convenience won over 'peer-to-peer', there are many reasons besides convenience (firewalls, for one) but the results are here and we'll have to live with it (except for a couple of die-hard hold-outs).
What I've tried to make clear, and either failed in or you disagree with this as well, is that I don't think saying we are "centralizing" or moving towards a "centralized" internet is correct, largely because that implies we are approaching, or event still moving towards, the end-point of that spectrum, which is centralization, and that implies a single authority.
I think it is correct to say we are, or at least were, decentralizing, to a degree. I think it's correct to say that we are not fully decentralized, which we were close to initially, but I don't think it's entirely constructive to say we are moving in a direction that leads to a centralized internet, and what that implies (a single authority, even if for a single service). I think we are moving towards, or have arrived at, what we see in many markets. Large dominant players that the majority use, but with a large market of smaller players that provide for the niche needs. Take the automotive industry, for example.
I think we are largely arguing over semantics, which is something I don't want to do, but at the same time it's hard to be sure I'm not just reducing your arguments to the point there's no difference and ignoring important points at the same time.
> But none of that changes that centralization is happening.
I think it's cyclical, and there will be periods where we move along the spectrum back and forth, but I doubt we'll get as close to the decentralized end as we started at, but for many reasons. I don't think we'll get all that close to the decentralized end either though.
My argument has not been "we are decentralized", it's been "we are not centralized". To that effect, peer-to-peer is irrelevant to my argument, and I've tried to make that clear.
> Everybody running their own mailserver: could be a good thing, presuming they can be made easy to set up and easy to maintain (I don't see any technical reason why not). Ditto webhosting, why should facebook host all your content (or google, or Yahoo).
Because it's very, very inefficient. There are upsides to centralization (e.g. discoverability), just as there are downsides (e.g. homogeneity). I think the sweet spot that maximizes the upsides and minimizes the downsides is somewhere between decentralization and centralization.
I think the accurate statement of your opinion is not "the web is centralized" but rather, "Zipf's law sucks."
In decentralized networks there end up being accumulation points, and Zipf's law (which shows up in piles of different contexts, originally noticed in rank of words used in languages) gives a pretty good idea of how that accumulation plays out in basically an L-shaped curve. Point being that it might have a lot more to do with the structure of human networks and attention than with choice of wire protocols...
The nature of http and websites makes the web centralized: there's always a server, users don't really serve data, it's always stored somewhere.
It's true that it's decentralized, that's it's easy to create websites, but in nature, if you shut down dns servers, you shut down 99% of the internet, which inclues HTML website.
And I think that a decentralized web might be more easy to index (proof of work system, etc).
That's not centralized, it's just less decentralized. Centralized and decentralized or on opposite ends of the spectrum. It's possible to be less decentralized and still be very far from centralized. There are many, many different entities providing all sorts of services, so I'm not sure how that portion can be seen as centralized at all. DNS, as you not, is probably the most centralized single point that everything relies on, but they simply have authority because we give them authority. If DNS server adminitrators decided to use different root servers, there's not a lot they can do about that. But I'll concede that authoritative DNS is fairly centralized, given it requires checking with a single authority, but even then, man entities(TLDs) have a say in what that authority says (but not the ultimate say).
Well you're right, in nature and architecture the internet in decentralized, but the use most users make of it, is centralized.
If you look at what internet.org attempted to do, that's actually how the internet is used most of the time. For consumers and most small businesses, internet is centralized. Technically, most of the internet is just http requests, meaning that there will always be this duality of servers and clients. Without web servers and their admins, there is nothing, and that's a form of control in my opinion: you can easily shut down a website.
I still don't see that. A centralized internet, or event a centralized "web" as has been distinctly defined elsewhere here, implies a single authority. That doesn't exist, and I don't see it existing in the future. Which email provider do you want to use? Pick from hundreds. Which social network do you want to use? Pick from from the tens of candidates. Which blog platform do you want to use, pick from hundreds again.
> Without web servers and their admins, there is nothing, and that's a form of control in my opinion: you can easily shut down a website.
There are webservers, and admins. That hasn't changed. There's been a shift to larger sites, but there's still plenty of small ones. You sill have the options to put your site at many different locations, or use a platform such as Facebook, Blogger or Wordpress.
Look, here google is trying to solve the problem of government surveillance and security. Web servers are a very weak point because you can shut them down if you have the law on your side, and recently the law has been abusive. And even if you can change your DNS, the root servers are still an important part of the internet, and they're subject to control and legal issues. Control and authority makes those aspects of the internet centralized. This applies to your hundreds of mail and web providers, which are not free by the way (datacenters). Decentralized technologies are entirely free.
What I'm talking about, is protocols that make services impossible to shut down, like bittorrent or bitcoin. That's what I mean by a decentralized internet. Those technologies are different and were made especially with the goal of avoiding control, and they are exactly the solutions to breaches of privacy. Here every computer is equal, and that's a true decentralized internet, in term of hardware AND software. What I was talking about, is generalizing bitcoin and bittorrent to messaging or even hosting databases.
Such software would run on many domestic computers that want to use it and host chunks of data in a redundant manner. The issue is authenticity and signing of data. But other than that, that's where the future is.
I'm sorry but I can't trust the html/http web one bit. HTML and javascript are awful technologies, which are slow to parse, building web browsers have been a race that resulted in no interesting progress and the web2.0 has been a joke. All those techs have been the base google have been making its money on, which also makes easy to mine, so to me centralization is a privacy issue.
No, my browser doesn't. Google's public DNS has little bearing on how I reach sites, unless I've specifically configured it that way. Either you really don't understand how DNS works, or you are simplifying to the point of just plain being wrong.
You could argue that the root servers are too centralized, and that their control constitutes centralized DNS control, but since the only reason they have control is that all the different DNS servers use them as authorities, an argument could also be made that their control is more be convention than anything else, and all it would take is a competitor to ICANN that added some value, and eventually we could have multiple authorities. Whether that would be beneficial or detrimental is another discussion.
As much as people like to bandy that term about, I don't think of (less than) 68% of all searches as a monopoly. Two out of three people is a lot, but it's not nearly enough to force some sort of information control (whether that information is result, or other people exclaiming how much better their search engine is working).
Through a complex interrelationship of distinctly controlled networks that advertise routes and addresses and allow traffic based on complex business relationships (peering). The only case where that's not happening is where we both have the same ISP, and ycombinator happens to be hosted there as well. Running a traceroute from myself to news.ycombinator.com, I count two distinct networks not including my local one, and not including cloudfare. If those networks stopped talking to each other, my packets to hacker news would find another route, assuming my first hop had access to other networks (given time for the networks to determine a new route and my first hop had access to other peers).
We're talking about the web as an application layer protocol. By your definition everything that happens on the internet is decentralised. That's not untrue if you look at it from the point of view of TCP/IP, but that's tangential to the conversation we're having.
You seem to be conflating the web with the internet.
But even by that definition, the web isn't a single application, it's many applications, some of them compartmentalized (search, social), some of them not (email), and some in between (websites/blogs). If an application were centralized, I would expect a single provider you had to use, but instead, where it at least compartmentalized, you have a group or providers. Can you name a single service/application that you expect more than 5% of people use that has only a single provider? For search, you have Google, Yahoo, Bing, and other smaller players. Google is dominant here, but still has less than 68% of the market. For social, Facebook is the dominant player, but you yourself used a different social network to communicate on this subject, and there are many other providers with popularity that ebbs and flows. It's the same with anything I can think of. I'm not sure how this is considered centralized under any definition.
Yep, the web is a distributed system. Yep, the web offers many services, and many providers offer the same class of service.
However, each and every one of those services are centralised in a technical sense on account of HTTP. Why might an alternative be useful? Consider the solution the Google service we're addressing is putting forward cf. Content Addressable Networking systems[0]. I can't spend any more time explaining, sorry. This might help- note the levels of centralisation in each generation of P2P systems:
So, I think I'm starting to understand your argument, which is that the web is composed of many services which each is implemented relying on an underlying centralized authority, and you want that to change? If that's the case, then I understand the need, and agree with that poiint of view. But I think to say "the web" or "the internet" is centralized is very big stretch. I wouldn't call a bunch of decentralized services with little shared infrastructure and ownership "centralized".
I'm definitely not saying the internet is centralised! Perish the thought. I never mentioned it- the discussion was to do with the web specifically.
Forget the web as a whole and consider a single service such as HN. That graph has |clients| >> |servers|. More than the cardinality the client and server nodes are different in kind.
I consider a decentralised architecture to be one where the nodes can in principle participate equally.
You are arguing that the web is decentralised because there are many services to choose from. I don't disagree, but that's above the application layer protocol- which is what I thought we were discussing. In that case decentralisation happens above the application layer. So in humans? By that definition BBS's were decentralised because I could call a different one.
In other words, yes the web is decentralised because I can choose from many Forex APIs. But at the logical application layer of HTTP, OANDA is a centralised service. HTTP addresses point to specific nodes which may or may not be individual servers at the network layer, but from the point of view of HTTP that's what you address. In a decentralised application layer protocol I would expect to that not to be the case.
That Google is proposing this service is proof that individual web services are centralised. There's a single point of failure.
We're talking at different layers. It's just semantics from here on in.
No, I'm not arguing the Web is decentralized, at least not as you are using the term. I'm arguing it's not centralized. That's an important distinction, which I tried to cover in a response in a different thread[1]. We wouldn't be having this conversation if you had the web needs to be more decentralized, but you stated the web is centralized. not(decentralized) != centralized. This problem was then compounded by our discussion about services, where you are referring to services as individual protocol definitions, and I'm referring to them as implemented in the wild. While a protocol definition may call for it to be implemented in a centralized (n-1 client server relationship across direct communication), I'm referring to the ecosystem which provides many, many instances of this, which adds a layer of redundancy and decentralization to the service as it exists in reality. That's not as good as a well defined decentralized protocol definition, but it is a manner of decentralization. So again I think we were arguing points that are, for the most part, correct, but using confounding terms.
I think you would have communicated your intent better if you said the web is not decentralized enough. I've been arguing the web is not centralized, you've been arguing the web is not decentralized (but by saying the web is centralized), and the problem is that both are true. The current situation is in-between those two extremes. Arguing that the web is centralized, when it isn't unless you define your scope to be so narrow as to not really encompass what most people think of when you say "web" is counter productive, when your point is a good one, and whether the web is "centralized" is irrelevant. What matters is whether there are benefits to being less/more centralized (or more/less decentralized) from the current state.
Edit: As a suggestion for how to refine your original statements so they are more accessible and understandable to those reading them, I suggest changing "the web is centralized" to "the protocols the web relies on require single centralized authority". It's more verbose, but it doesn't require cognitive leaps in just one of multiple possible directions to get what you are trying to express.
To take Google as an example: 92% market share in Europe in 2014 [1], 81% of the global market for smartphones (Android) [2] - 96% if you also add the single relevant competitor iOS. None of this is technically centralisation. (And won't ever be, as you could always "decentralize" the web by running your own personal search engine on your home box. As long as someone is using it, google doesn't have 100% market share.) However, it doesn't make much of a difference when you want to develop an app that doesn't get accepted into the iOS or Android app store.
But all if this is obviously beside the point that the OP made. Even if you don't want to develop a search engine or a phone app, you still have to tie your users to a central "cloud" service and web site so you can get discovered by google. That's a huge disincentive for p2p services.
That's a great argument for how dominant Google is in the smartphone OS category, but that doesn't really say anything for whether the web is centralized. Even with 100% market penetration, there are people that opt to not use Google's included apps (such as Facebook and their messenger app).