That's really unfortunate. And it's not just performance, it really messes around with OS-level URL handling protocols like Android intents (and possibly FB's app links and iOS's new Extensibility).
I recently found this happening with Twitter's Android app. The user sees a link to player.fm and thinks it will open the native Player FM app if they have it installed, since it's registered to handle that URL pattern. But instead, the OS offers web browsers and Twitter as ways to open the link, because it's not really a player.fm link as presented to the user, but a t.co link. If the user then chooses a browser, the browser immediately redirects to the correct URL, which then pulls up the intents menu again.
7 redirects could potentially be 7 popup menus for the user to navigate through.
The OS could pre-emptively follow redirects, but that would of course introduce considerable latency since normally the menu is presented without any call being made at all. Maybe the best solution for OSs is to present the menu immediately but still make the call in the background, so the menu could be updated if a redirect happens.
"I don't see any work happening in HTTP 2.0 to change it."
Probably the best HTML standard for dealing with it is the "ping" attribute which allows a way for servers to be notified of a click without actually redirecting. However, that's HTML and not HTTP, and these days, apps are more popular HTTP clients than browsers, and apps don't manually bother to implement things like that.
So there are probably things that could be done with the standard. Perhaps using some distributed lookup table to ensure at most 1 redirect (by caching the redirect sequence and returning it with the first request). That does ignore any personalisation that goes on, but generally these should be permanent redirects without personalisation anyway.
> I recently found this happening with Twitter's Android app. The user sees a link to player.fm and thinks it will open the native Player FM app if they have it installed, since it's registered to handle that URL pattern. But instead, the OS offers web browsers and Twitter as ways to open the link, because it's not really a player.fm link as presented to the user, but a t.co link. If the user then chooses a browser, the browser immediately redirects to the correct URL, which then pulls up the intents menu again.
You don't even have to go that far. Just click on a youtube link. First it'll ask if you want the www.youtube.com url to play in a browser or the app (which sucks), then it'll redirect to m.youtube.com and ask you again.
Only reason I haven't set it as my permanent choice is because I still hold out some shred of hope that the youtube app will be able to play an entire video without stopping for 2 seconds every 3 some day in the future.
If user convenience requires trashing server analytics, then server analytics should be trashed.
Possibly even deliberately - enabling that by default automated way in commonly used products, so that this tracking becomes ineffective and useless, and there's no more motivation to insert these artifical layers of redirection.
That will break just about every affiliate program that I'm aware of. Of course this is your intention but there will be a very large number of websites that will see their turnover plummet if that should happen.
I personally would not mind but I'm pretty sure that a lot of monied interests would not like to see this happen.
This would result in marketers returning 200 responses to set the cookie and render a page with javascript that sets `window.location` instead, which would be even slower.
People want analytics information, and the only way we can do this now is by adding things like this. (Not strictly true, but ease of deployment, etc).
Unless you shut down the only way to do something, the people who want this service will work around whatever restrictions have been put in place, and the solution we get will probably be even uglier.
The real way to get rid of this is to provide a mechanism that addresses everyone's desires.
the problem is that these analytics are only beneficial to the owners or advertisers, which means if there's a way for users to turn off the supposed capability, then the advertisers would find a different way to do it, which results in much the same thing as today.
it's a social problem, not a technological one imho.
How would that affect the single sign-on case? AFAIK it's common practice to issue redirects to people who are not authenticated to send them to the identity provider's (IDP's) login page. This would make it hard for an IDP to determine if the user already has an active session with them.
Click count statistics, time-clicked, and geo-information can all be gotten without any cookies. Some sites use url shorteners just to see clickthrough statistics, which can always be determined with no cookies etc.
I think the most practical solution to this, requiring only a change in practice and not in standard, would be for link shorteners to start doing HEAD requests on the urls they shorten and unwrap it to make their shortened link canonically correct if it results in a permanent redirect.
Yeah, there are things that might have some problems with this, but they're things that are probably somewhat abusive to the 301 status code to begin with.
> Redirects are being abused and I don't see any work happening in HTTP 2.0 to change it.
I agree that this is an unfortunate pattern, but what exactly could the HTTP spec do to change it? The only thing I can think of is limiting the number of chained redirects, although I don't see browsers implementing that if longer chains are even remotely common.
Why do we need a technical solution. My understanding is that the author is arguing for a change in how URL shorteners are being used, not a technical change making this impossible. The problem is that once a technology exists, it will be abused. Sometimes this abuse is just a clever and useful hack, and sometimes it is annoying and anti-usable.
If I remember correctly, there was an old (very old) project with reversible links called Project Xanadu.
if my sketchy memory serves me, it was based around using a currency and updatable links. Along with that, the idea was that you could also share segments of movies and music with the hyperlink system.
I'm pretty sure it died a pitiful death due to it being completely secret until after HTTP got ingrained.
I looked at it before, and it was, indeed, neat. But, I'm not sure that a project that spent 30 years in development before an initial release can really be said to have "died".
I think the other issue is that these aren't being used as URL shorteners any more (in the sense they were when they were used for Twitter's 140 character limit). They are tracking URLs, gathering data about you at each hop.
If the HTTP spec added 2 new VERBS (SHORT, LONG) as a method of shortening and elongating URLs then many things could be done.
1.) The browser could pro-actively lengthening the URL and the same way the server can respond 302/301 now the browser could cache this.
2.) The server could hand-back the final long URL with out needing to redirect the URL multiple times
3.) We could create services that can be integrated into the server software that integrate 3rd parties.
4.) Each domain could create their own shortened URL domains and mask it in a better way.
1) The browser doesn't know the long URL so how can it proactively lengthen it?
2) The server might not know the long URL since all that t.co knows about is the slate.me URL and that only knows about the slate.tribal URL and it only knows the goog.le url and so on and so forth. So this would not be possible unless it was only 1 hop.
3)I am assuming the services you want to integrate into the server software will resolve the shortened URL into a long one or vice versa but in case there are multiple redirects the services would still face the latency of redirects.
> The server might not know the long URL since all that t.co knows about is the slate.me URL
The server at t.co could send a request (HEAD works) to slate.me, and follow up any redirects it gets to resolve the final URL. (This could be done just by following until no more redirects, or only sending requests to known URL shorteners -- there's advantages and disadvantages to both) -- and you don't need any new HTTP verbs to do it.
That assumes that every user gets the same "long" URL for a particular "short" URL (and that every 30x corresponds to a short-to-long redirect). It falls down where a URL depends on geolocation or time sensitivity.
The URL should be under the full control of the domain.
1.) The browser can offer the ability to (right click) and shorten a URL or lengthen it. A HTTP standard would provide this mechanism.
3.) The would not require multiple redirects because everyone should ask the domain. If the URL is already shortened then there is not need to shorten again.
- service like bit.ly, goo.gl can provides services to: 1.) Actually shorten, statistics...
My guess would be for analytics, so it knows how its own service is being used and who is accessing websites through it. It comes with a convenient feature that bad URLs can be taken down on its site.
This is classic "Tragedy of the commons" behavior where each individual group with a link shortener is benefited by encouraging and enforcing its usage (ability to kill malicious links easily, user tracking, etc)
I'm not sure if this can be resolved until users are educated sufficiently on the long-term adverse effects of link shortening services (link rot, privacy concerns, slow/broken redirects, etc).
For change to happen the demand for direct links (generated explicitly by things like this blog posts, or implicitly by higher bounce rates due to long loading times) will need to be enough to outweigh the benefits to organizations that are building them.
Edit:
Even if there is evidence that shows this, why should _I_ be the one to give up my link shortener service when it will have no significant improvement to the overall problem which involves tens or hundreds of these services?
This is propagated by people not really understanding URLs and blindly reposting links that have already been wrapped in a URL shortener through services that wrap them in another one. Whenever I repost links, I repost only the URL of the final page, stripping off anything unnecessary. Sadly, the trend of browsers hiding URLs or pieces of them is not helping the situation either.
I don't think this can be solved technologically - HTTP redirects are not difficult to detect but a lot of these shorteners (and becoming increasingly more common) use Javascript and/or meta tags to accomplish redirection. The solution is better educated users that don't create chains of shortened URLs.
Could a URL wrapper service follow a URL through its redirects only wrap the final address?
I'm not a networking expert, but it seems viable enough to me. Shoot out a GET request, wrap the final address with your shortener. Cut out the middlemen.
It's an idea. It might fail at scale. And might not be feasible.
Downside is that if I use an URL shortening service that allows me to change where you are headed after the fact (for example after 1000 hits, go to a new page instead) then you've just broken that functionality.
That's no longer a URL shortening service, that's a campaign redirection service. Just serve the appropriate content, rather than relying on redirects.
A lot of shorteners don't just use regular HTTP redirects, they use Javascript or meta tags. To find the actual destination, the server would at the very least have to detect and parse HTML, and maybe even execute Javascript.
I always figured trib.al and bit.ly and their ilk offered different analytics or whatever and that that's why some URLs would bounce through both. I see this especially in major journalism outlets.
The user experience on mobile with multiple url-shortener redirects is beyond annoying. Every new HTTP connection opened on over a marginal cell or wifi connection can stall or fail, even when the actual destination site is up and reachable.
I'm no SEO guru, but isn't the recommended behavior to create a URL that matches the title of the blog post? I've seen these "post title" URLs with increasing frequency over the past few years.
Yeah, at least at one time, and likely still the case.
I never liked this SEO "feature", however. My thinking is why should search engines really care about the URL WRT content? Seems like a shortcoming of the engines, as well as a potential technique for gaming the search engines. In fact, it seems that if search engines could determine that URLs were being used to game them, then they wouldn't need sites to bother with this behavior in the first place. OTOH, if search engines cannot tell they are being gamed with URLs, then it's also completely useless.
Yeah, but you're supposed to put underscores or hyphens to separate words. The bots aren't smart enough to un-concatenate words as far as I understand.
> Every redirect is a one more point of failure, one more domain that can rot, one more server that can go down, one more layer between me and the content.
These are all good reasons, but are there any real users who are actually being affected by these issues? If it is just a theoretical concern, then I don't think it is reasonable to call the situation "officially out of control".
Seven redirects to different domains means seven new TCP connections being established, very likely over a crappy mobile connection (see twitter usage numbers from mobile). The user experience is definitely being harmed here.
I've lived in the Philippines for awhile, and the big telcom here, PLDT, has terrible DNS. t.co links are the most obvious point of contention, where they just won't resolve 90% of the time. It's incredibly obnoxious, especially on a mobile device where DNS settings aren't (easily) exposed.
A little off topic, but I seem to recall seeing, probably some years ago, a post on HN about someone a reversable url shortening algorithm that could convert from the shortened url back to the original. Can't find it now, anyone recall this, or did I dream it?
I get the link-rot concern (and 7 re-directs as showcased in the FA is absurd) but these are services are mostly used on twitter and social media where the life-span of a post sharing a link is hours to a day or so, at most.
I couldn't find anything that would output something similar to the redirects image shown in this post, so wrote a small script in node to do that. It looks like this: http://cl.ly/image/3T3e462G1C3d
> Because you are being served content from other people's servers. If one of those servers fails, the whole site you are trying to load fails.
Say what?
Many of those things are just tracking pixels. There is no way to serve them from the site host, that's the whole point. And if they fail to load, nbd.
And sure, if some substantial piece of JS or CSS doesn't load, that could cause problems on the page, but in most cases it wouldn't cause total failure of the site loading.
Public CDNs for shared assets makes a lot of sense.
These things are nothing like hotlinking somebodies image.
CDNs are supposed to decrease load times but in my experience it is the opposite... sometimes I wonder if CDNs exist only for data collection purposes.
They are all of the Photobucket web site's third-party javascript dependencies, along with a huge list of external js loaded by just one of those dependencies.
I think URL un-shortening should be done in the browser, on URLs that were shortened according to a standard hashing method, so your browser can tell you where the URL will go.
>I think URL un-shortening should be done in the browser, on URLs that were shortened according to a standard hashing method
Hashing, by definition, is non-reversible. So if you use a hashing method as your shortening system, you'll have a fingerprint from which the actual destination is unrecoverable.
> Shortening sevices are ridiculous and dangerous.
Right, and there is no reason to use them in any situation where the target is a browser, anyway. It makes sense to use a shortening service if you are going to be sending a URL in an SMS message, but if its going to a browser, there's no reason not to use the full URL. So, really, a server based system that will send message both as SMS and to regular browser users should support a shortener and use it only when sending SMS.
You're asking a 8-10 character string decompress to an arbitrary length string, and have the client be able to do the decompression.
I think if you think about this for a few seconds, you'd find out that if you were able to create such a solution, you've have destroyed some information theory laws.
I recently found this happening with Twitter's Android app. The user sees a link to player.fm and thinks it will open the native Player FM app if they have it installed, since it's registered to handle that URL pattern. But instead, the OS offers web browsers and Twitter as ways to open the link, because it's not really a player.fm link as presented to the user, but a t.co link. If the user then chooses a browser, the browser immediately redirects to the correct URL, which then pulls up the intents menu again.
7 redirects could potentially be 7 popup menus for the user to navigate through.
The OS could pre-emptively follow redirects, but that would of course introduce considerable latency since normally the menu is presented without any call being made at all. Maybe the best solution for OSs is to present the menu immediately but still make the call in the background, so the menu could be updated if a redirect happens.
"I don't see any work happening in HTTP 2.0 to change it."
Probably the best HTML standard for dealing with it is the "ping" attribute which allows a way for servers to be notified of a click without actually redirecting. However, that's HTML and not HTTP, and these days, apps are more popular HTTP clients than browsers, and apps don't manually bother to implement things like that.
So there are probably things that could be done with the standard. Perhaps using some distributed lookup table to ensure at most 1 redirect (by caching the redirect sequence and returning it with the first request). That does ignore any personalisation that goes on, but generally these should be permanent redirects without personalisation anyway.