Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Chinese Number Websites: The Secret Meaning of URLs (2014) (newrepublic.com)
181 points by lelf on April 25, 2019 | hide | past | favorite | 60 comments


Another reason I think is in China the search engines have been terrible since the beginning. For example even if your search query is as specific as "ACME inc. of AAA city in BBB province" , it would be your lucky day if the first two result pages have that company's web site. What you get mostly are paid ads and shitty seo stuff. It's been like this to this day.

So not being able to rely on search, people have to memorize companies' urls, to type them into address bar and pass along to others the actual url string. Internet companies are forced to display them predominantly in ads. So a domain name that is cute/easy(often costing ridiculus amount) is all that more important in China than elsewhere.

Google please come back, it will be a net gain for humanity.


> Google please come back, it will be a net gain for humanity.

I'd rather them wait until they are no longer required to censor their results and track their users on behalf of the state security apparatus.


Most people would rather want this because you don’t have to bear the consequences. To simulate the experience, I suggest you start using a censored Bing exclusively while dialing down its search relevancy 30-40% for a start. Make sure you also realize the political reality that your rulers would rather go North Korea than giving up censorship.


bing is ok, it’s no google though but much better than say baidu


yeah, my default search engine is bing. But sometimes you have the feeling it probably has less stuff in its database than what baidu has crawled and collected. Out of habit I rarely use 360 or sogou so I don't know how they fair compared to baidu or bing.


Yeah, I've found Bing to the best option when I'm in the mainland, only because all the local search are even more terrible at english.

When I need to search in Chinese, baidu is the least terrible (but still really bad). Sogou isn't too far behind, but it's index is not as fresh, in my experience.


That’s interesting. In japan they’ve gone the complete opposite route:

Instead of telling you the web address they tell you which keyword to type into search, sometimes even with an animation of it being typed into the search bar.


> The phone companies China Telecom and China Unicom simply reappropriated their well-known customer service numbers as domain names, 10086.cn and 10010.cn, respectively.

https://9292.nl is an example of this outside China, taking its name from 0900-9292 (a phone number you can call to get public transport travel directions, now largely replaced by the website/app).


Or https://www.1177.se for Sweden, the health care number/website. The NHS in the UK has https://111.nhs.uk as the URL for their new online emergency service.


Nitpick: 111 is the urgent but non-emergency service. The first page asks you to "Check it’s not an emergency" (signs of a heart attack/signs of a stroke/severe difficulty breathing/heavy bleeding/severe injuries/seizure), and to call 999 if it is.


Here's some more info about 1177 who recently had a security breach: https://news.ycombinator.com/item?id=19190346

It is a great service.


In the UK, 118118(.com) is a directory service provider that picked up a lot of popularity before mobile phones became the primary way to search for local businesses. They are still around though, and operate a website that is like a simplistic Yellow Pages / Yelp. They immediately came to mind because of their really catchy advertising back in the day.


In Australia, http://131500.com.au is also a very similar example (in that it also borrows the PT directions phone number), although it redirects to Transport for NSW's newer domain name these days.


In Italy telephone companies customer service had (and still have) short numbers beginning with 1, so 187.it and www.190.it redirect respectively to Telecom Italia and Vodafone. Strangely enough, 190.it doesn't work...


They forgot the most notorious, 12306.cn.

Basically Healthcare.gov for train tickets.


I believe this is in the article.


1588.lt in Lithuania ;)


>Chinese airline paid $280,000 for the phone number 88888888.

It certainly added wealth to the phone number holder/assigner.


I always wonder how much lawyers in the US pay for their phone numbers... 444-4444 or 888-8888 and so on. I would guess it's even more?


Sometimes the numbers are forwarded to different numbers based on user area code or other tactics to infer caller location.

Ie: they’re licensed to different parties in different locales.


I see bus ads for an accident lawyer in California with the phone number (800) 800-0000. I wonder if that phone number is more valuable than 888-8888.


Probably pretty similar in California advertising to non-Chinese. To Chinese, 8 is a lucky number, so more 8s would be better.


Or 1800-flowers/glasses etc.


How do you input a number like that?


On many (American) phone keypads, 1 is blank, 2 is A, or B, or C, 3 is DEF, 4 is GHI, 5 is JKL, 6 is MNO, 7 is PQRS, 8 is TUV, and 9 is WXYZ.

So for FLOWERS you'd see that the F is on 3, L is on 5, O is on 6, etc. to get 356-9377. Note that both R and S are on 7, they map to the same number, so different words are likely to all map to the same number, like "Roses" and "Ropes".

Companies sometimes use longer or shorter numbers: you might see a number written with left-padded numbers, like "1-800-1FLOWER", for which you'd dial 1-800-135-6937. You might also see longer words, like 1-800-ARRANGEMENTS..you could type in 1-800-277-2643 63687, but the extra letters would just go into the recipient's PBX (like dialing an extension) and get discarded.


I see. TIL and thank you!


A lot of phones come with the English alphabet on the 12 keys, three letters to a key. Similar to older phones that had a physical dialpad. You could then dial the "word" you want by looking at the number on which the correct letter is. That's one way these systems could work.


That's close, in the US it's 2-9 with 7 and 9 getting 4 letters for PQRS and WXYZ respectively.


Thank you!


Using the old timey keyboard layout[1] (a relic of pre-touchscreen phones)

So 1800-flowers would be 1800-3569377

[1]https://en.wikipedia.org/wiki/E.161


So that's what the letters were for on phone keyboards.


The use for vanity numbers is a later development. Originally the three digit exchange "number" in North America was identified by a three letter code. On a phone with no dial you'd tell the operator the exchange codeword and the four digit number within the exchange if you didn't want to manually pulse the switchhook. For automatic switch dialed calls, the dial needed to also be marked for these letter codes so you didn't need to use a separate reference for the corresponding digit.


The most valueable number is still:

555-SHOE


I remember going to 4399 for games when I was an elementary student.

Part of the reason why numbers are so pervasive, in the 51job.com sense, is that if you were to type out the pinyin for "i want a job" in Chinese... the url would be insanely long and unwieldy. 51job.com is easy to type and remember (also they have a billion ads everywhere). This means if you dont have numbers, you will try to use the initials for all the pinyins in your name. Jing Dong becomes jd.com

4008-517-517.com for McDonald's delivery is from their delivery phone number, which they had years before the URL. You would hear 4008 517 517 with a jingle on the radio all the time. The last time I heard it was probably 10 years ago but funny how the tune of these oft-heard ad jingles never fully leave your consciousness.


I wonder if this will ever evolve in emoji in URLs. I watched a short video about a guy who is squatting on most of the emoji domains


I now have the image of a man squatting over the pile of poo emoji. I accept I will never grow up.


I guess this site is relevant here: https://xn--ls8h.la/ (<pile of poop emoji>.la)

But apparently most TLDs don't support emoji URLs yet.


the biggest barrier, in my opinion, is the ugly punycode links. most browsers still do not render them properly, and if they did, it would have false positives. (i.e. there is no way to tell if a string is intended to be punycode or not)


It's not as much a matter of "still not rendering them properly", but more that they are doing it on purpose to avoid homoglyph attacks. Earlier browsers more OFTEN used to render them "properly", I believe.


Of course, this only helps against foreign letters, you still have people squatting or putting phishing on common typos of pure asci urls with basically the same result.

Meanwhile, almost all of humanity still can't get decent urls in their native language :(


> Meanwhile, almost all of humanity still can't get decent urls in their native language :(

Why not?

For example, http://xn--h1alffa9f.xn--p1ai/ renders the URL in Russian for me in the URL bar in all of Chrome, Firefox, and Safari (though Chrome converts to punycode if I copy the URL from the URL bar, unfortunately). [Edit: Also, it looks like HN's linkifier converts to punycode; what I wrote there is "россия.рф" and that's what HN has stored if I edit this comment.]

In more detail, for Firefox (where I can find this sort of thing quickly in the code), there are the following things affecting the display:

1) The "network.IDN_show_punycode" preference. This defaults to false, so punycode is not forced across the board.

2) There is a bunch of preferences for what toplevel domains are "safe" for use with non-ASCII chars by default no matter what. That option currently defaults to "false" as far as I can tell.

3) URLs the fit in the Highly Restrictive profile defined at https://www.unicode.org/reports/tr39/#Restriction_Level_Dete... are shown as non-punycode as far as I can tell.

4) There's some heuristic detection for URLs using multiple scripts at once and blocking that.

There are also preferences to force-allow or force-deny use of IDN with certain characters; those sets are empty by default.

In any case, the default behavior looks to me like a single-script URL in any language would be shown in IDN. Do you have a counter-example?

https://searchfox.org/mozilla-central/rev/75294521381b331f82... has the relevant preferences with their default values as of today.

Disclaimer: I work on Firefox, and have been involved peripherally in some of the IDN work.


If it starts xn-- then it's punycode, and the prefix was apparently chosen because it didn't occur in any actual DNS labels.


This actually predates emoji (and was mentioned in the article). In the early 00's we got the ability to use unicode characters in a domain name. The example I always like was Unicode Snowman ;-)

http://xn--n3h.com/ (Funny enough HN is stripping the Unicode Snowman glyph, so using the puny code version)

Was a great way to break websites. JavaScript used for client-side validation usually allowed it, but backend code often didn't understand unicode for URLs, and would usually choke somewhere and (fingers crossed) give you a lovely J2EE or ASP.NET error page with some helpful stack traces and version numbers


This is also an interesting way to go about dealing with the scarcity of domain names


>though the U.S. recently agreed to hand it over to a “global multi-stakeholder community” in 2015

Hmm, it seems like the title should have a year after it.


The last sentence is odd:

> You can’t blame other countries for wanting to tell the American 250s to 0748.

250 = idiot

0748 = go die

Which translates:

> You can’t blame other countries for wanting to tell the American idiots to go die.

I'm not sure why the author felt the need to include that.

(Edit: Am I wrong for questioning the article?)


It doesn't seem odd given the context of the full paragraph, which starts:

> Still, the numbers/letters divide is emblematic of the Internet’s built-in bias: Even more than two decades after its birth, it’s still a fundamentally American system.

It describes how URLs are frustrating and difficult to use for non-English speakers. Given that, it makes a joke about that frustration using the URL-hacks that are often used in China, which is what the whole prior article covers.


I get that the author wanted to express frustration and be clever, and he had limited examples of “number-based slang” to do so with, but I agree with the parent comment that calling Americans “idiots” and telling them to “go die” seems excessively hostile.

Though I also can’t imagine “go die” has the same force in Chinese that it would have in English if it’s a common slang.


I think you are correct that '0748' is not a very serious/forceful phrase. Probably only a lighthearted slang used by young netizens in China.


That's my thought too. Apparently HN does not agree which I think is sad. It's interesting information in the article, but being hostile just because the Internet was pioneered in the U.S. and thus a lot of U.S. customs are baked in seems excesive. I like anime, but I'm not hostile to the Japanese because the shows I like are spoken in Japanese with only English subtitles.


Anime isn’t deeply embedded in society the way the internet is now. Imagine if you had to learn some Japanese just to access your bank account or buy plane tickets.


I have (Japanese, Spanish, and Korean). I didn't hate the host cultures because of it.


I don’t mean imagine if you had to do that in order to live in those countries. I mean, imagine if you had to do that in the US, to access your American bank down the street.


It's the New Republic, an intellectual fashion magazine for the American Left. As it says at the bottom of the article, the author is one of their staff writers. Comments about American hegemony and how other countries can't be blamed for wanting American idiots to go die are the kinds of intellectual fashion statements that the New Republic expects its staff writers to make.


I am leftist myself, but these people are insane.

"Yeah, let's just go ahead and hand over control of internet standards gift-wrapped to some of the worst human rights abusers in history. What could possibly go wrong?!"

If Russia and China were sane, reasonable states that cared for any semblance of human rights, yeah, sure, maybe it'd make sense to make this not American-centric. But as it stands, the author just comes off as naive.


> If Russia and China were sane, reasonable states that cared for any semblance of human rights

America holding internet standards hostage can't be the answer to that, arrogant selfishness comes across as the motivation.

Telephony interoperability standards are managed just fine by an international committee; this can serve as a role model how to govern control of internet standards. http://ITU.int/ITU-T


420.com certainly works!


Mods, this is from 2014, maybe add that to the title?


This article has been discussed 5 years ago (at its release). https://news.ycombinator.com/item?id=7694076


please put the year [2014] after this title. Ancient as the Chinese culture it's about heh

https://news.ycombinator.com/item?id=7694076




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: