Hacker News new | past | comments | ask | show | jobs | submit login
The Lazy Man's URL Parsing (joezimjs.com)
114 points by joezimjs on May 7, 2012 | hide | past | favorite | 23 comments



This is a great trick, but there's an irritating IE bug. On the pathname attribute, IE doesn't add a leading slash on the pathname (whereas all other browsers do).

It can be corrected by doing:

    parser.pathname = parser.pathname.replace(/(^\/?)/,"/");
Further, because you're creating a DOM element, it's less performant than using a RegExp solution. That performance degradation will typically only surface if you're parsing a large number of URLs (for example, in a loop), rather than just 1 or 2.


For the DOM element, it seems like you could just create the element once and then keep it around for reuse. Should be safe since JavaScript is single-threaded.


My understanding is that modifying and querying DOM elements is what's slow, not just creating them.

http://jsperf.com/lazy-url-parsing


Added a test case for not caching the DOM element: http://jsperf.com/lazy-url-parsing/2


I'm actually surprised it's only ~40% slower (in my Chrome anyway). Considering how much less code it is to maintain, that's totally worth it IMO. I'm sure there are a thousand other functions you'd want to optimize in a real app before this.


Clever! Didn't realize you could reuse the element.


The interface is officially called "URL decomposition IDL attributes." IIRC it's only implemented by <a/>, <area/> and the Location object.

Here are the canonical docs: http://dev.w3.org/html5/spec-LC/urls.html#interfaces-for-url...

And here's a prettier (but less detailed) version: http://developers.whatwg.org/urls.html#interfaces-for-url-ma...


Just came across this clever solution yesterday, actually. Unfortunately, it doesn't work for URLs containing a username and password (e.g. http://username:password@example.com). Glad to have found URI.js, though--it's exactly what I was looking for.


I was noticing that too.


var link = $('<a/>').attr('href', 'http://example.com)[0] in jQuery, just to show how simple it can be.

A stylistic point is I'd just call it something like "link". Assigning a link element to "parser" is being cute and it's not what the object actually is, even if it's being used with intent to parse.


HN Link: http://www.joezimjs.com/javascript/the-lazy-mans-url-parsing...

Takes you here: This webpage has a redirect loop The webpage at http://www.joezimjs.com/500.shtml/ has resulted in too many redirects. Clearing your cookies for this site or allowing third-party cookies may fix the problem. If not, it is possibly a server configuration issue and not a problem with your computer. Here are some suggestions: Reload this webpage later. Learn more about this problem. Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many redirects.

Instead of here: http://www.joezimjs.com/javascript/the-lazy-mans-url-parsing...


The orignal seems to be down now. Note that there are many ways to name the bits you parse a URL into: http://tantek.com/2011/238/b1/many-ways-slice-url-name-piece...


Even worse, the original seems to be down due to an infinite redirect loop.


Sorry about "the original". It wasn't actually supposed to have that query string on the end. That was for Google Analytics tracking elsewhere and I forgot to remove it when I posted it here.

I'm not sure why it produced an redirect loop because it worked many times for a lot of people. It may have had something to do with my server getting overloaded. I'm on a Hostgator reseller account, so I have limited resources and when HG saw the massive CPU usage from all of you people checking this out, they shut down my site for a bit.


Incidentally, the Node.js `require('url').parse(str)` method is designed to present the same API, except that it also includes the auth section as an 'auth' member.


If you are coding with python instead of javascript check out my complete uri module (miniuri.py): https://bitbucket.org/russellballestrini/miniuri/src/tip/min...



Laziness is the mother of all inventions. Neat trick.


I wonder if it works in Internet Explorer 6 or 7.


IE omits the leading slash from the pathname, but its easily fixable. http://news.ycombinator.com/item?id=3939454


Do people building anything real world care about IE6 any more? Hell, do they really care about 7?


If you care about enterprise you probably care about IE 7.


In the hospital where I work, most of the computers got upgraded to IE7 about 6 months ago. There's talk of upgrading to IE8, but that's probably a year off if I had to guess.

The way I see it, about half of my salary is for caring about IE7. IE7 hacks and stupidities occupy no more than 5-10% of my time, but the way I see it, 80% of my work is damn fun and I would do it for free. I am glad that there is the remaining 20% of IE7 hacking, boss-assigned-task-doing, and so on for which I (consider myself to) get paid 5x my actual salary.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: