Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
NetSurf, a multi-platform web browser (netsurf-browser.org)
202 points by butz on Aug 11, 2020 | hide | past | favorite | 100 comments


I find NetSurf's source code beautiful and quite easy to read and follow, and straightforward. I am impressed. It's like the pure version of "makes sense". It's also split into independent libraries that can be used outside of NetSurf, which make them useful on their own, and NetSurf very modular. If I had to implement something that requires a feature implemented in a browser, I'd seriously consider one of them. Especially their CSS parser and engine.

I really recommand to have a look at the code.


To be honest, I wouldn't exactly call this "beautiful"...

https://source.netsurf-browser.org/libdom.git/tree/src/core/...

https://source.netsurf-browser.org/libdom.git/tree/src/core/...

https://source.netsurf-browser.org/libcss.git/tree/src/parse...

Looks like the authors have a severe case of goto-phobia that causes a quadratic explosion of copy-pasted code in the error-return paths. Some of the files also feel like they have been translated into C from C++ or some other OO language by some automated tool, resulting in some very long "namespaced"-looking identifiers.

Then again, I don't think Firefox or WebKit code is that much better either, so my impression of this codebase is neither great nor horrible. As "beautiful" is subjective, to give a reference for what I'd consider beautiful, look at BSD or early UNIX.


I agree with the above sentiment regarding the 'gotophobia'. Still, at a glance it looks like a codebase where lots of care and diligence has been taken in documentation and code presentation.


I also find the code to be clear and easy to read. Looking at the samples you provided, I feel like I could step right in and work on this project with minimal cognitive load. I actually like the style.


there are other "C++ programmer trying to write C" anti-patterns, such as typedefed structs and one "class" per file, leading to terrible performance without LTO. stuff like https://source.netsurf-browser.org/libdom.git/tree/src/core/... has more signs of excessive C++ love: "vtable", "protected", etc. I don't know that these things are inherently bad, but they certainly don't exist in C.


...and despite that, it still manages to feel faster than the mainstream browsers, which says just how much extra overhead those have, and that there is still "more room at the bottom" to trim this one down some more too.


That first one is madness! That's not even goto-phobia, that's just not knowing what arrays and loops are.

The third one is true goto-phobia of course. Crazy how people can't see the mess they make by avoiding goto for the sake of avoiding goto.


Taking a look, it's been a while since I've heard someone speak that highly about any source code, I admit that alone makes me very intrigued


I don't see many people speaking about code they find good. I'd appreciate this. Pointing out bad code is usual, but good ones?

My comment may seem exagerated but that retranscribes very well how I felt when I discovered this code. Often, I'm overwhelmed by the code tree of medium to big-sized projects, with many abstractions and complicated folder structures. This is a browser and yet it was easy to figure out where to find the parts I was interested in and to understand what the code did.

Now, one can always find problems and discuss about the lack of gotos or arrays, but if as a complete outsider I can navigate and understand the code, and even feel that I could hack it quite easily with no documentation, something must be right with it.

I'm not in any way connected to the project, I don't even use this browser (sadly, it is too impractical).


I was able to download and build the Gtk3 version of this in under 5 minutes. For compactness alone, gets my seal of approval. I'm posting this comment from NetSurf, HN works fine.


I'll try this tomorrow. The old opera presto also built in under five minutes and was in general very lean, but crashes on most modern JavaScript heavy sites. I still sometimes wonder where it would be today if they hadn't abandoned it or properly made it open source at least.


Does it block ads?

Asking because: DNS-level adblocking is stupid.


DNS level adblocking is fantastic, especially since it can be done network wide so easily and used as an extra layer of ad filtering.


> DNS level adblocking is fantastic, especially since it can be done network wide so easily and used as an extra layer of ad filtering.

It's great, but DNS over HTTPS will end the party soon enough (if I was a smart TV manufacturer, I would be prioritizing adding dns over https to the device firmware to subvert network blocks).


I do not understand. I run my own DNS but I also gather DNS data in bulk from DNS over HTTPS or DNS over TLS sometimes. I retrieve the data outside of the browser and put it into my own zone files. Are you saying that applications and devices will make it practically impossible for the user to change DNS settings to point to localhost or RFC1918-bound DNS servers? How would they be able to do that, assuming the user can control the first upstream router. Even if they could do this, it seems a bit too heavy-handed.

Much easier I would think for application developers to just make an ad blocking extension, e.g., uMatrix, stop working. For example, they could say this is because the application now has its own built-in ad blocker. Nevermind that the developers are paid from the sale of web advertising services.


DNS-adblocking in a router can be complemented by the router's firewall blocking outbound to all DoH provider IPs.

(It'll need to be a constantly-updating blocklist, but the DNS-adblock lists are also that already.)


I can't vouch for these since I haven't tried them yet, but it can apparently also be complemented by configuring your local DNS server to return NXDOMAIN for use-application-dns.net [1] and using a DoH proxy to protect upstream requests from snooping [2].

[1] https://support.mozilla.org/en-US/kb/canary-domain-use-appli...

[2] https://github.com/aarond10/https_dns_proxy


That gives the consumer one more reason not to hook it up to the network, ever.


I have to agree. I think DNS adblocking off a Raspberry Pi is the best. Unfortunately a power loss screwed up my Pis SD card.


Can you elaborate?


Cosmetic filtering and blocking specific files can't be done via dns filtering.


Could you give a working example, i.e., a website where you are doing cosmetic filtering or blocking specific files?

I also use a local proxy in addition to DNS which allows me to serve alternative resources or block/redirect certain URLs based on prefix/suffix/regex.


Not OP, but where I do cosmetic filtering, is on Stack Overflow. They display "hot network questions" on every page, with extremely interesting stuff from non-work Stack Overflow clones. "How many cats did Cmdr. Data have in Star Trek", that sort of stuff.

It has made me lose my focus on work repeatedly, and Stack Overflow really is a site related to work for me. So I block that column.


Thanks for that. Now I can definitely see the usefulness of this for interactive website use.

I am more of a non-interactive user and do not use a graphical, javascript-enabled browser much.

Here is a snippet I used to remove the annoying "hot network questions" from the page:

   sed '/./{/div id=\"hot-network-questions/,/<\/ul>/d;}' page.html
Out of curiosity I wanted to see if I could access all these networking questions non-interactively. That is, download all the questions, then download all the answers.

Some years ago, like 10 years or more, I was making some incremental page requests on SO, e.g., something like /q/1, /q/2, ... and I got blocked by their firewall. What amazed me at the time was the block was for many months, it may even have been a year. This is one of the harshest responses to crawling I ever encountered. One of the very few times I have ever been blocked by any site and the only time I ever got blocked for more than a few hours.

Things have definitely changed since then. To get all the networking questions, I pipelined 277 HTTP requests in a single TCP connection. No problems.

Here is how I got the number of pages of networking questions:

   y=https://networkengineering.stackexchange.com/questions
   x=$(curl $y|sed -n 's/.*page=//;s/\".*//;N;/rel=\"next\"/P;N;')
   echo no. of pages: $x
To generate the URLs:

   n=1;while true;do test $n -le $x||exit;
   echo $y?page=$n;n=$((n+1));done
I have simple C programs I wrote for HTTP/1.1 pipelining that generate HTTP, filter URLs from HTML and process chunked encoding in the responses.

Fastly is very pipelining friendly. No max-requests=100. Seems to be no limits at all.

There were 13,834 networking questions in total.

Wondering just how many requests Fastly would allow in one shot, I tried pipelining all 13,834 in a single TCP connection. It just kept going, no problems. Eventually I lost the connection but I think the issue was on my end, not theirs. At that point I had received 6,837 first pages of answers. 211MB of gzipped HTML.

So, it is quite easy these days to get SO content non-interactively.

It was also easy to split the incoming HTML into separate files, e.g., into a directory that I could then browse with a web browser.

   x=$(zgrep -c ^HTTP answers.gz)
   mkdir newdir; cd newdir;
   zcat ../answers.gz|csplit -k - '/^HTTP/' '{'$x'}'


Heh you sound like Richard Stallman, it's rumored he also doesn't use a browser.

As an aside, I've seen mirrors of Stack Overflow pop up when I use DuckDuckGo to search. Google seems to filter these out.


From what I have read of his philosophies I can imagine the reasons Stallman might not use a browser -- and I think that is an oft-repeated, old, unsubstantiated rumour. I would be he uses one. The reasons I prefer the command line to an over-sized, slow, graphical program are different. I was introduced to computers in the VAX era, not the Javascript era.

Curious if those SO mirrors did not show cruft like "Hot Network Questions" on every page, would you use them instead of relying on an ad blocker.


At the moment, I completely ignore the SO mirrors as my instinct is to go to the original source. I'll start paying attention, they might actually be the better website.


uBlock Origin with all but the regional languages filter lists gives:

133,215 network filters + 155,733 cosmetic filters

In the stats. Network filters being URL based not just domain based. The lists are easy to view from the uBlock settings page if you want an endless supply of examples. They are used in pretty much any style list: ad, privacy, annoyances, cookie banners, tracking


AFAIK ads on google search results can't be blocked by DNS alone.


Can you provide a working example?

I certainly block plenty of Google-controlled domains. I normally do not use Google search and even when I do I never seems to trigger any ads. Maybe I am just not searching for things people want to sell. In the rare event I do trigger an ad, because I am not using a "modern" browser to do searches, these ads are not distracting and I can easily edit them out of the text stream if I want to.


>Can you provide a working example?

Literally any search that uses the "expensive" keywords[1]. "car insurance quotes" would do nicely, for instance.

[1] https://www.wordstream.com/articles/most-expensive-keywords


It looks like even with all the more recent nonsense Google inserts into the results it is still easy to just extract the result URLs and leave behind the rest of the crud. If you want to retain the description text it is a little more work.

Interestingly, the /aclk? Ad URLs do not use HTTPS.

Seeing that these Ad URLs are still unobtrusive, I am wondering why anyone would want to remove them from the search results page. For cosmetic reasons?

I prefer searching from the command line. To remove the /aclk? Ad URL's I used sed and tr.

   #!/bin/sh
   # usage: $0 query > 1.htm
   # 
   x=$(echo y|tr y '\004');
   z=$(echo https://www.google.com/search?q=$@\&num=100|sed 's/ /%20/g');
   curl --resolve www.google.com:443:172.217.17.100 -Huser-agent: "$z"|sed "s/<a href=\"\/url?q=/"$x"&/g"|tr '\004' '\012'|sed -n '/url?q=/{s/.url?q=//;s/&amp;sa=.*\"><h3/\"><h3/;s/&amp;sa=.*\"><span/\"><span/;s/$/<br>/;/aclk?/d;p;}'


If anyone here is running Linux and is now researching a "backup plan" in case Firefox pivots, I would highly recommend Epiphany (now called GNOME Web): https://wiki.gnome.org/Apps/Web


Gnome Web (for Gnome), Falkon (for KDE), and Safari (for macOS) are really underrated browsers. It's true that they all use the WebKit engine, so it doesn't exactly promote a variety of engines, but at least none of them have any "monetization" features built in.


There are also surf https://surf.suckless.org/ and qutebrowser https://qutebrowser.org/, both based on webkit.


I use my own fork of surf daily, but am having to fallback to mainstream browsers quite often these days as sites stop working on it or it's unusably slow. Slack no longer works, GitHub breaks occasionally, etc.

Is there a well-maintained fork that fixes these issues? The main repo[1] hasn't seen updates in over a year now. I read that development slowed down after the main maintainer left suckless, but I'm hoping the community will pick it up. It's an excellent minimal browser.

[1]: https://git.suckless.org/surf/


FWIW Slack seems to work fine in surf for me. Are you sure you have an up-to-date WebKitGTK on your system? Those kinds of issues are typically fixed there, and not in surf itself. As long as the WebKit API stays backwards-compatible, there's probably not much need for surf to change (other than for new features).

There's zsurf based on QtWebEngine/Chromium too, FWIW: https://github.com/SteveDeFacto/zsurf


There is also badwolf https://hacktivis.me/projects/badwolf (again based on webkit)


qutebrowser uses QtWebEngine by default, which is based on Chromium.

You can use it with QtWebKit instead, but given that's based on a 2016 WebKit with no process isolation or sandboxing, I wouldn't recommend it.


Or perhaps Konqueror? Seems apt seeing that KDE wrote what became WebKit.


"Falkon" is the current KDE web browser. It used to be called QupZilla.


There is also a cross-platform Otter Browser (https://otter-browser.org) based on WebKit.


I’m surprised that no one has mentioned Opera yet.


Opera is now Chinese owned.

But the former Opera devs started Vivaldi some years ago. And it's really great.


My minor gripe with it is that you can't drag and drop URLs or any other text. Other than that it's a joy to use.


That's optional.

In Settings, search for "Allow Text Selection in Links" and deactivate it. Now you can drag+drop links.


Icecat is better. Epiphany is pretty slow and huge, and it depends on a lot of libraries.


> Firefox pivots

Pivots to what?


There were massive layoffs today apparently with discussion of finding better ways to make money. Some are taking it as a sign of Firefox potentially deciding to monetize data (and, thus, deincentivize privacy focus internally). That might be what OP is talking about.


I can only guess they mean if they pivot to be less privacy focused.


Or pivot to maintaining a half-ass Chromium fork instead of a whole-ass browser, since that's apparently the fiscally responsible thing to do.


I remember looking at this and Dillo a while ago, and while it definitely looked more polished than Dillo, I recall downloading the Windows binary and it crashing the first time I ran it - not a good start. On the other hand, Dillo ran immediately and was at least usable. Maybe it's time to try it again, especially now that there are even fewer independent browsers...


Dillo is faster whilst NetSurf has better CSS support. For text- and table-heavy sites Dillo can be nicer to use (since everything has a uniform look and feel); for sites which rely on CSS for usability then NetSurf is nicer to use (e.g. headers and sidebars packed with nested mouse-over menus, which Dillo and text-mode browsers tend to render as a gigantic column of cruft above the actual content).


The Windows binary doesn't crash anymore, I have it installed and use it occasionally. I also use it on my very low end Linux laptop and it works wonderfully for anything that doesn't require JavaScript.

Edit: just a warning, the Windows version has problems with the <select> element, it won't display the options when you click it.


You have discovered websites that don’t demand JS! Seriously though I have JS disabled (FireFox ext. to turn back on per site) and the amount of broken or blank pages is saddening.


What's even sadder is that even with all that breakage, disabling JS makes the web more usable/bearable.


My internet browsing is mostly limited to Hacker News, Reddit (the old UI) and StackOverflow and they all work relatively well with JS disabled.


Also might be of interest that this can be run in a framebuffer. Sometimes I work in a virtual console to avoid distractions but when Emacs’ built in web browser doesn’t serve a purpose, I’ll open NetSurf in another console, and it’s fine for almost all “search for errors or docs” workflows.



ok... i'll bite... how does this compare to chrome in terms of rendering and compatibility? i don't see this thing listed on caniuse.com.


It's good at old-school "websites". 0 support for modern web, aka "webapps".


It does not even pass the ACID2 test. It is however the best independently developed browser engine out there. If you want an alternative to Webkit and whatever Firefox uses, Netsurf is the only viable option at the moment.


Basically, it's the only browser out there even trying to handle modern sites that's not mostly funded by Google or Apple (Firefox revenue is mostly coming from Google, and there's Safari), and which runs on rarer platforms.

It's still relatively big, because web standards are incredibly complicated, but it's almost tractable.

If you care about browser diversity, it's important.


There are a few others but all with their own caveats -

https://www.ekioh.com/flow-browser/ is fully independent and can run Gmail, but it's commercial / non-FOSS. It's funded by contracts in the set-top-box industry.

https://sciter.com/ is non-FOSS with a free edition but you shouldn't run untrusted javascript on it

And there are some lightly-maintained forks of the leaked Presto 12.x engine from before Opera became a Chromium derivative, but of course it's copyright infringement.


I would love if Opera open sourced Presto. It's unfortunate that the lack of resources that pushed them to Chromium probably translate to a lack of resources needed to open source Presto properly, so I doubt we'll ever see it. : \


I loved presto so much, not going open source is a huge loss. And now the future of servo is unclear as well. :(


> It does not even pass the ACID2 test. It is however the best independently developed browser engine out there

Flow browser passes ACID3 [1], closed source but independent (afaik, dont't really know much about it)

Previous HN discussion: https://news.ycombinator.com/item?id=23508979

[1] https://www.ekioh.com/devblog/acid/


not available for Linux?


Have a look here (but page not updated in years apparently, so its compatibility maybe better in reality)

http://www.netsurf-browser.org/documentation/progress.html

The general impression I get is that it is not bad when it comes to HTML and CSS but not even close for Javascript (disabled by default because of minimal support apparently), so comparing this with Chrome is like apples to oranges, given the way the modern Web is going.


In other words, HTML4 and CSS2.

As a counterpoint, when you come across a site which should but doesn't work in one of these minimal browsers, maybe the blame should be put on the site for using needless complexity instead of the browser for not implementing it.


You really cannot even begin to blame websites for using HTML5 and CSS3 just because an obscure browser doesn't support them.


Yeah, blame adobe Photoshop 2020 CC for not optimising it for Windows 95.


So obviously caters to what the web once was and should have stayed, not to the metastasizing idiocy it has become.

I'm ancient enough to remember when mainstream browsers came in packages of a few megabytes and got us through the day just fine with a featureset less than (or at the very most equal to) the current state of NetSurf. In a somewhat more rational world there would be massive popular pressure for this to remain the case.

Yes, sites break in NetSurf. This is squarely their own fault for not providing civilized degradation. Although it certainly wouldn't go amiss if the NetSurf engine encorporated the worthwhile elements of CSS 3, basically meaning Grid and possibly Flexbox.


Not sure when CSS variables were added, but I'd consider them more friendly to user customization than preprocessors that inline the same font stack 20 times in autogenerated class names (looking at medium.com), requiring you to override the font stack 20 times.


Granted. Lack of variables was always among the more serious defects of the confusion that is CSS.


netsurf is a great browser, i test with it regularly, and it,s a pleasure to both use and develop for.


It is many good idea, but should need a better customization, and perhaps better keyboard commands, Xaw widgets, bitmap fonts, etc. Support specifying pipes and stuff for files like Heirloom-mailx does. What we need is the browser that you have enough ropes to hang yourself and also a few more just in case, and NetSurf may be the one to base it from. Make it for the user first; assume any file received is potentially hostile (whether or not the connection is secure; a secure connection merely prevents spies from tampering with it, it does not prevent the server operator from tampering with it!). Support specifying scripts that the user can replace with their own implementaiton; this would ensure that the user can alter any document received, and would even make it more efficient too since the user's implementation need not be written in an interpreted programming language and need not be sandboxed as much either.


I've played around with NetSurf a little for fun recently. It performed surprisingly well on a few of the sites i use. old.reddit, hackernews both rendered ok with some issues, also miniflux which i use for rss reading worked well too.

I think its a very cool and admirable project


I've been using NetSurf for a long time now on my Thinkpad when I'm traveling. If I only need to check webmail and read articles NetSurf suffices perfectly, without draining as much battery power as a regular browser.


Is there any browser that implements container tabs à la Firefox?


I noticed that in the screenshots section[1], the Linux screenshots for the Netsurf site and Wikipedia are from this year, but the BBC one is from 2011. How does the current BBC site fare?

[1] -- https://www.netsurf-browser.org/about/screenshots/


Front page: https://l.sr.ht/RfrR.png

Reading an article: https://l.sr.ht/ZSwt.png


For all that this browser lacks, I think it's simple enough to build that it's been the "first" browser on a few alternative/hobby OS's, no?

e.g, Redox, maybe Haiku pre-WebKit port? Haiku I could easily be misremembering as there was probably a BeOS browser that ran, albeit outdated...

tl;dr I appreciate the simplicity of it in contrast to modern browsers.


Is the NetSurf rendering engine multi-core/multi-thread? It's not clear from the homepage, which makes me think that it's single-core only.


Do some pages on the website need a little love for the copyright date?

"Copyright 2003 - 2009 The NetSurf Developers". (eg; Downloads)


Netsurf was a bit unstable and prone to lock compared to Dillo or Links+.

Also, the duktape JS capabilities are behind even edbrowse.


I wonder if it would build to MSDOS


Not a silly question. However, (at a guess) I reckon a lot of the underlying net code is probably POSIX based, which MS-DOS is not.

I notice there's a framebuffer version, so building a nano-linux distro that boots instantly into that shouldn't be that hard.


So there's the framebuffer code that's needed and then it's netcode.

So for the framebuffer code, maybe an MSDOS UNIVBE wrapper could be used.

For netcode, I have no idea, but a wrapper for POSIX calls must surely exist there also.

Anyyways. I guess the userbase would be pretty much nil, for this as alternatives do exist on DOS :)


For networking on msdos I recommend mTCP. Although it's C++.

The biggest problem however will be compiling the code. The only viable compiler for msdos is openwatcom so the question is if the code uses any fancy extensions or gnuisms. Also the only version of openwatcom that didn't produce broken binaries is the Windows version (or at least was about two years ago).

Happy porting :-)


what in the 1993 is this


...and that's a good thing.

The utter waste of resources that is the mindless trendchasing of the "modern web" is in desperate need of some strong opposition.


I agree. And there are more problems than waste of resources, although that is one of them.


no kidding - looks like even the web server is old too. Seems to be Cherokee, a project which last stable release was 6 years ago



Sorry, I should of said stable release


> Cherokee, a project which last stable release was 6 years ago

Its last CVE was 10 years ago[1].

It's not actually necessary for software to change if it already does what you want it to.

1: https://www.cvedetails.com/vulnerability-list/vendor_id-1005...


There's quite a few more recent CVEs for Cherokee.

https://nvd.nist.gov/vuln/search/results?form_type=Basic&res...


Especially when no one uses it.


Wasn't that the name of some shitty product from the 90s?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: