I find NetSurf's source code beautiful and quite easy to read and follow, and straightforward. I am impressed. It's like the pure version of "makes sense". It's also split into independent libraries that can be used outside of NetSurf, which make them useful on their own, and NetSurf very modular. If I had to implement something that requires a feature implemented in a browser, I'd seriously consider one of them. Especially their CSS parser and engine.
Looks like the authors have a severe case of goto-phobia that causes a quadratic explosion of copy-pasted code in the error-return paths. Some of the files also feel like they have been translated into C from C++ or some other OO language by some automated tool, resulting in some very long "namespaced"-looking identifiers.
Then again, I don't think Firefox or WebKit code is that much better either, so my impression of this codebase is neither great nor horrible. As "beautiful" is subjective, to give a reference for what I'd consider beautiful, look at BSD or early UNIX.
I agree with the above sentiment regarding the 'gotophobia'. Still, at a glance it looks like a codebase where lots of care and diligence has been taken in documentation and code presentation.
I also find the code to be clear and easy to read. Looking at the samples you provided, I feel like I could step right in and work on this project with minimal cognitive load. I actually like the style.
there are other "C++ programmer trying to write C" anti-patterns, such as typedefed structs and one "class" per file, leading to terrible performance without LTO. stuff like https://source.netsurf-browser.org/libdom.git/tree/src/core/... has more signs of excessive C++ love: "vtable", "protected", etc. I don't know that these things are inherently bad, but they certainly don't exist in C.
...and despite that, it still manages to feel faster than the mainstream browsers, which says just how much extra overhead those have, and that there is still "more room at the bottom" to trim this one down some more too.
I don't see many people speaking about code they find good. I'd appreciate this. Pointing out bad code is usual, but good ones?
My comment may seem exagerated but that retranscribes very well how I felt when I discovered this code. Often, I'm overwhelmed by the code tree of medium to big-sized projects, with many abstractions and complicated folder structures. This is a browser and yet it was easy to figure out where to find the parts I was interested in and to understand what the code did.
Now, one can always find problems and discuss about the lack of gotos or arrays, but if as a complete outsider I can navigate and understand the code, and even feel that I could hack it quite easily with no documentation, something must be right with it.
I'm not in any way connected to the project, I don't even use this browser (sadly, it is too impractical).
I was able to download and build the Gtk3 version of this in under 5 minutes. For compactness alone, gets my seal of approval. I'm posting this comment from NetSurf, HN works fine.
I'll try this tomorrow. The old opera presto also built in under five minutes and was in general very lean, but crashes on most modern JavaScript heavy sites. I still sometimes wonder where it would be today if they hadn't abandoned it or properly made it open source at least.
> DNS level adblocking is fantastic, especially since it can be done network wide so easily and used as an extra layer of ad filtering.
It's great, but DNS over HTTPS will end the party soon enough (if I was a smart TV manufacturer, I would be prioritizing adding dns over https to the device firmware to subvert network blocks).
I do not understand. I run my own DNS but I also gather DNS data in bulk from DNS over HTTPS or DNS over TLS sometimes. I retrieve the data outside of the browser and put it into my own zone files. Are you saying that applications and devices will make it practically impossible for the user to change DNS settings to point to localhost or RFC1918-bound DNS servers? How would they be able to do that, assuming the user can control the first upstream router. Even if they could do this, it seems a bit too heavy-handed.
Much easier I would think for application developers to just make an ad blocking extension, e.g., uMatrix, stop working. For example, they could say this is because the application now has its own built-in ad blocker. Nevermind that the developers are paid from the sale of web advertising services.
I can't vouch for these since I haven't tried them yet, but it can apparently also be complemented by configuring your local DNS server to return NXDOMAIN for use-application-dns.net [1] and using a DoH proxy to protect upstream requests from snooping [2].
Could you give a working example, i.e., a website where you are doing cosmetic filtering or blocking specific files?
I also use a local proxy in addition to DNS which allows me to serve alternative resources or block/redirect certain URLs based on prefix/suffix/regex.
Not OP, but where I do cosmetic filtering, is on Stack Overflow. They display "hot network questions" on every page, with extremely interesting stuff from non-work Stack Overflow clones. "How many cats did Cmdr. Data have in Star Trek", that sort of stuff.
It has made me lose my focus on work repeatedly, and Stack Overflow really is a site related to work for me. So I block that column.
Thanks for that. Now I can definitely see the usefulness of this for interactive website use.
I am more of a non-interactive user and do not use a graphical, javascript-enabled browser much.
Here is a snippet I used to remove the annoying "hot network questions" from the page:
sed '/./{/div id=\"hot-network-questions/,/<\/ul>/d;}' page.html
Out of curiosity I wanted to see if I could access all these networking questions non-interactively. That is, download all the questions, then download all the answers.
Some years ago, like 10 years or more, I was making some incremental page requests on SO, e.g., something like /q/1, /q/2, ... and I got blocked by their firewall. What amazed me at the time was the block was for many months, it may even have been a year. This is one of the harshest responses to crawling I ever encountered. One of the very few times I have ever been blocked by any site and the only time I ever got blocked for more than a few hours.
Things have definitely changed since then. To get all the networking questions, I pipelined 277 HTTP requests in a single TCP connection. No problems.
Here is how I got the number of pages of networking questions:
y=https://networkengineering.stackexchange.com/questions
x=$(curl $y|sed -n 's/.*page=//;s/\".*//;N;/rel=\"next\"/P;N;')
echo no. of pages: $x
To generate the URLs:
n=1;while true;do test $n -le $x||exit;
echo $y?page=$n;n=$((n+1));done
I have simple C programs I wrote for HTTP/1.1 pipelining that generate HTTP, filter URLs from HTML and process chunked encoding in the responses.
Fastly is very pipelining friendly. No max-requests=100. Seems to be no limits at all.
There were 13,834 networking questions in total.
Wondering just how many requests Fastly would allow in one shot, I tried pipelining all 13,834 in a single TCP connection. It just kept going, no problems. Eventually I lost the connection but I think the issue was on my end, not theirs. At that point I had received 6,837 first pages of answers. 211MB of gzipped HTML.
So, it is quite easy these days to get SO content non-interactively.
It was also easy to split the incoming HTML into separate files, e.g., into a directory that I could then browse with a web browser.
From what I have read of his philosophies I can imagine the reasons Stallman might not use a browser -- and I think that is an oft-repeated, old, unsubstantiated rumour. I would be he uses one. The reasons I prefer the command line to an over-sized, slow, graphical program are different. I was introduced to computers in the VAX era, not the Javascript era.
Curious if those SO mirrors did not show cruft like "Hot Network Questions" on every page, would you use them instead of relying on an ad blocker.
At the moment, I completely ignore the SO mirrors as my instinct is to go to the original source. I'll start paying attention, they might actually be the better website.
In the stats. Network filters being URL based not just domain based. The lists are easy to view from the uBlock settings page if you want an endless supply of examples. They are used in pretty much any style list: ad, privacy, annoyances, cookie banners, tracking
I certainly block plenty of Google-controlled domains. I normally do not use Google search and even when I do I never seems to trigger any ads. Maybe I am just not searching for things people want to sell. In the rare event I do trigger an ad, because I am not using a "modern" browser to do searches, these ads are not distracting and I can easily edit them out of the text stream if I want to.
It looks like even with all the more recent nonsense Google inserts into the results it is still easy to just extract the result URLs and leave behind the rest of the crud. If you want to retain the description text it is a little more work.
Interestingly, the /aclk? Ad URLs do not use HTTPS.
Seeing that these Ad URLs are still unobtrusive, I am wondering why anyone would want to remove them from the search results page. For cosmetic reasons?
I prefer searching from the command line. To remove the /aclk? Ad URL's I used sed and tr.
If anyone here is running Linux and is now researching a "backup plan" in case Firefox pivots, I would highly recommend Epiphany (now called GNOME Web):
https://wiki.gnome.org/Apps/Web
Gnome Web (for Gnome), Falkon (for KDE), and Safari (for macOS) are really underrated browsers. It's true that they all use the WebKit engine, so it doesn't exactly promote a variety of engines, but at least none of them have any "monetization" features built in.
I use my own fork of surf daily, but am having to fallback to mainstream browsers quite often these days as sites stop working on it or it's unusably slow. Slack no longer works, GitHub breaks occasionally, etc.
Is there a well-maintained fork that fixes these issues? The main repo[1] hasn't seen updates in over a year now. I read that development slowed down after the main maintainer left suckless, but I'm hoping the community will pick it up. It's an excellent minimal browser.
FWIW Slack seems to work fine in surf for me. Are you sure you have an up-to-date WebKitGTK on your system? Those kinds of issues are typically fixed there, and not in surf itself. As long as the WebKit API stays backwards-compatible, there's probably not much need for surf to change (other than for new features).
There were massive layoffs today apparently with discussion of finding better ways to make money. Some are taking it as a sign of Firefox potentially deciding to monetize data (and, thus, deincentivize privacy focus internally). That might be what OP is talking about.
I remember looking at this and Dillo a while ago, and while it definitely looked more polished than Dillo, I recall downloading the Windows binary and it crashing the first time I ran it - not a good start. On the other hand, Dillo ran immediately and was at least usable. Maybe it's time to try it again, especially now that there are even fewer independent browsers...
Dillo is faster whilst NetSurf has better CSS support. For text- and table-heavy sites Dillo can be nicer to use (since everything has a uniform look and feel); for sites which rely on CSS for usability then NetSurf is nicer to use (e.g. headers and sidebars packed with nested mouse-over menus, which Dillo and text-mode browsers tend to render as a gigantic column of cruft above the actual content).
The Windows binary doesn't crash anymore, I have it installed and use it occasionally. I also use it on my very low end Linux laptop and it works wonderfully for anything that doesn't require JavaScript.
Edit: just a warning, the Windows version has problems with the <select> element, it won't display the options when you click it.
You have discovered websites that don’t demand JS! Seriously though I have JS disabled (FireFox ext. to turn back on per site) and the amount of broken or blank pages is saddening.
Also might be of interest that this can be run in a framebuffer. Sometimes I work in a virtual console to avoid distractions but when Emacs’ built in web browser doesn’t serve a purpose, I’ll open NetSurf in another console, and it’s fine for almost all “search for errors or docs” workflows.
It does not even pass the ACID2 test. It is however the best independently developed browser engine out there. If you want an alternative to Webkit and whatever Firefox uses, Netsurf is the only viable option at the moment.
Basically, it's the only browser out there even trying to handle modern sites that's not mostly funded by Google or Apple (Firefox revenue is mostly coming from Google, and there's Safari), and which runs on rarer platforms.
It's still relatively big, because web standards are incredibly complicated, but it's almost tractable.
If you care about browser diversity, it's important.
There are a few others but all with their own caveats -
https://www.ekioh.com/flow-browser/ is fully independent and can run Gmail, but it's commercial / non-FOSS. It's funded by contracts in the set-top-box industry.
https://sciter.com/ is non-FOSS with a free edition but you shouldn't run untrusted javascript on it
And there are some lightly-maintained forks of the leaked Presto 12.x engine from before Opera became a Chromium derivative, but of course it's copyright infringement.
I would love if Opera open sourced Presto. It's unfortunate that the lack of resources that pushed them to Chromium probably translate to a lack of resources needed to open source Presto properly, so I doubt we'll ever see it. : \
The general impression I get is that it is not bad when it comes to HTML and CSS but not even close for Javascript (disabled by default because of minimal support apparently), so comparing this with Chrome is like apples to oranges, given the way the modern Web is going.
As a counterpoint, when you come across a site which should but doesn't work in one of these minimal browsers, maybe the blame should be put on the site for using needless complexity instead of the browser for not implementing it.
So obviously caters to what the web once was and should have stayed, not to the metastasizing idiocy it has become.
I'm ancient enough to remember when mainstream browsers came in packages of a few megabytes and got us through the day just fine with a featureset less than (or at the very most equal to) the current state of NetSurf. In a somewhat more rational world there would be massive popular pressure for this to remain the case.
Yes, sites break in NetSurf. This is squarely their own fault for not providing civilized degradation. Although it certainly wouldn't go amiss if the NetSurf engine encorporated the worthwhile elements of CSS 3, basically meaning Grid and possibly Flexbox.
Not sure when CSS variables were added, but I'd consider them more friendly to user customization than preprocessors that inline the same font stack 20 times in autogenerated class names (looking at medium.com), requiring you to override the font stack 20 times.
It is many good idea, but should need a better customization, and perhaps better keyboard commands, Xaw widgets, bitmap fonts, etc. Support specifying pipes and stuff for files like Heirloom-mailx does. What we need is the browser that you have enough ropes to hang yourself and also a few more just in case, and NetSurf may be the one to base it from. Make it for the user first; assume any file received is potentially hostile (whether or not the connection is secure; a secure connection merely prevents spies from tampering with it, it does not prevent the server operator from tampering with it!). Support specifying scripts that the user can replace with their own implementaiton; this would ensure that the user can alter any document received, and would even make it more efficient too since the user's implementation need not be written in an interpreted programming language and need not be sandboxed as much either.
I've played around with NetSurf a little for fun recently. It performed surprisingly well on a few of the sites i use. old.reddit, hackernews both rendered ok with some issues, also miniflux which i use for rss reading worked well too.
I've been using NetSurf for a long time now on my Thinkpad when I'm traveling. If I only need to check webmail and read articles NetSurf suffices perfectly, without draining as much battery power as a regular browser.
I noticed that in the screenshots section[1], the Linux screenshots for the Netsurf site and Wikipedia are from this year, but the BBC one is from 2011. How does the current BBC site fare?
For networking on msdos I recommend mTCP. Although it's C++.
The biggest problem however will be compiling the code. The only viable compiler for msdos is openwatcom so the question is if the code uses any fancy extensions or gnuisms. Also the only version of openwatcom that didn't produce broken binaries is the Windows version (or at least was about two years ago).
I really recommand to have a look at the code.