I highly recommend using uMatrix[1][2] if you're very privacy-conscious. It's the full-blown everything-at-your-fingertips console.
By default, it blocks third-party scripts/cookies/XHRs/frames (with an additional explicit blacklist). You then manually whitelist on a matrix which types of requests from which domains you want to allow. Your preferences are saved.
It is a bit annoying the first time you visit any new domain, because you need to go through a bootstrapping whitelist process to make it work. After a while I find I do it almost automatically though.
I use it in conjunction with uBlock Origin and Disconnect, and it still catches the vast majority of things. As a nice side-effect, I find I keep pretty up-to-date with new SAAS companies coming out!
Any browser plugin is inferior to using a hosts file. Hosts file's blackhole any network request before even attempting to make a connection. These browser plugins only help if you're using the specific browser — they aren't going to help that electron/desktop app that's phoning home. They wont help block inline media links (Messages on a Mac pre-rendering links) that show up in your chat programs which attempt to resolve to Facebook. They also wont block any software dependency library that you install without properly checking if it's got some social media tracking engine built in.
I don't even waste time or cpu cycles with browser based blocking applications. Steven Black's[1] maintained hosts files are the best for blocking adware, malware, fakenews, gambling , porn and social media outlets.
That doesn't stop anyone using IP addresses directly, and I find that a small minority does (but that minority does including, e.g. Microsoft for some Win10 updates).
Depending on your threat model, you might need to go the proxy/firewall route.
Hosts file is a weird middle ground - that has to be installed and maintained on every device, many of which (e.g. iphone/ipad) won't let you do that. It's better to set up a local DNS, which will serve every local machine; and as I mentioned, doing this at a firewall level is better yet.
I started using the brave browser, which i noticed blocks alot of network requests. Loading the NYT took about 25 seconds to finish all requests using chrome. With brave it took 4 seconds. Also brave gives you the option to pay the sites you visit with BAT tokens, if you want. Brave in conjunction with Pi-hole looks like even more secure and perhaps further page load speed.
That's mostly what I use at home too. It works very well.
It doesn't quite work well while on the road sometimes. For those cases, I have a docker running diginc/pi-hole (with some additional hosts file blocking going on), then I point my laptops DNS towards that and am good to go.
Running your own DNS server, and intercepting DNs traffic on your routerC is better for two reasons
1) processes and machines that bypass hosts are also caught
2) a large hoss file takes time to parse, line by line, slowing every DNA lookup. A DNs server should cache the entire better.
I have scripts and compiled utilities that transform HOSTS file to tinydns format. I use nsd as well which uses BIND format. I prefer the tinydns format.
Zone files are more flexible than HOSTS files, but I still use HOSTS as well. I have never had a concern about the speed of using HOSTS. It is certainly faster than DNS.
There is a comment in this thread where someone asserts that HOSTS was never designed to handle "millions of entries".
I would be interested in reading about a user who visits "millions" of websites or otherwise needs to do lookups on "millions" of domains.
I maintain a database of every IP address I am likely to use on a repeated basis, in several formats, with kdb+ as the "master" format. I believe most users will never come close to intentionally accessing "millions" of IP addresses in their entire lifetime.FN1 I could be mistaken and it would be interesting to learn of a user who can dispel this belief.
FN1. If you think about this, it may cause you to question the necessity of DNS for such users. Or not. Times have changed since the advent of HOSTS. They have also changed since the advent of DNS. For example, using "consumer" hardware, I can fit all the major ICANN TLD zones on external storage and the entire com zone (IMO, by far the most important) in memory. This is many, many more domains than I will ever need to look up. Assuming at best I will not live much longer than 100 years, I could not and will not explore them all or even a significant fraction.
If you know which DNS names you will need to know, then yes, there's no need for more than a hosts file.
Until DNS changes.
If I move a server from one IP to another, I change DNS, and in $TTL time everyone's pointing at the new server. Apart from you with a hosts file. How does that work if everyone has a hosts file?
If I say "check out this interesting story on blahblah.com", you don't have it in your hosts file, how do you get it?
I maintain a list of every phone number I am likely to use on a repeated basis, but sometimes I need to look up a phone number I don't know (in the old days this was a phone book locally, and directory inquiries further afield. Now it's ddg and assume they have a website. Which isn't in my hosts file or dns cache, and I've never visited before)
I maintain DNS entries for my home network of a dozen devices -- I host it on my mikrotik, but it's handy to have, when I type "ssh laptop" rather than remembering if it's on .71 or .73. It's one step better than a plain text file, as there's a standard based way to remotely query it. At work I maintain a DNS server with 2000 entries on my network, which is actually hosts file powered, but again I use dnsmasq for the DNS server rather than rsyncing that hosts file to 2000 machines.
"How does that work if everyone has a hosts files?"
In your particular case, I dont know. You have to do what best suits your needs, whatever they are.
Here is how someone else solves the problem of changing IP addresses. For my needs, I actually like this method.
The entire ICANN DNS used to be bootstrapped to a small text file called "root.hints", db.cache, named.root, named.cache, or something else. As far as I know, it still is.
How does one know the IP address from which to retrieve this text file?
Maybe they have it memorized, or written down somewhere, or perhaps it is written into some DNS software default configuration. In all cases, they have this address stored locally.
No remote lookup.
What happens when the administrator of the server that publishes the text file wants to change IP addresses?
This does not happen very often, but it does happen. What do they do? Considering that the entire ICANN DNS was bootstrapped to this one file, and assuming this is truly meant to be a dynamic system, then this is arguably the most important IP address on the internet.
They notify users in advance that the IP address is going to change.
Thats it.
As a www user, of course I would have to do a DNS lookup for blahblah.com. However I do not do lookups for the server with db.cache, for the .com nameservers, and in most cases not for the nameserver for blahblah.com either, and I do not do lookups using recursive caches. If blahblah.com changes its IP address I do not have to wait for changes to propagate through the system via TTLs. I am querying the authoritative nameserver, RD bit unset. If an IP address changes from the one I have stored, I know immediately when I try to access it. (I like being aware of these changes.) If I was relying a recursive cache I would probably not notice that the IP address had changed.
IME, IP address changes happen less frequently than people writing about DNS on the web would have one believe. Hence this system works well for me. Most domainnames I encounter are keeping the same IP address for long periods.
Ideally, if blahblah.com is not changing IP addresses frequently or unexpectedly but needs to make a change, she could publish a notice somewhere on her web server informing users she will be moving to a new IP address, just like the server that serves db.cache.
Also another viable solution although slightly more complicated than adding a line to a file. A question I have is whether you're talking about running your own DNS on your machine or on a server you control?
One benefit to the hosts file is that it travels with you everywhere you go. I have my DNS configured at home, but my hosts file for when I'm at a coffee shop, on a plane, work trip, or vacation.
I work around this with an ipsec vpn to home. Dns is setup on my router with unbound, I just point to that. When at a coffee shop or any untrusted network, I vpn into home.
He refers to a setup where egress DNS traffic is routed to a local DNS server. Thus, regardless of machine configuration, all machines use the local DNS server.
A convenient GUI app for managing the hosts file on macOS machines is Gas Mask, btw. You can have a local hosts file to block your pet peeves, then subscribe to a remote hosts file (such as the one linked above), and activate the combined hosts file.
Windows 8-10 users that use Windows Defender will notice that some hosts file entries will be ignored (like popular domains like facebook.com). You will also need to add an exception to the hosts file in Windows Defender.
I wonder what their rational for this is. I know in the past malware have modified the hosts file to block malware removal tool domains but why ignore entries for Facebook?
I heard that blackholing requests to Microsoft telemetry URLs also has no effect. Any way of finding the unlockable list I wonder.
More likely they just bypass looking at the local hosts file for such names, so the request always goes out to your DNS servers.
Therefore blocking these names by redirecting to 127.0.0.1 will work if done at your DNS server (for instance if you run an instance of https://pi-hole.net/ for that).
Unless of course they make the lookup use specific name servers that they run, instead of the local resolvers that your machine is configured to look at, for those names but that is less likely.
In that last case, you can often redirect those queries if they are standard DNS requests on your router. That's how my local network is configured -- all DNS requests are sent to my Pi-hole instance, except those coming from the Pi-hole itself. Even something like:
See my comment on TAForObvReasons - If you invest in a Safari Content Clocker that allows custom rules, you can effectively do the same thing as a hosts file. Safari Content Blockers prevent network traffic similar to host files works. The only thing you wont be able to block is some app that has a Facebook analytics or Facebook login dependency inside e.g.
Spotify.
This works great for keeping spam off your devices -- off your local network at any rate. Not possible to modify iOS hosts file without jailbreaking it.
On an iPhone without jailbreak you can use 1blocker[1]. Since every browser on the iPhone is basically a UIWebView/SFSafariViewController controlled by iOS, Safari Content Blockers[2] apply globally preventing web visits. Safari Content Blockers also prevent Messages from rendering Facebook content inline.
My 1blocker rule called "Bye Facebook" is:
https?://([a-z\.]*)?facebook\.com.*
I should probably update it to factor in a lot of these other TLD URLs now that I think about it.
> Since every browser on the iPhone is basically a UIWebView/SFSafariViewController controlled by iOS
Most browsers rely on WKWebView, which uses the Safari Nitro JavaScript engine but allows customization of the user interface as well. UIWebView is pretty much legacy at this point, and SFSafariViewController does not allow any customization beyond basic theming.
> Safari Content Blockers apply globally preventing web visits
Unfortunately, this is not true. They only work in Safari and Safari view controllers.
> Safari Content Blockers also prevent Messages from rendering Facebook content inline
Are you sure about this? As far as I know, Messages uses a web view.
Actually I just double checked and I was incorrect. I just remembered that I have another app on my phone called AdBlock[1] which is responsible for blocking requests at the network layer. They run their own DNS server and create a custom list to black hole network requests that match certain formats. If you add Facebook as a custom rule to AdBlock, that will prevent Messages from pre-rendering content and also block any messages to facebook from any service on your phone as long as you're connected to their VPN.
Sorry about the confusion... I'm really doing a lot to keep myself off of Facebooks radar.
I do not believe this is true. In messages the previews need an approval the first time some domain should load a preview and then this setting is stored. AFAIK there is no way to recall the permission though.
> they aren't going to help that electron/desktop app that's phoning home.
What's your threat model? Mine is third-party tracking cookies, and desktop apps don't share my browser's cookie jar. So while technically I can be tracked by IP from a desktop app, Facebook can't tell if it's me or someone else at the same coffee shop.
In particular, one nice thing about Chrome extensions is that they don't apply to incognito windows. I regularly use HTTPS Everywhere in block-all-HTTP-requests mode + an incognito window on wifi connections I don't trust, because the incognito window will permit plaintext requests, but it doesn't read my cookies or write to my cache, so it's sandboxed from my actual web browsing. I can safely read some random website that doesn't support HTTPS with my only concern being my own eyes reading a compromised page; none of my logged-in sessions are at risk.
> any software dependency library that you install without properly checking if it's got some social media tracking engine built in.
... is this a thing? (I totally believe that it's becoming a thing, I just haven't seen it yet and am morbidly curious.)
That works only with JavaScript active which uMatrix blocks for 3rd party. The sites one visits mainly are not known for 1st party fingerprinting (that's mainly done by the ad networks). The extra paranoids (like me) can also block JS for certain 1st party sites.
I use uMatrix only experimentally (I rely on NoScript) but it offers a fascinating flexibility of control if one is in the mood. As well, NoScript is near useless when doing stuff with AWS where uMatrix offers the right flexibility (allow from site Y, but only when fetched from site X).
While I acknowledge that your use case may be confined to browsing the internet, I still don't see what prevents a desktop app from reading your cookie jar.
Edit: your browser history (which may contain your profile URI) might be pretty out in the open, too.
Oh, yes, none of it is sandboxed from an actively malicious app—but an actively malicious app can just ignore your hosts file, too.
My threat model is a developer who includes a standard tracking snippet from a third party but is not going out of their way to reliably violate my privacy at all costs (because they have other features to ship, and the tracking snippet works on most computers). If your threat model includes actively malicious developers, stop running native apps from them at all.
Just browsing through the "fake news" section that hosts file is ridiculous. There's a tremendous number of completely legitimate news sources that are blocked and many that, while lurid, are not in any sense "fake news." The list includes both liberal and conservative legitimate news sites.
uBlock has much the same effect. Requests err in the console with "ERR_BLOCKED_BY_CLIENT" instead of a 404, so technically it is possible to detect this with a first-party JavaScript file.
Block all accesses to non-local (192.168, 10., etc) IP-based URLs. Will break some legit stuff, but, not that much for most users, and they can just whitelist that as it comes up.
IP whitelists could also be aggregated and shared on github similar to the current DNS blacklists.
If you're using uBlock Origin or uMatrix to block third party scripts it won't matter. That's what makes using the dynamic filtering in uBlock Origin so powerful. Easily the best bang for your buck ad blocking that you can do.
HOSTS files are static. They were never designed for blocking ads or tracking. And for all we know, every connection does a linear search through the HOSTS file so the larger it gets, the more wasted time, because it was never designed to have millions of entries.
To add to this, I've seen that stupid hosts file blacklist from SpyBot cause some Windows network service get locked up for 40 seconds every time the laptop was resumed from suspend or booted up in Windows 7. Parsing the hosts file took a relatively extreme amount of time for exactly this reason, massive hosts files are a kludge at best.
I swear by uBlock/uMatrix, but it's amazing how much of the web it breaks and how little of the content of some sites are hosted by the site itself. The web has become very reliant on CDN's.
My public broadcaster (http://www.abc.net.au/news/) for instance is completely reliant on third parties for it's "live story" functionality. It loses half it's functionality at work where twitter is blocked and uBlock kills the other half. It also kills the live stream when it can't load one of the half a dozen trackers on the page.
I'd love the no-script functionality to be merged in too so that I could turn off javascript by default.
You can turn on disconnect’s blocklists in ublock origin rather than run both. ublock origin comes with quite a few lists, but most of them are turned off by default.
I default deny all 3rd party scripts and frames, in addition to the blocklists, and I only sparingly noop relevant domains, the bare minimum to make pages work, on a page-by-page basis.
On top of that, I have Privacy Badger, Cookie Autodelete Decentraleyes and I've turned on first-party isolation.
It's mostly unobtrusive once my most important websites have been properly noop'ed, and it's relatively simple to add temporary exceptions if needed.
I'd prefer a solution that does not just work for a specific browser, but instead blocks all traffic regardless of browser, application, virtual machine, ...
This will only protect you while your on your own network. A lot of the juciest data is about your public location, for that you need something device/browser specific.
There's nothing (except possibly your ISP) stoping you from opening your firewall and using it remotely. I personally run dnsmasq (manually configured, but otherwise similar to pihole) on a VPS.
> There's nothing (except possibly your ISP) stopping you from opening your firewall and using it remotely.
My ISP won't but there are ways around that. The biggest problem I've faced is on the modem side of things, finding something I'd trust to be open to the internet, ideally something I can install openWRT or similar on and something I know will work in my market. It's an options minefield.
I've got a RaspberryPi Zero (WiFi via USB..ugh). Would that be too slow for DNS, or would having my DNS server be local vs remote negate that slow interface?
I use Little Snitch[1] (and its sibling Micro Snitch[2]) for filtering connections at the system level. I don't interact with it too often though, because I rarely install new apps.
Not to say /etc/hosts doesn't work, these days I just find I prefer things with better UX.
To clarify, I whitelist my browser entirely in Little Snitch and delegate to uMatrix and other extensions.
I also don't pre-emptively load in rules into Little Snitch - I have it running in active/interrupt mode, so it prompts me whenever it tries to make a new connection I haven't signed off on before. Unsurprisingly, not very many apps try to connect to Facebook.
Because it is completely impractical. I used LS but it's a waste of time to check and block ads servers or malicious domains, which is why most garbage should be blocked from hosts or dnsmasq.
The maintenance aspect of LS is definitely on the high side and only really dedicated folks will stick to it; if it were to come with auto-updated maintained lists it would most likely be used more
Little Snitch is for MacOS.
As a linux user I desperately looked for an equivalent and found none.
Douane was suggested. It's no good.
What a sorry state of affair. We need a simple app-level filtering solution.
Same story. I have always been dreaming of a Linux equivalent for LittleSnitch. More than a decade has passes since I've switched to Linux, still nothing...
Even better would be doing it on a device. It's a reason to have an intelligent router on your network where you run a custom dnsmasq or whatever, then you cover your phones and all the hootenanny that comes with a digital life. Like your fridge.
Same here. I have zero desire to do any sort of manual configuration any time I visit a new website. Blocking third-party cookies will eliminate like 90% of tracking and uBlock origin and Privacy Badger handle a significant percentage of the rest.
Using uMatrix together with uBlock Origin is a little bit redundant, as uBlock also offers a matrix functionality (enable advanced options). As a matter of fact, iirc uBlock is developed upon uMatrix's codebase by the same author.
I tried uMatrix several times and quit. It's crazy heavy. It takes forever to whitelist parts of a page until functional.
Scripts then XHR then more cascaded scripts then frames..
A hidden beauty of uMatrix is that with a little training, it doubles as a reminder against low-value search results -- content scrapers, marketers you've already decided never to buy from again. Just red-out the first-party web site. Search engines that can't keep up with the SEO-optimizers e.g. Duckduckgo become more usable.
I second this. uMatrix is my go-to add-on to prevent the vast majority of connections I do not want. However, I pretty much always use my browser in privacy mode no matter what (for several reasons), so the bootstrapping process never ends for me, but that's a price I am more than willing to pay.
I don't know enough about how different profiles operate to say with any certainty, unfortunately. So it's possible there is no benefit. However, for mobile there does appear to be at least one benefit. When I recently checked to see what data Google had on me using their free tool, I saw that the only web history they had associated with my device was all from URLs visiting while not in privacy mode. These were all links that I clicked from text messages which will always default to open in the regular, non-private tabs, even if you only have incognito tabs open. That right there is enough for me to always and forever use nothing but privacy mode. For the PC, this is a non-issue as I never log into my browser. Although I do use privacy mode there near 100% of the time as well.
I'm continuing to have issues with frames not displaying unless I completely disable the addon. It's extremely frustrating. I'm in advanced mode and all options are as they should be.
One minor feature I miss from noscript is that (unless I've missed a setting) umatrix can't block site scripts but allow bookmarklets. Though with the new extension API, I don't know if it's possible at all or not.
By default, it blocks third-party scripts/cookies/XHRs/frames (with an additional explicit blacklist). You then manually whitelist on a matrix which types of requests from which domains you want to allow. Your preferences are saved.
It is a bit annoying the first time you visit any new domain, because you need to go through a bootstrapping whitelist process to make it work. After a while I find I do it almost automatically though.
I use it in conjunction with uBlock Origin and Disconnect, and it still catches the vast majority of things. As a nice side-effect, I find I keep pretty up-to-date with new SAAS companies coming out!
---
[1] https://chrome.google.com/webstore/detail/umatrix/ogfcmafjal...
[2] https://addons.mozilla.org/en-US/firefox/addon/umatrix/