You're within your rights to create and offer whatever kind of service you want. As an end-user, however, any data what-so-ever sent to a 3rd party without my knowledge or consent is too much. There is no such thing as "the right amount."
I'm OK with websites using self-hosted tools such as Matomo as long as the data never leaves their servers. Analytics is important to any business. But I choose to do business with said business, not with Shopify, not with Google, not with Facebook or Twitter (I'm looking at those "sign in with" widgets that run social media code in my browser) or whatever 3rd party "SaaS" service the website is outsourcing my data to for ease of development or convenience. I don't consent to my data being shared with people I don't know about and did not consent to give a single shred of my information to.
This seems very impractical given the way the internet currently works. Most startups use dozens of SaaS products, let alone more basic/foundational things like global CDNs. You're being logged at every step of the process if only to prevent spam/DDoS/etc.
What you're asking for would require a fundamental restructuring of the internet, and of software business models, and a lot of other stuff. I can't see that happening any time soon.
In the meantime you can try using Tor, but good luck not getting blocked on half the websites you want to visit - and you can't blame the website for that (they need DDoS/spam defence).
Not only the internet, this is impractical given how any business works. Even a brick and mortar store is sharing aggregate customer buying habits with its supplier based upon products it buys from them.
When I visit a website of some business, I provide them with an IP address for use during the session (because of the way TCP/IP works). I'm okay with said site using some kind of load-balancer, DDoS protection or what not, as long as the business takes full responsibility to keep my personal information private unless I specifically indicate otherwise (opt-in[1]), for example using a form on the landing page. I believe that this is the true intent of the GDPR in this matter.
No, it's not. Using Matomo on my own servers has nothing to do with the way GA etc. operates - it's an equivalent of going through my own Nginx logs and parsing them to generate diagrams and so on. Of course if I share personally identifiable data with a third party, it's a completely different thing - in this case it does not matter if it comes from Matomo or web server logs.
But I agree with your conclusion: what matters is how it's being used. In this case - whether you share/sell it to others or not.*
[*] But not only: it also matters if you take adequate care in protecting personally identifiable information or not.
In general, under the GDPR it doesn't matter much whether you process data yourself on your own server or contract that same task out to a third-party. Either that processing is legal or it isn't - ownership of the server doesn't play a role.
The problem with Google Analytics here is not that it's a third-party but that it's under US control.
> we want to load JS from a CDN like literally everyone does
Well, carry on and load it, it's your server.
Oh, wait, you mean you want ME to load it, into MY browser? That's a problem - my browser only loads JS from the origin server, and only if I give it explicit permission.
As a developer, I deplore the use of CDNs to serve javascript libraries; you don't know what the CDN is going to serve to your users, it could change without warning and break your site.
You’re just illustrating why this isn’t an issue requiring legislation - anyone can block requests to whatever origin they like. No need for heavy handed gov’t getting involved in technical matters.
So maybe the legislation should be that you have to pass a "internet operator" test to get a license that ensures you have the awareness and the skill. Because even if the current law protects you from GA, there are tons of other companies doing the same things and have no intention of stopping.
Better to protect the people from all the bad companies, not just the ones who do business in the EU, right?
Sounds like protecting the people by leaving it to them, and (somehow) restricting their internet access if they haven't passed a course in internet jiu-jitsu.
And no: the GDPR isn't just about GA, and it isn't just about the internet; it's about any personal information.
Ad-blockers and JS-blockers are essentially technical solutions; but you have to know to install them. If they were integrated into browsers (and defaulted to "on"), that would make privacy less of a technical matter.
Maybe because the two crimes here are (1) breaking and entering (you have to actually break something) and (2) theft. If the window isn't locked, then you don't have to break in; you can just open the window.
It's not against the law to just walk in; or rather, it's the civil offence of trespass - you can sue the trespasser for damages, e.g. causing wear on your expensive carpet (but you'd have to produce evidence of monetary damages). And you can physically remove them, perhaps with the help of a bailiff. But the police won't help with common trespass - it's not a crime.
[Edit] At least, that's how I understand the law here. IANAL.
> The internet is just not designed for privacy at a technical level.
The Internet is A-Ok.
The issue lies with various slimy companies that exploit web developers ignorance, laziness and negligence with free and easy shortcuts in exchange for the private data of said developers' clients.
No one's forcing you to use CDNs in place of a properly setup caching. No one's stuffing Google Fonts down your designer's throat, they are just lazy to add local resources. An analytics service is not required and there are simple self-hosted options. And so on and so forth.
And the most infuriating part is that these companies, Google being the offender, know perfectly well that they are exploiting the ignorance and they are willingly facilitating and encouraging the spread of practices that would've been viewed as wildly unethical not 10-15 years ago.
Just look at the level of general erosion of privacy and nearly universal lack of concern for it in general population. If you reflect on it for a moment, it is plain fucking scary.
> I'm looking at those "sign in with" widgets that run social media code in my browser
Arguably, they provide code that can be run in your browser, but your browser chooses to run it. And since your browser is a user agent, you choose to run the code by way of installing and configuring a browser that makes that choice by default.
> I'm OK with websites using self-hosted tools such as Matomo as long as the data never leaves their servers.
You might never know that they backfeed data into external analytics services. Under this assumption, wouldn't you need to stop using _any_ website, at all?
It's not an "also" analytics service. It _is_ an analytics service.
If a website poped a question saying "Do you consent to your visit data being passed to Simple Analytics for processing?", how many people would say Yes? Close to zero. Just look at the stats on 3rd party cookie refusals - when done easily, the refusal rates are in high 90%. People may be lazy, but they sure as heck know they don't want to be tracked IF it's actually mentioned.
So what you offer is a GA alternative that makes website operators feel better about themselves for not using the GA. The situation with the visitors remains exactly the same - the still getting shafted with something that none of them wants.
The only way to do analytics in a way that's respectful to the visitors' privacy is with an installable on-host software. That's it.
> The only way to do analytics in a way that's respectful to the visitors' privacy is with an installable on-host software. That's it.
This is an argument taken to a naive extreme. You can't expect every business to also be in the business of analytics, it's not realistic. There's a reason companies have business partners who specialize in certain services.
It's why you have accountants, lawyers, marketers, etc.. Not every company can afford to have all these specialists on payroll, so you work with a service provider that lets you afford the services in a fractional way. You give them access to your data, including customer data sometimes, and in return they provide you with insights and information from that data.
Analytics is just another service provider like that.
You should of course work with a reliable and trusted partner that treats your customer data appropriately and has strong privacy guarantees.
The problem with GA is not "third party", it's "third party that uses my data for its own purposes" because that's the actual cost of using a free service.
Saying "no third parties at all" is not how businesses have operated since forever.
Privacy-respecting analytics should be self-hosted. No one's arguing against an average business using an analytics service, but that shouldn't be bundled with any "privacy" monickers.
If Simple Analytics were pitched as "not a Google Analytics", this would've been perfectly fine. But they insist on the privacy angle and it just demonstrates they don't grok what tracking concerns are about.
Oh no I get the context just fine. What you're missing is that "should be self-hosted" is outside the realm of the average business, and it's not realistic to put this as some arbitrary requirement to check the "privacy" box.
You're clearly a tech person so maybe it feels self-evident or easy for you to do that, just like taxes and law seem self-evident to accountants and lawyers, but the average business owner doesn't have time or money - or the skills - to figure all that out on their own, so they hire a service provider.
Do you think accountants and lawyers come to the business and work on their computers exclusively? No, they receive copies of the confidential business data and work on it within their own business environment.
And do you think accountants and lawyers don't include "privacy" in their pitch?
How is that different from analytics saying "we will keep any data you share with us private, and for your use only".
Based on your argument, as a business owner I should purchase and co-locate my own server, because even if I self-hosted my analytics, I'm storing that data on a third party server owned by my hosting provider!
Do accountants and lawyers routinely use or sell their customers' aggregated data for commercial purposes?
Does US law require accountants and lawyers to give the NSA access to their customers' data upon request, with an automatic gag order attached? If it did, would it still be OK for non-American companies to a US-based accountant or lawyer?
> OP was claiming that any third-party analytics are unacceptable
Don't put words in my mouth. I was not claiming that.
Third-party analytics _that bill themselves "privacy-first"_ are still not what any user would consent to voluntarily, so the "privacy" angle is largely irrelevant. What they should be billing themselves as is "not Google Analytics", which will be factually correct and somewhat relevant.
>> OP was claiming that any third-party analytics are unacceptable
> Don't put words in my mouth. I was not claiming that.
You stated that only self-hosted analytics were acceptable. Your exact words were:
> The only way to do analytics in a way that's respectful to the visitors' privacy is with an installable on-host software. That's it.
This implies - to me - that in your view all third-party analytics are unacceptable from privacy perspective.
I'm not sure how else I was supposed to parse that statement?
Either way, I disagreed with that, and said it's certainly possible to work with third-party service providers, of many kinds including analytics, while still respecting your customers' privacy.
I think the big difference here is that this platform sells a product to website owners who want to see how their visitors generally behave on their site, e.g which pages are most popular. That is a legitimate need.
The difference with GA is that GA offers to fill this need of website owners for free while it actually processes and sells the visitors data for immoral ends. The whole "the customer is the product" deal.
I don't understand why simply sending data from one server to another is seen as such a big deal, the problem with Google and Facebook and the rest is how they build extremely detailed personal profiles that they use to cause social harm. Surely that is very different from tracking which pages get the most views or how much time - on average - people spend on your website?
Did you read their docs? They aren’t setting cookies or collecting IP addresses. There’s no question to me that EU authorities would approve this method.
Visitors' IP addresses are provided to Simple Analytics in the course of loading their script and reporting back the results. That's all it took to get web sites using public Google Fonts resources in trouble—note that this didn't involve any actual analytics scripts or overt data collection, just some embedded CSS and font resources.
The only real advantage Simple Analytics has here is that they aren't Google, so they aren't as much of a political target and don't have deep pockets to attract legal predators on the lookout for an oversize payout—which is a pretty thin justification for treating them any differently.
The regional Google Fonts ruling was an odd one. It had to do with Google processing the IP address, not whether the website was loading from any external domain at all. It did appear to be based on the court's misunderstanding of an IP address contacting a server to be data processing, and perhaps we're going in that direction, and won't be able to use even an extremely privacy-focused CDN without a formal data processing agreement, but that is not currently the intent of GDPR.
The advantage of a service like Simple Analytics remains; it does not store or process any user data.
> The only way to do analytics in a way that's respectful to the visitors' privacy is with an installable on-host software. That's it.
How is that more respectful? I can fingerprint you pretty much the same with server logs (IP, user-agent, ...), don't I? I can even use cookies without any JS.
There is a big difference between "a person's surfing data" or "surfing data of all visitors combined". That's what we promise with Simple Analytics.
[1] https://blog.simpleanalytics.com/why-simple-analytics-is-a-g...
[2] https://docs.simpleanalytics.com/what-we-collect