Hacker News new | past | comments | ask | show | jobs | submit | netsroht's comments login

Gadgetbridge added support for Garmin watches recently [1]. All data is stored on your Android phone with no internet connectivity required and you can even export the sqlite DB so you own your sensor data. The UI isn't as nice as Garmin's but it does its job.

[1] https://gadgetbridge.org/basics/topics/garmin/


Always cool seeing stuff in this space.

Regarding "zeitgeist", about a year ago I built something similar called https://zeitgaist.ai which also incorporates other sources like Mastodon, Bluesky, some subreddits etc.


Being logged in while making search queries in search engines poses significant privacy risks. The searches can paint a comprehensive profile of the user, and these data often remain stored for extended periods. There's a chance this information might be shared with third parties. Coupled with other user data, these logged-in searches can pave the way for targeted advertising, sophisticated predictive analysis, and potential exploitation by governments or malicious entities. In the event of data breaches, the user's logged-in search histories can be exposed. Furthermore, users typically don't have clear insight into how their data is utilized when logged in.

I hope Kagi introduces an anonymous access feature. For instance, it could incorporate zero-knowledge proofs (ZKPs). These are cryptographic techniques where one party (the prover) can confirm to another (the verifier) that a claim is accurate without disclosing any additional information. This is especially beneficial for authentication scenarios where it's essential to avoid sharing extra details.

To implement zero-knowledge authentication for quota API access:

1. Token Creation:

- Each month, users receive a token tied to their identity and quota.

- The token can be split for use on multiple devices using cryptographic methods.

2. API Access:

- Clients present a zero-knowledge proof (ZKP) to confirm they have a valid token and haven't used up their quota. The server verifies this without seeing the exact details.

3. Client Synchronization:

- Each client tracks its quota usage.

- Synchronization can be peer-to-peer or through a centralized, encrypted server to prevent double spending of the quota.

4. Quota Renewal:

- Monthly, old tokens expire, and new tokens are issued.

Challenges:

- ZKPs can be resource-intensive.

- Token security is crucial; there should be a way to handle lost or compromised tokens.

- The system should prevent quota "double-spending" across devices.

- If a centralized server is used for synchronization, it should operate with encrypted data.

This way Kagi would only know who their customers are but not what kind of searches they make.


Kagi already provides a way to search anonymously via a random email address (we do not really verify it or need it for anything) and Bitcoin/Lightning payment [1].

Since you are interested in cryptography, there is a discussion on Kagi feedback site along the same lines as your idea, about possible ways to achieve this without the need for cryptocurrency. [2]

[1] https://blog.kagi.com/accepting-paypal-bitcoin

[2] https://kagifeedback.org/d/653-completely-anonymous-searches...


Thanks for the links. Using a disposable email with crypto payments and occasionally generating a new account to unlink from previous searches could be a viable intermediate solution.

Also, I found this link [1] in the thread you mentioned. They seem to have implemented something like that.

[1] https://metager.de/keys/help/anonymous-token


Just to make it clear, Kagi does not link searches to an account already, to begin with. Refer to our privacy policy [1]. We simply do not need that data for anything and it would be just a liability for us. Our philosophy is that users should personalize the search feed themselves and this is why we built features like the aiblity to block or promote domains, create search lenses and many more.

However there is no technical way of proving it. So cryptocurrency and cryptography are ways to achieve anonimity from a perspective of a user, regardless of what we are doing.

[1] https://kagi.com/privacy


Any system that can check balance, can link searches to a user. There's no way around it. In your case, Kagi would need to trust the client with the balance, which would be insecure.

There's only one solution, and that is that you need to put a bit of trust in Kagi. Compared to the major one, Google, you can chose between one that promises to not store data, and one that promises it does (and does a lot).

It's always a bit sad that here on HN, when companies try to do better than bigger players, there's always people who think it isn't enough. It has to be absolutely impossibly perfect.


> Any system that can check balance, can link searches to a user.

I don't think it's true. I can immediately see at least two ways how it can be done without identifying the user.

1. Each user gets X tokens at the beginning of the month. When searching, user supplies a token, which is immediately burned. The token does not contain the user identity, just signature validating it's a valid token.

2. Variation of the above: each user gets a token good for X searches at the beginning of the month. When searching, the system will return a token good for N-1 search each time token good for N searches is presented. Again, no need to contain user identity anywhere in the system.

Of course, both solutions have their downsides (sync between multiple devices, stealing tokens, losing tokens, etc.) but it id definitely possible. And I am sure if somebody spent a little time thinking on it, these ideas can be seriously improved to eliminate the downsides without introducing the need to identify the user.


In both these cases the search engine provider could easily store your identity together with your token while issuing it and recover the identity once the token is used without any way to prove this from the outside. They could even issue tokens in the form AES_ENC("SOME KEY ONLY THEY HAVE", USER_ID | counter) and you would not notice. You would have to trust them that they won't do this, which is no improvement to the current thing Kagi does (saying they won't collect any data, while admitting they can't prove it, you just have to trust them).


I think there's a fundamental difference between "X can not be implemented" and "can we trust this provider to implement X correctly"? In this case, it can be implemented without violating privacy. But of course you need to trust them to actually implement what they say and not instead put 9000 trackers in each page and track your every movement like certain other big companies do. But these are different things - the comment upstream claimed that the subscription system can not be implemented with privacy. This is not true - it can be. Whether or not a particular provider would implement it, and whether we can trust them that they did - that's a different question, which is also important but does not change the answer to the original one.


I'm not a cryptography expert, but from my research, shouldn't it be possible to verify quota on ZKPs server-side? Essentially, the server doesn't need to know the specifics of the user's identity, just that they possess a valid token and haven't exceeded their quota.

You can use search engines like Google without being logged in. When combined with tools like uBlock Origin and Cookie AutoDelete, it becomes more challenging for them to build a singular profile about a user, especially one tied to payment methods such as credit cards.

I genuinely appreciate what Kagi is doing, and I'd absolutely be willing to pay for their service, because if you're not paying for a service, you're the product. I trust companies to uphold their privacy promises, but "Trust is good, but proof is better." ;)


The issue is implementing it client side. ZKP means that you cannot simply embed a token in the URL, but instead need to participate in an active protocol. You could implement this in JavaScript, but then you need to trust the JS being served from the server.

Even once you do that, you have all the other tracking mechanisms that the server could use if it wanted to.


They key word is server side. You have no way to verify that they are not tracking sessions as an user.


> Any system that can check balance, can link searches to a user.

For what it's worth, you can buy a physical Mullvad gift card and use that to create a very anonymous account for VPN use.

Even if you buy your gift card from a major online retailer, it comes from a stack of gift cards, nothing tracks which one was sent to whom. You can also exchange gifts among friends.


I'm not searching for anything terrifyingly illegal, and for the rest Google and MS already scrape and compile every byte of data I've ever generated. Why would it suddenly be a problem when a more reliable and less vicious company is doing a fraction of that?

You have to understand that most of us aren't fighting some battle for "perfect privacy," I just want a search engine that works for me, rather than advertisers, at the level of the search results themselves.


I get your perspective. A lot of us just want a search engine that serves the user first, not advertisers, especially at the results level. It's about function over strict privacy for many--everyone has their own privacy threshold.

But it's also about digital data autonomy. It's not just about avoiding surveillance over sensitive searches, but having control over our data's destiny. Even mundane data, in aggregate, can sometimes be used in ways we can't predict.


Personally privacy is a strong concern for me; I have many aspects of my digital life set up less conveniently in exchange for privacy.

In this case though we've have on one hand a product that definitely does aggregate data about searches, and doesn't do what I need very well; and the other a product that could, but does not currently aggregate data, and does an excellent job serving my needs.

And importantly there is no option of a product, available now, that is verifiably prevented from aggregation. Even a VPN unless I disconnect and get a new random IP between every individual search does not provide that protection. (And then browser fingerprints even.)


What is counted as "terrifyingly illegal" changes without a moments notice on the whims of your rulers. So even if you're not googling on how to bomb the government, there are hundreds of other subjects and opinions that could in the future make the majority of your neighbours, family and workmates think you deserve to be shunned, fired, in prison, or worse. That is why people want to protect their privacy.


Ok, but again Google and a hundred data brokers already scrape every detail of my life no matter what I do. Even if I become a hermit in the woods the existence of my friends and family who don't take those precautions would make my efforts worthless. Meanwhile we're talking about Google/Bing vs. Kagi... not "Super Secret Perfect Privacy Magic" vs Kagi.

So while I understand your overall concern, that ship has sailed for search engines and the internet. We're living in a world full of networked cameras that people voluntarily and happily install, of people broadcasting their lives 24/7. The idea of perfect privacy is getting downright mystical/religious.


Sure, and for what it's worth I trust Kagi. But I can understand those who are more strict with their privacy.

In the end I think we have to accept in some way that everything we say, write and do is subject to surveillance, and that the government might want to kill you for any reason that you'd have no chance of preventing.


> Being logged in while making search queries in search engines poses significant privacy risks. The searches can paint a comprehensive profile of the user, and these data often remain stored for extended periods. There's a chance this information might be shared with third parties. Coupled with other user data, these logged-in searches can pave the way for targeted advertising, sophisticated predictive analysis, and potential exploitation by governments or malicious entities. In the event of data breaches, the user's logged-in search histories can be exposed. Furthermore, users typically don't have clear insight into how their data is utilized when logged in.

This reads and smells like ChatGPT / AI.


Was thinking the same thing. Not even gpt4


I’ve gotten tired of these boogey man arguments.

There are sooooo many other ways to fingerprint than an account.

Oh look, this MacBook with X by Y resolution from this IP address has had 100 searches for the past 2 hours. Oh no! He switched to incognito.


Is my new project [0] also part of the problem? I'm still unsure myself because LLMs also allow us to process data in unprecedented ways. In my specific case, I auto generate stories to highlight different view points based on what people are saying about hot controversial topics on social media.

What's your opinion?

[0] https://zeitgaist.social


> Is my new project also part of the problem?

Yes. In a few ways it’s considerably worse. Your website is referencing major world events, including war and freedom of press, by leveraging uninformed comments from the web (many of them themselves written by AI). That would be a problem even if your content weren’t auto-generated (random comments don’t make good or accurate journalism) but it’s worse when it’s churned at a high rate and introduces its own false interpretations.


Thanks for your input. This website should neither compete with nor replace regular journalism. What I try to achieve here is to be able to break free from social media silos where usually people are in kind of bubble. No one can read this many comments and people usually tend to read only comments / conversations where they align with their believes. Hence, I try to highlight different view points along with contrasting opinions (across several different social media platforms) to get an overview--not necessarily fact-based. These stories aren't supposed to push any agenda down anyone's throat.

Since this project just went live I'm still figuring out how to communicate that.


> This website should neither compete with nor replace regular journalism.

That’s good in theory, but people will use it like that anyway. We’re living in an era of rampant misinformation. Provable truths we’ve know for thousands of years are being called into question (the shape of the Earth). Even people who should know to do their own research (see recent case of lawyers using ChatGPT) are taking AI output as unquestionable truth. Your website won’t eliminate bias, it will only give people more sources to cherry pick.

> Hence, I try to highlight different view points along with contrasting opinions

Which exacerbates the issue. If we can prove the Earth is a sphere and 99% of people understand that, making it seem like the “contrasting opinion” is equally valid and deserves the same weighted consideration makes the problem worse, not better. John Oliver illustrated the problem in 2014, using climate change as the example.

https://www.youtube.com/watch?v=cjuGCJJUGsg


Why do all the images look deep fried?


Because I intentionally process them with a very rudimentary "cartoonizer" in order to distinguish from a regular news articles and to emphasize that these stories are not written by humans. I don't know yet whether this helps.


It looks like it could be interesting but 'Public Doubts Over Musk's Combat Readiness' is exactly what I wish I never had to read ever again.

There needs to be a dial between 100% stories about Prigozhin/wagner and 100% stories about Elon Musk's narcissistic press headline grab of the week.


I regularly get these infinity captchas on Firefox as well. A couple of days ago I noticed that switching to a different Firefox container let's me pass the captchas.


When I started working on Zeitgaist [1] I immediately recognized that biased information is the main problem. I'm currently thinking about attaching additional information to the sources I present like ground [2] is doing with political spectrums. I really like your idea for the news algorithm. If you or someone elsw wants to build something like that on top of Zeitgaist with me, don't hesitate to contact me.

[1] https://zeitgaist.ai [2] https://ground.news/


I think he's just emphasizing that OpenAI is in fact not open, thusly it's crossed out.


AI is a very broad term. Most search engines use some kind of ML for building their search index and also for ranking because it works very well.

> Do you want to avoid LLMs answering your search? I have not seen that widely adopted at all.

It's starting to get more though. The brief answers that sometimes show up right beneath the search term will most likely get improved by leveraging LLMs.


I agree that showing the prompts will break the usability flow. I'm currently thinking about a way that let's users see the reasoning behind the AI agent - maybe in form of prompts if they explicitly enable it - for my current project [1].

Unlike Bing chat etc., I at least show the detailed sources with contents from web searches and social media comments that have been used to generate the answers.

[1] https://zeitgaist.ai


LangChain is a great workaround for that. [1]

> how to work with a memory module that remembers things about specific entities. It extracts information on entities (using LLMs) and builds up its knowledge about that entity over time (also using LLMs).

[1] https://python.langchain.com/en/latest/modules/memory/types/...


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: