Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
MemoryCache: Augmenting local AI with browser data (future.mozilla.org)
468 points by NdMAND on Dec 12, 2023 | hide | past | favorite | 101 comments



Was just talking about this on reddit like two days ago

Instead of data going to models, we need models come to our data which is stored locally and stay locally.

While there are many OSS for Loading personal data, they dont do images or videos. In the future everyone may get their own Model but for now tech is there but product/OSS is missing for everyone to get their own QLORA or RAG or Summarizer.

Not just messages/docs: What we read or write, and our thoughts are part of what makes an individual unique. Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.

Was just watching this yesterday https://www.youtube.com/watch?v=zHLCKpmBeKA and thought, why we still don't have a computer secretary like her after almost 30 years, who is one step ahead of us.


"While there are many OSS for Loading personal data, they dont do images or videos"

Local models for images are getting pretty good.

LLaVA is an LLM with multi-modal image capabilities that runs pretty well on my laptop: https://simonwillison.net/2023/Nov/29/llamafile/

Models like Salesforce BLIP can be used to generate captions for images too - I built a little CLI tool far that here: https://github.com/simonw/blip-caption


CogVLM blows LLaVA out of the water, although it needs a beefier machine (quantized low-res version barely fits into 12GB VRAM, not sure about the accuracy of that).


I have no actual knowledge in this area so I'm not sure if it's entirely relevant but an update from the 7th of December on the CogVLM repo says it now works with 11GB of VRAM.


> Instead of data going to models, we need models come to our data which is stored locally and stay locally.

We are building this over at https://software.inc! We collect data about you (from your computer and the internet) into a local database and then teach models how to use it. The models can either be local or cloud-based, and we can route requests based on the sensitivity of the data or the capabilities needed.

We're also hiring if that sounds interesting!


Wow, nice domain. I'd work there for the name alone haha.


Am I cynical thinking the opposite? I can’t imagine they got that domain for a song. Spending a pile of cash on vanity such as that is a real turn off for me; it signals more flash than bang. Am I wrong to think this?


You are not wrong to think this – spending a pile of cash on a name is a big decision that you want to approach with rigor.

We didn't do that, though. Our domain was available for like $4,000. The .inc TLD is intentionally expensive to discourage domain squatting :-)


I’d be worried about the ability to be in relevant searches with a name so generic.


I've never ever ran a query for "software inc" before. They should be okay.

Plus, search engines usually catch up based in click-throughs, bounces, financial kickbacks (cough), too.

Searching for Go programming language stuff was a pain a few years back, but now engines have adapted to Go or Golang.

I don't use Google, so ymmv.


site's pretty funny, but would likely be more useful with more information and less clicking-around-nostalgia 8-)


That's because the company is more or less in stealth/investigatory mode. It's the same team that built Workflow which was acquired by Apple and then turned into Shortcuts.


Here is the website with the same information and the same clicking around but less nostalgia: https://software.inc/html

I don’t think it is more useful, but it is certainly more functional (supports screen reading, text selection, maybe dark mode, etc)


your site is not loading at all for me on firefox (emulator error) and is totally non-functional on chrome (TCPCreate failed)

might be worth having some sort of automatic fallback to a static site after a certain amount of failed loading or an error

just saw the link to your html version in another comment and it took literally five minutes to load on firefox


As fare as I can see it's just a MacOS image, nothing is happening


Just having an archiver that gives you a tradition search over every webpage you've loaded-- forget the AI stuff, would be a major advance.

I don't know about everyone but a majority of searches are for stuff I've seen before, and they're often frustrated by things that have gone offline or are downranked by search engines (e.g. old documentation on HTTP only sites) or burred by SEO.


you will be shocked when you try Rewind then...


I believe that's exactly what GitHub Copilot does. It first scans and indexes your entire codebase including dependencies (I think). So when it auto-completes, it heavily uses the context of your code, which actually makes Copilot so useful.

You're absolutely right about models coming to our data! If we could have Copilot-like intelligence, completely on-device, scanning all sorts of personal breadcrumbs like messages, browsing history, even webpage content, it would be a game-changer!


> Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.

I was imagining something a little more ambitious. Like a model that uses our search history and behavior to derive how to best compose a search query. Bing Chat's search queries look like what my uncle would type right after I explained to him what a search engine is. Throw in some advanced operators like site: or filetype: or at least parentheses along with AND/OR. Surely, we can fine tune it to emulate the search processes of the most impressive researchers, paralegals, and teenagers on the spectrum that immediately factcheck your grandpop's Ellis Island story, with evidence he both arrived at first and was naturalized in Chicago.


Google already tried this 15 years ago

https://en.m.wikipedia.org/wiki/Google_Search_Appliance


> Instead of data going to models, we need models come to our data which is stored locally and stay locally.

That's the most important idea I've read since ChatGPT / last year.

I'll wait for this. Then build my own private AI. And share it / pair it for learning with other private AIs, like a blogroll.

As always, there will be two 'different' AIs: a.) the mainstream, centralized, ad/revenue-driven, capitalist, political, controlling / exploiting etc. b.) personal, trustworthy, polished on peer networks, fun, profitable for one / a small community.

If by chance, commercial models will be better than open source models, due to better access to computing power / data, please let me know. We can go back to SETI and share our idle computing power / existing knowledge


I assume that training LLMs locally require high-end hardware. Even running a model requires a decent CPU or, even better, a high end GPU, but it is not so expensive as training a model. And usually you have to use hardware that is available on the cloud, so not much of privacy here.


You don't need to train the model on your data: you can use retrieval augmented generation to add the relevant documents to your prompt at query time.


This works if the document plus prompt fit in the context window. I suspect the most popular task for this workflow is summary which presumably means large documents. That's when you begin scaling out to a vector store and implementing those more advanced workflows. It does work even by sending a large document on certain local models, but even with the highest tier MacBook Pro a large document can quickly choke up any LLM and bring inference speed to a crawl. Meaning, a powerful client is still required no matter what. Even if you generate embeddings in "real-time" and dump to a vector store that process would be slow in most consumers hardware.

If you're passing in smaller documents then it works pretty good for real-time feedback.


Thank you for explanation. I see there is still a lot I have to learn about LLMs.


As someone else said you don't need to train any models, also - small LLMs (7b~) can run really well even on a base M1 Macbook air from 3 years ago.


Yes, should have local models in addition to remote models. Remote ones are always going to be more capable, and we shouldn't throw that away. Augmentation is orthogonal - you can augment either of these with your own data.


Local compute is so 80s, when people moved away from dumb terminals and mainframes, to PCs.


remote computing is so late '90s when people moved away from PCs to servers (the dot in dot com).

Turns out this sort of stuff is cyclical.


Yes but this time we call it “distributed computing”or “edge computing” instead.


Good idea. Mozilla gets a lot of rightful hate for their mishandling of FF and their political preaching, but I believe they are still capable of developing tech that is both privacy preserving and user friendly at the same time.

I use the offline translator built into FF regularly and It's magic. I would've never thought something like that can run locally, without a server park worth of hardware thrown at it.

Here's hoping this experiment turns out the same way.


Well said; I agree wholeheartedly.


My usage with browsing is not relevant for this. I don't want to "chat" with my browsin g history. I would simply love my browser would index my bookmarks on my OS so I could search the actual content of those bookmarks.

The feedback loop coming gained from chatgtp will I assume always be way better than my local gpt equivalent.

But often I bookmark pages where I know the information on there are important enough for me to come back to more than once.

So I have started crafting out a solution for this. It crawls your bookmarks on your local browser storage, downloads those pages and adds them to your search index on your OS.

That's been an itch for me for years.


Small data sets suffer from bad recall in full text search. So a bit of smart fuzzyness added to the search by AI could improve the experience on locally indexed bookmarks quite well.


Didn't Chrome do this at the very beginning, when it was initially released? I faintly remember that being a feature.

Personally I would already be content if my browsers didn't forget their history all the time, both Firefox and Safari history is way too short-lived.


You were probably thinking of Google Desktop which could search almost anything on your machine.

https://en.wikipedia.org/wiki/Google_Desktop


I went looking and it was indeed Chrome that could do it. See screenshot from 2009 here: https://superuser.com/a/42499

Google removed the feature intentionally in 2013: https://bugs.chromium.org/p/chromium/issues/detail?id=297648

Apparently Opera supported it too at the time, and from the comments Safari as well.

Performance reasons seem to have killed it. I'd think that after 10 years now any modern computer would be able to handle it.


That's an interesting find.


Isn't this just Safari? *using a modern chip


I would sure love a way to "chat" with my browsing history and page content. Is there any way to automatically save off pages that I've visited for later processing? I looked a decade or more ago and didn't really find a good solution.


I think WorldBrain (https://github.com/WorldBrain/Memex) promises this. While I'm also excited by the idea, I think there was some reason I ended up not using it.



Zotero might work, but only as a highly imperfect solution, since it is more focused on research


Rewind.ai is pretty much this - I just installed it and am very happy so far.


Isn't it Apple devices only?


"Coming soon to Windows"

https://www.rewind.ai/windows


Just need a Linux version or an open source alternative now


Classic bookmarks have failed because mnemonic organization doesn't scale. This kind of interface does, and can replace it entirely if done right.

Thinking of it, something like this can be used for all your local files as well, acting as a better version of the old filesystem-as-a-database idea. Or for a specific knowledge base (think LLM-powered Zotero).


Sounds like you just invented the modern version of Windows Longhorn


Wasn't this the idea behind "Networked Environment for Personal, Ontology-based Management of Unified Knowledge" (Nepomuk) Semantic Desktop?

Assuming that you can coerce the LLM to fill in the RDF correctly, and that we now have much more memory and faster storage, it might work.


Something like Orbit would be perfect

https://withorbit.com/


They might be onto something here.

Instead of doing lots of back-n-forth with the giants, enriching them with each prompt, you get a smaller local model that's much more respectful of your privacy.

That's an operating model I am willing to do some OSS contributions to, or even bankroll.

Gotta love the underdogs, even if admittedly, I am not a big Mozilla org fan.


In the future their AIs are going to talk to our AIs. Because we need protection.


It’s what Apple’s been doing for a few years, though it remains unclear how much of that is “AI”. So it makes sense that someone else would enter that niche.


PrivateGPT repository in case anyone's interested: https://github.com/imartinez/privateGPT . It doesn't seem to be linked from their official website.


It's linked from the MemoryCache repo listed at the bottom of the article: https://github.com/Mozilla-Ocho/Memory-Cache


Does anyone know what PrivateGPT is using for its local model, and where it came from?

Update:

Answering my own question it looks like it uses llamacpp in local mode? https://github.com/imartinez/privateGPT/blob/main/private_gp...


Maybe it is just me, since I lived through the Firefox OS era as a past intern: this feels like a possible re-entrance of offering a Mozilla-built OS in the future. They said Internet was born to connect people - but building everything into a browser is not the most optimal way of adding all these fancy stuff. Firefox OS was basically a small linux kernel plus Gecko plus HTML5 for rendering. So much like iOS and iPadOS Mozilla could offer similar OS for devices/platform. I mean, for the past 5 years they have been invested in AR and VR. So I won’t be surprised if they eventually bet on another Firefox OS…



Could barely get a sense of what any of this meant from the shared link.

Went back a bit further/to the official site:

> MemoryCache is an experimental development project to turn a local desktop environment into an on-device AI agent.

Okayy...

And this from November

Introducing Memory Cache

https://memorycache.ai/developer-blog/2023/11/06/introducing...


I'm confused by the example they gave.

> What is the meaning of a life well-lived?

Is the response to this based on browser data? Based on the description I was expecting queries more like:

> What was the name of that pdf I downloaded yesterday?

> What are my top 3 most visited sites?

> What type of content do I generally interact with?


One thing you'll see from a lot of these LLM examples and demos are intentionally subjective queries, so they can't be judged on pass/fail criteria.

For example, you'll see things like "where should I visit in Japan?" or "how should I plan a bachelor party?", because they are a huge variety of answers that are all "correct", regardless of how much you disagree with them. There is also a huge number of examples from them to draw from, especially compared to something as specific as your browsing history.


That information is already available. You want a better search interface.


Yes, exactly, I want a search interface that's an LLM instead of a bunch of menus.


They could private a local url called "about:wrapped" that gives a summary of your usage like Spotify Wrapped. The top 100 sites, you can click on a site for more info like what pages did you visit, when, how often, etc.


Does it ingest the ads containing in the web pages as well? This would be a major concern, ads will pollute the model at least with unwanted information. At worst they'd be a security concern when used for indiscriminate or even targeted manipulation of the model. Advertisers do that to our brains, it's all they do. So why shouldn't they try that with LLMs scraping/being fed from the web?


I wish they would fix basic features such as downloading pictures on Firefox for Android. Often long pressing the image on your screen opens a context menu that does not allow download, only following the link associated with the image.


What I would love to see is this model being able to learn ti automate some tasks that I usually do! e.g. sign up for events/buy tickets etc! If this has access to your login details, and could log in, it could be a great assistant!


Teach it to press the skip ad button


Or it could click "hide" on cookie banners for me!


They actually already added this, but it's still in a limited trial phase.

https://support.mozilla.org/en-US/kb/cookie-banner-reduction


Or shoe bot


Doing this for a long time.

https://news.ycombinator.com/item?id=38421121

The FF solution is just more automated.


so this is how growth hacking look like, building a landing page for a imaginary product to test market-fit idea?


this is a pretty cool idea, i'd like to be able to choose which pages i want to cache


I hope this encourages Mozilla to focus more on page archiving support on the web. I feel as though they missed a huge opportunity by not making it easy to archive pages with DOM snapshots, or easy to snag videos or images. (Go to Instagram and try to right-click -> download the image; you can't.) Would have been a very good way to differentiate from Chrome, as Google wouldn't want that available for Youtube. And "our browser can download videos and images from anywhere" is pretty easy to sell for potential users.


I'm baffled that the support of single file, offline HTML is still so bad today :

https://www.russellbeattie.com/notes/posts/the-decades-long-...

(I'm suspecting because this goes against the wants of some of the biggest players who have the incentive of making us leave as many online footprints as possible ?)

Even here, Mozilla recommends converting to PDF for easier (?!?) human readability. Except PDF is a very bad format for digital documents, with no support for reflow and very bad support of multimedia. (PDF is perhaps good for archival of offline documents, even despite its other issues).


"Save Page WE" will capture a DOM snapshot to a single HTML file. The only problem is that Data URLs encoded using Base64 are highly bloated.


Isn't that basically Pocket, the service that people complain about endlessly as "bloat"?


Agree, it seems like it’s insanely hard to back up a modern JS-enabled web page in a usable way that results in a single file which can be easily shared.


Have you tried SingleFile? It sounds like what you’re looking for:

https://github.com/gildas-lormeau/SingleFile


Also check out https://archiveweb.page which is open source, local, and lets you export archived data as WARC (ISO 28500). You can embed archives in web pages using their Web Component https://replayweb.page.


Will check it out, thanks.


Regarding PrivateGPT, if I have a 12GB Nvidia 4070 and an 11GB 2080ti, which LLM should I run?

Edited to add: https://www.choosellm.com/ by the PrivateGPT folks seems to have what I needed.


There's a big community discussing exactly that over at https://www.reddit.com/r/LocalLLaMA/.


+1 r/localllama, 23GB should allow you to run 30b~ models, but honestly some of the new smaller models such as Mistral & friends (Zephyr etc..) are really interesting. You could also Give Mixtral a try if you get a low quant format such as this q3 - https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-G...


Do you know if there is a website or spreadsheet where I could enter my RAM, GPU, etc. and see what models I could run locally?


Yes, https://www.reddit.com/r/LocalLLaMA/. Just ask there and a bunch of Non-Artificial Intelligent agents will give you that answer ;).


While I agree a website/spreadsheet would be convenient, it's not that complicated. As long as GPU is handling 50-75% of the LLM layers, you should get a decent tok/sec speed (unless you're running really really large models).


Could you explain to me (in steps) how I would go about calculating how much VRAM I would need to run, say, Mistral 8x7B?


you're better than this mozilla. hopping on the ai trend is disgusting given your alleged morals


This seems completely overkill.

I don't even like having to clear my history and wtv regularly. I use incognito mode most times.

Now I have monitor what my local AI collects?

"through the lens of privacy" my ass, man.

Why would I ask my browser what the meaning of a life well lived is?


What is happening at Firefox is quite strange. Like they are walking backwards.


This seems like a sensible step in the right direction IMO, (optional) features such as local, privacy respecting LLMs will help to augment peoples online research, bookmarking, contextual search etc....

It's important that we have Firefox working on such experiments otherwise as Google adds more of their privacy invading features to chrome / chromium it will likely impact negatively on peoples desire to find alternative browsers.


Yeah, but maybe, if you are constantly losing market share... maybe you should work on things that appeals to a wider audience. Except if you have a trump card and intend to use it as a deus ex machina to suddenly show people you are THE browser, the way forward.


You don't gain market share by doing the same stuff the other FREE alternative does.

You gain market share by doing what they refused to do, no matter how much it's in the user's interest, because their business is stealing the user's data and yours isn't.


In a just world, that's a way to gain market share. In our world, people concede their data for marginal improvements in the quality of a feature because they can't conceive of how giving up control of their data could come back to harm them. It doesn't feel like there is a downside.


What free alternatives? All the browsers look & feel the same. Zero innovation.


Very misleading name. The word "Memory" has a distinct meaning in relation to computing, but this is more about human memories.


I was going to ignore this as a troll comment because computer memory has its antecedents in human memory but the commenter is right - the combination of memory and cache to talk about human memory seems misleading.


I kind of like the association since it speaks to how text collected while browsing the web can be used to generate new text, which is similar, at least metaphorically, to how human memory is reconstructive and transformative, not perfect recall. https://en.wikipedia.org/wiki/Reconstructive_memory




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: