MemoryCache: Augmenting local AI with browser data

Jayakumark · on Dec 12, 2023

Was just talking about this on reddit like two days ago

Instead of data going to models, we need models come to our data which is stored locally and stay locally.

While there are many OSS for Loading personal data, they dont do images or videos. In the future everyone may get their own Model but for now tech is there but product/OSS is missing for everyone to get their own QLORA or RAG or Summarizer.

Not just messages/docs: What we read or write, and our thoughts are part of what makes an individual unique. Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.

Was just watching this yesterday https://www.youtube.com/watch?v=zHLCKpmBeKA and thought, why we still don't have a computer secretary like her after almost 30 years, who is one step ahead of us.

simonw · on Dec 12, 2023

"While there are many OSS for Loading personal data, they dont do images or videos"

Local models for images are getting pretty good.

LLaVA is an LLM with multi-modal image capabilities that runs pretty well on my laptop: https://simonwillison.net/2023/Nov/29/llamafile/

Models like Salesforce BLIP can be used to generate captions for images too - I built a little CLI tool far that here: https://github.com/simonw/blip-caption

orbital-decay · on Dec 12, 2023

CogVLM blows LLaVA out of the water, although it needs a beefier machine (quantized low-res version barely fits into 12GB VRAM, not sure about the accuracy of that).

cinntaile · on Dec 12, 2023

I have no actual knowledge in this area so I'm not sure if it's entirely relevant but an update from the 7th of December on the CogVLM repo says it now works with 11GB of VRAM.

conradev · on Dec 12, 2023

> Instead of data going to models, we need models come to our data which is stored locally and stay locally.

We are building this over at https://software.inc! We collect data about you (from your computer and the internet) into a local database and then teach models how to use it. The models can either be local or cloud-based, and we can route requests based on the sensitivity of the data or the capabilities needed.

We're also hiring if that sounds interesting!

gardenhedge · on Dec 12, 2023

Wow, nice domain. I'd work there for the name alone haha.

voakbasda · on Dec 12, 2023

Am I cynical thinking the opposite? I can’t imagine they got that domain for a song. Spending a pile of cash on vanity such as that is a real turn off for me; it signals more flash than bang. Am I wrong to think this?

conradev · on Dec 13, 2023

You are not wrong to think this – spending a pile of cash on a name is a big decision that you want to approach with rigor.

We didn't do that, though. Our domain was available for like $4,000. The .inc TLD is intentionally expensive to discourage domain squatting :-)

nickthegreek · on Dec 12, 2023

I’d be worried about the ability to be in relevant searches with a name so generic.

otteromkram · on Dec 12, 2023

I've never ever ran a query for "software inc" before. They should be okay.

Plus, search engines usually catch up based in click-throughs, bounces, financial kickbacks (cough), too.

Searching for Go programming language stuff was a pain a few years back, but now engines have adapted to Go or Golang.

I don't use Google, so ymmv.

herval · on Dec 12, 2023

site's pretty funny, but would likely be more useful with more information and less clicking-around-nostalgia 8-)

smith7018 · on Dec 12, 2023

That's because the company is more or less in stealth/investigatory mode. It's the same team that built Workflow which was acquired by Apple and then turned into Shortcuts.

conradev · on Dec 13, 2023

Here is the website with the same information and the same clicking around but less nostalgia: https://software.inc/html

I don’t think it is more useful, but it is certainly more functional (supports screen reading, text selection, maybe dark mode, etc)

bjord · on Dec 13, 2023

your site is not loading at all for me on firefox (emulator error) and is totally non-functional on chrome (TCPCreate failed)

might be worth having some sort of automatic fallback to a static site after a certain amount of failed loading or an error

just saw the link to your html version in another comment and it took literally five minutes to load on firefox

thepra · on Dec 12, 2023

As fare as I can see it's just a MacOS image, nothing is happening

nullc · on Dec 12, 2023

Just having an archiver that gives you a tradition search over every webpage you've loaded-- forget the AI stuff, would be a major advance.

I don't know about everyone but a majority of searches are for stuff I've seen before, and they're often frustrated by things that have gone offline or are downranked by search engines (e.g. old documentation on HTTP only sites) or burred by SEO.

mclightning · on Dec 12, 2023

you will be shocked when you try Rewind then...

timenova · on Dec 12, 2023

I believe that's exactly what GitHub Copilot does. It first scans and indexes your entire codebase including dependencies (I think). So when it auto-completes, it heavily uses the context of your code, which actually makes Copilot so useful.

You're absolutely right about models coming to our data! If we could have Copilot-like intelligence, completely on-device, scanning all sorts of personal breadcrumbs like messages, browsing history, even webpage content, it would be a game-changer!

jakderrida · on Dec 12, 2023

> Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.

I was imagining something a little more ambitious. Like a model that uses our search history and behavior to derive how to best compose a search query. Bing Chat's search queries look like what my uncle would type right after I explained to him what a search engine is. Throw in some advanced operators like site: or filetype: or at least parentheses along with AND/OR. Surely, we can fine tune it to emulate the search processes of the most impressive researchers, paralegals, and teenagers on the spectrum that immediately factcheck your grandpop's Ellis Island story, with evidence he both arrived at first and was naturalized in Chicago.

diceduckmonk · on Dec 12, 2023

Google already tried this 15 years ago

https://en.m.wikipedia.org/wiki/Google_Search_Appliance

csbartus · on Dec 12, 2023

> Instead of data going to models, we need models come to our data which is stored locally and stay locally.

That's the most important idea I've read since ChatGPT / last year.

I'll wait for this. Then build my own private AI. And share it / pair it for learning with other private AIs, like a blogroll.

As always, there will be two 'different' AIs: a.) the mainstream, centralized, ad/revenue-driven, capitalist, political, controlling / exploiting etc. b.) personal, trustworthy, polished on peer networks, fun, profitable for one / a small community.

If by chance, commercial models will be better than open source models, due to better access to computing power / data, please let me know. We can go back to SETI and share our idle computing power / existing knowledge

butz · on Dec 12, 2023

I assume that training LLMs locally require high-end hardware. Even running a model requires a decent CPU or, even better, a high end GPU, but it is not so expensive as training a model. And usually you have to use hardware that is available on the cloud, so not much of privacy here.

cjbprime · on Dec 12, 2023

You don't need to train the model on your data: you can use retrieval augmented generation to add the relevant documents to your prompt at query time.

Art9681 · on Dec 12, 2023

This works if the document plus prompt fit in the context window. I suspect the most popular task for this workflow is summary which presumably means large documents. That's when you begin scaling out to a vector store and implementing those more advanced workflows. It does work even by sending a large document on certain local models, but even with the highest tier MacBook Pro a large document can quickly choke up any LLM and bring inference speed to a crawl. Meaning, a powerful client is still required no matter what. Even if you generate embeddings in "real-time" and dump to a vector store that process would be slow in most consumers hardware.

If you're passing in smaller documents then it works pretty good for real-time feedback.

butz · on Dec 12, 2023

Thank you for explanation. I see there is still a lot I have to learn about LLMs.

smcleod · on Dec 12, 2023

As someone else said you don't need to train any models, also - small LLMs (7b~) can run really well even on a base M1 Macbook air from 3 years ago.

pradn · on Dec 12, 2023

Yes, should have local models in addition to remote models. Remote ones are always going to be more capable, and we shouldn't throw that away. Augmentation is orthogonal - you can augment either of these with your own data.

amelius · on Dec 12, 2023

Local compute is so 80s, when people moved away from dumb terminals and mainframes, to PCs.

gpderetta · on Dec 12, 2023

remote computing is so late '90s when people moved away from PCs to servers (the dot in dot com).

Turns out this sort of stuff is cyclical.

simondotau · on Dec 12, 2023

Yes but this time we call it “distributed computing”or “edge computing” instead.

no_time · on Dec 12, 2023

Good idea. Mozilla gets a lot of rightful hate for their mishandling of FF and their political preaching, but I believe they are still capable of developing tech that is both privacy preserving and user friendly at the same time.

I use the offline translator built into FF regularly and It's magic. I would've never thought something like that can run locally, without a server park worth of hardware thrown at it.

Here's hoping this experiment turns out the same way.

pixxel · on Dec 12, 2023

Well said; I agree wholeheartedly.

danielovichdk · on Dec 12, 2023

My usage with browsing is not relevant for this. I don't want to "chat" with my browsin g history. I would simply love my browser would index my bookmarks on my OS so I could search the actual content of those bookmarks.

The feedback loop coming gained from chatgtp will I assume always be way better than my local gpt equivalent.

But often I bookmark pages where I know the information on there are important enough for me to come back to more than once.

So I have started crafting out a solution for this. It crawls your bookmarks on your local browser storage, downloads those pages and adds them to your search index on your OS.

That's been an itch for me for years.

groestl · on Dec 12, 2023

Small data sets suffer from bad recall in full text search. So a bit of smart fuzzyness added to the search by AI could improve the experience on locally indexed bookmarks quite well.

jval43 · on Dec 12, 2023

Didn't Chrome do this at the very beginning, when it was initially released? I faintly remember that being a feature.

Personally I would already be content if my browsers didn't forget their history all the time, both Firefox and Safari history is way too short-lived.

ayewo · on Dec 13, 2023

You were probably thinking of Google Desktop which could search almost anything on your machine.

https://en.wikipedia.org/wiki/Google_Desktop

jval43 · on Dec 13, 2023

I went looking and it was indeed Chrome that could do it. See screenshot from 2009 here: https://superuser.com/a/42499

Google removed the feature intentionally in 2013: https://bugs.chromium.org/p/chromium/issues/detail?id=297648

Apparently Opera supported it too at the time, and from the comments Safari as well.

Performance reasons seem to have killed it. I'd think that after 10 years now any modern computer would be able to handle it.

ayewo · on Dec 23, 2023

That's an interesting find.

overstay8930 · on Dec 12, 2023

Isn't this just Safari? *using a modern chip

linsomniac · on Dec 12, 2023

I would sure love a way to "chat" with my browsing history and page content. Is there any way to automatically save off pages that I've visited for later processing? I looked a decade or more ago and didn't really find a good solution.

solarkraft · on Dec 12, 2023

I think WorldBrain (https://github.com/WorldBrain/Memex) promises this. While I'm also excited by the idea, I think there was some reason I ended up not using it.

rhn_mk1 · on Dec 12, 2023

Check out Recoll with Recoll-WE https://addons.mozilla.org/en-US/firefox/addon/recoll-we/

wizardwes · on Dec 12, 2023

Zotero might work, but only as a highly imperfect solution, since it is more focused on research

kaynelynn · on Dec 12, 2023

Rewind.ai is pretty much this - I just installed it and am very happy so far.

Alifatisk · on Dec 12, 2023

Isn't it Apple devices only?

thekevan · on Dec 12, 2023

"Coming soon to Windows"

https://www.rewind.ai/windows

emptysongglass · on Dec 12, 2023

Just need a Linux version or an open source alternative now

orbital-decay · on Dec 12, 2023

Classic bookmarks have failed because mnemonic organization doesn't scale. This kind of interface does, and can replace it entirely if done right.

Thinking of it, something like this can be used for all your local files as well, acting as a better version of the old filesystem-as-a-database idea. Or for a specific knowledge base (think LLM-powered Zotero).

wintogreen74 · on Dec 12, 2023

Sounds like you just invented the modern version of Windows Longhorn

ElectricalUnion · on Dec 13, 2023

Wasn't this the idea behind "Networked Environment for Personal, Ontology-based Management of Unified Knowledge" (Nepomuk) Semantic Desktop?

Assuming that you can coerce the LLM to fill in the RDF correctly, and that we now have much more memory and faster storage, it might work.

AureliusMA · on Dec 12, 2023

Something like Orbit would be perfect

https://withorbit.com/

ath3nd · on Dec 12, 2023

They might be onto something here.

Instead of doing lots of back-n-forth with the giants, enriching them with each prompt, you get a smaller local model that's much more respectful of your privacy.

That's an operating model I am willing to do some OSS contributions to, or even bankroll.

Gotta love the underdogs, even if admittedly, I am not a big Mozilla org fan.

visarga · on Dec 12, 2023

In the future their AIs are going to talk to our AIs. Because we need protection.

altairprime · on Dec 12, 2023

It’s what Apple’s been doing for a few years, though it remains unclear how much of that is “AI”. So it makes sense that someone else would enter that niche.

avallach · on Dec 12, 2023

PrivateGPT repository in case anyone's interested: https://github.com/imartinez/privateGPT . It doesn't seem to be linked from their official website.

nightski · on Dec 12, 2023

It's linked from the MemoryCache repo listed at the bottom of the article: https://github.com/Mozilla-Ocho/Memory-Cache

edsu · on Dec 15, 2023

Does anyone know what PrivateGPT is using for its local model, and where it came from?

Update:

Answering my own question it looks like it uses llamacpp in local mode? https://github.com/imartinez/privateGPT/blob/main/private_gp...

yeukhon · on Dec 12, 2023

Maybe it is just me, since I lived through the Firefox OS era as a past intern: this feels like a possible re-entrance of offering a Mozilla-built OS in the future. They said Internet was born to connect people - but building everything into a browser is not the most optimal way of adding all these fancy stuff. Firefox OS was basically a small linux kernel plus Gecko plus HTML5 for rendering. So much like iOS and iPadOS Mozilla could offer similar OS for devices/platform. I mean, for the past 5 years they have been invested in AR and VR. So I won’t be surprised if they eventually bet on another Firefox OS…

politelemon · on Dec 12, 2023

Did you mean to link to a forked repo?

https://memorycache.ai/developer-blog/2023/11/30/we-have-a-w...

links to https://github.com/misslivirose/Memory-Cache

but did you mean https://github.com/Mozilla-Ocho/Memory-Cache

ChrisArchitect · on Dec 12, 2023

Could barely get a sense of what any of this meant from the shared link.

Went back a bit further/to the official site:

> MemoryCache is an experimental development project to turn a local desktop environment into an on-device AI agent.

Okayy...

And this from November

Introducing Memory Cache

https://memorycache.ai/developer-blog/2023/11/06/introducing...

CollinEMac · on Dec 12, 2023

I'm confused by the example they gave.

> What is the meaning of a life well-lived?

Is the response to this based on browser data? Based on the description I was expecting queries more like:

> What was the name of that pdf I downloaded yesterday?

> What are my top 3 most visited sites?

> What type of content do I generally interact with?

atomicUpdate · on Dec 12, 2023

One thing you'll see from a lot of these LLM examples and demos are intentionally subjective queries, so they can't be judged on pass/fail criteria.

For example, you'll see things like "where should I visit in Japan?" or "how should I plan a bachelor party?", because they are a huge variety of answers that are all "correct", regardless of how much you disagree with them. There is also a huge number of examples from them to draw from, especially compared to something as specific as your browsing history.

ipaddr · on Dec 12, 2023

That information is already available. You want a better search interface.

lacker · on Dec 12, 2023

Yes, exactly, I want a search interface that's an LLM instead of a bunch of menus.

ape4 · on Dec 12, 2023

They could private a local url called "about:wrapped" that gives a summary of your usage like Spotify Wrapped. The top 100 sites, you can click on a site for more info like what pages did you visit, when, how often, etc.

emsign · on Dec 13, 2023

Does it ingest the ads containing in the web pages as well? This would be a major concern, ads will pollute the model at least with unwanted information. At worst they'd be a security concern when used for indiscriminate or even targeted manipulation of the model. Advertisers do that to our brains, it's all they do. So why shouldn't they try that with LLMs scraping/being fed from the web?

tesdinger · on Dec 12, 2023

I wish they would fix basic features such as downloading pictures on Firefox for Android. Often long pressing the image on your screen opens a context menu that does not allow download, only following the link associated with the image.

reqo · on Dec 12, 2023

What I would love to see is this model being able to learn ti automate some tasks that I usually do! e.g. sign up for events/buy tickets etc! If this has access to your login details, and could log in, it could be a great assistant!

candiddevmike · on Dec 12, 2023

Teach it to press the skip ad button

lacker · on Dec 12, 2023

Or it could click "hide" on cookie banners for me!

k1t · on Dec 12, 2023

They actually already added this, but it's still in a limited trial phase.

https://support.mozilla.org/en-US/kb/cookie-banner-reduction

redblacktree · on Dec 12, 2023

Or shoe bot

Ringz · on Dec 12, 2023

Doing this for a long time.

https://news.ycombinator.com/item?id=38421121

The FF solution is just more automated.

huy77 · on Dec 12, 2023

so this is how growth hacking look like, building a landing page for a imaginary product to test market-fit idea?

stainablesteel · on Dec 12, 2023

this is a pretty cool idea, i'd like to be able to choose which pages i want to cache

jml7c5 · on Dec 12, 2023

I hope this encourages Mozilla to focus more on page archiving support on the web. I feel as though they missed a huge opportunity by not making it easy to archive pages with DOM snapshots, or easy to snag videos or images. (Go to Instagram and try to right-click -> download the image; you can't.) Would have been a very good way to differentiate from Chrome, as Google wouldn't want that available for Youtube. And "our browser can download videos and images from anywhere" is pretty easy to sell for potential users.

BlueTemplar · on Dec 12, 2023

I'm baffled that the support of single file, offline HTML is still so bad today :

https://www.russellbeattie.com/notes/posts/the-decades-long-...

(I'm suspecting because this goes against the wants of some of the biggest players who have the incentive of making us leave as many online footprints as possible ?)

Even here, Mozilla recommends converting to PDF for easier (?!?) human readability. Except PDF is a very bad format for digital documents, with no support for reflow and very bad support of multimedia. (PDF is perhaps good for archival of offline documents, even despite its other issues).

Dwedit · on Dec 12, 2023

"Save Page WE" will capture a DOM snapshot to a single HTML file. The only problem is that Data URLs encoded using Base64 are highly bloated.

dralley · on Dec 13, 2023

Isn't that basically Pocket, the service that people complain about endlessly as "bloat"?

eigenvalue · on Dec 12, 2023

Agree, it seems like it’s insanely hard to back up a modern JS-enabled web page in a usable way that results in a single file which can be easily shared.

nekitamo · on Dec 12, 2023

Have you tried SingleFile? It sounds like what you’re looking for:

https://github.com/gildas-lormeau/SingleFile

edsu · on Dec 15, 2023

Also check out https://archiveweb.page which is open source, local, and lets you export archived data as WARC (ISO 28500). You can embed archives in web pages using their Web Component https://replayweb.page.

eigenvalue · on Dec 12, 2023

Will check it out, thanks.

bloopernova · on Dec 12, 2023

Regarding PrivateGPT, if I have a 12GB Nvidia 4070 and an 11GB 2080ti, which LLM should I run?

Edited to add: https://www.choosellm.com/ by the PrivateGPT folks seems to have what I needed.

SkyMarshal · on Dec 12, 2023

There's a big community discussing exactly that over at https://www.reddit.com/r/LocalLLaMA/.

smcleod · on Dec 12, 2023

+1 r/localllama, 23GB should allow you to run 30b~ models, but honestly some of the new smaller models such as Mistral & friends (Zephyr etc..) are really interesting. You could also Give Mixtral a try if you get a low quant format such as this q3 - https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-G...

dr_kiszonka · on Dec 13, 2023

Do you know if there is a website or spreadsheet where I could enter my RAM, GPU, etc. and see what models I could run locally?

SkyMarshal · on Dec 13, 2023

Yes, https://www.reddit.com/r/LocalLLaMA/. Just ask there and a bunch of Non-Artificial Intelligent agents will give you that answer ;).

unstuck3958 · on Dec 13, 2023

While I agree a website/spreadsheet would be convenient, it's not that complicated. As long as GPU is handling 50-75% of the LLM layers, you should get a decent tok/sec speed (unless you're running really really large models).

dr_kiszonka · on Dec 14, 2023

Could you explain to me (in steps) how I would go about calculating how much VRAM I would need to run, say, Mistral 8x7B?

nektro · on Dec 12, 2023

you're better than this mozilla. hopping on the ai trend is disgusting given your alleged morals

SpaceManNabs · on Dec 12, 2023

This seems completely overkill.

I don't even like having to clear my history and wtv regularly. I use incognito mode most times.

Now I have monitor what my local AI collects?

"through the lens of privacy" my ass, man.

Why would I ask my browser what the meaning of a life well lived is?

lofaszvanitt · on Dec 12, 2023

What is happening at Firefox is quite strange. Like they are walking backwards.

smcleod · on Dec 12, 2023

This seems like a sensible step in the right direction IMO, (optional) features such as local, privacy respecting LLMs will help to augment peoples online research, bookmarking, contextual search etc....

It's important that we have Firefox working on such experiments otherwise as Google adds more of their privacy invading features to chrome / chromium it will likely impact negatively on peoples desire to find alternative browsers.

lofaszvanitt · on Dec 12, 2023

Yeah, but maybe, if you are constantly losing market share... maybe you should work on things that appeals to a wider audience. Except if you have a trump card and intend to use it as a deus ex machina to suddenly show people you are THE browser, the way forward.

nullc · on Dec 12, 2023

You don't gain market share by doing the same stuff the other FREE alternative does.

You gain market share by doing what they refused to do, no matter how much it's in the user's interest, because their business is stealing the user's data and yours isn't.

mcbits · on Dec 13, 2023

In a just world, that's a way to gain market share. In our world, people concede their data for marginal improvements in the quality of a feature because they can't conceive of how giving up control of their data could come back to harm them. It doesn't feel like there is a downside.

lofaszvanitt · on Dec 13, 2023

What free alternatives? All the browsers look & feel the same. Zero innovation.

Dwedit · on Dec 12, 2023

Very misleading name. The word "Memory" has a distinct meaning in relation to computing, but this is more about human memories.

Sai_ · on Dec 12, 2023

I was going to ignore this as a troll comment because computer memory has its antecedents in human memory but the commenter is right - the combination of memory and cache to talk about human memory seems misleading.

edsu · on Dec 15, 2023

I kind of like the association since it speaks to how text collected while browsing the web can be used to generate new text, which is similar, at least metaphorically, to how human memory is reconstructive and transformative, not perfect recall. https://en.wikipedia.org/wiki/Reconstructive_memory