Was just talking about this on reddit like two days ago
Instead of data going to models, we need models come to our data which is stored locally and stay locally.
While there are many OSS for Loading personal data, they dont do images or videos. In the future everyone may get their own Model but for now tech is there but product/OSS is missing for everyone to get their own QLORA or RAG or Summarizer.
Not just messages/docs: What we read or write, and our thoughts are part of what makes an individual unique. Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.
Was just watching this yesterday https://www.youtube.com/watch?v=zHLCKpmBeKA and thought, why we still don't have a computer secretary like her after almost 30 years, who is one step ahead of us.
Models like Salesforce BLIP can be used to generate captions for images too - I built a little CLI tool far that here: https://github.com/simonw/blip-caption
CogVLM blows LLaVA out of the water, although it needs a beefier machine (quantized low-res version barely fits into 12GB VRAM, not sure about the accuracy of that).
I have no actual knowledge in this area so I'm not sure if it's entirely relevant but an update from the 7th of December on the CogVLM repo says it now works with 11GB of VRAM.
> Instead of data going to models, we need models come to our data which is stored locally and stay locally.
We are building this over at https://software.inc! We collect data about you (from your computer and the internet) into a local database and then teach models how to use it. The models can either be local or cloud-based, and we can route requests based on the sensitivity of the data or the capabilities needed.
Am I cynical thinking the opposite? I can’t imagine they got that domain for a song. Spending a pile of cash on vanity such as that is a real turn off for me; it signals more flash than bang. Am I wrong to think this?
That's because the company is more or less in stealth/investigatory mode. It's the same team that built Workflow which was acquired by Apple and then turned into Shortcuts.
Just having an archiver that gives you a tradition search over every webpage you've loaded-- forget the AI stuff, would be a major advance.
I don't know about everyone but a majority of searches are for stuff I've seen before, and they're often frustrated by things that have gone offline or are downranked by search engines (e.g. old documentation on HTTP only sites) or burred by SEO.
I believe that's exactly what GitHub Copilot does. It first scans and indexes your entire codebase including dependencies (I think). So when it auto-completes, it heavily uses the context of your code, which actually makes Copilot so useful.
You're absolutely right about models coming to our data! If we could have Copilot-like intelligence, completely on-device, scanning all sorts of personal breadcrumbs like messages, browsing history, even webpage content, it would be a game-changer!
> Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.
I was imagining something a little more ambitious. Like a model that uses our search history and behavior to derive how to best compose a search query. Bing Chat's search queries look like what my uncle would type right after I explained to him what a search engine is. Throw in some advanced operators like site: or filetype: or at least parentheses along with AND/OR. Surely, we can fine tune it to emulate the search processes of the most impressive researchers, paralegals, and teenagers on the spectrum that immediately factcheck your grandpop's Ellis Island story, with evidence he both arrived at first and was naturalized in Chicago.
> Instead of data going to models, we need models come to our data which is stored locally and stay locally.
That's the most important idea I've read since ChatGPT / last year.
I'll wait for this. Then build my own private AI. And share it / pair it for learning with other private AIs, like a blogroll.
As always, there will be two 'different' AIs: a.) the mainstream, centralized, ad/revenue-driven, capitalist, political, controlling / exploiting etc. b.) personal, trustworthy, polished on peer networks, fun, profitable for one / a small community.
If by chance, commercial models will be better than open source models, due to better access to computing power / data, please let me know. We can go back to SETI and share our idle computing power / existing knowledge
I assume that training LLMs locally require high-end hardware. Even running a model requires a decent CPU or, even better, a high end GPU, but it is not so expensive as training a model. And usually you have to use hardware that is available on the cloud, so not much of privacy here.
This works if the document plus prompt fit in the context window. I suspect the most popular task for this workflow is summary which presumably means large documents. That's when you begin scaling out to a vector store and implementing those more advanced workflows. It does work even by sending a large document on certain local models, but even with the highest tier MacBook Pro a large document can quickly choke up any LLM and bring inference speed to a crawl. Meaning, a powerful client is still required no matter what. Even if you generate embeddings in "real-time" and dump to a vector store that process would be slow in most consumers hardware.
If you're passing in smaller documents then it works pretty good for real-time feedback.
Yes, should have local models in addition to remote models. Remote ones are always going to be more capable, and we shouldn't throw that away. Augmentation is orthogonal - you can augment either of these with your own data.
Good idea. Mozilla gets a lot of rightful hate for their mishandling of FF and their political preaching, but I believe they are still capable of developing tech that is both privacy preserving and user friendly at the same time.
I use the offline translator built into FF regularly and It's magic. I would've never thought something like that can run locally, without a server park worth of hardware thrown at it.
Here's hoping this experiment turns out the same way.
My usage with browsing is not relevant for this. I don't want to "chat" with my browsin g history. I would simply love my browser would index my bookmarks on my OS so I could search the actual content of those bookmarks.
The feedback loop coming gained from chatgtp will I assume always be way better than my local gpt equivalent.
But often I bookmark pages where I know the information on there are important enough for me to come back to more than once.
So I have started crafting out a solution for this. It crawls your bookmarks on your local browser storage, downloads those pages and adds them to your search index on your OS.
Small data sets suffer from bad recall in full text search. So a bit of smart fuzzyness added to the search by AI could improve the experience on locally indexed bookmarks quite well.
I would sure love a way to "chat" with my browsing history and page content. Is there any way to automatically save off pages that I've visited for later processing? I looked a decade or more ago and didn't really find a good solution.
I think WorldBrain (https://github.com/WorldBrain/Memex) promises this. While I'm also excited by the idea, I think there was some reason I ended up not using it.
Classic bookmarks have failed because mnemonic organization doesn't scale. This kind of interface does, and can replace it entirely if done right.
Thinking of it, something like this can be used for all your local files as well, acting as a better version of the old filesystem-as-a-database idea. Or for a specific knowledge base (think LLM-powered Zotero).
Instead of doing lots of back-n-forth with the giants, enriching them with each prompt, you get a smaller local model that's much more respectful of your privacy.
That's an operating model I am willing to do some OSS contributions to, or even bankroll.
Gotta love the underdogs, even if admittedly, I am not a big Mozilla org fan.
It’s what Apple’s been doing for a few years, though it remains unclear how much of that is “AI”. So it makes sense that someone else would enter that niche.
Maybe it is just me, since I lived through the Firefox OS era as a past intern: this feels like a possible re-entrance of offering a Mozilla-built OS in the future. They said Internet was born to connect people - but building everything into a browser is not the most optimal way of adding all these fancy stuff. Firefox OS was basically a small linux kernel plus Gecko plus HTML5 for rendering. So much like iOS and iPadOS Mozilla could offer similar OS for devices/platform. I mean, for the past 5 years they have been invested in AR and VR. So I won’t be surprised if they eventually bet on another Firefox OS…
One thing you'll see from a lot of these LLM examples and demos are intentionally subjective queries, so they can't be judged on pass/fail criteria.
For example, you'll see things like "where should I visit in Japan?" or "how should I plan a bachelor party?", because they are a huge variety of answers that are all "correct", regardless of how much you disagree with them. There is also a huge number of examples from them to draw from, especially compared to something as specific as your browsing history.
They could private a local url called "about:wrapped" that gives a summary of your usage like Spotify Wrapped. The top 100 sites, you can click on a site for more info like what pages did you visit, when, how often, etc.
Does it ingest the ads containing in the web pages as well? This would be a major concern, ads will pollute the model at least with unwanted information. At worst they'd be a security concern when used for indiscriminate or even targeted manipulation of the model. Advertisers do that to our brains, it's all they do. So why shouldn't they try that with LLMs scraping/being fed from the web?
I wish they would fix basic features such as downloading pictures on Firefox for Android. Often long pressing the image on your screen opens a context menu that does not allow download, only following the link associated with the image.
What I would love to see is this model being able to learn ti automate some tasks that I usually do! e.g. sign up for events/buy tickets etc! If this has access to your login details, and could log in, it could be a great assistant!
I hope this encourages Mozilla to focus more on page archiving support on the web. I feel as though they missed a huge opportunity by not making it easy to archive pages with DOM snapshots, or easy to snag videos or images. (Go to Instagram and try to right-click -> download the image; you can't.) Would have been a very good way to differentiate from Chrome, as Google wouldn't want that available for Youtube. And "our browser can download videos and images from anywhere" is pretty easy to sell for potential users.
(I'm suspecting because this goes against the wants of some of the biggest players who have the incentive of making us leave as many online footprints as possible ?)
Even here, Mozilla recommends converting to PDF for easier (?!?) human readability. Except PDF is a very bad format for digital documents, with no support for reflow and very bad support of multimedia. (PDF is perhaps good for archival of offline documents, even despite its other issues).
Agree, it seems like it’s insanely hard to back up a modern JS-enabled web page in a usable way that results in a single file which can be easily shared.
Also check out https://archiveweb.page which is open source, local, and lets you export archived data as WARC (ISO 28500). You can embed archives in web pages using their Web Component https://replayweb.page.
+1 r/localllama, 23GB should allow you to run 30b~ models, but honestly some of the new smaller models such as Mistral & friends (Zephyr etc..) are really interesting. You could also Give Mixtral a try if you get a low quant format such as this q3 - https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-G...
While I agree a website/spreadsheet would be convenient, it's not that complicated.
As long as GPU is handling 50-75% of the LLM layers, you should get a decent tok/sec speed (unless you're running really really large models).
This seems like a sensible step in the right direction IMO, (optional) features such as local, privacy respecting LLMs will help to augment peoples online research, bookmarking, contextual search etc....
It's important that we have Firefox working on such experiments otherwise as Google adds more of their privacy invading features to chrome / chromium it will likely impact negatively on peoples desire to find alternative browsers.
Yeah, but maybe, if you are constantly losing market share... maybe you should work on things that appeals to a wider audience. Except if you have a trump card and intend to use it as a deus ex machina to suddenly show people you are THE browser, the way forward.
You don't gain market share by doing the same stuff the other FREE alternative does.
You gain market share by doing what they refused to do, no matter how much it's in the user's interest, because their business is stealing the user's data and yours isn't.
In a just world, that's a way to gain market share. In our world, people concede their data for marginal improvements in the quality of a feature because they can't conceive of how giving up control of their data could come back to harm them. It doesn't feel like there is a downside.
I was going to ignore this as a troll comment because computer memory has its antecedents in human memory but the commenter is right - the combination of memory and cache to talk about human memory seems misleading.
I kind of like the association since it speaks to how text collected while browsing the web can be used to generate new text, which is similar, at least metaphorically, to how human memory is reconstructive and transformative, not perfect recall. https://en.wikipedia.org/wiki/Reconstructive_memory
Instead of data going to models, we need models come to our data which is stored locally and stay locally.
While there are many OSS for Loading personal data, they dont do images or videos. In the future everyone may get their own Model but for now tech is there but product/OSS is missing for everyone to get their own QLORA or RAG or Summarizer.
Not just messages/docs: What we read or write, and our thoughts are part of what makes an individual unique. Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today.
Was just watching this yesterday https://www.youtube.com/watch?v=zHLCKpmBeKA and thought, why we still don't have a computer secretary like her after almost 30 years, who is one step ahead of us.