Hacker News new | past | comments | ask | show | jobs | submit | ilovefood's comments login

Looks pretty cool, congrats so far! Do you allow downloading the fine tuned model for local inference?


Thank you, and yes that is possible. Which model are you looking to fine-tune?


If that's the case then I'll try the platform out :) I want to finetune Codestral or Qwen2.5-coder on a custom codebase. Thank you for the response! Are there some docs or infos about the compatibility of the downloaded models, meaning will they work right away with llama.cpp?


We don't support Codestral or Qwen2.5-coder right out of the box for now, but depending on your use-case we certainly could add it.

We utilize LoRA for smaller models, and qLoRA (quantized) for 70b+ models to improve training speeds, so when downloading model weights, what you get is the weights & adapter_config.json. Should work with llama.cpp!


Looks great :) it's exactly what I've built a year ago: LLM in the middle / client side content filtering using LLms.

- https://karimjedda.com/llms-in-the-middle-content-aware-clie...


Loved the blog, I also built a version that used a local LLM but wasn't sure how to package that to a wider audience so went with openai instead :)


I was interested in giving it a shot, but i didn't see any setup instructions or anything i can download from that page. Is your extension available for use?


It isn't, the blog was just a high level recap of what I built, as I unfortunately don't have enough time to turn it into a product that can be used.

I'm very happy to see that someone is building it though, I really think that personal AIs/LLMs are the future.


You should still slap the code on github with a disclaimer that it's not stable/fully featured.


For solo projects, my code is the project planning. I use a bunch of #TODO or //TODO in different files throughout each project's codebase, that a small Python script picks up and compiles into a single HTML file.

As I'm working on a lot of projects, the important thing for me is keeping it easy to start/stop working on a project, and make sure that when I have time to make a pass on a project I don't have to also deal with the additional headache of keeping the codebase and my todos in sync.

Not sure if this is really helpful though. Good luck!


Hi all, after a lot of feedback I've pushed the new version of my Data Engineering with Rust online book.

It's free for now and I've ported the first chapters from the previous version, would love to hear what you think !

Karim


Beautiful and well written article. Unfortunately I made a small mistake in the second quizz :P


I love Pandoc!

I recently learned you can use LUA to write custom plugins and change some of the converting behavior. I'm using it for example to create slides similar to the "sent" program.

It helps me bootstrap new presentations and talks very quickly: https://github.com/KarimJedda/justslides


The article is good, and I get the point. However, from experience, this:

> POST on /queries/enlisted-students-on-joining-date/version/1 { "date": "2023-09-22" } to retrieve all students that joined on a given date.

always ends up in a complete and absolute mess, where every possible query gets it's own random name, different parameters, ending up with duplicates all over the place. Also, while it can be possible to cache these POST requests (responses), it's additional work and more friction compared to REST. I'm not convinced the tradeoff is worth it in this particular case.

All in all, I don't know if the current proposal can be an alternative either but the idea of a "REST-lite" goes in the right direction and it's a great start.

However this is a complete no for me:

> When a requested student is nonexistent, your API can return 200 OK HTTP status code with a { "user": null, "message": "No user exists with the specified ID" } response body.

If nothing is found, I want a 404! :D


I like the convention:

- 404 if resource requested by id

- 200 with empty list of results if it was a 'search' type request with params (not referring directly to an id)



I still prefer 200 with an empty list for the 'no results' case. The same client code works, rather than having to code for 204.

- If I query for an identifier which doesn't exist - Server replies 404, this should fall into my error handling as there's a data inconsistency

- If I query say like ?category=automotive&price=1 and I get a 200 containing a json body in which there's an empty list of matches, then my client code can handle 0 matches as well as 1 or 10 with no special handling.


this is the standard we have company-wide and it works pretty well, unless you make a mistake in the uri (like picking up the wrong api version). however, all apis will return some info on their root, so it's easy to distinguish and troubleshoot


404 means that there is no resource at the requested URI. But in the case you're discussing, there is a resource at the requested URI; it's just a resource that says there's no user there, instead of a resource that gives data for a user.

In other words, using 404 the way you describe conflates two different things: an invalid user URI (maybe the range of possible user IDs is restricted, or you typed in the URI wrong) and a valid user URI that just doesn't have a user there at the time you made your request. But you probably don't want those two things to be conflated; you want them to be distinguished. That's the article's point.


> There is promise in constraining output to be valid JSON. One new trick that the open-source llama.cpp project has popularized is generative grammars

This has been working for months now and is the best method for this type of stuff, a thing for moat-lovers. Too bad it wasn't explored here, the text-based methods turned out to be mainly an unreliable waste of time.


I'm using it to filter out the content that's displayed in my browser screen as I browse: https://karimjedda.com/llms-in-the-middle-content-aware-clie...

Essentially, I wrote a small browser extension, that takes the content of LinkedIn, Twitter, YouTube posts/titles, and filters them out based on if they are clickbait, low effort, etc.

It's liberating :D


The thing that made the initial chatgpt refreshing was the lack of ads - it wasn’t trying to sell you anything. This obviously will not continue; commercial pressures will direct AI efforts towards being a better ad pusher.

So the AI of the social media sites will end trying to get the crap past your local AI filters, in a big AI arms race :)


I would say, bring it on! Nothing will make it past my phi-2 or mistral-7B-v0.1 ^^, at least for now.

I think what this could lead to is homogenization of the content serving layer, since all you'd really need is to get content to the user that can move their filters from one site to the other, the display layer being less relevant (and differentiating). But let's see, exciting times.


That's awesome, I want to do something similar: categorize the content in social media, so I can choose what to see when I want. Sometimes I want to avoid politics, sometimes I'm ok with it, for example. Sometimes I want to see only content about game development.

What's your plan with your project, will you turn it into a product for others, open source it, or neither? I would love it if it was either of the former!


Thank you very much for the supporting words, I've been getting lots of positive feedback from this. The end of year workload has made it such though that I need to be mindful of time. I think one of the two first options will be the way to go. I'll post an update here as I always do with my small projects.


A video of before and after would do wonders.

Also, if this could show stats and graphs on the topics the user has been exposed to and what has been blocked out it would be amazing.


Amazing, I was eagerly waiting for this one. Loading extensions in previous DuckDB-WASM releases didn't work seamlessly. Looks like now it's the case :D

ref: https://github.com/duckdb/duckdb-wasm/issues/1542#issuecomme...

Thanks!!


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: