More

youssefabdelm · 2025-09-14T17:14:01 1757870041

One thing I love about SVGs for websites is their stability across many contexts, browsers, etc. Everything is pinpoint, coordinate based, nothing moves around unexpectedly. You define it once, it's done.

nine_k · 2025-09-15T01:04:40 1757898280

If this is the goal, a PDF website could be even better.

anon1395 · 2025-09-15T10:25:46 1757931946

That is technically possible

youssefabdelm · 2025-04-21T19:00:23 1745262023

Anyone know if possible to fine-tune for cloning my voice?

toebee · 2025-04-22T00:53:23 1745283203

We're adding guides for Zero-shot voice cloning. You can try it using the second example on Gradio: https://huggingface.co/spaces/nari-labs/Dia-1.6B

youssefabdelm · 2025-04-22T13:46:28 1745329588

Will give it a shot but I feel like fine-tuning will be more reliable, any way to do that?

youssefabdelm · 2025-04-15T11:04:27 1744715067

Disagree. It's really not complicated at all to me. Not sure why people make a big fuss over this. I don't want an AI automating which AI it chooses for me. I already know through lots of testing intuitively which one I want.

If they abstract all this away into one interface I won't know which model I'm getting. I prefer reliability.

youssefabdelm · 2025-03-28T17:24:28 1743182668

I hope someone can create an open source replica of this work. I see so much potential for features you can come up with.

For example the rhyming example brings to mind a feature where you give the model starting input and ending input and ask it to fill in.

Can not only imagine it being useful in that sense, but for ways at retroactively arriving at some answer, or solution or something. Like the causal chain that leads to a specific answer.

Another idea is to show all possible word variations, and then the middle is rewritten based on the chosen word.

youssefabdelm · 2025-03-23T01:10:50 1742692250

Eh... still kinda fucked up in my opinion. Should be treated equally.

apexalpha · 2025-03-23T06:43:08 1742712188

You cannot ask startups to hold themselves accountable to a higher standard than their wntrenched competitors

javierluraschi · 2025-03-23T03:50:22 1742701822

Why? They are competing against other companies that are outsourcing the same job and probably paying less.

youssefabdelm · 2025-03-23T11:44:09 1742730249

Don't you think this 'logic' is kind of a numb way to look at reality? Not caring about the actual human doing the job... when you've got millions in your pocket? Standard of living in third world countries tends to be worse. Look at every global human wellbeing index in existence and you'll see they're all pretty much the same map rhymed over and over. People suffer on one side so that people can live well on the other side.

It's not like "Oh their quality of life is exactly the same as in the US it's just cheaper! So it's totally fine for us to pay them peanuts! Right? Peanuts buy you a house there... I mean the house has no plumbing but it's a house!" No, their quality of life is absolute crap compared to first world countries (as someone from a third world country). And paying them like shit will only prolong their suffering.

The least the company could do is pay them fairly which means they can travel or live a better quality life.

youssefabdelm · 2025-03-21T01:17:32 1742519852

Off-topic but DALL-E has turned the web into slop-city. What a mess. Everything looks the same, cheap and ugly and stupid.

youssefabdelm · 2025-03-20T20:08:41 1742501321

Jeff you know what would be magical? Not just vanilla diarization "Speaker 1" and "2" but if the model can know from the conversation this speaker was referred to as "Jeff Harris" or "Jeff" so it uses that instead.

youssefabdelm · 2025-03-21T01:19:01 1742519941

Or if we could even provide samples of what an example speaker sounds like in general so that it would always classify them the way we want.

youssefabdelm · 2025-03-20T20:03:44 1742501024

Def prefer the pricing but so far on 4o, no timestamps or diarization sadly

youssefabdelm · 2025-03-02T10:53:19 1740912799

Yeah the eagerness to please thing feels like it carried over from the LLMs or something cause they're like that too.

youssefabdelm · 2025-02-28T09:11:29 1740733889

Yep. I've often said RLHF'd LLMs seem to be better at recognition memory than recall memory.

GPT-4o will never offhand, unprompted and 'unprimed', suggest a rare but relevant book like Shinichi Nakazawa's "A Holistic Lemma of Science" but a base model Mixtral 8x22B or Llama 405B will. (That's how I found it).

It seems most of the RLHF'd models seem biased towards popularity over relevance when it comes to recall. They know about rare people like Tyler Volk... but they will never suggest them unless you prime them really heavily for them.

Your point on recommendations from humans I couldn't agree more with. Humans are the OG and undefeated recommendation system in my opinion.