More

eachro · 2025-12-18T20:43:19 1766090599

Do you think this would be appropriate for a command line tool that hits various apis as the function calls? Ex: "what's the weather in SF tomorrow?" Or "daily price change of apple, Tesla stock for past week"? (Let's assume I have documented the apis thoroughly somewhere that the model has access to or fine tuned it on this data)

milenf · 2025-12-18T21:03:24 1766091804

Hi, also on the FunctionGemma team! Something like this would be a good use case for the model. Based on how complicated the API is you might need to finetune it (we released a colab that guides you through the experience + how to export/run it locally). Generally better tool descriptions help although if it is something very complicated finetuning would be better.

lostmsu · 2025-12-21T20:19:52 1766348392

Both your examples require Internet access so there's no reason not to use cloud-hosted model which would work magnitudes better.

eachro · 2025-09-08T23:38:26 1757374706

Does anyone know what the state of the art industry solvers do for these problems? I had dabbled a bit in ml approaches to combinatorial optimization with great interest a few years back, but I don't think any of these rl based methods ended up being used in production.

mmaaz · 2025-09-09T06:21:37 1757398897

The state of the art solvers are the proprietary ones like Gurobi, FICO, Cplex, Mosek, etc. A major contributor to the proprietary "sauce" is in the heuristics they use. For example, all solvers will have a "presolve" phase which attempts to eliminate redundant constraints/variables. There may be some ML they are using behind the scenes to derive these heuristics, I'm not sure, although I know it is a major research area.

Otherwise, the basic underlying algorithms are all the same, as in the textbook: branch-and-bound and so on.

__rito__ · 2025-09-09T04:50:19 1757393419

I know about only one such library, and works great for toy problems: PuLP [0][1].

[0]: https://coin-or.github.io/pulp/

[1]: https://pypi.org/project/PuLP/

eachro · 2025-09-02T02:48:38 1756781318

Didn't Amazon aquihire Adept Labs?

eachro · 2025-08-20T17:18:37 1755710317

If you wanted to train it from scratch, how long would it take on a reasonable GPU setup?

rck · 2025-08-20T19:10:37 1755717037

For the sake of comparison, you can train a 124M model on a 3090 (see nanoGPT). In that case, each batch ends up having about 500,000 tokens and takes maybe around 10ish seconds to run forward and backward. Then the 6 trillion tokens that this model was trained on would take about 4 years, approximately. Or just "too long" for a shorter answer.

canyon289 · 2025-08-20T18:11:34 1755713494

The world reasonable is vague but assuming you mean something that could be run in a residential unit it would long a very long time if training from pure scratch.

This is part of the rationale for releasing this model. Now you don't have to start from scratch and finetuning is reasonable on a wide variety of hardware, including reasonable GPU setups (and smaller)

eachro · 2025-08-15T14:40:14 1755268814

I'm reminded of the nixon quote: "When the president does it, that means it's not illegal."

schmidtleonard · 2025-08-15T15:30:50 1755271850

It was aspirational then, but after 50 years of working to create the Unitary Executive it is now fact.

more_corn · 2025-08-15T15:40:25 1755272425

Only if we let it be fact. Surely there’s a line.

p_j_w · 2025-08-15T17:01:45 1755277305

That line was crossed when we re-elected Mr. January 6th.

542354234235 · 2025-08-15T17:28:24 1755278904

Will no one rid me of this meddlesome priest!

xaoz · 2025-08-23T09:52:34 1755942754

At least he seemingly pretended afterwards not to have meant to order "go kill him".

The knights who murdered the archbishop weren't so lucky... my direct ancestor fled to Ireland afterwards (as family legend has it).

rescripting · 2025-08-15T20:17:27 1755289047

The German people of the 1930s would like to have a word...

ujkhsjkdhf234 · 2025-08-15T17:01:17 1755277277

There isn't.

GuB-42 · 2025-08-15T16:26:55 1755275215

True in most countries. The president or more generally the chief of the executive often has legal immunity. It makes sense because that are the law, at least in part.

In democracies there a usually some protection against abuse of that power (ex: impeachment).

TheOtherHobbes · 2025-08-15T17:13:23 1755278003

The UK has sovereign immunity.

The monarch is literally above the law. They cannot be arrested, questioned, tried, or punished for any reason.

Of course it would raise eyebrows if King Charles went on a shooting spree. But what happens behind closed doors is none of the public's business.

bell-cot · 2025-08-15T18:10:21 1755281421

Reality: If His Maj just doesn't look fit for purpose, then he can be suspended. Or forced off the throne entirely:

https://en.wikipedia.org/wiki/Regency_Act_1811#Care_of_King_...

https://en.wikipedia.org/wiki/Abdication_of_Edward_VIII

cco · 2025-08-16T02:09:36 1755310176

You're conflating a president (highest executive) with a monarch. Perhaps on purpose given current goings on, but a key distinction between monarchies and democracies is explicitly that all people in the country are subject to the same laws and there is no sovereign immunity.

corimaith · 2025-08-15T21:37:23 1755293843

The Monarch also needs permission from the Mayor of the City of London to enter the city, so we do need to make a distinction between de jure and de facto law here.

ciupicri · 2025-08-16T13:05:40 1755349540

From https://en.wikipedia.org/wiki/Lord_Mayor_of_London

> It is sometimes asserted that the Lord Mayor may exclude the monarch from the City of London. This legend is based on the misinterpretation of the ceremony observed each time the sovereign enters the City at Temple Bar, when the Lord Mayor presents the City's Pearl Sword to the sovereign as a symbol of the latter's overlordship. The monarch does not, as is often purported, wait for the Lord Mayor's permission to enter the City. When the sovereign enters the City, a short ceremony usually takes place where the Lord Mayor presents a sword to the monarch, symbolically surrendering their authority. If the sovereign is attending a service at St Paul's Cathedral this ceremony would take place there rather than at the boundary of the City, simply for convenience.

01HNNWZ0MV43FF · 2025-08-15T17:42:46 1755279766

"I'm not gonna do it, but I need the legal ability to murder innocent people"

incone123 · 2025-08-15T21:32:13 1755293533

We used to have the concept of the divine right of kings. The current arrangements are a step down from that. Your framing has it back to front.

eachro · 2025-07-17T18:58:05 1752778685

What would it take to make NYC more like Tokyo where you have consumer/retail level things on the not-ground floor level.

mbStavola · 2025-07-17T19:33:30 1752780810

This already exists, especially in the outer boroughs. But of course I'd love to see more of it!

geetee · 2025-07-17T19:31:34 1752780694

I've seen some of this around ktown. The elevators are always tiny and dingy. Not a fan at all.

antithesizer · 2025-07-17T19:32:42 1752780762

Among other things, a culture of shoppers who know to look upstairs

eachro · 2025-07-08T18:17:38 1751998658

From what I've heard, the llama3 models are fairly easy to fine-tune (please correct me if I'm wrong or if there are more amenable models here). How easy is it to finetune smollm3? I know a lot of the MoE LLMs have been quite fickle in this regard.

eachro · 2025-07-05T07:06:02 1751699162

"And 50% of the time they work 50% of the time."

I think this is still an incredible outcome given how many dice rolls you can take in parallel with multiple claude/o3/gemini attempts at a problem with slightly different prompts. Granted, each rollout does not come for free given the babysitting you need to do but the cost is much lower than going down the path yourself/having junior colleagues make the attempt.

eachro · 2025-06-30T12:16:20 1751285780

Is there a reason to use data classes over pedantic base models anymore?

alfons_foobar · 2025-06-30T12:46:43 1751287603

I guess some prefer to stick with the stdlib instead of third party libs.

Also, dataclasses feels more straightforward and less "magic" to me (in the sense that it is more or less "just" a way to avoid boilerplate for class definition, while pydantic does way more "magic" stuff like de-/serialization and validation, and adding numerous methods and attributes to the classes).

JimDabell · 2025-06-30T14:36:17 1751294177

I’ve never really gotten along with Pydantic. Something about it just doesn’t feel ergonomic.

If I need something more than dataclasses, I’ll normally go for attrs/cattrs. Dataclasses were originally based on attrs, so it’s not much of a leap.

LtWorf · 2025-07-01T13:02:41 1751374961

I never understood why basemodel even exists.

When I started to implement typedload, when types were just introduced, I supported NamedTuple, and then as more things were added, also attrs, dataclasses, typed dict…

What would be the point to require migrating the whole codebase to use something different to use your library?

On the other hand, if you wrote your code from scratch to use basemodel you're pretty much stuck with pydantic.

mvieira38 · 2025-06-30T14:33:24 1751294004

Speed and size, mainly. If you don't need the data validation there's no reason to use pydantic, it's a huge dependency

gjvc · 2025-06-30T12:17:36 1751285856

did you mean: "pydantic base models" ?

eachro · 2025-06-30T12:37:43 1751287063

Yeah haha I got autocorrected

guappa · 2025-06-30T13:18:44 1751289524

speed? Not pulling in a huge dependency?

eachro · 2025-06-09T22:11:33 1749507093

A lot of people are saying 12gb is too small to do anything interesting with. What's the most useful thing people __have__ gotten to work?