I will be crucified by this, but I think you are doing it wrong.
I would split it in 2 steps.
First, just move it to svelte, maintain the same functionality and ideally wrap it into some tests. As mentioned you want something that can be used as pass/no-pass filter. As in yes, the code did not change the functionality.
Then, apply another pass from Svelte bad quality to Svelte good quality.
Here the trick is that "good quality" is quite different and subjective. I found the models not quite able to grasp what "good quality" means in a codebase.
For the second pass, ideally you would feed an example of good modules in your codebase to follow and a description of what you think it is important.
With my partner we have been working to invert the overall model.
She started grading conversation than the students have with LLMs.
From the question that the students ask, it is obvious who knows the material and who is struggling.
We do have a custom setup, so that she creates an homework. There is a custom prompt to avoid the LLM answering the homework question. But thats pretty much it.
The results seems promising, with students spending 30m or so going back and forth with the LLMs.
If any educator wants to Ty or is interested in more information, let me know and we can see how we collaborate.
This makes some sense, but my first question would be how do you define a clear, fair grading rubric? Second, this sounds like it could work for checking who is smart, but can it motivate students to put in work to learn the material?
I believe the broader question would be if a free market is always USEFUL and DESIRABLE for individuals and community as a whole. And what is freedom when individual and community interest are not necessary the same.
What you're really asking is if fundamental individual human rights are desirable for individuals and community as a whole, which is of course a hotly debated topic. So yeah, it goes all the way down to fundamental questions like if we should have freedom of association.
Just to echo the point of MCP, they seem cool, but in my experience just using a CLI is orders of magnitude faster to write and to debug (I just run the CLI myself, put test in the code, etc...)
Jup and it doesn't bloat the context unnecessarily. The agent can call --help when it needs it. Just imagine a kubectl MCP with all the commands as individual tools, doesn't make any sense whatsoever.
And, this is why I usually use simple system prompts/direct chat for "heavy" problems/development that require reasoning. The context bloat is getting pretty nutty, and is definitely detrimental to performance.
The point of this stuff is to increase reliability. Sure the LLM has a good chance of figuring out the skill by itself, the idea is that its less likely to fuck up with the skill though. This is an engineering advancement that makes it easier for businesses to rely on LLMs for routine stuff with less oversight.
4o on ChatGPT.com vs. Opus in an IDE is like cooking food without kitchen tools vs. using them. 4o is neither a coding-optimized model nor a reasoning model in general.
You're not pushing them hard enough if you're not seeing a vast difference between 4o and Opus. Or possibly they're equivalent in the field you're working in but I suspect it's the former.
If it were me, yeah, park it in bonds and live off the interest on a tropical beach. Spend my days spearfishing and drinking beers with the locals. Have no concerns except how even my tan is (and tbh I don't see myself caring too much about that).
I am interested in improving the lives of the many people who cannot afford to be stockholders
The reason I'm interested in this is twofold
First, I think the current system is exploitative. I don't advocate for communism or anything, but the current system of extracting value from the lower class is disgusting
Second, they outnumber the successful people by a vast margin and I don't want them to have a reason to re-invent the guillotine
I agree. I just personally wouldn’t want to wander around exploring it continuously for months without more interesting work/goals. Even though cultures and geography may be wonderfully varied, their ranges are way smaller than what could be.
If you want to improve the lives of many, by all means go for it, I think that is a wonderful ambition to have in live and something I strive for, too!
But we are talking about an ad company here, trying to branch out into ai to sell more ads, right? Meta existing is without a doubt a net negative for mankind.
I met a youngster on Boca del Toro island in Panama a decade or so ago. I was about to be fired from my FAANG job so I used up years and years of vacation for one big trip before I was let go. We hung out for a few days while I was there (I don’t recommend the place at all btw). He cashed out from early twitter and was setting up surf schools all of the world. All he did was travel, surf, drink, and fuck. I’m still angry that laughed at all the dumb startups in the late 2000s instead of joining them. But this guy did what you’re suggesting, and I think there are many more unknown techbros who did it too.
I met a traveller from an antique land,
Who said: “Two vast and trunkless legs of stone
Stand in the desert. Near them, on the sand,
Half sunk, a shattered visage lies, whose frown,
And wrinkled lip, and sneer of cold command,
Tell that its sculptor well those passions read
Which yet survive, stamped on these lifeless things,
The hand that mocked them and the heart that fed;
And on the pedestal these words appear:
"My name is Ozymandias, king of kings:
Look on my works, ye Mighty, and despair!"
Nothing beside remains. Round the decay
Of that colossal wreck, boundless and bare,
The lone and level sands stretch far away.
I take that more as a rumination on the futility of vanity and self-aggrandizing rather than "ruling the world " which in the modern day comes down to politics. Yes, there is considerable overlap with ego, but there's more to that topic than pure self-worship.
> Also, do you have a better way to spend that money?
Yes, I do.
I am aware of some quite deep scientific results that would have a deep impact (and thus likely bring a lot of business value) if these were applied in practice.
downsize Facebook back to like a couple thousand people max, use the resulting savings to retire and start your own AI instead of doing the whole shadow artist "I'll hire John Carmack/top AI researcher to work for me because deep down I can't believe I'd ever be as good as them and my ego is too afraid to look foolish so I won't even try even if deep down that's what I want more than being a capricious billionaire"?
or am I just projecting my beliefs onto Mark Zuckerberg here?
Retire? Anyone with more than about 10-20 million that continues to work has some sort of pathology that leaves them unsatisfied. Normal people rarely even get to that level because they are too busy enjoying life. Anyone making billions has some serious issues that they are likely stuck with because hubris won't let them seek meaningful help.
In fairness, their design does not seem to be regional. With problems in one region bringing down another, apparently not unrelated, region.
With this kind of architecture, this sort of problems is just bound to happen.
During my time in AWS, region independence was a must. And some services were able to operate at least for a while without degrading also when some core dependencies were not available. Think like loosing S3.
And after that, the service would keep operating, but with a degraded experience.
I am stunned that this level of isolation is not common in GCP.
Global dependencies were disallowed back in 2018 with a tiny handful of exceptions that were difficult or impossible to make fully regional. Chemist, the service that went down, was one of those.
Generally GCP wants regionality, but because it offers so many higher-level inter-region features, some kind of a global layer is basically inevitable.
AWS regions are fundamentally different from GCP regions. GCP marketing tries really hard to make it seem otherwise, or that GCP has all the advantages of AWS regions plus the advantages of their approach, which means heavily on "effectively global" services. There are tradeoffs, for example multi region in GCP is often trivial and GCP can enforce fairness across regions, but that comes at the cost of availability. Which would be fine - GCP SLA's reflect the fact that they rarely consider regions to be a reliable fault containers, but GCP marketing, IMO, creates a dangerous situation by pretending to be something they aren't.
Even in the mini incident report they were going through extreme linguistic gymnastics trying to claim they are regional. Describing the service that caused the outage, which is responsible for global quota enforcement and is configured using a data store that replicates data globally in near real time, with apparently no option to delay replication, they said:
Service Control is a regional service that has a regional datastore that it reads quota and policy information from. This datastore metadata gets replicated almost instantly globally to manage quota policies for Google Cloud and our customers.
Not only would AWS call this a global service, the whole concept of global quotas would not fly at AWS.
How does AWS do that though? Do the re-implement all the code in every region? Because even the slightest re-use of code could trigger a synchronous (possibly delayed) downtime of all regions.
> Do the re-implement all the code in every region?
Everyone does.
The difference is AWS very strongly ensures that regions are independent failure domains. The GCP architecture is global with all the pros and cons that implies. e.g GCP has a truly global load balancer while AWS can not since everything is at core regional.
They definitely roll out code (at least for some services) one region at a time. That doesn't prevent old bugs/issues from coming up but it definitely helps prevent new ones from becoming global outages.
Region (and even availability zones) in AWS are independent. The regions all have overlapping IPv4 addresses, so direct cross-region connectivity is impossible.
So it's actually really hard to accidentally make cross-region calls, if you're working inside the AWS infrastructure. The call has to happen over the public Internet, and you need a special approval for that.
Deployments also happen gradually, typically only a few regions at a time. There's an internal tool that allows things to be gradually rolled out and automatically rolled back if monitoring detects that something is off.
Does Route53 depend on services in us-east-1 though? Or maybe it's something else, but i recall us-east-1 downtime causing service downtime for global services
As far as I remember, Route53 is semi-regional. The master copy is kept in us-east-1, but individual regions have replicated data. So if us-east-1 goes down, the individual regions will keep working with the last known state.
Static stability is a good start, but isn't enough.
In this outage, my service (on GCP) had static stability, which was great. However, some other similar services failed, and we got more load, but we couldn't start additional instances to handle the load because of the outage, and so we had overloaded servers and poor service quality.
Mayhaps we could have adjusted load across regions to manage instance load, but that's not something we normally do.
One of the core pieces of static stability (at least in one definition, it's an overloaded term) is being able to handle failure scenarios from a steady state.
The classic example is overprovisioning so that you can handle the extra zonal load in the event of a zonal outage without needing to scale up.
I would split it in 2 steps.
First, just move it to svelte, maintain the same functionality and ideally wrap it into some tests. As mentioned you want something that can be used as pass/no-pass filter. As in yes, the code did not change the functionality.
Then, apply another pass from Svelte bad quality to Svelte good quality. Here the trick is that "good quality" is quite different and subjective. I found the models not quite able to grasp what "good quality" means in a codebase.
For the second pass, ideally you would feed an example of good modules in your codebase to follow and a description of what you think it is important.