More

mikewave · on Feb 21, 2024

This is a fantastic use of the D-Wave technology and really highlights the ability to use annealing for a wide range of use cases. I'm definitely going to follow this development closely!

It will be interesting to see how this works on the Advantage2, especially when it moves past the prototype stage. The greater connectivity between qubits may allow for an even better demonstration.

mikewave · on Aug 21, 2023

I've been seeing Mac folklore posted to HN recently - this is a fantastically well produced podcast that goes through folklore.org and other articles from the time period, with asides, tidbits, interviews etc. that help paint a picture of this fantastic era in computing history. Well worth a listen.

mikewave · on Sept 21, 2022

Well, if your system elastically uses GPU compute and needs to be able to spin up, run compute on a GPU, and spin down in a predictable amount of time to provide reasonable UX, launch time would definitely be a factor in terms of customer-perceived reliability.

jhugo · on Sept 22, 2022

All the clouds are pretty upfront about availability being non-guaranteed if you don't reserve it. I wouldn't call it a reliability issue if your non-guaranteed capacity takes some tens of seconds to provision. I mean, it might be your reliability issue, because you chose not to reserve capacity, but it's not really unreliability of the cloud — they're providing exactly what they advertise.

deanCommie · on Sept 22, 2022

"Guaranteed" has different tiers of meaning - both theoretical and practical.

In many cases, "guaranteed" just means "we'll give you a refund if we fuck up". SLAs are very much like this.

IN PRACTICE, unless you're launching tens of thousands of instances of an obscure image type, reasonable customers would be able to get capacity, and promptly from the cloud.

That's the entire cloud value proposition.

So no, you can't just hand-waive past these GCP results and say "Well, they never said these were guaranteed".

jhugo · on Sept 22, 2022

Ignoring the fact that the results are probably partially flawed due to methodology (see top-level comment from someone who works on GCE) and are not reproducible due to missing information, pointing out the lack of a guarantee is not hand-waving. The OP uses the word "reliability" to catch attention, which certainly worked, but this has nothing to do with reliability.

robbintt · on Sept 22, 2022

This isn't actually true, even for tiny customers. In a personal project, I used a single host of a single instance type several times per day and had to code up a fallback.

dilyevsky · on Sept 22, 2022

Try spinning up 32+ core instances with local ssds attached or anything not n1 family and you will find that in may regions you can only have like single digits of them

dark-star · on Sept 21, 2022

I'd still consider it as "performance issue", not "reliability issue". There is no service unavailability here. It just takes your system a minute longer until the target GPU capacity is available. Until then it runs on fewer GPU resources, which makes it slower. Hence performance.

The errors might be considered a reliability issue, but then again, errors are a very common thing in large distributed systems, and any orchestrator/autoscaler would just re-try the instance creation and succeed. Again, a performance impact (since it takes longer until your target capacity is reached) but reliability? not really

irrational · on Sept 21, 2022

I’d like to see a breakdown of the cost differences. If the costs are nearly equal, why would I not choose the one that has a faster startup time and fewer errors?

campers · on Sept 22, 2022

With GCP you can right-size the CPU and memory of the VM the GPU is attached to, unlike the fixed GPU AWS instances, so there is the potential for cost savings there.

rco8786 · on Sept 21, 2022

Sure but not anywhere remotely near clearing the bar to simply calling that “reliability”.

Waterluvian · on Sept 21, 2022

When I think “reliability” I think “does it perform the act consistently?”

Consistently slow is still reliability.

somat · on Sept 21, 2022

It is not reliably running the machine but reliably getting the machine.

Like the article said, The promise of the cloud is that you can easily get machines when you need them the cloud that sometimes does not get you that machine(or does not get you that machine in time) is a less reliable cloud than the one that does.

rco8786 · on Sept 21, 2022

It’s still performance. If this was “AWE failed to deliver the new machines and GCP delivered”, sure, reliability. But this isn’t that.

The race car that finishes first is not “more reliable” than the one in 10th. They are equally as reliable, having both finished the race. The first place car is simply faster at the task.

somat · on Sept 22, 2022

The one in first can more reliably win races however.

rco8786 · on Sept 22, 2022

You cannot infer that based on the results of the race...that's literally the entire point I am making. The 1st place car might blow up in the next race, the 10th place car might finish 10th place for the next 100 races.

If the article were measuring HTTP response times and found that AWS's average response time was 50ms and GCP's was 200ms, and both returned 200s for every single request in the test, would you say AWS is more reliable than GCP based on that? Of course not, it's asinine.

onphonenow · on Sept 22, 2022

If you want that promise you can reserve capacity in various ways. Google has reservations. Folks use this for DR, your org can get a pool of shared ones going if you are going to have various teams leaning on GPU etc.

The promise of the cloud is that you can flexibly spin up machines if available, and easily spin down, no long term contracts or CapEx etc. They are all pretty clear that there are capacity limits under the hood (and your account likely has various limits on it as a result).

VWWHFSfQ · on Sept 21, 2022

I would still call it "reliability".

If the instance takes too long to launch then it doesn't matter if it's "reliable" once it's running. It took too long to even get started.

rco8786 · on Sept 21, 2022

Why would you not call it “startup performance”.

Calling this reliability is like saying a Ford is more reliable than a Chevy because the Ford has a better throttle response.

endisneigh · on Sept 21, 2022

that's not what reliability means

VWWHFSfQ · on Sept 21, 2022

> that's not what reliability means

What is your definition of reliability?

endisneigh · on Sept 21, 2022

unfortunately cloud computing and marketing have conflated reliability, availability and fault tolerance so it's hard to give you a definition everyone would agree to, but in general I'd say reliability is referring to your ability to use the system without errors or significant decreases in throughput, such that it's not usable for the stated purpose.

in other words, reliability is that it does what you expect it to. GCP does not have any particular guarantees around being able to spin up VMs fast, so its inability to do so wouldn't make it unreliable. it would be like me saying that you're unreliable for not doing something when you never said you were going to.

if this were comparing Lambda vs Cloud Functions, who both have stated SLAs around cold start times, and there were significant discrepancies, sure.

pas · on Sept 21, 2022

true, the grammar and semantics work out, but since reliability needs a target usually it's a serious design flaw to rely on something that never demonstrably worked like your reliability target assumes.

so that's why in engineering it's not really used as such. (as far as I understand at least.)

Art9681 · on Sept 21, 2022

Why would you scale to zero in high perf compute? Wouldn't it be wise to have a buffer of instances ready to pick up workloads instantly? I get that it shouldnt be necessary with a reliable and performant backend, and that the cost of having some instances waiting for job can be substantial depending on how you do it, but I wonder if the cost difference between AWS and GCP would make up for that and you can get an equivalent amount of performance for an equivalent price? I'm not sure. I'd like to know though.

thwayunion · on Sept 21, 2022

> Why would you scale to zero in high perf compute?

Midnight - 6am is six hours. The on demand price for a G5 is $1/hr. That's over $2K/yr, or "an extra week of skiing paid for by your B2B side project that almost never has customers from ~9pm west coat to ~6am east coast". And I'm not even counting weekends.

But that's sort of a silly edge case (albeit probably a real one for lots of folks commenting here). The real savings are in predictable startup times for bursty work loads. Fast and low variance startup times unlock a huge amount of savings. Without both speed and predictability, you have to plan to fail and over-allocate. Which can get really expensive fast.

Another way to think about this is that zero isn't special. It's just a special case of the more general scenario where customer demand exceeds current allocation. The larger your customer base, and the burstier your demand, the more instances you need sitting on ice to meet customers' UX requirements. This is particularly true when you're growing fast and most of your customers are new; you really want a good customer experience every single time.

diroussel · on Sept 21, 2022

Scaling to zero means zero cost when there is zero work. If you have a buffer pool, how long do you keep it populated when you have no work?

Maintaining a buffer pool is hard. You need to maintain state, have a prediction function, track usage through time, etc. just spinning up new nodes for new work is substantially easier.

And the author said he could spin up new nodes in 15 seconds, that’s pretty quick.

HenriTEL · on Sept 22, 2022

GCP provides elactic features for that. One should use them instead of manually requesting new instances.

mikepurvis · on Sept 22, 2022

Hopefully anyone with a workload that's that latency sensitive would a have preallocated pool of warmed up instances ready to go.

pier25 · on Sept 22, 2022

Wouldn't Cloud Run be a better product for that use case?

mikewave · on Sept 20, 2022

>Over the past five years, there has been undeniable hype around quantum computing—hype around approaches, timelines, applications, and more. As far back as 2017, vendors were claiming the commercialization of the technology was just a couple of years away—like the announcement of a 5,000-qubit system by 2020 (which didn’t happen).

Inaccurate. D-Wave did in fact launch a 5000+ qubit system named Advantage in 2020.

mikewave · on Sept 16, 2022

Is there any evidence Buterin is on the spectrum, rather than just being a Standard Nerd?

mikewave · on Aug 24, 2022

> I don't mind that idea - if it turned out installing heated seats failed in say... 28% of installations, fine - charge me more for a model with functioning heated seats.

I know you just chose this as an example, but clearly it wouldn't be acceptable to anyone to have a failed installation of something like this. On one extreme, it could cause a fire; but otherwise, it's just a button that doesn't work, and damages the quality reputation of the car.

Processor binning only works because it is a black box, indistinguishable from the outside. You can purposely disable components if your yields get high enough, etc. and nobody will ever notice the difference.

For vehicles, automakers will learn from customers that this kind of nickel-and-diming is not appreciated, when people turn to other manufacturers that still "get it", e.g. Mazda.

mikewave · on Aug 19, 2022

Same here. It's barebones, but it's also cheap and plentiful on the used market, has enough functionality to be useful without being overwhelming, and fits easily in a backpack. Did some noodling around with it connected to my iPhone on a plane recently, no drivers needed if you get that Lighting-to-USB adapter, phone powered it fine, and I was able to drive Korg Gadget and make some shitty techno.

mikewave · on Aug 15, 2022

So go to the website and use it. Many of us simply find it as a way for more propaganda to be fed to us unasked-for when we open up a new tab (at least, until we remember to turn it off).

Nextgrid · on Aug 15, 2022

In fact internally it's still packaged as an add-on that is installed on first run. Unbundling it would be super easy.

m-p-3 · on Aug 15, 2022

The product belongs to Mozilla, so I don't see why they shouldn't integrate it.

kyleee · on Aug 16, 2022

as with many things though it should be consent based and opt in; they can have a big button "add pocket to your firefox" and those who want it can opt in and everyone else won't have to go through the rigamarole of finding all the settings (strewn about in various locations including about:config) to disable it

hammyhavoc · on Aug 16, 2022

That sounds far too civil and reasonable.

solarkraft · on Aug 16, 2022

Because it bloats up the browser.

That said I find the articles they recommend interesting (and to steal my time ...).

mikewave · on Aug 15, 2022

Some of us would love to work on a team like this. It would be nice to have the option. Your definition of "acceptable" might not actually result in teams that can take on the big challenges we face as a species as men who did find this kind of thing acceptable retire out of the workforce.

beebeepka · on Aug 15, 2022

We might all die but at least no one's feelings would be hurt, no matter what they did or didn't do.

I am only half joking. It's a good thing no one is being forced to work with Linus. People really need to keep that in mind

shadowgovt · on Aug 15, 2022

If "We've always done it this way and it's a risk to do it differently" was the argument that carried the day, few of us would have to worry about these questions at all because we'd never have gotten out from under feudalism.

sangnoir · on Aug 15, 2022

> Some of us would love to work on a team like this. It would be nice to have the option

Be the change you wish to see.

mikewave · on July 28, 2022

If you find this interesting, the excellent Mac Folklore Radio podcast has a few episodes on this:

https://macfolkloreradio.com/2021/06/21/bob-hearn-a-brief-hi...