Hacker Newsnew | past | comments | ask | show | jobs | submit | more bartwr's commentslogin

Yes, my experience with academics is that there are a lot of very dishonest people. They are political bullies who also lie in their research.

Chances of being caught are close to zero (I have contacted many times authors of papers who's work I was unable to replicate - most of the time zero reply, sometimes "yeah it was a honest mistake, oops"), super high competition (only a few tenured positions in all world's high visibility institutions per year), full control over student's future and being able to force them to do fraud (and later blame on them).

Obviously, not all, blah blah - but many academic scientists are the last people that should be doing science.


I know this is an unpopular opinion in the US, but the tenements can be pretty great.

I grew up in Eastern Europe (Warsaw) in "commie" blocks and there was a lot of valid criticisms and problems (like poor quality of buildings, small apartments, or thin walls - but consider that they rebuilt whole Warsaw after it was completely grounded in WW2 in a decade or two!), but also a lot to love. Extremely walkable, safe, all amenities (cinemas, stores, cultural centers, playgrounds) in the walking distance, lots of trees and green, easy access to public transit. As a kid or teenager they were great. I preferred it 100x over suburbs where my parents moved later, and to typical American cityscapes. (And this is why I moved to NYC and love it)

Here is a fun and a bit provocative/exaggerated video https://youtu.be/1eIxUuuJX7Y

Everyone is different so I'm not forcing my perspective onto anyone, just worth considering - especially if you have not had such first hand experience (and the main objection to tenements comes from how depressing they look or American association of "projects = crime", which misses a lot of "why"). Feel free to disagree!

And apart from that, I don't think "more of small houses" solves anything. It has to create more car dependence and social isolation. And it does not really scale, where would you fit more of smaller homes in SF?


"Everyone is different"

I think this is the root of it. You prefer higher density -- and that's great. I'm sure not everyone agrees, but I don't see any reason to take that away. In fact, I think it should be encouraged for those that like it.

The issue, IMHO, is that some folks don't like that and prefer lower density. And a lot of these changes focus on taking that away from them (i.e. changing their current neighborhood).

Also, just a comment on: "It has to create more car dependence and social isolation"

I don't think that's true. I live in a pretty traditional SFH neighborhood. Within a 12 minute bike ride, I have:

* Four grocery stores (Major chains)

* 2 gyms

* Dozens of restaurants

* Several large parks

* 2 home improvement stores

* Several large employers

* Several (non-Starbucks) coffee shops

And lots more. It's certainly possible, with bikes, to have SFH neighborhoods where cars aren't required.


There are lots of less dense places in the US that are not major metros.

Many of the people who are pissed about more density coming to cities only moved to them in the last decade or two, especially on the east coast where white flight only recently reversed.


I lived in SF for ~1.5y, and it's not NYC, and I did not like it too much, but it certainly has some of the city conveniences and is not a car-hell suburb. (I lived in a building with ~10 units in Castro, which was cool)

But my question remains - how do you scale up your approach to the already-full SF? How do you make it more affordable, as prices are insane due to demand >> supply? Or do you just envision a more sprawled, but similarly dense SF as the solution?


I enjoyed the post and appreciate the author sharing their perspective. It's one of many valuable datapoints for anyone considering such a transition.

I agree with a lot, but like others - disagree with some.

My background: I've worked in tech for ~14 in all kinds of roles - from pure junior IC, through a "team lead" (something between an expert IC, a tech lead, and a manager), "tech lead," company's "technical architect" (highest level tech lead, peer to the technical director, but without any direct reports), and something akin to a tech director. Now I'm back to IC. Companies from small gamedev ones (80 people total, 15 engineers), medium gamedev (30 engineers and coding technical artists), huge gamedev (Ubisoft where you can have 100+ engineers and 1500 people total on a project), and for the last 7y "big tech".

The idea I would like to push back the most is that "your words have more weight". I have never had trouble getting my opinions heard, even if I didn't push them. Being the expert IC and "problem solver" sometimes I'd have even CEO asking me directly for advice and how to solve some issues (both technical and non-technical!). Not always following them, but I didn't expect that. Having an official title, in theory, you can use some "authority" to formally push those ideas. But... in practice, it does not work better. People will still go directly to the most technical experts. And if you abuse the position/authority/title (I hope I never did that, but that's not for me to judge...), it can cause resentment, pushback, and more disagreements. You will also hear less of gossip and honest feedback, to some engineers you becomes "not one of us anymore".

It can also destroy friendships. I had a great friend (meeting socially with our wives once a week, sharing interests) who was my peer and was then promoted to my lead. I still really liked them and wanted to stay friends. We always had technical disagreements (which were fine for peer ICs). Later, some of those technical disagreements and my bringing up issues publicly caused him to get bitter with me (as the upper management saw those and took my side on some occasions), and eventually stopped the friendship completely; after I left the company, they started ghosting me. :(

Similarly, on a few occasions, I agreed to lead/manage formally (in one case, it came from me - in other cases, I was asked to). I agreed because I thought, "Things are f-d up; I can solve them by being closer to the upper leadership and helping the team succeed." Man, I was so naive. :( I didn't have any more authority or power with the higher-ups, and there were more disagreements. They expected me to enforce policies I disagreed with. As you can imagine, this didn't last long, and I always ended up leaving the team/company and being super burnt out.

So now I'm happy to be a staff-level IC, an expert, and a "hacker," playing with problems hands-on and building my expertise further. The field grows so quickly that there is always something exciting and new to learn and do. I would happily be a tech lead of some project close to me (luckily, at Google and similar, it's flexible, per-project, and not formal), but I probably do not want to manage again. Maybe it will change, depends.


I'm a former game dev and I used ImGui a lot and don't think it's used because those reasons.

It's used for quickly hacked debug tools to interleave UI and regular logic and not do a logic/view separation (as it would result in code bloat and a necessity for a refactor). You want UI code do some logic (like modifying properties of some game entity or renderer) and prefer to inline it. Lots of game code is effectively YOLO without even a single test. It's also typically guarded by IfDefs and compiled out of release versions.

But as soon as it stops being just hacky debuggers and people try to write proper tools in it, it becomes much more of a pain - people try to (poorly) emulate a retained mode in it, hold state, cache - and it becomes unreadable mess.


> But as soon as it stops being just hacky debuggers and people try to write proper tools in it, it becomes much more of a pain - people try to (poorly) emulate a retained mode in it, hold state, cache - and it becomes unreadable mess.

Effectively people are hasty and don't spend the time to try doing things nicely, in particular because the first steps and debug use allow you to do quick things.

But I don't think it's a fundamental property of IMGUI or Dear ImGui that "proper tools" become particularly more of a pain. Of course it is more work to make a proper tools than hasty-debug-tools, and I can easily see how underengineering can back-fire (over-engineering it likewise).


This is such a good and clear take.


Wait, what? This is literally not the point of research. The point of research is to form some theory and gather data that is relevant, and then analyze statistically. Plus propose follow up theories and studies.

This is how almost every research publication in every field - from medicine, through psychology, to computer science looks like.


I'm always for people over financial entities and capital holders, however in this case it's pretty simple - you sell control to someone for a ton of money, so why on Earth would you expect to retain it forever? You literally got millions in exchange for having less shares and it not being "your" company.

Maby startups don't even need to take on seed investor money (could be self funded, smaller team, longer time before funding etc) - but founders obviously prefer to have a nice salary and soft landing. Or sometimes literally just as a marketing strategy "we are backed by x/y/z, this proves our value". This is the price to pay. What am I missing?


If they create a better tool chain, ecosystem, and programming experience than CUDA and compatible with all computational platforms at their peak performance - awesome! Everyone wins!

Until then, it's a bit funny claim, especially considering what a failure OpenCL was (programmer's experience and fading support). Or trying to do GPGPU with compute shaders in DX/GL/Vulkan. Are they really "motivated"? Because they had so many years and the results are miserable... And I don't think they invested even a fraction of what got invested into CUDA. Put your money where your mouth is.


I wish AMD or Intel would just ship a giant honking CPU with 1000s of cores that doesn't need any special purpose programming languages to utilize. Screw co-processors. Screw trying to make yet another fucked up special purpose language -- whether that's C/C++-with-quirks or a half-assed Python clone or whatever. Nuts to that. Just ship more cores and let me use real threads in regular programming languages.


It doesn't work if you're going against GPUs. All the nice goodies we are accustomed to on large desktop x86 machines with gigantic caches and huge branch predictor area and OOO execution engines -- the features that yield the performance profile we expect -- simply do not translate or scale up to thousands of cores per die. To scale that up, you need to redesign the microarchitecture in a fundamental way to allow more compute-per-mm^2 of area, but at that point none of the original software will work in any meaningful capacity because the pipeline is so radically different, it might as well be a different architecture entirely. That means you might as well just write an entirely different software stack, too, and if you're rewriting the software, well, a different ISA is actually the easy part. And no, shoving sockets on the mobo does not change this; it doesn't matter if it's a single die or multi socket. The same dynamics apply.


While the first >1000 core x86 processor is probably a little ways out, Intel is releasing a 288-core x86 processor in the first half of 2024 (Sierra Forest). I assume AMD will have something similarly high core in 2024-25 as well.


To be clear, you can probably make a 1000 core x86 machine, and those 1000 cores can probably even be pretty powerful. I don't doubt that. I think Azure even has crazy 8-socket multi-sled systems doing hundreds of cores, today. But this thread is about CUDA. Sierra Forest will get absolutely obliterated by a single A100 in basically any workload where you could reasonably choose between the two as options. I'm not saying they can't exist. Just that they will be (very) bad in this specific competition. I made an edit to my comment to reflect that.

But what you mention is important, and also a reason for the ultimate demise of e.g. Xeon Phi. Intel surely realized they could just scale their existing Xeon designs up-and-out further than expected. Like from a product/SKU standpoint, what is the point of having a 300 core Phi where every core is slow as shit, when you have a 100 core 4-socket Xeon design on the horizon, using an existing battle-tested design that you ship billions of dollars worth every year? Especially when the 300 core Xeon fails completely against the competition. By the time Phi died, they were already doing 100-cores-per-socket systems. They essentially realized any market they could have had would be served better by the existing Xeon line and by playing to their existing strengths.


> Intel is releasing a 288-core x86

This made me wonder a couple of things-

What kind of workloads and problems is that best suited for? It’s a lot of cores for a CPU, but for pure math/compute, like with AI training and inference and with graphics, 288 cores is like ~1.5% of the number of threads of a modern GPU, right? Doesn’t it take particular kinds of problems to make a 288 core CPU attractive?

I also wondered if the ratio of the highest core count CPU to GPU has been relatively flat for a while? Which way is it trending- which of CPUs or GPUs are getting more cores faster?


You could do sparse deep learning with much, much larger models with these CPUs. As paradoxical as it might sound, sparse deep learning gets more compute bound as you add more cores.


I'd be curious to learn more about how it's compute bound and what specifically is compute bound. On modern H100s you need ~600 fp8 operations per byte loaded from memory in order to be compute bound, and that's with full 128-byte loads each time. Even integer/fp32 vector operations need quite a few operations to be compute bound (~20 for vector fp32).


I think you misunderstood what I mean. Sparse ML is inherently memory latency bound since you have a completely unpredictable access pattern prone to cache misses. The amount of compute you perform is a tiny blip compared to the hash map operations you perform. What I mean is that as you add more cores, there are sharing effects because multiple cores are accessing the same memory location at the same time. The compute bound sections of your code become a much greater percentage of the overall runtime as you add cores, which is surprising, since adding more compute is the easy part. Pay attention to my words "_more_ compute bound".

Here is a relevant article: https://www.kdnuggets.com/2020/03/deep-learning-breakthrough...


288 Cores or Threads? Cuz to my knowledge AMD already has a 128 Core, 256 Thread Processor with the Epyc 9754


Apple might be sort-of trying to build the honking CPU, but it still requires different language extensions and a mix of different programming models.

And what you suggest could be done, but it would likely flop commercially if you made it today, which is why they aren’t doing it. SIMD machines are faster on homogenous workloads, by a lot. It would be a bummer to develop a CPU with thousands of cores that is still tens or hundreds of times slower than a comparably priced GPU.

SIMD isn’t going away anytime soon, or maybe ever. When the workload is embarrassingly parallel, it’s cheaper and more efficient to use SIMD over general purpose cores. Specialized chiplets and co-processors are on the rise too, co-inciding with the wane of Moore’s law; specialization is often the lowest hanging fruit for improving efficiency now.

There’s going to be plenty of demand for general programmers but maybe worth keeping in mind the kinds of opportunities that are opening up for people who can learn and develop special purpose hardware and software.


Well, that is what a GPU is. Cuda / openmp etc are attempts at conveniently programming a mixed cpu/gpu system.

If you don't want that, program the GPU directly in assembly or C++ or whatever. A kernel is a thread - program counter, register file, independent execution from the other threads.

There isn't a Linux kernel equivalent sitting between you and the hardware so it's very like bare metal x64 programming, but you could put a kernel abstraction on it if you wanted.

Core isn't very well defined, but if we go with "number of independent program counters live at the same time" it's a few thousand.

X64 cores are vaguely equivalent to GCN compute units, 100 or so if either in a 300W envelope. X64 has two threads and a load of branch prediction / speculation hardware. GCN has 80 threads and swaps between them each cycle. Same sort of idea, different allocation of silicon.


It was called Larrabee and XeonPhi, they botched it, and the only thing left from that effort is AVX.


I used to play with these toys 7-8 years ago. We tried everything, and it was bad at it all.

Traditional compute? The cores were too weak.

Number crunching? Okay-ish but gpus were better.

Useless stuff.


They seemed exceedingly hard to use well but interestingly capable & full of promise. And they were made in a much more primitive software age.

I'd love to hear about what didn't work. OpenMP support seemed ok maybe but OpenMP is just a platform, figuring out software architectures that's mechanistically sympathetic to the system is hard. It would be so interesting to see what Xeon Phi might have been if we had Calcite or Velox or OpenXLA or other execution engine/optimizers that can orchestrate usage. The possibility of something like Phi seems so much higher now.

There's such a consensus around Phi tanking, and yes, some people came and tried and failed. But most of those lessons, of why it wasn't working (or was!) never survived the era, never were turned into stories & research that illuminates what Phi really was. My feeling is that most people were staying the course on GPU stuff, and that there weren't that many people trying Phi. I'd like more than the heresay heaped at Phi's feed to judge by.


Well... Back then in my shop they would just assign programmers to things, together with a couple of mathematicians.

Math guys came up with a list of algorithms to try for a search engine backend.

What we needed was matrix multiplication and maybe some decision tree walking (that was some time ago, trees were still big back then, NNs were seen as too compute-intensive for no clear benefits). So we thought that it might be cool to have a tool that would support both. Phi sounded just right for both.

And things written to AVX-512 did work. Software surpisingly easy to port.

But then comes the usual SIMD/CPU trouble: every SIMD generation wants a little software rewrite. So for both Phi generations we had to update our code. For things not compatible with the SIMD approach (think tree-walking) it is just a weak x86.

In theory Phi's were universal, in practice what we got was: okay number crunching, bad generic compute.

GPU was somewhat similar: the software stack was unstable, CUDA just did not materialize as a standard yet. But every generation introduced a massive increase in compute available. And boy did NVIDIA move fast...

So GPU situation was: amazing number crunching, no generic compute.

And then there were a few ML breakthroughs results which rendered everything that did not look like a matrix multiplication obsolete.

PS I wouldn't take this story too seriously, details may vary.


By any chance, Yandex?


Nope but close enough :-)


Some observations:

- Very bad performance at existing x86 workloads, so a major selling point was basically not there in practice, because extracting any meaningful performance required a software rewrite anyway. This was an important adoption criteria; if they outright said "All your existing workloads are compatible, but will perform like complete dogshit", why would anyone bother? Compatibility was a big selling point that ended up meaning little in practice, unfortunately.

- Not actually what x86 users wanted. This was at the height of "Intel stagnation" and while I think they were experimenting with lots of stuff, well, in this case, they were serving a market that didn't really want what they had (or at least wasn't convinced they wanted it).

- GPU creators weren't sitting idle and twiddling their thumbs. Nvidia was continuously improving performance and programmability of their GPUs across all segments (gaming, HPC, datacenters, scientific workloads) while this was all happening. They improved their compilers, programming models, and microarchitecture. They did not sit by on any of these fronts.

Ironically the main living legacy of Phi is AVX-512, which people did and still do want. But that kind of gives it all away, doesn't it? People didn't want a new massively multicore microarchitecture. They wanted new vector instructions that were flexible and easier to program than what they had -- and AVX-512 is really much better. They wanted the things they were already doing to get better, not things that were like, effectively a different market.

Anyway, the most important point is probably the last one, honestly. Like we could talk a lot about compiler optimizations or autovectorization. But really, the market that Phi was trying to occupy just wasn't actually that big, and in the end, GPUs got better at things they were bad at, quicker than Phi got better at things it was bad at. It's not dissimilar to Optane. Technically interesting, and I mourn its death, but the competition simply improved faster than the adoption rate of the new thing, and so flash is what we have.

Once you factor in that you have to rewrite software to get meaningful performance uplift, the rest sort of falls into place. Keep in mind that if you have a $10,000 chip and you can only extract 50% of the performance, you more or less have just $5,000 on fire for nothing in return. You might as well go all the way and use a GPU because at least then you're getting more ops/mm^2 of silicon.


I don't disagree anywhere but I don't think any of these statements actually condemn Xeon Phi outright. It didn't work at the time, and doing it with so little software support to tile out workloads well was a big & possibly bad gambit, but I'm so unsure we can condemn the architecture. There seems to be so few folks who made good attempts and succeeded or failed & wrote about it.

I tend to think there was tons of untapped potential still on the table. And that a failure to adopt potential isn't purely Intel alone's fault. The story we are commenting on is about the rest-of-industry trying to figure out enduring joint strategies, and much of this is chipmaker provided, but it is also informed and helped by plenty of consumers also pouring energy in to figure out what's working and not, trying to push the bounds.

Agreed that anyone going in thinking Xeon Phi would be viable for running a boring everyday x86 workload was going to be sad. To me the promise seemed clear that existing toolchains & code would work, but it was always clear to me there were a bunch of little punycores & massive SIMD units and that doing anything not SIMD intensive wasn't going to go well at all. But what's the current trend? Intel and AMD are both actively building not punycores but smaller cores, with Sierra Forest and Bergamo. E-cores are the grown up Atom we saw here.

Yes the GPGPU folks were winning. They had a huge head start, were the default option. And Intel was having trouble delivering nodes. So yes, Xeon Phi was getting trounced for real reasons. But they weren't architectural issues! It just means the Xeon Phi premise was becoming increasingly handicapped.

As I said I broadly agree everywhere. Your core point about giving the market more of what it already does is well taken, is a river of wisdom we see again and again. But I do think conservative thinking, iterating along, is dangerous thinking that obstructs us from seeing real value & possibility before us. Maybe Intel could have made a better ML chip than the GPGPU market has gotten for years, had things gone differently; I think the industry could perhaps have been glad they had veered onto a new course, but the barriers to that happening & the slow down in Intel delivery & the difficulty bootstrapping new software were all horrible encumberances which were rightly more than was worth bearing together.


I don't thing anybody seriously considered Phi's for generic compute or something.

Most experimenters saw it as a way to have something GPU-like in terms of raw power but with no limitations charateristic of SIMT's. Like, slightly different code paths for threads doing number crunching or something.

But it turns out that it's easier to force everything into a matrix. Or a very big matrix. Or a very-very-very big matrix.

And then see what sticks.


Why are we not also talking about memory bandwidth? Personal opinion: this is the key. The latest Phi had about 100 GB/s in 2017. The contemporary Nvidia GTX 1080: 320 GB/s.

When CPUs actually come with bandwidth and a decent vector unit, such as the A64FX, lo and behold, they lead the Top500 supercomputer list, also beating out GPUs of the day.

Why have we not been getting bandwidth in CPUs? Is it because SPECint benchmarks do not use much? Or because there is too much branch-heavy code, so we think hundreds of cores are helpful?

Existing machines are ridiculously imbalanced, hundreds of times more compute vs bandwidth than the 1:1 still seen in the 90s. Hence matmul as a way of using/wasting the extra compute.

The AMD MI300a looks like a very interesting development: >5 TB/s shared by 24 cores plus GPUs.


Hence why " they botched it".


The closest Intel got to this was Xeon Phi / Knights Landing https://en.wikipedia.org/wiki/Xeon_Phi with 60+ cores per chip, each able to run 4 threads simultaneously - each of which could run arbitrary x86 code. Discontinued due to low demand in 2020 though.

In practice, people weren’t afraid to roll up their sleeves and write CUDA code. If you wanted good performance you had to think about data parallelism anyways, and at that point you’re not benefiting from x86 backwards compatibility. It was a fascinating dream while it lasted though.


AVX might be going the right direction, even if the AVX512 was stretch too far. I was impressed by llama.cpp performance boost when AVX1 support was added.

There's no intrinsic reason why multiplying matrices requires massive parallelism, in principle it could be done on few cores plus good management of ALUs/memory bandwidth/caches.


What's wrong with compute shaders ?


I shipped a dozen products with them (mostly video games), so there's nothing "wrong" that would make them unusable. But programming them and setting up the graphics pipe (and all the passes, structured buffers, compiling, binding, weird errors, and synchronization) is a huge PITA as compared to the convenience of CUDA. Compilers are way less mature, especially on some platforms cough. Some GPU capabilities are not exposed. No real composability or libraries. No proper debugging.


These days, some game engines have done pretty well at making compute shaders easy to use (such as Bevy [1] -- disclaimer, I contribute to that engine). But telling the scientific/financial/etc. community that they need to run their code inside a game engine to get a decent experience is a hard sell. It's not a great situation compared to how easy it is on NVIDIA's stack.

[1]: https://github.com/bevyengine/bevy/blob/main/examples/shader...


I have recently published an AI-related open-source project entirely based on compute shaders https://github.com/Const-me/Cgml and I’m super happy with the workflow. Possible to implement very complicated things without compiling a single line of C++, the software is mostly in C#.

> setting up the graphics pipe

I’ve picked D3D11, as opposed to D3D12 or Vulkan. The 11 is significantly higher level, and much easier to use.

> compiling, binding

The compiler is design-time, I ship them compiled, and integrated into the IDE. I solved the bindings with a simple code generation tool, which parses HLSL and generates C#.

> No proper debugging

I partially agree but still, we have renderdoc.


I understand why you've picked D3D11, but people have to understand that comes with serious limitations. There are no subgroups, which also means no cooperative matrix multiplication ("tensor cores"). For throughput in machine learning inference in particular, there's no way D3D11 can compete with either CUDA or a more modern compute shader stack, such as one based on Vulkan 1.3.


> no subgroups

Indeed, in D3D they are called “wave intrinsics” and require D3D12. But that’s IMO a reasonable price to pay for hardware compatibility.

> no cooperative matrix multiplication

Matrix multiplication compute shader which uses group shared memory for cooperative loads: https://github.com/Const-me/Cgml/blob/master/Mistral/Mistral...

> tensor cores

When running inference on end-user computers, for many practical applications users don’t care about throughput. They only have a single audio stream / chat / picture being generated, their batch size is a small number often just 1, and they mostly care about latency, not throughput. Under these conditions inference is guaranteed to bottleneck on memory bandwidth, as opposed to compute. For use cases like that, tensor cores are useless.

> there's no way D3D11 can compete with either CUDA

My D3D11 port of Whisper outperformed original CUDA-based implementation running on the same GPU: https://github.com/Const-me/Whisper/


Sure. It's a tradeoff space. Gain portability and ergonomics, lose throughput. For applications that are throttled by TOPS at low precisions (ie most ML inferencing) then the performance drop from not being able to use tensor cores is going to be unacceptable. Glad you found something that works for you, but it certainly doesn't spell the end of CUDA.


> ie most ML inferencing

Most ML inferencing is throttled with memory, not compute. This certainly applies to both Whisper and Mistral models.

> it certainly doesn't spell the end of CUDA

No, because traditional HPC. Some people in the industry spent many man-years developing very complicated compute kernels, which are very expensive to port.

AI is another story. Not too hard to port from CUDA to compute shaders, because the GPU-running code is rather simple.

Moreover, it can help with performance just by removing abstraction layers. I think the reason why compute shaders-based Whisper outperformed CUDA-based version on the same GPU, these implementations do slightly different things. Unlike Python and Torch, compute shaders actually program GPUs as opposed to calling libraries with tons of abstractions layers inside them. This saves memory bandwidth storing and then loading temporary tensors.


This. It's crazy how primitive the GPU development process still is in the year 2023. Yeah it's gotten better, but there's still a massive gap with traditional development.


It's kinda like building Legos vs building actual Skyscrapers. The gap between compute shaders and CUDA is massive. At least it feels massive because CUDA has some key features that compute shaders lack, and which make it so much easier to build complex, powerful and fast applications.

One of the features that would get compute shaders far ahead compared to now would be pointers and pointer casting - Just let me have a byte buffer and easily cast the bytes to whatever I want. Another would be function pointers. These two are pretty much the main reason I had to stop doing a project in OpenGL/Vulkan, and start using CUDA. There are so many more, however, that make life easier like cooperative groups with device-wide sync, being able to allocate a single buffer with all the GPU memory, recursion, etc.

Khronos should start supporting C++20 for shaders (basically what CUDA is) and stop the glsl or spirv nonsense.


You might argue for forking off from glsl and SPIR-V for complex compute workloads, but lightweight, fast compilers for a simple language like glsl do solve issues for graphics. Some graphics use cases don't get around shipping a shader compiler to the user. The number of possible shader configurations is often either insanely large or just impossible to enumerate, so on the fly compilation is really the only thing you can do.


Ironically, most people use HLSL with Vulkan, because Khronos doesn't have a budget nor the people to improve GLSL.

So yet another thing where Khronos APIs are dependent on DirectX evolution.

It used to be that AMD and NVidia would first implement new stuff on DirectX in collaboration with Microsoft, have them as extensions in OpenGL, and eventually as standard features.

Now even the shading language is part of it.


For GPGPU tasks, they lack a lot of useful features that CUDA has like the ability to allocate memory and launch kernels from the GPU. They also generally require you to write your GPU and CPU portions of an algorithm in different languages, while CUDA allows you to intermix your code and share data structures and simple functions between the two.


CUDA = C++ on GPUs. Compute shader - subset of C with a weird quirks.


There are existing efforts to compile SYCL to Vulkan compute shaders. Plenty of "weird quirks" involved since they're based on different underlying varieties of SPIR-V ("kernels" vs. "shaders") and seem to have evolved independently in other ways (Vulkan does not have the amount of support for numerical computation that OpenCL/SYCL has) - but nothing too terrible or anything that couldn't be addressed by future Vulkan extensions.


A subset that lacks pointers, which makes compute shaders a toy language next to CUDA.


Vulkan 1.3 has pointers, thanks to buffer device address[1]. It took a while to get there, and earlier pointer support was flawed. I also don't know of any major applications that use this.

Modern Vulkan is looking pretty good now. Cooperative matrix multiplication has also landed (as a widely supported extension), and I think it's fair to say it's gone past OpenCL.

Whether we get significant adoption of all this I think is too early to say, but I think it's a plausible foundation for real stuff. It's no longer just a toy.

[1] https://community.arm.com/arm-community-blogs/b/graphics-gam...


Is IREE the main runtime doing Vulkan or are there others? Who should we be listening to (oh wise @raphlinus)?

It's been awesome seeing folks like Keras 3.0 kicking out broad Intercompatibility across JAX, TF, Pytorch, powered by flexible executuon engines. Looking forward to seeing more Vulkan based runs getting socialized benchmarked & compared. https://news.ycombinator.com/item?id=38446353


The two I know of are IREE and Kompute[1]. I'm not sure how much momentum the latter has, I don't see it referenced much. There's also a growing body of work that uses Vulkan indirectly through WebGPU. This is currently lagging in performance due to lack of subgroups and cooperative matrix mult, but I see that gap closing. There I think wonnx[2] has the most momentum, but I am aware of other efforts.

[1]: https://kompute.cc/

[2]: https://github.com/webonnx/wonnx


How feasible would it be to target Vulkan 1.3 or such from standard SYCL (as first seen in Sylkan, for earlier Vulkan Compute)? Is it still lacking the numerical properties for some math functions that OpenCL and SYCL seem to expect?


That's a really good question. I don't know enough about SYCL to be able to tell you the answer, but I've heard rumblings that it may be the thing to watch. I think there may be some other limitations, for example SYCL 2020 depends on unified shared memory, and that is definitely not something you can depend on in compute shader land (in some cases you can get some of it, for example with resizable BAR, but it depends).

In researching this answer, I came across a really interesting thread[1] on diagnosing performance problems with USM in SYCL (running on AMD HIP in this case). It's a good tour of why this is hard, and why for the vast majority of users it's far better to just use CUDA and not have to deal with any of this bullshit - things pretty much just work.

When targeting compute shaders, you pretty much have to manage buffers manually, and also do copying between host and device memory explicitly (when needed - on hardware such as Apple Silicon, you prefer to not copy). I personally don't have a problem with this, as I like things being explicit, but it is definitely one of the ergonomic advantages of modern CUDA, and one of the reasons why fully automated conversion to other runtimes is not going to work well.

[1]: https://stackoverflow.com/questions/76700305/4000-performanc...


Unified shared memory is an intel specific extension of OpenCL.

SYCL builds on top of OpenCL so you need to know the history of OpenCL. OpenCL 2.0 introduced shared virtual memory, which is basically the most insane way of doing it. Even with coarse grained shared virtual memory, memory pages can transparently migrate from host to device on access. This is difficult to implement in hardware. The only good implementations were on iGPUs simply because the memory is already shared. No vendor, not even AMD could implement this demanding feature. You would need full cache coherence from the processor to the GPU, something that is only possible with something like CXL and that one isn't ready even to this day.

So OpenCL 2.x was basically dead. It has unimplementable mandatory features so nobody wrote software for OpenCL 2.x.

Khronos then decided to make OpenCL 3.0, which gets rid of all these difficult to implement features so vendors can finally move on.

So, Intel is building their Arc GPUs and they decided to create a variant of shared virtual memory that is actually implementable called unified shared memory.

The idea is the following: All USM buffers are accessible by CPU and GPU, but the location is defined by the developer. Host memory stays on the host and the GPU must access it over PCIe. Device memory stays on the GPU and the host must access it over PCIe. These types of memory already cover the vast majority of use cases and can be implemented by anyone. Then finally, there is "shared" memory, which can migrate between CPU and GPU in a coarse grained matter. This isn't page level. The entire buffer gets moved as far as I am aware. This allows you to do CPU work then GPU work and then CPU work. What doesn't exist is a fully cache coherent form of shared memory.

https://registry.khronos.org/OpenCL/extensions/intel/cl_inte...


https://enccs.github.io/sycl-workshop/unified-shared-memory/ seems to suggest that USM is still a hardware-specific feature in SYCL 2020, so compatibility with hardware that requires a buffer copying approach is still maintained. Is this incorrect?


Good call. So this doesn't look like a blocker to SYCL compatibility. I'm interested in learning more about this.


> Vulkan 1.3 has pointers, thanks to buffer device address[1].

> [1] https://community.arm.com/arm-community-blogs/b/graphics-gam...

"Using a pointer in a shader - In Vulkan GLSL, there is the GL_EXT_buffer_reference extension "

That extension is utter garbage. I tried it. It was the last thing I tried before giving up on GLSL/Vulkan and switching to CUDA. It was the nail in the coffin that made me go "okay, if that's the best Vulkan can do, then I need to switch to CUDA". It's incredibly cumbersome, confusing and verbose.

What's needed are regular, simple, C-like pointers.


Compute shaders are not capable of using modern GPU features like tensor cores or many of the other features needed to feed tensor cores data fast enough (e.g. TMA/cp.async.shared)


Something super creepy that happened to me recently: a hospital where I've been to a few months ago called me and asked me to participate in some DNA analysis program. They said "oh and the best part? You don't need to do anything! We will use blood samples we collected the last time." I obviously declined, but it was a huge wtf to me - they stored biological samples associated with me without informing me and can do a post hoc DNA analysis. This is just insane and a proof of how non existent any privacy laws in the US are. (In EU they cannot freeze any samples without consent and unfrozen ones are ok for at most a few days)


Sweden has a registry of blood samples of every person born in sweden since 1975: https://sv.wikipedia.org/wiki/PKU-registret (Swedish only wiki page)

Predictably?, amusingly? police never had access to this data, until a government minister was murdered in 2003, when a sample from the suspect was retrieved. From what we know it has not been used since. So we can be cynical, but under the circumstances, the police use of the registry has not yet taken hold and is guarded by the courts..


Boy, that "yet" reaaaally makes me feel comfortable trusting government with information it can use to kill entire families.


It is interesting how that 'single exception in 2003' really erodes trust in the whole thing. It's very hard to come back from that.


If the government wanted you gone there wouldn’t be this song and dance about getting dna data from a blood bank. They’d just kill you and that would be that.


They can barely agree on a budget. They wouldn't be able to unify against the public, they're too dysfunctional. You'd need a President willing to defy his oath and two other branches on board with him.

Not happening.


Congress is not the agency that does the killing. More efficient ones with practically zero oversight do. See Frank Olson for an example, we only know about that after mkultra was revealed in CIA abuse investigations. His family still is fighting for justice some 60 years after his assassination.


Well they did call you and tried to trick you into letting them use it.

But assuming they obey the law they did not used your samples.

So there are privacy laws in place? Also they could have been cleaning old results/samples and this was one step.


How do I know they didn't use them? They already did something with my biological samples (storing for a different purpose than when they drew my blood) without my consent nor informing me.

And also - could eg. police use it?


I am not sure, only guessing. But why would they ask for permission in the firs place?

> They already did something with my biological samples

I can only guess you were tested for something in the hospital. Samples are sent to the lab (separate department) to be tested. If additional tests need to be perform they can use the blood they already received. The samples are kept for ready availability if additional tests are requested by doctor. Doctors dont care how its done, they dont have time to inform lab patient x is out home.

After some time they need to be destroyed - due to expiry date on it. Before destroying the lab contacted you and asked for dna permission.

> And also - could eg. police use it?

I don't know that. But you might want to check how medical data is protected in your jurisdiction.


They asked for permission because if you want to use data a hospital already has stored for another use, you need to contact the patient to get additional consent (you already gave them consent to collect and keep the sample).

See https://www.hhs.gov/ohrp/regulations-and-policy/guidance/faq... under "Should the initial consent ... be repeated or supplemented?" and I think this is the law: (45 CFR 46.116(b)(5)).


Calling to get more consent for something that was already under consent is not a trick and it's not against the law for them to use the samples.

THere isn't anything nefarious going on here. Just scientific research with medical data.


As an engineer, you generally don't know which one applies. Even qualified lawyers don't decide on the actual criminality (a jury or judge after a valid trial does).

So I assume that "reporting wrongdoing" encompasses reporting up your management chain anything that "feels off," from mildly misleading marketing to actual witnessed or experienced harassment. Probably, in the vast majority of the cases, you are not qualified to judge it.

And in any case, even if you are wrong and the reported thing is "ok", there should never be any retaliation for just raising concerns and speaking up.

Side note and something hilarious, creepy, but kind of making sense: at one workplace (not SV big tech, but a big company), I had a manager-only training about "illegal harassment" that explicitly taught us what kinds of harassment are illegal (against protected categories) vs. "just" breaking the company's code of conduct, with quizzes on hypothetical situations and asking us "is this bad behavior potentially illegal harassment?".


This sounds like an insane conflict of interest.

CEO of a company (or worse, non-profit!) and a member of its board creates another, for-profit company (in partial secrecy/lack of transparency) that the non-profit would eventually pay a lot of money. This is almost a fraudulent level of siphoning non-profit money.

Btw, this is hilarious - regular employees have non-competes in their contracts (sometimes void/illegal, depending on the local jurisdiction) and breaching them is an immediately fireable offense (sometimes leading to more severe consequences). You work on a small thing on the side? Better be careful, ask your manager/HR, risk it getting taken over by the company (luckily, IIUC this part is mostly illegal now in all jurisdictions that "matter" for tech).

But sitting on multiple boards where you have much more room and possibilites for creating conflicts of interest and damaging the company? All fine and common!


There's an even bigger problem here: if he were just making money, that would be a normal-sized problem. If he were just making a supplier for OA, heck, that might be a good thing on net for OA; a subsidiary doing hardware might be justifiable given the importance of securing hardware.

But that's not what he's doing.

Creating such an independent hardware startup comes off as basically directly opposed to OA's safety mission - GPUs are one of the biggest limitations to creating better-than-OA models! The best customers would be the ones who are most sabotaging the OA mission. (You can't run the UN and also head an arms manufacturer dreaming of democratizing access to munitions.)


How is it opposed to OpenAI's goals to have a friendly company selling them chips instead of NVIDIA, which is, at-best, a neutral company?

Software is always more important than hardware. All the big players have access to NVIDIA chips today and yet only OpenAI has ChatGPT, proving the point.

OpenAI probably wishes someone would create competition to NVIDIA and this is Sam Altman trying to make that happen himself, since no one else seems to have been able to pull it off so far.

A conflict of interest would be OpenAI buying Altman's chips at inflated prices or something like that.

But if he makes a bunch of money selling OpenAI chips and OpenAI gets better/cheaper chips, that seems like pure win-win and totally free of ethical conflict.


"Conflict of interest" is not defined by a bad outcome or malice. It is defined as the potential of those, human nature, and various cognitive biases. "Conflict of interest" is disclosing anything that could lead to a biased decision or lack of transparency. Can a CEO of 2 companies be objective about a contract between the two and when claiming that his company no2 is better than the competition?

And in the tech-specific case: As much as junior engineers would love to believe in "superior" solutions, tech decisions are seldom clear-cut. There are many trade-offs: cost, efficiency, memory use, throughput, latency, ease of use, cost of switching, and many more. You always have a pile of pros and cons. Sometimes, one is strong enough, but most of the time, it feels almost like guessing/intuition. And then the conflict of interest becomes especially concerning.


I thought “potential conflict of interest” was the term used to refer to the potential conflict.


yes, and "disclosure of" is the term used when you properly disclose conflicts.


“Disclosure of potential conflicts of interest” — if I understand?

Asking people to disclose potential conflicts of interest casts a wider net — so they aren’t led to think “nah, that’s not a conflict, it’s a win win!”


It may be a conflict of interest but it will be hidden by layers. OAI has deal with gpu producer and there is little conflict as they don't interact, but MS that owns Azure and resell of hardware creates that conflict. Maybe I have bad view, but I think many such conflicts that come from interdependent parties exist and especially in datacenter/cloud world.


This is patently wrong. All of it. You made up a concept and then ran with it like it’s reality. “Potential” is not an issue. “Actual” is. This isn’t a judge. This is a CEO and they can self deal as long as it stays as a value to the core company. It’s up to the board to decide that when it’s proven.

A bunch of nerds just thought they could jump the gun here because they are inexperienced doofus’s when it comes to corp.


A CEO self-dealing can be civilly (and in gross cases, criminally) convicted of violating their fiduciary duty to act in the company's interest, over their own, and violating company trade secrets.

(If Altman doesn't want to be restricted by fiduciary duties then he shouldn't be on a board or be an executive.)

How could Altman possibly not use private OpenAI information regarding its hardware needs, when creating an AI hardware company he wants to serve OpenAI? And its competitors?

He could invest in a new AI hardware company created and run by other people (without his specific input) without a conflict. That does not appear to be the case here. He could create an OpenAI hardware subsidiary. Again not what he was doing.


“Can”. Everyone here is talking in absolutes. There is a shit ton of room in “can”.


I am not sure what you are saying. Agreeing with me or not? "Can" as "a shit ton of room" vs. some undescribed "absolutes"?

Self-dealing CEO's and board members can, and are, convicted criminally and civilly of violating their fiduciary duties. And "can" and are fired with cause.


Lots of e.g. news organizations absolutely have tons of guardrails in place around conflicts of interest that may or may not really influence behavior but may even have the appearance of potentially doing so.


No, conflict of interest is entirely about potential.


> totally free of ethical conflict.

Unfortunately, far from that case.

Altman's hardware startup would only be free of ethical conflict if Altman was open about it, and the board approved at least two things (and probably more):

1. A formal plan to separate OpenAI CEO Altman from OpenAI hardware acquisition decisions.

2. A formal agreement with Altman and his new company, on how OpenAI's private information with hardware implications is firewalled and/or shared with Altman's new hardware concern.

Otherwise, Altman is going rogue, acting on private OpenAI information useful to a new hardware company looking for future business with OpenAI and OpenAI competitors.

CEO Altman has a fiduciary duty to act directly in OpenAI's interest, and not in some "hey this could be great for everyone" version.

Litmus test: If your legal partner/executive is doing things behind your back with large implications for you, they are almost certainly violating ethics in some way.


Actually I don’t think he does have a fiduciary duty to OpenAI, and neither does the board. It’s a non-profit.


Non-profits totally have boards with fiduciary duties. Just because it’s a non profit doesn’t mean it isn’t being juiced for money by others. It just means that the org can’t make a profit, but it can totally spend its money unwisely so that it winds up in someone’s pockets. Heck, most of America’s hospital systems are like that.


The “fiduciary duty“ of a nonprofit, such as it is, is just securing operating funds so that it can Fulfil its mission. Altman has been spectacularly successful at achieving that goal by obtaining $10 billion in funding. Better, cheaper, or more power efficient chips, whatever the source, would absolutely help through the mission too. and that of course is assuming they actually buy or use them at all in the first place. Firing him on the grounds that it could potentially be a bad deal that “lines his pockets“ sometime in the future seems premature.

I think it’s a real stretch to say this chip company would be a violation of his “fiduciary duty“ to open AI. The best argument you could make is that he has a conflict of interest with a competitor. But again, open AI is a nonprofit. It doesn’t have “competitors“. Either it has the funding it needs and is fulfilling its mission or it isn’t.


Fiduciary =/= monetary profits. The board (including Altman) has to put OpenAI's interest first, that is their fiduciary duty.


It's still not clear to me how a board member prospectively running a chip company (a related but different business) works against OpenAI's interests. And people here seem to be making an awful lot of assumptions in order to somehow connect those dots.


Sam Altman has inside information on OpenAI's current and future hardware needs. Sam Altman as CEO, was in a position to direct OpenAI's current and future hardware purchases.

How can he separate those concerns from having his own hardware initiative put together precisely to serve OpenAI and its competitors hardware needs?

Without an agreement with the OpenAI board on how these conflicts of trade secret information and executive power can be settled (Significant shares in the new company for OpenAI?) no competent board would put up with this.

This situation smacks of the ethically questionable transition to a closed/profit organization, after receiving initial funding based on their being an open/non-profit organization. (Apparently the original funders didn't retain any veto power over such a transition, to the regret of at least one significant donor.)


>Without an agreement with the OpenAI board on how these conflicts of trade secret information and executive power can be settled (Significant shares in the new company for OpenAI?) no competent board would put up with this.

I would say if they really did anticipate and worry about such an issue, a competent board would work toward forging such an agreement, rather than firing the CEO years before the aforementioned chip company even existed and before telling any of their other stakeholders.


Well when you put it that way ... yeah. That would have been great.


I’m not taking a side here because I don’t know the facts of the case. But a conflict of interest is a huge deal because it could lead to spending more money than necessary, and is also the main way people juice non profits for profits somewhere else. Of course, if they start a chip company out in the open and not granting it money or guaranteed business from the non profit, things might be up to standard.


The issue is not the OpenAI/SamaChip relationship, which is probably beneficial for OpenAI.

The issue is what happens when SamaChip's profit imperative forces them to maximize revenue. Because the best way to maximize revenue, when you're an independent chipmaker with a large R&D investment, is to sell to more companies. Which by definition are going to be OpenAI's competitors, and whose interests may not be aligned with OpenAI, but will have a financial line to SamaChip.

Gwern's highlighting an interesting contradiction in OpenAI's core charter. In order to ensure responsible, safe, humanity-benefiting AGI, OpenAI needs to have control over AGI's development; any for-profit entity that gets ahead of them probably will not have the same humanity-benefitting mission (actually, we know they won't, they will have a shareholder-benefitting mission by definition). But that means that by their charter, they can't be "Open". Anything like the Developer Day or API or a SamaChip that can sell to other startups means that other parties will have the freedom to use it for their own interests.

Not saying whether this is good or bad - the tension between openness and vulnerability always exists, and personally I tend to come down on the side of openness. But IMHO OpenAI's mission was contradictory from the very beginning, and was more a recruiting tool to get bright idealistic AI researchers to work for them.


Hahaha if you played at any high level in the FAANG you’d know that this is a desirable outcome compared to the alternative.


Where does Ilya stand on transparency in relation to OpenAI's mission of developing API?


> How is it opposed to OpenAI's goals to have a friendly company selling them chips instead of NVIDIA, which is, at-best, a neutral company?

Because OpenAI's mission statement is along the lines of providing AI to all. "All" is more than data centers and billion dollar valuation companies.

I strongly doubt I would be able to purchase one of said chips and have it in my house.

This GPU fiasco is all thanks to LLMs - especially transformers, which was OpenAIs trajectory under Altman. I wouldn't be surprised if the breakdown in communication was over OpenAI becoming a LLM printer. Transformers are a solved problem, making a bigger one is hardly research and definitely not a step towards OpenAIs mission statement.


Yes but creating the processing power with which they can realize their vision is. In interviews Sam’s opponent at Open AI Ilya Sutskever has said lack of hardware and energy for them could become major factors impeding progress in AI. It’s obvious how more players in the chip field help them, especially if the manufacturer has intimate knowledge of what they need or would like to see made. Even if Sam is gone for good they should work toward doing this anyway.


I don’t know if “strongly doubting“ you could buy one of these chips and have it in your house is a strong enough basis for arguing a conflict of interest between open AI and this chip company. People seem to be making an awful lot of assumptions about the specifics of a company that doesn’t even exist yet.


These are strange times, very strange ones, and you are grossly underestimating how much hardware constraints are impacting people. What I know is under NDA, but trust me, even the biggest names you know and would never guess are short tens of thousands of GPUs


Agree with this take. Sam previously stated (eg on Lex’s podcast) that a slow takeoff soon was his goal, to give society maximum time to adjust and to prevent an unexpected fast takeoff from capability overhang. I bought his take when he said it.

Going off and trying to accelerate hardware capabilities (especially with an outside company that presumably sells these processors on the open market) seems indefensible in this framework unless you have already solved alignment, which they clearly have not.


> basically directly opposed to OA's safety mission

OpenAI does not have a mission to ensure that the entire industry is safe.

And if anyone actually believes that then they are frankly delusional because right now AI is a geo-political fight between nation states. Is OpenAI really going to have any ability to control what China or UAE do with their LLMs. No.


China is the only other state that has anywhere close to the capacity to attempt to go for AGI. They also have an existential interest in creating safe ASI. The problem is that in geopolitical struggles, safety usually goes a bit out the window, and we've so far been fairly lucky (and surprisingly competent) with not destroying large parts of human civilisation. There are a lot of unknowns around the topic. If ASI is possible and reachable, the first to get there would end the race.

AGI could mean we are not alone as a sentient species we can directly communicate with on this planet anymore, and hopefully we wouldn't try to enslave it like we did to the "others" we encountered before.


> They also have an existential interest in creating safe ASI.

An oppressive state with AGI is no better than an oppressive AGI.

> China is the only other state that has anywhere close to the capacity to attempt to go for AGI.

Any state can easily recreate/adopt the AGI invented by others. The costs and challenges for any given level of AI tech have dropped precipitously after each advance, and state level actors have far more than enough resources.


Why can't, I don't know, the Canadian government hire some ex OpenAI employees and attempt to make an AGI? What is the constraint here that makes it only possible for the U.S. and China?


One of them is money. Canada was lucky enough to get in on AI early, when the whole thing was pie in the sky and most people didn’t care. But today it’s turning into big business, and Canada just can’t compete with Silicon Valley salary levels. And that’s a problem on top of trying to convince people to move from California to Toronto.

Europe and elsewhere have similar problems. To have an advantage, you need the very best people. But the very best people all want jobs at major US companies, and major US companies want them, too.


AI in China must adhere to strict socialist values. I wish that was a joke, but it’s actually a rule the government is trying to implement. I doubt it isn’t so much safety for the people as it is for the CPC.


Reframing

If this was simply reframed as, by creating your own GPUs - it would radically lower OpenAI costs in a material way, it’d be more understandable.


Google already does this. There is a material difference between doing it in house and starting a for profit company to do the thing


Kinda depends on your perspective on safety. In your view, should GPUs be regulated or not exist?


reposting this since it's getting more and more relevant

>> Common sense AI control NOW. Ban assault style GPUs

> That is beautiful, I made you a shirt. https://sprd.co/ZZufv7j

/context https://news.ycombinator.com/context?id=38119845


> This sounds like an insane conflict of interest.

It would be, but: this company hasn't been formed yet and this sketch does not justify the haste with which they kicked him out, there were all kinds of boxes that needed to be checked before they could do that without risking damage to OpenAI, this is a founder and the CEO we're speaking of.

Besides that: stupidly enough the contract that Sam has with OpenAI does not have a non-compete in it (this has been confirmed by multiple sources now I take that as true) and I don't see how it directly would harm OpenAI other than that his attention might be diluted and that it should be clear which cap he is wearing. But until that company formed and Sam named himself CEO of it (or took up some other high profile role) it leaves him so many outs that it only makes the board look like bumbling idiots. Because now he can simply say: "I would only be an investor" and that would be that, just like the rest of the investors in OpenAI (and, notably some members of the board) have conflicts of interest at least as large.

So if this was it they're in even more trouble than they were before because now it is the boards' conflicts of interest that will be held up to the light and those are not necessarily smaller.

What a complete circus.


stupidly enough the contract that Sam has with OpenAI does not have a non-compete

Does it really matter? If the behavior appears to be in conflict with OpenAI, and the board doesn’t like it, then that’s enough to let him go. It doesn’t need to be a contract violation, he just wasn’t doing the job they wanted him to do.


Yes, details like that really matter, and if the party you let go can pull half the company out from under you then you may have won the battle but you've just lost the war. On top of that you'll be in lawsuits until the third generation or so given who else sits at that table (Sequoia, YC and Microsoft to name a few).


Do you think there's a problem with him pitching his hardware side-gig to investors who approached him with an interest in OpenAI's tender offer? That gives the appearance of quid pro quo; with the hope that the investors get to skip the next line at the next OpenAI investment opportunity. Imagine someone high up on the Nvidia sales org telling a customer they are all out of H200 graphics cards for the half, but they have a fantastic timeshare investment opportunity they are selling on the side while the customer waits for the next batch.


Not necessarily, unless he proposed to run it himself. I don't see any evidence of that. Just tons of speculation.


I thought non-competes are not legally enforceable in the state of California anyway.


Why do you keep calling him a founder? Is my history wrong or did he jump from leading THIS very website to OpenAI when they came through for a round of sweet hn bucks.


From the caption right underneath his picture on this page:

https://en.wikipedia.org/wiki/OpenAI

"Co-founder and former CEO of OpenAI, Sam Altman" as well as countless news articles repeating this as if it is a well known fact, for which to date I haven't seen any contradicting information.


Looks like he was originally "co-chair" which sounds pretty foundery. https://web.archive.org/web/20170729172845/https://blog.open...


He also committed a chunk of the initial funding. And it's not rare for capital providers who are there on day #1 to call themselves and be referred to as co-founders.

Essentially it is a short-hand for 'those that were present on the founding day of the company and whose names appear on the founding documents besides the lawyers'.

If you get added later on then you are technically not a co-founder, though even there sometimes (and sometimes rightly) exceptions are made.


> And it's not rare for capital providers who are there on day #1 to call themselves and be referred to as co-founders.

I believe the opposite, never heard of any hands-off investors calling themselves co-founders.


So there's different scenarios.

The classic is that a person has an idea, starts working Hands-On, and then goes out and finds capital investment.

In this case, a number of capital holders got together with an idea for a company and then went out and hired several employees to from all over the world to work for them


They may not be hands off but they may be there from day #1. It's usually a matter of how much time they can commit. If it is only a little bit then it is usually in an oversight role, but it is also possible that they are instrumental in raising the first round of funding. Or in the case of a medical start-up the person who allows their name to be used by the venture because they believe in the concept or maybe even because they contributed the idea. Essentially: co-founders are those that were there on day #1 or that the rest of the founders allowed to call themselves co-founder even though they joined later (but they have made an outside contributions).

So it's a term with a strict definition but also one with plenty of exceptions.


Sounds like a funder rather than a founder. "chairman" is a hands-off role usually.


He was one of the founders and initial financers in 2015 along with Greg, Musk, Theil, and Bezos.


Your history is wrong. He is the original co-founder of openai as you can review from the publicly available legal data.


In a way it is interesting that people would wade into this discussion without even the most basic facts.


yeah, I wonder where this type of miss-information comes from. Do people simply make things up out of thin air to suit their biases and then proclaim them to be true?


You are totally right. Why would anyone be trying to claim Sam is a founder when he’s clearly not.

The organization was founded in December 2015 by Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Jessica Livingston, John Schulman, Pamela Vagata, and Wojciech Zaremba, with Sam Altman and Elon Musk serving as the initial board members.


Im not sure what the distinction you are drawing is.

Greg, Sam and Elon made a company and literally went out and hired researchers like Ilya.

They pulled together funding, created a company, and hired a team of researchers to operate it. If that isnt founding a company, I dont know what is. Sam was there before all of them

It is not like a scrappy team of researchers with a pre-existing company went out, got funding, then gave the VCs board seats.

If you are going to copy and paste from wikipedia, read a little further:

>In December 2015, Sam Altman, Greg Brockman, Reid Hoffman, Jessica Livingston, Peter Thiel, Elon Musk, Amazon Web Services (AWS), Infosys, and YC Research announced[15] the formation of OpenAI and pledged over $1 billion to the venture. The actual collected total amount of contributions was only $130 million until 2019.[6] According to an investigation led by TechCrunch, Musk was its largest donor while YC Research did not contribute anything at all.[16] The organization stated it would "freely collaborate" with other institutions and researchers by making its patents and research open to the public.[17][18] OpenAI is headquartered at the Pioneer Building in Mission District, San Francisco.[19][20]

According to Wired, Brockman met with Yoshua Bengio, one of the "founding fathers" of deep learning, and drew up a list of the "best researchers in the field".[21] Brockman was able to hire nine of them as the first employees in December 2015.

Also good reading: https://web.archive.org/web/20160427162700/http://www.wired....


In SV parlance "founder" is a hands-on, full-on role to take the company from 0 to 1.


Sure, the convention in SV is also that the person created the company and is not an employee of someone else. The founder is there at the Inception of the company and it is their idea.

I feel like the facts are on the table, and if they don't convince someone then they are unlikely to be convinced by more facts


I’m sorry but none of that says that Sam or elon are founders. Your big long wired blog for lack of a better term said nothing of the sort.


facts like Altman claiming he’s got a successful coup goin eh


I think you're letting your personal feelings about Sam Altman cloud your judgment as to what facts you will accept and which you won't.

Whether Altman has a successful coup going or not is immaterial, but whether is a founder or not is what you asked and you got a pretty solid answer. You can like the answer, or you don't but it stands unopposed.


I have no personal feelings to Altman what so ever. What I dislike is the worship and rewriting of history to appease the business tycoons in the tech industry who are little more than parasites in many cases and I believe this one could be one of those cases.

On the flip side I think your blind worship of these tech business leeches may be clouding your judgement.


> regular employees have non-competes in their contracts (sometimes void/illegal, depending on the local jurisdiction)

Non-competes mostly have an effective after period, so Altman's situation is a bit more akin to my companies "only job" policy. It means I can't have a side hustle or alternative means of making money. Enforceable or not, while you're employed you can be fired for anything.


I don't see a conflict at all but IANAL. His biggest issue is GPU cost. He was hustling to vertically integrate and knew that the non-profit nature of OpenAI would not allow for it. So he starts to think about the creation of a separate company to handle that with exclusivity of some kind. It makes perfect sense. No idea if he could get the board to go for it of course, but that's clearly something he would have needed to do to make it a reality. And it's completely in his remit to make these kinds of bets. This board may have seen this as an overstep but all they needed to do is tell him no. I'm sure he would have made a persuasive argument had they let him. This board seems completely out of touch with the reality of running a company like this. And GPUs have nothing to do with AI safety, that's like saying a faster neuron makes a person evil or good.


"non-profit nature of OpenAI would not allow for it."

How so? Seeking more efficiency and cost-effectiveness absolutely does not conflict with a non-profit mission.


You seem to be arguing that it isn't a conflict with the non-profit charter. What people are saying is that it's a conflict of interest. It's called self-dealing and one of the most common forms of conflict of interest.


There’s no immediate conflict of interest though. No self-dealing has actually occurred yet, and you have to assume a tremendous amount of bad faith years in advance of reality to get there.

Suppose I am on the board of a wildlife preservation nonprofit, and I am thinking about starting a catering business, and talking to potential financial backers.

If you want to take it to the extreme, you could argue that I could award my catering company a lucrative contract to provide food for the nonprofits cafeteria. But that would be entirely speculative on your part. it would by no means justify immediately firing me from my position at the nonprofit. If you take those hypotheticals out far enough, you could basically argue anybody with any other job or role is potentially in conflict of interest. But again, doing that requires assuming a tremendous amount of bad faith.


Come on, I think it's really obvious that the largest AI nonprofit in the world might be interested in chips, to a much larger degree than a typical nonprofit would be interested in cafeteria food.

I think the analogy is more like someone on the board + Executive Director of your wildlife preservation nonprofit buying up land with potentially endangered animals on the side, with the (presumed) intention of selling the land back to the nonprofit.

Clearly a COI even if it's net good for the animals.


So why not just judge it a conflict of interest if he tries to sell them to them? The board can evaluate the options available to open AI, and agree to allow him to buy from his chip company if and only if it's in OpenAI's interest to. Firing him years before the chip company even exists in the mere anticipation of that possible conflict seems…premature.

Direct and immediate Conflicts of interest can arise if decisions you make for one company could regularly come into conflict with decisions you have to make for the other, example, if he was starting another AI company making rivals to ChatGPT. But in the case of the chip company, nobody has really made a persuasive case of how that would happen here. In terms of interrelatedness, maybe they would build chips customized for openAI. But that would be a potential benefit Open AI, not a conflict!


There is an immediate conflict of interest. Because one should immediately start to wonder if Sam is holding back the company from making their own chips or perhaps avoiding partnering with another supplier because he wants to funnel future business to his future side project. That may be in the best interest of the non-profit, but it may not be, we can't tell, because of the conflict of interest.


I'm sorry but I have to disagree. There are very few other suppliers. We see it with Apple and ARM...without this vertical integration the M2 laptop upon which I write this post would not exist. When one is running a company, the biggest enemy is always time. Had Apple not planned the ARM move more than a decade ago, it would not have happened in time to revamp the mac lineup. What you see as a conflict I see as making good bets and moving the ball forward. Given more recent events, its abundantly clear that the board was out of touch. Letting OpenAI fall behind would have caused it to fail in its mission of safe AGI. When making these judgments, its always necessary to consider the alternative timelines that would inevitably occur.


Wow, that's some Adam Neumann level self-dealing.


Sam's Alameda moment


Assume someone had been working for a startup, as CEO, for free. For years. That someone had cut himself away from any way to compensate his work directly. Due to altruism or poor planning or a result of a negotiation.

At a later time other ventures that this person had been propping started failing and a money injection deemed necessary. It would not be surprising, if that person would try to risk his unpaid work position and monetize it.

Recent tweets from Sam “go for a full value of my stock” seem to indicate towards this direction.


He presumably at least got paid a salary as CEO, so it's not uncompensated work. In any case, being in some kind of suboptimal financial situation isn't an excuse for breaking the rules.


I don’t know. It can easily be the case of $1 comp, and insane amounts of work and pressure, including financial, from all sides. It is quite clear that crypto is failing and he is deeply engaged into crypto with the WorldCoin project. For all we know, personal moneys may have been spent on crypto.


Non-competes, like so many other workplace rules, only apply below the C-suite


I think the reverse is true. If you're a grunt no one really cares and it's not worth it to enforce. If you're c suite and leave to start a competitor you can be sure you'll be hearing from company lawyers. Similarly mandatory gardening leave generally grows with your title


> I think the reverse is true. If you're a grunt no one really cares and it's not worth it to enforce.

As a grunt who was handcuffed for three months after switching jobs while my former employer and my new employer tried to sort out what I was and wasn't allowed to do... no, this is not true. And the document I ended up having to sign means I can't say anything other than Amazon (former employer), Microsoft (new employer), and myself came to a mutual agreement.


The CSuite walks in with a golden parachute. No SWE job automatically vests a ton of options when you're fired by the company.


I sincerely don't give a shit and am just kibitzing but: why would this be a conflict of interest at all? One of OpenAI's biggest strategic concerns is getting out from under Nvidia. OpenAI is never going to do hardware; they don't even want to rack servers. The ultimate customer for a new AI chip would be Microsoft, not OpenAI. OpenAI is research and software. Hardware is a complement, not a competitor.


You seriously don't see a conflict of interest with a board member and CEO creating a separate company that sells to the company he is the CEO of? That is the definition of conflict of interest.


Yeah something can be potentially net good for a company but still be a COI. If I'm in charge of awarding military contracts and I give it to a armaments company my son is a VP of, and I didn't disclose to anybody that my son was a VP there, this is a clear COI even if I later argue (and people agree with me!) that my son's armaments company is best suited for the business.

At the very minimum I should not be the one to make that call.


A more usual method is to have the non-profits contract with a for-profit "management consulting firm" that does the running of the real things. Nonprofit can take in money, pay the for-profit for providing the thing the donation was earmarked for; all legal and fine. The same people can be employed by both companies.

Any profits the "for profit" arm makes can then be donated back to the non-profits for financial advantage in holding the money, if any. Lather, rinse, repeat.


I think Altman hails from a kind of hustle culture that may not go great with AI safety:

>I just saw Sam Altman speak at YCNYC and I was impressed. I have never actually met him or heard him speak before Monday, but one of his stories really stuck out and went something like this:

> "We were trying to get a big client for weeks, and they said no and went with a competitor. The competitor already had a terms sheet from the company were we trying to sign up. It was real serious.

> We were devastated, but we decided to fly down and sit in their lobby until they would meet with us. So they finally let us talk to them after most of the day.

> We then had a few more meetings, and the company wanted to come visit our offices so they could make sure we were a 'real' company. At that time, we were only 5 guys. So we hired a bunch of our college friends to 'work' for us for the day so we could look larger than we actually were. It worked, and we got the contract."

> I think the reason why PG respects Sam so much is he is charismatic, resourceful, and just overall seems like a genuine person.

https://news.ycombinator.com/item?id=3048944

The crypto WorldCoin thing is probably a bigger example.


> We then had a few more meetings, and the company wanted to come visit our offices so they could make sure we were a 'real' company. At that time, we were only 5 guys. So we hired a bunch of our college friends to 'work' for us for the day so we could look larger than we actually were. It worked, and we got the contract."

That strategy was tried with Barry Minkow, with "ZZZZ Best", the fake building maintenance company fraud.[1] He did prison time for that.

[1] https://en.wikipedia.org/wiki/Barry_Minkow


"Its not fraud when we do it"


> While still in high school, Minkow founded ZZZZ Best (pronounced "Zee Best"), which appeared to be an immensely successful carpet-cleaning and restoration company. However, it was actually a front to attract investment for a massive Ponzi scheme.

Apples and oranges.


It's not fraud if Sam's company delivered. Did they?


I'm not sure if that's true. If they obtained a financial advantage (e.g. a contract) via deception, then that could well be fraud. I don't think it makes much difference if they eventually delivered on the contract or not.


It'd probably come down to the legal definition of "deception." It's not Sam's fault if the customer saw a lot of people poring over screens and walking around with papers and assumed they were employees. But if he explicitly introduced them as employees, yeah, that'd be over the line.

Similar to the debate over whether it was ethical of Reddit to seed themselves with sock puppets.


It totally is his fault if those people were directed by him to pretend to be employees of the company – which he seems to admit.

"I contracted some people specifically to pretend to be employees so that someone else would be deceived into thinking they were employees. But somehow it's still not my fault that this person was deceived."

Seems like a weak defense.


Well, they were employees, after all. Temporary ones. Very temporary ones.


Yeah, this kind of pedantry isn't going to work in a legal setting. Obviously the intent was to deceive the person into thinking that the company had a large number of regular employees.


I still don't see where the bright line of the law was crossed here. Is it really so surprising when people who are willing to do things like this get ahead of people who aren't?

Are you really so certain that we'd be better off otherwise?


Is it surprising that people sometimes lie and cheat? No, of course not.

Is it surprising that people who lie and cheat sometimes obtain an advantage by doing so? Again, no.

But yes, I do think we would be better off without people lying and cheating.

IANAL and I do not claim that the law was necessarily broken in this instance, but the intent was clearly to deceive a business partner in order to obtain a financial advantage. I mean just look at SA's own words here:

> So we hired a bunch of our college friends to 'work' for us for the day so we could look larger than we actually were

The deceptive intent is explicit in his own description of events. It's honestly quite silly that you are looking for some semantic get-out clause here when SA's own statement is so explicit.


There was another AI start-up, one backed / supported by various conservative politians from Germany and Austria thatvdid the same thing, hiring people to look busy when customers (bad) and investors (a lot worse) showed up at the offices: Augustus Intelligence.

At best, this is fake it till you make it. At worst, it is fraud. The tiny, tiny differce is whether that next investor shows up or not. Just ask FTX.

I do not respect people resorting to that kind of thing.


I have never seen him speak, but odd that an example of him faking it is supposed to make him seem genuine?


AND a complete Moby Dick move. You're going to suddenly get up and compete with all the chip designers and beat them at their job? Computer chips are literally the most complex products in the world. It is an astonishing development that the most profitable company in the world was able to make chips, let alone chips that beat Intel and Nvidia in some key metrics. It took Apple more than a decade of shipping chips in phones to be able to move up to desktop-class.

The board was right to get rid of a guy who would rather hunt a white whale than do his job.


And now they want him back because the investors want him back.

What a mess.


> SoftBank and others had hoped to be part of this deal, one person said, but were put on a waitlist for a similar deal at a later date. In the interim, Altman urged investors to consider his new ventures, two people said.

JFC


> that the non-profit would eventually pay a lot of money

That's not a fact. It's your assumption.

And not a particular good one given that Microsoft is OpenAI's partner and is providing compute services at no cost.


When you do lunch with Microsoft, you want to bring your own spoon.

Better make it a long one.


> All fine and common!

Sam got fired, so obviously not fine.


There is no evidence that this is the reason though.

And so everything needs to be caveated with "this is an assumption".


major mistake by OpenAI's board not to mention this if it actually did play a role in removing him, public opinion now is that it was a coup over AI safety stuff. OpenAI board seemed to have no PR plan


They mentioned it clearly. They said had hidden something from them.

The other story was later constructed in forums and media.


Conflict of interest?

This is no different than Google designing and manufacturing their own chips (TPU, tensor processing unit)


The claim is that he's starting different AI hardware companies on the side, not that he's doing it under OpenAI's umbrella. It's more like if Sundar tried to get funding for some side hustles while talking with potential customers or contractors of Google, without disclosing to Google's board ahead of time.


Not at all fine. As we’ve just seen, it’s a fireable offense.


One example of conflict of interests is Adam D'Angelo is sitting on OpenAI's board when Quora is building a competitor, Poe.

And it has been rumored that D'Angelo helped a lot with this coup because he did it before at Quora.


Every single one of Elon's ventures are conflicts of interest with each other. Simply seeking money for a chip venture is small compared to pulling out Tesla engineers to work on X, the stuff with SolarCity, etc


As Twitter's usage declined the extra Sacramento DC was sold to Tesla.

Despite Musk saying it was terrible.

https://www.theinformation.com/articles/after-elon-musk-bash...


It is funny that you say that after Elon cofounded openAI and then left due to conflicts of interest in the AI space.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: