Hacker Newsnew | past | comments | ask | show | jobs | submit | biorach's commentslogin

> Their valuation is in the same order of magnitude as OpenAI, but they have orders of magnitude fewer users.

it's an open question how many of OpenAI's users are monetizable.

There's an argument to be made that your brand being what the general public identifies with AI is a medium term liability in light of the vast capital and operating costs involved.

It may well be that Anthropic focusing on an order of magnitudes smaller, but immediately monetiazable market will play out better.


I wouldn't count on it being immediately monetizable. At least not to the point where training foundation models becomes fundamentally profitable. And from what we're seeing right now, you have to do that or you will get left behind fast. But with a billion active users, you are approaching Facebook levels of market penetration and thereby advertising-potential. So in the mid to long term, this is certainly more valuable.


There are certain styles of programming and data structure implementations that end up requiring you to fight Rust at almost every step. Things like intrusive data structures, pointer manipulation and so on. Famously there is an entire book online on how to write a performant linked list in idiomatic Rust - something that is considered straightforward in C.

For these cases you could always use Zig instead of C


Given Zig's approach to safety, you can get the same in C with static and runtime analysis tools, that many devs keep ignoring.

Already setting the proper defaults on a Makefile would get many people half way there, without changing to language yet to be 1.0, and no use-after-free story.


> many devs keep ignoring

And thats why Zig don’t offer much. Devs will just ignore it.


it is not straightforward in rust because the linked list is inherently tricky to implement correctly. Rust makes that very apparent (and, yeah, a bit too apparent).

I know, a linked list is not exactly super complex and rust makes that a bit tough. But the realisation one must have is this: building a linked list will break some key assumptions about memory safety, so trying to force that into rust is just not gonna make it.

Problem is I guess that for several of us, we have forgotten about memory safety and it's a bit painful to have that remembered to us by a compiler :-)


Can you elaborate, what key assumptions about memory safety linked lists break? Sure, double linked lists may have non-trivial ownership, but that doesn't compromise safety.


Rust wants all memory to be modeled as an ownership tree: the same bit of memory can't be owned by more than one data structure. A doubly linked list breaks that requirement so it can't be modeled in safe Rust directly. The options are using unsafe, or using one of the pointer wrapper types that have runtime checks that ensure correct behavior and own the underlying memory as far Rust is concerned.


Right. So it is not that double-linked lists are inherently unsafe, it is (just) Rust ownership model cannot represent them (any other cyclic structures).


It's not that it cannot, it just doesn't want to :-) (but you're right). I guess that in this very case of DLL, it's a bit hard to swallow. To be honest, it's because the rest of rust really helps me in other areas of my projects that I have accepted that. Learning the ownership model of rust is really painful, it really forces you to code in its way and it was not pleasant to me.


I've been trying to convert to Rust an in-memory database and failed. It is strictly single-threaded and full of intrusive lists. I tried hard to please borrow-checker, but one have little choice when internal structures are full of cycles. The result was ugly mess of Rc all over the place. I guess it is just an example of a problem that doesn't fit Rust well.

This makes me wonder: what performance cost Rust code pay due to inability represent cyclic structures efficiently? It seems people tend to design their data in way to please Rust and not in way that would be otherwise more performance efficient.


Using Rc doesn't sound like an intrusive list to me. Personally I find tons of Rcs to be messy, so I'd agree with you.

> what performance cost Rust code pay due to inability represent cyclic structures efficiently?

You can still write the code you'd write in C with unsafe. There's no inherent loss left on the table.

Furthermore, a lot of C folks reach for intrusive lists because it's easy in C, but that doesn't mean that it's always the most performant. See https://bcantrill.dtrace.org/2018/09/28/the-relative-perform... as an example of this phenomenon.


You can do it by combining ghostcell/qcell along with some bespoke "static/compile-time reference counting" for the double links part. But ghostcell/qcell is quite difficult to use with Rust's current feature set (it has to use lifetime hacks to place a safe "brand" on type instantiations, a kind of quasi-capability construct), so it hasn't become a part of standard rust so far.


Sure. Now import a useful performant B+tree in C from a reusable library, while enforcing type safety for your keys and values.

Lots of things C chose to use intrusive pointers and custom data structures for, you would program very differently in a different language.

I'm an old C neckbeard and I find Rust a great experience. Some of the arguments against it sound like people are complaining about how hard it is to run pushing a bicycle.


Or just build a tested unsafe implementation as a library. For example the Linked List in the standard library.

https://doc.rust-lang.org/src/alloc/collections/linked_list....


Yeah, if you need a linked list (you probably don't) use that. If however you are one of the very small number of people who need fine-grained control over a tailored data-structure with internal cross-references or whatnot then you may find yourself in a world where Rust really does not believe that you know what you are doing and fights you every step of the way. If you actually do know what you are doing, then Zig is probably the best modern choice. The TigerBeetle people chose Zig for these reasons, various resources on the net explain their motivations.


The point with the linked list is that it is perfectly valid to use unsafe to design said ”tailored data structure with internal cross-reference or what not” library and then expose a safe interface.

If you’re having trouble designing a safe interface for your collection then that should be a signal that maybe what you are doing will result in UB when looked at the wrong way.

That is how all standard library collections in Rust works. They’ve just gone to the length of formally verifying parts of the code to ensure performance and safety.


> If you’re having trouble designing a safe interface for your collection then that should be a signal that maybe what you are doing will result in UB when looked at the wrong way.

Rust is great, but there are some things that are safe (and you could prove them safe in the abstract), but that you can't easily express in Rust's type system.

More specifically, there are some some things and usage pattern of these things that are safe when taken together. But the library can't force the safe usage pattern on the client, with the tools that Rust provides.


If you can't create a safe interface and must have the function then create an unsafe function and clearly document the invariants and then rely on the user to uphold them?

Take a look at the unsafe functions for the standard library Vec type to see examples of this:

https://doc.rust-lang.org/std/vec/struct.Vec.html#method.fro...


> If you can't create a safe interface and must have the function then create an unsafe function and clearly document the invariants and then rely on the user to uphold them?

Yes, that's what you do in practice. But it's no different--in principle--from the approach C programmers have to use.


>>That is how all standard library collections in Rust works

Yeah and that's what not going to work for high performance data structures because you need to embed hooks into the objects - not just put objects into a bigger collection object. Once you think in terms of a collection that contains things you have already lost that specific battle.

Another thing that doesn't work very well in Rust (from my understanding, I tried it very briefly) is using multiple memory allocators which is also needed in high performance code. Zig takes care to make it easy and explicit.


I think that misses the point though. C trusts you to design your own linked list.

It also trusts your neighbor, your kid, your LLM, you, your dog, another linked list...


what is an intrusive data structure?


A container class that needs cooperation from the contained items, usually with special data fields. For example, a doubly linked list where the forward and back pointers are regular member variables of the contained items. Intrusive containers can avoid memory allocations (which can be a correctness issue in a kernel) and go well with C's lack of built-in container classes. They are somewhat common in C and very rare in C++ and Rust.


At least for a double linked list you can probably get pretty far in terms of performance in the non-intrusive case, if your compiler unboxes the contained item into your nodes? Or are there benefits left in intrusive data structures that this doesn't capture?


Storing the data in nodes doesn't work if the given structure may need to be in multiple linked lists, which iirc was a concern for the kernel?

And generally I'd imagine it's quite a weird form for data structures for which being in a linked list isn't a core aspect (no clue what specifically the kernel uses, but I could imagine situations where where objects aren't in any linked list for 99% of time, but must be able to be chained in even if there are 0 bytes of free RAM ("Error: cannot free memory because memory is full" is probably not a thing you'd ever want to see)).


> Storing the data in nodes doesn't work if the given structure may need to be in multiple linked lists, which iirc was a concern for the kernel?

That's a great point! A language almost like C plus a smart enough compiler could do the unboxing, but this technique doesn't work for multiple structures.

> And generally I'd imagine it's quite a weird form for data structures for which being in a linked list isn't a core aspect (no clue what specifically the kernel uses, but I could imagine situations where where objects aren't in any linked list for 99% of time, but must be able to be chained in even if there are 0 bytes of free RAM ("Error: cannot free memory because memory is full" is probably not a thing you'd ever want to see)).

I think you are right in practice, though in principle you could pre-allocate the necessary memory when you create the items? When you have the intrusive links, you pay for their allocation already anyway, too. In terms of total storage space, you wouldn't pay for more.

(And you don't have to formally alloc them via a call to kmalloc or so. In the sense that you don't need to find a space for them: you just need to make sure that the system keeps a big enough buffer of contiguous-enough space somewhere. Similar to how many filesystems allow you to reserve space for root, but that doesn't mean any particular block is reserved for root up-front.)

But as I thought, that's about in-principle memory usage. A language like C makes the alternative of intrusive data structures much simpler.


> you just need to make sure that the system keeps a big enough buffer of contiguous-enough space somewhere. Similar to how many filesystems allow you to reserve space for root

Said file system reserving can fill up (I've experienced that :) ). Could perhaps engineer some super-equation for a global reserved memory size that can actually guarantee being sufficient; but, unless you mark allocations as needing reserved space vs not and compute based on that (which is back to overhead), that'll necessarily waste a good amount of memory.

And such a global equation of course can result in global failure if just one subsystem miscalculates its required reserved amount, or the calculations improperly account for worst-case non-contiguousness or interleaved different-size-linked-list-alloc-requests or whatever. (never mind that you now need to alloc when adding elements to linked lists, an action that would otherwise be 1-4 inlined stores)

Probably (?) possible to handle, but way way way more easy to break or get wrong (with very-difficult-to-debug consequences) than just some pointer fields.

Oh and also of course, if you store list data in a separate allocation from the main object, you lose the ability to do O(1) (and extremely-cheap at that) "remove self from linked list just given my own pointer", which is probably quite common (and you can't even O(1) with an arraylist instead of a linkedlist).


> Storing the data in nodes doesn't work if the given structure may need to be in multiple linked lists

That is why kernel mostly (always?) uses intrusive linked lists. They have no such problem.


The main thing is that he object can be a member of various structures. It can be in big general queue and in priority queue for example. Once you find it and deal with it you can remove it from both without needing to search for it.

Same story for games where it can be in the list of all objects and in the list of objects that might be attacked. Once it's killed you can remove it from all lists without searching for it in every single one.


Thanks! That makes a lot of sense.


> if your compiler unboxes the contained item into your nodes

Is there known compilers that can do that?


Haskell's GHC partially does it. LLVM can do it in principle, if your frontend gives enough information. Some JVMs can partially do some of it.

The above is about the optimiser figuring out whether to box or unbox by itself.

If you are willing to give the compiler a hand: Rust can do it just fine and it's the default when you define data structures. If you need boxing, you need to explicitly ask for it, eg via https://doc.rust-lang.org/std/boxed/struct.Box.html


A data structure that requires you to change the data to use it.

Like a linked list that forces you to add a next pointer to the record you want to store in it.


> the most state-of-the-art perpetual futures market in the world is an Ethereum Layer 2 named Lighter

Is this not just a state of the art innovation in the Ponzi scheme and online casino space?


It's true that perp platforms are zero-sum games mostly catering to extremely high risk traders that overwhelmingly rely on luck more than skill. I don't use perp platforms myself.

It's also true that perp platforms can provide very accessible and efficient hedging. For example, if you own NVDA and don't want exposure to their quarterly results volatility, you can take a much smaller amount of collateral than your underlying NVDA shares and use that to open a 10x leveraged short on NVDA in the same size as your main NVDA position. This makes you "delta neutral" so the USD value of your position won't change even if NVDA craters on quarterly results. All without selling your underlying shares. Then you can close the short after the quarterly results are absorbed by the market.

Separately, here is a list of transformative benefits of public decentralized chains https://news.ycombinator.com/item?id=46175312


Tip:

A list compiler should be relatively straightforward, as these things go. If you get the AI to write it you should actually read it, all of it, and understand it, to the point where you can add features and fix bugs yourself. There are many many resources on the subject. Only after this should you consider contributing to open source projects. And even then you need to be able to read and understand your contributions


Are you speaking from experience?

Have you actually tried writing a "list" compiler?


The wider point is that copyright headers are a very important detail and that a) the AI got it wrong b) you did not notice c) you have not taken on board the fact that it is important despite being told several times and have dismissed the issue as unimportant

Which raises the question how many other important incorrect details are buried in the 13k lines of code that you are unaware of and unable to recognise the significance of? And how much mantainer time would you waste being dismissive of the issues?

People have taken the copyright header as indicative of wider problems in the code.


Yes, please then find those for now imaginative issues and drill through them? Sorry, but I haven't seen anyone in that MR calling out for technical deficiencies so this is just crying out loud in a public for no concrete reasons.

It's the same as if your colleague sitting next to you would not allow the MR to be merged for various political and not technical reasons - this is exactly what is happening here.


> Yes, please then find those for now imaginative issues and drill through them?

No, that is a massive amount of work which will only establish what we already know with a high degree of certainty due to the red flags already mentored - that this code is too flawed to begin with.

This is not political, this is looking out for warming signs in order to avoid wasting time. At this stage the burden of proof is on the submitter, not the reviewers


Too flawed? Did you miss that tiny detail that MR fixes a long time issue for ocaml? This is exactly political because there's no legal or technical issue. Only fluff by scared developers. I have no stakes in this but I'm sincerely surprised by the amount of unreasonable and unsubstantiated claims and explanations given in this thread and MR


I don't get why you do not understand why nobody wants to waste time on a MR where the author didn't even themselves have any interest on looking over it even once. https://github.com/ocaml/ocaml/pull/14369/files#diff-bc37d03... also all the unused functions...

did it fix a long time issue? maybe, but 9 tests for 13k lines doesnt give much confidence in that

and even if it worked perfectly, who will maintain this?


An incorrect copyright header is a major red flag for non technical reasons. If you think it is an irrelevant minor matter then you do not undesirable several very important social and legal aspects of the issue.


Social maybe yes what legal aspects? Everybody keeps repeating that but there is no copyright infringement. Maybe you can point me to one?

I understand that people are uncomfortable with this, I am likely too, but objectively looking there's technically nothing wrong or different to what humans already do.


The point is that it ended up in the PR in the first place. The submitted seemed unaware of its presence and only looked into it after it was pointed out. This is sloppy and is a major red flag.


So there's no point? Sloppy maybe yes but technically incorrect or legally questionable no. Struggle is real


If the submitter is sloppy with things that are not complicated, how can one be sure of things that ARE complicated?


The funny thing is that it works, have a look at the MR. It says:

  All existing tests pass. Additional DWARF tests verify:

  DWARF structure (DW_TAG_compile_unit, DW_TAG_subprogram).
  Breakpoints by function and line in both GDB and LLDB.
  Type information and variable visibility.
  Correct multi-object linking.
  Platform-specific relocation handling.
So the burden of proof is obviously not anymore on the MR submitter side but the other.


Might be $TERM needs to be set or you need to add ghostty to terminfo


> Might be $TERM needs to be set or you need to add ghostty to terminfo

Yeah, except that the specific terminfo needed for ghostty isn't installed anywhere on the boxes you ssh into ... you need to manually install it on every single one of them.

That in and of itself makes it truly painful to switch to ghostty.

And there are still a lot of other issues, like e.g. building the tip is a freaking nightmare of dependencies and weird issues (hard reliance on specific versions of the zig compiler and of something called "blueprint compiler", etc...)

Not ready for prime time by a mile IMO.


> the specific terminfo needed for ghostty isn't installed anywhere on the boxes you ssh into ... you need to manually install it on every single one of them.

Yeah this is going to be an issue with any of the newer terminal emulators. No big deal. Updating terminfo is easy. If you can't then just set TERM=xterm

> Not ready for prime time by a mile IMO.

Nah, the issue is your lack of experience and understanding of the basics is terminals.


> Updating terminfo is easy.

lol.

sure, very easy to do this on order of magnitude 1000 remote machines whose various OS's are entirely managed by automation.


no need for the bitchy response.


> no need for the bitchy response.

Says the man who accuses others of lack of expertise without a shred of evidence.


Well you seem to think automating update of terminfo on a fleet of machines is too difficult so...


I don't think you've used Ghostty in a while. It has auto installing terminfo when ssh-ing.

Also, every program ever depends on a certain version of a compiler, so I don't understand this complaint. Ghostty requires Zig 0.14. That's it, not a specific compiler hash. blueprint-compiler is packaged for pretty much every distribution these days.


I'm not sure how the fact that it's complicated to compile makes it unready for prime time.

As for $TERM, you can simply default it to `xterm-256color` which is more than enough


Or just "export TERM=xterm-256color". Lose support for Ghostty-specific features, but get the subset that works everywhere.


This. See the ghostty documentation: https://ghostty.org/docs/help/terminfo#ssh


It doesn't sound like they are talking about invalid states, more like they are taking about the kind of thing that in Rust would be represented by `Option<Box<dyn SomeTrait>>` or suchlike. Maybe your point is that in Rust much less ceremony is necessary to avoid hitting a null pointer when doing this. But still, in either language it's easy to end up with hard to follow logic when doing this.


not super familiar with Rust but isn't Option<T> just an union type of null and T? I get the language has special semantics for this compared to a union type but it is conceptionally just an union.

For example this is something you can do with typescript.

  function(args: Arguments) { ... }

  type Arguments = { a: number, b: number } | { a: number, b: string, c: number }
the Arguments { a: 1, b: 1, c: 1 } is not representable.


> not super familiar with Rust but isn't Option<T> just an union type of null and T?

Only if there is a niche optimization happens if T is never null, otherwise it's a tagged union.

That's not what you're replying to is about.


I'm pretty sure they're talking about reference counting that depends on the arguments, not about optional arguments or invalid argument combinations.


Yes, it is mentioned there, as being deprecated:

> easy_install, now deprecated, was released in 2004 as part of Setuptools.


> It took decades for the to even consider starting the process of removing the GIL,

This is not true. Discussions about removing the GIL have been ongoing for decades. There were some abortive attempts over the years but it is regarded as quite the technical challenge, and has implications for compatibility with code written in C.

> which is a great example of "it's clearly broken but lets not fix it".

That is not the attitude of the Python community.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: