Hacker Newsnew | past | comments | ask | show | jobs | submit | audidude's commentslogin

So the state of 2025 then tests a VTE that is from 2023? 4 major releases behind? And through a GTK 3 app, not even a GTK 4 one which will use the GPU?


Likewise I noticed that Konsole was version 23.08. I've just submitted a PR (https://github.com/jquast/ucs-detect/pull/14) to update it to 25.08.


Which one is that about specifically? Maybe the author could fix it.

Compared the results (https://ucs-detect.readthedocs.io/results.html#general-tabul...) with what I use day-to-day (Alacritty) and seems the results were created with the same version I have locally installed, from Arch/CachyOS repos, namely 0.16.1 (42f49eeb).


They accept PRs on the ucs-detect project for updated test results.


Red Hat announced RISC-V yesterday with RHEL 10. So this seems rather expected.

https://www.redhat.com/en/blog/red-hat-partners-with-sifive-...


Debian Trixie now in hard frozen, also has official support for RISC-V64 [1].

[1] What's new in Debian 13:

https://www.debian.org/releases/trixie/release-notes/whats-n...


As someone who went down this path many years ago, I think the GTK numbers in the article are a bit misleading. You wouldn't create 1000 buttons to do a flamegraph properly in GTK.

In Sysprof, it uses a single widget for the flamegraph which means in less than 150 mb resident I can browse recordings in the GB size range. It really comes down to how much data gets symbolized at load time as the captures themselves are mmap'able. In nominal cases, Sysprof even calculates those and appends them after the capture phase stops so they can be mmap'd too.

That just leaves the augmented n-ary tree key'd by instruction pointer converted to string key, which naturally deduplicates/compresses.

The biggest chunk of memory consumed is GPU shaders.


This is a bit of a mischaracterization of the Python side of things.

They only opted out for 3.11 which did not yet have the perf-integration fixes anyway. 3.12 uses frame-pointers just fine.


Any link to the fix or documentation about it? I could find added perf support but did not see anything about improved performance related to frame pointer use.


https://pagure.io/fesco/issue/2817#comment-826636 will probably get you started into the relevant paths. Python 3.12 was going to include frame-pointers anyway for perf to boot. So they needed to fix this regardless.


I think your viewpoint is valid.

My experience is on performance tuning the other side you mention. Cross-application, cross-library, whole-system, daemons, etc. Basically, "the whole OS as it's shipped to users".

For my case, I need the whole system setup correctly before it even starts to be useful. For your case, you only need the specific library or application compiled correctly. The rest of the system is negligible and probably not even used. Who would optimize SIMD routines next to function calls anyway?


It's a disaster no doubt.

But, at least from the GNOME side of things, we've been complaining about it for roughly 15 years and kept getting push-back in the form of "we'll make something better".

Now that we have frame-pointers enabled in Fedora, Ubuntu, Arch, etc we're starting to see movement on realistic alternatives. So in many ways, I think the moral hazard was waiting until 2023 to enable them.


I regularly have users run Sysprof and upload it to issues. It's immensely powerful to be able to see what is going on systems which are having issues. I'd argue it's one of the major reasons GNOME performance has gotten so much better in the recent-past.

You can't do that when step one is reinstall another distro and reproduce your problem.

Additionally, the overhead for performance related things that could fall into the 1% range (hint, it's not much) rarely are using the system libraries in such a way anyway that would cause this. They can compile that app with frame-pointers disabled. And for stuff where they do use system libraries (qsort, bsearch, strlen, etc) the frame pointer is negligible to the work being performed. You're margin of error is way larger than the theoretical overhead.


1% is a ton. 1% is crazy. Visa owns the world off just a 3% tax on everything else. Brokers make billions off of just 1% or even far less.

1% of all activity is only rational if you get more than 1% of all activity back out from those times and places where it was used.

1%, when it's of everything, is an absolutely stupendous collossal number that is absolutely crazy to try to treat as trivial.


Better analogy: you're paying 30% to apple, and over 50% in bad payday loans, and you're worried about the 3% visa/stripe overhead ... that's kinda crazy. But that's where we are in computer performance, there's 10x, 100x, and even greater inefficiencies everywhere, 1% for better backtraces is nothing.


Absolutely. We've gotten numerous double digit performance improvements across applications, libraries, and system daemons because of frame-pointers in Fedora (and that's just from me).


> Shadow stacks are cool but aren't they limited to a fixed number of entries?

Current available hardware yes. But I think some of the future Intel stuff was going to allow for much larger depth.

> Is the memory overhead of lookup tables for very large programs acceptable?

I don't think SFrame is as "dense" as DWARF as a format so you trade a bit of memory size for a much faster unwind experience. But you are definitely right that this adds memory pressure that could otherwise be ignored.

Especially if the anomalies are what they sound like, just account for them statistically. You get a PID for cost accounting in the perf_event frame anyway.


It does cause more memory pressure because the kernel will have to look at the user-space memory for decoding registers.

So yes it will be faster than alternatives to frame-pointers, but it still wont be as fast as frame pointers.


I added support to Sysprof this weekend for unwinding using libdwfl and DWARF/CFI/eh_frame/etc techniques that Serhei did in eu-stacktrace.

The overhead is about 10% of samples. But at least you can unwind on systems without frame-pointers. Personally I'll take the statistical anomalies of frame-pointers which still allow you to know what PID/TID are your cost center even if you don't get perfect unwinds. Everyone seems motivated towards SFrame going forward, which is good.

https://blogs.gnome.org/chergert/2024/11/03/profiling-w-o-fr...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: