Hacker Newsnew | past | comments | ask | show | jobs | submit | s-macke's commentslogin

I try myself on multiple reverse engineering projects of old games such as [0]. The goal is to find a pattern in which an AI can reverse engineer at least 99% of the disassembly.

[0] https://github.com/s-macke/weltendaemmerung


WebAssembly itself is not that inefficient in storage. It is mostly the usual bloat that comes with binaries. For example, Go binaries have to provide a full runtime, including garbage collection.

If size is your top priority, you can produce very small binaries, for example with C. Project [0] emulates an x86 architecture, including hardware, BIOS, and DOS compatibility, and ends up with a WebAssembly size of 78 kB uncompressed and a 24 kB transfer size.

[0] https://github.com/s-macke/FSHistory


>you can produce very small binaries, for example with C

Not many people are going to want to be rolling their own libc like that author. Most people just compile their app and ship megabytes of webassembly at the expense of their users. To me webassembly is just a shortcut to ship faster because you don't have to port existing code.


> Not many people are going to want to be rolling their own libc like that author.

Emscripten provides a libc implementation based on musl, and so does wasi-libc (https://github.com/WebAssembly/wasi-libc).

If you explicitly list which functions you want to export from your WebAssembly module, the linker will remove all the unused code, in the same way that "tree-shaking" works for JS bundlers.

In my experience, a WebAssembly module (even with all symbols exported) is smaller than the equivalent native library. The bytecode is denser.

WebAssembly modules tend to be larger than JavaScript because AOT-compiled languages don't care as much about code size--they assume you only download the program/library once. In particular, LLVM (which I believe is the only mainstream WebAssembly-emitting backend) loves inlining everything.

Judicious use of `-Oz`, stripping debug info, and other standard code size techniques really help here. The app developer does have to care about code size, of course.


Opus 4.5 has become really capable.

Not in terms of knowledge. That was already phenomenal. But in its ability to act independently: to make decisions, collaborate with me to solve problems, ask follow-up questions, write plans and actually execute them.

You have to experience it yourself on your own real problems and over the course of days or weeks.

Every coding problem I was able to define clearly enough within the limits of the context window, the chatbot could solve and these weren’t easy. It wasn’t just about writing and testing code. It also involved reverse engineering and cracking encoding-related problems. The most impressive part was how actively it worked on problems in a tight feedback loop.

In the traditional sense, I haven’t really coded privately at all in recent weeks. Instead, I’ve been guiding and directing, having it write specifications, and then refining and improving them.

Curious how this will perform in complex, large production environments.


Just some examples I’ve already made public. More complex ones are in the pipeline. With [0], I’m trying to benchmark different coding-agents. With [1], I successfully reverse-engineered an old C64 game using Opus 4.5 only.

Yes, feel free to blame me for the fact that these aren’t very business-realistic.

[0] https://github.com/s-macke/coding-agent-benchmark

[1] https://github.com/s-macke/weltendaemmerung


> You have to experience it yourself on your own real problems and over the course of days or weeks.

How do you stop it from over-engineering everything?


This has always been my problem whether it's Gemini, openai or Claude. Unless you hand-hold it to an extreme degree, it is going to build a mountain next to a molehill.

It may end up working, but the thing is going to convolute apis and abstractions and mix patterns basically everywhere


Not in my experience - you need to build the fact that you don’t want it to do that into your design and specification.


Sure, I can tell it not to do that, but it doesn't know what that is. It's a je ne sais quoi.

I can't teach it taste.


Recent Claude will just look at your code and copy what you've been doing, mostly, in an existing codebase - without being asked. In a new codebase, you can just ask it to "be conscice, keep it simple" or something.


The trick isn’t to tell it what not to do, it’s to tell it what to do. And give it examples and requirements.

It's very good at following instructions. You can build dedicated agents for different tasks (backend, API design, database design) and make it follow design and coding patterns.

It's verbose by default but a few hours of custom instructions and you can make it code just like anyone



Sure why not

Difficult and it really depends on the complexity. I definitely work in a spec-driven way, with a step-by-step implementation phase. If it goes the wrong way I prefer to rewrite the spec and throw away the code.


I have it propose several approaches, pick and choose from each, and remove what I don't want done. "Use the general structure of A, but use the validation structure of D. Using a view translation layer is too much, just rely on FastAPI/SQLModel's implicit view conversion."


The Plan mode already does this, it makes multiple plans and then synthesises them


“Everything Should Be Made as Simple as Possible, But Not Simpler” should be the ending of every prompt :)


I personally try to narrow scope as much as possible to prevent this. If a human hands me a PR that is not digestible size-wise and content-wise (to me), I am not reviewing and merging it. Same thing with what claude generates with my guidance.


Instructions, in the system prompt for not doing that

Once more people realize how easy it is to customize and personalized your agent, I hope they will move beyond what cookie cutter Big AI like Anthropic and Google give you.

I suspect most won't though because (1) it means you have to write human language, communication, and this weird form of persuasion, (2) ai is gonna make a bunch of them lazy and big AI sold them on magic solutions that require no effort on your part (not true, there is a lot of customizing and it has huge dividends)


I find my sweet spot is using the Claude web app as a rubber duck as well as feeding it snippets of code and letting it help me refine the specific thing I'm doing.

When I use Claude Code I find that it *can* add a tremendous amount of ability due to its ability to see my entire codebase at once, but the issue is that if I'm doing something where seeing my entire codebase would help that it blasts through my quota too fast. And if I'm tightly scoping it, it's just as easy & faster for me to use the website.

Because of this I've shifted back to the website. I find that I get more done faster that way.


I've had similar experiences but I've been able to start using Claude Code for larger projects by doing some refactoring with the goal of making the codebase understandable by just looking at the interfaces. This along with instructions to prefer looking at the interface for a module unless working directly on the implementation of the module seems to allow further progress to be made within session limits.

By "the website" do you mean you're copy pasting, or are you using the code system where Anthropic clones your code from GitHub and interacts with it in a VM/container for you.


Just pasting code snippets, and occasionally an entire file or two into the main claude.com site. I usually already know what I want and need, but just want to speed up the process on how to get there, and perhaps I missed something in the process.


Aider is pretty good way to automate that. You can use it with Claude models. It lets you be completely precise down to a single file, and sit in chat/code/review loop - but it does a lot of the chores, like generating commit messages etc while saving you the copy paste effort.


> In the traditional sense, I haven’t really coded privately at all in recent weeks. Instead, I’ve been guiding and directing, having it write specifications, and then refining and improving them.

This is basically all my side projects.


This has also been my experience.


The link didn’t get enough votes a few days ago.


I know - I posted it :)


While the methods are similar in that they both ray-march through the scene to compute per-pixel fluence, the algorithm presented in the blog post scales linearly with the number of light sources, whereas Radiance Cascades can handle an arbitrary distribution of light sources with constant time by benefiting from geometric properties of lighting. Radiance Cascades are also not limited to SDFs for smooth shadows.


Yeah, and I believe Radiance Cascades accurately calculate the size of the penumbra from the size and distance of the area light, which also means that point light sources, as in reality, always produce hard shadows.

The technique here seems to rely more on eyeballing a plausible penumbra without explicitly considering a size of the light source, though I don't quite understand the core intuition.


I stopped reading the article after this sentence. Now I am scanning the comments.


You can write a network device driver, which exports the network packages into JavaScript. The author already wrote a console device. So, not much of a deal.

https://github.com/joelseverin/linux-wasm/blob/master/patche...


Doable for http and https, but if you're running it in a browser environment, you'll eventually run into issues with CORS and other protocols. To get around this you need a proxy server running elsewhere that exposes the lower layers of the network stack.


This is exactly what [0] does. Try it out. If you know the IP you can even log in to another open browser window via telnet.

[0] https://github.com/s-macke/jor1k


Aha! Now I see I'm talking to the expert on the topic ;) Thanks for the link. I'll check this out.


Not quite right. Try the following.

  echo *
  cd /proc
  echo *
  while read line; do echo $line; done < /proc/cpuinfo
The last line should work and print the entire file, but it seems there's a bug.


Well, it should not surprise you that the virtual file systems of the kernel remain.


That’s fast. Buggy, but fast. I’m totally impressed! Especially because I researched the necessary steps to do the same thing 10 years ago based on [0]. The patches required for this hack touch LLVM, libc, Linux kernel, BusyBox, ... and total approximately 15,000 lines of code.

I ran a small performance test with 'bc -lq' and compared with [0]:

  scale=1000
  4*a(1)
This WASM architecture compilation completely blows away my old emulation setup, which only managed around 200 MIPS. Maybe this approach can be generalized. Running a full Linux distribution at near-native speed right in the browser would be awesome.

[0] https://github.com/s-macke/jor1k


Your project was also really nice to play around with. I think it was one of the few which actually had an interesting idea including (blink), (copy.sh)

I generally preferred copy.sh more to be really honest. I have actually used it sometimes as a poor man's qemu. If I may ask, what are your thoughts on copy.sh as I found that its performance on busybox or (tinycore linux with gui) was so brilliant (the only downside was that the internet speed was abysmally slow, like for me really really slow.)


copy.sh has the advantage of being x86-compatible and can run many different Linux distributions. However, this CPU choice also makes it quite complex and relatively slow (not sure, if this is still correct).

My own OpenRISC CPU emulation fits into just 1,500 lines of code, and I optimized every single line. To make it work, I had to compile my own Linux distribution completely from scratch. I stopped working on it about eight years ago, but I’ve completed a dozen other successful projects since then.

I’m still very proud that nearly every browser-based Linux emulator, including JSLinux and copy.sh, uses my 9p-virtio- filesystem approach. It makes running complex Linux distributions in the browser much simpler.

Overall, my thoughts about copy.sh’s work are entirely positive.


nice benchmark. comparing to fabrice bellard's jslinux (https://bellard.org/jslinux/) it's roughly 20x faster (if arm on arm) and 64x faster (if x86 on arm)


What results did your benchmark get?


By a factor of about 170. But this is more of a micro benchmark that gives you a rough idea. It's not a definitive figure.


I will give a lecture about Haskell next week and might use this website for demonstration.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: