More

eigenvalue · 2025-04-08T18:52:25 1744138345

I think that’s because it’s a Chinese Project. Same thing with Ant Design Components, which are really awesome but not as well known as they should be.

sdesol · 2025-04-08T19:09:43 1744139383

I think it being Chinese is part of the reason as some of the examples in the early days were Chinese only, which could deter some people. It is certainly more complex (for a good reason in my opinion), I can see why it is not well known since I think the vast majority just wants to create simple charts. However, with echarts, it really can meet Enterprise needs.

rorylaitila · 2025-04-08T19:53:50 1744142030

Yeah that might be why. A couple years ago I was trying to find "this cool charting library" I came across and I could not get it to surface in Google.

ranger_danger · 2025-04-09T04:26:59 1744172819

A large number of developers/users of Apache ECharts also seem to be Chinese.

nchmy · 2025-04-09T01:14:45 1744161285

Link here, for reference. https://github.com/antvis

It seems good, but their docs websites are absolute trash (though they've seemingly gotten somewhat better recently - they were previously completely unusable).

e.g https://g6.antv.antgroup.com/

Overall, I think you cant go wrong with Apache.

hamsterbase · 2025-04-09T08:37:31 1744187851

There's one more thing. Leader of antv is the developer of echarts.

nchmy · 2025-04-09T20:52:40 1744231960

really? how/why?

RowanH · 2025-04-08T22:20:46 1744150846

Oooo thank you for mentioning that. Looks quite feature rich !

athrowaway3z · 2025-04-09T05:33:40 1744176820

Time to put on a conspiracy hat.

If you scroll down, the entire comment section reads a lot like a campaign to fix their search results & AI suggestibility.

Way too many comments read like generic seller-inserted reviews, eg "Great Product!", "Tried X, Y, Z but this solved my problem!".

eigenvalue · 2025-04-02T21:26:09 1743629169

You’re right that it’s analogous in concept, but strategy distillation happens at a higher level: it encodes and transfers successful latent reasoning patterns as reusable “strategies,” without necessarily requiring direct gradient updates to the original model weights.

eigenvalue · 2025-04-02T21:23:58 1743629038

I can see where you’re coming from, but not really. Unlike an RNN, the main transformer still processes sequences non-recurrently. The “sidecar” model just encodes internal activations into compressed latent states, allowing introspection and rollback without changing the underlying transformer architecture.

eigenvalue · 2025-04-02T21:19:14 1743628754

I can assure you it’s not a joke. Compute power is increasing at a ridiculous pace, and highly capable models are getting smaller and smaller, now at the 30b parameter size and under. So even if it wouldn’t be pragmatic now, it could become highly relevant in 4 or 5 years if trend lines continue at anything like the recent pace.

neuroelectron · 2025-04-03T00:06:19 1743638779

Actually compute power is not increasing as fast as it once did and it's hitting a plateau thanks to thermodynamic limits.

eigenvalue · 2025-04-02T19:43:41 1743623021

Interesting, it certainly wouldn’t take up much additional space, but I wonder if it would have any real impact, since it seems somewhat orthogonal to finding a faithful low-dimensional encoding of the activations.

eigenvalue · 2025-04-02T18:42:14 1743619334

Welcome Tom! Thanks for helping to make this the highest signal-to-noise ratio forum in the general technology/business space.

eigenvalue · 2025-04-02T17:42:56 1743615776

I recently started thinking about what a shame it is that LLMs have no way of directly accessing their own internal states, and how potentially useful that would be if they could. One thing led to the next, and I ended up developing those ideas a lot further.

Transformers today discard internal states after each token, losing valuable information. There's no rollback, introspection, or replaying of their reasoning. Saving every activation isn't practical; it would require way too much space (hundreds of megabytes at least).

The insight here is that transformer activations aren't randomly scattered in high-dimensional space. Instead, they form structured, lower-dimensional manifolds shaped by architecture, language structure, and learned tasks. It's all sitting on a paper-thin membrane in N-space!

This suggested a neat analogy: just like video games save compact states (player location, inventory, progress flags) instead of full frames, transformers could efficiently save "thought states," reconstructable at any time. Reload your saved game, for LLMs!

Here's the approach: attach a small sidecar model alongside a transformer to compress its internal states into compact latent codes. These codes can later be decoded to reconstruct the hidden states and attention caches. The trick is to compress stuff a LOT, but not be TOO lossy.

What new capabilities would this enable? Transformers could rewind their thoughts, debug errors at the latent level, or explore alternative decision paths. RL agents could optimize entire thought trajectories instead of just outputs. A joystick for the brain if you will.

This leads naturally to the concept of a rewindable reasoning graph, where each compressed state is a node. Models could precisely backtrack, branch into alternate reasoning paths, and debug the causes of errors internally. Like a thoughtful person can (hopefully!).

Longer-term, it suggests something bigger: a metacognitive operating system for transformers, enabling AI to practice difficult reasoning tasks repeatedly, refine cognitive strategies, and transfer learned skills across domains. Learning from learning, if you will.

Ultimately, the core shift is moving transformers from stateless text generators into cognitive systems capable of reflective self-improvement. It's a fundamentally new way for AI to become better at thinking.

For fun, I wrote it up and formatted it as a fancy academic-looking paper, which you can read here:

https://raw.githubusercontent.com/Dicklesworthstone/llm_intr...

kridsdale1 · 2025-04-02T20:06:03 1743624363

Cool stuff. I celebrate all of this kind of thinking outside the succeeding paradigms.

eigenvalue · 2025-03-27T01:35:51 1743039351

This is great, I was debating whether I should do my latest project using the new OpenAI Responses API (optimized for agent workflows) or using MCP, but now it seems even more obvious that MCP is the way to go.

I was able to make a pretty complex MCP server in 2 days for LLM task delegation:

https://github.com/Dicklesworthstone/llm_gateway_mcp_server

eigenvalue · 2025-03-26T06:00:26 1742968826

I used to use McFly too, but atuin is much better. Worth trying, takes one minute to install it.

noname120 · 2025-03-26T11:20:59 1742988059

Which features from Atuin are missing from/inferior in McFly?

fernandotakai · 2025-03-26T11:47:22 1742989642

for me, atuin's history sync is a huge must since i use 5 different machines across my house.

v3ss0n · 2025-03-26T07:24:27 1742973867

atuin is much better in what? McFly is a lot more light weight and does semantic search using a neural network and do not need a database.

eigenvalue · 2025-03-21T14:51:12 1742568672

I wrote this article detailing my recent experience helping my dad navigate some complex health issues with AI models. In the process, I came up with a good set of prompts and procedures for getting really high-quality diagnostic feedback and analysis from the best frontier models. Hopefully others will find it useful.