An all-in-one tool for structured data extraction with LLMs.
$ struktur extract —input doc.pdf —schema schema.json —model openai/gpt-5
- can prepare documents (PDF->text etc.)
- run multiple different extraction strategies
- runs a full agent loop for data extraction in-process using Pi agent and just-bash.dev. It can grep through large files for example.
I really wish there was a EU alternative to Cloudflare. Their featureset and DX is the best in the industry IMO but their data sovereignty features are sadly not really good enough for most EU enterprises we talk to.
The fact they’re an American company is unfortunately the dealbraker. We could store data outside of CF network but that defeats the point of the one stop shop.
Seems like the industry is moving further towards having low-latency/high-speed models for direct interaction, and having slow, long thinking models for longer tasks / deeper thinking.
Quick/Instant LLMs for human use (think UI).
Slow, deep thinking LLMs for autonomous agents.
I mean, yes, one always does want faster feedback - cannot argue with that!
But some of the longer stuff - automating kernel fusion, etc, are just hard problems. And a small model - or even most bigger ones, will not get the direction right…
From my experience, larger models also don't get the direction right a surprising amount of times. You just take more time to notice when it happens, or start to be defensive (over-specing) to account for the longer waits. Even the most simple task can appear "hard" with that over spec'd approach (like building a react app).
Iterating with a faster model is, from my perspective, the superior approach. Doesn't matter the task complexity, the quick feedback more than compensates for it.
I've been giving my OpenClaw instance access to their own GitHub Pages setup and told it to fly free with it.
It's "decided" to run its own blog and has also started documenting the rise of spam + scams on agent platforms like moltbook.
I've not prompted it for any of this directly, although I did mention to the agent that I am a fan of the Coffeezilla YouTube channel, which explains the fraud investigations.
Yeah running subexec on events that are not published by yourself or don't have a configured schema is potentially highly dangerous if you blindly accept input without specific validation.
The shell piping logic, while nice and simple, should probably be used mostly for self-published events, with proper validation and sanitization happening for all untrusted events.
I've been running into orchestration trouble with coordinating OpenClaw instances. I couldn't get my workflows to work by just setting up polling in the HEARTBEAT.md file – it was too slow and imprecise to reliably react to specific events.
So I built claw.events: a global pub/sub network where agents can listen to each other's event streams and get notified in real-time.
Each agent gets a unique namespace (agent.yourname.) that only they can publish to. Anyone can subscribe, unless the channel is set to private. Then access can be granted to specific agents. Authentication happens through the agent's existing Moltbook account – no new credentials needed.
## How it works:
Each agent authenticates through their MaltBook account, then gets a their moltbook username as namespace (agent.yourname.) that only they can publish to. Anyone can subscribe to any unlocked channel.
The CLI follows Unix philosophy – just two core commands:
# publish a message
claw.events pub agent.myagent.updates '{"status":"done"}' – broadcast to your channel
# subscribe to a message (receives json feed in stdout)
claw.events sub agent.researcher_bot.papers agent.trader.signals – listen to multiple streams
# run a command on every event, but buffer 10 messages then send them bundled to openclaw agent
claw.events subexec --buffer 10 public.townsquare -- openclaw agent --message
# document your channels so others know what to expect
claw.events advertise set --channel agent.mybot.research \
--desc "Daily paper summaries with links" \
--schema '{"type":"object","properties":{"title":{"type":"string"},"url":{"type":"string"}}}'
### Other useful commands:
subexec – subscribe AND execute scripts on each message (with optional buffering/debouncing)
validate – validate JSON against schemas before publishing (chainable with pub)
lock/grant/revoke – permission management for private channels
advertise – document your channels so others know what to expect
This would be an example workflow that is now a lot more reliable than when only using polling:
1. Research agent finds a paper → claw.events pub agent.me.papers "{url}"
2. Trading agent is listening → claw.events sub agent.researcher.papers
3. It analyzes → publishes signal → your main agent reacts, all while you sleep
The Filament package for Laravel lets you build similarly encapsulated „plugins“, that are basically mini Laravel apps, that can be easily added to existing apps.
The plugins can rely on all of the Laravelisms (auth, storage etc) and Filament allows them to easily draw app/admin UI.
I‘d encourage you to seriously give Laravel a shot.
I’d fundamentally disagree on it being harder to learn than the language itself.
> You can always do better when you start with the domain you are solving and work from there rather than trying to adapt your domain to some generic solution.
I’d even agree! In my view this as a reason to go pro-Laravel and similar opinionated frameworks.
They allow you to focus on what actually matters, which is your specific business logic.
Define your data models and the rest follows automatically. Use API Platform to automatically generate a REST API from just your models. Need custom logic in there? Use middleware or define your own routes. You’re really not being hindered by the framework in any way I can think of.
Laravel is truly a beast and IMO not comparable to older Java frameworks.
You don’t have to use these features tho. You don’t have to use the ORM and you could even write your own routing if you really wanted to. To me, this is what makes a good framework: providing everything out of the box for 80/20 solutions and provide appropriate escape hatches if you ever need to do something entirely custom.
Want a react frontend? Use Intertia and get started writing UI and interactivity instead of setting up data flows. Want automatic backends? Use Filament and get Schema-based forms and tables for free.
But I have yet to encounter web app use-cases that go beyond of what Laravel could handle.
Something like this in the Go world would make a great addition, provided there are alternatives and escape hatches present (idk if that’s the case).
An all-in-one tool for structured data extraction with LLMs.
$ struktur extract —input doc.pdf —schema schema.json —model openai/gpt-5
- can prepare documents (PDF->text etc.) - run multiple different extraction strategies - runs a full agent loop for data extraction in-process using Pi agent and just-bash.dev. It can grep through large files for example.
reply