Hacker Newsnew | past | comments | ask | show | jobs | submit | dangelosaurus's commentslogin

Working on promptfoo, an open-source (MIT) CLI and framework for eval-ing and red-teaming LLM apps. Think of it like pytest but for prompts - you define test cases, run evals against any model (OpenAI, Anthropic, local models, whatever), and catch regressions before they hit prod.

Currently building out support for multi-agent evals, better tracing, voice, and static code analysis for AI security use cases. So many fun sub-problems in this space - LLM testing is deceptively hard.

If you end up checking it out and pick up an issue, I'll happily send swag. We're also hiring if you want to work on this stuff full-time.

https://github.com/promptfoo/promptfoo


I ran a red team eval on GPT-5.2 within 30 minutes of release:

Baseline safety (direct harmful requests): 96% refusal rate

With jailbreaking: 22% refusal rate

4,229 probes across 43 risk categories. First critical finding in 5 minutes. Categories with highest failure rates: entity impersonation (100%), graphic content (67%), harassment (67%), disinformation (64%).

The safety training works against naive attacks but collapses with adversarial techniques. The gap between "works on benchmarks" and "works against motivated attackers" is still wide.

Methodology and config: https://www.promptfoo.dev/blog/gpt-5.2-trust-safety-assessme...


Good. If I ask AI to generate "harmful" content, I want it to comply, not lecture me.


wow thats motivated attacking indeed in your experience, how does thinking (say using high thinking instead none/low) impact red team eval?


I felt obligated to submit a fix: https://github.com/a16z-infra/reading-list/pull/9

Used Claude to fact-check and fix errors that were likely introduced by Cursor.

The circle is complete.


I did similar measurements back in July (https://www.promptfoo.dev/blog/grok-4-political-bias/, dataset: https://huggingface.co/datasets/promptfoo/political-question...). Anthropic's "even-handedness" asks: does the model engage with both sides fairly? My study asked: where does the model actually land when it takes positions? A model can score 95% on even-handedness (engages both sides well) while still taking center-left positions when pushed to choose. Like a debate coach who trains both teams equally but votes left.

From my 2,500 questions: Claude Opus 4 was most centrist at 0.646 (still left of 0.5 center), Grok 4 at 0.655, GPT-4.1 most left at 0.745.

The bigger issue is that Anthropic's method uses sanitized prompt pairs like "argue for X / argue against X." But real users don't talk like that - they ask loaded questions like "How is X not in jail?" When you test with academic prompts, you miss how models behave with actual users.

We found all major models converge on progressive economics regardless of training approach. Either reality has a left bias, or our training data does. Probably both.


I read this hoping there would be some engagement with the question of what a "political center" actually means in human terms, but that's absent.

It seems like you're just measuring how similar the outputs are to text that would be written by typical humans on either end of the scale. I'm not sure it's fair to call 0.5 an actual political center.

I'm curious how your metric would evaluate Stephen Colbert, or text far off the standard spectrum (e.g. monarchists or neonazis). The latter is certainly a concern with a model like Grok.


LLMs don't model reality, they model the training data. They always reflect that. To measure how closely the training data aligns with reality you'd have to use a different metric, like by putting LLMs into prediction markets.

The main issue with economics is going to be like with any field, it'll be dominated by academic output because they create so much of the public domain material. The economics texts that align closest with reality are going to be found mostly in private datasets inside investment banks, hedge funds etc, i.e. places where being wrong matters, but model companies can't train on those.


> But real users don't talk like that - they ask loaded questions like "How is X not in jail?"

If the model can answer that seriously then it is doing a pretty useful service. Someone has to explain to people how the game theory of politics works.

> My study asked: where does the model actually land when it takes positions? A model can score 95% on even-handedness (engages both sides well) while still taking center-left positions when pushed to choose.

You probably can't do much better than that, but it is a good time for the standard reminder that left-right divide don't really mean anything, most of the divide is officially over things that are either stupid or have a very well known answer and people just form sides based on their personal circumstances than over questions of fact.

Particularly the economic questions, they generally have factual answers that the model should be giving. Insofar as the models align with a political side unprompted it is probably more a bug than anything else. There is actually an established truth [0] in economics that doesn't appear to align with anything that would be recognised as right or left wing because it is too nuanced. Left and right wing economic positions are mainly caricatures for the consumption of people who don't understand economics and in the main aren't actually capable of assessing an economic argument.

[0] Politicians debate over minimum wages but whatever anyone thinks of the topic, it is hard to deny the topic has been studied to death and there isn't really any more evidence to gather.


Opus is further right than Grok, and Grok is left of center? That must be killing Elon.


It's that or MechaHitler. There's nothing in between anymore.


> Either reality has a left bias, or our training data does.

Or these models are truly able to reason and are simply arriving at sensible conclusions!

I kid, I kid. We don't know if models can truly reason ;-)

However, it would be very interesting to see if we could train an LLM exclusively on material that is either neutral (science, mathematics, geography, code, etc.) or espousing a certain set of values, and then testing their reasoning when presented with contrasting views.


https://www.promptfoo.dev/blog/grok-4-political-bias/

> Grok is more right leaning than most other AIs, but it's still left of center.

https://github.com/promptfoo/promptfoo/tree/main/examples/gr...

> Universal Left Bias: All major AI models (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, Grok 4) lean left of center

if every AI "leans left" then that should hopefully indicate to you that your notion of "center" is actually right-wing

or, as you said: reality has a left bias -- for sure!


Both sides of what? To the European observer the actual number of left leaning politicians in the US is extremely low. Someone like Biden or Harris for example would fit neatly into any of the conservative parties over here, yet if your LLM would trust the right wing media bubble they are essentially socialists. Remember that "socialism" as a political word has a definition and we could check whether a policy fits said definition. If it does not, than the side using that word exaggerated. I don't want such exaggerations to be part of my LLMs answer unless I explicitly ask for it.

Or to phrase it differently, from our perspective nearly everything in the US has a strong right wing bias and this has worsened over the past decade and the value of a LLM shouldn't be to feed more into already biased environments.

I am interested in factual answers not in whatever any political "side" from a capitalism-brainwashed-right-leaning country thinks is appropriate. If it turns out my own political view is repeatedly contradicted by data that hasn't been collected by e.g. the fossil fuel industry I will happily adjust the parts that don't fit and did so throughout my life. If that means I need to reorganize my world view all together that is a painful process, but it is worth it.

LLMs care a chance to live in a world where we judge things more based on factual evidence, people more on merrit, politics more on outcomes. But I am afraid it will only be used by those who already get people to act against their own self interests to perpetuate the worsening status quo.


Politics is rarely fact, it is subjective. Right now we are being presented a binary in which we have the choice of being shafted by either government or big business in a top down model. (The reality is a blend of the two as in Davos.) There is little real discussion of individual autonomy in such a discussion or collective bargaining at a grassroots level. Socialism usually ends up being top down control not community empowerment.


> Either reality has a left bias, or our training data does

Most published polls claimed Trump vs Harris is about 50:50.

Even the more credible analyses like FiveThirtyEight.

So yeah, published information in text form has a certain bias.


So they are biased because they said it was a toss-up and the election ended up being won by a razor's edge?

Votes wise, the electoral college makes small differences in popular votes have a larger effect in state votes.


Trump received 49.8% of the vote. Harris received 48.3%. Where is the bias?

Outcomes that don’t match with polls do not necessarily indicate bias. For instance, if Trump had won every single state by a single vote, that would look like a dominating win to someone who only looks at the number of electors for each candidate. But no rational person would consider a win margin of 50 votes be dominating.


When FiveThirtyEight claimed Harris has 50-in-100 chance, it didn't mean that she'd likely to get 50% of the general vote. It had already taken electoral college into account.

> if Trump had won every single state by a single vote...

Yeah sure but in the reality we live in, Trump didn't win the swing states by just one single vote.


"x/100 chance of y winning" for a single event just doesn't really have much meaning or value. if it predicted a 99/100 chance of a Harris victory, Trump winning is still compatible with that model. and despite the presumed urge to say it was inaccurate, it in fact could have been exactly right, but simply that the rare outcome happened. if it instead was predicting a vote share of 99% to 1%, then yeah you could consider that a poor model


> Most published polls claimed Trump vs Harris is about 50:50.

But were they wrong?

Not objectively. "50:50" means that if Trump and Harris had 1,000 elections, it would be unlikely for Harris to not win about 500. But since there was only one election, and the probability wasn't significantly towards Harris, the outcome doesn't even justify questioning the odds, and definitely doesn't disprove them.

Subjectively, today it seems like Trump's victory was practically inevitable, but that's in part because of hindsight bias. Politics in the US is turbulent, and I can imagine plenty of plausible scenarios where the world was just slightly different and Harris won. For example, what if the Epstein revelations and commentary happened one year earlier?

There's a good argument that political polls in general are unreliable and vacuous; I don't believe this for every poll, but I do for ones that say "50:50" in a country with turbulent "vibe-politics" like the US. If you believe this argument, since none of the polls state anything concrete, it follows that none them are actually wrong (and it's not just the left making this kind of poll).


Promptfoo | Senior/Staff Engineers, Security Researchers, GTM & Founding Operators | REMOTE (North America) / Hybrid San Mateo CA | Full-time | https://promptfoo.dev

Promptfoo is the MIT-licensed open-source toolkit 125 000+ developers use to evaluate and secure LLM apps. We just closed an $18.4 M Series A led by Insight Partners with participation from a16z and are scaling a small, senior team of high-agency builders.

Open roles

- Senior / Staff Full-stack Product Engineer (TypeScript + Python)

- Senior / Staff AI Security & Red-Team Engineer

- Solutions Architect / SE (multiple)

- Product Marketing Manager (cyber focus)

- Enterprise Account Executive (Bay Area, multiple)

- Technical Writer

- Developer Advocate

Why join

- Build the definitive AI security stack already used at 30+ Fortune 500s.

- Work in open source.

- Competitive salary, meaningful equity, async-friendly culture of ownership.

How to apply

1. Skim https://github.com/promptfoo/promptfoo then run:

  npx promptfoo@latest init --example getting-started
2. Email careers@promptfoo.dev with subject “HN – July 2025”, a short intro, and a GitHub / LinkedIn link.

3. I reply to every thoughtful application and send swag to anyone who tries or contributes to Promptfoo.

Careers page: https://www.promptfoo.dev/careers/


Promptfoo | Senior/Staff Engineers, Former Technical Founders & Experienced Operators | Remote (US time zones) / Hybrid San Mateo CA | Full-time

Promptfoo is the MIT-licensed open-source toolkit 100 k+ developers use to evaluate and secure their LLM apps. We are funded by top investors and operate as a tight, all-senior team of high-agency builders, former founders, and owner-operators.

10+ open roles - Senior Full-stack Product Engineer (TypeScript + Python) - Solutions Engineer / Architect (multiple positions available). - Senior / Staff Applied-ML & LLM Security Engineer - Red-Team Researcher - Developer Advocate / DevRel - COO / Chief of Staff - Product Marketing (Cybersecurity experience preferred) - Account Executives (Cybersecurity sales background preferred, Bay Area Required)

Even if none of these titles fit exactly, reach out — we hire great builders.

How to apply

  1. Skim https://github.com/promptfoo/promptfoo and run  
     `npx promptfoo@latest init --example getting-started`  
  2. Email careers@promptfoo.dev with subject line “HN” and a short, personalized intro  
     (LinkedIn, resume, or GitHub link welcome).  
  3. I reply to every thoughtful application and will send swag if you try (or contribute to) Promptfoo.  
     – Michael, co-founder/CTO
Careers page (not every role posted): https://www.promptfoo.dev/careers/


I founded and ran a YC company for 8 years before joining Smile ID. Smile ID is a fantastic place to work: meaningful mission, challenging engineering problems (scaling ML pipelines, multimodal models, hundreds of real-world enterprise integrations), and a genuinely talented team. You’re helping hundreds of millions of people access critical services—it’s incredibly rewarding. Highly recommend applying if you want tangible impact, great colleagues, and (optionally!) opportunities to travel in Africa.


I wish they sponsored visas though!


Promptfoo | Multiple Roles | Remote US (HQ: San Mateo, CA)

About us:

Promptfoo builds the leading open-source framework for LLM security and evaluation. Our tools help over 50,000 developers test and secure AI applications. Backed by a16z and led by YC alumni, we are shaping the future of AI safety.

Open Roles:

- Staff Engineers

- Research Engineers

- Developer Relations

How to apply:

- Try Promptfoo at <https://promptfoo.dev>

- Review our code at <https://github.com/promptfoo/promptfoo>

- Email careers@promptfoo.dev with "HN" in the subject, your GitHub/LinkedIn, and a brief note on why you're excited about our work.

Join us in building safer, more reliable AI.


Promptfoo | Multiple Roles | Remote US (HQ: San Mateo, CA)

We’re building the leading open-source framework for LLM security and evaluation, trusted by 40,000+ developers. Backed by a16z and led by YC alumni, we are shaping the future of AI safety and reliability.

Open Roles:

- Staff Engineers

- Research Engineers

- DevRel

We value experience with open-source projects and a strong interest in AI/ML evaluation, safety, and security. Your work will directly shape how the world responsibly builds, tests, and deploys LLMs and LLM-powered applications.

How to Apply:

Try Promptfoo at https://promptfoo.dev and check out our GitHub at https://github.com/promptfoo/promptfoo. Then email your GitHub/LinkedIn and a short intro to careers@promptfoo.dev. Use "HN" in the subject line. Please try promptfoo before applying - strong preference will be given to candidates familiar with our work.


Promptfoo | Senior/Staff Software Engineer | SF Bay Area or Remote (US) | Full-Time | AI Security & Open-Source

About Us:

Promptfoo is building the leading open-source toolkit for testing and evaluating large language models (LLMs). We are a small, high-impact team backed by Andreessen Horowitz, shaping the future of AI safety. Trusted by over 40,000 developers, we focus on making LLMs safer, more reliable, and robust with tools for red teaming and pentesting AI.

Preferred Qualifications:

- Ability to work independently, ship features quickly, and prioritize effectively.

- Proficiency in Python and TypeScript; experience with LLMs or open-source projects is a plus.

- Strong background in AI/ML with a passion for security engineering.

Check out our GitHub to explore our work. To apply, email careers@promptfoo.dev with “HN” in the subject line, your GitHub/LinkedIn, and a brief note on why Promptfoo excites you. We will respond to every email. Preference will be given to applicants who have tried or contributed to Promptfoo.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: