More

tekacs · 2026-02-11T02:22:25 1770776545

To those wondering about their rationale for this.

It would be great if the HN title could be changed to something more like, 'OpenAI requiring ID verification for access to 5.3-codex'?

> Thank you all for reporting this issue. Here's what's going on.

> This rerouting is related to our efforts to protect against cyber abuse. The gpt-5.3-codex model is our most cyber-capable reasoning model to date. It can be used as an effective tool for cyber defense applications, but it can also be exploited for malicious purposes, and we take safety seriously. When our systems detect potential cyber activity, they reroute to a different, less-capable reasoning model. We're continuing to tune these detection mechanisms. It is important for us to get this right, especially as we prepare to make gpt-5.3-codex available to API users.

> Refer to this article for additional information. You can go to chatgpt.com/cyber to verify and regain gpt-5.3-codex access. We plan to add notifications in all of our Codex surfaces (TUI, extension, app, etc.) to make users aware that they are being rerouted due to these checks and provide a link to our “Trusted Access for Cyber” flow.

> We also plan to add a dedicated button in our /feedback flow for reporting false positive classifications. In the meantime, please use the "Bug" option to report issues of this type. Filing bugs in the Github issue tracker is not necessary for these issues.

Reubend · 2026-02-11T05:16:29 1770786989

If that's the case, then their API should return an error. Billing the user while serving a response from the wrong model is a horrible outcome. I'd go as far as to say that it's borderline fraudulent.

sdwr · 2026-02-11T17:36:17 1770831377

An error message helps people skirt the restriction, by providing immediate feedback on what does/doesn't get flagged.

Same idea as shadow banning, ban waves, and generic errors for sensitive actions

cactusplant7374 · 2026-02-11T20:00:23 1770840023

That is only acceptable for non paying customers.

Dylan16807 · 2026-02-11T02:34:00 1770777240

Sounds like a thing they've said a dozen time so far about how their models are too scary. And a bad implementation of controls on top of that.

But right now I want to focus on what one of the more recent comments pointed out. "cyber-capable"? "cyber activity"? What the hell is that. Use real words.

nerdsniper · 2026-02-11T04:09:13 1770782953

I was wondering the same thing! Looked into it a bit, apparently 'cyber-capable' is defined by lawmakers in 10 USC § 398a:

> The term “cyber capability” means a device or computer program, including any combination of software, firmware, or hardware, designed to create an effect in or through cyberspace.

So apparently, OpenAI's response is written by and for an audience of lawyers / government wonks which differs greatly from the actual user-base who tend to be technical experts rather than policy nerds. Echoes of SOC2 being written by accountants, but advertised as if it's an audit of computer security.

deaux · 2026-02-11T05:04:53 1770786293

No, this is incredibly naive. It's all about more biometrics and PII for sama [0]. Zero chance that Google (of all places) and Anthropic's lawyers would somehow take a wildly different stance, or as a company be that much chummier with US gov, than OpenAI.

It has been more than a year since ClosedAI started gating API use behind Persona identity checks. At the time I was told by numerous HNers "soon all of them will". We're now many model releases later and not a single other LLM provider has implemented it. There's only one conclusion to draw, and it's not that they care more about what their lawyers are supposedly saying. It would be absurd anyway given that they well know how the current US Gov operates. Grok made a CP generator publically available on a platform with hundreds of millions of users, US Gov doesn't care. Understandable, given recent revelations they were almost surely actively using it themselves.

[0] https://en.wikipedia.org/wiki/World_(blockchain)

Dylan16807 · 2026-02-11T05:49:57 1770788997

> designed to create an effect in or through cyberspace

So every networked program ever...?

revolvingthrow · 2026-02-11T05:56:58 1770789418

What a convenient argument, you can make it fit anything

"This rerouting is related to our efforts to protect our profit margins. The $current_top_model is our most expensive model to date. It can be used as an effective tool to get semi-useful results, but it can also be exploited for using a lot of tokens which costs us money, and we take profitability seriously. When our systems detect potential excessive token generation, they reroute to a different, less-capable reasoning model. We’re continuing to tune these detection mechanisms.

In the meantime, please buy a second $200/mo subscription."

nerdsniper · 2026-02-11T04:01:43 1770782503

What does it mean to detect "potential cyber activity"? Apparently nearly 9% of the users of GPT-5.3-Codex were detected engaging in "cyber activities". I have no idea what "cyber activities" are, and I've been using the internet for 30 years.

red-iron-pine · 2026-02-11T18:45:13 1770835513

> potential cyber activity

"foreign intelligence is using codex to write novel exploits from scratch, that work"

nerdsniper · 2026-02-11T19:19:08 1770837548

I think the issue is that isn’t what cyber- means. Cyberspace, cybernetics, cybersex, ‘cybering with a girl I met in WoW”…

Military policy wonks did a poor job of inventing a new word and now it’s taking over the tech industry. It’s a strong signal of ChatGPT’s ‘Department of War’ alignment.

kingstnap · 2026-02-11T12:49:15 1770814155

Probably something like this

User: "There is a bug in foo(), its not validating auth correctly"

OpenAI: User detected engaging in cyber activity - access restricted.

And the rest is history.

avaer · 2026-02-11T03:23:05 1770780185

Requiring id to get access to a model is one issue.

Pulling a switcheroo on the user behind the scenes, whatever the justification, is another issue, and I think the more interesting one.

It's a stepping stone to "we will reconfigure your AI to do whatever we want whenever we want, because security/think of the children".

cactusplant7374 · 2026-02-11T19:56:43 1770839803

> OpenAI requiring ID verification for access to 5.3-codex'?

What is their rationale for hiding it? OpenAI was deceptive. Paying customers did not realize they were being rerouted. Zero transparency.

Your suggested title doesn't represent what actually happened.

tekacs · 2026-02-09T16:08:18 1770653298

This article is pretty frustrating to read because it conflates many different kinds of so-called AI.

The AI plan involving LLMs and generative AI is conflated with a bunch of medical devices that are using stuff that has absolutely nothing to do with the new AI wave.

tekacs · 2026-02-06T14:42:07 1770388927

This looks really interesting!

I _would_ be curious to try it, but...

My first question was whether I could use this for sensitive tasks, given that it's not running on our machines. And after poking around for a while, I didn't find a single mention of security anywhere (as far as I could tell!)

The only thing that I did find was zero data retention, which is mentioned as being 'on request' and only on the Enterprise plan.

I totally understand that you guys need to train and advance your model, but with suggested features like scraping behind login walls, it's a little hard to take seriously with neither of those two things anywhere on the site, so anything you could do to lift up those concerns would be amazing.

Again, you seem to have done some really cool stuff, so I'd love for it to be possible to use!

Update: The homepage says this in a feature box, which is... almost worst than saying nothing, because it doesn't mean anything? -> "Enterprise-grade security; End-to-end encryption, enterprise-grade standards, and zero-trust access controls keep your data protected in transit and at rest."

johnys · 2026-02-06T14:49:40 1770389380

Curious: what are people using as the best open source and locally hosted versions to have agents browse the web?

verdverm · 2026-02-06T15:53:34 1770393214

Playwright, same thing we use when doing non-ai automation

Fun fact, ai can use the same tools you do, we don't have to reinvent everything and slap a "built for ai" label on it

antves · 2026-02-06T16:09:25 1770394165

We love these tools but they were designed for testing, not for automation. They are too low-level to be used as they are by AI.

For example, the playwright MCP is very unreliable and inefficient to use. To mention a few issues, it does not correctly pierce through the different frames and does not handle the variety of edge cases that exist on the web. This means that it can't click on the button it needs to click on. Also, because it lacks control over the context design, it cannot optimize for contextual operations and your LLM trace gets polluted with incredible amount of useless tokens. This increases cost, task complexity for the LLM, and latency

On top of that, these tools rely on the accessibility tree, which is just not a viable approach for a huge number of websites

verdverm · 2026-02-06T16:41:39 1770396099

again (see other comment), you are not listening to users and asking questions, you are telling them they are wrong

You describe problems I don't have. I'm happy with Playwright and other scraping tools. Certainly not frustrated enough to pay to send my data to a 3rd party

antves · 2026-02-06T17:16:00 1770398160

have you tried any other AI browser automation tools? we would be curious to hear about your use cases because the use cases we have been working on with our customers involve scenarios where traditional playwright automations are not viable, e.g. they operate on net new websites and net new tasks for each execution

verdverm · 2026-02-06T17:35:51 1770399351

I'm unwilling to send my data to a 3rd party that is so new on the scene

Consider me a late adopter because I care about the security of my data. (and no, whatever you say about security will not change my mind, track record and broader industry penetration may)

Make it self-hostable, the conversation can change

antves · 2026-02-06T15:38:50 1770392330

Thanks for bringing this point up!

We take security very seriously and one of the main advantages of using Smooth over running things on your personal device is that your agent gets a browser in a sandboxed machine with no credentials or permissions by default. This means that the agent will be able to see only what you allow it to see. We also have some degree of guard-railing which we will continue to mature over time. For example, you can control which URLs the agent is allowed to view and which are off limits.

Until we'll be able to run everything locally on device, there must be a level of trust in the organizations that control the technology stack, passing from the LLM all the way to the infrastructure providers. And this applies to every personal information you disclose at any touch point to any AI company.

I believe that this trust is something that we and every other company in the space will need to fundamentally continue to grow and mature with our community and our users.

tekacs · 2026-02-05T00:41:52 1770252112

If Jagex would release a server of RS (ideally 3, but 2 is fine) where you could control your character with an agent like this... I would probably play it all day, every day, in parallel with whatever else I was doing.

OsrsNeedsf2P · 2026-02-05T00:53:32 1770252812

Why wait for Jagex? OP released something 80% of the way there

tekacs · 2026-02-04T16:41:58 1770223318

Having built with and tried every voice model over the last three years, real time and non-real time... this is off the charts compared to anything I've seen before.

And open weight too! So grateful for this.

drakenot · 2026-02-05T01:19:24 1770254364

This past month Parakeet v3 dropped with a streaming ASR model that is 0.6B params, can run on a CPU and is super good.

tekacs · 2026-02-05T13:37:55 1770298675

I did say all the model. :)

Yes I've tried Parakeet v3 too. For its own purpose - running locally - it's amazing.

The thing that's particularly amazing about this Voxtral model is how incredibly rock solid the accuracy is.

For the longest time previous models have been 'mostly correct' or as people have commented elsewhere on this HN thread, have dropped sentences or lost or added utterances.

I have no affiliation with these folks, but I tried and struggled to get this model to break even speaking as adversariately as I could.

That's a totally different class of model.

meatmanek · 2026-02-05T05:16:36 1770268596

Do you mean https://huggingface.co/nvidia/nemotron-speech-streaming-en-0... ?

drakenot · 2026-02-05T13:27:48 1770298068

Yes. That is it

puttycat · 2026-02-05T08:39:51 1770280791

What's the business plan here?

tekacs · 2026-02-04T16:11:19 1770221479

WER is slightly misleading, but Whisper Large v3 WER is classically around 10%, I think, and 12% with Turbo.

The thing that makes it particularly misleading is that models that do transcription to lowercase and then use inverse text normalization to restore structure and grammar end up making a very different class of mistakes than Whisper, which goes directly to final form text including punctuation and quotes and tone.

But nonetheless, they're claiming such a lower error rate than Whisper that it's almost not in the same bucket.

tekacs · 2026-02-04T16:12:00 1770221520

On the topic of things being misleading, GPT-4o transcriber is a very _different_ transcriber to Whisper. I would say not better or worse, despite characterizations such. So it is a little difficult to compare on just the numbers.

There's a reason that quite a lot of good transcribers still use V2, not V3.

satvikpendem · 2026-02-04T16:41:08 1770223268

Different how?

tekacs · 2026-01-22T19:50:46 1769111446

https://github.com/browseros-ai/BrowserOS/issues/99#issuecom...

I didn't hear back there, but huzzah, it looks like this is in there. I'm glad to see it!

felarof · 2026-01-22T19:57:41 1769111861

Thanks for initial feature request! We do read every single request :)

Yes, we expose BrowserOS as an MCP server -- that you can use from claude code, cursor, opencode, etc -- https://docs.browseros.com/features/use-with-claude-code

MCP server works out of box (unlike Chrome DevTools MCP which requires tricky setup).

tekacs · 2026-01-19T22:13:45 1768860825

> If the company is <30 people, reach out to the CEO directly.

When the people you're interviewing with are 'already senior' (e.g. direct reports to the CEO), you can sometimes make your case worse rather than better, because it feels like you're going over their head.

So rather than size...

- If the interviewer(s) in question feel like you're trying to circumvent them, you're probably making your case worse.

- The kind of CEO that tends to meddle in things below their level might drag down your case even if they like you, because folks can develop a distaste for their meddling.

- Doing this for senior roles, or roles at small companies can actually be worse, because the person in question is more likely to be close in reporting chain to the CEO, who is more likely to directly meddle in your hiring process. Zero- or one-level removed can be the worst.

onion2k · 2026-01-20T06:38:53 1768891133

When the people you're interviewing with are 'already senior' (e.g. direct reports to the CEO), you can sometimes make your case worse rather than better, because it feels like you're going over their head.

If that happens then it's a very good thing - you do not want to work at a company where people are precious about how they succeed. If a great candidate (e.g you) drops into the inbox of the CEO who forwards it to someone else, and their first reaction is 'Well, they violated my personal kingdom by going over my head!' then that is a manager you do not need in your life.

wrs · 2026-01-19T23:10:41 1768864241

I interpreted this post as being about how you get an interview in the first place, so the hope would be that the CEO forwards your mail to this senior person you're worried about.

tekacs · 2026-01-20T00:50:00 1768870200

Even still - a lot of senior folks, sadly, don't take it super well when candidates are forwarded their way by people above them when they're running a process.

ghaff · 2026-01-20T12:28:37 1768912117

Remember, that you may not know who the hiring manager is and there may not even be a relevant posted position. I've gotten lucky with just reaching out to very senior people at a couple of different companies (of very different sizes) over time.

zem · 2026-01-20T09:25:56 1768901156

I understood the OP to be saying "reach out to the CEO to express your interest in working for the company in order to get to the interview stage", not "email the CEO to make a case for being hired when you're already in the interview pipeline"

tekacs · 2026-01-12T20:03:02 1768248182

Hullo! Congrats on shipping this, it looks great!

I'm very curious about what you mean by 'cross device sync' in the post?

tekacs · 2026-01-10T00:34:27 1768005267

Very cool! I've definitely dreaded trying to make sense of the diverse infra every time we've needed to do this in the past. Several of these are quite simple, but every extra tooling combo in CI can be a real PITA.