To those wondering about their rationale for this.
It would be great if the HN title could be changed to something more like, 'OpenAI requiring ID verification for access to 5.3-codex'?
> Thank you all for reporting this issue. Here's what's going on.
> This rerouting is related to our efforts to protect against cyber abuse. The gpt-5.3-codex model is our most cyber-capable reasoning model to date. It can be used as an effective tool for cyber defense applications, but it can also be exploited for malicious purposes, and we take safety seriously. When our systems detect potential cyber activity, they reroute to a different, less-capable reasoning model. We're continuing to tune these detection mechanisms. It is important for us to get this right, especially as we prepare to make gpt-5.3-codex available to API users.
> Refer to this article for additional information. You can go to chatgpt.com/cyber to verify and regain gpt-5.3-codex access. We plan to add notifications in all of our Codex surfaces (TUI, extension, app, etc.) to make users aware that they are being rerouted due to these checks and provide a link to our “Trusted Access for Cyber” flow.
> We also plan to add a dedicated button in our /feedback flow for reporting false positive classifications. In the meantime, please use the "Bug" option to report issues of this type. Filing bugs in the Github issue tracker is not necessary for these issues.
If that's the case, then their API should return an error. Billing the user while serving a response from the wrong model is a horrible outcome. I'd go as far as to say that it's borderline fraudulent.
Sounds like a thing they've said a dozen time so far about how their models are too scary. And a bad implementation of controls on top of that.
But right now I want to focus on what one of the more recent comments pointed out. "cyber-capable"? "cyber activity"? What the hell is that. Use real words.
I was wondering the same thing! Looked into it a bit, apparently 'cyber-capable' is defined by lawmakers in 10 USC § 398a:
> The term “cyber capability” means a device or computer program, including any combination of software, firmware, or hardware, designed to create an effect in or through cyberspace.
So apparently, OpenAI's response is written by and for an audience of lawyers / government wonks which differs greatly from the actual user-base who tend to be technical experts rather than policy nerds. Echoes of SOC2 being written by accountants, but advertised as if it's an audit of computer security.
No, this is incredibly naive. It's all about more biometrics and PII for sama [0]. Zero chance that Google (of all places) and Anthropic's lawyers would somehow take a wildly different stance, or as a company be that much chummier with US gov, than OpenAI.
It has been more than a year since ClosedAI started gating API use behind Persona identity checks. At the time I was told by numerous HNers "soon all of them will". We're now many model releases later and not a single other LLM provider has implemented it. There's only one conclusion to draw, and it's not that they care more about what their lawyers are supposedly saying. It would be absurd anyway given that they well know how the current US Gov operates. Grok made a CP generator publically available on a platform with hundreds of millions of users, US Gov doesn't care. Understandable, given recent revelations they were almost surely actively using it themselves.
What a convenient argument, you can make it fit anything
"This rerouting is related to our efforts to protect our profit margins. The $current_top_model is our most expensive model to date. It can be used as an effective tool to get semi-useful results, but it can also be exploited for using a lot of tokens which costs us money, and we take profitability seriously. When our systems detect potential excessive token generation, they reroute to a different, less-capable reasoning model. We’re continuing to tune these detection mechanisms.
In the meantime, please buy a second $200/mo subscription."
What does it mean to detect "potential cyber activity"? Apparently nearly 9% of the users of GPT-5.3-Codex were detected engaging in "cyber activities". I have no idea what "cyber activities" are, and I've been using the internet for 30 years.
I think the issue is that isn’t what cyber- means. Cyberspace, cybernetics, cybersex, ‘cybering with a girl I met in WoW”…
Military policy wonks did a poor job of inventing a new word and now it’s taking over the tech industry. It’s a strong signal of ChatGPT’s ‘Department of War’ alignment.
This article is pretty frustrating to read because it conflates many different kinds of so-called AI.
The AI plan involving LLMs and generative AI is conflated with a bunch of medical devices that are using stuff that has absolutely nothing to do with the new AI wave.
My first question was whether I could use this for sensitive tasks, given that it's not running on our machines. And after poking around for a while, I didn't find a single mention of security anywhere (as far as I could tell!)
The only thing that I did find was zero data retention, which is mentioned as being 'on request' and only on the Enterprise plan.
I totally understand that you guys need to train and advance your model, but with suggested features like scraping behind login walls, it's a little hard to take seriously with neither of those two things anywhere on the site, so anything you could do to lift up those concerns would be amazing.
Again, you seem to have done some really cool stuff, so I'd love for it to be possible to use!
Update: The homepage says this in a feature box, which is... almost worst than saying nothing, because it doesn't mean anything? -> "Enterprise-grade security; End-to-end encryption, enterprise-grade standards, and zero-trust access controls keep your data protected in transit and at rest."
We love these tools but they were designed for testing, not for automation. They are too low-level to be used as they are by AI.
For example, the playwright MCP is very unreliable and inefficient to use. To mention a few issues, it does not correctly pierce through the different frames and does not handle the variety of edge cases that exist on the web. This means that it can't click on the button it needs to click on. Also, because it lacks control over the context design, it cannot optimize for contextual operations and your LLM trace gets polluted with incredible amount of useless tokens. This increases cost, task complexity for the LLM, and latency
On top of that, these tools rely on the accessibility tree, which is just not a viable approach for a huge number of websites
again (see other comment), you are not listening to users and asking questions, you are telling them they are wrong
You describe problems I don't have. I'm happy with Playwright and other scraping tools. Certainly not frustrated enough to pay to send my data to a 3rd party
have you tried any other AI browser automation tools? we would be curious to hear about your use cases because the use cases we have been working on with our customers involve scenarios where traditional playwright automations are not viable, e.g. they operate on net new websites and net new tasks for each execution
I'm unwilling to send my data to a 3rd party that is so new on the scene
Consider me a late adopter because I care about the security of my data. (and no, whatever you say about security will not change my mind, track record and broader industry penetration may)
Make it self-hostable, the conversation can change
We take security very seriously and one of the main advantages of using Smooth over running things on your personal device is that your agent gets a browser in a sandboxed machine with no credentials or permissions by default. This means that the agent will be able to see only what you allow it to see. We also have some degree of guard-railing which we will continue to mature over time. For example, you can control which URLs the agent is allowed to view and which are off limits.
Until we'll be able to run everything locally on device, there must be a level of trust in the organizations that control the technology stack, passing from the LLM all the way to the infrastructure providers. And this applies to every personal information you disclose at any touch point to any AI company.
I believe that this trust is something that we and every other company in the space will need to fundamentally continue to grow and mature with our community and our users.
If Jagex would release a server of RS (ideally 3, but 2 is fine) where you could control your character with an agent like this... I would probably play it all day, every day, in parallel with whatever else I was doing.
Having built with and tried every voice model over the last three years, real time and non-real time... this is off the charts compared to anything I've seen before.
Yes I've tried Parakeet v3 too. For its own purpose - running locally - it's amazing.
The thing that's particularly amazing about this Voxtral model is how incredibly rock solid the accuracy is.
For the longest time previous models have been 'mostly correct' or as people have commented elsewhere on this HN thread, have dropped sentences or lost or added utterances.
I have no affiliation with these folks, but I tried and struggled to get this model to break even speaking as adversariately as I could.
WER is slightly misleading, but Whisper Large v3 WER is classically around 10%, I think, and 12% with Turbo.
The thing that makes it particularly misleading is that models that do transcription to lowercase and then use inverse text normalization to restore structure and grammar end up making a very different class of mistakes than Whisper, which goes directly to final form text including punctuation and quotes and tone.
But nonetheless, they're claiming such a lower error rate than Whisper that it's almost not in the same bucket.
On the topic of things being misleading, GPT-4o transcriber is a very _different_ transcriber to Whisper. I would say not better or worse, despite characterizations such. So it is a little difficult to compare on just the numbers.
There's a reason that quite a lot of good transcribers still use V2, not V3.
> If the company is <30 people, reach out to the CEO directly.
When the people you're interviewing with are 'already senior' (e.g. direct reports to the CEO), you can sometimes make your case worse rather than better, because it feels like you're going over their head.
So rather than size...
- If the interviewer(s) in question feel like you're trying to circumvent them, you're probably making your case worse.
- The kind of CEO that tends to meddle in things below their level might drag down your case even if they like you, because folks can develop a distaste for their meddling.
- Doing this for senior roles, or roles at small companies can actually be worse, because the person in question is more likely to be close in reporting chain to the CEO, who is more likely to directly meddle in your hiring process. Zero- or one-level removed can be the worst.
When the people you're interviewing with are 'already senior' (e.g. direct reports to the CEO), you can sometimes make your case worse rather than better, because it feels like you're going over their head.
If that happens then it's a very good thing - you do not want to work at a company where people are precious about how they succeed. If a great candidate (e.g you) drops into the inbox of the CEO who forwards it to someone else, and their first reaction is 'Well, they violated my personal kingdom by going over my head!' then that is a manager you do not need in your life.
I interpreted this post as being about how you get an interview in the first place, so the hope would be that the CEO forwards your mail to this senior person you're worried about.
Even still - a lot of senior folks, sadly, don't take it super well when candidates are forwarded their way by people above them when they're running a process.
Remember, that you may not know who the hiring manager is and there may not even be a relevant posted position. I've gotten lucky with just reaching out to very senior people at a couple of different companies (of very different sizes) over time.
I understood the OP to be saying "reach out to the CEO to express your interest in working for the company in order to get to the interview stage", not "email the CEO to make a case for being hired when you're already in the interview pipeline"
Very cool! I've definitely dreaded trying to make sense of the diverse infra every time we've needed to do this in the past. Several of these are quite simple, but every extra tooling combo in CI can be a real PITA.
It would be great if the HN title could be changed to something more like, 'OpenAI requiring ID verification for access to 5.3-codex'?
> Thank you all for reporting this issue. Here's what's going on.
> This rerouting is related to our efforts to protect against cyber abuse. The gpt-5.3-codex model is our most cyber-capable reasoning model to date. It can be used as an effective tool for cyber defense applications, but it can also be exploited for malicious purposes, and we take safety seriously. When our systems detect potential cyber activity, they reroute to a different, less-capable reasoning model. We're continuing to tune these detection mechanisms. It is important for us to get this right, especially as we prepare to make gpt-5.3-codex available to API users.
> Refer to this article for additional information. You can go to chatgpt.com/cyber to verify and regain gpt-5.3-codex access. We plan to add notifications in all of our Codex surfaces (TUI, extension, app, etc.) to make users aware that they are being rerouted due to these checks and provide a link to our “Trusted Access for Cyber” flow.
> We also plan to add a dedicated button in our /feedback flow for reporting false positive classifications. In the meantime, please use the "Bug" option to report issues of this type. Filing bugs in the Github issue tracker is not necessary for these issues.
reply