You keep ignoring both how to audit NNs and how to address behavior not in code. Have you read the paper yet? Explain how audits work in light of the paper.
Without answering those you’re simply wasting time and effort by claiming audits can find things they cannot.
I quite doubt you work in security from your inability to grasp these things. Please demonstrate you’re not lying. Your posting history shows a tendency to be a conspiracy believer, and there’s zero evidence you do anything professionally in security, unlike the history of those I know that do work in security.
Without answering those you’re simply wasting time and effort by claiming audits can find things they cannot.
I quite doubt you work in security from your inability to grasp these things. Please demonstrate you’re not lying. Your posting history shows a tendency to be a conspiracy believer, and there’s zero evidence you do anything professionally in security, unlike the history of those I know that do work in security.