The paper is great. It really shows how alignement is entirely surface level and...

		xmcqdpt2 29 days ago \| parent \| context \| favorite \| on: Heretic: Automatic censorship removal for language... The paper is great. It really shows how alignement is entirely surface level and not actually deeply ingrained in the models. Really interesting work.