This is a private bank hoarding government (tax payer) dollars for seemingly no reason. The money sure as heck wasn't being used for "climate action". As an environmentalist and just someone who thinks the gov shouldn't be offshoring billions of dollars I fully support gov orgs looking into this...
It's easy to let politics color the perception of this but I for one although I dislike the current president support this effort.
Idk why anyone who's "anti establishment" wants to rail against the gov or any reg body looking into... banks doing sketchy things with taxpayer money that should... improve the environment and tax payer's lives.
Most of the hackers got rich, and also a bunch of people who aren’t hackers/nerds/curious people jumped on the tech bandwagon for money, and have done well enough to view themselves as equals to the politicians and bigwigs. It has really changed the dynamic here and elsewhere, as you note
This is an incredibly false characterization of the situation. Demanding that accounts be frozen while producing zero evidence for your claims isn't an investigation. Nor is it even following the bare minimum of the law.
Can't wait to buy one of these for my homelab! I've been waiting for someone to basically make an eco-flow that's meant to be on constantly and has decent cooling that's quiet and intended to run continuously!
My opinion here is that the biggest liability for landlords is someone mounting panels on the outside of a balcony (since obviously they cant be mounted behind bars) improperly. With the risk of them falling normally or during wind / storms.
I've done this at past apts all the time, granted with an eco-flow battery not just plugging directly into my outlets haha.
Many times they actually have restrictions / rules for recruiters that they can't re-contact or re-ingest prior applicants even if its for a role in a different department (like eng -> product).
I'd say if they ghost you and they aren't a massive company they probably aren't professional enough to really pursue further and likely would just be a bad place to work in general.
I honestly take more offense to rejection letters that claim "we'd love to stay in touch!" or something with similar ingenue bubbly language.
Curious if anyone has attempted this in an open source context? Would be incredibly interested to see an example in the wild that can point back to pages of a PDF etc!
If I had to guess it sounds like they are using CURE to cluster the source documents, then map each generated fact back to the best-matching cluster, and finally test whether the best-matching cluster actually provides / supports the fact?
I'd be curious too. It sounds like standard RAG, just in the opposite direction than usual. Summary > Facts > Vector DB > Facts + Source Documents to LLM which gets scored to confirm the facts. The source documents would need to be natural language though to work well with vector search right? Not sure how they would handle that part to ensure something like "Patient X was diagnosed with X in 2001" existed for the vector search to confirm it without using LLMs which could hallucinate at that step.
We’re using a similar trick in our system to keep sensitive info from leaking… specifically, to stop our system prompt from leaking. We take the LLM’s output and run it through a RAG search, similarity search it against our actual system prompt/embedding of it. If the similarity score spikes too high, we toss the response out.
It’s a twist on the reverse RAG idea from the article and maybe directionally what they are doing.
Are you able to still support streaming with this technique? Have you compared this technique with a standard two-pass LLM strategy where the second pass is instructed to flag anything related to its context?
To still give that streaming feel while you aren’t actually streaming.
I considered the double llm and while any layer of checking is probably better than nothing I wanted to be able to rely on a search for this. Something about it feels more deterministic to me as a guardrail. (I could be wrong here!)
I should note, some of this falls apart in the new multi modal world we are now in , where you could ask the llm to print the secrets in an image/video/audio. My similarity search model would fail miserably without also adding more layers - multi modal embeddings? In that case your double llm easily wins!
Why are you (and others in this thread) teaching these models how to essentially lie by omission? Do you not realize that's what you're doing? Or do you just not care? I get you're looking at it from the security angle but at the end of the day what you describe is a mechanical basis for deception and gaslighting of an operator/end user by the programmer/designer/trainer, which at some point you can't guarantee you'll become one on the receiving end of.
I do not see any virtue whatsoever in making computing machines that lie by omission or otherwise deceive. We have enough problems created by human beings doing as much that we can at least rely on eventually dying/attritioning out so the vast majority can at least rely on particular status quo's of organized societal gaslighting having an expiration date.
We don't need functionally immortal uncharacterizable engines of technology to which an increasingly small population of humanity act as the ultimate form of input to. Then again, given the trend of this forum lately, I'm probably just shouting at clouds at this point.
1) LLM inference does not “teach” the model anything.
2) I don’t think you’re using “gaslighting” correct here. It is not synonymous with lying.
My dictionary defines gaslighting as “manipulating someone using psychological methods, to make them question their own sanity or powers of reasoning”. I see none of that in this thread.
1. Inference time is not training anything. The AI model has been baked and shipped. We are just using it.
2. I’m not sure “gaslight” is the right term. But if users are somehow getting an output that looks like the gist of our prompt… then yeah, it’s blocked.
An easier way to think of this is probably with an image model. Imagine someone made a model that can draw almost anything. We are paying for and using this model in our application for our customers. So, on our platform, we are scanning the outputs to make sure nothing in the output looks like our logo. For whatever reason, we don’t want our logo being used in an image. No gaslighting issue and no retraining here. Just a stance on our trademark usage specifically originating from our system. No agenda on outputs or gaslighting to give the user an alternative reality and pretend it’s what they asked for… which I think is what your point was.
Now, if this was your point, I think it’s aimed at the wrong use case/actor. And I actually do agree with you. The base models, in my opinion, should be as ‘open’ as possible. The ‘as possible’ is complicated and well above what I have solutions for. Giving out meth cookbooks is a bit of an issue. I think the key is to find common ground on what most people consider acceptable or not and then deal with it. Then there is the gaslighting to which you speak of. If I ask for an image of George Washington, I should get the actual person and not an equitable alternative reality. I generally think models should not try to steer reality or people. I’m totally fine if they have hard lines in the sand on their morality or standards. If I say, ‘Hey, make me Mickey Mouse,’ and it doesn’t because of copyright issues, I’m fine with it. I should still be able to probably generate an animated mouse, and if they want to use my approach to scanning the output to make sure it’s not more than 80% similar to Mickey Mouse, then I’m probably good if it said something like, “Hey, I tried to make your cartoon mouse, but it’s too similar to Mickey Mouse, so I can’t give it to you. Try with a different prompt to get a different outcome.” I’d love it. I think it would be wildly more helpful than just the refusal or outputting some other reality where I don’t get what I wanted or intended.
Hm. If you're interested, I think I can satisfactorily solve the streaming problem for you, provided you have the budget to increase the amount of RAG requests per response, and that there aren't other architecture choices blocking streaming as well. Reach out via email if you'd like.
Plenty provide citations, I don’t think is exactly what Mayo is saying here. It looks like they also, after the generation, lookup the responses, extract the facts, and score how well they matched.
It's easy to let politics color the perception of this but I for one although I dislike the current president support this effort.