Dismantling a Crappy Malware Operation

nightpool · on March 26, 2023

You mentioned they were using Dropbox to distribute the malware—did you follow up with them? What about the university?

jallasprit · on March 26, 2023

I am surprised and also not surprised that they had approximately 0 OPSEC related to their hustle.

MrBruh · on March 26, 2023

Yeah for sure, also something interesting I found is that one of the guys is a teacher and the other is a student at the same university in Vietnam.

anonym29 · on March 26, 2023

Keep in mind that there is a nonzero chance that they are victims of account compromise themselves, and the adversaries just leveraged those to give authorities an easy "culprit" to point to.

In fact, having apparently terrible opsec could even have been a deliberate effort used to frame individuals who's accounts had been compromised.

It's not likely, but it could be an effective way to frame someone.

msla · on March 26, 2023

Rhetorically, there's a push and pull here:

It's entirely possible the people in the semi-redacted pictures were acting like idiots because they're idiots.

It's also possible the real criminals are using those accounts as patsies.

We, sitting here, can't know the real facts.

However, philosophically, we can say that the first option is a least hypothesis, while still remaining open to a more complex explanation.

Believe it or not, you don't have to come to a firm conclusion based on incomplete evidence.

anonym29 · on March 26, 2023

Not coming to a firm conclusion based on incomplete evidence is the point I am making: let's not (rhetorically) convict these people without wider consideration of the possibilities.

To be clear, I am not calling for (rhetorically) dropping the charges against them, I'm just calling for "maybe we shouldn't halfway-dox people who _might_ be innocent and uninvolved parties". You know, an "innocent until proven guilty" sort of thing.

I'd have forwarded this to law enforcement, sure. Blog post about how to dismantle poorly coded operations, fine. But posting poorly redacted names and photos? There isn't even a guarantee that the individuals in question are the same people who created the github accounts.

As someone who was a victim of pretty serious bullying as a kid (including having other people make fake social media accounts with my name and picture to post horrific things that I wholeheartedly reject to harm my reputation and attempt to get me in legal trouble), I just want to remind everyone that these kind of situtations aren't always what they initially look like.

Cybercriminals do stuff like this to security researchers all the time - you need look no further than Brian Krebs being repeatedly swatted and routinely harassed/trolled by cybercriminals around the world, with major database breaches often being publicly attributed to him by the cybercriminals. There was even a carding website / marketplace set up with his name and likeness - all to harm the reputation of someone working to ethically stamp out their crimes.

lovasoa · on March 26, 2023

Did you report them to the University?

nubinetwork · on March 26, 2023

Nice, but I have to wonder why Github acted on this so fast... I reported one account spreading Python based malware 2 months ago and the account was still there up until last week.

quacksilver · on March 26, 2023

Great work! - though the redaction of names / university is very leaky if that is a concern (particularly if you have some knowledge of common Vietnamese naming patterns)

Gordonjcp · on March 26, 2023

Not sure why you'd think that would be a problem.

I wouldn't have redacted any names.

quacksilver · on March 26, 2023

I am neutral on whether it is or not - though rather than posting unredacted images, an attempt appeared to be made to redact them for whatever reason and I could still read the names.

m3047 · on March 26, 2023

Use something suffering from acropalypse and claim plausible deniability.

atsushin · on March 26, 2023

Really fun analysis, wasn't aware that Python scripts could be packaged into an executable until now, learned something new. Thanks for sharing!

anonym29 · on March 26, 2023

PyInstaller is only one of several ways to do this. It bundles the Python interpreter, script, and dependencies together, drops them in a temp directory, and then starts the script using that interpreter, but that isn't the only technique.

There are also source-to-source translation tools like Nuitka that translate Python to C, which can then be compiled to a PE. Nuitka is less reliable than PyInstaller, but harder to reverse engineer for predictable reasons.

flangola7 · on March 26, 2023

>Nuitka is less reliable than PyInstaller, but harder to reverse engineer for predictable reasons.

That will not matter for long. GPT-4 can turn assembly back into C and generate appropriate comments.

j-krieger · on March 26, 2023

GPT4 can not extrapolate real information from less. Reverse engineering large obfuscated binaries is like unpixelating an image. It’s just guessing.

Still immensely useful, but it does have it’s limits.

duskwuff · on March 26, 2023

TBH, an LLM might be decent at code identification -- looking at some assembly and saying "that looks like a CRC32 hash", for example. That's a task that dovetails fairly well with its strong pattern-matching abilities. Making larger statements about the structure and function of an entire application is probably beyond it, though.

Moreover, it's likely to fail in any sort of adversarial scenario. If you show it a function with some loops that XORs an input against 0xEDB88320, for example, it would probably confidently identify that function as CRC32, even if it's actually something else which happens to use the same constant.

flangola7 · on March 27, 2023

All the real information is already in the binary, no guessing is necessary. It takes data, processes it through a set of defined steps, and outputs it. Both the C code, the assembly code, and the obfuscated assembly code, express the same fundamental conceptual object.

If you have a good enough model with a large enough token window to grasp the entire binary, it will see all of those relations easily. GPT-4 already demonstrates ability in reverse engineering, and GPT-5 is underway which if it as powerful of a generational jump as 3 to 4 will advance these abilities tremendously.

supriyo-biswas · on March 26, 2023

I am skeptical that reverse engineering will be taken over by LLMs. At the very least, most LLMs aren't trained to work in an adversarial environment, which is what reverse engineering is.

Moru · on March 26, 2023

And just like SEO we will have people tailoring their code to fool the AI.

remexre · on March 26, 2023

Does this actually work for nontrivial functions, e.g. a hashtable lookup function?

anonym29 · on March 26, 2023

GPT 3.5 can write most of the infrastructure and "scaffolding" for a full ransomware campaign, but has absolutely no idea how to perform the most basic cryptographic operations even when explicitly instructed on which library and method to use, and will just confidently spit out absolute bullshit that only sorta vaguely resembles what you're looking for - it's like asking a nine year old. Struggles with writing any kind of obfuscation methods beyond base64, string splitting, and XORing too - I have asked it dozens of times and it's never managed to get close to a trivial implementation not using those, even when directly told to do exactly that.

Haven't played with GPT4 yet. Need to try that as well as larger LLaMA models on a rented cloud GPU box sometime. I have a full battery of tests covering writing malware, identifying vulnerabilities a la static analysis, fixing those vulnerabilities, exploiting those vulnerabilities, in a variety of languages, as well as a few generic / assorted technical tasks.

Some other things GPT 3.5 sucks at, in addition to implementing cryptography and obfuscating code:

- Writing ASCII Art

- Writing HTML, CSS with any kind of graphical instructions, even very simple ones like "draw a car using HTML5 and CSS" or "draw the Facebook logo in HTML and CSS"

- Incomplete solutions. Example: when asked to find all of the vulnerabilities in a block of code that contains three or four, it'll confidently list one and say that's the only vulnerability. If you argue with it and insist that there's more, it'll find another, and then insist it found all of them and apologize for missing it the first time. Then you will ask again and it'll say "nope, there are no more vulnerabilies in this code".

- False negatives until told explicitly. Example, you can show it a code block containing a more low-level or exotic vulnerability (e.g. TOCTOU) than your ordinary SQL injection or XSS, ask it if there are any vulnerabilities, and it'll confidently say none over and over. Then you ask if it's vulnerable to a TOCTOU attack and it can finally then realize, oh yeah, the variable X retrieves this value for this comparison on line Y but then retrieves the value again when it passes it to this other function on line Z and if the value changes during that time, it could pass the bounds check on line Y but be invalid when checked again on line Z... which is great that it gets it, right until you realize that you basically have to ask it over and over again for every specific type of vulnerability, and even then, some it'll still miss altogether.

At the level of work expected in big tech companies, I can see GPT 3.5 augmenting or supplementing some outsourced junior consultants, but it's not even adequate to replace them outright, to say nothing of seniors, principals, and true domain experts, at least in security.

j-krieger · on March 26, 2023

Not when I tried.

flangola7 · on March 27, 2023

It's in the Microsoft paper. We might require the 32k token model to really handle it.

remexre · on March 27, 2023

I didn't see it in https://cdn.openai.com/papers/gpt-4.pdf -- which paper are you referring to? (Or if I missed it, what page number?)

voiper1 · on March 26, 2023

Incredible detective work!

Why would discord let anyone delete a webhook?

I'd think anyone can post to the webhook, but you need to be authorized to modify it.

charcircuit · on March 26, 2023

>Why would discord let anyone delete a webhook?

Why wouldn't they? Blocking it just leads to more abuse of Discord and its users.

VoidWhisperer · on March 26, 2023

I think they mean why would discord let anyone delete a webhook if they have the webhook URL, as opposed to requiring a bot or user token with the correct permissions in the server the webhook is from

charcircuit · on March 26, 2023

I answered that. If someone has your webhook either people will spam it which means it's best to have it deleted or private information is being sent over it in which it's best to delete it.

It's like if AWS had a public end point to invalidate access keys. If access keys are public they will be used for abuse, so it benefits everyone if anyone can report these access keys to AWS to have them deactivated.

juunpp · on March 26, 2023

Did they have "malware development and distribution" on their resume?

charcircuit · on March 26, 2023

As mentioned in the article anyone can delete a malicious webhook.

https://webhooks.scam.gay/ is a site that makes it easy to do for people who want a tool do it for them.

b1c1jones · on March 26, 2023

Great work!