Interesting - 6 months ago or so I was monitoring all transactions on a handful of altcoins' blockchains in regards to which transactions were going in and out of exchanges. Because the number of transactions per block were a number of orders of magnitude lower than bitcoin, it was far easier to keep an accurate index of which addresses belong to a given exchange without relying on a bug in BitGo.
I kept a running tally of coins in/out per hour, 6 hours, and day. I could with fairly high certainty predict most large dumps 5-30 minutes before they occurred (delay before deposit is confirmed) and then immediately sell myself while placing a buy order for 5-10% lower depending on the size of the deposit.
I also started tracking the exact deposit addresses being used - and more importantly, which ones get reused (a crazy high percent) - which let me correlate market behavior to an individual's actions. For example, address X just had a very large deposit. 6 blocks later, there is a market sell order placed for the exact amount deposited and this same behavior happens every time coins are deposited to the given address.
I can then basically add a trigger to automatically pre-empt the likely incoming dump with one of my own and then buy back for 5% less.
The biggest problem was the market cap and transaction volume was just too small to make much money relative to the risk. However if one could do this same analysis with bitcoin, it could be extremely profitable.
CTO of BitGo here. I wouldn't call this a bug, per se, but it's a known issue that we plan to fix. The BitGo API is agnostic where the change output(s) are placed -- this is just an issue with the client-side SDK.
The primary reason we haven't changed it sooner is that BitGoD (which Bitstamp uses), currently relies on the change output being last to determine which output of a transaction is change when listing transactions. This was needed due to missing functionality in our back-end transaction indexer which has been remedied in the last few weeks.
The other reason we don't consider it a huge deal is that, until there is much more adoption of multi-sig, it's still going to be relatively easy to determine which output is change (it's likely the only one that starts with a "3").
> it's still going to be relatively easy to determine which output is change
I'm not sure that you understand what the impact is. It's revealing that a group of transactions all use the same software. Finding out which one is a change address isn't really important.
It appears that the bug is that the change address is always the last one, which defeats the purpose of having a change address. You can see that this is what their commit changed:
A change address is designed to enhance anonymity on the blockchain (to anyone who says this doesn't exist, prove it and tell me which bitcoin transactions I've done). With a change address, whenever Alices makes a payment from address X, during a single transaction part of the money goes to Bob's bitcoin address Y and the rest of the money goes to another address Z under Alice's control. The idea is that people should not be able to tell whether Alice controls Y or Z, so it should make it more difficult to know exactly how much money Alice and Bob ended up with.
Of course, if you always know that Alice's change address Z is last, this whole exercise is futile.
>A change address is designed to enhance anonymity on the blockchain
Don't think that's actually true. A change address may help for anonymity, but the design of bitcoin (specifically, "outputs"), means that change addresses must be used. I doubt the idea was specifically to help anonymity.
Yes, but that's if you want it all to go to one place. I don't see any specific design choice involving change addresses that seems specifically for anonymity.
Change addresses aren't part of the design at all. Multiple outputs are. By convention, an output to an address you control is called a change address, but it's just a convention on top of the protocol, not something the protocol is aware of.
And yes, their purpose is very explicitly extra anonymity. Gavin named a branch without change addresses a "noprivacy" branch:
>By convention, an output to an address you control is called a change address, but it's just a convention on top of the protocol, not something the protocol is aware of.
I'm aware of that, it's part of what I was trying to say.
>Gavin named a branch without change addresses a "noprivacy" branch:
That branch apparently just sends it to the sending address.
So, let me understand this. Blocktrail noticed a similar problem in their application "several months ago." But they didn't post anything about it then. Instead, they opted to fix it quietly for themselves and not share that knowledge with the community.
Then, an opportunity pops up to take a shot at a competitor and they jump on it. Sure, Blocktrail submitted a pull request, 48 hours ago.
While I am ok worth their not publishing this data, I would much prefer something on the lines of, "we will publish this data in 30 days once the waters have cooled down."
The data is in the blockchain. You can identify and fix the bug but this is still giving an advantage to the people who are going to compile this data for themselves.
I kept a running tally of coins in/out per hour, 6 hours, and day. I could with fairly high certainty predict most large dumps 5-30 minutes before they occurred (delay before deposit is confirmed) and then immediately sell myself while placing a buy order for 5-10% lower depending on the size of the deposit.
I also started tracking the exact deposit addresses being used - and more importantly, which ones get reused (a crazy high percent) - which let me correlate market behavior to an individual's actions. For example, address X just had a very large deposit. 6 blocks later, there is a market sell order placed for the exact amount deposited and this same behavior happens every time coins are deposited to the given address.
I can then basically add a trigger to automatically pre-empt the likely incoming dump with one of my own and then buy back for 5% less.
The biggest problem was the market cap and transaction volume was just too small to make much money relative to the risk. However if one could do this same analysis with bitcoin, it could be extremely profitable.