Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[dupe] Not a protocol fault: MtGox and transaction malleability (oleganza.com)
146 points by oleganza on Feb 10, 2014 | hide | past | favorite | 69 comments


"It's a protocol bug!" "It's not a protocol bug!" is sort of a moot point.

Bitcoin is a protocol in the sense that the IE6 rendering engine is a protocol: yes, rigorously examining what the client monoculture does and does not emit does allow you to describe some set-of-rules. Trying to create another Bitcoin client which attempts to agree with that set of rules is, ahem, fraught. You have to be bug-for-bug compatible with the Satoshi client implementation, not with the Satoshi client's declared behavior, design intent, or whatever constellation of PDFs and Wiki posts that the Bitcoin community anoints as "the protocol."


Satoshi never had any "whitepaper spec" which proclaims "intended" behaviour. The protocol is whatever 100% of users agree with. It started with one implementation, today we have many of them. And we already had the protocol changed in a hard way ("hardfork") at least 4 times:

1. In 2009 when they fixed OP_RETURN and OP_CODESEPARATOR bug allowing to spend anyone's coins.

2. In 2010 when integer overflow created billions of BTC in one tx.

3. In 2012 when "OP_HASH160 <hash> OP_EQUAL" script was redefined to be interpreted as a hash of a script (see BIP16, Pay-to-Script-Hash aka "P2SH").

4. In 2013 when v0.8/v0.7 difference in database engines caused half of the network mine parallel blockchain due to obscure limit on number of file handles. The fix which was gradually introduced after reverting "incompatible" (although, longer) chain, was changing the protocol in a hard way.

The takeway from this: there's no parallel universe where a clearly defined spec in English is how all software operates to produce exactly the same blockchain. The most precise description would be a computer code itself running on the same hardware everyone uses. Anyone coming up with a "spec" of Bitcoin will only have a "description", not a "prescription".


In 2013 when v0.8/v0.7 difference in database engines caused half of the network mine parallel blockchain due to obscure limit on number of file handles.

Can I just highlight, for generic web developers who might not parse this sentence, that this is like saying "One of the ways we make sure all clients agree on HTTP is by making sure that all clients and servers stay on the same point release of Oracle DB because when they didn't for a few minutes the entire Internet died in fire. We put the fire out by reverting the Oracle DB version on a few big sites, telling everyone else to downgrade as well, and posting on an out of band channel that if you were doing anything important with the Internet you might want to stop for a few hours because there were going to be two separate and very mutually incompatible Internets and one of them was going to suddenly vanish a couple hours later and it sure would suck if your mail/web browsing/Dropbox/etc went into that one."

Seriously. That actually happened.


That's because in HTTP, only the client and server involved in the transaction need to agree on what's going on. But in bitcoin, every client in the network needs to be able to verify your transaction, so they all need to agree on what transactions are valid.


That's not really what happened. The majority of Bitcoin clients need to agree whether or not to accept a block. In 0.7, a limitation of the database library was causing a very large block to be rejected, whereas in 0.8 the block was correctly accepted. This caused a blockchain split, with the 0.7 clients going one way, and the 0.8 clients going the other.

This sort of problem doesn't really occur on the web, so trying to draw an analogy to HTTP isn't going to help people understand the issue.


The point of the post is to make web developers say "But wait, Patrick, if HTTP actually worked like that that would be batshit insane. I mean, separation of concerns. Having to coordinate patch schedules worldwide across billions of clients. Being to-death-do-us-part committed to every library choice made by the first HTTP client. Wait, there could never be a second HTTP client, because you'd never be sure it worked warts-and-all like the first one. You'd probably have to do something like extract the One True HTTP Client into a wrapper program and, I don't know, call out to it from your web browser, so that changes in web browser chrome weren't tightly coupled to the One True HTTP client. Developers on the One True HTTP Client would probably refuse to help people who made competing HTTP clients, because they'd be building potential existential threats to the Internet, and they'd probably gloat when those other developers suffered misfortune.

Please tell me, Patrick, that you're exaggerating and that Bitcoin doesn't work like this."

Guess what?


HTTP isn't trying to maintain a consistent, decentralised database. Bitcoin is. HTTP doesn't have a problem maintaining data consistency because that's outside the scope of the protocol.

You're suggesting the Bitcoin protocol is badly designed because it's had problems that HTTP hasn't had. But this is like saying aeroplanes are poorly designed compared to cars because the latter don't fall out of the sky when their engines die. Planes have to deal with problems cars don't, not because of poor design, but because of the nature of what they're made to do.

Bitcoin concerns itself with providing data integrity across a distributed system. The only way of doing this, without a centralised server, is to ensure that all the clients are performing the same calculations. If it turns out a significant proportion of clients are performing the wrong calculation, then you have a problem. This isn't a problem inherent to Bitcoin; it's a problem with any distributed system trying to maintain a consistent database.


Patrick is a very smart guy, which is part of the reason I (like you) am so confused as to why he thinks HTTP makes for a good analogy.


I'm pretty sure that by HTTP client he meant rendering engines. And in this analogy the data integrity is "whatever crap I have to send to the client to get the expected output on all client machines". So it's somewhat fair to say that if you have wildly divergent interpretations of HTML rendering then you have a data integrity problem, since you can't guarantee that the same data will be interpreted the same way by all clients. None of that has to do with HTTP the protocol though... so I'm confused as to why HTTP the protocol was mentioned instead of HTML and rendering engines.


I can't tell if you meant for this statement to be as insulting to Patick as it seems at first glance. If you did, I will point out that an alternative outlook is to realize that as Patrick is a very smart guy you may have simply misunderstood, misinterpreted, or misapplied his analogy.


I wonder if anyone with some experience with bitcoind's source code could explain why the network layer and the data storage layer are not (or cannot be) sufficiently abstracted to prevent some low-level data storage implementation detail like this from affecting the correctness of the propagation algorithm's behavior. This seems like an architectural flaw, but maybe there's something inherent to the bitcoin protocol that makes this type of incident inevitable?


Early days of exciting new tech are early. And, yes, Blockchain compatibility requirement is a new phenomenon, unlike HTTP or even TCP/IP, so your comparison is not very relevant (although useful to highlight possible difficulties). We have to get used to it and learn how to deal with the risk of unintentional forks, or stay away from the game.


[deleted]


With the exception of the integer overflow none of these problem seem directly related to the language being used (and integer overflows are a problem in many languages besides C).


"v1.0" is meaningless label. Bitcoin already works and cannot be changed willy-nilly in a glorious self-proclaimed version 1.0 of a single particular implementation.


bitcoind is C++.


> The protocol is whatever 100% of users agree with.

This is not correct, it is what half-plus-one of the users agree to. 100% is not a requirement.


The protocol is whatever 51% of users agree with.


It's substantially more complicated than that.

One: I think you understand "user" as mapping to hashpower-over-some-time-interval, but vanishingly few users of Bitcoin understand user to mean that. Two: if you control less than 50% of the hashpower over some time interval, you can with a certain probability and cost alter some rules of the protocol, and the consensus will back you. For example, there exists no rule amenable to 51% of Bitcoin hashpower currently that says "Transactions to addresses known to be controlled by Patio11 have a special property: they are not allowed on Tuesdays" but you can, in fact, impose that rule on the network without requiring 50% of the hashpower available on Tuesday. (It would be a pretty expensive thought experiment, depending on how certain you would want to be that the rule get adopted.)


No. If 51% of users fork off the blockchain for themselves, their forked coins will lose at least 49% of liquidity and value.


There is no concrete relationship between the number of users and the value implied by some portion of those users.

(Ridiculous thought experiment: One user controls 75% of bitcoin and will buy and sell them only for $5. Given enough dollars desiring bitcoin he would quickly be reduced to 0% control, but in the meantime everyone else would have some difficulty selling for much more than $5.)


You are right. What I mean is, if half of the coins will go with a forked blockchain, another half of the coins will not be able to participate on it. So for a guy who is about to receive a payment in a forked coin, the market would be roughly 2x less liquid. It's the same with altcoins: if I pay you in Litecoin, you'll get 11x less liquid object than BTC. Maybe it has the right price at the moment and is fine by you, but as it's less liquid, it's less certain to hold its value over the longer period of time.


Yeah, my objection was to the precision that your wording implied.

(realistically, I think the outcome of an unpatched fork is the collapse of bitcoin and people becoming very wary of anything claiming bitcoin in its genealogy (or even inviting close comparison to bitcoin))


No, accidental fork will be reverted and people will go on with the previously working software. Bitcoin can collapse only when it's non-recoverable (e.g. ECDSA or SHA256 is completely broken). If it is recoverable (like in all cases mentioned above), then there will be just a short period of turbulence.


I think there is some resiliency against glitch forks, I don't think it is obvious that the only outcome is 'just a short period of turbulence'.

(mostly because there are lots of humans involved and they will act like humans)


It's not only a moot point, it looks like intentional FUD to distract from the main issue: Mt. Gox isn't filling withdrawal requests.

Mt. Gox could fix the issue themselves without a change to the protocol by tracking the entire transaction, not just the hash.

If they don't do this, then we must ask why not? Could it be that they've lost so much Bitcoin to double-withdraws that they can't fill new withdrawal requests?

Would Mt. Gox benefit from a steep decline in the price of Bitcoin, so they can fill the gap at a lower cost?

What's a logical explanation of their behavior?


Utter nonsense:

> If you need a quick answer: there’s no bug in the Bitcoin itself.

> Is it a design issue in Bitcoin to allow slight changes in the transactions? Yes, probably is.


It's only an "issue" to the extent that someone who is not familiar with the protocol might be misled by it. After 4 years running an exchange, one would hope that Gox would have someone familiar enough with the protocol to avoid this, but I guess not.

This page has been there for at last 1 year: https://en.bitcoin.it/wiki/Transaction_Malleability


This is known design "issue", but it's a part of the protocol and thus not a bug which must be fixed. Bitcoin has many odd facets and all of them must be understood by anyone making bitcoin software.


A design issue is not a bug is it? This design issue seems to have lead to the bug in Mt Gox implementation though.


That's like saying "it's only illegal if you get caught".


It does sound like a bug in Bitcoin, and it's one that people have to specifically work around. It does not sound deadly, but nor does it sound ideal. And it does seem likely that someone figured out that Gox was doing what they were doing and exploited it, the question being to what extent.

But it is interesting evaluating the commentary given that people who have a vested interest in Bitcoin will naturally have a very strong reason to defend the protocol and fundamentals (the linked submission even encourages readers to go buy into the hype at a "huge discount"). This is something you see on equity trading boards, the guy deep underwater on the penny stock loudly defending all fundamentals of his choice.


Interesting comment from the lead dev of Ethereum (a new cryptocurrency):

> aside from cryptocurrencies, there really is no other situation where the fact that you can take a valid signature and turn it into another valid signature with a different hash is a significant problem, and yet here it’s fatal.

http://blog.ethereum.org/2014/02/09/why-not-just-use-x-an-in...


"Or, MtGox themselves see that they’ve been watching for transaction for too long and could automatically re-send another transaction"

There is no amount of time to wait until a bitcoin transaction is considered dead. Once its been communicated over the network, one has to assume it can be redeemed (incorporated into the blockchain) anywhere from never to a thousand years from now. Checking for txid even without malleable transactions is not helpful.

If a bitcoin payment has been issued, and a customer wishes it to be re-issued, the (only?) safe approach is to move the entire address contents to a new address, wait for that to hit the blockchain, and reissue the payment from the new address.

When different customer's balances are mixed in the same address, things get more difficult, and seeing as how its customary to pull balances from multiple addresses to satisfy the inputs a single transaction, I think mtgox has a claim to complexity beyond their ability to fix without help from the protocol itself.

As far as being affected by malleable transactions, an exchange should have a value on their 'books' for each customer and a value in the 'bank'/wallet for each customer. By keeping track of each set of addresses for each customer, like bitcoind does in its wallet, the two can be audited with each new block on the chain.


How on earth can this 'fix' scale?

Let's imagine a future world where all monetary transactions go through Bitcoin. So the only way for me to find out if my transaction took place is to search the list of every worldly transaction to spot mine?


If you're running a full node - basically, Bitcoin-Qt or bitcoind - your client goes over every single transaction and verifies it as it is. In Bitcoin-Qt/bitcoind, the client does not depend on transaction hashes to verify that a transaction took place, instead depending on the input/output lists. Any competently written "light" client does the same.

Obelisk, written by the libbitcoin people, provides an address/transaction tracking system over the network that could be used if bitcoind does not meet requirements.

Basically, this fix has been in place for a long time everywhere aside from MtGox.


Don't you already have to do that? You have to download the entire blockchain to see if a transaction happened. The fix is just to look for a different attribute on the data you're already looking at.


Precisely. If you're offering any type of merchant services (and Gox does), then you already have to monitor all incoming transactions. All this means is that you have to keep a small list of the addresses of people with pending withdrawals, along with the corresponding Gox outputs, and then remove those addresses from the list as each block rolls in (at however many confirmations they determine they need).


A service could watch addresses for you.

(blockchain.info essentially does this for every address appearing in the blockchain)

But yeah, bitcoin as implemented is way to complicated to want to deal with all the time, everyday.


"MtGox remembers a hash of that transaction (unique fingerprint of its contents)"

Technically, there might be collisions with hashes, they're in no way unique... Though the probability is rather low.

Anyways, this is a nice explanation of the real issue.


For all practical purposes it's unique. Otherwise, Bitcoin is broken and useless.


> they instead should watch if the address X (specified by user) got amount N (specified by user) from outputs Y, Z and W (used by MtGox).

This doesn't seem ideal.


Why? In order to be running an exchange like Gox, you have to be monitoring arbitrary addresses anyway.

It's a simple solution to a simple problem, and it's pretty telling that Gox had this problem to begin with (but doesn't surprise me at all).


Because if this mode of transfer ever becomes more than a toy, then there will be massive overloading of those values.

Can you come up with circumstances in which someone might choose to make successive transactions of the same value? Especially if that might confuse the issue?

You want to be able to tell if _this_ transaction completed, not if, say "A transaction that really looks like this one" completed.


No. "successive transactions of the same value [between the same people]" will not have the same transaction inputs. A 'transaction input' in bitcoin isn't the originator's pubkey, it's the output of a previous transaction. A second transaction which literally had the same inputs as a first would be a doublespend attempt.


Two transactions with the same inputs and the same outputs can safely be considered the same transaction. You don't just compare the transaction value.


I've been through the other thread of another article declaring mt.gox is wrong, but I still don't see the solution.

What is the recommended solution by bitcoin implementers to verify a transaction succeeded, with transaction malleability existing ?


I am not bitcoin implementer, but I believe you need to maintain the list of pending withdrawal addresses, and monitor that the transactions with mtgox outputs sent to those addresses received at least 6 confirmations.


Where would those 6 confirmations come from ?


From the network. Or, at least as I understand it, Transaction Malleability basically means, that the "final" transaction can have different txid, so you shouldn't rely on it. But you can check if the transaction went through the same way as the bitcoin-qt does it, which is what BTC developer Gregory Maxwell recommends here:

http://sourceforge.net/mailarchive/message.php?msg_id=319565...


On a side note, if you believe in BTC, I cannot remember the last time you could buy it for less than $600 like today.


You mean like 2 months ago?


Had to look at the exchange charts - I guess there was a two day dip around December 18th but it quickly corrected.


"corrected"

I love this teleological market-speak.


Ha, yeah I watch NBR before the PBS Newshour some days so unfortunately I've picked up on some of the terminology. I jokingly refer to NBR as PBS's dark little corner.


The BTC believers have very short memories.


BTC believers know there was never a year without at least 200% price increase.


Hard to tell whether that was someone fucking up their sell order or someone dumping a whole lot of (ill-gotten?) bitcoins onto BTC-e.


If you had money in Gox and in other exchanges you might dump your other money now in fear of having lost Gox's funds forever.


Apparently someone sold 3000 BTC @ $102 which seems very wrong.

http://i.imgur.com/ScHaDyl.png

How many coins was the US government holding? It would have to be someone not paying attention or not their money.


That explanation is so simple it just can't be right. :) Usually if two objects are equal their hashes should be identical too. That's the whole point of hashing since otherwise O(1) lookups aren't possible. Apparently the bitcoin protocol breaks that expectation since two semantically equal transactions can have different hashes.


It doesn't break that expectation it's just a misunderstanding of what data is included in the hash.


The main and most important use of transaction hashes in Bitcoin is to ensure that, when blocks are mined, the transactions contained within those blocks are locked in and any modification of those transactions can be detected and rejected. Transaction malleability is meant to change the hash of the transaction, because that ensures no-one can mutate transactions after they're in a block. If you really need to track transactions in the way MtGox is doing, you should construct your own transaction ID that hashes everything except the scriptSigs (which are malleable).


> Transaction malleability is meant to change the hash of the transaction, because that ensures no-one can mutate transactions after they're in a block.

Huh?

> If you really need to track transactions in the way MtGox is doing, you should construct your own transaction ID that hashes everything except the scriptSigs (which are malleable).

Yeah. Except the Bitcoin network also apparently produces its own hashes of transactions and those hashes include mutable data leading to undesirable situations like hash(txn1) != hash(txn2) even though txn1 == txn2. It's like hashing people by their name and age instead of name and date of birth. Next year you won't find them in your table because their age is different.


Can someone link to a better explanation? I'm not really satisfied by this blog post.




What Mount?


Mount Gox is a malapropism of the acronym MTGOX (which is expanded in other comments on this thread)


Magic: The Gathering Online Exchange




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: