You might well want to criticise the paper, but I think the video is a fair representation of the contents of the paper.
Regarding the number of holders, the paper says:
"While the original database has 896 million addresses, after we remove addresses in peeling chains we end up with 640 million addresses. Theses addresses belong to 189 million clusters, of which 116 million clusters are single-address clusters."
Not sure where the number of 68 million wallets comes from. Taking 189m as the number of holders, we'd have 0.005% of BTC holders holding 27% of all BTC. But even we take that 68m "wallet" number, and assume that each holder controlled multiple wallets, say 10 on average, we'd have 6.8m holders, and then 0.15% of holders controlling about 27% of BTC. Still enormous concentration.
More from the paper: "It is also important to note that this measurement of concentration most likely is an understatement since we cannot rule out that some of the largest addresses are controlled by the same entity. In particular, in the above calculations, we do not assign the ownership of early bitcoins, which are held in about 20,000 addresses, to one person (Satoshi Nakamoto) but consider them as belonging to 20,000 different individuals."
Another interesting part of the paper: "To the best of our knowledge, we have the most complete information about crypto entities that have been used in academic research up to this point. Our data cover 1,043 different entities. These include 393 exchanges, 86 gambling sites, 39 on-line wallets, 33 payment processors, 63 mining pools, 35 scammers, 227 ransomware attackers, 151 dark net market places and illegal services."
> You might well want to criticise the paper, but I think the video is a fair representation of the contents of the paper.
I do have criticisms of the paper, but those are milder. They are inherent limitations of the methodology, which the authors seem to recognize. The headline in this HN post is not remotely a fair representation of anything whatsoever in the paper.
> Not sure where the number of 68 million wallets comes from. Taking 189m as the number of holders, we'd have 0.005% of BTC holders holding 27% of all BTC. But even we take that 68m "wallet" number, and assume that each holder controlled multiple wallets, say 10 on average, we'd have 6.8m holders, and then 0.15% of holders controlling about 27% of BTC. Still enormous concentration.
If you do the divisions I think it's pretty clear that the video divided 10k/68m and 5m/18m, which is completely illegitimate, considering the paper is discussing "clusters" of wallets.
As for your analysis, i'm not really sure what manipulation you're doing there. There is no coherent way to adjust for the number of wallets owned by an individual, because you don't know the composition of that ownership. Unless you can identify who owns which groups of wallets, you can't back out wealth inequality here under the assumption of multiple wallet ownership. That's exactly why the paper tries to do this clustering analysis that it does. Its purpose is to group wallets presumed to be owned by one person.
> More from the paper: "It is also important to note that this measurement of concentration most likely is an understatement since we cannot rule out that some of the largest addresses are controlled by the same entity. In particular, in the above calculations, we do not assign the ownership of early bitcoins, which are held in about 20,000 addresses, to one person (Satoshi Nakamoto) but consider them as belonging to 20,000 different individuals."
This cuts both ways, though. It may be that a huge number of the low balance wallets are duplicates controlled by individuals as well. People that programmatically created wallets, e.g. for the purpose of anonymization, and left tiny bits of dust in them. They aren't filtering by low balance in any way I can observe.
> Another interesting part of the paper: "To the best of our knowledge, we have the most complete information about crypto entities that have been used in academic research up to this point. Our data cover 1,043 different entities. These include 393 exchanges, 86 gambling sites, 39 on-line wallets, 33 payment processors, 63 mining pools, 35 scammers, 227 ransomware attackers, 151 dark net market places and illegal services."
I'm sure this is all true, and i'm sure it is the most complete academic dataset used to date. That doesn't really mean it is complete though. Note that they enumerate an impressive list of entities, but we have no idea how complete their coverage of those entity's wallets is. And as someone that has done this kind of analysis, I do not trust academic econometricians just trying to publish their next paper to do a good job of it. Chainalysis is hard and inherently quite fuzzy. It's also an adversarial environment, intermediaries often intentionally try to mask their activity, as do individuals.
Let's look at the actual content of their methodology, though in the paragraph directly above:
> To link address clusters to real entities we scrape cryptocurrency blogs and websites, such as Reddit, Blockchain.info, bitinfocharts.com, bitcointalk.org, walletexplorer.com, and Matbea.com for all publicly available addresses of prominent Bitcoin entities such as exchanges, payment processors, gambling sites, and others. We supplement this information with the state-of-the-art database of crypto entities from Bitfury Crystal Blockchain. Bitfury Crystal Blockchain is one of the leading providers of anti-moneylaundering tools and analytic solutions in the crypto space
Does this sound comprehensive to you? The only wildcard here is this "Bitfury" thing which i've certainly never heard of, but we can quickly look at their customers page to get a gauge of how serious they are:
Everyone on that list i've also never heard of. There is a company that dominates this space, and it's https://www.chainalysis.com/ Compare Chainalysis's customer list to Bitfury's. Every name on there is a serious entity that you've probably heard of.
Why aren't they using data from Chainalysis? Or data from someone else reputable? We have no insight into what "Bitfury"s methodology is, so I can't explicitly critique it, but there are really good reasons to be very suspiciuos of its quality.
Finally, I note that they make no mention of mixers or contracts like WBTC, which can hold enormous quantities of coins, and can also create tons of fragmentation in the address space, which is very likely to significantly frustrate any clustering based methodology or naive chainalysis attempt.
> Does this sound comprehensive to you? The only wildcard here is this "Bitfury" thing which i've certainly never heard of
If you operate in this industry and haven’t heard of Bitfury, you may have less of a grasp of the landscape than you think. It was created by some of the main insiders and has employed many former government officials. I would have a rethink of my authoritative stance if I were you.
From everything I know about crypto, the linked paper checks out. A lot of people discuss these issues — I think Vitalik recently lamented that cryptocurrencies seem to devolve into feudal dollars.
> If you operate in this industry and haven’t heard of Bitfury, you may have less of a grasp of the landscape than you think. It was created by some of the main insiders and has employed many former government officials. I would have a rethink of my authoritative stance if I were you.
Employing government officials is not an index of seriousness. I should have been more clear, i've heard of Bitfury in the context of mining and ASICs, but I was unaware they had started a chainalysis product, and I don't take their efforts there seriously. I think the advertised customer list for this product speaks for itself.
> From everything I know about crypto, the linked paper checks out. A lot of people discuss these issues — I think Vitalik recently lamented that cryptocurrencies seem to devolve into feudal dollars.
I'm not denying that the basic idea is correct: that ownership inequality is high. It's just not as high as this headline is indicating.
Remember, the paper does not anywhere assert this 0.01% vs 27% stat. That is created by combining the paper's data on wallet clusters with crypto.com estimate of bitcoin owners. If you read the crypto.com whitepaper on how they computed this, they note that a limitation of their analysis is that they cannot count people who bought Bitcoin at one time, and have since sold it. That means that if you accept everything else about their analysis (which I don't), the denominator they're using counts literally everyone who has ever owned any non-zero fraction of a Bitcoin, ever.
Regarding the number of holders, the paper says: "While the original database has 896 million addresses, after we remove addresses in peeling chains we end up with 640 million addresses. Theses addresses belong to 189 million clusters, of which 116 million clusters are single-address clusters."
Not sure where the number of 68 million wallets comes from. Taking 189m as the number of holders, we'd have 0.005% of BTC holders holding 27% of all BTC. But even we take that 68m "wallet" number, and assume that each holder controlled multiple wallets, say 10 on average, we'd have 6.8m holders, and then 0.15% of holders controlling about 27% of BTC. Still enormous concentration.
More from the paper: "It is also important to note that this measurement of concentration most likely is an understatement since we cannot rule out that some of the largest addresses are controlled by the same entity. In particular, in the above calculations, we do not assign the ownership of early bitcoins, which are held in about 20,000 addresses, to one person (Satoshi Nakamoto) but consider them as belonging to 20,000 different individuals."
Another interesting part of the paper: "To the best of our knowledge, we have the most complete information about crypto entities that have been used in academic research up to this point. Our data cover 1,043 different entities. These include 393 exchanges, 86 gambling sites, 39 on-line wallets, 33 payment processors, 63 mining pools, 35 scammers, 227 ransomware attackers, 151 dark net market places and illegal services."