Mapping Bitcoin Addresses to IPs: Hard, but possible
We are excited to announce that our research paper has been accepted for publication in the Financial Crypto 2014 conference. It is available here: An Analysis of Anonymity in Bitcoin Using P2P Network Traffic
We wanted to determine if real-time transaction traﬃc received from directly connected peers can alone be used to create Bitcoin address-to-IP mappings. Although previous work has analyzed the degree of anonymity Bitcoin oﬀers using clustering and ﬂow analysis, none have demonstrated the ability to map Bitcoin addresses directly to IP data. The ability to create such mappings is important since there have been cases where individuals participating in P2P networks have been identiﬁed by law enforcement after their ISPs had been subpoenaed.
Although numerous Bitcoin clients exist, none of them were specialized for data collection. Available clients often need to balance receiving and spending bitcoins, vetting and rejecting invalid transactions, maintaining a user’s wallet, mining bitcoins, and, perhaps most detrimental to our study, disconnecting from “poorly-behaving” peers; these were precisely the peers we were interested in. Because existing software had integrated functionality that interfered with our goals, we decided to build our own minimal Bitcoin client called CoinSeer, which was a lean tool designed exclusively for data collection.
To increase the likelihood of receiving transactions directly from their creators in a gossip protocol, CoinSeer created an outbound connection to every listening peer whose IP address was advertised on the Bitcoin network for a period of 5 months between July 24, 2012 and January 2, 2013. We actively collected all data, along with its IP information, being relayed on the network and stored it for oﬄine processing. This approach was inspired by a technique proposed by Dan Kaminsky during the 2011 Black Hat conference.
In the paper, we discuss the circumstances that allowed us to map nearly 1,000 Bitcoin addresses to their likely owner IPs by leveraging P2P relaying behavior. Although normal relaying behavior proved very difficult to deanonymize, we discuss how certain anomalous relay patterns became highly useful in our study.
If you take the proper precautions (e.g., using TOR, eWallets, mixing services), you are still very safe from our approach. Even if you take no special precautions at all, 91.4% of all traffic was not amenable to analysis.