Recent Updates Toggle Comment Threads | Keyboard Shortcuts

  • philipkoshy 9:43 pm on January 20, 2014 Permalink | Reply  

    Mapping Bitcoin Addresses to IPs: Hard, but possible 

    We are excited to announce that our research paper has been accepted for publication in the Financial Crypto 2014 conference. It is available here: An Analysis of Anonymity in Bitcoin Using P2P Network Traffic


    We wanted to determine if real-time transaction traffic received from directly connected peers can alone be used to create Bitcoin address-to-IP mappings. Although previous work has analyzed the degree of anonymity Bitcoin offers using clustering and flow analysis, none have demonstrated the ability to map Bitcoin addresses directly to IP data. The ability to create such mappings is important since there have been cases where individuals participating in P2P networks have been identified by law enforcement after their ISPs had been subpoenaed.

    Our Approach

    Although numerous Bitcoin clients exist, none of them were specialized for data collection. Available clients often need to balance receiving and spending bitcoins, vetting and rejecting invalid transactions, maintaining a user’s wallet, mining bitcoins, and, perhaps most detrimental to our study, disconnecting from “poorly-behaving” peers; these were precisely the peers we were interested in. Because existing software had integrated functionality that interfered with our goals, we decided to build our own minimal Bitcoin client called CoinSeer, which was a lean tool designed exclusively for data collection.

    To increase the likelihood of receiving transactions directly from their creators in a gossip protocol, CoinSeer created an outbound connection to every listening peer whose IP address was advertised on the Bitcoin network for a period of 5 months between July 24, 2012 and January 2, 2013. We actively collected all data, along with its IP information, being relayed on the network and stored it for offline processing. This approach was inspired by a technique proposed by Dan Kaminsky during the 2011 Black Hat conference. 


    In the paper, we discuss the circumstances that allowed us to map nearly 1,000 Bitcoin addresses to their likely owner IPs by leveraging P2P relaying behavior.  Although normal relaying behavior proved very difficult to deanonymize, we discuss how certain anomalous relay patterns became highly useful in our study.


    If you take the proper precautions (e.g., using TOR, eWallets, mixing services), you are still very safe from our approach. Even if you take no special precautions at all, 91.4% of all traffic was not amenable to analysis. 

  • dianakoshy 5:06 pm on August 6, 2012 Permalink | Reply  

    Sample Scripts 

    I found a few examples of the BIP-16 Pay-To-Script-Hash scripts in the wild. So far, the ones I’ve found contain multi-sig scripts.

    Below is an example of a matching output/input pair where the output contains the hash of a MULTISIG script and the input contains the script itself along with the signature required to claim the money. The MULTISIG script lists 2 compressed keys, only one of which need to sign for the money to become available.


    Output (send):

    TX Hash:


    Output Index:


    Raw Script:

    A9 14 74 82 84 39 0F 9E  26 3A 4B 76 6A 75 D0 63
    3C 50 42 6E B8 75 87

    Parsed Script:

    A9                             - OP_HASH160
    14                             - length of script hash (= 20)
    74 82 84 39 0F 9E 26 3A        - script hash
    4B 76 6A 75 D0 63 3C 50   
    42 6E B8 75
    87                             - OP_EQUAL

    Input (claim):

    TX Hash:


    Input Index:


    Raw Script:

    00 48 30 45 02 20 4A 71  D4 24 8E F5 F5 36 53 50
    4A 4C 0C FD 36 80 B2 D5  C3 8B FD CE 6A F2 D8 5C
    FA CC 71 A4 7F 5F 02 21  00 C2 38 8E E8 6B C1 54
    3B 47 50 59 5B C4 76 1B  23 A1 E2 F5 B6 24 03 68 
    D7 A9 79 5B A3 0C C3 28  95 01 47 51 21 02 2A FC
    20 BF 37 9B C9 6A 2F 4E  9E 63 FF CE B8 65 2B 2B
    6A 09 7F 63 FB EE 6E CE  C2 A4 9A 48 01 0E 21 03
    A7 67 C7 22 1E 9F 15 F8  70 F1 AD 93 11 F5 AB 93
    7D 79 FC AE EE 15 BB 2C  72 2B CA 51 55 81 B4 C0
    52 AE 

    Parsed Script:

    Signature as required by TX_OUT_MULTISIG:

    00                             - OP_0 (required by MULTISIG before sig)    
    48                             - length of full signature (sigLength)     
    45                             - length of signature pieces (rsLength)     
    20                             - length of first piece (rLength)
    4A 71 D4 24 8E F5 F5 36        - sig_r
    53 50 4A 4C 0C FD 36 80
    B2 D5 C3 8B FD CE 6A F2 
    D8 5C FA CC 71 A4 7F 5F
    21                             - length of second piece (sLength)
    00 C2 38 8E E8 6B C1 54        - sig_s
    3B 47 50 59 5B C4 76 1B
    23 A1 E2 F5 B6 24 03 68 
    D7 A9 79 5B A3 0C C3 28

    Script of type TX_OUT_MULTISIG:

    47                             - length of script
    51                             - m = OP_1
    21                             - length of pubKey #1
    02 2A FC 20 BF 37 9B C9        - pubKey #1(compressed)
    6A 2F 4E 9E 63 FF CE B8
    65 2B 2B 6A 09 7F 63 FB
    EE 6E CE C2 A4 9A 48 01
    21                             - length of pubKey #2
    03 A7 67 C7 22 1E 9F 15        - pubKey #2 (compressed)
    F8 70 F1 AD 93 11 F5 AB
    93 7D 79 FC AE EE 15 BB
    2C 72 2B CA 51 55 81 B4
    52                             - n = OP_2
    AE                             - OP_CHECKMULTISIG

    Bitcoin addresses derived from the keys:


  • philipkoshy 6:36 am on July 22, 2012 Permalink | Reply  

    What is Bitcoin? 

    The following is a short technical description I wish I had when I started working with the Bitcoin protocol almost a year ago. Although the original paper was useful, it didn’t provide the instant gratification I required as a wildly impatient developer.

    A Technical Description

    Bitcoin is a decentralized P2P currency that uses a gossip protocol to transmit messages among peers in an overlay network

    (1) Transactions and (2) Blocks are the two main data structures in the protocol. Coins are transferred among users within transactions, which are then grouped into blocks that must be accepted by the network. A coin owner transfers coins by digitally signing (via ECDSA) a hash digest of the previous transaction and the public key of the next owner. This signature is then appended to the end of the coin. Here is the diagram from the original paper:

    Transactions are placed in blocks, which are linked by SHA256 hashes. Although the accepted chain can be considered a list, the block chain is best represented with a tree.

    Coin generation is tied to block creation. Creating a block is computationally expensive since it requires solving a cryptographic proof-of-work puzzle (see hashcash). Anytime a node generates a block which goes on to be accepted by the network, it is currently awarded 50 Bitcoins although this reward will decrease over time. Not all  blocks will be accepted network-wide (i.e., not all generated blocks warrant an award).

    New blocks are linked to older blocks, forming a block chain that is constantly being extended. Because of Bitcoin’s decentralized and distributed nature, multiple participants may generate blocks at the same time.  For example, in the diagram below, blocks 3, 7 and 8 are all created at the same time. This leads to the distributed consensus problem. We can represent the block chain as a tree structure, with the longest path representing the accepted chain. A participant choosing to extend an existing path in the block chain indicates a vote towards consensus on that path. The longer the path, the more computation was expended building it. We note that in our data, we find that the tree has a branching factor close to one at any given moment – in other words, there is very little contention about which chain is longest. In this way, Bitcoin offers a unique solution to the consensus problem in distributed systems since voting power is directly proportional to computing power.

    Further Reading

    If you’ve read this far, you are likely interested in the protocol details.  For this, I recommend reading the protocol specification and the source code.

  • philipkoshy 5:12 am on July 22, 2012 Permalink | Reply  

    Getting Bitcoin-Qt running on Ubuntu 12.04 

    The following is a quick start guide I’ve written for myself to get the latest version of the Bitcoin-Qt client running in Ubuntu Desktop.  Although the following instructions will work on a Ubuntu 12.04 installation, please consult the official documentation as well. I’ve chosen a setup that suits my own development needs and you may run into issues (i.e.,  you may not be able to open old wallet backups with the version of libdb I’ve chosen here).

    Preinstallation Steps:

    Install Ubuntu Desktop Edition. I use Virtual Box to do most of my Linux development.

    The following packages should be installed using the apt package manager:

    sudo apt-get install g++ libboost-all-dev qt4-qmake libqt4-dev build-essential libssl-dev libdb5.1++-dev

    Compile the Bitcoin client:

    •  Download the latest development release. On the top right of the page, click on Linux (tgz, 32/64-bit) tarball.
    • After downloading the source, extract the tgz file and type:

    qmake USE_UPNP=- USE_QR_CODE=0


    To start the client:

    To start the client: ./bitcoin-qt

    • Jordan Arseno 12:22 am on August 6, 2012 Permalink | Reply

      For folks that don’t need the latest release:

      sudo add-apt-repository ppa:bitcoin/bitcoin
      sudo apt-get update
      sudo apt-get install bitcoin-qt

      Or, use Ubuntu Software Centre.

  • dianakoshy 4:10 am on July 22, 2012 Permalink | Reply  

    Bitcoin Scripts 

    When first learning about the scripts the Bitcoin protocol uses, I couldn’t find in-depth information in one place. I compiled a document for my own reference based on sites and posts I’ve encountered. Hopefully this will make someone’s life a little easier!

    General Information

    Every transaction contains one script per input and one script per output. Output scripts specify who the money is going to and what must be done to claim it. Input scripts specify who the money is from and generally claim previous outputs (unless the input is a coinbase transaction), thus using coins received from previous transactions.

    The scripts contain data fields and operations. When an output is claimed by an input, the input script is prepended to the output script.

    i.e. [input script claiming money][output script being claimed]

    Data fields, each starting with a length, are pushed onto a stack. Operations indicate what must be done to the field(s) at the top of the stack. Usually, this is simply signature verification.

    See for more general information.

    Script Types

    There are a number of different script output/input pairs:

    a) Standard script sending money to a Bitcoin address and claiming money sent in this way.

    output (send): OP_DUP OP_HASH160 [pubKeyHash] OP_EQUALVERIFY OP_CHECKSIG

    input (claim): [sig][pubKey]

    b) Standard script assigning newly generated coins to a Bitcoin address and clamining these coins. This is also used for transactions to an IP address.

    output (send): [pubKey] OP_CHECKSIG

    input (claim): [sig]

    c) Standard input script for newly generated coins (COINBASE).

    The script is random data (prev_hash is all 0′s, and prev_index is all f’s). Some mining pools encode their information in these scripts. The “extranonce” field, along with a timestamp, may also be encoded here when a miner has exhausted the nonce space while trying to solve the proof-of-work puzzle.

    d) Standard script sending money to a script instead of a Bitcoin address (P2SH = Pay-To-Script (BIP 16)). The script must be one of the other standard output scripts.

    output (send): OP_HASH160 [hashOfScript] OP_EQUAL

    input (claim): [signatures as required by script][script]

    e) Standard script requiring multiple signatures to claim coins (BIP 11) . 

    output (send): OP_SMALLINT1 [pubKey][pubKey][pubKey] OP_SMALLINT2 OP_CHECKMULTISIG

    input (claim): OP_0 [sig][sig][sig]

    f) Sample non-standard transaction including a message.

    output (send): [message] OP_DROP [pubKey] OP_CHECKSIG

    input (claim): [sig]

    On-The-Wire: Parsing The Scripts

    Parsing the scripts requires understanding what they look like at the byte level. The following diagram has helped me out quite a bit:

    Common Operations:

    OP_DUP = 0×76


    OP_DROP = 0×75

    OP_HASH160 = 0xa9

    OP_CHECKSIG = 0xac

    OP_EQUAL = 0×87


    OP_SMALLINT1 = m

    OP_SMALLINT2 = n

    where 1 ≤ n ≤ 3 and 1 ≤ m ≤ n are integers representing that m out of n signatures need to appear to claim the transaction.

    Common Data Fields:

    [sig] = [sigLength][0×30][rsLength][0×02][rLength][sig_r][0×02][sLength][sig_s][0×01]


    sigLength gives the number of bytes taken up the rest of the signature ([0×30]…[0×01])

    rsLength gives the number of bytes in [0×02][rLength][sig_r][0×02][sLength][sig_s]

    rLength gives the number of bytes in [sig_r] (approx 32 bytes)

    sLength gives the number of bytes in [sig_s] (approx 32 bytes)

    [pubKeyHash] = [pubKeyHashLength][RIPEMD160(SHA256(public key))]

    where pubKeyHashLength is always 0×14 (= 20) since the RIPEMD160 digest is 20 bytes.

    [pubKey] (uncompressed) = [publicKeyLength][0×04][keyX][keyY]

    where publicKeyLength is always 0×41 (= 65) since keyX and keyY are 32 bytes and 0×04 is 1 byte

    [pubKey] (compressed) = [publicKeyLength][0×02 or 0×03][keyX]

    where publicKeyLength is always 0×21 (= 33) since keyX is 32 bytes and 0×02/0×03 is 1 byte.

    Note: [pubKey] can be either compressed or uncompressed. [keyX] and [keyY] are always 32 bytes each. Thus, publicKeyLength will always be 0×41 (= 65) for uncompressed keys and 0×21 (= 33) for compressed keys. The 0×02/0×03/0×04 is part of the key and describes the encoding. 0×04 is for uncompressed while 0×02/0×03 is for compressed key encodings.

    Bitcoin Addresses: Base 58 Encoded Public Keys

    Bitcoin addresses can be built from the public key or the public key hash as follows:

    (Taken from

    Key hash = Version concatenated with RIPEMD-160(SHA-256(public key))

    Checksum = 1st 4 bytes of SHA-256(SHA-256(Key hash))

    Bitcoin Address = Base58Encode(Key hash concatenated with Checksum)

    Note: Version = 1 byte of 0 (zero); on the test network, this is 1 byte of 111

    Note: The Base58 encoding used is home made, and has some differences. Especially, leading zeroes are kept as single zeroes when conversion happens.

    A more detailed step by step algorithm can be found at:

    Note: In step 9, it should not be using Base58Check, which appends a checksum to its input and then converts to base 58. Since the checksum has already been appended in step 8, just convert the string from step 8 to base 58 using the custom alphabet.

    There is a site that provides a form which performs these steps for a given key:

    A full public key (including the 0×02/0×03/0×04 leading byte) would be pasted into #1, while a public key hash would be pasted into #3.

compose new post
next post/next comment
previous post/previous comment
show/hide comments
go to top
go to login
show/hide help
shift + esc