Why the Blockchain and the Bitcoin Wallet Balances Differ
If you look at a website like blockchain.info or blockexplorer.com, you may notice it is possible to find out the details about a particular bitcoin address, such as the last transactions and of course the balance.
If you try this on a Bitcoin address that belongs to you, and fire up the Bitcoin Qt client (aka Bitcoin Core), you may have noticed a discrepancy. It’s very likely for the balance displayed on the website to be less than the one displayed by the software wallet.
The discrepancy is caused by the nature of bitcoin. Instead of storing actual coins, the bitcoin protocol should be seen as a distributed public database of transactions which together form the blockchain. You “receive” bitcoins when another party decides to use their private key to sign a transaction and send some amount of bitcoins to your public address. Bitcoins only exist in the sense that you can trace the chain of valid transactions until you reach special coinbase transaction, i.e. some mined bitcoins. You can almost think of all the transactions forming a singly linked list, that stops at one end with mined bitcoins, and on the other end with unspent bitcoins… except for the fact that each transaction can have multiple inputs or outputs. (Please keep in mind this is voluntarily simplified, if you wish to know more check the protocol documentation)
One of the quirks of the protocol, is that the amount of the inputs and outputs in the transaction must match (in reality, the output can be less than the input, and the remainder then constitutes the optional transaction fee). That rule greatly simplifies the validation of transactions, since there is no need to extract the entire history of transactions to figure out how much funds are spent or unspent for a given transaction: it’s either all or nothing.
The drawback of this solution arises when you need to spend only a fraction of the amount received in a previous transaction. In that case, the wallet software automatically creates two outputs to the transaction: one output is used to send money to the intended recipient, one output is used to send the remainder to the sender.
At this stage, it’s probably simpler to reason with an example. Let’s imagine Alice wants to sent 1.2 BTC to Bob. Alice previously received 1 BTC from Chip and 0.5 BTC from Dale. The new transaction she makes has to reference both previous unspent transactions as inputs, since neither of these transactions taken individually have enough funds. One of the outputs of the transaction must be the 1.2 BTC that are sent to Bob. But Alice also need to add a 0.3 BTC output that are sent back to herself. In the future, if she could use these 0.3 BTC coins that remain in her wallet, by referencing this 0.3 BTC output as an input for a new transaction.
It would be possible to use the same public address to send the money back to the sender, but the Bitcoin Qt software sends it to a new address instead, for privacy reasons. A bitcoin wallet contains at least a hundred of such addresses which constitute the key pool. The key pool is pre-allocated (therefore many addresses will have a balance of zero) so that slightly out of date backups of wallet files result in no loss of bitcoins. Every time a transaction that requires return funds is made, these returned funds seem to “disappear” from the balance of the wallet’s public address. It’s possible to reach a balance of zero on your public address in that way.