Aymeric on Software

Because we needed another of these blogs...

Why the Blockchain and the Bitcoin Wallet Balances Differ

If you look at a website like blockchain.info or blockexplorer.com, you may notice it is possible to find out the details about a particular bitcoin address, such as the last transactions and of course the balance.

If you try this on a Bitcoin address that belongs to you, and fire up the Bitcoin Qt client (aka Bitcoin Core), you may have noticed a discrepancy. It’s very likely for the balance displayed on the website to be less than the one displayed by the software wallet.

The discrepancy is caused by the nature of bitcoin. Instead of storing actual coins, the bitcoin protocol should be seen as a distributed public database of transactions which together form the blockchain. You “receive” bitcoins when another party decides to use their private key to sign a transaction and send some amount of bitcoins to your public address. Bitcoins only exist in the sense that you can trace the chain of valid transactions until you reach special coinbase transaction, i.e. some mined bitcoins. You can almost think of all the transactions forming a singly linked list, that stops at one end with mined bitcoins, and on the other end with unspent bitcoins… except for the fact that each transaction can have multiple inputs or outputs. (Please keep in mind this is voluntarily simplified, if you wish to know more check the protocol documentation)

One of the quirks of the protocol, is that the amount of the inputs and outputs in the transaction must match (in reality, the output can be less than the input, and the remainder then constitutes the optional transaction fee). That rule greatly simplifies the validation of transactions, since there is no need to extract the entire history of transactions to figure out how much funds are spent or unspent for a given transaction: it’s either all or nothing.

The drawback of this solution arises when you need to spend only a fraction of the amount received in a previous transaction. In that case, the wallet software automatically creates two outputs to the transaction: one output is used to send money to the intended recipient, one output is used to send the remainder to the sender.

At this stage, it’s probably simpler to reason with an example. Let’s imagine Alice wants to sent 1.2 BTC to Bob. Alice previously received 1 BTC from Chip and 0.5 BTC from Dale. The new transaction she makes has to reference both previous unspent transactions as inputs, since neither of these transactions taken individually have enough funds. One of the outputs of the transaction must be the 1.2 BTC that are sent to Bob. But Alice also need to add a 0.3 BTC output that are sent back to herself. In the future, if she could use these 0.3 BTC coins that remain in her wallet, by referencing this 0.3 BTC output as an input for a new transaction.

It would be possible to use the same public address to send the money back to the sender, but the Bitcoin Qt software sends it to a new address instead, for privacy reasons. A bitcoin wallet contains at least a hundred of such addresses which constitute the key pool. The key pool is pre-allocated (therefore many addresses will have a balance of zero) so that slightly out of date backups of wallet files result in no loss of bitcoins. Every time a transaction that requires return funds is made, these returned funds seem to “disappear” from the balance of the wallet’s public address. It’s possible to reach a balance of zero on your public address in that way.

How I Store My Bytes

Over a year ago, I read an article on Mockyblog about storing your personal data on a HP ProLiant MicroServer. After juggling with no less than 4 external hard drives, to palliate the the lack of space on my iMac, I ended up buying this machine and turning it into a Linux powered NAS. After listening to a recent episode of Accidental Tech Podcast, where the host chose a more expensive and convenient approach, I decided to share my experience.

  • The HP microserver is cheaper than a real NAS. HP regularly operates a cashback offer on this hardware (which I used). You could acquire an HP ProLiant G7 N54L 2.2GHz for roughly £150 with a cashback offer last month. The cashback and the price varies but regularly comes back. The hardware is comparable to a NAS: it is small, fitted with a 4 HD bay, it has a CD-ROM bay that can be used for an additional disk, and it has E-SATA ports for adding external disks. In comparison, Synology hardware is usually north of £400 and DROBO is north of £300. The cheapest 4 day NAS I found is the Synology DS413j which currently retails at £265. But the hardware is pale in comparison of my HP microserver: single core CPU, no E-SATA and only 512MB of RAM.

  • Booting from the internal USB connector. There is a connector fitted on the motherboard, and this is where I plug my USB thumbdrive (a SanDisk Cruzer Blade). Since the microserver is not connected to any screen nor keyboard, having it boot from a thumbdrive is an advantage. In case of troubles, I can take the drive, plug it to the iMac and boot using VMWare. I also regularly clone it to another thumbdrive for backup purposes. Using the USB drive as a boot disk also means that all my hard drives are allowed to spin down to save power and prolong their lifetime.

  • It’s relatively quiet and low power. I measured my power consumption over the course of 1 month and it comes to about 40 Watts on average, for a monthly cost of £3.50. It is fitted with 4 hard drives, which are probably powered down about 90% of the time. I use hdparm to power down the drives when not in use. The CPU of the machine is constantly solicited though, with various services I installed, and this probably prevents the machine from lowering to 30-35 Watt, which can be observed by just booting it and doing nothing. The large fan is relatively quiet, but it got more noisy after 1 year and absorbing dust. Currently the noise is about 40dB from 1 meter away which is comparable to my late 2008 iMac.

  • Use encryption on all disks. As I explained in my previous post, the goal is not to stop the NSA or GCHQ from stealing my data. There is no 4th amendment in the UK, and the government can force you to reveal your key or put you in jail for 2 or 5 years if you refuse. However, encryption is a good way to prevent identify theft, if someone breaks into your house and steal your toys. I use full disk encryption with LUKS on all hard drives, but not the thumb drive. The encrypted disks appear as regular block devices, and you can use any filesystem or utility. Since I have no keyboard to enter a password at boot time, I have SSH into the box and enter it manually after each reboot. It’s slightly inconvenient, but I only power down the machine to dust it off, about once a month.

  • Use snapraid for redundancy. I fitted the server with the hard disks I initially used as external drives for the Mac. Consequently all for disks are of different sizes. My goal is to achieve 1 disk redundancy, i.e. be immune to the failure of a single disk. Traditional RAID would have a hard time to cope with this setup, and it would be hard to expand it dynamically. Mockyblog mentioned using ZFS with RAID-Z. This fulfills the redundancy goal, but unfortunately it’s not possible to add a disk to an existing zpool. There are complicated techniques to work around it though… BTRFS is a good alternative to ZFS but it not yet ready. So in the end I adopted snapraid. Despite its name, it has nothing to do with RAID. It’s a file utility than runs on Linux, MacOS X and Windows and work with any file system. It can be easily configured with 3 data disks and 1 parity disk. The parity is stored in a series of checksum files and it’s somewhat similar to PAR2. It reads chunks of files from each data disk and store the checksum on the parity disk. You need to run snapraid regularly to update the checksums or use a cron job. It’s very well suited for for NAS storage, where data seldom changes and is mostly added (rather than modified/deleted). Compared to RAID, you also do not need to wake up all 4 drives when choose to read or write data.

  • Plex Media Server. This probably the service I use the most. This is a huge improvement over manually sorting files and firing up EyeTV or VLC, because Plex automatically retrieves thumnails and meta data, and the UI is very good. I can also access my collection from all my small collection of iOS and Android devices, or via a web page. Streaming works even outside of the home network, and even though my upload link is not good enough for 1080p films, it does stream music very well. As a consequence, I have deleted almost all of my songs from the 32GB iPhone and for the first time ever, I have lot of space to spare. It’s like having your own Spotify, iCloud, and other radio things, for much cheaper and without ads and privacy concerns.

  • Bittorrent Sync. As an alternative to Dropbox, not the Pirate Bay style bittorrent. Again it’s free, and I can share a lot more than with the dropbox free tier. You could add a dedicated server with 100MBits bandwidth in the mix (which you can get for about £3.50 a month for 500GB these days, but I use this instead). My phone, my office computers, my laptop all use it. I recently synchronized 20GB of photos and videos of my brother’s wedding.

  • Various download services. You can queue large download (or upload) from HTTP, SSH, RSync, BitTorrent, NZB, or anything really… The advantage is that this server is low power, you can schedule for data transfer to occur in the middle of the night, so that bandwidth is not affected during the day. There are lots of software you can run on the server and use with a Web interface: Bittorrent Sync, Sabnzbd, Sickbeard, Couchpotato, Transmission to name a few.

  • Private Minecraft server. Or any game really. This is really the kind of stuff that would be hard to achieve on a NAS. I actually had to put extra RAM on the server. In honesty the CPU is a bit weak for that kind of thing, but it can be done. I doubt a NAS would be able to do this…

In the end, using commodity PC hardware and Linux you can build pretty much any solution for less money than a dedicated NAS.

There is no denying that researching and implementing features that come out of the box on a commercial NAS is really time consuming. But once it’s done, maintenance does not cost any time. I hardly touch the server anymore, unless I need to dust it off or install updates.