art with code

2013-07-21

10 billion user challenge

Storage

A person using a PC with a keyboard and a mouse can generate something like 100 input events per second. Assuming we can compress those to two bytes per event, the maximum a person can output on a computer is around 200 bytes per second.

In 24 hours, that'd amount to around 17 megs. A year would fit in 6 GB. If you wanted to have the system used by everyone in the world, it'd generate 173 petabytes per day and 63 exabytes per year. With double redundancy, you'd have to store 126 exabytes of data.

If you were to store the data on 4TB hard disks, you'd need 31 million of them. At current prices, that'd come to 5.6 billion dollars per year, which comes to 56 cents per person. (If you bought solar PV to power the drives and expected 20 year panel lifetime, you could power each drive at 35 cents per year. Total electricity cost $10 million a year.)

Let's optimize that a little bit. The input stream is likely to average less than 1 byte per second as not everyone swings their mouse around every second of the day. And most people sleep a third of their day. The input event stream is likely to be highly compressible as well. Good compressors squeeze text to a fifth of the original size. Putting all of the above together, the compressed realistic input stream per person might be around 0.1 bytes per second.

Now the per person storage per year shrinks to 6 megabytes with double redundancy. The disk cost falls to 2.3 million per year. Total electricity $5000 per year.

Access

It's not enough to just store all the things made by everyone, everywhere, forever. You want to have people following interesting events and broadcasting what they're doing. Suppose the latest rubber-stamped planetary dear leader wants to address every single person out there on the subject of taxes, general mood, weather, etc. The resulting 3 hour monologue composed mostly of hastily-written text messages needs to be sent to all the ten billion humble subjects in real-time. At 160 bytes per minute, the combined bandwidth consumed by the stream would be 27 GBps.

Bear in mind that all the also listeners want to tell their 150 closest friends about their thoughts about dear leader's hair-do and dance moves. Each person is now on the receiving end of 150 streams and is sending out a stream to 150 listeners. At 160 bytes per minute, the total bandwidth used by each person in each direction is 400 bytes per second. Each person would also produce something like three random reads and writes per second. The combined bandwidth used by the population would sum up to 8 TBps and require 60 billion IOPS.

To achieve 60 billion IOPS, you'd need around 6,000 DRAM chips. Or 600,000 super-fast SSDs. Let's say that you can spend a bit of time in RAM and turn the random writes into streaming log writes. If you needed 10 TB of DRAM to achieve that, the cost would be around $70000 (DDR3 spot price for a 512MB chip is $3.5). For the SSD solution, you'd have to spend 60 million. On the other hand, you would also get an extra 77 PB of fast storage.

Software

What kind of math lets you piece all this stuff together? ( ._. )







2013-07-10

The $1000 cluster

$50 gets you a quad-core ARM A9 with 2 GB of DDR3 and 8 GB of flash. Buy twenty. Plug them into a $80 switch with 24 ports.

Hello, kinda slow 80-core cluster with 40 GB RAM and 160 GB flash, plus 80 ARM Mali-400 GPU cores. Total bill of materials $1080.

You get something like peak 192 DP GFLOPS (512 SP) from the CPUs, 480 GFLOPS from the GPUs, 60 GBps aggregate memory bandwidth, 10-20 megs of L2, 5 megs of L1. And all the fun of programming a cluster with a 100 Mbps interconnect.

That said, if you have the right workload, you might have something interesting here. I kinda want to build one just for the fun of it.

Or, wait for the A15 with Mali-T600 series to go mainstream. You'll get double the CPU performance, OpenCL, and triple the GPU perf.

Blog Archive