The SciLens project team has been granted a budget from the Netherlands Organisation for Scientific Research (NWO) to develop and install in 2011 a large database experimentation platform. Significantly improving the potential for large-scale demonstrators.
The infrastructure is specifically tuned towards data-intensive work. The CPU power is balanced against its main memory, its network topology, and the IO bandwidth. All nodes approach an Amdahl factor 1.0. The system configuration is based on a 4-tier cluster: a computational top tier (1 node), a high-end tier (16 nodes), a Cloud-oriented tier (64 nodes), and an energy-conservative tier (256 nodes). Each tier consist of a homogeneous platform. All tiers, except the top, come with the same global hardware resources of >1 TB RAM and >128 TB disk space. The tiers are assembled in both an Infini-band network and GB Ethernet to enable topology reconfiguration and slicing over all tiers.
From dream to reality
During spring of 2011 we assembled a dozen different boxes for the bottom two tiers. They were assembled from COTS elements and included Intel Atom D425 @1.8Ghz, Atom 330 @1.6 Ghz, I3 540 @3.07Ghz, E5700 @3 Ghz, I3 M350 @2.27Ghz, I7 860 @2.8 Ghz, I7 970 @ 2.8Ghz, I7 K2600 and AMD II X2 250, and AMD Bobcat. It covered a complete range of affordable processors with 1 to 6 cores. For hard drives we experimented with RAID-0 configurations from 1-4 160, 500 and 2TB SATA disks and 128 GB, 250 GB SSDs. All systems were mounted on state-of-the art motherboards.
Given the large number of parameters, a heuristic search was initiated to find the 'best' solution for the bottom layers. The search components were based on measuring pure disk IO bandwidth, network capabilities, compilation of the MonetDB system and two database warehouse benchmarks, i.e. TPC-H and the airtraffic benchmark. The latter are considered representative for the work we intend to perform on the machine. In addition, we measured electricity consumption of the complete system in idle mode, boot and high-stress situations.
The selection process soon uncovered the weakness of the presumed energy conservative Atoms. Using these systems for database processing turned out to both poor in performance and far more energy draining. For example, the complete workload on an Atom took around 5.2 hours compared to the I7 970 in 1.2 hours. Although the peak energy drain was 40 against 132 Watt, the total energy consumed by the I7 was better as the machine could be turned off after an hour. This phenomena has been observed in most cloud settings as well; migration of work and turning off machines is by far the best energy saver.
The disk specific benchmark confirmed the well-known dependency on read/write mix and the block sizes. However, in assembling the systems the motherboard chip sets had sufficient impact on the results measured. The SATA disks topped at 380 MB/sec in 4-way RAID 0 and the peak for the SSD was 2 GB/sec using an (expensive) JBOD card. The benefit of faster SSDs was not fully reflected in the database workload performance, where it lead to about a factor 2 compaired with a similar machine configured with SATA HDD.
The SciLens Machine
The heuristic exploration let to choosing the 'best' of all worlds, taking into account the price per box. We settled for two kind of boxes. The bottom layer of the machine, the pebbles, consists of 144 shoeboxes with an AMD Bobcat, 8GB RAM, GB Ethernet, and 5 2TB HDD configured as 4 in RAID-0 and one system disk. The next layer, the rocks, consist of 144 Shuttle boxes with an Intel K2600, 16 GB RAM, 40 Gb Infiniband, a 2TB HDD and (to be decided) 0.5-1TB SSD.
The aggregate processing power consists of 1332 (288 + 1044) cores, 3 TB RAM (1 TB + 2TB), 1.72 PB HDD (1.44 + 0.28) storage.