====================================================================== PlasmaKV performance ====================================================================== SOFTWARE: plasma rev 490 (corresponding to plasma-0.5) HARDWARE: - Two nodes (office3, office4) with an Opteron 1354 CPU (4-core, 2.2 GHz), and 8G of RAM - The nodes store the volume data on a RAID-1 array. The disks are not ultimately fast, but "normal" 7500 rpm disks, connected with SATA-300. - The PostgreSQL database resides on an SSD with 90G capacity (OCZ Agility 3). - The two nodes are connected with a gigabit network. - A third node (office1) is used for submitting test requests CLUSTER CONFIGURATION: - There is only one namenode. The security mode for namenode accesses is "privacy" (i.e. full). The security mode for datanode accesses is "auth" (i.e. only authentication, but no encryption). - blocksize=64K ---------------------------------------------------------------------- TEST: plasma_kv_httpd_demo ---------------------------------------------------------------------- We create a KV database of around 3.5G size. The keys are small, and values have an average size of around 30K. The aim is to find out at what speed we can do lookups in the database, where the lookups are passed in via HTTP. The program plasma_kv_httpd_demo (in the Plasma distribution) is used for this purpose. It accepts HTTP connections, and interprets GET requests as database lookups. The program uses only a single thread and a single process, and concurrency is achieved by event-driven programming. The HTTP requests are generated with httperf (a standard utility): office1:~# httperf --port=8765 --uri=/p --wset=116936,1 --num-conn=100 \ --num-calls=1000 --rate=1000 This invocation creates 100 connections to the plasma_kv_httpd_demo server, and each connection serves 1000 HTTP GETs in sequence. RESULTS: ********************************************************************** httperf --client=0/1 --server=localhost --port=8765 --uri=/p --rate=1000 --send-buffer=4096 --recv-buffer=16384 --num-conns=100 --num-calls=1000 --wset=116936,1.000 Maximum connect burst length: 1 Total: connections 100 requests 100000 replies 100000 test-duration 147.274 s Connection rate: 0.7 conn/s (1472.7 ms/conn, <=100 concurrent connections) Connection time [ms]: min 131951.6 avg 141807.4 max 147207.1 median 0.0 stddev 3919.1 Connection time [ms]: connect 1080.1 Connection length [replies/conn]: 1000.000 Request rate: 679.0 req/s (1.5 ms/req) Request size [B]: 80.0 Reply rate [replies/s]: min 4.0 avg 675.1 max 1111.7 stddev 333.8 (29 samples) Reply time [ms]: response 87.0 transfer 53.7 Reply size [B]: header 143.0 content 30980.0 footer 0.0 (total 31123.0) Reply status: 1xx=0 2xx=100000 3xx=0 4xx=0 5xx=0 CPU time [s]: user 6.16 system 141.12 (user 4.2% system 95.8% total 100.0%) Net I/O: 20691.3 KB/s (169.5*10^6 bps) Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0 ********************************************************************** This is a very acceptable result: around 675 responses/second in the average, with a peak performance of around 1111 responses/second. While running this test, the httperf utility consumed one of the four CPU cores of the machine completely. Another core was almost completely utilized by plasma_kv_httpd_demo. The servers running the PlasmaFS system were almost idle (around 3% CPU consumption). So, we were only running against the limitations of our test setup, not against the capacity limit of PlasmaFS. For a more meaningful test we would have to run "something like this" on several cores in parallel.