Archive for the ‘Databases (RDBS)’ Category

MySQL 5.6.25 sql-bench results on Linux 4.2.3 kernel and SSD RAID 1

Friday, October 23rd, 2015

This post provides fresh sql-bench results for the test that was performed a while back.
(more…)

HP DL380 G5 drive write cache (BBWC)

Friday, June 19th, 2009

In one of my previous articles I have written about tools that I use for benchmarking database performance and especially for discovering system bottlenecks. rose gold glitter iphone 8 case iphone 7 case pocahontas iphone x case girls In this article I will show you how large impact on filesystem performance a drive write cache may have.

HP DL360 G4 server

HP and its line of DL servers is very respected amongst IT engineers. iphone 7 phone cases rose gold football phone cases iphone 7 It is (at least by my experience) a reliable class of servers, well built, easy to maintain and comes with an excellent server management software called Integrated Lights-Out (ILO). phone case iphone 7 walking dead DLs SCSI and SAS storage subsystem is usually controlled by controllers called Smart Array, which can be integrated onto the motherboard or not. otterbox iphone 6 case strada Fresh HP customer gets all the nifty RAID levels to play with and is usually satisfied. iphone 7 apple case leather apple charger case iphone x But what fresh customer usually DOES NOT KNOW is that Smart Array write performance really sucks if write cache is not enabled. one direction phone case iphone 7 soft iphone 6 plus case iphone 7 dragon ball case And to enable it, fresh user needs to install a special module with attached battery called BBWC. iphone 8 case carbon iphone 7 case cream So, HP, if you are accidentally reading this, please do notify your customers about such things. iphone 7 plus case star trek

HP DL380 G5 server

This is another server where I conducted the benchmark and, unlike DL360 G4, already came with BBWC installed. apple iphone 7 phone cases marvel iphone 7 plus case beach iphone 7 case with card holder black To simulate absence of write cache I disabled it with command line tool hpacucli (HP Array Configuration Utility Command Line Interface). iphone 8 case ombre iphone 8 case nack front

Benchmark metodology

For this benchmark I used two tools from sql-bench suite which heavily stress filesystem with lots of file creations and deletions. diamante phone case iphone 8 luke hemmngs iphone 6 case These tools are test-alter-table and test-create. iphone 6 anti radiation phone case The former test is faster and only gives rough figure. phone case iphone 7 plus initials The later creates and deletes around 50,000 MyISAM tables which results in 150,000 files created and deleted. I executed both benchmarks first without and then with write cache enabled. silicone phone cases iphone 6 plus One of the machines (the DL380) was already in production, but it was benchmarked during the night when usage is negligible. detachable iphone 7 case wolves iphone 7 case

Test systems

HP DL360 G4

  • Controller: Smart Array 6i
  • Filesystem: ufs
  • OS: FreeBSD 6.0
  • MySQL: mysql-5.0.41-freebsd6.0-i386

HP DL380 G5

  • Controller: Smart Array P400i
  • ext3 filesystem
  • OS: Slackware 12.2
  • MySQL: mysql-5.0.77 compiled from source

Results on HP DL380 G5

Drive write cache DISABLED

Testing of ALTER TABLE Time for insert (1000) 0 wallclock secs ( 0.02 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for alter_table_add (100): 17 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for create_index (8): 2 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for drop_index (8): 2 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for alter_table_drop (91): 16 wallclock secs ( 0.01 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.01 CPU) Total time: 37 wallclock secs ( 0.03 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.03 CPU) Testing the speed of creating and dropping tables Time for create_MANY_tables (10000): 253 wallclock secs ( 0.26 usr 0.06 sys + 0.00 cusr 0.00 csys = 0.32 CPU) Time to select_group_when_MANY_tables (10000): 1 wallclock secs ( 0.09 usr 0.07 sys + 0.00 cusr 0.00 csys = 0.16 CPU) Time for drop_table_when_MANY_tables (10000): 1 wallclock secs ( 0.09 usr 0.03 sys + 0.00 cusr 0.00 csys = 0.12 CPU) Time for create+drop (10000): 259 wallclock secs ( 0.24 usr 0.18 sys + 0.00 cusr 0.00 csys = 0.42 CPU) Time for create_key+drop (10000): 255 wallclock secs ( 0.41 usr 0.11 sys + 0.00 cusr 0.00 csys = 0.52 CPU) Total time: 769 wallclock secs ( 1.09 usr 0.45 sys + 0.00 cusr 0.00 csys = 1.54 CPU)

Drive write cache ENABLED

Testing of ALTER TABLE Time for insert (1000) 0 wallclock secs ( 0.02 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for alter_table_add (100): 3 wallclock secs ( 0.01 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for create_index (8): 1 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for drop_index (8): 0 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for alter_table_drop (91): 4 wallclock secs ( 0.01 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.01 CPU) Total time: 8 wallclock secs ( 0.04 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.05 CPU) Testing the speed of creating and dropping tables Time for create_MANY_tables (10000): 9 wallclock secs ( 0.34 usr 0.06 sys + 0.00 cusr 0.00 csys = 0.40 CPU) Time to select_group_when_MANY_tables (10000): 1 wallclock secs ( 0.06 usr 0.03 sys + 0.00 cusr 0.00 csys = 0.09 CPU) Time for drop_table_when_MANY_tables (10000): 1 wallclock secs ( 0.10 usr 0.04 sys + 0.00 cusr 0.00 csys = 0.14 CPU) Time for create+drop (10000): 9 wallclock secs ( 0.44 usr 0.10 sys + 0.00 cusr 0.00 csys = 0.54 CPU) Time for create_key+drop (10000): 10 wallclock secs ( 0.35 usr 0.10 sys + 0.00 cusr 0.00 csys = 0.45 CPU) Total time: 30 wallclock secs ( 1.29 usr 0.33 sys + 0.00 cusr 0.00 csys = 1.62 CPU)

Drive write cache DISABLED ENABLED Relative difference
sql-bench: test-alter-table 37 s 8 s 462%
sql-bench: test-create 769 s 30 s 2563%

Results on HP DL360 G4

Drive write cache DISABLED

Testing of ALTER TABLE Time for insert (1000) 0 wallclock secs ( 0.02 usr 0.02 sys + 0.00 cusr 0.00 csys = 0.04 CPU) Time for alter_table_add (100): 33 wallclock secs ( 0.01 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for create_index (8): 4 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for drop_index (8): 3 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for alter_table_drop (91): 33 wallclock secs ( 0.01 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Total time: 75 wallclock secs ( 0.04 usr 0.03 sys + 0.00 cusr 0.00 csys = 0.07 CPU) Testing the speed of creating and dropping tables Testing with 10000 tables and 10000 loop count Time for create_MANY_tables (10000): 1035 wallclock secs ( 1.27 usr 0.25 sys + 0.00 cusr 0.00 csys = 1.52 CPU) Time to select_group_when_MANY_tables (10000): 83 wallclock secs ( 0.63 usr 0.16 sys + 0.00 cusr 0.00 csys = 0.79 CPU) Time for drop_table_when_MANY_tables (10000): 493 wallclock secs ( 0.50 usr 0.19 sys + 0.00 cusr 0.00 csys = 0.69 CPU) Time for create+drop (10000): 958 wallclock secs ( 1.59 usr 0.38 sys + 0.00 cusr 0.00 csys = 1.97 CPU) (NOTICE: Could not wait for this test to finish, because machine needed to get back in production,) (thus I assume it to be around 900 seconds just to be on the safe side, probably would be more.) Total time calculated: around 3400 seconds

Drive write cache ENABLED

Testing of ALTER TABLE Time for insert (1000) 0 wallclock secs ( 0.02 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for alter_table_add (100): 8 wallclock secs ( 0.02 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Time for create_index (8): 1 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for drop_index (8): 1 wallclock secs ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU) Time for alter_table_drop (91): 8 wallclock secs ( 0.01 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.02 CPU) Total time: 18 wallclock secs ( 0.05 usr 0.02 sys + 0.00 cusr 0.00 csys = 0.06 CPU) Testing the speed of creating and dropping tables Time for create_MANY_tables (10000): 104 wallclock secs ( 1.07 usr 0.24 sys + 0.00 cusr 0.00 csys = 1.31 CPU) Time to select_group_when_MANY_tables (10000): 27 wallclock secs ( 0.48 usr 0.18 sys + 0.00 cusr 0.00 csys = 0.66 CPU) Time for drop_table_when_MANY_tables (10000): 53 wallclock secs ( 0.32 usr 0.09 sys + 0.00 cusr 0.00 csys = 0.41 CPU) Time for create+drop (10000): 143 wallclock secs ( 1.31 usr 0.30 sys + 0.00 cusr 0.00 csys = 1.62 CPU) Time for create_key+drop (10000): 164 wallclock secs ( 1.56 usr 0.36 sys + 0.00 cusr 0.00 csys = 1.92 CPU) Total time: 491 wallclock secs ( 4.74 usr 1.18 sys + 0.00 cusr 0.00 csys = 5.92 CPU)

Drive write cache ABSENT ENABLED Relative difference
sql-bench: test-alter-table 75 s 18 s 416%
sql-bench: test-create 3400 s 491 s 692%

Analysis

The test-alter-table results seem fine, slightly over 400% increase in performance. water proof case iphone 8 plus 3 in 1 iphone 7 case But what bothers me is the test-create difference. slogan iphone 7 case iphone 7 vape case I expected the HP DL360 G4 to improve more and execute this test below 100 seconds barier, heck, actually I expected it below 50 seconds. iphone 8 plus case charging It is true that this machine uses different operating- and filesystem. bape case iphone 7 plus iphone 6 plus cases tumblr But 500 seconds still seems too much to me, especially when HP DL380 G5 excells at 30 seconds. phone case iphone 6 space wildflower cases iphone 6 plus If someone know the answer, please drop it in comments. iphone 7 jet black case

Conclusion

I believe this article has clearly shown why one must conduct even such synthetic tests before deploying the systems to production environment. Furthermore even before the “real-world benchmarks” are conducted. squishy phone case iphone 7 plus water phone case iphone 7 plus The phrase “real-world benchmark” signifies a comparative benchmark of certain application on an existing production systems and on the ones that are in testing phase. It often happens that hardware is not upgraded for quite some time, which means that new hardware is few generations younger than the existing one. tech 21 phone case iphone 6 The new one is far more powerful and one easily misses some not-so-innocent bottleneck if “real-world benchmark” displays certain improvement. iphone 7 case edge Thus, as I believe, newer systems MUST perform better than older ones in every synthetic benchmark (if the systems are comparable, of course), and only then we can start conducting “real-world benchmarks”. iphone 8 plus jet black case

How did I discover this “issue”?

It happened to me back in the 2004 that I deployed such a HP server to collocation facility and only later discovered that it was performing worse than some old test machine lying under my desk. galaxy iphone 7 plus case iphone 6 case orange After couple of hours of googling I assumed that the lack of BBWC was our problem. iphone iphone 6 case 360 degree phone case iphone 7 I had to order it and then remove the server from collocation because I also wanted to upgrade all the firmwares, just in case. On top of that, I still had to figure out how to install ‘hpacucli’ on non-RedHat linux. iphone 8 case marvel iphone 6 orange case After a long weekend the machine was back in production and never caused a single problem again.

MySQL sql-bench results

Tuesday, June 16th, 2009

UPDATE: There was a newer test performed, with MySQL 5.6.25 on Linux 4.2.3 64bit, on almost the same hardware. designer phone case iphone 8 You can see it here. iphone 8 plus marble case with name This is a follow-up article to the MySQL Super Smack benchmark results. lumee duo case iphone 6 Results from sql-bench benchmark suite can easily pinpoint some of the potential system bottlenecks. iphone 6 purse case lighting phone case iphone 6 I find it especially useful for discovering filesystem performance or – better – slowness. iphone 7 phone case character iphone 7 plus case london iphone 7 case running mirror case for iphone 8 plus Results Total execution time is: 562 seconds

# run-all-tests alter-table: Total time: 8 wallclock secs ( 0.02 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.03 CPU) ATIS: Total time: 2 wallclock secs ( 1.20 usr 0.09 sys + 0.00 cusr 0.00 csys = 1.29 CPU) big-tables: Total time: 5 wallclock secs ( 2.45 usr 0.08 sys + 0.00 cusr 0.00 csys = 2.53 CPU) connect: Total time: 50 wallclock secs (12.74 usr 4.50 sys + 0.00 cusr 0.00 csys = 17.24 CPU) create: Total time: 31 wallclock secs ( 1.20 usr 0.44 sys + 0.00 cusr 0.00 csys = 1.64 CPU) insert: Total time: 397 wallclock secs (97.95 usr 13.61 sys + 0.00 cusr 0.00 csys = 111.56 CPU) select: Total time: 44 wallclock secs ( 8.71 usr 0.88 sys + 0.00 cusr 0.00 csys = 9.59 CPU) transactions: Test skipped because the database doesn’t support transactions wisconsin: Total time: 3 wallclock secs ( 0.91 usr 0.23 sys + 0.00 cusr 0.00 csys = 1.14 CPU) TOTALS 562.00 123.77 19.82 143.59 3425950

System specification can be found here. iphone 8 cases boys baseus iphone 7 battery case iphone 6 case pink hard ReiserFS vs others In the age of Linux kernel 2.4.x we used ReiserFS v3 as the filesystem of choice. iphone 7 plus phone bling case fruit iphone 7 case With the available options of ReiserFS (journal, performance), ext2 (stable but slow) and ext3 (probably stable, but not so speedy as ReiserFS) the choice was obvious. surf iphone 8 case iphone 7 phone case marble black I skipped few years then and this year again tried using ReiserFS with linux 2.6.29.1 but it turned out to be even slower than ext2 was in the old days. lamborghini iphone 8 case apple iphone 8 case shockproof liverpool phone case iphone 7 plus Googling around for an answer gave some hints that ReiserFS has an issue with someting called BIG_KERNEL_LOCK on 2.6 kernels. iphone 7 case carbon I didn’t really investigate further, but went down the ext3 way. 8 iphone cases leather girly phone case iphone 6 plus Comments on test-create If test-create takes much more time than, say, 30-60 seconds, then you definitely have a problem with filesystem write performance. iphone 6 case bape On HP DL360 and DL380 class of servers this correlates with the presence and activation of BBWC (Battery-Backed Write Cache enabler kit). love moschino case for iphone 6 tpu case iphone 6 charger cases iphone 6 Without BBWC and hence without write cache enabled, this test took more than 10 minutes to complete. Thus, if you are purchasing some new HP servers, be sure that you also order BBWCs. iphone 6 plus cases disney quotes lv phone case iphone x Question about test-insert Looking at the test times, this test-insert result is really standing out. Again, I do not have any other data to compare it to, but somewhere deep down in my memory I seem to remember that the total time for all the tests was around 300 seconds. iphone 7 cases bling iphone 7 case mercedes superhero phone case iphone 8 plus This obviously means that this test-insert result is the bad guy here. amazing iphone 6 case Can someone comment on this result, or paste in the comments his own? Thanks. apple red iphone 8 plus case steven brown iphone 6 case iphone 7 plus flamingo case iphone 7 disney phone cases Feedback If you have any questions, recommendations or benchmark results to compare, do not hesitate to leave a comment. iphone 6 solar case tellur case iphone 7 plus sons of anarchy iphone 7 plus case sequin phone case iphone 6 iphone 7 cases michael kors iphone 8 case disney castle UPDATE1: 2014-09-09 I forgot to mention explicitly that this system is running 32bit version of Slackware UPDATE2: 2014-09-09 Fortunately this system is still up and running. iphone charge case 7 iphone 7 plus phone cases light up During these five years only storage has been expanded with 300GB 10K SAS drives in RAID 1 configuration. ted baker cases iphone 7 plus 360 iphone 8 case silicone tendlin iphone 7 case Software was upgraded regularly and is currently on MySQL 5.5.39 and pending 5.6.20 upgrade. I retested the test-create today and the result was 85 wall-clock seconds. case motorcycle iphone 8 plus rfid iphone x case This is almost 3x worse as initially. knomo iphone 7 case The server is currently lightly loaded.