I fell in love with md5 hash algorithm because it can detect some very interesting characteristics of system which I want to benchmark. Almost all computations which need to be performed during computation of md5 hash sum are lying in critical path. It means that it’s almost impossible to parallelize md5 hash sum computation. And I’m not talking about execution in multiple threads, but about instruction level parallelization(superscalar and vector computing). So this feature excluding any new modern tricks used in CPU cores(like out-of-order execution and specialized instruction sets) out of equation and makes it perfect single thread benchmark.
Let’s see some numbers:
Calculate md5(10GiB of zerroes) on i5-760(Turbo frequency: 3.33 GHz, launch date Q3’10)(with Ubuntu 14.04)
1 2 |
$ (dd if=/dev/zero bs=1M count=10k | md5sum >/dev/null) 2>&1 | tail -n1 10737418240 bytes (11 GB) copied, 22.8959 s, 469 MB/s |
And then do the same on i7-6700(Turbo frequency: 4.0 GHz, launch date Q3’15)(with Ubuntu 15.10)
1 2 |
$ (dd if=/dev/zero bs=1M count=10k | md5sum >/dev/null) 2>&1 | tail -n1 10737418240 bytes (11 GB) copied, 17.325 s, 620 MB/s |
So we have 140 and 155 MB/s per GHz respectively. It is 10.7% performance boost after 5 years of CPU evolution. And it looks so frustrating.
p.s. Yep, I know that CPU now much smarter than 5 years ago and have rich set of specialized instruction sets(like AES-NI which is responsible for +2200% ghash calculation speed). But any software developer should be ready for that fact that unparallelizeable algorithms execution will not become faster for even a bit in near future.