I asked an interviewee how long it takes for a disk seek, and he replied that he thought it was between 4 and 8 milliseconds. Speaking with colleagues, the conventional wisdom was that it was around 10 ms. I was unsatisfied so I thought I would try it for myself.
time sudo perl -we '$disk="/dev/sda"; $n=1500; $blocks=`blockdev --getsz $disk`; if (!$blocks) {print "Enter capacity in manufacturer GB\n: "; $blocks=1953125*(<>)}; use Time::HiRes "time"; open DISK, $disk; $start=time; for (0..$n) {seek DISK, int(rand($blocks))*512, 0; sysread DISK, $x, 512 || die; $now=time; $times[int(($now-$start)*1000)]++; $x=$now-$start; $s2+=$x**2; $s+=$x; $start=$now}; for (0..$#times) {if ($t=$times[$_]) { $tot+=$t; $median||=$_ if $tot>=$n/2; printf "%3d %s\n", $_, "x" x ($t/2) . ($t%2?":":"")}}; printf "\nTook %3.4gs for %d seeks of %s (%d GB)\n", $s, $n, $disk, $blocks/2097152; printf "Mean: %2.03gms; Median: %d-%dms; Std dev: %2.03gms\n", 1000*$s/$n, $median-1, $median, 1000*sqrt($s2/$n - ($s/$n)**2);'
It should be obvious that before you run this, you should check for yourself that it doesn't do anything dangerous. Or at least check that $disk is set appropriately for your hardware and operating system. If you don't have blockdev, estimate the number of 512-byte blocks and set $blocks to that value. (There are 1953125 blocks in a hard disk manufacturer’s "Gigabyte".)
The graph it produces prints an 'x' for two seeks of a given number of milliseconds and a trailing ':' if there was one left over. For me, on my one year old Linux 2.6.18 workstation with a 160 "GB" Western Digital (WDC WD1600JS — quoted seek time 8.9 ms) the typical output is:
0 : 3 : 4 : 5 x: 6 xxxx: 7 xxxxxxx 8 xxxxxxxxxxx: 9 xxxxxxxxxxxxxxxxx 10 xxxxxxxxxxxxxxxxxxxxxxx 11 xxxxxxxxxxxxxxxxxxxxxxxxxxxx: 12 xxxxxxxxxxxxxxxxxxxxxxxxx 13 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 14 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 15 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 16 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 17 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 18 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 19 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 20 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 21 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 22 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 23 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 24 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 25 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx: 26 xxxxxxxxxxxxxxxxxx 27 xxxxxxxxxxxxxxx 28 xxxxxx: 29 xxxxx: 30 xxxx 31 : 32 : 34 x 35 : 57 : Took 27.69s for 1500 seeks of /dev/sda (149 GB) Mean: 18.5ms; Median: 17-18ms; Std dev: 5.24ms real 0m27.859s user 0m0.116s sys 0m0.044s
Why might I care? 18 milliseconds is almost a lifetime compared with anything else a modern PC does. It’s slower than my monitors' refresh period! I can read around 1 MB from disk, 10 MB over a GigE link or 20 MB from RAM in this time. I can ping from Ireland to England, crossing 40 routers there and back, in the time it takes for my disk head to seek.
Every time you read something from a previously unread file, it costs on average EIGHTEEN MILLISECONDS even before it starts reading. That’s just 55 in a second. This has obvious implications for I/O program performance, i.e. that of most servers.
If anyone has a concern about my method, I'd be interested to hear it.
I'd also like to see the timings for different disks. Either post the whole histogram or just the stats, plus the make and model of the disk. On Linux you can get this from dmesg or hdparm -I device.
Update: I've made the script more robust under non-existence of blockdev.
I forgot : my drives are both Hitachi 7200 rpm.
The laptop’s one is 100 Gb one, the server’s one 120 Go…
Weired… I can't post a message with the results…
OK, there appears to be some problem with posting comments. Jermy or JC, can you post or mail to me (ads@wompom.org) a description of what you did and the site’s response?
Thanks
This is reason to turn off disc cache
anonymous: The graph’s curve would start much higher up if the readahead cache were to blame. Just to test it, I tried again and the results were within bounds:
Mean: 18.5ms; Median: 17-18ms; Std dev: 11ms
Jermy has given me the results for one of his 500GB disks. He notes that the first time he ran the script, the median was much lower than the mean and he had a 7000-ms seek as an outlier. When he ran it again (after it had spun up) this vanished.
Disk is a Seagate 500GB, Model ST3500641AS
1500 seeks of /dev/sda (465 GB)
Mean: 13.7ms; Median: 12-13ms; Std dev: 3.9ms
Took 25.26s for 1500 seeks of /dev/sda (74 GB)
Mean: 16.8ms; Median: 14-15ms; Std dev: 8.79ms
The drive is a Seagate ST3808110AS.
JC gave me the following info:
The results on my laptop (running OpenSuse 10.2) :
Took 0.7388s for 1500 seeks of /dev/sda (0 GB)
Mean: 0.493ms; Median: 0-1ms; Std dev: 3.29ms
real 0m7.091s
user 0m0.130s
sys 0m0.040s
The results on my server (running Debian 4.0) :
Took 0.2398s for 1500 seeks of /dev/sda (0 GB)
Mean: 0.16ms; Median: 0-1ms; Std dev: 0.954ms
real 0m0.354s
user 0m0.076s
sys 0m0.028s
I am rather surprised that you have such a slow access time with your
disk !
Took 19.87s for 1500 seeks of /dev/sda (149 GB)
Mean: 13.2ms; Median: 12-13ms; Std dev: 4.82ms
Seagate ST3160827AS
On my server with SCSI Ultra320 drives, Seagate Model ST3146707LC
Took 13.11s for 1500 seeks of /dev/sda (136 GB)
Mean: 8.74ms; Median: 7-8ms; Std dev: 4.47ms
Took 20.21s for 1500 seeks of /dev/sda (232 GB)
Mean: 13.5ms; Median: 12-13ms; Std dev: 4.76ms
Seagate SATA ST3250620AS
Just one pointer, the results seem to vary quite a lot. After I ran the above I ran it twice again in succession. The first run looked like this:
Took 22.89s for 1500 seeks of /dev/sda (232 GB)
Mean: 15.3ms; Median: 12-13ms; Std dev: 6.75ms
And then the second run:
Took 19.98s for 1500 seeks of /dev/sda (232 GB)
Mean: 13.3ms; Median: 12-13ms; Std dev: 3.7ms
As you can see, performance seems to vary quite a lot so the results should be taken with a grain of salt (like most performance/benchmarking tests). The median value seems to be the one worth taking note of.
[…] There is a nice script to measure the access time of your disk. […]
You should send this to the hdparm maintainers. Perhaps it could be included as an example.
Just ran the script again on one partition at a time. As you migh expect, the acess time goes down but I couldn't get any better than this:
Took 10.95s for 1500 seeks of /dev/md1 (1 GB)
Mean: 7.3ms; Median: 6-7ms; Std dev: 3.39ms
The underlying hard disk partitions showed the same results, so mirroring isn't beneficial in this case. (It’s a linear, blocking script so that'’s hardly surprising.)