Tuesday, June 8, 2010

Slower ZFS scrubs/resilver on the way!

A nice change have made it's way into ON, it fixes a few bugs regarding general user performance when scrubs or resilver are running in a pool. It will delay the scrub thread if there was a non-scrub I/O the last 50 ticks, resilver are considered more importand and are delayed only 2 ticks where scrubs are delayed 4 ticks. This should allow the scrubs/resilver to run at full speed when possible while limiting the impact to other I/O. This will as with many other ZFS features probably make its debut in an update to the S7000 series firmware. Previous to this fix a failed drive or a ordinary scrub could have quite negative impact on other I/O in the pool.

"* We keep track of time-sensitive I/Os so that the scan thread
* can quickly react to certain workloads. In particular, we care
* about non-scrubbing, top-level reads and writes with the following
* characteristics:
* - synchronous writes of user data to non-slog devices
* - any reads of user data
* When these conditions are met, adjust the timestamp of spa_last_io
* which allows the scan thread to adjust its workload accordingly."


hg.genunix.org/onnv-gate.hg/rev/b118bbd65be9
6743992 scrub/resilver causes systemic slowdown

No comments: