May 7th, 2008 . by David Soergel
The problem
In searching for a way to back up one of my Linux boxes to Amazon S3, I was surprised to find that none of the many backup methods and scripts I found on the net did what I wanted, so I wrote yet another one.
Read the rest of this entry »
Posted in Sysadmin |
6 Comments »
March 14th, 2006 . by David Soergel
The Symptoms
I have two 3ware 8506-4 controllers with three disks each connected to them. I’ve been getting high iowait on my box whenever there’s an even slightly disk-intensive task going on. The weird thing is that even processes that don’t use the high-usage disk get stalled in iowait. As a result, load skyrockets and the system becomes unresponsive.
Inducing disk load with bonnie++ and watching things with “iostat -x” revealed that even disks with extremely low bandwidth usage and very few i/o requests end up with 100% utilization. However, only those disks on the same controller as the high-usage disk are affected. The observed total throughput is around 50 Mb/sec, e.g., the limit of the high-usage disk. It’s nowhere near the bandwidth limit of the controller, the SATA channel, or the PCI bus. So, there’s something wrong with the controller that effectively limits it to control one disk at a time with reasonable performance.
The kernel is the current CentOS 3.6 one, 2.4.21-37.0.1.ELsmp.
The write cache is enabled on the controllers.
This seems to be related to at least some of the posts on the infamous RHEL vs. 3ware performance bug.
The Diagnosis
Apparently, what is happening is that the four SATA channels on the controller share a single request queue, with a depth of 256 requests. When requests to one disk arrive at a high rate, they fill the queue, thereby starving the other disks.
Controllers from other manufacturers typically limit the number of requests to a single disk to some number smaller than the total queue depth for the controller. Thus, the queue for a single disk can get full without blocking access to the other disks. Why 3ware made the default per-disk queue depth the same as the depth for the whole controller is beyond me. Fortunately, this is correctable.
Read the rest of this entry »
Posted in Sysadmin |
No Comments »