SCSI vs IDE (was:Re: [vox-tech] Xen + LVM usage)

Bill Broadley bill at cse.ucdavis.edu
Thu Sep 14 18:43:50 PDT 2006


On Tue, Aug 08, 2006 at 10:15:52AM -0700, Luke Crawford wrote:
> SCSI, painful?  these massive fileservers you use, what is the 
> file-acces concurrency on them?

Not sure what you want as an answer.  Yes, my fileservers serve many
clients, sometimes 100's, and yes they are often reading/writing
different files.

> I've got around 20 virtual servers (some under moderate load) on one 
> server w/ 10 18G 10K fibre drives. I've got another 40 on 6 10K u160 SCSI 
> drives.   before I had customers, I had put 10 virtual servers on a SATA 
> disk system.  even though the SATA system was running trivial 
> loads (http, dns, spamfiltering and email for my personal stuff)  the sata 
> system would have periods of unresponsiveness, when you'd have to wait 10 
> seconds or more to open a 5K file with PINE.   On the SCSI system, I can 

Sounds like that system had issues.  It's not my experience that this
is the normal behavior for SATA.  Were you using linux?  A 2.6 kernel?

> count on opening 1GB mail folders in the same 10 seconds.  That's what I 

Again, linux? Same mail client?  Same mailbox format (mbox? maildir?)?
Caches cold in both cases?  Your machine was parsing and loading a
mailbox at 100MB/sec?  Thats a bit faster than I'd expect even with a
rather healthy filesystem.

I just tried it with mutt, this involves reading in 472MB 23,609 messages
( I didn't have 1GB of mail handy) of mailbox, and writing out said
mailbox, caches cold, 2 disk RAID-1:
# ls -alh mailbox-test
-rw-r--r--  1 root root 472M Sep 14 18:06 mailbox-test
# time mutt -f mailbox-test
23609 kept, 0 deleted.
real    0m8.260s
user    0m2.697s
sys     0m3.460s
#

Caches warm (this might have only read the file and exited):
# time mutt -f mailbox-test
Mailbox is unchanged.
real    0m1.549s
user    0m1.060s
sys     0m0.127s
#

> need... predictable.   It doesn't have to be blazingly fast, but it does 
> need to degrade gracefully when overloaded.  IDE does not seem to do this.

That has not been my experience.

> Modern metadata ordering /caching fileystems (logging or softupdates or 
> whatever) all but eliminate the IDE penalty on write, but on read, you 
> still hit concurrency issues if you have to many users hitting the same 
> IDE disk.

Er, in both cases you are limited by head movement and rotational delays,
you can only read one block at a time, and seeks take a long time.
In both cases you can have multiple outstanding requests.  In both cases
the head can be scheduled to get maximum throughput.

> Now, if you only have a small number of concurrent accesses to the 
> filesystem, I agree, IDE is the better choice, simply because it is so 

IDE != SATA.

> cheap and so big.  Right now, I'm looking at used SATA -> fibre channel 
> cases on E-bay, with the intent of renting customers IDE disks one at a 

Sounds strange.  Why bother putting the sata disk on the wrong end of
fibre channel interface?  Current SATA RAID cards can read/write 800-900MB
(yes Megabytes, not Mb = megabits).  Why get 1/10th the bandwidth and
higher latency with fibre channel?

> time.  One business model I am considering is to use customer-owned disks; 
> perhaps requiring that they buy the disk from me + pay a up front deposit 
> that is enough to cover return shipping and handling, then just charge 
> $15/month or so for an IDE slot in my SAN.  I  can then connect the IDE 
> disk to the virtual server the customer requests.  If they buy the disk 
> from me, I could even include a "I'll replace it within X hours of when 
> it dies, then handle the RMA myself at no charge" service, as that would 
> be easy for me to do. I just keep a couple extra disks around and swap 
> them as needed; then RMA all the disks once a month.  When they quit, I 
> mail the disk back to them, or buy it back for some pre-arranged 
> (time-based) fee.  This lowers my initial capital costs, and makes the 
> monthly cost of managed remote disk much more competitive with the 
> costs of throwing your own server full of ide up somewhere else.

Sounds reasonable, although somewhat labor intensive, tracking who owns
which disk, where it is physically, troubleshooting, sending/receiving
packages etc.  I tend to just buy a 16 bay SATA chassis and put
whatever motherboard I want in it, then anything particularly disk
intensive I run on that machine, anything less disk intensive I run
elsewhere and access the storage over the network.

As a potential customer I'd rather pay $x per GB and have it just work,
like say tonight.

I lean towards putting 2 $100 ish SATA disks in a Dom-0 and then depending
on centralized storage for er, centralized use. 

For bandwidth SCSI or SATA manage 50MB/sec or so (each), for random
accesses 100-200 per second.  I've not seen much difference.  Do you
have an actual reproducible workload that could generate this radical
difference (above you quoted a factor of 200,000).

I'd suggest postmark it allows pretty much any kind of test you want,
random, sequential, whatever mix of read/write, I think it supports
testing concurrent access as well.  Name your workload and I'll run it
on some hardware I have around.

> Of course, I'm low on both capital and time, so who knows when or if I 
> will implement it... but my point is that I'm not a total IDE bigot.

In my experience for bandwidth SATA scales better, hitting > 800MB/sec
on raid-5/6.  For smaller random access performance scales with the
number of heads.  While SCSI has a slight edge with 15k rpm disks, it's
usually cheaper/faster to just buy more SATA heads.  I.e. 16 SATAs crush
8 15k rpm SCSIs.

I manage a few 10's of TB's of SATA and have been very happy with the
varied types of uses.  For the really intensive work there is nothing
like having the disks inside the box.  

> Then we have the next big thing Serial attached SCSI-  from what I 
> understand, it looks an auful lot like those raptor 10K drives.  the 
> interesting part here is that there are/will be SAN-style disk aggrigation 
> technology, and SAS is plug-compatable with SATA, so if I get a SAS san 
> up, I will be able to swap in SATA disks using the same attachment 
> technology.

Right.  SCSI finally gets the advantages of SATA, nice cheap cables,
cheap enclosures, no more sillyness with SCSI ID's, a zillion types of
cables, complex termination issues etc.

I have a SAS array... hard to tell the difference compared to my SATA
arrays.

> So far the only SAS/SATA aggrigation tech I've seen in the field is the 
> 3ware multi-lane cable that aggrigates 4 SATA cables into one 
> semi-propritary cable (adaptec has something very similar, based on the 

Lame, I thought they used IB cables.  I've not tried multi-lane yet,
all my 16 bays have 16 sata cables.  

> same standard, but the cables are subtly incompatable.  One of my customers 
> got the wrong one; I tried.)  and really, it's nothing to get 
> excited about.  Fibre channel is a mature, stable and inexpensive 

Agreed.  16 disk SATA fileservers aren't very new/exciting, replacing
SATA with SAS doesn't change much.

> technology in comparison.  (that is, if you buy used 1G fibre.)

1Gb fiber is cheap... for scsi.  Not what I'd call cheap.  Especially
when you consider power usage, rack space, complexity, etc.  Monkeys
can put a few SATA disks together ;-).

-- 
Bill Broadley
Computational Science and Engineering
UC Davis


More information about the vox-tech mailing list