login
Header Space

 
 

Re: ECC and DMA to/from disk controllers

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Alan Cox <alan@...>
Cc: Bruce Allen <ballen@...>, Linux Kernel Mailing List <linux-kernel@...>, Bruce Allen <bruce.allen@...>
Date: Friday, September 14, 2007 - 5:32 am

* Alan Cox (alan@lxorguk.ukuu.org.uk) [20070910 14:54]:

Alan,

Thanks for your interest (and Bruce, for posting).


Do you have any contacts?  We're in contact directly with the
system integrators only, not the drive manufacturers.


All our data is based on system-local probes (i.e. no network
involved).


Thanks, it's new information.  I was planning to extend fsprobe
with locality information inside the buffers so that we can catch
this as it is happening.


We tried to “force” these corruptions out from their hiding
places on targeted systems, but we failed miserably.  Currently we
can't reproduce the issue at will, even on the affected systems.


That's interesting, I'll think about how to expose this.
Currently a single pass writes data only once, so I don't think
any chunk can live hours long in the drives' cache.


They seem to be popping more frequently on ARECA-based boxes.  The
“software” is a running target as we gradually upgrade the
computer center.


Most of our workhorses are 3ware controllers, the CPU nodes
usually have Intel SATA chips.

The fsprobe utility we run in the background on practically all
our boxes is available at http://cern.ch/Peter.Kelemen/fsprobe/ .
We have it deployed on several thousand machines to gather data.
I know that some other HEP institutes looked at it, but I have no
information on who's running it on how many boxes, let alone what
it found.  I would be very much interested in whatever findings
people have.

Peter

-- 
    .+'''+.         .+'''+.         .+'''+.         .+'''+.         .+''
 Kelemen Péter     /       \       /       \     Peter.Kelemen@cern.ch
.+'         `+...+'         `+...+'         `+...+'         `+...+'
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
ECC and DMA to/from disk controllers, Bruce Allen, (Mon Sep 10, 8:19 am)
Re: ECC and DMA to/from disk controllers, linux-os (Dick Johnson), (Mon Sep 10, 2:05 pm)
Re: ECC and DMA to/from disk controllers, Alan Cox, (Mon Sep 10, 9:54 am)
Re: ECC and DMA to/from disk controllers, KELEMEN Peter, (Fri Sep 14, 5:32 am)
speck-geostationary