Hi all,
The subject may be a bit misleading since I haven't investigated the whole
issue under other OS-es (nor do I plan to), but this is how the story goes..Those fancy new WD GreenPower drives seem to be heavily suffering from the
rapidly increasing head load/unload problem. And the bad thing is they don't
respond to 'hdparm -B', which would mean (I think) their power management
behaviour is solely up to their firmware.I got one of them (WD5000AACS) recently and to my horror after less than three
days of being power on this is what I saw:9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 66
193 Load_Cycle_Count 0x0032 197 197 000 Old_age Always - 10233At this rate the disk would reach it's design limit for load/unload cycles in
around 80 days. Not good - so I implemented a lame workaround of keeping disk
busy every couple of seconds - hopefully that won't kill it sooner that
unloads would..I am also currently talking with first line of WD's tech support trying to get
some data on how exactly those drives manage head unloading, but that may not
lead anywhere useful.So in parallel I decided to ask here to see if someone knows something about
this?If it matters, I am running vanilla 2.6.24 on that box and sata_sil is driving
that disk. Otherwise it is a pretty basic Ubuntu 7.10, a mix of ext3 and jfs
filesystems, all mounted with noatime. More detailed information available on
request.Tvrtko
--
Yuck, how stupid!
But the solution is simple. Make sure to get a warranty of much more
than 80 days. Use RAID-1 (or backup often).
Just let those disks destroy themselves (they _are_ faulty) and
get new ones all the time. As long as they make them this stupid, you won't
have to buy new disks again. Free warranty replacements forever.To be a bit more constructive, tell them about this strategy. Perhaps
they get
busy fixing the firmware?Helge Hafting
--
I have suggested exactly what you say - unfortunately the conversation has
gone cold since. Maybe they are busy already. :)Tvrtko
--
Design *minimum* for load/unload cycles is reportedly 600K cycles,
that would give me over 7.5 years of 24/7 operation, which I can live
with in cheap high capacity drives that have only 5 years of warranty
anyway.Support wouldn't say if the behavior is considered normal, though ...
Will set the timeout to 25sec using the DOS kludge and see what
happens ... doubt it will help much, though. I'd much rather like to
know what causes the drives to load heads again so soon after the
unload. With the usage pattern of these drives they should stay
unloaded most of the time here.Being able to set the timeout online via hdparm would be nice, too :)
C.
--
That is what they said to me as well. But their datasheet says it is 300k for
desktop and 600k cycles for RAID edition drives. It also doesn't mention
minimum but "Controlled unloads at ambient condition". But that's probably
fine.. you are just in luck that you don't use them as system drives. As I
said before, with timer set to 25.5s they are fine for me now - I am seeing
just two unload cycles per hour on average, that is by factor of 100 less
than before.Tvrtko
--
..
That can be arranged. I just need a copy of the DOS utility that does it now.
Cheers
--
I got back from WD's tech support and received a DOS utility which can control
this drive feature, apparently using vendor specific commands. With it head
unload timer can be disabled or set to a period between 100ms and 25.5s.Of course I asked for more than a DOS utility, but the question really is how
was this feature intended to work with Windows for example, is there
something there which would prevent such rapid load/unload cycle growth, and
why isn't it documented somewhere?I am kind off hoping that someone from WD is reading this list and will notice
this in case my effort with tech support fails.Tvrtko
--
time, perhaps this is why they never had this trouble when testing..
(which is kindof weird, as one would think lots of people would be--
I have the same disks, only with more warranty and supposedly
"enterprise" firmware (WD1000FYPS). The bug is the same, though, on
all four of them (~ 9 cycles /hour):9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 917
193 Load_Cycle_Count 0x0032 198 198 000 Old_age
Always - 80259 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 906
193 Load_Cycle_Count 0x0032 198 198 000 Old_age
Always - 79089 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 907
193 Load_Cycle_Count 0x0032 198 198 000 Old_age
Always - 80519 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 918
193 Load_Cycle_Count 0x0032 198 198 000 Old_ageHave they confirmed it's a bug? I have no idea how often the disks
Why? Using a DOS boot disk once would not be that bad as a workaround,
would it? Or is the setting not persistent? Either way I'd prefer to
be able to leave the feature on (in usable form), eventually.The bigger question is - why the rapid loads / unloads? In my case the
disks are idle hours at a time. No system files on there, just bulk
data. Does / could the controller (sata_sil24 in my case) have any
influence on this kind of power management?Thanks,
C.
--
This DOS nonsense does not work anywhere outside x86.
I see people put *lots* of afterthought in their utilities, hardware even.WD is not the first to do this unload nonsense, I've noticed it with
Toshiba before: http://lkml.org/lkml/2006/11/15/413 Since then, I am
dependent on the thkd[1] module and no hdparm is going to fix it; the
module does a dumb read every now and then on one device, causing
streaming performance to kink periodically, but it works at least at
keeping the disk alive. (Reason it's in kernel: better to have even when
userspace is not running.) Improvements welcome.[1] ftp://ftp5.gwdg.de/pub/linux/misc/suser-jengelh/rawkernel/
somewhere in there since recently.
--
..
Indeed. Seagate beat them to it by a decade or so.
Hence the hdparm -Z flag for old Seagate drives. :)
--
| Rafael J. Wysocki | [Bug #10493] mips BCM47XX compile error |
| Ingo Molnar | [patch 02/13] syslets: add syslet.h include file, user API/ABI definitions |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Andrea Arcangeli | [PATCH 00 of 11] mmu notifier #v16 |
git: | |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Linus Torvalds | Re: [GIT]: Networking |
| Mark Lord | Re: [BUG] New Kernel Bugs |
