Is It Hopeless?

Previous thread: by kernel.majianpeng on Friday, December 24, 2010 - 7:49 pm. (4 messages)

Next thread: md-raid and block sizes by D.S. Ljungmark on Monday, December 27, 2010 - 5:00 am. (9 messages)
From: Carl Cook
Date: Sunday, December 26, 2010 - 11:19 am

I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.  

Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.

The only sign of trouble:
Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.

Reboot:
Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
Dec 25 16:15:48 cygnus kernel: [    ...
From: Neil Brown
Date: Sunday, December 26, 2010 - 1:11 pm

None of your logs show anything about jfs....

What does
  fsck.jfs /dev/md2
report?
What about
  mount -t jfs /dev/md2 /home

??


--

From: Carl Cook
Date: Sunday, December 26, 2010 - 1:19 pm

My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils.  Now the array mounts.

I don't understand it.  I thought the JFS driver is in the kernel?


--

From: CoolCold
Date: Sunday, December 26, 2010 - 1:19 pm

-- 
Best regards,
[COOLCOLD-RIPN]
--

From: Neil Brown
Date: Sunday, December 26, 2010 - 1:33 pm

Like many parts of Linux, most of JFS is in the kernel, but some support
tools are separate.   Most filesystems have a separate mkfs.$FSTYPE and
fsck.$FSTYPE.  ALSA (sound subsystem) has alsamixer etc.  md/RAID has mdadm,
nfs has nfs-utils etc etc.  Each of these are primarily kernel subsystems,
but need user-space tools to configure and manage them.

But the important thing is that you have your data back, preparing you for a
Happy New Year!


--

From: Berkey B Walker
Date: Sunday, December 26, 2010 - 2:14 pm

Excellent save!!!  The OP might want to continue the "Giving Season" by 
giving himself a brand new system backup.

--

From: Carl Cook
Date: Sunday, December 26, 2010 - 5:06 pm

Indeed.  Thank you Neil, you saved me (along with my not touching anything).



Actually I have much too much data to back anything up, so I've been stalling for over a year on building a backup server.  It's fixin' to happen now, you betcha.  A cube case with ITX and a BTRFS array the same size as my RAID10.  NoMachine NX over SSH for admin.  It'll be in a far corner of the garage down low, so if a fire or theft I don't lose my data.  I'll sync it with the main over GBethernet, at some interval, in some way (suggestions?), but keep it offline most of the time.  All my systems are named after constellations, so this will be called "Gemini". (the twin)


--

From: Stan Hoeppner
Date: Sunday, December 26, 2010 - 9:45 pm

<snip>

Every time I read/hear this I cringe.  If that is the case your data is
worthless to begin with so just delete it all right now.  You are
literally saying the same thing with your statement.  The difference is
that I _KNOW_ you drives will fail, or you'll lose an array due to
corruption, etc.  Your HTPC storage system will fail.  It's not an _IF_
but a _WHEN_ issue.  The question then becomes, what is the best
backup/restore strategy to fit HTPC needs.  Build another system with
similar technology and you have the same failure modes and risks as the
first.

Spinning Rust Disks (SRDs) are not a suitable long term backup/restore
solution.  What happens when your disk-to-disk backup server solution
drops an MD array for no reason such as just happened, _during_ a
restore operation?  Or you suffer a disk failure on the backup server
during a restore operation (which is very common today)?  Will your
backup server contain a 4 x 2 TB disk RAID 10 set?

I'd suggest tape as the better solution to D2D in the HTPC case,
primarily based on cost and availability of library and media, and the
fact the disaster recover procedure is much easier and much more
straightforward:

8 drive LTO-2 autoloader

http://www.msrcglobal.com/p-216-af203a.aspx?gclid=CI789Mm_i6YCFQTrKgodIg0pnA&
http://h18000.www1.hp.com/products/quickspecs/11841_div/11841_div.HTML
Ultrium 448 drive
3.2 TB compressed max per library
U160 LVD/SE SCSI interface
172 GB/hour capacity -
"desktop" model
$650 USD

http://www.newegg.com/Product/Product.aspx?Item=N82E16816118057
$90 USD

http://www.newegg.com/Product/Product.aspx?Item=N82E16840999118
8 x $22 = $176 USD

Total = $916 + shipping

Using the correct backup strategy this should easily meet your needs.
The 8 tapes in the library will handle 75% of your RAW level 10 mdraid
device capacity.  Once a filesystem is laid down, and you take overhead
concerns into account, you won't be putting more than 3.2 TB of data on
it anyway (or, at least, you ...
From: Phil Turmel
Date: Sunday, December 26, 2010 - 10:35 pm

(Heh.  I name my systems after constellations, too.)

I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net.  To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly.  I also use a super-cheap SATA hot-swap bay (carriage-less) to support weekly and monthly rotations to a fire safe at a third site.  Although Stan is right that tape is the best choice for really large backups, I find the commodity 1.5T drives hard to beat for convenience.

Just my $0.02.

HTH,

Phil
--

From: Carl Cook
Date: Monday, December 27, 2010 - 6:10 am

> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement. 

No, I'm saying that the MTBF of disk drives is astronomical, and the likelihood of a fail during backup is miniscule.  MTBF of tape is hundreds of times sooner.  Not to mention that tape would take forever, and require constant tending.  This is why it's not used anymore.  My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.



Can you please give some detail on your sync scripts?  I've never done this and am not a programmer, but I'm a pretty good shade-tree admin.


--

From: Phil Turmel
Date: Monday, December 27, 2010 - 8:04 am

Sure.  Attached.  Note that the script doesn't set the sysctls for speed limits...  The defaults are fine for me.

HTH,

Phil
From: Brad Campbell
Date: Monday, December 27, 2010 - 2:34 pm

Hrm.. I used to do this too, until a silent corruption issue with a controller trashed 8TB of data. 
Now if I'd done a "check" instead of a repair, and been alerted to the fact the controller was 
corrupting data rather than blindly overwriting the parity blocks I'd have had a chance of saving 
the array.

Checksum and test regularly!.

I do have a backup regime, however I do a full rotation every 3 months and it was approximately 3 
months and 5 days before I noticed I really had a problem.

Brad
-- 
Dolphins are so intelligent that within a few weeks they can
train Americans to stand at the edge of the pool and throw them
fish.
--

From: Stan Hoeppner
Date: Monday, December 27, 2010 - 9:37 am

Interesting statement.  You're arguing the reliability of these modern
giant driver, yet use RAID 10 instead of simple spanning or striping.  I

Failure during backup is of less concern.  Failure of the source
media/system during _restore_ is.  Read this thread and the previous
thread from Eli a few months prior.  It will open your eyes to the peril
of using a D2D backup system with 2TB WD Green drives.  His choice of
such, along with other poor choices based on acquisition cost, almost
cost him his job.  Everything is cheap until it costs you dearly.

http://comments.gmane.org/gmane.comp.file-systems.xfs.general/35555

Too many (especially younger) IT people _only_ consider up front
acquisition cost of systems and not long term support of such systems.
Total system cost _must_ include a reliable DRS (Disaster Recover
System).  If you can't afford the DRS to go with a new system, then you
can't afford that system, and must downsize it or reduce its costs in
some way to allow inclusion of DRS.

 There is no free lunch.  Eli nearly lost his job over poor acquisition
and architecture choices.  In that thread he makes the same excuses you
do regarding his total storage size needs and his "budget for backup".
There is no such thing as "budget for backup".  DRS _must_ be included
in all acquisition costs.  If not, someone will pay dear consequences at
some point in time if the lost data has real value.  In Eli's case the
lost data was Ph.D. student research data.  If one student lost all his
data, he may likely have to redo an entire year of school.  Who pays for
that?  Who pays for his year of lost earnings sine he can't (re)enter
the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
student, the university, or both, $200K or more depending on career field.


Really?  Eli had those WD20EARS online in his D2D backup system for less
than 5 months.  LTO tape reliability is less than 5 months?  Show data
to back that argument up please.

Tape isn't perfect either, ...
From: Berkey B Walker
Date: Monday, December 27, 2010 - 6:36 pm

Whereas a tape and drive does not offer an "Everything is the single 
point of failure" as does a permanently _sealed_  storage disc/drive, 
tape is definately NOT "fool proof", nor even the lazy, everyday 
operator.  The operator and system requirements for use and maintenance 
are still, basically, as they were a half century ago.  Imagine if the 
operator were required to retension his/her RAID disks and replace them 
after so many uses.  Money drives it all, creating street level, IT 
techs, SysAdmins aproaching the commodity level.  This not intended to 
be pointing at the Original Poster but market and marketing of today.
berk

--

From: Carl Cook
Date: Monday, December 27, 2010 - 9:16 pm

Alll righty then, we are slipping close to hysteria.

Please try not to worry about me...  I am switching to BTRFS and a backup server for my frickin' movies, and will be fine.


--

From: Stan Hoeppner
Date: Tuesday, December 28, 2010 - 8:04 pm

BTRFS?  You must be pulling our chains.  It's an experimental filesystem
for Pete's sake.  A month ago it still didn't have a check/repair tool.
 Does it yet?

-- 
Stan
--

From: Roman Mamedov
Date: Tuesday, December 28, 2010 - 10:34 pm

On Tue, 28 Dec 2010 21:04:10 -0600

AFAIK it does have a check tool, but not a repair tool yet :)

To be fair, it can be argued that if you have a backup [server], this doesn't
matter too much.

But personally I find btrfs to be good not for primary storage yet, but for
precisely the mentioned backup storage: its snapshot feature allows to do
(fast) incremental backups, while at the same time have the full state saved
at each backup step instantly accessible with no special software required
(snapshots of older state just sit there looking like plain directories in the
filesystem).

-- 
With respect,
Roman
Previous thread: by kernel.majianpeng on Friday, December 24, 2010 - 7:49 pm. (4 messages)

Next thread: md-raid and block sizes by D.S. Ljungmark on Monday, December 27, 2010 - 5:00 am. (9 messages)