I went in to turn on my home theater system today, and found a blank screen. I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years. I try to mount it manually, and "wrong fs or bad superblock". The array is getting set up fine, but the filesystem seems to be destroyed. Unbelievable. This isn't supposed to happen. It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release. But now it's RAID10 with JFS. The only sign of trouble: Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped. Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3> Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3) Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3> Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3) Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0 Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped. Reboot: Dec 25 16:15:48 cygnus kernel: [ 1.156657] Uniform CD-ROM driver Revision: 3.20 Dec 25 16:15:48 cygnus kernel: [ 1.464298] md: raid10 personality registered for level 10 Dec 25 16:15:48 cygnus kernel: [ 1.469307] md: md2 stopped. Dec 25 16:15:48 cygnus kernel: [ 1.470540] md: bind<sdc3> Dec 25 16:15:48 cygnus kernel: [ 1.470642] md: bind<sdb3> Dec 25 16:15:48 cygnus kernel: [ 1.471381] raid10: raid set md2 active with 2 out of 2 devices Dec 25 16:15:48 cygnus kernel: [ 1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits Dec 25 16:15:48 cygnus kernel: [ 1.476050] created bitmap (223 pages) for device md2 Dec 25 16:15:48 cygnus kernel: [ 1.488465] md2: detected capacity change from 0 to 1913403736064 Dec 25 16:15:48 cygnus kernel: [ 1.488942] md2: unknown partition table Dec 25 16:15:48 cygnus kernel: [ ...
None of your logs show anything about jfs.... What does fsck.jfs /dev/md2 report? What about mount -t jfs /dev/md2 /home ?? --
My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils. Now the array mounts. I don't understand it. I thought the JFS driver is in the kernel? --
Like many parts of Linux, most of JFS is in the kernel, but some support tools are separate. Most filesystems have a separate mkfs.$FSTYPE and fsck.$FSTYPE. ALSA (sound subsystem) has alsamixer etc. md/RAID has mdadm, nfs has nfs-utils etc etc. Each of these are primarily kernel subsystems, but need user-space tools to configure and manage them. But the important thing is that you have your data back, preparing you for a Happy New Year! --
Excellent save!!! The OP might want to continue the "Giving Season" by giving himself a brand new system backup. --
Indeed. Thank you Neil, you saved me (along with my not touching anything). Actually I have much too much data to back anything up, so I've been stalling for over a year on building a backup server. It's fixin' to happen now, you betcha. A cube case with ITX and a BTRFS array the same size as my RAID10. NoMachine NX over SSH for admin. It'll be in a far corner of the garage down low, so if a fire or theft I don't lose my data. I'll sync it with the main over GBethernet, at some interval, in some way (suggestions?), but keep it offline most of the time. All my systems are named after constellations, so this will be called "Gemini". (the twin) --
<snip> Every time I read/hear this I cringe. If that is the case your data is worthless to begin with so just delete it all right now. You are literally saying the same thing with your statement. The difference is that I _KNOW_ you drives will fail, or you'll lose an array due to corruption, etc. Your HTPC storage system will fail. It's not an _IF_ but a _WHEN_ issue. The question then becomes, what is the best backup/restore strategy to fit HTPC needs. Build another system with similar technology and you have the same failure modes and risks as the first. Spinning Rust Disks (SRDs) are not a suitable long term backup/restore solution. What happens when your disk-to-disk backup server solution drops an MD array for no reason such as just happened, _during_ a restore operation? Or you suffer a disk failure on the backup server during a restore operation (which is very common today)? Will your backup server contain a 4 x 2 TB disk RAID 10 set? I'd suggest tape as the better solution to D2D in the HTPC case, primarily based on cost and availability of library and media, and the fact the disaster recover procedure is much easier and much more straightforward: 8 drive LTO-2 autoloader http://www.msrcglobal.com/p-216-af203a.aspx?gclid=CI789Mm_i6YCFQTrKgodIg0pnA& http://h18000.www1.hp.com/products/quickspecs/11841_div/11841_div.HTML Ultrium 448 drive 3.2 TB compressed max per library U160 LVD/SE SCSI interface 172 GB/hour capacity - "desktop" model $650 USD http://www.newegg.com/Product/Product.aspx?Item=N82E16816118057 $90 USD http://www.newegg.com/Product/Product.aspx?Item=N82E16840999118 8 x $22 = $176 USD Total = $916 + shipping Using the correct backup strategy this should easily meet your needs. The 8 tapes in the library will handle 75% of your RAW level 10 mdraid device capacity. Once a filesystem is laid down, and you take overhead concerns into account, you won't be putting more than 3.2 TB of data on it anyway (or, at least, you ...
(Heh. I name my systems after constellations, too.) I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net. To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly. I also use a super-cheap SATA hot-swap bay (carriage-less) to support weekly and monthly rotations to a fire safe at a third site. Although Stan is right that tape is the best choice for really large backups, I find the commodity 1.5T drives hard to beat for convenience. Just my $0.02. HTH, Phil --
> Every time I read/hear this I cringe. If that is the case your data is worthless to begin with so just delete it all right now. You are literally saying the same thing with your statement. No, I'm saying that the MTBF of disk drives is astronomical, and the likelihood of a fail during backup is miniscule. MTBF of tape is hundreds of times sooner. Not to mention that tape would take forever, and require constant tending. This is why it's not used anymore. My storage is 2TB now, but my library is growing all the time. Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed. Each WD 2TB drive is $99 from Newegg! Astounding. Thanks for the input though. Can you please give some detail on your sync scripts? I've never done this and am not a programmer, but I'm a pretty good shade-tree admin. --
Sure. Attached. Note that the script doesn't set the sysctls for speed limits... The defaults are fine for me. HTH, Phil
Hrm.. I used to do this too, until a silent corruption issue with a controller trashed 8TB of data. Now if I'd done a "check" instead of a repair, and been alerted to the fact the controller was corrupting data rather than blindly overwriting the parity blocks I'd have had a chance of saving the array. Checksum and test regularly!. I do have a backup regime, however I do a full rotation every 3 months and it was approximately 3 months and 5 days before I noticed I really had a problem. Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish. --
Interesting statement. You're arguing the reliability of these modern giant driver, yet use RAID 10 instead of simple spanning or striping. I Failure during backup is of less concern. Failure of the source media/system during _restore_ is. Read this thread and the previous thread from Eli a few months prior. It will open your eyes to the peril of using a D2D backup system with 2TB WD Green drives. His choice of such, along with other poor choices based on acquisition cost, almost cost him his job. Everything is cheap until it costs you dearly. http://comments.gmane.org/gmane.comp.file-systems.xfs.general/35555 Too many (especially younger) IT people _only_ consider up front acquisition cost of systems and not long term support of such systems. Total system cost _must_ include a reliable DRS (Disaster Recover System). If you can't afford the DRS to go with a new system, then you can't afford that system, and must downsize it or reduce its costs in some way to allow inclusion of DRS. There is no free lunch. Eli nearly lost his job over poor acquisition and architecture choices. In that thread he makes the same excuses you do regarding his total storage size needs and his "budget for backup". There is no such thing as "budget for backup". DRS _must_ be included in all acquisition costs. If not, someone will pay dear consequences at some point in time if the lost data has real value. In Eli's case the lost data was Ph.D. student research data. If one student lost all his data, he may likely have to redo an entire year of school. Who pays for that? Who pays for his year of lost earnings sine he can't (re)enter the workforce at Ph.D. pay scale? This snafu may cost a single Ph.D. student, the university, or both, $200K or more depending on career field. Really? Eli had those WD20EARS online in his D2D backup system for less than 5 months. LTO tape reliability is less than 5 months? Show data to back that argument up please. Tape isn't perfect either, ...
Whereas a tape and drive does not offer an "Everything is the single point of failure" as does a permanently _sealed_ storage disc/drive, tape is definately NOT "fool proof", nor even the lazy, everyday operator. The operator and system requirements for use and maintenance are still, basically, as they were a half century ago. Imagine if the operator were required to retension his/her RAID disks and replace them after so many uses. Money drives it all, creating street level, IT techs, SysAdmins aproaching the commodity level. This not intended to be pointing at the Original Poster but market and marketing of today. berk --
Alll righty then, we are slipping close to hysteria. Please try not to worry about me... I am switching to BTRFS and a backup server for my frickin' movies, and will be fine. --
BTRFS? You must be pulling our chains. It's an experimental filesystem for Pete's sake. A month ago it still didn't have a check/repair tool. Does it yet? -- Stan --
On Tue, 28 Dec 2010 21:04:10 -0600 AFAIK it does have a check tool, but not a repair tool yet :) To be fair, it can be argued that if you have a backup [server], this doesn't matter too much. But personally I find btrfs to be good not for primary storage yet, but for precisely the mentioned backup storage: its snapshot feature allows to do (fast) incremental backups, while at the same time have the full state saved at each backup step instantly accessible with no special software required (snapshots of older state just sit there looking like plain directories in the filesystem). -- With respect, Roman
