I have 4 same model 160GB IDE drives connected to 2 PDC20269 U133TX2 card. I created two raid1: md0, md1. Then I created a LV with stripes from both software raid, which basiclly gives me a software raid10 configuration. I then formated the LV with mkfs.xfs. The problem occured after I copied about 5GB data. I started to see hundreds of these repeating errors:
Aug 15 23:36:15 server kernel: ide: failed opcode was: unknown
Aug 15 23:36:15 server kernel: hdi: dma_intr: bad DMA status (dma_stat=35)
Aug 15 23:36:15 server kernel: hdi: dma_intr: status=0x50 { DriveReady SeekComplete }
then I saw this:
Aug 15 23:36:15 server kernel: ide: failed opcode was: unknown
Aug 15 23:36:15 server kernel: hdi: DMA disabled
Aug 15 23:36:15 server kernel: PDC202XX: Primary channel reset.
Aug 15 23:36:15 server kernel: ide4: reset: success
after which, I saw many of this:
Aug 15 23:36:17 server kernel: ide: failed opcode was: unknown
Aug 15 23:36:17 server kernel: hdi: task_out_intr: status=0x58 { DriveReady SeekComplete DataRequest }
Aug 15 23:36:17 server kernel:
Aug 15 23:36:17 server kernel: ide: failed opcode was: unknown
Aug 15 23:36:20 server kernel: hdi: task_out_intr: status=0x50 { DriveReady SeekComplete }
with the status codes altering between 0x50 and 0x58. At last, the system marked /dev/hdi faulty and kicked it out of the sw raid1. I checked the drive using the vendor's toolset to find no error of the disk at all.
I experimented with jfs, and ran into the same situation. However, when I ext3, no matter how I stressed the disk system, it worked without any problem. By the way, I am running RHEL4 with a FC3 2.6.11 kernel (a nvidia driver thing).
I post this because I just started with the kernel and it is beyond my head to figure out what was going on. Hopefully this post can provide something for the real kernel hackers to think of;)
XFS error log in dmesg
I guess this is essential:
Starting XFS recovery on filesystem: dm-9 (dev: dm-9)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed
The jfs also has errors related to journal log.
Aug 15 11:42:10 server kernel: ERROR: (device dm-9): dbAllocAG: unable to allocate blocks