[SOLVED] HD103UJ I/O Errors with LINUX

After booting up my Linux box (kernel 4.4.13 ) and start using it I noticed that I was getting I/O (Input Output) errors from the hard drive some seconds after working with it.


My hard drive is a Samsung HD103UJ and my motherboard is an Asus P5ne-sli with BIOS firmware update 1403 .
According to what I've found in some forums, the nv chipset and this disk in particular doesn't get along too well and in others, the kernel version was the one to blame.

The same problem occurs with the Western Digital HD103UJ sata disk.





  • This is what my dmesg outputs. (keep reading after the output ...)

ata5: EH in SWNCQ mode,QC:qc_active 0x3F80000 sactive 0x3F80000
ata5: SWNCQ:qc_active 0x180000 defer_bits 0x3E00000 last_issue_tag 0x14
  dhfis 0x80000 dmafis 0x0 sdbfis 0x0
ata5: ATA_REG 0x40 ERR_REG 0x0
ata5: tag : dhfis dmafis sdbfis sactive
ata5: tag 0x13: 1 0 0 1 
ata5: tag 0x14: 0 0 0 1 
ata5.00: exception Emask 0x0 SAct 0x3f80000 SErr 0x0 action 0x6 frozen
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:98:00:4b:5a/00:00:10:00:00/40 tag 19 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:a0:10:08:00/00:00:00:00:00/40 tag 20 ncq 4096 out
         res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:a8:08:aa:42/00:00:00:00:00/40 tag 21 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:b0:b0:4d:46/00:00:00:00:00/40 tag 22 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:b8:00:08:60/00:00:00:00:00/40 tag 23 ncq 4096 out
         res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:c0:10:de:62/00:00:00:00:00/40 tag 24 ncq 4096 out
         res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/08:c8:b0:eb:62/00:00:00:00:00/40 tag 25 ncq 4096 out
         res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: hard resetting link
ata5: nv: skipping hardreset on occupied port
ata5: link is slow to respond, please be patient (ready=0)
ata5: SRST failed (errno=-16)
ata5: hard resetting link
ata5: nv: skipping hardreset on occupied port
ata5: link is slow to respond, please be patient (ready=0)

ata5: SRST failed (errno=-16)
ata5: hard resetting link
ata5: nv: skipping hardreset on occupied port
ata5: link is slow to respond, please be patient (ready=0)
ata5: SRST failed (errno=-16)
ata5: hard resetting link
ata5: nv: skipping hardreset on occupied port
ata5: link is slow to respond, please be patient (ready=0)
ata5: SRST failed (errno=-16)
ata5: limiting SATA link speed to 1.5 Gbps
ata5: hard resetting link
ata5: nv: skipping hardreset on occupied port
ata5: SRST failed (errno=-16)
ata5: reset failed, giving up
ata5.00: disabled
ata5.00: device reported invalid CHS sector 0
ata5.00: device reported invalid CHS sector 0
ata5: EH complete
sd 4:0:0:0: [sdc] tag#27 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#27 CDB: opcode=0x2a 2a 00 00 62 eb b0 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 6482864
Buffer I/O error on dev sdc1, logical block 810102, lost async page write
sd 4:0:0:0: [sdc] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#28 CDB: opcode=0x2a 2a 00 00 62 de 10 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 6479376
Buffer I/O error on dev sdc1, logical block 809666, lost async page write
sd 4:0:0:0: [sdc] tag#29 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#29 CDB: opcode=0x2a 2a 00 00 60 08 00 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 6293504
Buffer I/O error on dev sdc1, logical block 786432, lost async page write
sd 4:0:0:0: [sdc] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#30 CDB: opcode=0x2a 2a 00 00 46 4d b0 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 4607408
Buffer I/O error on dev sdc1, logical block 575670, lost async page write
sd 4:0:0:0: [sdc] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#31 CDB: opcode=0x2a 2a 00 00 42 aa 08 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 4368904
Buffer I/O error on dev sdc1, logical block 545857, lost async page write
sd 4:0:0:0: [sdc] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#0 CDB: opcode=0x2a 2a 00 00 00 08 10 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 2064
Buffer I/O error on dev sdc1, logical block 2, lost async page write
sd 4:0:0:0: [sdc] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#1 CDB: opcode=0x2a 2a 00 10 5a 4b 00 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 274352896
Buffer I/O error on dev sdc1, logical block 34293856, lost async page write
sd 4:0:0:0: [sdc] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#4 CDB: opcode=0x2a 2a 00 00 62 eb b0 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 6482864
Buffer I/O error on dev sdc1, logical block 810102, lost async page write
sd 4:0:0:0: [sdc] tag#5 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#5 CDB: opcode=0x2a 2a 00 00 62 de 10 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 6479376
Buffer I/O error on dev sdc1, logical block 809666, lost async page write
sd 4:0:0:0: [sdc] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#6 CDB: opcode=0x2a 2a 00 10 5a 4b 00 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 274352896
Buffer I/O error on dev sdc1, logical block 34293856, lost async page write
sd 4:0:0:0: [sdc] tag#7 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#7 CDB: opcode=0x2a 2a 00 00 42 aa 08 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 4368904
Buffer I/O error on dev sdc1, logical block 545857, lost async page write
sd 4:0:0:0: [sdc] tag#8 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
sd 4:0:0:0: [sdc] tag#8 CDB: opcode=0x2a 2a 00 00 46 4d b0 00 00 08 00
blk_update_request: I/O error, dev sdc, sector 4607408
Buffer I/O error on dev sdc1, logical block 575670, lost async page write

 

 Before throwing the hard drive to the trash I tried to do some tweaks to the hard drive device and low down the queue depth value to 1.

The queue depth is the number of pending I/O requests to the disk, so If you lower it to 1, it won't crash.
It is adviced to use it only as a storage device but I tried it for the system and it worked perfectly fine. I didn't find any performance slowdown whatsoever.

To check the queue value in linux, use "hdparm", so as root  type  (use "sudo" or "sudo su" to log as the root) :


# hdparm -Q /dev/sdX

/dev/sdc:
 queue_depth    =  31

(note X is the letter of your device, in my case it is /dev/sdc and 31 is the queue_depth value).

SOLUTION:

Change the queue_depth value to "1" (without quotes) so as root type this time:


# hdparm -Q 1 /dev/sdX
 setting queue_depth to 1
 queue_depth     1

DONE !!!.

After a lot of tests I didn't get any noticeable difference in performance.
Stability is 100%.
If you don't use the value 1, you will get the same crash result seen above.
Only queue_depth = 1 is the stable value.


For persistent changes to take effect on your linux machine, you should create a udev rule, so every time the system detects the disk it will set this value to 1 at boot time.

Here is the custom rule I've created.


STEPS
  1. Create a file called 999-discHD103UJ.rules inside /etc/udev/rules.d/ (note the ".rules" extension), use nano, vim, or leafpad or whatever text editor you have installed in your system as root or with sudo.
  2. Find your device by the id.
    • In a terminal type
    • ls /dev/disk/by-id/
    • find your device,

    my output is :
    ata-SAMSUNG_HD103UJ_S13PJ90Z108076
    ata-SAMSUNG_HD103UJ_S13PJ90Z108076-part1
    ata-SAMSUNG_HD103UJ_S13PJ90Z108076-part5
    Note: DO NOT use the lines with "part-1 or part-5" at the end as they are the logical partitions of the drive.
    You only want to use the first line in green.
    Replace the orange line below with the green one here.
  3. Just write down the line in step 4 into that file ( it only works if the ATTRS{model}=="SAMSUNG HD103UJ, also make sure you understood step 2 ").
  4. SUBSYSTEMS=="scsi", ATTRS{model}=="SAMSUNG HD103UJ ", RUN+="/usr/bin/hdparm -Q 1 /dev/disk/by-id/ata-SAMSUNG_HD103UJ_S13PJ90Z108076"
  5. save your changes and exit.


  • After this changes I never had any more issues with the disk, and dmesg output showed no more errors at all.


 Please give it a thumbs up if it helped in some way or share.

Thanks for passing by.












No comments:

Post a Comment