... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
alt.os.linux.mint
Looks pretty on the outside, thats it!
30,566 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 28,782 of 30,566
Paul to Felix
Re: Hard drive question (1/2)
25 Jul 25 10:01:09
   From: nospam@needed.invalid   
      
   On Fri, 7/25/2025 12:47 AM, Felix wrote:   
   >   
   > How does LM treat HD bad sectors? Can it identify and   
   > mark them (if any) 'not for use'? or is there an app   
   > that will do it? thanks all,   
      
   https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-   
   anually-without-trying-to-read-write   
      
   The problem I have with this idea, is later if you buy a   
   new hard drive, and you want to clone over the drive (say using   
   ddrescue), you would also copy the portion of the file system that declares   
   some blocks bad. When cloning, the badblock information   
   is really "private" to that particular drive.   
      
   What you have to decide for yourself, is how far to push   
   HDD, before transferring the data to a second drive.   
      
   *******   
      
   The hard drive has automatic sparing, which means if there   
   is trouble with a sector, the drive has some spare sectors   
   in the immediate area. And a table of spared blocks is   
   maintained by the drive, independent of anything the user   
   is doing. When the drive is getting low on spare sectors,   
   the SMART "Reallocated" statistic raw data field goes non-zero,   
   indicating drive life is on the warning track.   
      
   The "smartctl" utility from smartmontools package, can tell   
   you how healthy the drive is.   
      
       sudo smartctl -a /dev/sda   
      
   SMART gives its best warnings, when the drive errors are   
   independent of one another, and uniformly spread out. SMART   
   gives a less-useful warning, when the drive has a "bad spot",   
   as all the spares in the bad spot can be exhausted and yet   
   the drive health will be declared as "Good".   
      
   A bad spot in a disk, can be detected (and not all that accurately),   
   by benchmark testing the disk with a transfer benchmark. For example,   
   one drive I had, there was a 70GB wide area that transferred data   
   at 10MB/sec (which is abnormally low). The drive health was listed   
   as "Good" which is rubbish, as the drive was obviously not normal   
   at that point. I transferred the data off the drive.   
      
   *******   
      
   Blocks with problems, are maintained in a queue for maintenance activity   
   when an attempt is made to write the block. The drive will check whether   
   the write is working or not, whether the block needs to be spared, it   
   spares the block out and so on. This is all automated and may slow the   
   drive down a bit while the determination is made.   
      
   If you write the drive surface:   
      
       # Do a backup first, *before* the next command   
      
       smartctl -a /dev/sda   # Record health info before run begins.   
      
       sudo dd if=/dev/zero of=/dev/sda bs=221184   # Destructive write test   
      
   Then do some reads:   
      
       sudo dd if=/dev/sda of=/dev/null bs=221184   # Read verify, test will stop   
   if bad block present   
      
       sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log   # Alternately,   
   ddrescue of gddrescue package can be   
       xed /root/rescue.log                                      # used to   
   generate a logfile with badblock info.   
                                                                 # This sequence   
   differs from the previous command   
                                                                 # in that the   
   command should always finish.   
      
       smartctl -a /dev/sda      # Look to see if Reallocated raw data increased   
   by a couple hundred,   
                                 # indicating the questionable blocks have been   
   permanently harvested.   
                                 # The raw data field might have a range of   
   0..5500 or so, just to give   
                                 # some idea how worried you should be when   
   Reallocated = 300.   
      
       # Restore disk from backup once the harvesting is complete and you are   
   happy.   
      
   But when the Reallocated SMART parameter raw data field goes non-zero,   
   it is time to move the data off the disk and onto another disk.   
   While you can punish a drive, use up all the spares in a region,   
   forcing the drive to declare an actual "CRC error" on a block there,   
   then you need to start using badblocks for EXT4 to manage the   
   defects and keep the file system from using the now non-functional   
   inodes. And if you do that, if you resort to manual badblock management,   
   the main danger is accidentally transferring the (inaccurate for a second   
   hard drive) badblock data to a new disk. You are really better off   
   with the disks doing their own bad block management, and you the   
   operator, monitoring SMART Reallocated plus watching for "benchmark   
   bad spots" as indicators the drive is at end-of-life.   
      
   *******   
      
   The last hard drive I opened, a Seagate, I was shocked at what I found.   
   The drive only had about 10,000 hours on it, when taken out of service.   
   The Reallocated might have been 300. What did I find ? A single platter,   
   which is to be expected on some of your hard drive fleet of course.   
   What I didn't expect to find, is there was no landing ramp for the   
   heads inside the drive. The head just sits on the platter. I looked it   
   up, and after the "stiction era" (quantum fireball era or so), they   
   had found a way to "laser pattern" the area near the platter hub and   
   make a "non-stiction area" for the heads to park when the drive   
   spins down. While modern lubricants (polymer finish) are fairly   
   robust, not having a landing ramp for the head, that is just not a   
   best practice, and guarantees if you cycle the power every day   
   on the computer, the drive does lots of spinning down and wearing   
   the heads as the heads skate over the surface.   
      
   And that's why the drive had lasted only 10,000 hours. It was because   
   even though the drives are in the modern era and science had discovered   
   the benefits of landing ramps, my drive didn't have a plastic landing ramp.   
      
   And this is just in case you do not understand why you didn't get   
   50,000 hours from a HDD. But you only figure things like this out,   
   by examining the drive after it reaches end of life, to see whether   
   the drive was too cheaply made. I never expected to find such an   
   idiotic development, as to be dragging the heads across the platter   
   when I opened the drive. I had expected to find dirt or rubbish inside   
   the drive, proportional to a surface degradation, but the filter   
   pack was still lilly white and the platter surface was impeccable   
   to the eye, yet it had spared out enough blocks to be end-of-life.   
   This means I'd need a microscope to find the damage that was   
   present on the drive platter.   
      
   *******   
      
   When the first hard drives came out for consumers, I tested them   
   in the lab. I took the factory bad block list, and the grown   
   defect list, reset them, and had the drive scan for bad blocks.   
   What was interesting, is the drive exactly reproduced the same   
   defect list as was present in the lists. This is just in case   
   you were thinking "oh, those blocks aren't really bad and   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]