Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
HW problem I am afraid[Solved]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Fri Jul 08, 2016 5:34 pm    Post subject: HW problem I am afraid[Solved] Reply with quote

Last startup of Gentoo installed on ssd did not start X server. Console message:
Quote:
This is (none).unknown_domain(Linux x86_64 4.1.15-gentoo-r1)
dmesg
Quote:
...[sda]tag#0 FILED result:hostbyte=DID_OK...
[sda]tag#0 Sense key: medium Error [current][descriptor]
[sda]tag#0Add.Sense:Unrecovered readerror-auto rellocate failed...etc
When trying to write something on the ssd
Quote:
read-only file system

Does it mean that live of my ssd is over?
How could I check this ssd?


Last edited by apiaio on Sat Jul 09, 2016 3:04 pm; edited 1 time in total
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2971
Location: Germany

PostPosted: Fri Jul 08, 2016 5:57 pm    Post subject: Reply with quote

smartctl -a /dev/sda?
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Fri Jul 08, 2016 6:10 pm    Post subject: Reply with quote

frostschutz wrote:
smartctl -a /dev/sda?
Code:
sabayonx86-64 miro # smartctl -a /dev/sda
bash: smartctl: command not found
Just now I am booted in Sabayon installed on sdb
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2971
Location: Germany

PostPosted: Fri Jul 08, 2016 6:27 pm    Post subject: Reply with quote

it's part of smartmontools :roll:
Back to top
View user's profile Send private message
The Doctor
Moderator
Moderator


Joined: 27 Jul 2010
Posts: 2585

PostPosted: Fri Jul 08, 2016 6:30 pm    Post subject: Reply with quote

So install sys-apps/smartmontools ;)


Reading the documentation on it might also help since this is one of those tools you really want to have on all your systems.
_________________
First things first, but not necessarily in that order.

Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box.
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Fri Jul 08, 2016 6:39 pm    Post subject: Reply with quote

Well.

I have swiched to other gentoo installed on sdc and installed sys-apps/smartmontools.
BTW
Quote:
localhost miro # emerge smartmontools -vp
* Last emerge --sync was 2y 168d 2h 26m 38s ago.
Code:
localhost miro # smartctl -a /dev/sda
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.10.25-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Indilinx Barefoot based SSDs
Device Model:     Corsair CSSD-V32GB2
Serial Number:    1106650500FF10200281
Firmware Version: 2.2
User Capacity:    32,017,047,552 bytes [32.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
Local Time is:    Fri Jul  8 18:34:02 2016 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x1d) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Abort Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging NOT supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   0) minutes.
Extended self-test routine
recommended polling time:        (   0) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0000   ---   ---   ---    Old_age   Offline      -       6
  9 Power_On_Hours          0x0000   ---   ---   ---    Old_age   Offline      -       5808
 12 Power_Cycle_Count       0x0000   ---   ---   ---    Old_age   Offline      -       8158
184 Initial_Bad_Block_Count 0x0000   ---   ---   ---    Old_age   Offline      -       28
195 Program_Failure_Blk_Ct  0x0000   ---   ---   ---    Old_age   Offline      -       1
196 Erase_Failure_Blk_Ct    0x0000   ---   ---   ---    Old_age   Offline      -       0
197 Read_Failure_Blk_Ct     0x0000   ---   ---   ---    Old_age   Offline      -       3
198 Read_Sectors_Tot_Ct     0x0000   ---   ---   ---    Old_age   Offline      -       2026892145
199 Write_Sectors_Tot_Ct    0x0000   ---   ---   ---    Old_age   Offline      -       1410713922
200 Read_Commands_Tot_Ct    0x0000   ---   ---   ---    Old_age   Offline      -       27692167
201 Write_Commands_Tot_Ct   0x0000   ---   ---   ---    Old_age   Offline      -       16148660
202 Error_Bits_Flash_Tot_Ct 0x0000   ---   ---   ---    Old_age   Offline      -       6937082
203 Corr_Read_Errors_Tot_Ct 0x0000   ---   ---   ---    Old_age   Offline      -       6287996
204 Bad_Block_Full_Flag     0x0000   ---   ---   ---    Old_age   Offline      -       0
205 Max_PE_Count_Spec       0x0000   ---   ---   ---    Old_age   Offline      -       5000
206 Min_Erase_Count         0x0000   ---   ---   ---    Old_age   Offline      -       691
207 Max_Erase_Count         0x0000   ---   ---   ---    Old_age   Offline      -       3676
208 Average_Erase_Count     0x0000   ---   ---   ---    Old_age   Offline      -       1666
209 Remaining_Lifetime_Perc 0x0000   ---   ---   ---    Old_age   Offline      -       67
211 SATA_Error_Ct_CRC       0x0000   ---   ---   ---    Old_age   Offline      -       0
212 SATA_Error_Ct_Handshake 0x0000   ---   ---   ---    Old_age   Offline      -       0
213 Indilinx_Internal       0x0000   ---   ---   ---    Old_age   Offline      -       0

Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
SMART Error Log Version: 1
No Errors Logged

Warning! SMART Self-Test Log Structure error: invalid SMART checksum.
SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


Selective Self-tests/Logging not supported
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43378
Location: 56N 3W

PostPosted: Fri Jul 08, 2016 7:03 pm    Post subject: Reply with quote

apiaio,

Code:
Error logging capability:        (0x00) Error logging NOT supported.
                                        General Purpose Logging supported.

There is nothing useful there.

Run the long self test and look at the results. That's the same as reading the entire content of the drive to /dev/null, except its all internal to the drive.

Quote:
[sda]tag#0 Sense key: medium Error [current][descriptor]
[sda]tag#0Add.Sense:Unrecovered readerror-auto rellocate failed...etc
suggests a read failed and its internal to the drive.
However, cheap poor quality SATA data cables have been known to cause similar effects and an STA cable is much lower cost than an SSD.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Logicien
Veteran
Veteran


Joined: 16 Sep 2005
Posts: 1369
Location: Montréal

PostPosted: Fri Jul 08, 2016 8:29 pm    Post subject: Reply with quote

Can you read and write with Sabayon on the Gentoo defective ssd? Like mount the filesystems in the partitions and access the data?
_________________
Paul
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Sat Jul 09, 2016 6:46 am    Post subject: Reply with quote

Logicien wrote:
Can you read and write with Sabayon on the Gentoo defective ssd? Like mount the filesystems in the partitions and access the data?
Yes
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43378
Location: 56N 3W

PostPosted: Sat Jul 09, 2016 6:51 am    Post subject: Reply with quote

apiaio,

That you can read/write elsewhere on the filesystem suggests that the SATA data cable is OK.

What did the smartctl long test tell?

dmesg should have given you a block number for the error.
Can you read that block with dd?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Sat Jul 09, 2016 6:59 am    Post subject: Reply with quote

NeddySeagoon wrote:
apiaio,

That you can read/write elsewhere on the filesystem suggests that the SATA data cable is OK.

What did the smartctl long test tell?

dmesg should have given you a block number for the error.
Can you read that block with dd?
I am not sure if I use smartctl command correctly
Code:
localhost gen # smartctl —test=long /dev/sda
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.10.25-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

ERROR: smartctl takes ONE device name as the final command-line argument.
You have provided 2 device names:
—test=long
/dev/sda

Use smartctl -h to get a usage summary

Code:
localhost gen # dmesg|grep sda
[    0.777125] sd 0:0:1:0: [sda] 62533296 512-byte logical blocks: (32.0 GB/29.8 GiB)
[    0.777302] sd 0:0:1:0: [sda] Write Protect is off
[    0.777307] sd 0:0:1:0: [sda] Mode Sense: 00 3a 00 00
[    0.777333] sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.777651]  sda: sda1
[    0.777888] sd 0:0:1:0: [sda] Attached SCSI disk
[  124.823372] EXT3-fs (sda): error: can't find ext3 filesystem on dev sda.
[  124.823505] EXT4-fs (sda): VFS: Can't find ext4 filesystem
[  124.823638] EXT4-fs (sda): VFS: Can't find ext4 filesystem
[  124.823762] FAT-fs (sda): bogus number of FAT structure
[  124.823765] FAT-fs (sda): Can't find a valid FAT filesystem
[  124.823879] FAT-fs (sda): bogus number of FAT structure
[  124.823880] FAT-fs (sda): Can't find a valid FAT filesystem
[  124.824104] EXT3-fs (sda): error: can't find ext3 filesystem on dev sda.
[  124.824218] EXT4-fs (sda): VFS: Can't find ext4 filesystem
[  124.824356] EXT4-fs (sda): VFS: Can't find ext4 filesystem
[  124.824483] FAT-fs (sda): bogus number of FAT structure
[  124.824485] FAT-fs (sda): Can't find a valid FAT filesystem
[  124.824606] FAT-fs (sda): bogus number of FAT structure
[  124.824608] FAT-fs (sda): Can't find a valid FAT filesystem
[  124.835542] NTFS-fs warning (device sda): is_boot_sector_ntfs(): Invalid boot sector checksum.
[  124.835545] NTFS-fs error (device sda): read_ntfs_boot_sector(): Primary boot sector is invalid.
[  124.835547] NTFS-fs error (device sda): read_ntfs_boot_sector(): Mount option errors=recover not used. Aborting without trying to recover.
[  124.835550] NTFS-fs error (device sda): ntfs_fill_super(): Not an NTFS volume.
[  124.836138] XFS (sda): bad magic number
[  124.836156] XFS (sda): Internal error xfs_sb_read_verify at line 730 of file fs/xfs/xfs_mount.c.  Caller 0xffffffff812e2aa5
[  124.836207] XFS (sda): Corruption detected. Unmount and run xfs_repair
[  124.836244] XFS (sda): SB validate failed with error 22.
[  256.317421] EXT4-fs (sda1): warning: mounting fs with errors, running e2fsck is recommended
[  256.317760] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[  435.604341] EXT4-fs (sda1): re-mounted. Opts: commit=0
[  556.677658] EXT4-fs (sda1): error count: 1
[  556.677664] EXT4-fs (sda1): initial error at 1467917189: ext4_find_entry:1457: inode 1049072
[  556.677668] EXT4-fs (sda1): last error at 1467917189: ext4_find_entry:1457: inode 1049072
how to read block with dd? :oops:
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43378
Location: 56N 3W

PostPosted: Sat Jul 09, 2016 7:24 am    Post subject: Reply with quote

apiaio,

The syntax is
Code:
-t TEST, --test=TEST
Notice one hyphen to introduce short options and two for long options.
Your message shows that smartctl did not understanh the command.

Code:
[  256.317421] EXT4-fs (sda1): warning: mounting fs with errors, running e2fsck is recommended
[  256.317760] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)

Don't run e2fsck. That can make a bad situation worse. Do not mount the filesystem read write either.
Its a really bad idea to write to a damaged file system.

Make an image of the drive using ddrescue before you attempt any data recovery.

Code:
[  556.677664] EXT4-fs (sda1): initial error at 1467917189: ext4_find_entry:1457: inode 1049072

inode 1049072 is a pointer to the damaged object in the filesystem. It can be a directory or a file.

Code:
ls -Ri /mnt/point | grep 1049072
will recursively read all of the directories starting at /mnt/point than print the name of the object using inode 1049072.
Its possible that inode 1049072 is used for filesystem metadata but that's a long way down the drive for anything but a backup superblock.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Sat Jul 09, 2016 8:55 am    Post subject: Reply with quote

Up to now I did
Code:
localhost miro # ddrescue -f -n /dev/sda1 /dev/sdc6 logfile


GNU ddrescue 1.16
Press Ctrl-C to interrupt
rescued:    32015 MB,  errsize:    4096 B,  current rate:   39055 kB/s
   ipos:    17213 MB,   errors:       1,    average rate:   67974 kB/s
   opos:    17213 MB,     time since last successful read:       0 s
Finished     
and
Code:
localhost miro # e2fsck -v -f /dev/sdc6
e2fsck 1.42.7 (21-Jan-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 1049072, block #0, offset 0: directory corrupted
Salvage<y>? yes
Missing '.' in directory inode 1049072.
Fix<y>? yes
Setting filetype for entry '.' in ??? (1049072) to 2.
Missing '..' in directory inode 1049072.
Fix<y>? yes
Setting filetype for entry '..' in ??? (1049072) to 2.
Pass 3: Checking directory connectivity
'..' in /home/miro/.mozilla/firefox/ci7aym0p.default/minidumps (1049072) is <The NULL inode> (0), should be /home/miro/.mozilla/firefox/ci7aym0p.default (1049066).
Fix<y>? yes
Pass 4: Checking reference counts
Inode 2 ref count is 26, should be 27.  Fix<y>? yes
Inode 1049066 ref count is 14, should be 13.  Fix<y>? yes
Pass 5: Checking group summary information

/dev/sdc6: ***** FILE SYSTEM WAS MODIFIED *****

      467941 inodes used (23.90%, out of 1957888)
         680 non-contiguous files (0.1%)
         177 non-contiguous directories (0.0%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 454045/46
     2499673 blocks used (31.98%, out of 7816406)
           0 bad blocks
           1 large file

      412159 regular files
       41735 directories
         174 character device files
          97 block device files
           2 fifos
         244 links
       13765 symbolic links (13569 fast symbolic links)
           0 sockets
------------
      468176 files

What should be the next step? May I format sda1 and
Code:
ddrescue -f -n /dev/sdc6 /dev/sda1
?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43378
Location: 56N 3W

PostPosted: Sat Jul 09, 2016 2:31 pm    Post subject: Reply with quote

apiaio,

Your filesystem is self consistent. You have lost exactly one block.

However, it was
Code:
Directory inode 1049072, block #0, offset 0: directory corrupted

That means that all the files indexed from that directory block are no longer accessible.
We know that it was the first or only directory block in that directory as
Code:
Missing '.' in directory inode 1049072.
Fix<y>? yes
Setting filetype for entry '.' in ??? (1049072) to 2.
Missing '..' in directory inode 1049072.
That's the parent and this directory entries that are made my mkdir.

It looks like the damage is confined to /home/miro/.mozilla/firefox/ci7aym0p.default, which looks like user miros Firefox profile.
There is no need to format and restore your backup. You will end up doing the fsck and getting to the same place you are now.
Do keep your backup.

If that were a conventional HDD, I would suspect mechanical problems and only use it for things I could afford to lose.
In effect, the drie can no longer read its own writing. However, SSDs are different. They don't have mechanical problems.
Its likely that only a single memory cell has failed, so that the drive cannot read the data to remap it.
One cell failing (there are four memory cells per byte) says nothing about any other cells.

I would keep using the drive and run the long test every few days to check for more errors.
I might even make a new profile for Firefox by that's a pain as all your history, bookmarks and so on will vanish.
Unlike a HDD, you cannot force a sector remap by writing to the faulty sector, as SSDs do a remap on write anyway, to avoid the erase penalty.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
apiaio
Apprentice
Apprentice


Joined: 04 Dec 2008
Posts: 208

PostPosted: Sat Jul 09, 2016 3:03 pm    Post subject: Reply with quote

After fsck everything works again. Even the firefox.
Thanks
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum