Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Seagate portable drive crashes
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 3383

PostPosted: Fri Jun 24, 2016 1:25 am    Post subject: Seagate portable drive crashes Reply with quote

For my backup, I have three portable 2TB hard drives and two USB3 ports to plug them into. The mainboard is a few years old, so the USB3 ports are on a PCIe card. The idea here is to always have two drives plugged in, and one "offsite" - in my cabinet at work. Every night I back up my RAID1 onto the older portable drive. Every Friday morning I take the newer portable backup to work, bring the one from work home, and plug it in. There is always at least one backup drive plugged into the system. Most of the time I have two day's backup, sometimes only one. This already saved me once, when my btrfs volume "went away" and lost absolutely everything, including the ability to scan for a btrfs filesystem.

Anyway, two of the drives are Western Digital and one Seagate. Some time back the Seagate drive became intermittant, taking two plugins to get it to respond, then three, and finally I left it plugged in, rotating only the Western Digital drives, because I was scared the Seagate may never be detected again.

Realizing the Seagate was almost a year old, I finally decided to start warranty proceedings. Of course they insisted I test it on Windows. It passes, repeated plugs. Then I tested it on my work computer, running RedHat 6.7 - it passes repeated plugs. So tonight I tested it on my laptop, running Gentoo - it passes repeated plugs.

It only fails on the server where I want to use it. I did more testing tonight, including plugging it into one of the mainboard's native USB2 jacks, thinking it might be an interaction with xhci. Fails, same signature. So here is the signature of a plug/unplug, and it's repeatable. I've got multiple copies, and USB2 looks the same as USB3, and Western Digital still plugs and unplugs happily.

Code:
Jun 22 06:28:23 secretHostname kernel: usb 4-4: new SuperSpeed USB device number 18 using xhci_hcd
Jun 22 06:28:23 secretHostname kernel: scsi host26: uas
Jun 22 06:28:23 secretHostname kernel: kworker/1:2: page allocation failure: order:7, mode:0x2204020
Jun 22 06:28:23 secretHostname kernel: CPU: 1 PID: 21500 Comm: kworker/1:2 Not tainted 4.4.2-hardened #1
Jun 22 06:28:23 secretHostname kernel: Hardware name: System manufacturer System Product Name/M4A785TD-M EVO, BIOS 2103    06/30/2010
Jun 22 06:28:23 secretHostname kernel: Workqueue: usb_hub_wq ffffffffa0052fdf
Jun 22 06:28:23 secretHostname kernel:  0000000000000000 0000000000000000 ffffc90006593558 ffffffff8131b53c
Jun 22 06:28:23 secretHostname kernel:  0000000000000006 0000000000000000 ffffc900065935f0 ffffffff810df82c
Jun 22 06:28:23 secretHostname kernel:  00000001065935a0 ffffffff81a4de78 ffffffffffffff80 0220402000000000
Jun 22 06:28:23 secretHostname kernel: Call Trace:
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff8131b53c>] dump_stack+0x45/0x63
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff810df82c>] warn_alloc_failed+0x113/0x131
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff810e1d8c>] __alloc_pages_nodemask+0x6bd/0x6f8
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff811133a8>] cache_alloc_refill+0x248/0x45e
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81113729>] __kmalloc+0x8b/0xe0
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff812f2181>] init_tag_map+0xec/0x151
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff812f2222>] __blk_queue_init_tags+0x3c/0x77
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff812f2222>] ? __blk_queue_init_tags+0x3c/0x77
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff812f2275>] blk_init_tags+0x18/0x23
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff814017c0>] scsi_add_host_with_dma+0xd7/0x300
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa02f5341>] uas_probe+0x349/0x3b1 [uas]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa005c876>] usb_probe_interface+0x161/0x1ef [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa00691d0>] ? __func__.40822+0x18/0x18 [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e41f4>] driver_probe_device+0x119/0x2a2
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e44c2>] __device_attach_driver+0x7a/0x87
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4448>] ? driver_allows_async_probing+0x37/0x37
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e2772>] bus_for_each_drv+0x8a/0x9f
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4037>] __device_attach+0xa1/0x107
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4644>] device_initial_probe+0x17/0x1f
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e36d4>] bus_probe_device+0x37/0xad
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e18a4>] device_add+0x400/0x50b
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff8154d8da>] ? mutex_unlock+0x12/0x1a
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa005ae69>] usb_set_configuration+0x64e/0x6d7 [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa0068be0>] ? usb_bus_nb+0x20/0x20 [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa006640a>] generic_probe+0x43/0x7b [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa006640a>] ? generic_probe+0x43/0x7b [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa005c6fc>] usb_probe_device+0x37/0x50 [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e41f4>] driver_probe_device+0x119/0x2a2
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e44c2>] __device_attach_driver+0x7a/0x87
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4448>] ? driver_allows_async_probing+0x37/0x37
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e2772>] bus_for_each_drv+0x8a/0x9f
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4037>] __device_attach+0xa1/0x107
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e4644>] device_initial_probe+0x17/0x1f
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e36d4>] bus_probe_device+0x37/0xad
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff813e18a4>] device_add+0x400/0x50b
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa00525b6>] usb_new_device+0x279/0x3c8 [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffffa0053b7c>] hub_event+0xb9d/0xfac [usbcore]
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81061ee2>] process_one_work+0x1af/0x2d9
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81061ee2>] ? process_one_work+0x1af/0x2d9
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81062546>] worker_thread+0x27e/0x383
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff810622c8>] ? rescuer_thread+0x28c/0x28c
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81066ade>] kthread+0xdd/0xe5
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81066a01>] ? kthread_create_on_node+0x17f/0x17f
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff8154fdae>] ret_from_fork+0x3e/0x70
Jun 22 06:28:23 secretHostname kernel:  [<ffffffff81066a01>] ? kthread_create_on_node+0x17f/0x17f
Jun 22 06:28:23 secretHostname kernel: Mem-Info:
Jun 22 06:28:23 secretHostname kernel: active_anon:25601 inactive_anon:41661 isolated_anon:0\x0a active_file:662966 inactive_file:558549 isolated_file:32\x0a unevictable:2025 dirty:0 writeback:0 unstable:0\x0a slab_reclaimable:282120 slab_unreclaimable:13809\x0a mapped:6080 shmem:68 pagetables:1088 bounce:0\x0a free:381067 free_pcp:704 free_cma:0
Jun 22 06:28:23 secretHostname kernel: DMA free:15896kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 22 06:28:23 secretHostname kernel: lowmem_reserve[]: 0 3231 7716 7716
Jun 22 06:28:23 secretHostname kernel: DMA32 free:703796kB min:4700kB low:5872kB high:7048kB active_anon:25240kB inactive_anon:83812kB active_file:1071760kB inactive_file:925600kB unevictable:4964kB isolated(anon):0kB isolated(file):128kB present:3390016kB managed:3311376kB mlocked:4964kB dirty:0kB writeback:0kB mapped:11472kB shmem:136kB slab_reclaimable:459240kB slab_unreclaimable:18684kB kernel_stack:1232kB pagetables:1236kB unstable:0kB bounce:0kB free_pcp:1452kB local_pcp:716kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
Jun 22 06:28:23 secretHostname kernel: lowmem_reserve[]: 0 0 4485 4485
Jun 22 06:28:23 secretHostname kernel: Normal free:804576kB min:6524kB low:8152kB high:9784kB active_anon:77164kB inactive_anon:82832kB active_file:1579976kB inactive_file:1308596kB unevictable:3136kB isolated(anon):0kB isolated(file):0kB present:4718592kB managed:4592704kB mlocked:3136kB dirty:0kB writeback:0kB mapped:12848kB shmem:136kB slab_reclaimable:669240kB slab_unreclaimable:36552kB kernel_stack:3008kB pagetables:3116kB unstable:0kB bounce:0kB free_pcp:1432kB local_pcp:784kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
Jun 22 06:28:23 secretHostname kernel: lowmem_reserve[]: 0 0 0 0
Jun 22 06:28:23 secretHostname kernel: DMA: 2*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (M) = 15896kB
Jun 22 06:28:23 secretHostname kernel: DMA32: 41444*4kB (UME) 44694*8kB (UME) 10170*16kB (UME) 540*32kB (UME) 10*64kB (UE) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 703968kB
Jun 22 06:28:23 secretHostname kernel: Normal: 54311*4kB (UME) 53590*8kB (UME) 9284*16kB (UME) 315*32kB (UME) 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 804652kB
Jun 22 06:28:23 secretHostname kernel: 1223345 total pagecache pages
Jun 22 06:28:23 secretHostname kernel: 166 pages in swap cache
Jun 22 06:28:23 secretHostname kernel: Swap cache stats: add 24241, delete 24075, find 12737/13300
Jun 22 06:28:23 secretHostname kernel: Free swap  = 16292484kB
Jun 22 06:28:23 secretHostname kernel: Total swap = 16383996kB
Jun 22 06:28:23 secretHostname kernel: 2031148 pages RAM
Jun 22 06:28:23 secretHostname kernel: 0 pages HighMem/MovableOnly
Jun 22 06:28:23 secretHostname kernel: 51153 pages reserved
Jun 22 06:28:23 secretHostname kernel: uas: probe of 4-4:1.0 failed with error -12
Jun 22 07:22:47 secretHostname kernel: usb 4-4: USB disconnect, device number 18


Anyone have a clue here? Obviously it's saying "page allocation failure, and near the end I see that "0 pages HighMem/MovableOnly". But Western Digital works, and:
Code:
$ free
              total        used        free      shared  buff/cache   available
Mem:        7919980      307948      212768         276     7399264     7509164
Swap:      16383996      103812    16280184


This is an 8G system, and though it's running mythbackend, there just isn't that much memory in actual use. Most of the usage is in buffers, which are readily evacuated for normal use. I also see the talk of memory a little bit up, but am not sure how to interpret it. Is this indicating that I have a memory leak somewhere, and I need to (shudder) apply the "Windows Fix" and reboot? This system has only been up for 44 days. I suspect the correct information is here, but it's beyond my experience to interpret it and devise a fix. Someone else had the problem here, but no response yet: https://bugzilla.redhat.com/show_bug.cgi?id=1293155

This didn't help, either, from: http://serverfault.com/questions/236170/page-allocation-failure-am-i-running-out-of-memory
Code:
#change value for this boot
sysctl -w vm.min_free_kbytes=65536

#change value for subsequent boots
echo "vm.min_free_kbytes=65536" >> /etc/sysctl.conf


Pardon the somewhat scatterbrained post - as I've been writing it I've been turning more stones - to no avail.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 13836

PostPosted: Sat Jun 25, 2016 5:13 pm    Post subject: Reply with quote

Total memory use may be relatively low, but the error message says it wants an order 7 allocation. As I understand it, that is a fairly large contiguous allocation (2**7 pages = 128 pages). If available memory is fragmented, you might be unable to satisfy an order 7 allocation even if your total free memory is much greater than the number of pages this driver needs. The failure may not apply to the Western Digital drive if the kernel decides not to use an order 7 allocation when servicing it. This could be because the Western Digital has different DMA properties.
Back to top
View user's profile Send private message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 3383

PostPosted: Sat Jun 25, 2016 5:24 pm    Post subject: Reply with quote

Interesting the way search works.

On my first searches, I only put in the "page allocation failure" and got some stuff related to vm settings, which didn't work. Then I noticed that most of the hits had a value of "order=" with some number, so I added that to my search terms. I basically got the same hits as before. Then I put quotes around the whole thing, and it led me to this: http://www.spinics.net/lists/linux-scsi/msg94814.html

Well really I'm giving you the thread at its conclusion. Amusing thing is, I checked my server kernel(s) and found that the suggested fix is indeed in 4.4.8-hardened-r1, but was not in 4.4.2-hardened. Ironically I had already built 4.4.8-r1, but hadn't rebooted yet, so I was still running 4.4.2. Rebooting to my new kernel appears to have fixed it. The drive is now plugged in and recognized. I'm going into my desired three-drive backup rotation, and we'll see how things go.

In the meantime, I guess I should put some sort of response on my Seagate ticket. I'm not at all impressed with the attention they give Linux, and that will probably be in there. If I were purchasing a drive again, knowing what I know now, it would be Western Digital instead of Seagate, even if I am now working.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 13836

PostPosted: Sat Jun 25, 2016 5:33 pm    Post subject: Reply with quote

I cannot comment on the quality of their attention to Linux, but I would caution that most vendors seem to employ front-line CSRs trained to seem useless. It is possible that, if the vendors were reversed and the Western Digital drive had been the troublesome one, you would have had just as little help from Western Digital and would be just as unhappy with them. As I understand the linked thread, this was a kernel problem that the USB-attached-storage driver tried to perform far too large an allocation. It may be that the Western Digital drive worked solely because it did not count as UAS, and took some other code path. If so, this failure is neither an indictment of Seagate nor an endorsement of Western Digital.
Back to top
View user's profile Send private message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 3383

PostPosted: Sat Jun 25, 2016 5:49 pm    Post subject: Reply with quote

I understand that. On the other hand, the WD drives have been trouble-free under Linux.
Had the situation been reversed, you're probably correct in your estimation.

Then again, knowing what hardware tends to be trouble-free under Linux is sometimes almost as important as knowing who supports Linux. In practical terms, I want my system to work, not send me to Google.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum