Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
memtest fixed bad RAM?!
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Cyker
Veteran
Veteran


Joined: 15 Jun 2006
Posts: 1746

PostPosted: Fri Nov 13, 2015 9:38 am    Post subject: memtest fixed bad RAM?! Reply with quote

Summary: Bad RAM suddenly 'healed' after long memtest; Explanations?!


Okay, so a friend was complaining their laptop kept crashing; They'd be doing something then weird lines would appear on the screen, then crash.
Classic bad RAM symptoms on systems that use integrated graphics.

So to be sure, ran memtest and small-violin! Indeed there were many errors.

I think I had a phone call or something then as for some reason I went away and forgot about it.

2 days later I remember to check it - It's been running for near enough 2 days at this point and despite racking up 90,000 errors and many fails, I noticed with interest that it had 7 passes. WTF?

So I power-cycle the machine and run memtest again - No errors!!!

Waaaaaaaat?!

Has this happened to anyone before? I had already done the usual stuff of cleaning the contacts with switch cleaner, reseating the modules etc.
The only thing that currently springs to mind is if there were some tin whiskers or something on the BGA that were creating a teeny tiny short which have been fried by the RAM being hammered for 2 days straight.

I am now not sure whether to bother replacing the RAM or not...!

I've left it off to cool down for a while now just to see if it starts failing again, but this is pretty weird; Never seen this happen before!
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2970
Location: Germany

PostPosted: Fri Nov 13, 2015 10:09 am    Post subject: Re: memtest fixed bad RAM?! Reply with quote

Cyker wrote:
Summary: Bad RAM suddenly 'healed' after long memtest; Explanations?!


Maybe it was infested with nargles?

Quote:
I had already done the usual stuff of cleaning the contacts with switch cleaner, reseating the modules etc.


Maybe some cleaner residue left which evaporated thanks to the RAM modules heating up? :D

I would not trust it and keep a very close watch on it. Memory errors are strange. I had memtest sometimes run for days without error, but after a reboot memtest showing errors immediately. The RAM was without question bad and booting into Linux would cause a system freeze within 24 hours regardless.

As long as the error is in a very specific region you can work around it and use the module losing a few KB/MB of RAM depending on the size of damage. If it's random or large-area you have no choice but to get a new memory module and hope it wasn't the mainboard / cpu
Back to top
View user's profile Send private message
Syl20
Guru
Guru


Joined: 04 Aug 2005
Posts: 564
Location: France

PostPosted: Fri Nov 13, 2015 11:48 am    Post subject: Re: memtest fixed bad RAM?! Reply with quote

Cyker wrote:
Summary: Bad RAM suddenly 'healed' after long memtest; Explanations?!

Maybe a power weakness ? The 1 or 0 states of each memory area are just a voltage difference, after all. If the battery or the AC adapter don't give enough energy to the motherboard, that might cause such errors.
Was the battery charged when you got the laptop ?

However, getting new memory modules is encouraged. Just in case... :wink:
Back to top
View user's profile Send private message
Cyker
Veteran
Veteran


Joined: 15 Jun 2006
Posts: 1746

PostPosted: Fri Nov 13, 2015 3:01 pm    Post subject: Reply with quote

Yeah, at this point it'll be up to my friend to decide whether they want to pay for the new sodimms...

Would have been easier if it stayed broken so I could show them, but now I have no proof!!!


The switch cleaner shouldn't have left any residue (I learned the hard way that carb cleaner is NOT a substitute for electrical switch/contact cleaner when cleaning a car MAF sensor for this reason!)

Hadn't considered the weak power supply angle... the battery in this laptop appears to be dead so it could be sapping the volts down I suppose...

Personally I think it's the nargles... >_>
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7071

PostPosted: Fri Nov 13, 2015 3:45 pm    Post subject: Reply with quote

heat also, overheating ram appears dead ; when colder (not into the critical heat condition), the ram appears fine.

and when you say laptop, you should just always think about it as a "computer with heat trouble", because it's nearly always the case.
and many times, the line between too hot and "hot but still ok" for laptop can be cross by raising it a few allowing it to get fresh flow from bellow (and you may think "but i didn't change anything", when changing its position may be enough).
Back to top
View user's profile Send private message
Goverp
l33t
l33t


Joined: 07 Mar 2007
Posts: 668

PostPosted: Sat Nov 14, 2015 9:18 am    Post subject: Reply with quote

I wonder if there were dirty connexions - my first action on electronic problems is to reseat all the connectors and chips, to scrape them clean of oxide, cruft and chocolate.
_________________
Greybeard
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43195
Location: 56N 3W

PostPosted: Sat Nov 14, 2015 11:17 am    Post subject: Reply with quote

Cyker,

Something drying out due to two days of running?
Electrolytic capacitors reforming/healing due to being left powered up?
Mechanical stresses causing issues that are only present during thermal transisitions?

Remove the DIMMS, wipe the contacts with an ink eraser (its mildly abrasive) put them back.
Let everything cool down and run memtest again.

When memtest finds errors, it indicates an issue with memory accesses. It does not mean that the RAM is faulty.
If the errors occur at the same address and in the same test every pass, RAM is the likely problem.
Random errors indicate the issue is elsewhere.

As the machine has graphics in shared memory, you can determine if the issue is reads, writes or refreshes. I'm not sure that it helps.
Have a static display image. If the image is generally correct but pixels or lines twinkle, you are seeing read errors.
The memory content for the image is correct but its being misread from time to time.
If the image decays over time, the RAM is forgetting .. that's a refresh issue.
If its written incorrectly the screen display will always be incorrect, but you probably not get booted to see it.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Cyker
Veteran
Veteran


Joined: 15 Jun 2006
Posts: 1746

PostPosted: Sun Nov 15, 2015 7:46 pm    Post subject: Reply with quote

Wow, thanks for all the replies!

The RAM hasn't missed a beat since it mysteriously healed itself; Tried it from cold, tried it hot, tried it leaned up at various angles, but we're replacing it anyway and also going from 6GB to 8GB so it's all good :)

Lots of plausible theories here on the cause, although alas we'll never know what caused it for sure!

I don't think it was dirty connections as I cleaned them and anyway, dirty connections don't usually get better on their own, which is what was so perplexing about this!

Hadn't considered the liquid one, but that would fit the symptoms if the heat from the memtesting evaporated it!
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5761

PostPosted: Sun Nov 15, 2015 8:33 pm    Post subject: Reply with quote

All computers try to pretend they're these abstract, neat digital circuits, but in the end they're really analog electronics with tons of state, crosstalk, reality and other nasty things happening all the time. So with that in mind, there's no point overthinking things like this.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum