On Thu, Nov 13, 2014 at 06:12:36PM -0500, Bill Davidsen wrote:
Bill Davidsen wrote:
>David A. De Graaf wrote:
>>On Fri, Oct 03, 2014 at 02:19:55PM -0400, David A. De Graaf wrote:
>>>On Fri, Oct 03, 2014 at 04:01:30AM +0930, Tim wrote:
>>>>Allegedly, on or about 02 October 2014, Chris Murphy sent:
>>>>>Cables are often the source of weird problems. Specifically it's
the
>>>>>connectors that are flakey, not the cable portion itself.
>>>>
>>>>Though, if you savagely bend SATA leads, the way some of them are
>>>>supplied in a flattened up zig-zag style, with a rubber band around
>>>>them, you can mess up the data transmission.
>>>>
>>>
>>>Some quick feedback: It's now apparent that the cables or SATA
>>>sockets have nothing to do with my problem. The finger of guilt
>>>now seems to point to the RAM sticks. However, experiments are
>>>slow. More later.
>>
>>After weeks of experimentation it's clear that my machine crashes have
>>nothing to do with the SATA connections or the harddrives.
>>They are caused by a too-small swap space!
>>
>>Zero is OK; large is OK; but small is NG.
>>
>>For reasons I can't recall, the system is set up with only a 2 GB swap
>>partition, and for a long while it had a single 4 GB RAM memory stick.
>>This was OK.
>>
>>Then I added a second 4 GB memory stick, identical to the first.
>>With 8 GB RAM and 2 GB swap the system crashed - froze - after a
>>random few hours.
>>
>>This was maddening. Not knowing the real cause, I bought a different
>>motherboard, changed power supply, tried different SATA and ATA
>>harddrive
>>connections, changed the SATA cable, removed the extra data drive,
>>removed the ATA CD drive, used one or the other RAM stick,
>>disconnected
>>everything and ran with only a Live F20 Xfce USB stick. I ran
>>memtest86
>>for days without error. The only thing that worked was to revert to
>>only a single memory stick - 4 GB. Either stick was OK.
>>
>>I put everything back together, using an ATA/SATA converter for the
>>350 GB primary disk, the SATA 1TB data harddrive, and the ATA CD.
>>
>>Then I noticed the size of the swap partition was 2 GB and, having
>>nothing else to try, added an 8 GB swap file.
>>
>>Eureka! It ran.
>>
>>I have a matrix of test cases which I won't bore you with.
>>They can be summarized as follows:
>>1 - with 4 GB RAM, either 0 or 2 GB swap space is OK.
>>2 - with 8 GB RAM, 0 swap space is OK.
>>3 - with 8 GB RAM, 2 GB swap space will reliably freeze the system
>>4 - with 8 GB RAM, 4 GB swap file is OK.
>>5 - with 8 GB RAM, 2 GB swap partition + 8 GB swap file is OK,
>> even if the priority of the smaller one is forced higher.
>>
>>At no time during these experiments was swap space actually used
>>according to the gkrellm display; the RAM usage remained well
>>below what was available.
>>
>>This is clearly a bug. No rational design would work like this.
>>Is it a kernel bug? Some other component?
>>Which one gets the Bugzilla?
>>
>
>It's probably too late to check now, but did you try taking the 2GB swap offline
>and running mkswap on it to check for a glitch somewhere? Yes, I know that's
>nominally a "can't happen" thing, but having had success with that, I
mention
>it. My sample size (one) is pretty small.
>
And to reply to my own suggestion, my notes on that also say you may want to
change to deadline scheduler on the swap device.
Thanks, Bill Davidsen.
I am now using *only* an 8 GB swap file on the mostly unused 1 TB SATA
disk. Yesterday the machine froze again.
Just for grins I will try your suggestion to use the deadline
scheduler. Googling shows that the way to do that is to
echo SCHEDULER > /sys/block/DEVICE/queue/scheduler
where SCHEDULER is one of cfq, noop, or deadline and DEVICE the block
device (sda for example).
[root@datwiz /sys/block/sdb/queue]
# cat scheduler
noop deadline [cfq]
[root@datwiz /sys/block/sdb/queue]
# echo deadline > scheduler
[root@datwiz /sys/block/sdb/queue]
# cat scheduler
noop [deadline] cfq
As you can see, the original scheduler was 'cfq'; it is now 'deadline'.
Evidently, the type of scheduler applies to the entire /dev/sdb and
not to just the swapfile or the partition that it's in.
That should be OK.
If there's an improvement, I'll report here.
Thanks, again.
--
David A. De Graaf DATIX, Inc. Hendersonville, NC
dad(a)datix.us
www.datix.us
"Physics is like sex...it may give some practical results, but that's
not why we do it." -- Richard Feynman