 |
 |
 |
 |
| EOL (End Of Life) Versions This is a Forum to discuss problems and workarounds for versions of Fedora that have passed End of Life. |

4th December 2006, 09:54 PM
|
|
Registered User
|
|
Join Date: Oct 2006
Posts: 8

|
|
|
Random Reboots - FC5
Hello, I have a dedicated server at a data center which has performed several unexplained random reboots.
The server has dual AMD Athlon MP 2200+, 2GB of RAM, Tyan motherboard. I installed the 2.6.18-1.2200.fc5smp kernel package to add SMP support to fc5.
Server worked fine for about two weeks, then the random reboots started.
Server logs do not show anything unusual ... it's as if the server was losing power momentarily, but the host indicates that there have been no power issues in that rack.
The host's technician tested the ram and the hard drive and found no issues.
Do you have any suggestions to try to track down the reason for the reboots?
Thank you,
Hector
|

5th December 2006, 12:52 AM
|
|
Registered User
|
|
Join Date: Jul 2005
Posts: 175

|
|
|
By "random" do you mean that these reboots are occurring at various hours of various days? What's the average time between reboots?
And you're sure no one has physical access to the server and is doing this manually (either by accident or on purpose)?
|

5th December 2006, 12:56 AM
|
 |
Registered User
|
|
Join Date: Aug 2006
Posts: 356

|
|
|
Hows your power supply? Could be power supply problems.
|

5th December 2006, 01:00 AM
|
 |
Registered User
|
|
Join Date: Jul 2005
Location: Wine Country, California
Posts: 2,862

|
|
|
Mobo or CPU or even CPU fan or heat sink or dust bunnies built up. Got lmsensors installed?
__________________
Mark N.
Perpetual Newbie
--
I wanted to proclaim myself "The Typo King" but there's way too much competion. :p
411874 Get Counted
|

5th December 2006, 05:34 AM
|
|
Registered User
|
|
Join Date: Oct 2006
Posts: 8

|
|
Thank you for your suggestions.
There does not appear to be any pattern to the reboots:
Nov 24 19:40:58 sh1 syslogd 1.4.1: restart.
Nov 28 16:12:47 sh1 syslogd 1.4.1: restart.
Nov 29 00:47:45 sh1 syslogd 1.4.1: restart.
Dec 4 00:09:44 sh1 syslogd 1.4.1: restart.
The server is leased from the hosting company. Only the hosting company has physical access to the server. I would be willing to rule out foul play since it is in their best interest to keep a paying customer.
I didn't have lm_sensors installed, so I proceeded to install it.
From the following information (dual AMD Athlon, AMD 760MP, TYAN PAULANER motherboard ) gathered from dmesg and a google search, I was able to identify the motherboard as a Tyan Tiger MPX (S2466). According to Tyan it has a Winbond W83782D asic.
sensors-detect found a Winbond W83782D on the SMBUS and a Winbond W83627HF Super IO Sensors on the ISA bus. Since the motherboard datasheet specifies a W83782D, I selected smbus for the sensors-detect configuration.
I suspect something is screwed with the lm_sensors configuration since this is what I get from 'service lm_sensors status':
Code:
[root@sh1 /etc]# service lm_sensors status
w83627hf-isa-0c00
Adapter: ISA adapter
VCore 1: +1.62 V (min = +1.57 V, max = +1.73 V)
VCore 2: +1.63 V (min = +1.57 V, max = +1.73 V)
+3.3V: +3.28 V (min = +3.14 V, max = +3.47 V)
+5V: +5.00 V (min = +4.76 V, max = +5.24 V)
+12V: +9.30 V (min = +10.82 V, max = +13.19 V) ALARM
-12V: -12.11 V (min = -13.18 V, max = -10.80 V)
-5V: +0.23 V (min = -5.25 V, max = -4.75 V) ALARM
V5SB: +5.43 V (min = +4.76 V, max = +5.24 V) ALARM
VBat: +0.00 V (min = +2.40 V, max = +3.60 V) ALARM
fan1: 4591 RPM (min = -1 RPM, div = 2) ALARM
fan2: 4856 RPM (min = 6887 RPM, div = 2) ALARM
fan3: 0 RPM (min = 4687 RPM, div = 2) ALARM
temp1: +80°C (high = +64°C, hyst = -111°C) sensor = thermistor ALARM
temp2: +80.0°C (high = +80°C, hyst = +75°C) sensor = thermistor ALARM
temp3: +80.0°C (high = +80°C, hyst = +75°C) sensor = thermistor ALARM
vid: +1.650 V (VRM Version 9.0)
alarms: Chassis intrusion detection ALARM
beep_enable:
Sound alarm enabled
w83627hf-i2c-0-2c
Adapter: SMBus AMD768 adapter at 80e0
VCore 1: +4.08 V (min = +4.08 V, max = +4.08 V)
VCore 2: +4.08 V (min = +4.08 V, max = +4.08 V)
+3.3V: +4.08 V (min = +4.08 V, max = +4.08 V)
+5V: +6.85 V (min = +6.85 V, max = +6.85 V)
+12V: +15.50 V (min = +15.50 V, max = +15.50 V) (beep)
-12V: +6.06 V (min = +6.06 V, max = +6.06 V) (beep)
-5V: +5.10 V (min = +5.10 V, max = +5.10 V) (beep)
V5SB: +6.85 V (min = +6.85 V, max = +6.85 V) (beep)
VBat: +4.08 V (min = +4.08 V, max = +4.08 V) (beep)
fan1: 0 RPM (min = 0 RPM, div = 128)
fan2: 0 RPM (min = 0 RPM, div = 128)
fan3: 0 RPM (min = 0 RPM, div = 128) (beep)
temp1: -1°C (high = -1°C, hyst = -1°C) sensor = thermistor
temp2: +81.0°C (high = +80°C, hyst = +75°C) sensor = thermistor
temp3: +80.5°C (high = +80°C, hyst = +75°C) sensor = thermistor (beep)
vid: +0.000 V (VRM Version 9.0)
alarms:
beep_enable:
Sound alarm enabled
The voltages and temperatures appear to be way out of range. I suspect something is wrong with my lm_sensors config.
Hector
|

5th December 2006, 05:43 AM
|
 |
Registered User
|
|
Join Date: Jul 2005
Location: Wine Country, California
Posts: 2,862

|
|
|
It's been a while since I did it myself, but you can look through the config file (don't recall which off hand, maybe someone else will post) to tweak the setting (on some chipsets), but if these temps are actually high, I believe they could affect the voltages, or if the voltages are high it could affect the temps, and if it's rebooting randomly, it's a good chance these values are correct (IMHO).
EDIT: If this machine is out of your physical control, it's out of your hands except for informing the owners of the "risk" and since it's in their best interest.....
__________________
Mark N.
Perpetual Newbie
--
I wanted to proclaim myself "The Typo King" but there's way too much competion. :p
411874 Get Counted
Last edited by u-noneinc-s; 5th December 2006 at 05:45 AM.
Reason: added content
|

5th December 2006, 06:13 AM
|
|
Registered User
|
|
Join Date: Oct 2006
Posts: 8

|
|
I found a lm_sensors config file for my motherboard. I'll fix some of the errors in it and upload it ot the lm_sensors wiki for future users.
Here are the correct values (and they are ok ... not great, but ok):
Code:
[root@sh1 /etc]# service lm_sensors status
w83627hf-isa-0c00
Adapter: ISA adapter
VCore1: +1.63 V (min = +1.57 V, max = +1.73 V)
VCore2: +1.63 V (min = +1.57 V, max = +1.73 V)
+3.3 V: +3.30 V (min = +3.14 V, max = +3.47 V)
+12 V: +11.75 V (min = +13.21 V, max = +10.83 V) ALARM
-12 V: -12.11 V (min = -13.18 V, max = -10.80 V)
CPU1 Fan: 4560 RPM (min = -1 RPM, div = 2) ALARM
CPU2 Fan: 4821 RPM (min = 6887 RPM, div = 2) ALARM
VRM1 Temp: +60°C (high = +64°C, hyst = -111°C) sensor = transistor ALARM
AGP Temp: +59.0°C (high = +80°C, hyst = +75°C) sensor = transistor
DDR Temp: +57.5°C (high = +80°C, hyst = +75°C) sensor = transistor
alarms: Chassis intrusion detection ALARM
beep_enable:
Sound alarm disabled
w83782d-i2c-0-2d
Adapter: SMBus AMD768 adapter at 80e0
AGP V: +3.26 V (min = +3.14 V, max = +3.46 V)
+5 V: +4.81 V (min = +4.73 V, max = +5.24 V)
DDR V: +1.22 V (min = +2.05 V, max = +0.16 V) ALARM
3 VSB: +3.30 V (min = +2.85 V, max = +3.15 V) ALARM
Bat V: +0.00 V (min = +2.64 V, max = +3.95 V) ALARM
chs1 Fan: 0 RPM (min = 84375 RPM, div = 2) ALARM
chs2 Fan: 0 RPM (min = 13500 RPM, div = 2) ALARM
chs3 Fan: 0 RPM (min = 42187 RPM, div = 2) ALARM
VRM2 Temp: +64°C (high = +9°C, hyst = -75°C) sensor = transistor ALARM
CPU1 Temp: +64.5°C (high = +80°C, hyst = +75°C) sensor = transistor
CPU2 Temp: +62.5°C (high = +80°C, hyst = +75°C) sensor = transistor
alarms: Chassis intrusion detection ALARM
beep_enable:
Sound alarm enabled
I'll setup MRTG to keep an eye on the temperatures ...
Other than that, do you have any other suggestions to look at?
Thank you all for your help!
Hector
|

5th December 2006, 06:38 AM
|
 |
Registered User
|
|
Join Date: Jul 2005
Location: Wine Country, California
Posts: 2,862

|
|
|
There are still a lot of alarms, and it appears the chassis fans aren't turning
__________________
Mark N.
Perpetual Newbie
--
I wanted to proclaim myself "The Typo King" but there's way too much competion. :p
411874 Get Counted
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Hybrid Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
Current GMT-time: 07:20 (Tuesday, 21-05-2013)
|
|
 |
 |
 |
 |
|
|