PDA

View Full Version : Intermittent network card failure


Quickstep
2005-07-24, 01:37 PM CDT
I have a small home network setup, consisting of 1 Win2k machine and 2 Linux machines (Fedora Core 3 & Suse 9.1). Lately, I'm having a really strange problem with my Fedora box and I don't even know where to begin troubleshooting it.

The problem: My networking continues to suddenly become unavailable with this box, with no apparent cause. It will be working one minute, and then it will suddenly stop responding to anything I try to do via the network. It has done it while I was actually sitting at the Fedora machine and browsing the web or a Windows share, while I was at my Windows machine, logged in remotely via SSH to my Fedora machine, or it will just be non-operational when I go to use it. I have checked my physical network connections many times. It's something with the box, not the cable or connection.

I have found a way to get it working again. Originally, I would just reboot the machine and it worked fine. BUT, this isn't Windows... I shouldn't have to reboot all the time. So, I wrote a small shell script which will unmount all of my SMB and NFS shares, restart the network (/etc/init.d/./network restart), and then remount all remote shares. This seems to work fine. ...but, I don't want to have to keep doing this. I want it to work!

I've been using Linux for about 4 years now and totally love it. As I encounter problems like these, I will usually try to troubleshoot them myself, and then do some good ole Google searching. 95% of the time, I find my solution rather quickly. With this one, I don't even know where to begin.

Is there a log I can view that might give me some hints as to the cause of this problem?

Are there any known recent updates that might adversely affect my network configuration?
(I manually do a yum update a few times a week so that I can SEE what is being updated)

Again, this is a recent problem. It was working PERFECTLY, up until the past few weeks (just in case that might give anyone a hint as to the cause).

Let me know if you need me to post any logs or check anything on the machine. I GREATLY appreciate anyone's help with this.

Thanks!

bvgsy
2005-07-24, 02:41 PM CDT
If you haven't made any changes to your network configurations that would
affect your connectivity, it could be that your NIC is about to die.

Do a dmesg|grep -i eth and see how many times it went up and down and
if the occurrence is too often, it's time for a new NIC.

HTH!

bvgsy

crackers
2005-07-24, 09:25 PM CDT
Simpler than checking dmesg is to run /sbin/ifconfig and look at the RX/TX packets and the "errors" number. If it's really large or even a significant percentage of the RX or TX packets, it's a good indication that you may have a hardware issue.

Our work file-server had this exact same issue and, after doing much the same as you, the admin did as above and ended up replacing the NIC. Problem solved.

Quickstep
2005-07-25, 08:15 AM CDT
Simpler than checking dmesg is to run /sbin/ifconfig and look at the RX/TX packets and the "errors" number. If it's really large or even a significant percentage of the RX or TX packets, it's a good indication that you may have a hardware issue.

0 errors, 0 packets dropped

I had considered, as suggested, just throwing a new nic in there, but I'd like to troubleshoot it a bit more, just in case it's something else. But, if it comes down to it, I know I've got a bunch of nics laying around.