... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
alt.os.linux.slackware
I think its the one without Selinux crap
87,272 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 85,872 of 87,272
Henrik Carlqvist to S.K.R. de Jong
Re: High system load on NFS snafu
06 Jun 22 17:57:15
   From: Henrik.Carlqvist@deadspam.com   
      
   On Mon, 06 Jun 2022 15:04:46 +0000, S.K.R. de Jong wrote:   
   > I have a Slackware64 15.0 system on which I had several directories   
   > mounted by NFS from a remote system. That remote system was actually   
   > rebooted a few times - for maintenance purposes - but I was stupid   
   > enough not to unmount those directories in my system.   
      
   Usually that is not needed when an NFS server is rebooted, once it is   
   back up again everything is supposed to be fine again.   
      
   > In fact, I had at least one terminal emulator where I was in one of the   
   > NFS-mounted directories. I foolishly tried to list the contents of that   
   > directory, and the shell just froze up on me. I had to kill the   
   > terminal emulator.   
      
   Most likely, somehow, the NFS server has not come back as it should.   
      
   > The system load has shot up to at least 4.00 ever since, even when,   
   > according to top, nothing much is going on in the system. I mean,   
   > I have a few things running, but nothing to justify that load: all the   
   > cores are at least 95% idle at any given time.   
      
   Even if you killed the terminal, your ls process is probably still there   
   in a "D" state (waiting for disk) and your system load is the sum of all   
   processes wayting for CPU and all processes waiting for disk.   
      
   > 	I was able to unmount those NFS directories - forcefully, on   
   > occasion - and I was able to stop the RPC and NFSD daemons. However, the   
   > high load issue did not disappear.   
      
   To get rid of the high load you will need to kill the processes in "D"   
   state. This is probably only possible if you mounted the NFS directories   
   with the "intr" option.   
      
   Stopping the rpc and nfsd daemons on the NFS server will from the NFS   
   clients point of view be just as bad as shutting down the NFS server   
   completely. Any processes being hung in "D" state will be so until the   
   NFS service is restored. Instead of stopping the NFS service you should   
   do something like "/etc/rc.d/rc.nfsd restart".   
      
   > 	Anybody got any suggestions as to how to diagnose and solve this   
   > problem, without rebooting? top is not helping, and I see nothing   
   > relevant in dmesg, or any of the /var/log files.   
      
   In both dmesg and your log files you should see something like this:   
      
   nfs: server foo.example.com  not responding, still trying   
      
   When this is the latest you see about that NFS server you will get   
   processes stuck in "D" state. Once the NFS server is rebooted and up   
   again you should see:   
      
   nfs: server foo.example.com  OK   
      
   and all your processes in "D" state should get back to normal again.   
      
   > More precisely, there are relevant entries, but they are all old and   
   > not being updated - but the high load stubbornly remains.   
      
   If you do:   
      
   ps aux | grep D   
      
   and look for processes with a "D" in the STAT column those processes   
   might explain your high load. There are other tools like lsof and fuser   
   to find out which processes are in an NFS mounted directory (or any other   
   directory), but you should focus on bringing that NFS server back instead   
   of killing unfinished processes.   
      
   regards Henrik   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]