Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.protocols.tcp-ip    |    TCP and IP network protocols.    |    14,669 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 12,816 of 14,669    |
|    Barry Margolin to skendric@fhcrc.org    |
|    Re: interpreting "input ICMP message fai    |
|    23 Apr 09 20:40:58    |
      15db2fd3       From: barmar@alum.mit.edu              In article       <79031e5b-db3f-4134-8c4b-3677180e2868@v35g2000pro.googlegroups.com>,        skendric@fhcrc.org wrote:              > I have a monitoring application which emits pings. Most of the time,       > those ICMP Echos leave the box, arrive at their destination, and come       > back as ICMP Replies -- this is good. However, intermittently, those       > pings don't even leave the box. [I know this because I have a sniffer       > positioned just outside the box, mirroring traffic via the switch.]       > Naturally, my application then thinks that *all* its monitored devices       > have gone down, whereupon it becomes agitated, emits pages, and so       > forth. I'm adding debugging code to the application to try to       > understand where the failure is. Thus far, checking the return code       > on my call to send(), things seem fine, i.e. the OS claims that my send       > () call completed. OK, so the ICMP Echo gets dropped somewhere after       > my application has handed it off to the kernel.       >       > Looking at the output of "netstat -s -w":       >       > gnat> netstat -s -w       > [...]       > Icmp:       > 3510705 ICMP messages received       > 34835 input ICMP message failed.       > ICMP input histogram:       > destination unreachable: 111508       > timeout in transit: 26581       > echo requests: 84611       > echo replies: 3288005       > 7155838 ICMP messages sent       > 0 ICMP messages failed       > ICMP output histogram:       > destination unreachable: 64167       > echo request: 7007060       > echo replies: 84611       >       > (1) How to interpret the 'input ICMP message failed' counter?       >       > Does this mean ... that the OS was asked to *transmit* an ICMP message       > but was unable to (due to resource constraints perhaps) and that the       > OS threw away this message? Does it mean that the OS *received* an       > ICMP message but was unable to process it for some reason (full buffer       > perhaps) and tossed it?              "Input" means received messages, not transmitted messages. My guess       would be that it's ICMP messages with a type or code that the stack       doesn't know how to process.              > Checking the output of "netstat -i"; I don't see any sign that the NIC       > is dropping frames.       >       > gnat> netstat -i       > Kernel Interface table       > Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-       > OK TX-ERR TX-DRP TX-OVR       > bond0 1500 177785514 0 0 0       > 59109219 0 0 0       > bond0:1 1500 - no statistics available -       > eth0 1500 162487138 0 0 0       > 59109219 0 0 0       > eth1 1500 15298376 0 0       > 0 0 0 0 0       > eth2 1500 1791025 0 0       > 0 652286 0 0 0       > eth3 1500 1362124 0 0       > 0 54 0 0 0       > lo 16436 18460803 0 0       > 0 18460803 0 0 0       > gnat>       >       > (2) How reliably do NIC drivers update the counters which 'netstat -i'       > is querying? How confident can I be that, in fact, the NIC is *not*       > dropping frames?              Things break. Have you seen this on more than one machine? Have you       tried replacing the NIC?              >       > (3) And finally, any recommendations for books on this topic? I've       > skimmed through "Advanced Unix Programming" by Rochkind, "Unix Network       > Programming, Volume 1: The Sockets Network API", and "TCP/IP       > Illustrated Volume 2: The Implementation", without success thus far.       > [helpful in other ways, but not in how to interpret 'netstat' output]              I don't think you'll find any programming book that explains how to       troubleshoot network problems like this.              --       Barry Margolin, barmar@alum.mit.edu       Arlington, MA       *** PLEASE don't copy me on replies, I'll read them in the group ***              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca