Previous | Table of Contents | Next |
I Cant Spool, Take Two
I was in a shop that bought a new UNIX host, moved some print queues over, and noticed during testing that certain print jobs would just sit there for long periods of time before printing. We noticed that this only happened with wide areaconnected printers, as well as printers that were connected to dedicated print servers. We further noticed that only one printer at a time was printing on any given standalone print server. Our host vendor claimed that it had to be our network, whereas our print server vendor said it had to be the host. After sending print jobs to multiple queues from the host to one of our standalone print servers, I went to the host and typed the following: netstat -an | grep 515 I only saw one socket being opened to the print server. That was enough evidence for meapparently, the print services on this (proprietary) version of UNIX didnt support more than one printer on a given network host, and it was only sending one job at a time. This was why the wide area stuff was acting funny, toothere were more than one printer on the other side of the WAN link, and instead of sending multiple print jobs, one big print job could block all the other, smaller print jobs. Network traces bore this out. The vendor claimed that this aberrant behavior was as designed, and declined to fix it. So we went and bought something that worked right. |
Content Checker
For certain services, such as HTTP, you can actually check content. For instance, Listing 18.1 shows a troubleshooting session with a Web server.
Listing 18.1 A Troubleshooting Session with a Web Server
mori A ~$ telnet 167.195.160.6 80 Trying 167.195.160.6 Connected to 167.195.160.6. Escape character is ^]. GET / <title>Neato Geeky Stuff(tm)</title> <B>Neato Geeky Stuff(tm)</B><P> <img src=jonny/smguru.gif> <B>Leo sez:</B> <p> Check it out. <I>Lotsa</I> neato geeky stuff.
Whoaits the whole HTML page. This definitely tells you more than a pingit tells you that your Web server is up and serving HTML. In other words, who cares if your Web server is responding to pings? You dont have it there to respond to pings, you have it there to serve Web pages. If its responding to pings but not serving Web pages, it is for all intents and purposes down. By checking the content like this, you know that its functioning properly.
Mail Fail
If you can telnet to a socket but are still having application problems, its time to point the finger at the app. I saw a proprietary mail system that was failing at a remote sitethe users were connecting but then getting hung up for long periods of time while they were trying to access their mail. We put in a mail server to serve them locally (and get the wide area out of the picture), and the users were then fine. However, when the new mail server tried to talk to the main server, we got lots of disconnected socketslots of sockets in the TIME_WAIT state. By lots, I mean, 30 to 50. This indicated that there were many successful connections, followed by disconnections. Odd. Usually, when a connection is established, it will sit there and do its work merrily; disconnects are usually caused by network problemsnot the application. However, I could stay connected to a socket using Telnet as long as I liked. This really, really pointed to the app. A search of the vendors Web site on the socket state revealed that, with certain router configurations, this problem would occur. The vendor recommended fixing the router but also provided a patch and a workaround applicable to the server and client software, which ending up fixing the problem. |
Client/server and file and print networking are the bread and butter of most networked shops. Although file and print networking is, underneath it all, a huge agglomeration of client/server protocols, its complex enough to warrant being treated differently. File and print services rely on their clients heavily, so youll want to keep everybody on the same sheet of music.
Printing is one of the most important and aggravating functions on your network. Understanding the print process can help you to quickly pinpoint where a problem is. Printer-oriented documentation helps tremendously, particularly when its slapped on a label on the physical printer. Knowing how to trace a print problem is really helpful for complex problems.
File errors arent always necessarily accuratesomeone who opens a read-only file might be opening a file thats only read-only to him or her because of security attributes on that file. Knowing how to navigate your particular servers security is really important here. You can rule out security-related problems by trying the same operation after assigning super-user rights to a userand then quickly removing them.
Disk space problems can bring your entire operation to a halt. Apart from finding the culprit, you can alleviate the problem by keeping some spare disk space on the side.
Socket-level troubleshooting is the keystone of client/server application troubleshooting after youve ruled out network-level problems. The netstat -a command is one of your best friends here. Knowing your service numbers, as well as socket states, can help in diagnosing a problem. In particular, performing the Telnet trick can quickly point out whether a service is running on a given host, and it can sometimes show you whether its serving up content properly.
Previous | Table of Contents | Next |