Jan. 15, 2011, 8:13 p.m.
posted by effect
What to do first when you just can't get your wireless connection working.
As ethereal as wireless networks seem to be, they work surprisingly well. Once you are within range of a properly configured wireless network, there is usually very little work required on the part of the end user. Typically, you simply open your laptop and it all "just works."
Except, of course, when it doesn't. If you are having trouble getting online, it's time to practice your troubleshooting skills. Here are some simple steps that should help you to quickly pinpoint the source of the trouble.
Is your wireless card installed and turned on? Many laptops have the ability to disable the wireless card, either through software or a physical switch. Is your card plugged in (all the way!), is it turned on, and does it have all of the proper drivers installed? This is the troubleshooting equivalent of "is it plugged in," but is certainly worth checking first.
Are you in range of an AP? When in doubt, always check your signal meter. Do you have enough signal strength to talk to the AP? You could simply be out of range. If your client software shows noise levels, check them as well to be sure that you have a high signal-to-noise ratio. It is always possible that a neighbor has just started microwaving a burrito, or maybe they just answered their 2.4 GHz phone.
Are you associated with the proper network? This step sounds silly, but is becoming more important to check every day. For example, I live in an apartment building in a busy part of Seattle. I once sat down at my laptop and tried to log into my local file server, only to find that it was unreachable. I thought I was having connectivity problems, but there was plenty of signal strength, and I could get out to web pages with no problem. I plugged in the console to my file server and everything seemed fine. Puzzled, I decided to try to ping my laptop from the file server. It was then that I realized that my laptop was using a strange IP address, not even in the same network that I use for my home. How could it possibly be able to get to web pages if it is using the wrong network numbers?
Then it dawned on me that earlier that day, I had set my laptop to use the network with the strongest signal, regardless of the ESSID. It turns out that a neighbor had just installed an access point, and I was associated with it instead! Since my neighbor and I are both running open networks, my laptop dutifully associated and started using my neighbor's DSL line. Of course my home machines were unreachable; they are all on a private network behind my router.
So the moral of the story is: be sure that you know which network you are talking to.
If your IP is listed as 0.0.0.0, is missing, or starts with 169, then you have no IP address. That means that you don't have a DHCP lease. Be absolutely sure that your WEP settings are correct, and if you are using MAC address filtering, be sure that the MAC address of your wireless card matches the list in your AP.
If all of your wireless settings are correct and you have plenty of signal, then for some reason you simply haven't received a DHCP lease. This can happen for a variety of reasons. Is your card configured to request DHCP? Is the DHCP server up and running on your network? If you are serving a large number of clients, have you run out of DHCP leases? If in doubt, this level of troubleshooting is best done with the help of your network administrator. If you are the network admin, try using a traffic sniffer like tcpdump [Hack #37] and Ethereal [Hack #38] . Can you see traffic from the AP? What happens when your machine requests a DHCP lease? A good sniffer can help find the source of the problem very quickly.
If you absolutely need to get onto a network that isn't offering DHCP leases, and you have access to a sniffer, you can always "camp." This is not recommended except in the most dire of emergencies, but it wouldn't be a hack if I didn't tell you how, would it? Using a sniffer on a busy network, you can quickly discern the network layout, including the IP range being used and the likely default gateway. Pick an IP address in that range and assign it statically. Then define your default router, and you should be all set. This is highly discouraged as it is difficult to tell if you are "sharing" an IP address with another machine on the network, which could cause problems for both of you. Also, any self-respecting network admin who figures out what you are doing will likely bring his wrath down upon your head. But if they are so self-respecting, be sure to inquire why his DHCP server didn't work in the first place.
Before continuing on to the remaining troubleshooting steps, it is a good idea to disable any encrypted tunnels or proxies that you might be running, in order to establish basic connectivity without too many variables getting in the way.
Try to ping the IP address listed. An unreachable gateway by itself doesn't necessarily indicate a problem, as not all routers respond to ping requests. However, if you can't reach the gateway, and you can't get out to the rest of the Internet, then make sure that the gateway is up. If you aren't the network admin, then you had better find her.
Can you ping an IP on the Internet? This step is important and frequently overlooked. Try to ping any popular IP on the Internet. For example, 184.108.40.206 is listed as the IP address of www.google.com. You should memorize one IP address that is guaranteed to be up and try to ping it. A successful ping here establishes basic connectivity to the rest of the Internet. If you have been successful in all tests so far, but can't ping an Internet IP, then traffic isn't getting beyond your default gateway or it isn't coming back. See Using traceroute.
Can you ping www.google.com? This is an important but separate step from the previous step. If you can ping 220.127.116.11, but attempts to ping www.google.com take a long time and eventually fail, then it is very likely that your routing is fine, but DNS name resolution isn't working. Check the DNS servers that your DHCP server handed to you.
If you have no DNS servers listed, then your DHCP server didn't hand one out. Assign one manually, or better yet, fix your DHCP server. If you have DNS servers listed but you can't resolve DNS names, then something is probably wrong with your DNS configuration. At this point, contact your network admin, as here there be dragons.
Can you browse to www.google.com? This is, of course, an obvious test, which may have been what started you troubleshooting in the first place. If this test works, then you have some sort of connectivity to the Internet. But it is always worth trying as part of your regular routine, particularly if you are using a public access point. Many public networks run a captive portal (such as NoCatAuth [Hack #89]) that prohibits most network connectivity before logging into a web page. This can be very confusing, particularly if you are able to perform many of these tests but can't establish an SSH connection or check your email, for example. Always try to browse to a popular web page if you are having trouble connecting to a public network, because you might find yourself redirected to instructions on how to gain further network access.
This can also turn up unexpected problems with intervening transparent proxies, such as squid. It may also indicate that there is a manual proxy that you are supposed to use for Internet traffic. If you can ping and resolve hostnames, but you can't browse to web sites, check with your network admin to see if there is a proxy server somewhere on the network and that it is functioning properly.
One very handy tool for finding the source of network problems is traceroute. While not completely infallible, it can help to pinpoint exactly where communications are breaking down. It is best used when you can reach some machines (for example, your default gateway) but not others.
traceroute attempts to contact every machine along the route between your local computer and the ultimate destination, and reports on the average amount of time it takes to contact each one. Not all routers will allow traceroute traffic to pass through it, but it is worth trying if you are having network troubles.
Under Linux, BSD, or OS X, run traceroute -n 18.104.22.168
In any version of Windows, from the command shell, run:
tracert -n 22.214.171.124
Of course, you can use any Internet IP address you like. You should see something like the following:
traceroute to 126.96.36.199 (188.8.131.52), 30 hops max, 40 byte packets 1 10.15.6.1 4.802 ms 4.411 ms 4.886 ms 2 184.108.40.206 11.341 ms 11.202 ms 10.797 ms 3 220.127.116.11 14.212 ms 25.894 ms 11.811 ms 4 18.104.22.168 14.362 ms 13.564 ms 23.587 ms 5 22.214.171.124 13.046 ms 13.244 ms 13.595 ms 6 126.96.36.199 147.823 ms 16.747 ms 17.827 ms 7 188.8.131.52 19.723 ms 156.864 ms 23.545 ms 8 184.108.40.206 22.393 ms 32.006 ms 18.52 ms 9 220.127.116.11 14.93 ms 115.795 ms 34.949 ms 10 18.104.22.168 35.249 ms 139.869 ms 32.841 ms 11 22.214.171.124 38.268 ms 148.991 ms 33.852 ms 12 126.96.36.199 33.457 ms 49.736 ms 90.575 ms 13 188.8.131.52 34.17 ms 32.661 ms 32.978 ms 14 184.108.40.206 53.416 ms 67.974 ms 35.621 ms 15 220.127.116.11 41.108 ms 60.794 ms 92.63 ms 16 18.104.22.168 40.331 ms 54.544 ms 144.794 ms 17 22.214.171.124 49.154 ms 36.918 ms 124.526 ms
This shows the IP address of each intervening hop on the way to the destination, and the approximate amount of time it took to reach each hop. It makes three attempts to contact each hop, and reports the timing of each to help establish something of an average. If you are absolutely sure that name resolution is working properly, you can omit the -n switch, which will cause traceroute to look up the name of each hop along the way as well.
Connectivity problems are indicated by the presence of stars in the IP address field. For example, here is a traceroute to a nonexistent IP address:
[email protected]:~$ traceroute -n 192.168.1.1 traceroute to 192.168.1.1 (192.168.1.1), 30 hops max, 40 byte packets 1 10.15.6.1 4.795 ms 4.586 ms 4.3 ms 2 126.96.36.199 169.344 ms 17.067 ms 15.115 ms 3 188.8.131.52 15.13 ms 24.71 ms 16.03 ms 4 * * * ^C
The router at 184.108.40.206 likely realized that 192.168.1.1 is a non-routable IP address, and is silently discarding packets with that destination. If there are connectivity problems between any two points along a traceroute, you will see stars in the IP field, or very large response times at each hop. Generally, anything in excess of a couple of hundred milliseconds means that you should probably try your call again later.
In my experience, it is hardly ever necessary to go through this entire checklist to establish connectivity. But when it is necessary, it is important to remember to go slow, eliminate variables as you go, and try to fix things so that the same problem doesn't happen again. For really tough connectivity issues that these steps won't solve, you'll need more powerful measures. A powerful protocol analyzer like tcpdump or Ethereal can quickly bring the trickiest of network problems to light.