Knowing vs Guessing: Diagnosing Network Speed Problems

So every time network speeds go to crap, people start posting here, declaiming the state of the universe and all.

At that point folks like myself explain what the likely problem is, suggesting instead of blaming their ISP, their provider, or Illuminati or something, that they take a rational diagnostic approach, and figure out for themselves where the problem is.

It is like some kind of deja vue all over again. ....almost, not quite, as bad as the almost daily posts asking for bestest cheapest seedbox plan.

So again I take up the mantle of explaining the wonders of MTR.

MTR is a linux tool, kind of a combo of ping and trace route. MTR (my trace route) sends out ICMP packets asking each network stop to ping back, this way the tool can tell you the latency (time it took for the round trip) to each turning point on a network route for origin to destination. With this tool, you are generally looking for two things that can tell you if there are issues, they are jitter and packet loss.

Jitter is an indication of congestion, measured as variation in latency, the more the variation, the more likely there is congestion, and you can see it in two ways: in mtr's stdev column (standard deviation), and multi-routing on a single step. Think of it as sorta the rubbernecking around a freeway accident. Because of the accident, at that particular point in the freeway things slow down, some cars are able to pass quickly by changing lanes and ignoring things, others are slowed to a crawl, that speed variation indicates a problem.

Packet loss is a death in the family, a packet is never heard from again. This isn't suppose to happen, a perfect network has 100% packet delivery, so if you see packet loss it means something is broken, a pipe has burst and water is spraying out everywhere ( or just slowly leaking).

The general rule of thumb is packet loss gets fixed, and jitter just gets worse.

So how, where and when do you run mtr?

First, there is a common confusion, when communicating over a network, there is not just one route, it isn't like freeway traffic where the way you get there is also the way you get back. It doesn't work that way. There are at least two routes, the outbound route that your ISP has sent you on, to get to your destination. And the return route, the one that was determined by the guy wanting to talk to you.

That means that the data you send to a website is on a different path then the data the website is sending to you.

Though it isn't always the case, generally network speed problems are on the path where the bulk of the data is traveling. In the case of FTP, for example that would be from your seedbox to home. So the best place to start is from your seedbox to home.

The how is amazingly simple, first from home google "What is my IP Address", google will tell you your address.

Next log-in to a ssh session, using putty, xshell, vnc, x2go, whatever you use to get a command line prompt. I use putty.

Then from that shell prompt type:

This will fill your screen with a display that looks like:

This will refresh once a second until you stop it, you generally want to let it run 5-15 minutes to get a complete picture.

In the example above, Cogent is the backbone, and comcast is the ISP. There is both jitter and packet loss, you can see the jitter on line 9 and 10. 64.9 is significant (but not horrible) jitter, and indicates that the interchange of traffic between Cogent and Comcast is congested (known problem Comcast is known to run public interchange points hot, over 50% full - folks suggest that they do this to save money, and to encourage private peering which is more lucrative to Comcast - I don't think this that is the case, I think they do it simply because they're shitheads.)

The larger problem is the packet loss, indicating bad and broken network plumbing. 2.3%  is not insignificant and is enough to significantly slow down your ftp traffic, it can cause retransmits, and smaller and smaller send windows (packet payload size), meaning more and more packets for less and less data.

Now if you run an mtr and it looks perfect, without issue, then the problem can be your service provider, your hard disk, or your home machine,  is the time to get a ticket open with your provider so they can take a harder look.