TCP DUMP ---- Part -II

Capture Files

In addition to its normal behavior of printing packet descriptions to the screen, tcpdump also supports a mode of operation where it writes packets to a file instead. This mode is activated when the -w option is used to specify an output capture file.

When writing to a file, tcpdump uses a completely different format from when it writes to the screen. When writing to the screen, formatted text descriptions of packets are printed. When writing to a file, the raw packets are recorded as is, without analysis.

Instead of doing a live capture, tcpdump also can read from an existing capture file as input with the -r option. Because tcpdump capture files use the universally supported "pcap" format, they also can be opened by other applications, including Wireshark.

This gives you the option to capture packets with tcpdump on one host, but perform analysis on a different host by transferring and loading the capture file. This lets you use Wireshark on your local workstation without having to attach it to the network and location you need to capture from.

Analyzing TCP-Based Application Protocols

tcpdump is a packet-based analyzer, and it works great for connectionless, packet-based protocols like IP, UDP, DHCP, DNS and ICMP. However, it cannot directly analyze "connection-oriented" protocols, such as HTTP, SMTP and IMAP, because they work completely different.

They do not have the concept of "packets". Instead, they operate over the stream-based connections of TCP, which provide an abstracted communications layer. These application protocols are really more like interactive console programs than packet-based network protocols.

TCP transparently handles all of the underlying details required to provide these reliable, end-to-end, session-style connections. This includes encapsulating the stream-based data into packets (called segments) that can be sent across the network. All of these details are hidden below the application layer.

In order to capture TCP-based application protocols, an extra step is needed beyond capturing packets. Because each TCP segment is only a slice of application data, it can't be used individually to obtain any meaningful information. You first must reassemble TCP sessions (or flows) from the combined sets of individual segments/packets. The application protocol data is contained directly within the sessions.

tcpdump doesn't have an option to assemble TCP sessions from packets directly, but you can "fake" it by using what I call "the tcpdump strings trick".

The tcpdump Strings Trick

Usually when I'm capturing traffic, it's just for the purpose of casual analysis. The data doesn't need to be perfect if it shows me what I'm looking for and helps me gain some insight.

In these cases, speed and convenience reign supreme. The following trick is along these lines and is one of my favorite tcpdump techniques. It works because:

TCP segments usually are sent in chronological order.
Text-based application protocols produce TCP segments with text payloads.
The data surrounding the text payloads, such as packet headers, is usually not text.
The UNIX command strings filters out binary data from streams preserving only text (printable characters).
When tcpdump is called with -w - it prints raw packets to STDOUT.

Put it all together, and you get a command that dumps real-time HTTP session data:


tcpdump -l -s0 -w - tcp dst port 80 | strings

The -l option above turns on line buffering, which makes sure data gets printed to the screen right away.

What is happening here is tcpdump is printing the raw, binary data to the screen. This uses a twist on the -w option where the special filename - writes to STDOUT instead of a file. Normally, doing this would display all kinds of gibberish, but that's where the strings command comes in—it allows only data recognized as text through to the screen.

There are few caveats to be aware of. First, data from multiple sessions received simultaneously is displayed simultaneously, clobbering your output. The more refined you make the filter expression, the less of a problem this will be. You also should run separate commands (in separate shells) for the client and server side of a session:


tcpdump -l -s0 -w - tcp dst port 80 | strings
tcpdump -l -s0 -w - tcp src port 80 | strings

Also, you should expect to see a few gibberish characters here and there whenever a sequence of binary data also happens to look like text characters. You can cut down on this by increasing min-len (see the strings man page).

This trick works just as well for other text-based protocols.

HTTP and SMTP Analysis

Using the strings trick in the previous section, you can capture HTTP data even though tcpdump doesn't actually understand anything about it. You then can "analyze" it further in any number of ways.

If you wanted to see all the Web sites being accessed by "davepc" in real time, for example, you could run this command on the firewall (assume the internal interface is eth1):


tcpdump -i eth1 -l -s0 -w - host davepc and port 80 \
  | strings | grep 'GET\|Host'

In this example, I'm using a simple grep command to display only lines with GET or Host. These strings show up in HTTP requests and together show the URLs being accessed.

This works just as well for SMTP. You could run this on your mail server to watch e-mail senders and recipients:


tcpdump -l -s0 -w - tcp dst port 25 | strings \
  | grep -i 'MAIL FROM\|RCPT TO'

These are just a few silly examples to illustrate what's possible. You obviously could take it beyond grep. You could go as far as to write a Perl script to do all sorts of sophisticated things. You probably wouldn't take that too far, however, because at that point, there are better tools that actually are designed to do that sort of thing.

The real value of tcpdump is the ability to do these kinds of things interactively and on a whim. It's the power to look inside any aspect of your network whenever you want without a lot of effort.

Debugging Routes and VPN Links

tcpdump is really handy when debugging VPNs and other network connections by showing where packets are showing up and where they aren't. Let's say you've set up a standard routable network-to-network VPN between 10.0.50.0/24 and 192.168.5.0/24 (Figure 2).

illustration of a standard routable network-to-network VPN

Figure 2. Example VPN Topology

If it's operating properly, hosts from either network should be able to ping one another. However, if you are not getting replies when pinging host D from host A, for instance, you can use tcpdump to zero in on exactly where the breakdown is occurring:


tcpdump -tn icmp and host 10.0.50.2

In this example, during a ping from 10.0.50.2 to 192.168.5.38, each round trip should show up as a pair of packets like the following, regardless of from which of the four systems the tcpdump command is run:


IP 10.0.50.2 > 192.168.5.38: ICMP echo request, 
 ↪id 46687, seq 1, length 64
IP 192.168.5.38 > 10.0.50.2: ICMP echo reply, 
 ↪id 46687, seq 1, length 64

If the request packets make it to host C (the remote gateway) but not to D, this indicates that the VPN itself is working, but there could be a routing problem. If host D receives the request but doesn't generate a reply, it probably has ICMP blocked. If it does generate a reply but it doesn't make it back to C, then D might not have the right gateway configured to get back to 10.0.50.0/24.

Using tcpdump, you can follow the ping through all eight possible points of failure as it makes its way across the network and back again.

Conclusion

I hope this article has piqued your interest in tcpdump and given you some new ideas. Hopefully, you also enjoyed the examples that barely scratch the surface of what's possible.

Besides its many built-in features and options, as I showed in several examples, tcpdump can be used as a packet-data conduit by piping into other commands to expand the possibilities further—even if you do manage to exhaust its extensive "advertised" capabilities. The ways to use tcpdump are limited only by your imagination.

tcpdump also is an incredible learning tool. There is no better way to learn how networks and protocols work than from watching their actual packets.

Don't forget to check out the tcpdump and pcap-filter man pages for additional details and information.

The tcpdump/libpcap Legacy

tcpdump has been the de facto packet capture tool for the past 25 years. It really did spawn the whole genre of network utilities based on sniffing and analyzing packets. Prior to tcpdump, packet capture had such high processing demands that it was largely impractical. tcpdump introduced some key innovations and optimizations that helped make packet capture more viable, both for regular systems and for networks with a lot of traffic.

The utilities that came along afterward not only followed tcpdump's lead, but also directly incorporated its packet capture functionality. This was possible because very early on, tcpdump's authors decided to move the packet capture code into a separate portable library called libpcap.

Wireshark, ntop, snort, iftop, ngrep and hundreds of other applications and utilities available today are all based on libpcap. Even most packet capture applications for Windows are based on a port of libpcap called WinPcap.

__________________________________________________________________________________________________________________________________________

Resources

tcpdump and libpcap: http://www.tcpdump.org

Wireshark: http://www.wireshark.org

TCP/IP Model: http://en.wikipedia.org/wiki/TCP/IP_model

Kernel Open Source

Sunday, 19 August 2012