Tuesday, October 4, 2011

Episode #159: Portalogical Exam

Tim finally has an idea

Sadly, we've been away for two weeks due to lack of new, original ideas for posts. BUT! I came up with and idea. Yep, all by myself too. (By the way, if you have an idea for an episode send it in)

During my day job pen testing, I regularly look at nmap results to see what services are available. I like to get a high level look at the open ports. For example, lots of tcp/445 means a bunch of Windows boxes. It is also useful to quickly see the one off ports, and in this line of work, the one offs can be quite important. One unique service may be legacy, special (a.k.a. not patched), or forgotten.

Nmap has a number of output options: XML, grep'able output, and standard nmap output. PowerShell really favors objects, which means that XML will work great. So let's start off by reading the file and parsing it as XML.

PS C:\> [xml](Get-Content nmap.xml)

xml xml-stylesheet #comment nmaprun
--- -------------- -------- -------
version="1.0" href="file:///usr/local/sh... Nmap 5.51 scan initiated ... nmaprun


Get-Content (alias gc, cat, type) is used to read the file, then [xml] parses it and converts it to an XML object. After we have an XML object, we can see all the nodes of the document. To access each node we access it like any property:

PS C:\> ([xml](gc nmap.xml)).nmaprun.host


But each host has a ports property that needs to be expanded, and each ports property has multiple port properties to be expanded. To do this we use a pair of Select-Object cmdlets with the -ExpandProperty switch (-ex for short).

PS C:\> ([xml](gc .nmap.xml)).nmaprun.host | select -expand ports | 
select -ExpandProperty port


protocol portid state service
-------- ------ ----- -------
tcp 22 state service
tcp 23 state service
tcp 80 state service
tcp 443 state service
tcp 80 state service
tcp 443 state service
...


Nmap can have information on closed ports, so I like to make sure that I am only looking at open ports. We use the Where-Object cmdlet (alias ?) to filter for ports that are open. Each port has a state element and a state property, and we'll check if the state of the state (yep, that's right) is open:

PS C> ... | ? { $_.state.state -eq "open" }


The output is the same, just with extra filtering. Now all we need to do is count. To do that, we use the Group-Object cmdlet (alias group).

PS C:\> ([xml](gc nmap.xml)).nmaprun.host | select -expand ports | 
select -ExpandProperty port | ? { $_.state.state -eq "open" } |
group protocol,portid -NoElement


Count Name
----- ----
12 tcp, 80
1 tcp, 25
12 tcp, 443
2 tcp, 53
...


The -NoElement switch tells the cmdlet to discard the individual objects and just give us the group information.

Of course, if we are looking for patterns or one off ports we need use the Sort-Object cmdlet (alias sort) by the Count and use the -Descending switch (-desc for short)..

PS C:\> ([xml](gc nmap.xml)).nmaprun.host | select -expand ports | 
select -ExpandProperty port | ? { $_.state.state -eq "open" } |
group protocol,portid -NoElement | sort count -desc


Count Name
----- ----
12 tcp, 443
12 tcp, 80
2 tcp, 53
2 tcp, 18264
...



Now that's handy, but many times I have multiple scans, like a UDP scan and a TCP scan. If we want to combine multiple scans into one table we can do it relatively easily.

PS C:\> ls *.xml | % { ([xml](gc $_)).nmaprun.host } | select -expand ports |
select -ExpandProperty port | ? { $_.state.state -eq "open" } |
group protocol,portid -NoElement | sort count -desc


Count Name
----- ----
12 tcp, 443
12 tcp, 80
3 udp, 161
2 tcp, 53
2 tcp, 18264
...


The beauty of PowerShell's pipeline is that we can use any method we want to pick the files, then feed them into the next command with a ForEach-Object loop (alias %).

Now that I've checked all my ports, it's time for Hal to get his checked.

Hal examines his options

Tim, when you get to be my age, you'll get all of your ports checked on an annual basis.

Now let's examine this so-called idea of Tim's. Oh sure, XML is all fine and dandy for fancy scripting languages like Powershell. But you'll notice he didn't even attempt to do this in CMD.EXE. Weakling.

While XML is generally a multi-line format and not typically conducive to shell utilities that operate on a "line at a time" basis, for something simple like this we can easily hack together some code. In the XML format, the lines that show open ports have a regular format:

<port protocol="tcp" portid="443"><state state="open" />...</port>

So pardon me while I throw down some sed:

$ sed 's/^<port protocol="\([^"]*\)" portid="\([^"]*\)"><state state="open".*/\2\/\1/; 
t p; d; :p' test.xml | sort | uniq -c | sort -nr -k1 -k2

6 443/tcp
5 80/tcp
3 22/tcp
2 3306/tcp
1 9100/tcp
...

The first part of the sed expression is a substitution that matches the protocol name and port number and replaces the entire line with just "<port>/<protocol>". Now I only want to output the lines where this substitution succeeded, so I use "t p" to branch to the label ":p" whenever the substitution happens. If we don't branch, then we hit the "d" command to drop the pattern space without printing and move onto the next line. Since the ":p" label we jump to on a successful substitution is an empty block, sed just prints the pattern space and moves onto the next line. This is a useful sed idiom for only printing our matching lines.

The rest of the pipeline puts the output lines from sed into sorted order so we can feed them into "uniq -c" to count the occurrences of each line. After that we use sort again to do a descending numeric sort ("-nr") of first the counts ("-k1") and then the port numbers ("-k2"). And that give us the output we want.

I actually find the so-called "grep-able" output format of Nmap kind of a pain to deal with for this kind of thing. That's because Nmap insists on jamming all of the port information together into delimited, but variable length lines like this:

Host: 192.168.1.2 (test.deer-run.com) Ports: 22/open/tcp//ssh///, 
25/open/tcp//smtp///, 53/open/tcp//domain///, 80/open/tcp//http///,
139/open/tcp//netbios-ssn///, 143/open/tcp//imap///, 443/open/tcp//https///,
445/open/tcp//microsoft-ds///, 514/open/tcp//shell///, 587/open/tcp//submission///,
601/open/tcp/////, 902/open/tcp//iss-realsecure-sensor///, 993/open/tcp//imaps///,
1723/open/tcp//pptp///, 8009/open/tcp//ajp13/// Seq Index: 3221019...

So to handle this problem, I'm just going to use tr to convert the spaces to newlines, forcing the output to have a single port entry per line. After that, it's just awk:

$ cat test.gnmap | tr ' ' \\n | awk -F/ '/\/\/\// {print $1 "/" $3}' | 
sort | uniq -c | sort -nr -k1 -k2

6 443/tcp
5 80/tcp
3 22/tcp
2 3306/tcp
1 9100/tcp
...

The port listings all end with "///", so I use awk to match those lines and output the port and protocol fields. Notice the "-F/" option so that awk uses the slash character as the field delimiter instead of whitespace. After that it's the same "sort ... | uniq -c | sort ..." pipeline we used in the last case to format the output.

The easiest case is actually the regular Nmap output:

$ awk '/^[0-9]/ {print $1}' test.nmap | sort | uniq -c | sort -nr -k1 -k2
6 443/tcp
5 80/tcp
3 22/tcp
2 3306/tcp
1 9100/tcp
...

The lines about open ports are the only ones in the output that start with digits. So it's a quick awk expression to match these lines and output the port specifier. After that, we use the same pipeline we used in the previous examples to format the output appropriately.

So, Tim, my shell may not have built-in tools to parse XML but it's apparently three times the shell that yours is. Stick that in your port and smoke it.

Because Davide sed So

Proving he's not just a whiz at awk, Davide Brini wrote in with a more elegant sed idiom for just printing the lines that match our substitution:

$ sed -n 's/^<port protocol="\([^"]*\)" portid="\([^"]*\)"><state state="open".*/\2\/\1/p' test.xml | 
sort | uniq -c | sort -nr -k1 -k2

6 443/tcp
5 80/tcp
3 22/tcp
2 3306/tcp
1 9100/tcp
...

"sed -n ..." suppresses the normal sed output and "s/.../.../p" causes the lines that match our substitution to be printed out. And that's much easier. Thanks, Davide!