A couple of weeks ago I promised some answers to the exercises I proposed at the end of my last post. What we have here is a case of, "Better late than never!"
1. If you go back and look at the example where I counted the number of processes per user, you'll notice that the "UID" header from the ps command ends up being counted. How would you suppress this?
There's a couple of different ways you could attack this using the material I showed you in the previous post. One way would be to do string comparison on field $1:
$ ps -ef | awk '$1 != "UID" {print $1}' | sort | uniq -c | sort -nr
178 root
58 hal
2 www-data
...
An alternative approach would be to use pattern matching to print lines that don't match the string "UID". The "!" operator means "not", so the expression "!/UID/" does what we want:
$ ps -ef | awk '!/UID/ {print $1}' | sort | uniq -c | sort -nr
178 root
57 hal
2 www-data
...
You'll notice that the "!/UID/" version counts one less process for user "hal" than the string comparison version. That's because the pattern match is matching the "UID" in the awk code and not showing you that process. So the string comparison version is slightly more accurate.
2. Print out the usernames of all accounts with superuser privileges (UID is 0 in /etc/passwd).
Remember that /etc/passwd file is colon-delimited, so we'll use awk's "-F" operator to split on colons. UID is field #3 and the username is field #1:
$ awk -F: '$3 == 0 {print $1}' /etc/passwd
root
Normally, a Unix-like OS will only have a single UID 0 account named "root". If you find other UID 0 accounts in your password file, they could be a sign that somebody's doing something naughty.
3. Print out the usernames of all accounts with null password fields in /etc/shadow.
You'll need to be root to do this one, since /etc/shadow is only readable by the superuser:
# awk -F: '$2 == "" {print $1}' /etc/shadow
Again, we use "-F:" to split the fields in /etc/shadow. We look for lines where the second field (containing the password hash) is the empty string and print the first field (the username) when this condition is true. It's really not much different from the previous /etc/passwd example.
You should get no output. There shouldn't be any entries in /etc/shadow with null password hashes!
4. Print out process data for all commands being run as root by interactive users on the system (HINT: If the command is interactive, then the "TTY" column will have something other than a "?" in it)
The "TTY" column in the "ps" output is field #6 and the username field is #1:
# ps -ef | awk '$1 == "root" && $6 != "?" {print}'
root 1422 1 0 Jan05 tty4 00:00:00 /sbin/getty -8 38400 tty4
root 1427 1 0 Jan05 tty5 00:00:00 /sbin/getty -8 38400 tty5
root 1434 1 0 Jan05 tty2 00:00:00 /sbin/getty -8 38400 tty2
root 1435 1 0 Jan05 tty3 00:00:00 /sbin/getty -8 38400 tty3
root 1438 1 0 Jan05 tty6 00:00:00 /sbin/getty -8 38400 tty6
root 1614 1523 0 Jan05 tty7 00:09:00 /usr/bin/X :0 -nr -verbose -auth ...
root 2082 1 0 Jan05 tty1 00:00:00 /sbin/getty -8 38400 tty1
root 5909 5864 0 13:42 pts/3 00:00:00 su -
root 5938 5909 0 13:42 pts/3 00:00:00 -su
root 5968 5938 0 13:47 pts/3 00:00:00 ps -ef
root 5969 5938 0 13:47 pts/3 00:00:00 awk $1 == "root" && $6 != "?" {print}
We look for the keyword "root" in the first field, and anything that's not "?" in the sixth field. If both conditions are true, then we just print out the entire line with "{print}".
Actually, "{print}" is the default action for awk. So we could shorten our code just a bit:
# ps -ef | awk '$1 == "root" && $6 != "?"'
root 1422 1 0 Jan05 tty4 00:00:00 /sbin/getty -8 38400 tty4
root 1427 1 0 Jan05 tty5 00:00:00 /sbin/getty -8 38400 tty5
root 1434 1 0 Jan05 tty2 00:00:00 /sbin/getty -8 38400 tty2
...
5. I mentioned that if you kill all the sshd processes while logged in via SSH, you'll be kicked out of the box (you killed your own sshd process) and unable to log back in (you've killed the master SSH daemon). Fix the awk so that it only prints out the PIDs of SSH daemon processes that (a) don't belong to you, and (b) aren't the master SSH daemon (HINT: The master SSH daemon is the one who's parent process ID is 1).
This one's a little tricky. Take a look at the sshd processes on my system:
# ps -ef | grep sshd
root 3394 1 0 2012 ? 00:00:00 /usr/sbin/sshd
root 13248 3394 0 Jan05 ? 00:00:00 sshd: hal [priv]
hal 13250 13248 0 Jan05 ? 00:00:02 sshd: hal@pts/0
root 25189 3394 0 08:27 ? 00:00:00 sshd: hal [priv]
hal 25191 25189 0 08:27 ? 00:00:00 sshd: hal@pts/1
root 25835 25807 0 15:33 pts/1 00:00:00 grep sshd
For modern SSH daemons with "Privilege Separation" enabled, there are actually two sshd processes per login. There's a root-owned process marked as "sshd: <user> [priv]" and a process owned by the user marked as "sshd: <user>@<tty>". Life would be a whole lot easier if both processes were identified with the associated pty, but alas things didn't work out that way. So here's what I came up with:
# ps -ef | awk '/sshd/ && !($3 == 1 || /sshd: hal[@ ]/) {print $2}'
First we eliminate all processes except for the sshd processes with "/sshd/". We only want to print out the process IDs if it's not the master SSH daemon ("$3 == 1" to make sure the PPID isn't 1) or if it's not one of my SSH daemons ("/sshd: hal[@ ]/" means the string "sshd: hal" followed by either "@" or space). If everything looks good, then print the process ID of the process ("{print $2}").
Frankly, that's some pretty nasty awk. I'm not sure it's something I'd come up with easily on the spur of the moment.
6. Use awk to parse the output of the ifconfig command and print out the IP address of the local system.
Here's the output from ifconfig on my system:
$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr f0:de:f1:29:c7:18
inet addr:192.168.0.14 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::f2de:f1ff:fe29:c718/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7724312 errors:0 dropped:0 overruns:0 frame:0
TX packets:13553720 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:711630936 (711.6 MB) TX bytes:17529013051 (17.5 GB)
Memory:f2500000-f2520000
So this is a reasonable first approximation:
$ ifconfig eth0 | awk '/inet addr:/ {print $2}'
addr:192.168.0.14
The only problem is the "addr:" bit that's still hanging on. awk has a number of built-in functions, including substr() which can help us in this case:
$ ifconfig eth0 | awk '/inet addr:/ {print substr($2, 6)}'
192.168.0.14
substr() takes as arguments the string we're working on (field $2 in this case) and the place in the string where you want to start (for us, that's the sixth character so we skip over the "addr:"). There's an optional third argument which is the number of characters to grab. If you leave that off, then you just get the rest of the string, which is what we want here.
There are lots of other useful built-in functions in awk. Consult the manual page for further info.
7. Parse the output of "lsof -nPi" and output the unique process name, PID, user ID, and port combinations for all processes that are in "LISTEN" mode on ports on the system.
Let's take a look at the "lsof -nPi" output using awk to match only the lines for "LISTEN" mode:
# lsof -nPi | awk '/LISTEN/'
sshd 1216 root 3u IPv4 5264 0t0 TCP *:22 (LISTEN)
sshd 1216 root 4u IPv6 5266 0t0 TCP *:22 (LISTEN)
mysqld 1610 mysql 10u IPv4 6146 0t0 TCP 127.0.0.1:3306 (LISTEN)
vmware-au 1804 root 8u IPv4 6440 0t0 TCP *:902 (LISTEN)
cupsd 1879 root 6u IPv6 73057 0t0 TCP [::1]:631 (LISTEN)
cupsd 1879 root 8u IPv4 73058 0t0 TCP 127.0.0.1:631 (LISTEN)
apache2 1964 root 4u IPv4 7412 0t0 TCP *:80 (LISTEN)
apache2 1964 root 5u IPv4 7414 0t0 TCP *:443 (LISTEN)
apache2 4112 www-data 4u IPv4 7412 0t0 TCP *:80 (LISTEN)
apache2 4112 www-data 5u IPv4 7414 0t0 TCP *:443 (LISTEN)
apache2 4113 www-data 4u IPv4 7412 0t0 TCP *:80 (LISTEN)
apache2 4113 www-data 5u IPv4 7414 0t0 TCP *:443 (LISTEN)
skype 5133 hal 41u IPv4 104783 0t0 TCP *:6553 (LISTEN)
Process name, PID, and process owner are fields 1-3 and the protocol and port are in fields 8-9. So that suggests the following awk:
# lsof -nPi | awk '/LISTEN/ {print $1, $2, $3, $8, $9}'
sshd 1216 root TCP *:22
sshd 1216 root TCP *:22
mysqld 1610 mysql TCP 127.0.0.1:3306
vmware-au 1804 root TCP *:902
cupsd 1879 root TCP [::1]:631
cupsd 1879 root TCP 127.0.0.1:631
apache2 1964 root TCP *:80
apache2 1964 root TCP *:443
apache2 4112 www-data TCP *:80
apache2 4112 www-data TCP *:443
apache2 4113 www-data TCP *:80
apache2 4113 www-data TCP *:443
skype 5133 hal TCP *:6553
And if we want the unique entries, then just use "sort -u":
# lsof -nPi | awk '/LISTEN/ {print $1, $2, $3, $8, $9}' | sort -u
apache2 1964 root TCP *:443
apache2 1964 root TCP *:80
apache2 4112 www-data TCP *:443
apache2 4112 www-data TCP *:80
apache2 4113 www-data TCP *:443
apache2 4113 www-data TCP *:80
cupsd 1879 root TCP 127.0.0.1:631
cupsd 1879 root TCP [::1]:631
mysqld 1610 mysql TCP 127.0.0.1:3306
skype 5133 hal TCP *:6553
sshd 1216 root TCP *:22
vmware-au 1804 root TCP *:902
Looking at the output, I'm not sure I care about all of the different apache2 instances. All I really want to know is which program is using port 80/tcp and 443/tcp. So perhaps we should just drop the PID and process owner:
# lsof -nPi | awk '/LISTEN/ {print $1, $8, $9}' | sort -u
apache2 TCP *:443
apache2 TCP *:80
cupsd TCP 127.0.0.1:631
cupsd TCP [::1]:631
mysqld TCP 127.0.0.1:3306
skype TCP *:6553
sshd TCP *:22
vmware-au TCP *:902
In the above output you see cupsd bound to both the IPv4 and IPv6 loopback address. If you just care about the port numbers, we can flash a little sed to clean things up:
# lsof -nPi | awk '/LISTEN/ {print $1, $8, $9}' | \
sed 's/[^ ]*:\([0-9]*\)/\1/' | sort -u -n -k3
sshd TCP 22
apache2 TCP 80
apache2 TCP 443
cupsd TCP 631
vmware-au TCP 902
mysqld TCP 3306
skype TCP 6553
In the sed expression I'm matching "some non-space characters followed by a colon" ("[^ ]*:") with some digits afterwards ("[0-9]*"). The digits are the port number, so we replace the matching expression with just the port number. Notice I used "\(...\)" around the "[0-9]*" to create a sub-expression that I can substitute on the right-hand side as "\1".
I've also modified the final "sort" command so that we get a numeric ("-n") sort on the port number ("-k3" for the third column). That makes the output look more natural to me.
I guess the moral of the story here is that awk is good for many things, but not necessarily for everything. Don't forget that there are other standard commands like sed and sort that can help produce the output that you're looking for.
Happy awk-ing everyone!