Monday, June 1, 2009

Episode #43: Users & Groups

Ed rushes in:

Here's an easy one that I use all the time when analyzing a system. When auditing a box or investigating a compromised system, I often want to double check which groups are defined locally, along with the membership of each group. I especially focus on who is in the admin group. We can dump a list of groups as follows:

C:\> net localgroup

Then, we can check the accounts associated with each group using:

C:\> net localgroup [groupname]

That's all well and good, but how can we get all of this information at one time? We could use a FOR /F loop to iterate on the output of our first command (group names) showing the membership (user names). Put it all together, and you have:

C:\> for /F "skip=4 delims=*" %i in ('net localgroup ^| find /v
"The command completed successfully"') do @net localgroup "%i"

Here, I'm using a for /F loop to parse through the output of "net localgroup". I filter (find) through the output of that command to choose lines that do not have (/v) the annoying "The command completed successfully" output. Otherwise, I'd get errors. I define some custom parsing in my FOR /F loop to skip the first 4 lines of cruft, and set a delimeter of * to remove that garbage that Microsoft prepends to each group name. BTW, what's with Microsoft making the output of their commands so ugly? Why do we have to parse all this garbage? How about they make the output useful as is? Oh well... Anyway, once I've parsed the output of "net localgroup" to get a group list, I push the output through "net localgroup" again to get a list of members.

Hal's going off the deep end:

Ed's challenge seemed deceptively simple when I first read it. Then my brain kicked in, and as usual that made things infinitely more difficult. At first I thought this was going to be as simple as using "cut" to pull the appropriate fields out of /etc/group:

$ cut -f1,4 -d: /etc/group
root:
daemon:
bin:
sys:
adm:hal
[...]

Alternatively, if you only cared about groups that actually had users listed in the last field, you could do:

$ cut -f1,4 -d: /etc/group | grep -v ':$'
adm:hal
dialout:hal
cdrom:hal
audio:pulse
plugdev:hal
lpadmin:hal
admin:hal
sambashare:hal

But now my uppity brain intruded with an ugly fact: by only looking at /etc/group, we're ignoring the users' default group assignments in /etc/passwd. What we really need to do is merge the information in /etc/passwd with the group assignments in /etc/group. I'll warn you up front that my final solution skirts perilously close to the edge of our "no scripting" rule, but here goes.

We're going to be using the "join" command to stitch together the /etc/passwd and /etc/group files on the group ID column. However, "join" requires both of its input files to be sorted on the join field. So before we do anything we need to accomplish this:

$ sort -n -t: -k4 /etc/passwd >passwd.sorted
$ sort -n -t: -k3 /etc/group >group.sorted

In the "sort" commands, "-n" means do a numeric sort, "-t" specifies the field delimiter, and "-k" is used to specify the field(s) to sort on.

Once we have the sorted files, producing the output we want is trivial:

$ join -a 1 -t: -1 3 -2 4 group.sorted passwd.sorted | \
awk -F: '{ grps[$2] = grps[$2] "," $4 "," $5 }
END { for ( g in grps ) print g ":" grps[g] }' | \
sed -r 's/(:|,),*/\1/g; s/,$//' | sort

[...]
list:list
lpadmin:hal
lp:hplip,lp
mail:mail
man:man
messagebus:messagebus
mlocate:
netdev:
news:news
nogroup:nobody,sshd,sync
[...]

While I hate to belabor the obvious, let me go over the above example line-by-line for the two or three folks reading this blog who might be confused:


  • "join" is a bit funky. The "-t" option specifies the column delimiter, just like "sort", and you can probably guess that "-1 3" and "-2 4" are how we're specifying the join column in file 1 ("-1") and file 2 ("-2"). Normally "join" will only output lines when it can find lines in both files that it can merge together. However, the "-a 1" option tells "join" to output all lines from file 1, even if there's no corresponding line in file 2.

    So that you can understand the rest of the command-line above, let me show you some of the output from the "join" command by itself:

    $ join -a 1 -t: -1 3 -2 4 group.sorted passwd.sorted
    [...]
    124:sambashare:x:hal
    125:ntp:x::ntp:x:112::/home/ntp:/bin/false
    126:bind:x::bind:x:113::/var/cache/bind:/bin/false
    1000:hal:x::hal:x:1000:Hal Pomeranz,,,:/home/hal:/bin/bash
    65534:nogroup:x::nobody:x:65534:nobody:/nonexistent:/bin/sh
    65534:nogroup:x::sshd:x:114::/var/run/sshd:/usr/sbin/nologin
    65534:nogroup:x::sync:x:4:sync:/bin:/bin/sync

    When doing its output, "join" puts the merge column value (the GID in our case) up at the front of each line of output. Then you see the remaining fields of the first input file (group name, group password, user list), followed by the remaining fields of the second input file (user name, BSD password, UID, and so on). The "sambashare" line at the top of our sample output is an example of a group that had no corresponding users in /etc/passwd. The "nogroup" lines toward the bottom of the output are an example of a single group that actually has several users associated with it in /etc/passwd.

    Somehow we've got to pull the output from the "join" command into a consolidated output format. That's going to require some pretty flexible text processing, plus the ability to merge user names from multiple lines of output, like the "nogroup" lines in our sample output. Sounds like a job for awk.

  • In the awk expression I'm using "-F:" to tell awk to split the input lines on colons, rather than whitespace which is the default. Now the group name is always in field 2, and the list of users from the /etc/group is in field 4, and the user name from /etc/passwd is in field 5. As I read each line of input, I'm building up an array indexed by group name that contains a list of all the values in fields 4 and 5, separated by commas. In the "END" block that gets processed when the input is exhausted I'm outputting the group name, a colon, and the list of users.

    The only problem is that sometimes field 4 and field 5 are null, so you get some extra commas in the output:

    $ join -a 1 -t: -1 3 -2 4 group.sorted passwd.sorted | \
    awk -F: '{ grps[$2] = grps[$2] "," $4 "," $5 }
    END { for ( g in grps ) print g ":" grps[g] }'

    [...]
    sambashare:,hal,
    nogroup:,,nobody,,sshd,,sync
    [...]

    A little "sed" will clean that right up.

  • Our "sed" expression actually contains two substitution operations separated by a semicolon: "s/(:|,),*/\1/g" and "s/,$//". Both substitutions will be applied to all input lines.

    The first subsitution is the most complex. We're matching either a colon or a comma followed by some number of extra commas and replacing that with the initial colon or comma. This allows us to remove all of the extra commas in the middle of the output lines.

    The second substitution matches commas at the end of the line and removes them (replaces them with nothing).


We throw a final "sort" command at the end of the pipeline so we get the output sorted by group name, but the hard part is basically over.

Clever readers will note that there's a potential problem with my solution. What if the "nogroup" entry in /etc/group had a user list like "nogroup:x:65534:foo,bar"? Because there were multiple /etc/passwd lines that were associated with "nogroup", I'd end up repeating the users in from the list in /etc/group multiple times:

$ join -a 1 -t: -1 3 -2 4 group.sorted passwd.sorted | ... | grep nogroup
nogroup:foo,bar,nobody,foo,bar,sshd,foo,bar,sync

The real solution requires introducing some conditional logic into the middle of the awk expression in order to avoid this duplication:

$ join -a 1 -t: -1 3 -2 4 group.sorted passwd.sorted | \
awk -F: '{ if (grps[$2]) { grps[$2] = grps[$2] "," $5 }
else { grps[$2] = $4 "," $5 } }
END { for ( g in grps ) print g ":" grps[g] }' | \
sed -r 's/(:|,),*/\1/g; s/,$//' | sort

[...]
nogroup:foo,bar,nobody,sshd,sync
[...]

The "if" statement in the middle of the awk code is checking to see whether we've seen this group before or not. The first time we see a group (the "else" clause), we make a new entry in the "grps" array with both the user list from /etc/group ($4) and the user name from the /etc/passwd entry ($5). Otherwise, we just append the user name info from the /etc/passwd entry and don't bother re-appending the group list from /etc/group.

I was able to successfully type the above code into a single command-line, but it's clearly a small script at this point. So I'd say that it at least goes against the spirit of the rules of this blog.