Command Line Kung Fu: Episode #127: Making a Difference

Hal went to school

I recently got the opportunity to sit in on (fellow SANS instructor) Lenny Zeltser's "Reverse Engineering Malware" class. It's a terrific course, and I highly recommend it.

During the material on memory analysis, we were comparing the output of "volatility pslist" and "volatility psscan2". It's relatively straightforward for rootkits to hide themselves from pslist, but psscan2 does a much more thorough job of finding the hidden processes. So the differences in the output are always very interesting to the analyst. Here's an example of what I mean:

$ volatility pslist -f memory.img
Name                 Pid    PPid   Thds   Hnds   Time  
System               4      0      55     260    Thu Jan 01 00:00:00 1970  
smss.exe             540    4      3      21     Thu Jan 28 16:11:40 2010  
csrss.exe            604    540    12     363    Thu Jan 28 16:11:46 2010  
lsass.exe            684    628    18     341    Thu Jan 28 16:11:47 2010  
vmacthlp.exe         836    672    1      24     Thu Jan 28 16:11:47 2010  
svchost.exe          848    672    18     201    Thu Jan 28 16:11:47 2010  
svchost.exe          1024   672    51     1178   Thu Jan 28 16:11:47 2010  
svchost.exe          1072   672    4      75     Thu Jan 28 16:11:47 2010  
svchost.exe          1132   672    15     212    Thu Jan 28 16:11:48 2010  
spoolsv.exe          1476   672    10     115    Thu Jan 28 16:11:49 2010  
explorer.exe         1592   1572   12     4021   Thu Jan 28 16:11:50 2010  
VMwareUser.exe       1656   1592   8      416    Thu Jan 28 16:11:50 2010  
VMwareService.e      1996   672    3      1026   Thu Jan 28 16:11:58 2010  
wscntfy.exe          1396   1024   1      27     Thu Jan 28 16:12:03 2010  
taskmgr.exe          1624   628    3      20201  Tue Feb 02 02:45:05 2010  
mike022.exe          1956   672    2      30     Tue Feb 02 03:25:29 2010  
wordpad.exe          1992   1260   4      102    Tue Feb 02 22:17:03 2010  
calc.exe             828    1592   1      26     Thu Feb 04 00:01:00 2010  
cmd.exe              968    1592   1      32     Thu Feb 04 00:01:13 2010  
wordpad.exe          2008   1256   5      101    Thu Feb 04 00:02:56 2010  
$ volatility psscan2 -f memory.img
PID    PPID   Time created             Time exited              Offset     PDB        Remarks
------ ------ ------------------------ ------------------------ ---------- ---------- ----------------

   932    672 Thu Jan 28 16:11:47 2010                          0x01ea3558 0x082c0100 svchost.exe     
  1744    848 Thu Feb 04 00:02:53 2010 Thu Feb 04 00:04:23 2010 0x01eaea88 0x082c0380 wmiprvse.exe    
  1132    672 Thu Jan 28 16:11:48 2010                          0x01eb4970 0x082c0160 svchost.exe     
  1956    672 Tue Feb 02 03:25:29 2010                          0x020155d8 0x082c02c0 mike022.exe     
  1072    672 Thu Jan 28 16:11:47 2010                          0x02016978 0x082c0140 svchost.exe     
  1172   1592 Tue Feb 02 02:40:48 2010                          0x0204c850 0x082c01c0 cmd.exe         
  1476    672 Thu Jan 28 16:11:49 2010                          0x0209db38 0x082c01a0 spoolsv.exe     
  1996    672 Thu Jan 28 16:11:58 2010                          0x021f0da0 0x082c0180 VMwareService.e 
  1664   1592 Thu Jan 28 16:11:50 2010                          0x021feb88 0x082c0240 msmsgs.exe      
  1024    672 Thu Jan 28 16:11:47 2010                          0x02202880 0x082c0120 svchost.exe     
   604    540 Thu Jan 28 16:11:46 2010                          0x0221f020 0x082c0040 csrss.exe       
  1624    628 Tue Feb 02 02:45:05 2010                          0x02256da0 0x082c02e0 taskmgr.exe     
   272   1820 Thu Feb 04 00:00:55 2010                          0x02293b08 0x082c0300 wordpad.exe     
  1012    672 Thu Jan 28 16:12:02 2010                          0x023a78b0 0x082c0260 alg.exe         
  1656   1592 Thu Jan 28 16:11:50 2010                          0x023a9c28 0x082c0220 VMwareUser.exe  
  1648   1592 Thu Jan 28 16:11:50 2010                          0x023ae980 0x082c0200 VMwareTray.exe  
   848    672 Thu Jan 28 16:11:47 2010                          0x023b3020 0x082c00e0 svchost.exe     
  1748   1592 Thu Feb 04 00:02:10 2010 Thu Feb 04 00:06:19 2010 0x0240b9a0 0x082c03a0 cmd.exe         
   836    672 Thu Jan 28 16:11:47 2010                          0x02412b58 0x082c00c0 vmacthlp.exe    
   672    628 Thu Jan 28 16:11:47 2010                          0x02448cf8 0x082c0080 services.exe    
   968   1592 Thu Feb 04 00:01:13 2010                          0x024707e8 0x082c0340 cmd.exe         
   684    628 Thu Jan 28 16:11:47 2010                          0x02483da0 0x082c00a0 lsass.exe       
  1992   1260 Tue Feb 02 22:17:03 2010                          0x02491130 0x082c0360 wordpad.exe     
  1396   1024 Thu Jan 28 16:12:03 2010                          0x02492d78 0x082c0280 wscntfy.exe     
  2008   1256 Thu Feb 04 00:02:56 2010                          0x02494988 0x082c03e0 wordpad.exe     
   828   1592 Thu Feb 04 00:01:00 2010                          0x024c86b8 0x082c02a0 calc.exe        
  1592   1572 Thu Jan 28 16:11:50 2010                          0x024ddda0 0x082c01e0 explorer.exe    
   540      4 Thu Jan 28 16:11:40 2010                          0x024f8368 0x082c0020 smss.exe        
   628    540 Thu Jan 28 16:11:46 2010                          0x025314e8 0x082c0060 winlogon.exe    
     4      0                                                   0x025c8830 0x00319000 System

Visually you can see that the psscan2 output lists several more processes than pslist, but just using your eyeballs it can be difficult to figure out exactly what the differences are. Seems like a job for command-line kung fu!

My first thought was to simply extract the list of .EXEs from each command and diff them. In order to do the diff properly, I'll need to sort them into canonical order, but that's no problem. Here's how we manage the output from pslist:

$ volatility pslist -f memory.img | tail -n +2 | awk '{print $1}' | sort
calc.exe
cmd.exe
csrss.exe
...

I use tail to chop off the header line, then awk to extract the name of the .EXE from the first column, and finally pipe the whole thing into sort.

Dealing with the psscan2 output is very similar:

$ volatility psscan2 -f memory.img | tail -n +4 | awk '{print $NF}' | sort
alg.exe
calc.exe
cmd.exe
...

In this case, there are three header lines we need to skip. Also the .EXE name is in the last column of output-- "print $NF" is a useful awk idiom for printing the value in the last column.

So now we need to diff the output of these two commands. We could do this by creating temporary files, but why bother when have the magic bash "<(...)" syntax that lets us substitute command output in a place where a command would normally be looking for a file name:

diff <(volatility psscan2 -f memory.img | tail -n +4 | awk '{print $NF}' | sort) \
     <(volatility pslist -f memory.img | tail -n +2 | awk '{print $1}' | sort)
1d0
< alg.exe
4,5d2
< cmd.exe
< cmd.exe
10,11d6
< msmsgs.exe
< services.exe
18d12
< svchost.exe
23d16
< VMwareTray.exe
25,27d17
< winlogon.exe
< wmiprvse.exe
< wordpad.exe

Wicked! There are 10 processes that appear in the psscan2 output that don't show up in the pslist output. Since we don't see any lines starting with ">" there are no processes in the pslist output that don't show up in psscan2-- this is what we'd expect, but it's always nice to get confirmation.

The only problem here is that as we got further into the in-class exercises, I realized I really wanted all of the extra detail about each of the hidden processes from the psscan2 output. For example, the hex offset values end up being very useful, and I'd like to know exactly which two of the three command.exe processes are the hidden ones. Let me show you the command line I came up with and then explain it to you:

$ join -v 1 -1 1 -2 2 \
    <(volatility psscan2 -f memory.img | tail -n +4 | sort -n -k 1,1) \
    <(volatility pslist -f memory.img | tail -n +2 | sort -n -k2,2)
272 1820 Thu Feb 04 00:00:55 2010 0x02293b08 0x082c0300 wordpad.exe 
628 540 Thu Jan 28 16:11:46 2010 0x025314e8 0x082c0060 winlogon.exe 
672 628 Thu Jan 28 16:11:47 2010 0x02448cf8 0x082c0080 services.exe 
932 672 Thu Jan 28 16:11:47 2010 0x01ea3558 0x082c0100 svchost.exe 
join: file 1 is not in sorted order
join: file 2 is not in sorted order
1012 672 Thu Jan 28 16:12:02 2010 0x023a78b0 0x082c0260 alg.exe 
1172 1592 Tue Feb 02 02:40:48 2010 0x0204c850 0x082c01c0 cmd.exe 
1648 1592 Thu Jan 28 16:11:50 2010 0x023ae980 0x082c0200 VMwareTray.exe 
1664 1592 Thu Jan 28 16:11:50 2010 0x021feb88 0x082c0240 msmsgs.exe 
1744 848 Thu Feb 04 00:02:53 2010 Thu Feb 04 00:04:23 2010 0x01eaea88 0x082c0380 wmiprvse.exe 
1748 1592 Thu Feb 04 00:02:10 2010 Thu Feb 04 00:06:19 2010 0x0240b9a0 0x082c03a0 cmd.exe

In this case I'm using join rather than diff because the output of the two commands is so differently formatted. Essentially I'm doing a join on the PID columns of the psscan2 ("-1 1") and pslist ("-2 2") output and telling join to output the non-matching lines from psscan2 ("-v 1"). The tricky bit is that each command output needs to be sorted by its PID column for join to work. So if you look in the "<(...)" clauses, you'll see that the final element of the pipeline in each case is a numeric sort on the PID column. Easy, right?

The only fly in the ointment is the "not in sorted order" error messages from join. The problem is that join only understands alphabetic sorting. So when we go from 9xx PIDs to 1xxx PIDs, join thinks the file has gone all unsorted. There's no "-n" option to join like there is for sort, but in some versions of join we can use the "--nocheck-order" option to suppress the error messages:

$ join -v 1 -1 1 -2 2 --nocheck-order \
    <(volatility psscan2 -f memory.img | tail -n +4 | sort -n -k 1,1) \
    <(volatility pslist -f memory.img | tail -n +2 | sort -n -k2,2)
272 1820 Thu Feb 04 00:00:55 2010 0x02293b08 0x082c0300 wordpad.exe 
628 540 Thu Jan 28 16:11:46 2010 0x025314e8 0x082c0060 winlogon.exe 
672 628 Thu Jan 28 16:11:47 2010 0x02448cf8 0x082c0080 services.exe 
932 672 Thu Jan 28 16:11:47 2010 0x01ea3558 0x082c0100 svchost.exe 
1012 672 Thu Jan 28 16:12:02 2010 0x023a78b0 0x082c0260 alg.exe 
1172 1592 Tue Feb 02 02:40:48 2010 0x0204c850 0x082c01c0 cmd.exe 
1648 1592 Thu Jan 28 16:11:50 2010 0x023ae980 0x082c0200 VMwareTray.exe 
1664 1592 Thu Jan 28 16:11:50 2010 0x021feb88 0x082c0240 msmsgs.exe 
1744 848 Thu Feb 04 00:02:53 2010 Thu Feb 04 00:04:23 2010 0x01eaea88 0x082c0380 wmiprvse.exe 
1748 1592 Thu Feb 04 00:02:10 2010 Thu Feb 04 00:06:19 2010 0x0240b9a0 0x082c03a0 cmd.exe

The other alternative is obviously to sort the PID columns alphabetically, but that offends my sensibilities somehow.

Mmmm, hmmm! That was some tasty fu! Hey Tim, volatility runs on Windows-- what can you do with the output? I double-dog-dare you to try it in CMD.EXE first...

Tim skipped school:

Do cmd.exe, dang Hal. Happy Freaking New Year to me, huh?

Here is what I came up with based on the assumption that pslist returns a subset of psscan2.

C:\> python.exe volatility psslist -f memory.img > plist.txt
C:\> cmd /v:on /c "for /F "skip=2 tokens=1,5,10,15" %a in ('python.exe volatility psscan2 -f lab3.img') do
  @(if not "%d"=="" (set name=%d) else (if not "%c"=="" (set name=%c) else (set name=%b))) &
  set pid=%a & (type pslist.txt | findstr /B /R /C:"!name! *!pid! " > NUL || echo !name! !pid!)"

svchost.exe 932
wmiprvse.exe 1744
cmd.exe 1172
msmsgs.exe 1664
wordpad.exe 272
alg.exe 1012
VMwareTray.exe 1648
cmd.exe 1748
services.exe 672
winlogon.exe 628

I split this command into two for the sake of readability; however, it could be easily combined into a one-liner. But I'll leave that simple experiment to you. The first line takes the output of psslist and dumps the contents into a file. This file will be read numerous times so it is significantly faster to just read the file in the second "half" of our command. Now, regarding that second half...

We start off by using invoking our shell with /v:on to enable delayed variable expansion and /c to cause our spawned shell to exit upon completion. Inside the shell we use our trusty For loop. The first three lines are skipped as they are headers. The For loop then splits the line based on white space. We are trying to get the name of the process, and due to spacing, it may be in the 5th, 10th, or 15th token. Yes, it is that confusing. Here is a little diagram of what I mean:

PID    PPID   Time created             Time exited              Offset     PDB        Remarks
------ ------ ------------------------ ------------------------ ---------- ---------- ----------------

Token1      2   3   4  5        6    7                                   8          9     10
   932    672 Thu Jan 28 16:11:47 2010                          0x01ea3558 0x082c0100 svchost.exe

Token1      2   3   4  5        6    7   8   9 10       11   12         13         14     15
  1744    848 Thu Feb 04 00:02:53 2010 Thu Feb 04 00:04:23 2010 0x01eaea88 0x082c0380 wmiprvse.exe

Token1      2                                                            3          4      5
     4      0                                                   0x025c8830 0x00319000 System

Our for loop will give us 4 variables a, b, c, and d which represent the 1st, 5th, 10th, and 15th token. We have to use a little trick to figure out which of the three variables contains the process name by checking each variable from right to left. If %d is not empty, then it contains the process name so we set Name equal to %d. If %d is empty we try %c, and if %c is empty we use %b. For the sake of nice variable names we set !pid! equal to %a. We then have the variable !pid!, which contains the process id, and !name!, which contains the process name.

We then search the pslist.txt file to see if the current process, represented by !name! and !pid!, is in the file. We output the file, using the Type command, and use FindStr to search for the matching name and process id. The /B switch says our search string must be at the beginning of the line, the /R enables regular expression searches. The default FindStr setting is to treat a space in our search string as a logical OR, but the /C switch "uses [the] specified string as a literal search string," meaning it doesn't treat a space as a logical OR. In short, it looks for the process name at the beginning of the line, followed by some number of spaces, then the process id, and then another space.

We then use the logical OR (||) in conjunction with the FindStr command to determine whether FindStr found something or not. This trick has been used repeatedly, but most recently in episode 122. If FindStr doesn't find anything we then output the process name and PID. This effectively gives us a list of processes that are found with psscan2 but not pslist.

Now for a more robust solution using...

PowerShell

I'm going to deviate into script land here, only because this mini-script may be very useful for manipulating the output of these commands. It will take the output and objectify it.

Objectifying psscan2:

PS C:\> $null, $pslist = python volatility pslist -f memory.img
PS C:\> [regex]$regex = '(?<Name>\S+)\s+(?<PID>[0-9]+)\s+(?<PPID>[0-9]+)\s+(?<Threads>[0-9]+)\s+(?<Handles>[0-9]+)\s+(?<Time>.*)'
PS C:\> $pslistobjects = foreach ($p in $pslist) {
...        $psobj = "" | Select-Object Name, PID, PPID, Threads, Handles, Time
...        $p -match $regex | Out-Null
...        $psobj.Name = $matches.Name
...        $psobj.PID = $matches.PID
...        $psobj.PPID = $matches.PPID
...        $psobj.Threads = $matches.Threads
...        $psobj.Handles = $matches.Handles
...        $psobj.Time = [datetime]::ParseExact($matches.Time.Trim(), "ddd MMM dd HH:mm:ss yyyy", $null)
...        $psobj
...     }

PS C:\> $pslistobjects | Format-Table
Name            PID  PPID Threads Handles Time
----            ---  ---- ------- ------- ----
System          4    0    55      260     1/1/1970 12:00:00 AM
smss.exe        540  4    3       21      1/28/2010 4:11:40 PM
csrss.exe       604  540  12      363     1/28/2010 4:11:46 PM
...

This takes the output from pslist and converts it to PowerShell objects. Let's look at each line, one at a time.

PS C:\> $null, $pslist = python volatility pslist -f memory.img

Here we get the output from pslist, send the first line to null, and the remainder is put into the variable pslist. This effectively skips the first line (header).

PS C:\> [regex]$regex = '(?<Name>\S+)\s+(?<PID>[0-9]+)\s+(?<PPID>[0-9]+)\s+(?<Threads>[0-9]+)\s+(?<Handles>[0-9]+)\s+(?<Time>.*)'

The next chunk sets up our Regular Expression with named groupings.

PS C:\> $pslistobjects = foreach ($p in $pslist) {
...        $psobj = "" | Select-Object Name, PID, PPID, Threads, Handles, Time
...        $p -match $regex | Out-Null
...        $psobj.Name = $matches.Name
...        $psobj.PID = $matches.PID
...        $psobj.PPID = $matches.PPID
...        $psobj.Threads = $matches.Threads
...        $psobj.Handles = $matches.Handles
...        $psobj.Time = [datetime]::ParseExact($matches.Time.Trim(), "ddd MMM dd HH:mm:ss yyyy", $null)
...        $psobj
...     }

Inside the ForEach-Object loop is where the heavy lifting is done. First, an empty object is created. Then the Match operator is used to match the string using the regular expression and automatically populate the $matches variable. We then set each property of our object. The Time property is a bit special since the time format used by pslist isn't one of the formats that PowerShell/Windows natively understands. The variable $pslistobjects then contains PowerShell'ed objects from volatility's pslist. We can then sort, filter, or do perform all sorts of tricks once it has been PowerShellized.

A similar mini-script will objectify the output from psscan2:

PS C:\> $null, $null, $null, $psscan2 = \python25\python.exe volatility psscan2 -f memory.img
PS C:\> [regex]$regex = '\s*?(?<PID>[0-9]+)\s+(?<PPID>[0-9]+)\s(?<Created>.{24})\s(?<Exited>.{24})
          \s(?<Offset>[0-9a-fx]{10})\s(?<PDB>[0-9a-fx]{10})\s(?<Name>.+)'
PS C:\> $psscan2objects = foreach ($p in $psscan2) {
...        $psobj = "" | Select-Object Name, PID, PPID, Created, Exited, Offset, PDB
...        $p -match $regex | Out-Null
...        $psobj.Name = $matches.Name
...        $psobj.PID = $matches.PID
...        $psobj.PPID = $matches.PPID
...        $psobj.Offset = $matches.Offset
...        $psobj.PDB = $matches.PDB
...        if ($matches.Created.Trim()) {
...            $psobj.Created = [datetime]::ParseExact($matches.Created, "ddd MMM dd HH:mm:ss yyyy", $null)
...        }
...        if ($matches.Exited.Trim()) {
...            $psobj.Exited = [datetime]::ParseExact($matches.Exited, "ddd MMM dd HH:mm:ss yyyy", $null)
...        }
...        $psobj
...     }

PS C:\> $psscan2objects | ft

Name             PID  PPID Created              Exited               Offset     PDB
----             ---  ---- -------              ------               ------     ---
svchost.exe      932  672  1/28/2010 4:11:47 PM                      0x01ea3558 0x082c0100
wmiprvse.exe     1744 848  2/4/2010 12:02:53 AM 2/4/2010 12:04:23 AM 0x01eaea88 0x082c0380
svchost.exe      1132 672  1/28/2010 4:11:48 PM                      0x01eb4970 0x082c0160
mike022.exe      1956 672  2/2/2010 3:25:29 AM                       0x020155d8 0x082c02c0
...

If you are going to use these commands often I would highly suggest making these into script files. You could even pass the file name to these scripts and have it wrap the volititlity commands.

Ok, so now we have two variables, each contains the output of the respective volatility command.

PS C:\> $pslistobjects | ft

Name            PID  PPID Threads Handles Time
----            ---  ---- ------- ------- ----
System          4    0    55      260     1/1/1970 12:00:00 AM
smss.exe        540  4    3       21      1/28/2010 4:11:40 PM
csrss.exe       604  540  12      363     1/28/2010 4:11:46 PM
lsass.exe       684  628  18      341     1/28/2010 4:11:47 PM
...

PS C:\> $psscan2objects | ft

Name             PID  PPID Created              Exited               Offset     PDB
----             ---  ---- -------              ------               ------     ---
svchost.exe      932  672  1/28/2010 4:11:47 PM                      0x01ea3558 0x082c0100
wmiprvse.exe     1744 848  2/4/2010 12:02:53 AM 2/4/2010 12:04:23 AM 0x01eaea88 0x082c0380
svchost.exe      1132 672  1/28/2010 4:11:48 PM                      0x01eb4970 0x082c0160
mike022.exe      1956 672  2/2/2010 3:25:29 AM                       0x020155d8 0x082c02c0
...

~~Finally~~ Now, we can then use the Compare-Object cmdlet to compare the two sets of processes.

PS C:\> Compare-Object $pslistobjects $psscan2objects -Property name,pid

name           pid  SideIndicator
----           ---  -------------
svchost.exe    932  =>
wmiprvse.exe   1744 =>
cmd.exe        1172 =>
msmsgs.exe     1664 =>
wordpad.exe    272  =>
alg.exe        1012 =>
VMwareTray.exe 1648 =>
cmd.exe        1748 =>
services.exe   672  =>
winlogon.exe   628  =>

The Property parameter is used to specify the properties to use for comparison. We can either use a single property or a comma separated list of property names.

From this output it is quickly apparent that there are 10 processes found by psscan2 that were not found by pslist.

Whew, that was a lot of work this week. I hope it gets me on Santa's Nice list...next year.

Davide is too cool for school

Davide Brini has once again punk'd me with this full-on awk attack:

awk 'FNR>1 && NR==FNR {a[$1,$2]; next} 
     FNR>3 && !(($NF,$1) in a)' \
        <(volatility pslist -f memory.img) \
        <(volatility psscan2 -f memory.img)

Obviously, Davide has a PhD in awk, so let me explain what's going on here. FNR is an internal awk variable that tracks the current "input record number"-- usually the line number-- of the current file. NR, on the other hand, tracks the total number of records (lines) seen so far across all files.

If you look at the first awk clause, the "FNR>1" is how Davide is skipping the first header line in the pslist output. The "NR=FNR" expression will only be true if we're processing the first input "file", i.e. the output of "volatility pslist ...". Once awk moves on to the second "file" (the psscan output), NR will keep on accumulating, but FNR will be reset to zero.

So the first clause is for handling the psscan output. If you look at what's happening in the curly braces, Davide is creating empty array entries indexed by process name ($1) and PID ($2). The "next" just tells awk to read and process the next line of input, skipping the second clause which applies to the psscan output.

So let's look at that second clause. We can only get here if "NR!=FNR", which means we're dealing with the psscan output from the second input "file". Here Davide is using "FNR>3" to skip the header lines. For all the other lines, "!(($NF,$1) in a)" is true if and only if there is no entry in the array "a" for this combination of process name ($NF) and PID ($1). If we don't find an entry then psscan is telling us about a process that's been hidden from pslist and we want to output the information about this process. Davide is relying on the implicit "{print}" behavior of awk to make this happen.

Davide points out that the output from the above command will not be sorted, but you can always pipe the results into sort if that's important to you:

awk 'FNR>1 && NR==FNR {a[$1,$2]; next} 
     FNR>3 && !(($NF,$1) in a)' \
        <(volatility pslist -f memory.img) \
        <(volatility psscan2 -f memory.img) | sort -n -k2,2

Nice job, Davide!

Michael has to stay late for passing notes

Wow, this Episode sure provoked a lot of interesting commentary. Michael Hale Ligh gave us a shout out from the volatility camp. He even wrote a small plugin for volatility, psdiff.py, that does the same thing as our command line kung fu:

# For http://volatility.googlecode.com/svn/branches/Volatility-1.4_rc1

import volatility.plugins.psscan as psscan 
import volatility.win32.tasks as tasks
import volatility.utils as utils

class PsDiff(psscan.PSScan):
    """Produce a process diff"""

    def calculate(self):
        addr_space = utils.load_as(self._config)

        # Build a dictionary of processes found by scanning. The keys are 
        # physical addresses and the values are the objects
        procs_scan = dict((p.obj_offset, p) for p in psscan.PSScan.calculate(self))

        # Build a dictionary of processes found by walking the linked list. 
        # The virtual addresses are converted to physical with vtop. 
        procs_list = dict((addr_space.vtop(p.obj_offset), p) for p in tasks.pslist(addr_space))
        
        # Create two sets of addresses so we can easily compute the difference 
        scan_addrs = set(procs_scan.keys())
        list_addrs = set(procs_list.keys())

        # Yield any objects that are found by psscan but not pslist 
        for addr in (scan_addrs - list_addrs):
            yield procs_scan[addr]

    def render_text(self, outfd, data):
        for p in data:
            outfd.write("{0:<8} {1:<16} {2}\n".format(p.UniqueProcessId, p.ImageFileName, p.ExitTime))

Michael's plugin uses "psscan" instead of "psscan2", so the output will be slightly different, but it shouldn't be that hard to switch things over to use "psscan2" instead if you prefer. Michael also provided a bit more explanation in his original email:

$ python volatility.py psdiff -f memory.dmp

Volatile Systems Volatility Framework 1.4_rc1
0 Idle 1970-01-01 00:00:00
940 cmd.exe 2008-11-26 07:45:49
660 services.exe 1970-01-01 00:00:00
808 taskmgr.exe 2008-11-26 07:45:40
924 svchost.exe 1970-01-01 00:00:00
592 csrss.exe 1970-01-01 00:00:00
992 alg.exe 1970-01-01 00:00:00
1016 svchost.exe 1970-01-01 00:00:00
828 svchost.exe 1970-01-01 00:00:00

The exit time of "1970-01-01 00:00:00" just means the field is empty (process is still active). I am doing the diff based on the address of EPROCESS objects, however its possible, though not very likely, that an address could get re-used...so for a more robust diff you may check other fields as well.

If you want to see other fields in the output, its rather easy because the Volatility types are auto-generated from Microsoft's PDB symbol files. For example since Windows defines a structure like this:

typedef struct _EPROCESS {
...
char ImageFileName[16];
DWORD UniqueProcessId;
...
} EPROCESS, *PEPROCESS;

You can print those fields like p.ImageFileName and p.UniqueProcessId in the plugin.

Lastly, the csrpslist plugin discussed in Malware Analyst's Cookbook produces a diff using two alternate sources of process listings (the csrss.exe handle table and an internal linked list found in the memory of csrss.exe). There are many other sources as well...

Command Line Kung Fu

Tuesday, December 28, 2010

Episode #127: Making a Difference

Pages

Contact us

Blog Archive

Followers

Contributors