Command Line Kung Fu

Tuesday, September 28, 2010

Episode #114: Pod People

Tim pumps up the jam:

I enjoy listening to podcasts, and I listen to a lot of them. On Windows I use Juice to download my podcasts. It downloads all the files to a directory of my choosing and creates a subdirectory for each podcast.

Since I have so many mp3s to listen to, I speed them up and normalize the volume. To make use these outside tools easier to use I move all of the mp3s to one directory. This where I break out some fu.

I first used cmd.exe for this task since PowerShell wasn't out yet.

C:\> for /f "tokens=*" %i in ('dir C:\Podcasts /b /s ^| find ".mp3" ^|
  find /v "!ToFix"') do @move /Y "%i" "C:\Podcasts"\!ToFix\"

We use our friend the For loop to get each file. The tokens options will prevent any delimiters from being used so our variable will contain the full path and won't be broken by spaces. Inside the For loop we get a directory listing and use a recursive (/s) and bare format (/b) so we get the full path and no headers. The first Find command filters for files containing ".mp3", and the second Find filters out any mp3 file that contains "!ToFix" in the path since it is already in the correct location.

At this point the variable %i contains the full path to the file. We then use the Move command to move the file to our directory. I use the /Y option with the Move command to force overwriting of files with the same name. I only do this because Juice sometimes will download an old episode and I don't want to manually confirm each overwrite.

Now, how do we do the same thing in PowerShell?

PS C:\> Get-ChildItem -Recurse \podcasts -Include *.mp3 -Exclude !ToFix |
  Move-Item -Destination C:\temp\podcasts\!ToFix -Force

Shortened by using aliases and shortened parameter names:

PS C:\> ls -r \podcasts -In *.mp3 -Ex "!ToFix" |
  move -Dest C:\temp\podcasts\!ToFix -Force

We use Get-ChildItem with the recurse option to get all the objects in the podcast directory. The Include option filters for files that contain ".mp3" and the Exclude option removes the !ToFix directory. The results are piped into Move-Item where the destination directory is specified. The Force option is used to force overwriting.

Now I'm ready to nerd out with my podcasts. Hal is a nerd too, but I wonder if he listens to any podcasts.

Hal gets jiggy with it:

No I don't listen to podcasts. And you damn kids stay off my lawn!

Here's the first solution I rattled off for this week's challenge:

find podcasts -name \*.mp3 | grep -v /ToFix/ | xargs -I{} cp {} podcasts/ToFix

This one's pretty straightforward: find all files named *.mp3, filter out pathnames containing the "ToFix" target directory, and copy all other files into the target dir via "xargs ... cp".

We have to use "-I{}" here because the cp command will be terminated by the target directory path. The path names we're feeding in on the standard input are the middle arguments to the command, and we use the "{}" to place them appropriately in the command line. One bonus to using "-I" is that it forces xargs to treat each line of input as a single argument and ignore spaces in file names, so we don't need to worry about escaping or quoting our arguments or messing around with "find ... -print0".

But since we just covered "find ... -prune" in Episode 112, you may be wondering why I used "grep -v" above when I could just filter out the "ToFix" directory within the find command itself:

find podcasts \( -name \*.mp3 \) -o \( -name ToFix -prune -false \) | 
    xargs -I{} cp {} podcasts/ToFix

Comparing the two commands, the question sort of answers itself, doesn't it? Personally, I think the version with grep is a whole lot clearer. It's certainly easier to type. So while prunes are good for grumpy old men like me, that doesn't mean I have to use them all the time.

Are you kids still on my lawn?

Hal follows up

Woo-wee! We sure got a lot of feedback after this Episode!

Davide Brini wrote in to point out that newer versions of find have a "-path" option that can be used to simplify things considerably:

find podcasts -name \*.mp3 ! -path podcasts/ToFix/\* | xargs -I{} cp {} podcasts/ToFix

Davide also points out that if you live in the Linux world or at least in a world where you have the GNU fileutils installed everywhere, you can do things like this:

find podcasts -name \*.mp3 ! -path podcasts/ToFix/\* -exec cp -t podcasts/ToFix {} +

First, note the use of "cp -t ...", which allows me to specify the target directory as an initial argument. That means the remaining arguments are just the files that need to be copied, so I don't need to muck around with the tedious "-I{}" construct in xargs.

But speaking of xargs, the GNU version of find has a "-exec ... +" construct that basically does the same thing as xargs does. Rather than executing the specified command once for each file found, the "-exec ... +" option will aggregate multiple found files to reduce the number of executions.

In a separate email, Ole Tange wrote in to point out that my solutions fail miserably if any of the files we're copying contain quotes in the file names. Ole suggests using either the "-print0" option to GNU find or GNU Parallel, which properly handles input lines containing quotes:

find podcasts -name \*.mp3 | grep -v /ToFix/ | parallel -X cp {} podcasts/ToFix

Davide's final solution using "-exec ... +" also properly deals with the "file names containing quotes" issue.

So the bottom line is that if you live in a GNU software environment, then these problems can all be solved quite easily. But I wondered how I'd solve all of these issues without the shiny GNU tools. Here's the most robust solution I could come up with:

find podcasts -name \*.mp3 | grep -v ^podcasts/ToFix/ |
    while read file; do n=`basename "$file"`; cp "$file" "podcasts/ToFix/$n"; done

Not pretty, but it works.

Tuesday, September 21, 2010

Episode #113: Checking for Prints

Hal is in transit

Greetings from SANS Network Security in Las Vegas! For those of you who are in town for the conference, I'll be giving a Command Line Kung Fu talk on Wednesday at 8pm-- hope to see you there!

Right, so I needed to come up with something quick for this week because of my travel time crunch. And as I was prepping to head to Vegas, the perfect idea occurred to me as I typed the following command:

$ lp travel-calendar.txt
request id is HP-LaserJet-4050-89 (1 file(s))

For the three or four of you who still print documents from your Linux or Unix systems, let's talk about managing print jobs from the command line!

The lp command-- yes, that's short for "line printer", which tells you how old the Unix printing interface is-- is used to submit print jobs. With no additional arguments, the file(s) you're printing are just sent to the system's default printer. You can use "-P" to specify an alternate printer. You can also set the environment variable $PRINTER to choose a different default for your sessions.

You can use lpstat to peek at the status of your print jobs:

$ lpstat
HP-LaserJet-4050-89     hal               1024   Sun 19 Sep 2010 08:27:02 AM PDT

But you can also use "lpstat -a" to get the current status of all of the available printers on your system:

$ lpstat -a
HP-LaserJet-4050 accepting requests since Sun 19 Sep 2010 08:27:05 AM PDT
LJ9000N accepting requests since Thu Sep 09 11:10:54 2010 AM PDT

Finally, you can use the cancel command ("lprm" also works on most Unix systems) to stop your print jobs by job ID:

$ cancel HP-LaserJet-4050-89
$ lpstat

However, note that in the modern era of networked printers with their own local print servers, you'll have to be pretty quick with your cancel command to interrupt the print job before it gets spooled over the network to the remote printer. Once that happens, the cancel command you issue on your local machine probably won't stop the print job.

Similarly, there's also an lpmove command that allows you to switch a job from one printer to another. But again, you need to move the job before it gets spooled off your system. But lpmove can be useful when your jobs are stalled in a print queue on the local machine because the remote printer is down.

Let's see what Tim's spooling off on the Windows side of the house, shall we?

Tim's been taking his pretty little time:

I was initially going to go first on this since Hal was so busy. And I even wrote it way ahead of time, but I completely forgot to post it. I guess you could say I forgot to hit the print button.

Ok, so I can tell ahead of time that attempt at a joke failed miserably, but work with me here people, I'm trying to come up with a segway.

Soooo anyway, print jobs. In Windows we have to use our good ol' pal WMI. I'm sure no one is suprissed. Of course when it comes to cmd.exe, there only seem to be two choices, a For loop and wmic.

This week I'm going to combine the cmd.exe and PowerShell portions since they are very similar. Let's first start off by listing print jobs.

PS C:\> gwmi win32_printjob
Document              JobId JobStatus Owner  Priority    Size Name
--------              ----- --------- -----  --------    ---- ----
Command Line Kung Fu      6           tim           1  331444 Canon MX350
Command Line Kung Fu      7           tim           1  331444 Canon MX350

C:\> wmic printjob get "Document,JobId,Name,Owner,Priority,Size"
Document              JobId  Name            Owner  Priority  Size
Command Line Kung Fu  5      Canon MX350, 5  tim    1         228396
Command Line Kung Fu  6      Canon MX350, 5  tim    1         228396

We can select a specific job:

PS C:\> gwmi win32_printjob | ? { $_.JobId -eq 5 }
C:\> wmic printjob where JobId=5 get "Document,JobId,Name,Owner,Priority,Size"

And we can kill that job:

PS C:\> gwmi win32_printjob | ? { $_.JobId -eq 5 } | % { $_.Delete }
C:\> wmic printjob where JobId=5 delete

Or kill all jobs:

PS C:\> gwmi win32_printjob | % { $_.Delete }
C:\> wmic printjob delete

And that wraps up the print job portion, but I wanted to address one weird thing I discovered in with WMI. For some reason, WMI doesn't tell PowerShell that it has a delete method for print jobs.

PS C:\> gwmi win32_printjob | gm -MemberType Method

   TypeName: System.Management.ManagementObject#root\cimv2\Win32_PrintJob

Name   MemberType Definition
----   ---------- ----------
Pause  Method     System.Management.ManagementBaseObject Pause()
Resume Method     System.Management.ManagementBaseObject Resume()

We use Get-Member (alias gm) to look at the members of an object. We use the -MemberType parameter to specifically look for methods. But it doesn't show the delete method, even though it does work as we can see above. Wierd right? I know you are suprised, weirdness in WMI, right? Of cource after working with WMI so much I'm pretty sure that the W in WMI stands for weirdness.

Tuesday, September 14, 2010

Episode #112: Sensitivity Training

Tim get's PO'ed:

Jason Jarvis sends a question via the Post Office:

I need to conduct folder permissions audit on folders with specific names and then check to make sure that a specific group is explicitly denied.

I produced some PowerShell code to do that and was fairly happy:

PS C:\> Get-Childitem -path S: -recurse -include *classified*,*sensitive*,restricted*
-exclude *notsensitive* | where { $_.Attributes -match "d" } | Get-Acl | where {
$_.AccessToString -notmatch "DOMAIN\\GROUP" } | select PSPath, AccessToString |
export-csv outputfilename.csv

This works a treat until I realized that there are 140 remote locations where I don't have PowerShell installed.

It is a nice little bit of fu. The command starts off by getting a recursive directory listing of the S drive and looking for objects with classified, sensitive, or restricted in the name, but not containing the word notsensitive. The objects are filtered for directories by using Where-Object and checking the Attributes of the object. I usually filter for directories by checking the PSIsContainer property since it is faster, but it isn't a huge difference (~ 10%).

The objects are piped into Get-Acl and subsequently into another Where-Object filter, which only passes objects that do *not* have permissions defined for our group. Finally, the results are exported to a CSV.

The AccessToString property converts the ACL to a nice string which contains the principal, allow or deny, and the right (full, read, write, etc). It looks like this:

AccessToString: Domain\Group Deny FullControl
                BUILTIN\Administrators Allow  FullControl
                NT AUTHORITY\SYSTEM Allow  FullControl
                BUILTIN\Users Allow  ReadAndExecute, Synchronize
                NT AUTHORITY\Authenticated Users Allow  Modify, Synchronize

We can do a few things to shorten the original command by using aliases, but the biggest suggestion would be to change the filter on AccessToString. We want to make sure the group in question isn't just explicitly defined, but it also denied. Here is the shortened bit with the fix.

PS C:\> ls S: -r -include *classified*,*sensitive*,restricted* -exclude *notsensitive* |
? { $_.PSIsContainer } | Get-Acl | ? { $_.AccessToString -notmatch "DOMAIN\\GROUP Deny" } |
select PSPath, AccessToString | export-csv outputfilename.csv

Now how do we do this in cmd.exe?

Let's build up to the final command, and start off by getting a list of all the directories with the matching names.

C:\> dir S: /a:d /s /b | findstr /i "classified$ sensitive$ restricted$" |
findstr /i /v "notsensitive$"

We use dir with a few switches to get directories on the S drive. The /a:d switch just returns directories, the /s switch enables recursion, and the /b switch gives us the full path and not the header info. We then use Findstr to filter for directories with our keywords. The spaces in the search string are treated as logical ORs. Finally, we use Find with the /v switch to filter out anything that contains the word notsensitive. Edit: We also use the /i switch to make the search case insensitive.

To parse it, we'll have to use our friend, the For loop.

C:\> for /F %i in ('dir S: /a:d /s /b ^| findstr /i "classified$ sensitive$ restricted$" ^|
findstr /v /i "notsensitive$"') do @echo %i

S:\sensitive
S:\classified
...

Using the For loop we have the variable %i that contains our directory, but how do we check permissions? We use cacls, and to do our searching we have to know what the output looks like.

C:\>  cacls S:\
S:\sensitive domain\group:(OI)(CI)N
             BUILTIN\Administrators:(ID)F
             BUILTIN\Administrators:(OI)(CI)(IO)(ID)F
             NT AUTHORITY\SYSTEM:(ID)F
             ...

The N at the end of the line means No Access (or access denied). So we need to search for the group name and N at the end of the line. We can do it like this:

C:\>  cacls S:\ | findstr /i "domain\\group:.*)N"
S:\sensitive domain\group:(OI)(CI)N

The FindStr command uses a regular expression to perform our search. Now, let's put it all together:

C:\> for /F %i in ('dir /a:d /s /b ^| findstr /i "classified$ sensitive$ restricted$" ^|
  findstr /i /v "notsensitive$"') do @cacls sensitive | findstr /i 
  "domain\\group:.*)N domain\\group:.*DENY" > nul || @echo %i >> outputfilename.txt

Edit, the output of cacls seems to depends on the version of windows. The regular expression used with findstr needs to catch both "domain\group:(OI)(CI)N" and "FN2131\user123:(OI)(CI)(DENY)".

We use a little trick with the Logical OR (||). If the first command is successful, then the second command won't be executed. Likewise, if the first command is not successful, then the second command executes. So, if there is no output from our cacls/find, meaning nothing was found, then it executes the next command. In our case, the second command writes the variable %i to a file. We are then left with a file that contains all of the directories.

Let's see if Hal can wade his way through his sensitive and non-sensitive sides.

Hal gets wacky

Hey, man, I am one sensitive m------f-----!

There are probably lots of ways to solve this challenge in the Unix shell, but I decided to try and answer the question using only a single find command. The result looks quite complicated, but it's really not too bad:

find . -type d \
   \( \( -group users -a -perm -0010 \) -o -perm -0001 -o \( -prune -false \) \) \
   \( \( -name '*classified*' -o -name '*restricted*' -o -name '*sensitive*' \)  \
      -a ! -name '*nonsensitive*' \) \
   -ls

If you take this command apart into pieces, you'll see that there are four major chunks:

First there's the basic "find . -type d" bit that descends from the current directory looking for any sub-directories.

Next we check the permissions on the directory. For the directory to be accessible, the directory must be owned by the group in question (I'm using group "users" in this example) AND the group execute bit must be set ("-group users -a -perm -0010"). The directory can also be accessed if the execute bit is set for "everybody" ("-perm -0001").

If neither of these conditions is true, then there's no point in checking any of the directories beneath the inaccessible directory, because members of our group won't be able to reach any of the subdirectories below this point. So we use "-prune" to prevent find from going further. However, "-prune" always returns "true", which means find would actually output information about this directory. But we don't want that because we've already determined that this directory is not accessible. So the idiom "-prune -false" prunes the directory structure but returns false so that we don't output information about the top-level directory.

The next complicated looking logical operation filters out the directory names. Per Jason's request, we're matching directory names that contain "classified", "restricted", or "sensitive" and then explicitly excluding directories with the word "nonsensitive".

Finally we use the "-ls" operator to output a detailed listing of our matching directories. If you really need a CSV file, I'd recommend using GNU find so that you can use the "-printf" operator to dump the output in whatever format you require. But I'll leave that as an exercise for the reader.

It's interesting to note that the order of the two main logical blocks is important-- you'll get the wrong answer if you check names before permissions. Consider what would happen if we had a directory called "foo" that was mode 700 and owned by root that contained a subdirectory "sensitive" that was mode 755. In my example above, the find command hits the "foo" directory, determines that it is inaccessible, and immediately prunes the search at that point.

If we were to check names first, however, then the directory name "foo" would not match any of our keywords. So that clause would evaluate to "false" and the permissions on the directory "foo" would never be checked. We'd never get to the "-prune" statement! This means that find would happily descend into the "foo" directory and erroneously report that "foo/sensitive" was accessible, when in fact it's not because "foo" is closed.

Thanks for your interesting problem this week, Jason! Keep those cards and letters coming!

Tuesday, September 7, 2010

Episode #111: What's in a Name Server?

Hal is feeling puckish

Loyal reader Matt Raspberry writes:

Recently I needed to ensure that all the configured dns servers of a server had reverse DNS setup correctly:

for i in `awk '/^nameserver/ {print $2}' /etc/resolv.conf`; do host $i; done;

Honestly that's probably the way I would have solved the same challenge. However, once I consulted my command-line muse, the following alternate solution occurred to me:

awk '/^nameserver/ {print $2}' /etc/resolv.conf | xargs -L 1 host

Normally the xargs command would take as many input lines as possible and stack them all up as arguments in a single host command. But host doesn't react well to multiple arguments and you'll get erroneous output.

But "xargs -L <n>" lets you specify a maximum number of input lines to use for each command invocation. In the example above, we're telling xargs to take one line at a time and call host on each individual input. So basically we're swapping out Matt's original for loop for an xargs command. This saves a little bit of typing, but it's in no way a real optimization.

Honestly, I picked this one out of the mailbag for two reasons. First, Matt did all the work for me this week (thanks Matt!), which is nice because I'm going on vacation. Second, it's going to be a lot more work for Tim. Let's watch the carnage, shall we? What fools these mortals be!

Tim brings the carnage:

This isn't pretty, but WMI rarely (read never) is. WMI is the only way to get information on the DNS servers in use. Ugh! But let's start off easy and get a list of the DNS servers in use.

PS C:\> PS C:\> gwmi Win32_NetworkAdapterConfiguration  | ? { $_.DNSServerSearchOrder } |
  select DNSServerSearchOrder

DNSServerSearchOrder
--------------------
{208.67.222.222, 208.67.220.220}

The Get-WmiObject cmdlet (alias gwmi) is used to get the configuration of all network adapters. The results are piped into the Where-Object cmdlet (alias ?) to filter for objects where the DNSServerSearchOrder property has a value. Select-Object (alias select) is used to return only the property we want.

One problem, the result is an array of IP Addresses, but, we can easily deal with it by using the ExpandProperty switch with the Select-Object cmdlet.

PS C:\> gwmi Win32_NetworkAdapterConfiguration  | ? { $_.DNSServerSearchOrder } |
  select -expand DNSServerSearchOrder

208.67.222.222
208.67.220.220

The ExpandProperty switch will explode the property in question, so we get two strings instead of one array of strings. Doesn't this sound like the same thing? Not if you think of how the objects are going to be sent down the pipeline. We need individual strings to do our lookup, not an array of strings.

Now that we have the IP Addresses of our DNS serves, we can do the lookup.

PS C:\> gwmi Win32_NetworkAdapterConfiguration  | ? { $_.DNSServerSearchOrder } |
  select -expand DNSServerSearchOrder | % { nslookup $_ }

So there you have it, carnage. But what about the carnage we can cause with old school cmd.exe? Time to go Freddy Krueger on this thing.

Let's get the DNS servers.

C:\> wmic nicconfig get dnsserversearchorder | find ","

{"208.67.222.222", "208.67.220.220"}

It returns a pseudo array of the DNS servers. That part is easy, but as you all know, parsing in windows is not easy. When all you have is a ~~hammer~~ crappy For loop, the world starts to look like a nail.

Here is the most simplistic way to parse this:

C:\> for /f "tokens=1-3 delims={}, " %a in
  ('wmic nicconfig get dnsserversearchor ^| find ","')
  do @nslookup %a & @nslookup %b 2>gt;nul & @nslookup %c 2>gt;nul

We use the handy-dandy For loop to split the output. The results are split using the delimiters left brace ({), right brace (}), comma, and space. We only return the first three tokens, since more is pretty rare in Windows. We then output the results of each nslookup and send any errors to nul.

Without some really really really really ugly cmd.exe fu, that no one would actually use, we can't parse the list and guarantee that we get the full list. In our case, we don't know if there was a 4th DNS server configured. We could change the number of tokens and use all 26 variables (a-z), but what if there were a 27th server? It can be done. In fact, Ed has done some crazy fu like that, but we are talking Quentin Tarantino level carnage, and that is just ridiculous.

Tuesday, August 31, 2010

Episode #110: Insert Title Here

In this corner, Tim, the disputed PowerShell Heavyweight Champion of the World:

What is in a title? Titles are important and they can be really useful. An aptly named window can be useful for screen shots, managing multiple sessions, and for basic organization. The default PowerShell title of Administrator: C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe can be really useless. So how do we rename the window to something more useful?

PS C:\> (Get-Host).UI.RawUI.WindowTitle = "Shell to Take over the World"

Boom! Now we have a window that is distinct from our other windows, and clearly displays its use for world domination. So how does the command work?

Get-Host returns an object that represents the host. By accessing the UI, RawUI, and WindowTitle property hierarchy we can modify the title. Not straight-forward, but rather simple. As for the actual plans to take over the world, those aren't simple, nor are they for sale.

And in the red trunks, Hal is undisputedly heavy

Oh sure, namby-pamby Powershell can have an elegant API for changing the window title, but in Unix we prefer things a little more raw:

$ echo -e "\e]0;im in ur windoz, changin ur titlz\a"

Yep, that's the highly intuitive sequence "<Esc>-right bracket-zero-semicolon-<title>-<BEL>" to set your window title to "<title>". Since this obviously doesn't roll trippingly off the fingers, I generally create the following shell function in my .bashrc:

function ct { echo -e "\e]0;$*\a"; }

See the "$*" in the middle of all the line noise? That takes all of the "arguments" to the function (anything we type on the command line after the function name, "ct") and uses them as the title for your window. So now you can change titles in your windows by typing:

$ ct im in ur windoz, changin ur titlz

And that's a whole lot nicer.

Another cool place to use this escape sequence is in your shell prompt. Yes, I know we've done several Episodes at this point that deal with wacky shell prompt games, but this is a fun hack:

$ export PS1='\e]0;\u@\h: $PWD\a\h\$ '
elk$

Here we've embedded our title-setting escape sequence into the PS1 variable that sets up our command-line prompt. We're using it to set the window title to "\u@\h: $PWD", where \u is expanded by the shell as our username, \h is the unqualified hostname, and $PWD is our current working directory. So now we can easily look at the window titles and know exactly who and where we are. This is useful for picking out the one window we want if we have a stack of minimized terminal windows. Notice we also have "\h\$ " after the escape sequence which will actually set the visible shell prompt we see in our terminal window.

Whee!

Mike Cardosa slips something a little extra into his gloves, cmd.exe

Only a true Unix guy would use the word "elegant" to describe the PowerShell method for setting the window title. I do not think that word means what you think that it means.

It turns out that the cmd.exe developers failed to achieve the level of obscurity that we've come to expect when dealing with this humble shell.

Ladies and gentlemen, introducing quite possibly the most intuitive command available in cmd.exe:

C:\> title h@xor at work

That's it! We've managed to change the window title using the simple 'title' command. Just feed it any string you'd like and you are good to go. Including environment variables is also startlingly simple:

C:\> title netcat listener on %computername%

Look at that. The window title - including environment variable - is set without using a single cryptic command line switch or for loop.

Fortunately for sys admins (and certain bloggers), this sort of simplicity is the exception rather than the rule, so the world still needs those with a mastery of obscure commands.

Tuesday, August 24, 2010

Episode #109: The $PATH Less Taken

Hal is in a reflective mood:

I was perusing SHELLdorado the other day and came across a tip from glong-at-openwave-dot-com for printing the elements of your $PATH with each directory on a separate line:

$ IFS=':'; for i in $PATH; do echo $i; done
/bin
/usr/bin
/usr/X11R6/bin
/usr/local/bin
/sbin
/usr/sbin
/usr/local/sbin
/usr/games
/home/hal/bin

It's an interesting example of using IFS to break up your input on something other than whitespace, but in this particular case there are obviously more terse ways to accomplish the same thing:

$ echo $PATH | sed 's/:/\n/g'
/bin
/usr/bin
/usr/X11R6/bin
/usr/local/bin
/sbin
/usr/sbin
/usr/local/sbin
/usr/games
/home/hal/bin

Or to be even more terse:

$ echo $PATH | tr : \\n
/bin
...

The important piece of advice here is that if you're just exchanging one character for another (or even one set of characters for another), then tr is probably the quickest way for you to accomplish your mission. sed is obviously a superset of this functionality, but you have to use a more complex operator to do the same thing.

I think this example also nicely brings home the fact that both sed and tr (as well as awk and many other Unix input-processing primitives) are implicit loops over their input. The for loop from the first example has been subsumed by the functionality of sed and tr, but you still pay the performance cost of reading the entire input. So if you have a multi-stage pipeline that has several sed, tr, and/or awk commands in it, you might try to look at ways to combine operations in order to reduce the number of times you have to read your input.

Tim takes the high road:

In PowerShell we can do the same thing, and just as easily.

PS C:\> $env:path -replace ";","`n"
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\system32\WindowsPowerShell\v1.0

Just as the registry and the filesystem have a provider, environment variables have their own too. The providers allow PowerShell to access the objects using a common set of cmdlets, like Get-ChildItem (alias gci, ls, and dir). Just as we can list the contents of a drive by typing gci c: we can list all the environment variables with gci env:.

To access a specific variable we use $env: followed by the variable name.

PS C:\> $env:path
C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\system32\WindowsPowerShell\v1.0

Once we get the variable's value, we just replace the semicolon with the new line character (`n) by using the replace operator.

Bonus! Guest CMD.EXE solution:

We've all been missing Ed's CMD.EXE wisdom, but loyal reader Vince has moved in before the body is even cold and sent us this little tidbit:

C:\> for %i in ("%PATH:;=" "%") do @echo %i
"C:\Program Files\Perl\site\bin"
"C:\Program Files\Perl\bin"
"C:\WINDOWS\system32"
"C:\WINDOWS"

Thanks, Vince!

Tuesday, August 17, 2010

Episode #108: Acess List Listing

Hal's turn in the mailbag

Loyal reader Rick Miner sent us an interesting challenge recently. He's got several dozen Cisco IOS and PIX configuration files containing access-list rules. He'd like to have an easy way to audit the access-lists across all the files and see which rules are common to all files and where rules might be missing from some files.

Basically we're looking through the files for lines that start with "access-list", like this one:

access-list 1 deny   any log

However, access-lists can also contain comment lines (indicated by the keyword "remark"), and we don't care about these:

access-list 1 remark This is a comment

We also want to be careful to ignore any extra spaces that may have been introduced for readability. So the following two lines should be treated as the same:

access-list 1 deny   any log
access-list 1 deny any log

Rick sent us a sample access-list, which he'd sanitized with generic IP addresses, etc. I created a directory with a few slightly modified versions of his original sample-- giving me 5 different files to test with.

Now I love challenges like this, because they always allow me to construct some really fun pipelines. Here's my solution:

$ grep -h ^access-list rules0* | grep -v remark | sed 's/  */ /g' | 
    sort | uniq -c | sort -nr
      5 access-list 4 permit any
...
      4 access-list 1 deny any log
...

First I use grep to pull the access-list lines out of my sample files (named rules01, rules02, ...). Normally when you run grep against multiple files it will prepend the file name to each matching line, but I don't want that because I plan on feeding the output to sort and uniq later. So I use the "-h" option with grep to suppress the file names.

Next we have a little cleanup action. I use a second grep command to strip out all the "remark" lines. The output then goes to sed to replace instances of multiple spaces with a single space. Note that the sed substitution is "s/<space><space>*/<space>/g", though it's a little difficult to read in this format.

Finally we have to process our output to answer Rick's question. We sort the lines and then use "uniq -c" to count the number of occurrences of each rule. The second sort gives us a descending numeric sort the lines using the number of instances of each rule as the sort criteria. Since I'm working with five sample files, rules like "access-list 4 permit any" must appear in each file (assuming no duplicate rules, which seems unlikely). On the other hand, "access-list 1 deny any log" appears to be missing from one file.

But which file is our rule missing from? One way to answer this question is to look for the files where the rule is present:

$ grep -l 'access-list 1 deny any log' rules0*'

Wait a minute! What just happened here? We should have gotten four matching files! Well remember how we canonicalized the lines by converting multiple spaces to a single space? Let's try making our rule a bit more flexible:

$ grep -l 'access-list *1 *deny *any *log' rules0*'
rules01
rules02
rules03
rules05

That's better! We use the stars to match any number of spaces and we find our rules. "grep -l" (that's an "el" not a "one") means just display the matching file names and not the matching lines so that we can easily see that "rules04" is the file missing the rule.

But what if you were Rick with dozens of files to sort through. It wouldn't necessarily be clear which files weren't included in the output from our grep command. It would be nicer if we could output the names of the files that were missing the rule, rather than listing the files that included the rule. Easier done than said:

$ sort <(ls) <(grep -l 'access-list *1 *deny *any *log' rules0*) | uniq -u
rules04

"<(...)" is an output substitution that allows you to insert the output of a command in a spot where you would normally expect to use a filename. Here I'm using sort to merge the output of ls, which gives me a list of all files in the directory, with our previous command for selecting the files that contain the rule we're interested in. "uniq -u" gives you the lines that only appear once in the output (the unique lines). Of course these are the files that appear in the ls output but which are not matched by our grep expression, and thus they're the files that don't contain the rule that we're looking for. And that's the answer we wanted.

You can do so much with sort and uniq on the Unix command line. They're some of my absolute favorite utilities. I've laid off the sort and uniq action because Windows CMD.EXE didn't have anything like them and it always made Ed grumpy when I pulled them out of my tool chest. But now that we've ~~murdered Ed and buried him in a shallow grave out back~~bid farewell to Ed, I get to bring out more of my favorites. Still, I fear this may be another one of those "character building" Episodes for Tim. Let's watch, shall we?

Tim's got plenty of character:

Alright Hal, you used to push Ed around, but you are going to have a slightly tougher time pushing me around. And not just because I wear sticky shoes.

This is going to be easy. Ready to watch, Hal?

PS C:\> ls rules* | ? { -not (Select-String "access-list *1 *deny *any *log" $_) }

    Directory: C:\temp

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         8/16/2010   9:44 PM       1409 rules05.txt

Files whose name begin with "rules" are piped into our filter. The Where-Object filter (alias ?) uses a logical Not in conjunction with Select-String to find files that do not contain our search string. The search string used is the same as that used by Hal.

Now to crank it up a few notches...

But what if we have a file containing our gold standard, and we wanted to compare it against all of our config files to find ones that don't comply with our standard. Your wish is my command (unless your with involves a water buffalo, a nine iron, and some peanut butter).

PS C:\> cat gold.txt | select-string -NotMatch '^ *$' | % { $_ -replace "\s+", "\s+" } | % {
  $a = $_;
  ls rules* | ? { -not (select-string $a $_) }
} | get-unique

    Directory: C:\temp

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         8/16/2010   9:44 PM       1409 rules02.txt
-a---         8/16/2010   9:44 PM       1337 rules05.txt

In the command above, the first line of our command gets each non-blank line of our gold config, and changes and spaces into \s+ for use in our search string. The \s+ is the regular expression equivalent of "one or more spaces". Now that we have generated our search string, lets search each file like we did earlier. Finally, we use the Get-Unique cmdlet to remove duplicates.

Hal, you may have buried Ed, but you haven't killed me off...yet.