Tuesday, March 30, 2010

Episode #88: Massage Techniques

Ed's Getting in the Mood:

The GUI can be so enticing. Sitting there with a bunch of GUI windows on the screen can beguile even a shell geek into doing things the hard way without even thinking about it.

Case in point: A few days ago, I needed to send e-mail to about a hundred people. No, I'm not a spammer... I needed to thank some folks for taking my SANS class. SANS sent me a list of e-mail addresses in a format that was kinda ugly -- a space delimited list of a hundred lines formatted as follows:
EmailAddress FirstName LastName SomeCrazyNumber
So, I simply needed to strip off the e-mail addresses from this list. I first highlighted the list and copied it. Then, without thinking, my reptile brain launched Excel and pasted the list into a new spreadsheet. I guess I was kind of expecting that the spreadsheet paste action would parse things into columns based on those spaces. But, that's dumb... why would it do that? My reptile brain then started moving the mouse up to the menu bar to figure out a way to break my one column into four, so I could just peel off the e-mail addresses of the first column.

Happily, before my mouse even clicked on the first menu, higher brain functionality kicked in. "Skodo," my brain said to itself, "You could spend 5 minutes farting around in Excel trying to do this, or you could do it in, like, 10 seconds in a shell." Suitably chastised by my own brain, I went to a cmd.exe on my Vista box and did the following:

C:\> notepad names.txt
Hit Enter (to say Yes, I want to create the file)

CTRL-V (to paste)

ALT-F4 (to close notepad)

Enter (to tell notepad to save the file)

C:\> (for /f %i in (names.txt) do @echo %i >> email.txt) & clip < email.txt &
del email.txt names.txt

Then, with the list of e-mail addresses safely ensconced in my clipboard, I just pasted that list of e-mail addresses into my mail program. No fuss, no muss.

But, let's look at what's going on here in a little more detail, because we've introduced a new friend here, namely clip.exe. I start out by creating a file called names.txt using notepad. Yes, that's a GUI tool, but it's a quick way of taking your clipboard contents (which still had my list of addresses, names, and crazy numbers) and moving it into a file. Then, four simple keystrokes (which even the lizard brain knows) allows me to create the file, paste the data into it, close notepad, and save the file while notepad is closing. All pretty routine stuff, accomplished in about 2 seconds.

My remaining 8 seconds went into the FOR loop. I invoked a FOR /F loop to parse the names.txt file, with a single iterator variable (%i), using default delims (spaces and tabs). At each iteration through my loop, in my do clause, I turn off command display (@) and append (>>) the first column of the file (%i) to a file I call email.txt. That first column in names.txt contains e-mail addresses, making my email.txt file contain only the e-mail addresses. I surround that whole FOR loop in parens so that I can allow that command to finish running, letting me follow it up with more commands.

After the loop is done, I then invoke the clip command, which is included in Windows 2003, Vista, Win7, and 2008 Server. It's not in XP, although you can copy it there from another Windows box, and it'll run (consult your lawyer regarding the license implications of that maneuver). The clip command can take the contents of a file and put it into the Window clipboard (via the syntax "clip < filename") Or, you could take the output of any command and dump it into the clipboard with "command | clip".

The clip command really does come in handy. Unfortunately, it only deals with shuffling file contents or command output TO the clipboard. It does not allow you to take clipboard content and pipe it to another program as input or dump it into a file. For that capability, there are numerous free third-party tools, or the handy reptile-brain-memory technique using Notepad pasting described above.

Finally, with my email.txt file copied over to my clipboard, I then run the del command to remove my temporary email.txt and names.txt files, cleaning up after myself. Remember, del can delete multiple files at a time by simply listing them.

Bada-bing, bada-boom. I've now got the data that I want in the format I want it, loaded into my clipboard and ready to paste.

So, next time you need to massage some data, don't just blindly reach for a spreadsheet program. Instead, consider what your command-line friends can do for you!

Tim is feeling a little stiff:

Ah, the days of Vista. If Ed had just used Windows 7 he would have the built in capability to use PowerShell, version 2 even.

Per usual, we can do this multiple ways. I selected the option that didn't require the temporary input file. Instead of pasting the data into a text file, we can paste it right into our shell. Here is how it works:

PS C:\> @"
Hit Enter

Paste - Right click, Select Paste

>> "@ | % { $_.Split("`n") } | % { $_.Split(" ")[0] } | clip
Here is what it would look like in our command window.

PS C:\> @"
>> tim@yomammashouse.com Tim Medin 00001
>> ed@yomammashouse.com Ed Skoudis 31337
...
>> hal@yomammashouse.com Hal Pomeranz 80085
>> "@ | % { $_.Split("`n") } | % { $_.Split(" ")[0] } | clip
>> <hit enter a second time>
In PowerShell, the >> prompt means that PowerShell is expecting more input. In our case, the shell is waiting for us to finish our multiline string and multiline command. I typically omit this prompt in my write-ups so it is easier to read and less confusing. I don't want someone to that >> needs to be typed. To finish our multiline command, hit enter twice. We now have all of the email addresses on the clipboard and we can paste them into our favorite email program.

Now let's see how this command works.

We first use a multiline string that holds all the email addresses, names, and weird numbers. This multiline string is actually called a here-string. A here-string begins with @" on line by itself and ends with "@ on a line by itself. The here-string contains all of the email addresses, names and weird numbers. The text can contain line breaks, single quotes, double quotes, blanks spaces, and it doesn't require any delimiters. It is a pretty cool construct.

The here-string is piped into a ForEach-Object loop to break the string into multiple lines by splitting on the line break character (`n). The output is then piped into another ForEach-Object loop. In the second loop, we simultaneously split the line, using the space character as a delimiter, and output the 0th item (email address) of the newly created array. The results are piped into clip, which stores the results in the clipboard.

There's the rubdown in PowerShell, let's see what Hal's got oiled up for us.

Hal is very relaxed

My goodness! Once again my Windows compatriots have worked themselves into a lather over something that is very simple to do in the Unix shell. Man, if I only got a free massage every time that happens. I'd be the most relaxed person on the planet!

Our first option would be to just send the email directly from the command-line:

$ cat email-msg.txt | mailx -s 'Thank you!' `awk '{print $1}' names.txt`

Assuming we already have our canned "Thank you" email in the file email-msg.txt, we just shove that into the standard input of the mailx command. The "-s" option allows us to specify a Subject: line, and then we just use awk to extract the list of recipients from the first column of our text file.

But perhaps you'd prefer to use a GUI mail client to compose your "Thank you" email. Just like Windows, you can copy command-line output to the standard X clipboards. There are actually several tools that do this, but I generally will use xsel:

$ awk '{print $1}' names.txt | xsel
$ awk '{print $1}' names.txt | xsel --clipboard

The first form copies the standard input to the "primary" selection, which you can then paste into your GUI program by clicking the middle mouse button. The second form copies the standard input to the clipboard, which is normally accessed by right-clicking and selecting "Paste" from the context-sensitive pop-up menu.

And while I feel almost bad about harshing on Ed and Tim's mellow, I feel I must disclose that, unlike Windows, the xsel command also lets you get the output from the clipboard into your commands as well. You can even append the output of multiple commands into the current selection:

$ echo Unix rules! | xsel
$ xsel -o
Unix rules!
$ echo Windows drools! | xsel -a
$ xsel -o
Unix rules!
Windows drools!

As you can see, the "-o" (output) option outputs the value of the current selection. The "-a" (append) option can be used to add text to the current selection.

I'll let Ed and Tim get back to their massages now. Poor guys, they obviously need all the relaxation they can get after having to work so hard.