Tuesday, December 29, 2009

Episode #75: Yule Be Wanting an Explanation Then

Hal returns to the scene of the crime

I opened last week's post saying there would be no "explanations or excuses", but apparently that wasn't good enough for some of you. So at the request of our loyal readers, we'd like to revisit last week's episode and explain some of the code. Especially that crazy cmd.exe stuff that Ed was throwing around.

Of course the bash code is completely straightforward:

$ ct=12; while read line; do
[ $ct == 1 ] && echo -n Plus || echo -n $ct;
echo " $line";
((ct--));
done <<EoLines
keyboards drumming
... lines removed ...
command line hist-or-y!
EoLines

First we're initializing the "ct" variable we'll be using to count down the 12 Days of Fu. Then we start a while loop to read lines from the standard input.

Within the body of the loop, we use the quick and dirty "[...] && ... || ..." idiom that I've used in previous Episodes as a shortcut instead of a full-blown "if ... then ... else ..." clause. Basically, if we've counted down to one then we want to output the word "Plus"; otherwise we just output the value of $ct. Notice we use "echo -n ..." so that we don't output a newline. This allows us to output the text we've read from the standard input as the remainder of the line. Finally, we decrement $ct and continue reading lines from stdin.

The interesting stuff happens after the loop is over. Yeah, I could have put the text into a file and read the file. But I was looking for a cool command-line way of entering the text. The "<<EoLines" syntax at the end of the loop starts what's called a "here document". Basically, I'm saying that the text I type in on the following lines should be associated with the standard input (of the while loop in this case). The input ends when the string "EoLines" appears by itself on a line. So all I have to do is type in the 12 lines of text for our 12 Days of Fu and then finish it off with "EoLines". After that we get our output. Neat!

Everybody clear? Cool. I now throw it over to Tim to get the lowdown on his PowerShell madness.

Tim goes back in time

Let's unwrap what we did last week.

PS C:\> $ct=12; "keyboards drumming
admins smiling
... lines removed ...
command line hist-or-y!".split("`n") |
% { if ($ct -eq 1) {"Plus $_"} else {"$ct $_"}; $ct-- }


Here is the break down:

First, we initialize the $ct variable to begin our count down.

ct=12;


That was easy. Next, we take a multi-line string and split it in to an array using the new line character (`n) as the delimiter.

"keyboards drumming
... lines removed ...
command line hist-or-y!".split("`n")


We could have just as easily read in content from a file using Get-Content, but I wanted to demonstrate some new Fu that was similar to Hal's Fu. The nice thing is that reading from a file would be an easy drop-in replacement.

Once we have an array of Fu Text, we pipe it into ForEach-Object so we can work with each line individually.

if ($ct -eq 1) {"Plus $_"} else {"$ct $_"}


Inside the ForEach-Object loop we use an IF statement to format the output. If the count is one, then we output "Plus" and the Fu Text, otherwise we output the value of $ct and the Fu Text.

Finally, we decrement the value of $ct.

$ct--


Simple right? That was pretty straightforward. But sit back, and grab some spiked Egg Nog before you proceed further.

Ed Surveys the Madcap Mayhem:
I really liked this episode because it required the use of a few techniques that we haven't yet highlighted in this blog before. Let's check them out, first reiterating that command:

c:\> copy con TempLines.txt > nul & cmd.exe /v:on /c "set ct=12& for /f
"delims=" %i in (TempLines.txt) do @(if not !ct!==1 (set prefix=!ct!) else (set prefix=
Plus)) & echo !prefix! %i & set /a ct=ct-1>nul" & del TempLines.txt
keyboards drumming
----snip----
command line hist-or-y!
^Z (i.e., hit CTRL-Z and then hit Enter)

OK... we start out with the copy command, which of course copies stuff from one file to another. But, we use a reserved name here for our source file: con. That'll pull information in from the console, line by line, dumping the results into a temporary file, very cleverly named TempLines.txt. Of course, there is the little matter of telling "copy con" when we're done entering input. While there are several ways to do that, the most efficient way to do so that has minimal impact on the contents of the file is to hit CTRL-Z and then Enter. Voila... we've got a file with our input. By the way, I throw away the output of "copy con" with a "> nul" because I didn't want the ugly "1 file(s) copied." message to spoil my output. By the way, it kinda stinks that that message is put on Standard Output, doesn't it? Where I come from, that is much more of a Standard Error thing. But, I'm sure we could get into a big philosophical debate about what should go on Std Out and what should go on Std Err. But, let's just cut the argument short and say that many Windows command line tools are all screwed up on this front, regardless of your philosophy. That's because there are no reasonable and consistent rules for directing stuff to Std Out versus Std Err in Windows command line output.

Anyway, back to the point. I then run "cmd.exe /v:on" so I can do delayed variable expansion. That'll let me use variables with values that can change as my command runs. Otherwise, cmd.exe will expand all variables to their values instantly when I hit Enter. I need to let these puppies float. I use the cmd.exe to execute a command, with the /c option, enclosing the command in double quotes.

So far, so good. But, now is where things get interesting.

To map to Hal and Tim's fu, I then set a variable called ct to a value of 12, just a simple little integer. But, what's with that & right after the 12? Well, if you leave it out, you'll see that the output will contain an extra space in "12 keyboards drumming"... it'll look like "12[space][space]keyboards drumming". Where does the extra space come from? Well, check this out:
c:\> cmd.exe /v:on /c "set var=foo & echo !var!bar"
foo bar

c:\> cmd.exe /v:on /c "set var=foo& echo !var!bar"
foobar

Do you see what's happening here? In the first command, the set command puts everything in the variable called var, starting right after the = and going up to the & (which separates the next command), and that includes the space! We have to omit that space by having [value] with the & right after it. For a more extreme example, consider this:
c:\> cmd.exe /v:on /c "set var=stuff      & echo !var!blah"
stuff blah

I typically like to include a space before and after my & command separators when wielding my fu, for ease of readability. However, sometimes that extra space has meaning, so it has to be taken out, especially when used with the set command to assign a string to a variable, like the ct variable in that big command above.

Wait a second... earlier I referred to ct as an integer, and now I'm calling it a string? What gives? Just hold on a second... I'll come back to that in just a bit.

We have to deal with our FOR loop next. I'm using a routine FOR /F loop to iterate through the contents of the TempLines.txt file. I'm specifying custom delimiters, though, to override the default space and tab delimiters that FOR /F loops like to use. With "delims=", I'm specifying a delimiter of... what exactly? The equals sign? No... that's actually part of the syntax of specifying a delimiter. To make an equals a delimiter, I'd have to use "delims==". So, what is the delimiter I'm setting here? Well, friends, it's nothing. Yeah. I'm turning off parsing, because I want the full line of my input to be set to my iterator variable. In the past, when I was but a young cmd.exe grasshopper, I would turn off such parsing by setting a delim of a crazy character I would never expect in my input file, such as a ^, with the syntax "delims=^". But, I now realize that the most effective way to use FOR /F loops is to simply let them use you. I turn off parsing by making a custom delimiter of the empty set. Do not try and bend the spoon... That's impossible. Instead try to realize the truth.... that there is no spoon.

Anyway, so where was I? Oh yea, with my delimiterless lines now being assigned one by one to my iterator variable of %i, I'm off and running. In the DO clause of my FOR loop, I turn off the display of commands (@) and jump right into an IF statement, to check the value of my ct variable. I expand ct into its value with a !ct!, because I'm using delayed variable expansion. Without delayed expansion, variables are referred to a %var%. My IF statement checks to see if !ct! is NOT equal (==) to 1. If it's not, I set another variable called prefix to the value of ct.

Then, I get to my ELSE clause. Although I use ELSE a lot in my work, I have to say that this is the first time I've had to use it in one of our challenges on this blog. The important thing to remember with ELSE clauses in single commands (not in batch scripts) is that you have to put everything in parentheses. So, if !ct! is equal to 1, my ELSE clause kicks in and sets the prefix variable to the string "Plus". That way, later on, I can simply print out prefix, which will contain the ct number for most of the days, but the word "Plus" for the last day.

And, here we back to that string/integer thing I alluded to above. The cmd.exe shell uses loosely typed variables. No, this is not a reference to how hard you hit the keys on your keyboard when typing. Instead, like many interpreted languages, variable types (such as strings and integers) are not hard and fast. In cmd.exe, they are evaluated in real time based on context, and they can even change in a single command. My ct variable behaves like an integer, for the most part. I can add to it, subtract from it, and store its value in another variable. But, when I defined it originally with the set command, if I had used "set ct=12 &...", it would have been a string with the trailing space until I used it in some math, and then that space would disappear. Also, the prefix variable is given the value of ct most of the time, which is just an integer. But, when ct is one, I give the prefix variable a string of "Plus". I'm an old C-language programmer, so this loose type enforcement kinda weirds me out. Still, it's quite flexible.

Then, I echo the prefix (!prefix!) and the line of text (%i). I then subtract one from my ct with the "set /a ct=ct-1", throwing the output of the set command away (>nul). Note that I want to show the prefix and the text on the same line, so I use a single echo command to display both variables on the same line. Most cmd.exe command-line tools actually put their output on standard out with a Carriage Return Line Feed (CRLF) right afterward. Thus, two echo commands, one for each variable, would have broken the prefix and the file content on separate lines, a no-no when trying to reproduce exactly the output of Hal and Tim. When formulating commands that need to display a lot of different items on a single line, I often chunk them into variables and then echo them exactly as I need them on a single line with a single echo statement.

Now, there is one interesting counter-example to the general rule that cmd.exe command-line tools insert a CRLF at the end of their output: the "set /a" command. It does its math, and then displays the output without any extraneous return, as in:
c:\> set /a 8-500 & echo hey
-492hey

I used that little fact in this fun command to spew Matrix-like gibberish on the screen from Episode #58:

C:\> cmd.exe /v:on /c "for /L %i in (1,0,2) do @set /a !random!"

When I first was working on this 12-days challenge, I was thinking about using set /a to display !ct! and then the line from the file. It would all be on the same line because of that special "no-CRLF" property of "set /a". But, I ran into the little problem of the "Plus" for the last line of input, so I instead introduced the prefix variable and played on the loose typing. There are dozens of other ways to do this, but I tried to focus on one that, believe it or not, I thought made the most sense and was simplest.

Oh, and to close things out, I delete the TempLines.txt file. Can't litter our file system with crap, now can we?

So, as you can see, there were a bunch of ideas we haven't used in this blog so far that popped out of cmd.exe in this innocuous-seeming challenge, including empty-set delims, an ELSE clause, weak type enforcement, and variable building for a single line of output. That's a lot of holiday cheer, and it makes me happy.

With that said, all of us at the Command Line Kung Fu blog would like to wish our readers a Happy and Prosperous New Year!