Tuesday, August 23, 2011

Episode #157: I Ain't No Fortunate One

Hal to the rescue!

We were kicking around ideas for this week's Episode and Tim suggested a little command-line "Russian Roulette". The plan was to come up with some shell fu that would pick a random number between one and six. When the result came up one, you "lost" and the command would randomly delete a file in your home directory.

Holy carp, Tim! This is the kind of thing you do for fun? Why don't we do an Episode about seeing how many files you can delete from your OS before it stops working? It's not like our readers would come looking for us with torches and pitchforks or anything. Geez.

Now I'm as big a fan of rolling the dice as anybody, but let's try something a bit more gentle. What I'm going to do is pick random sayings out of the data files used by the "fortune" program. For those of you who've never looked at these files before, they're just text files with various pithy quotes delimited by "%" markers:

$ head -15 /usr/share/games/fortunes/linux

"How do you pronounce SunOS?" "Just like you hear it, with a big SOS"
-- dedicated to Roland Kaltefleiter
%
finlandia:~> apropos win
win: nothing appropriate.
%
C:\> WIN
Bad command or filename

C:\> LOSE
Loading Microsoft Windows ...
%
Linux ext2fs has been stable for a long time, now it's time to break it
-- Linuxkongreß '95 in Berlin
%

In order to pick one of these quotes randomly, I'm going to need to know how many there are in the file:

$ numfortunes=$(grep '^%$' /usr/share/games/fortunes/linux | wc -l)

$ echo $numfortunes
334

By the way, there's no off-by-one error here because there actually is a trailing "%" as the last line of the file.

OK, now that we know the number of fortunes we can pick from, I can choose which numbered fortune I want with a little modular arithmetic:

$ echo $(( $RANDOM % $numfortunes + 1 ))

109
$ echo $(( $RANDOM % $numfortunes + 1 ))
128
$ echo $(( $RANDOM % $numfortunes + 1 ))
325

I've used $RANDOM a couple of times in past Episodes-- it's simply a special shell variable that produces a random value between 0 and 32K. I'm just using arithmetic here to turn that into a value between 1 and $numfortunes.

But having selected the number of the fortune we want to output, how do we actually pull it out of the file and print it? Sounds like a job for awk:

$ awk "BEGIN { RS = \"%\" }; 

NR == $(( $RANDOM % $numfortunes + 1 ))" /usr/share/games/fortunes/linux


#if _FP_W_TYPE_SIZE < 32
#error "Here's a nickel kid. Go buy yourself a real computer."
#endif
-- linux/arch/sparc64/double.h

In awk, the "BEGIN { ... }" block happens before the input file(s) get read or any of the other awk statements get executed. Here I'm setting the "record seperator" (RS) variable to the percent sign. So rather than pulling the file apart line-by-line (awk's default RS value is newline), awk will treat each block of text between percent signs as an individual record.

Once that's happening, selecting the correct record is easy. We use our expression for picking a random fortune number and wait until awk has read that many records. The variable NR tracks the number of records seen, so when NR equals our random value we've reached the record we want to output. Since I don't have an action block after the conditional expression, "{ print }" is assumed and my fortune gets printed.

By the way, I'm sure that some of you are wondering why I'm using $RANDOM rather than the built-in rand() function in awk. Turns out that some versions of awk don't support rand(), so my method above is more portable. If your awk does support rand(), then the command would be:


$ awk "BEGIN { RS = \"%\"; srand(); sel = int(rand()*$numfortunes)+1 }; NR == sel" \

/usr/share/games/fortunes/linux


panic("Foooooooood fight!");
-- In the kernel source aha1542.c, after detecting a bad segment list

Frankly, the need to call srand() to reseed the random number generator at the start of the program, makes using the built-in rand() function a lot less attractive than just going with $RANDOM. By the way, our arithmetic is a little different here because rand() produces a floating point number between 0 and 1.

Meh. I like the $RANDOM version better.

So Tim, if you can stop deleting your own files for a second, let's see what you've got this week.

Tim steps into Mambi-pambi-land

We are 157 Episodes in and Hal (and Ed) still aren't up for manly commands; not willing to put it all on the line. Instead, we get fortune cookies. Alright, but once you guys grow some chest hair, let's throw down mano a mano computo a computo.

Let's start with cmd.exe. Similar to what Hal did we first need to figure out how many lines only contain a percent sign.

C:\> findstr /r "^%$" fortunes.txt | find /c "%"

431


We use FindStr with the /r to use a regular expression to look for the beginning of the line (^), a percent sign, end of line ($). Note, the file has to be saved with the Carriage Return Line Feed (CRLF) that Windows is used to, and not just a Carriage Return (CR) as text files are normally saved in Linux. The results are piped into Find with the /c switch to actually do the counting. But you mask ask, "Why both commands?"

Unfortunately, we can't just use Find, since there is no mechanism to ensure the percent sign is on a line by itself. We also can't just use FindStr as it doesn't count. Now that we have the number, lets cram it into a variable as an integer.

C:\> set /a count="findstr /r "%$" fortunes.txt ^| find /c ^"%^""

Divide by zero error.


I tried all sort of syntax options, different quotes, and escaping (using ^) to fix this error, but no luck. However, if you wrap it in a For loop and use the loop to handle the command output, it works. Why? Who knows. Don't come to cmd.exe if you are looking for things to make sense.

C:\> cmd.exe /v:on /c "for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a x=%i"

334


This command uses delayed variable expansion (/v:on) so we can set a variable and use it right away. We then use a For loop that "loops" (only one loop) through the command output.

With a slight modification we can get a random fortune number.

C:\> cmd.exe /v:on /c "for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a rnd=%random% % %i"

12
C:\> cmd.exe /v:on /c "for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a rnd=%random% % %i"
169
C:\> cmd.exe /v:on /c "for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a rnd=%random% % %i"
252
C:\> cmd.exe /v:on /c "for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a rnd=%random% % %i"
42


We use the variable %RANDOM% and the modulus operator (%) to select a random number between 0 and 333 by using the method developed in Episode #49.

Now we need to find our relevant line(s) and display them. Of course, we will need another For loop to do this.

C:\> cmd.exe /v:on /c "(for /F %i in ('findstr /r "^%$" fortunes.txt ^| find /c "%"') do @set /a rnd=%random% % %i > NUL) & @set /a itemnum=0 > NUL & for /F "tokens=* delims=" %j in (fortunes.txt) do @(echo %j| findstr /r "^%$" > NUL && set /a itemnum=!itemnum!+1 > NUL || if !itemnum!==!rnd! echo %j)"

Be cheerful while you are alive.
-- Phathotep, 24th Century B.C.


Before our second For loop we initialize the itemnum counter, which will be used to keep track of the current comment number. We use base 0 for counting as that is what the modulus output gives us.

The options used with the For loop sets the tokens and delims options so we get the whole line (tokens=*) including leading spaces (delimes=<nothing>). Next we use Echo and FindStr to check if the current line contains only a percent sign. If the command has output it is successful, and with our short circuit logical And (&&) we increment the itemnum counter. If the line is not a percent sign, then the logical Or (||) will execute our If statement.

If the our itemnum counter matches the random number, then we output the current line. As the itemnum counter does not increment until the next time is sees a percent sign, it can output multiple lines of text.

To be honest, this command was a big pain. More than once I wished my `puter had been shot by that Russian bullet. At least the PowerShell version is much easier.

PowerShell

PowerShell is great with objects, so let's turn each fortune into an object.

PS C:\> $f = ((gc fortunes.txt) -join "`n") -split '^%$', 0, "multiline"


This command gives us an array of fortunes. We read in the file with Get-Content (alias gc). Get-Content will return an array of rows, but this isn't what we want. We then recombine all the lines using the New Line characters (`n) between each element. We then recut the string using the Split operator and some fancy options.

We give the split operator three parameters. The first is the regular expression to use in splitting. The second is the number of maximum number of substrings, where 0 means return everything. The third parameter is used to enable the MultiLine option so the split operator will handle multiple lines.

Now we have a list of fortunes and we can count how many.

PS C:\> $f.length

335


Wait, 335? What is going on? Let's check the last fortune. Remember, we are working with base 0, so the last item is 334.

PS C:\> $f[334]

<nothing>


This happens because the last item is a % and we have characters after it, a Carriage Return Line Feed. As long as we know this we can work around it. Now to output a random line number.

PS C:\> $f = ((gc fortunes.txt) -join "`n") -split '^%$', 0, "multiline"

PS C:\> $f[(Get-Random -Maximum $f.length) - 1]

Questionable day.

Ask somebody something.

PS C:\> $f[(Get-Random -Maximum $f.length) - 1]

Don't look back, the lemmings are gaining on you.

PS C:\> $f[(Get-Random -Maximum $f.length) - 1]

You need no longer worry about the future. This time tomorrow you'll be dead.


This may be my last week as I was just informed that "You will be traveling and coming into a fortune." YIPEE! I'm off to Tahiti! (Hopefully)