Tuesday, August 24, 2010

Episode #109: The $PATH Less Taken

Hal is in a reflective mood:

I was perusing SHELLdorado the other day and came across a tip from glong-at-openwave-dot-com for printing the elements of your $PATH with each directory on a separate line:

$ IFS=':'; for i in $PATH; do echo $i; done
/bin
/usr/bin
/usr/X11R6/bin
/usr/local/bin
/sbin
/usr/sbin
/usr/local/sbin
/usr/games
/home/hal/bin

It's an interesting example of using IFS to break up your input on something other than whitespace, but in this particular case there are obviously more terse ways to accomplish the same thing:

$ echo $PATH | sed 's/:/\n/g'
/bin
/usr/bin
/usr/X11R6/bin
/usr/local/bin
/sbin
/usr/sbin
/usr/local/sbin
/usr/games
/home/hal/bin

Or to be even more terse:

$ echo $PATH | tr : \\n
/bin
...

The important piece of advice here is that if you're just exchanging one character for another (or even one set of characters for another), then tr is probably the quickest way for you to accomplish your mission. sed is obviously a superset of this functionality, but you have to use a more complex operator to do the same thing.

I think this example also nicely brings home the fact that both sed and tr (as well as awk and many other Unix input-processing primitives) are implicit loops over their input. The for loop from the first example has been subsumed by the functionality of sed and tr, but you still pay the performance cost of reading the entire input. So if you have a multi-stage pipeline that has several sed, tr, and/or awk commands in it, you might try to look at ways to combine operations in order to reduce the number of times you have to read your input.

Tim takes the high road:

In PowerShell we can do the same thing, and just as easily.

PS C:\> $env:path -replace ";","`n"
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\system32\WindowsPowerShell\v1.0
Just as the registry and the filesystem have a provider, environment variables have their own too. The providers allow PowerShell to access the objects using a common set of cmdlets, like Get-ChildItem (alias gci, ls, and dir). Just as we can list the contents of a drive by typing gci c: we can list all the environment variables with gci env:.

To access a specific variable we use $env: followed by the variable name.

PS C:\> $env:path
C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\system32\WindowsPowerShell\v1.0
Once we get the variable's value, we just replace the semicolon with the new line character (`n) by using the replace operator.

Bonus! Guest CMD.EXE solution:

We've all been missing Ed's CMD.EXE wisdom, but loyal reader Vince has moved in before the body is even cold and sent us this little tidbit:

C:\> for %i in ("%PATH:;=" "%") do @echo %i
"C:\Program Files\Perl\site\bin"
"C:\Program Files\Perl\bin"
"C:\WINDOWS\system32"
"C:\WINDOWS"

Thanks, Vince!