Tuesday, October 3, 2017

Episode #181: Making Contact

Hal wanders back on stage
Whew! Sure is dusty in here!
Man, those were the days! It started with Ed jamming on Twitter and me heckling from the audience. Then Ed invited me up on stage (once we built the stage), and that was some pretty sweet kung fu. Then Tim joined the band, Ed left, and the miles, and the booze, and the groupies got to be too much.
But we still get fan mail! Here's one from superfan Josh Wright that came in just the other day:
I have a bunch of sub-directories, all of which have files of various names. I want to produce a list of directories that do not have a file starting with "ContactSheet-*.jpg".
I thought I would just use "find" with "-exec test":
find . -type d \! -exec test -e "{}/ContactSheet\*" \; -print
Unfortunately, "test" doesn't support globbing, so this always fails.
Here's a sample directory tree and files. I thought this might be an interesting CLKF topic.
$ ls
Adriana  Alessandra  Gisele  Heidi  Jelena  Kendall  Kim  Miranda
$ ls *
Adriana:
Adriana-1.jpg  Adriana-3.jpg  Adriana-5.jpg
Adriana-2.jpg  Adriana-4.jpg  Adriana-6.jpg

Alessandra:
Alessandra-1.jpg  Alessandra-4.jpg  ContactSheet-Alessandra.jpg
Alessandra-2.jpg  Alessandra-5.jpg
Alessandra-3.jpg  Alessandra-6.jpg

Gisele:
Gisele-1.jpg  Gisele-3.jpg  Gisele-5.jpg
Gisele-2.jpg  Gisele-4.jpg  Gisele-6.jpg

Heidi:
ContactSheet-Heidi.jpg  Heidi-2.jpg  Heidi-4.jpg  Heidi-6.jpg
Heidi-1.jpg             Heidi-3.jpg  Heidi-5.jpg

Jelena:
ContactSheet-Jelena.jpg  Jelena-2.jpg  Jelena-4.jpg  Jelena-6.jpg
Jelena-1.jpg             Jelena-3.jpg  Jelena-5.jpg

Kendall:
Kendall-1.jpg  Kendall-3.jpg  Kendall-5.jpg
Kendall-2.jpg  Kendall-4.jpg  Kendall-6.jpg

Kim:
ContactSheet-Kim.jpg  Kim-2.jpg  Kim-4.jpg  Kim-6.jpg
Kim-1.jpg             Kim-3.jpg  Kim-5.jpg

Miranda:
Miranda-1.jpg  Miranda-3.jpg  Miranda-5.jpg
Miranda-2.jpg  Miranda-4.jpg  Miranda-6.jpg
OK, Josh. I'm feeling you on this one. Maybe I can find some of the lost magic.
Because Josh started this riff with a "find" command, my brain went there first. But my solutions all ended up being variants on running two find commands-- get a list of the directories with "ContactSheet" files and "subtract" that from the list of all directories. Here's one of those solutons:
$ sort <(find * -type d) <(find * -name ContactSheet-\* | xargs dirname) | uniq -u
Adriana
Gisele
Kendall
Miranda
The first "find" gets me all the directory names. The second "find" gets all of the "ContactSheet" files, and then that output gets turned into a list of directory names with "xargs dirname". Then I use the "<(...)" construct to feed both lists of directories into the "sort" command. "uniq -u" gives me a list of the directories that only appear once-- which is the directories that do not have a "ContactSheet" file in them.
But I wasn't cool with running the two "find" commands-- especially when we might have a big set of directories. And then it hit me. Just like our CLKF jam is better when I had my friends Ed and Tim rocking out with me, we can make this solution better by combining our selection criteria into a single "find" command:
$ find * \( -type d -o -name ContactSheet\* \) | sed 's/\/ContactSheet.*//' | uniq -u
Adriana
Gisele
Kendall
Miranda
By itself, the "find" command gives me output like this:
$ find * \( -type d -o -name ContactSheet\* \)
Adriana
Alessandra
Alessandra/ContactSheet-Alessandra.jpg
Gisele
Heidi
Heidi/ContactSheet-Heidi.jpg
Jelena
Jelena/ContactSheet-Jelena.jpg
Kendall
Kim
Kim/ContactSheet-Kim.jpg
Miranda
Then I use "sed" to pick off the file name, and I end up with the directory list with the duplicate directory names already sorted together. That means I can just feed the results into "uniq -u" and everything is groovy!
Cool, man. That was cool. Now if only my friends Ed and Tim were here, that would be something else.

A Wild-Eyed Man Appears on Stage for the First Time Since December 2013
Wh-wh-where am I?  Why am I covered with dust?  Who are you people?  What's going on?

This all looks so familiar.  It reminds me of... of... those halcyon days of the early Command Line Kung Fu blog.  A strapping young Tim Medin throwing down some amazing shell tricks.  Master Hal Pomeranz busting out beautiful bash fu.  Wow... those were the days.  Where the heck have I been?

Oh wait... what?  You want me to solve Josh's dilemma using cmd.exe?  What, am I a trained monkey who dances for you when you turn the crank on the music box?  Oh I can hear the sounds now.  That lovely music box... with its beautiful tunes that are so so so hypnotizing... so hypnotizing...

...and then Ed starts to dance...

I originally thought, "Uh-oh... a challenge posed by Josh Wright and then smashed by Hal is gonna be an absolute pain in the neck in cmd.exe, the Ruprecht of shells."  But then, much to my amazement, the answer all came together in about 3 minutes.  Here ya go, Big Josh.

c:\> for /D /R %i in (*) do @dir /b %i | find "ContactSheet" > nul || echo %i

The logic works thusly:

I've got a "for" loop, iterating over directories (/D) in a recursive fashion (/R) with an iterator variable of %i which will hold the directory names.  I do this for everything in the current working directory and down (in (*)... although you could put a directory path in there to start at a different directory).  That'll spit out each directory. At each iteration through the loop, I do the following:
  • Turn off the display of commands so it doesn't clutter the output (@)
  • Get a directory listing of the current directory indicated by the iterator variable, %i.  I want this directory listing in bare form without the Volume Name and Size cruft but with full path (/b).  That'll spew all these directories and their contents on standard out.  Remember, I'm recursing using the /R in my for loop, so I don't need to use a /s in my dir command here.
  • I take the output of the dir command and pipe it through the find command to look for the string "ContactSheet".  I throw out the output of the find command, because it's a bunch of cruft where it actually finds ContactSheet.
  • But, if the find command FAILS to see the string "ContactSheet" (||), I want to display the path of the directory where it failed, so I echo %i.
Voilamundo!  There you go!  The output looks like this:

c:\tmp\test>for /D /r %i in (*) do @dir /b /s %i | find "ContactSheet" > nul || echo %i
c:\tmp\test\ChrisFall
c:\tmp\test\dan
c:\tmp\test\Fred\Fred2

I'd like to thank Josh for the most engaging challenge!  I'll now go back into my hibernative state.... Zzzzzzzzz...

...and then Tim grabs the mic...

What a fantastic throwback, a reunion of sorts. Like the return of Guns n' Roses, but more Welcome to the Jungle than Paradise City, it gets worse here everyday. We got everything you want honey, we know the commands. We are the people that can find whatever you may need.

Uh, sorry... I've missed the spotlight here. Let's get back to the commands. Here is a working command in long form and shortened form:

PS C:\Photos> Get-ChildItem -Directory | Where-Object { -not (Get-ChildItem -Path $_ -Filter ContactSheet*) }

PS C:\Photos> ls -di | ? { -not (ls $_ -Filter ContactSheet*) }

Let's take it day piece by day piece. If you want it you're gonna bleed but it's the price to pay. Well, actually, it isn't that difficult so no bleeding will be involved.

The first portion simply gets a list of directories in the current directory.

Next, we have a Where-Object filter that will pass the proper objects down the pipeline. In the filter we need to look for files in the directory passed down the pipeline ($_) containing files that start with ContactSheet. We simply invert the search with the -not operator.

With this command you can have everything you want but you better not take it from me... the other guys did some excellent work. We've missed ya'll and hopefully we will see you again. Now back into the cave to hibernate.