Tuesday, July 12, 2011

Episode #153: I'll Have What She's Having

Tim gets off easy

This week friend of the blog Jeff "I am the Hammer" Haemer writes in looking for a solution to this problem:

Given two directories with the same file and directory names, but with different file contents and perms/ownerships, copy the perms and ownerships from the "master" directory to the new dir while preserving the content that's different in the new directory (i.e., copy the ownerships and perms without overwriting any files).

Jeff, the *nix ninja, set me up for a super easy week. Upon receiving this email I quickly checked out robocopy's help.

C:\> robocopy /?

...
/COPY:copyflag[s] :: what to COPY for files (default is /COPY:DAT).
(copyflags : D=Data, A=Attributes, T=Timestamps).
(S=Security=NTFS ACLs, O=Owner info, U=aUditing info).
...


Bingo! Looks like we can use the S, O and A flags to copy our data. But let's double check.

To test it I created a two directories, dir1 and dir2 that looked like this.

C:\> tree /f
Folder PATH listing
Volume serial number is DEAD-BEEF

C:.
|
+---dir1
| aaaa.txt
| bbbb.txt
| cccc.txt
|
+---dir2
aaaa.txt
bbbb.txt


The content of each file is different. Also, I added permissions for "Anonymous Login" to aaaa.txt and I completely changed the permissions of bbbb.txt. The file cccc.txt was left with the default permissions.

C:\dir1> cacls *
C:\dir1\aaaa.txt NT AUTHORITY\ANONYMOUS LOGON:R
BUILTIN\Administrators:(ID)F
NT AUTHORITY\SYSTEM:(ID)F
BUILTIN\Users:(ID)R
NT AUTHORITY\Authenticated Users:(ID)C

C:\dir1\bbbb.txt NT AUTHORITY\SYSTEM:F
FNSCORP\tm:F

C:\dir1\cccc.txt BUILTIN\Administrators:(ID)F
NT AUTHORITY\SYSTEM:(ID)F
BUILTIN\Users:(ID)R
NT AUTHORITY\Authenticated Users:(ID)C


All the files in dir2 look like this:

C:\dir2> cacls *
C:\dir2\????.txt NT AUTHORITY\ANONYMOUS LOGON:R
BUILTIN\Administrators:(ID)F
NT AUTHORITY\SYSTEM:(ID)F
BUILTIN\Users:(ID)R
NT AUTHORITY\Authenticated Users:(ID)C


Now we run our command.

PS C:\> robocopy c:\dir1 c:\dir2 /COPY:ASO


...and check the permissions:

C:\dir2> cacls *
C:\dir2\aaaa.txt NT AUTHORITY\ANONYMOUS LOGON:R
BUILTIN\Administrators:(ID)F
NT AUTHORITY\SYSTEM:(ID)F
BUILTIN\Users:(ID)R
NT AUTHORITY\Authenticated Users:(ID)C

C:\dir2\bbbb.txt NT AUTHORITY\SYSTEM:F
FNSCORP\tm:F


Permissions match, the file content hasn't changed, and the additional file wasn't copied. That was super easy. Even better, it works in PowerShell and CMD. Unfortunately, it doesn't work in *nix land, so Hal has got some work ahead of him. Hal, get to it.

Hal gets screwed

I'm considering revoking Jeff's "friend of the blog" status for setting me up for failure on this one. Unfortunately, rsync/tar/cpio/etc don't have an option like robocpy does for copying permissions and ownerships but not content. So we're left with cobbling together our own command line madness.

But it turns out that getting the file permissions in a usable format is a difficult thing to do in a portable fashion. It's no problem on Linux or BSD, where we have the stat command. Here's the Linux solution:

# cd /your/source/dir
# find * -print0 | xargs -0 stat -c '%a %u %g %n' |
while read perms user group file; do
chown $user:$group "/path/to/target/dir/$file";
chmod $perms "/path/to/target/dir/$file";
done

Essentially I'm using "find * -print0 | xargs -0 stat -c '%a %u %g %n'" as a fill-in for the "ls" command to get my file info in the form that I need it. For each file the stat format I'm specifying with "-c" will give me the permissions in octal, the numeric UID and GID, and the file name. From there it's just a matter of using this data appropriately inside the while loop to do the chown and chmod.

Notice that I'm being careful to use "-print0" and "xargs -0" so that we handle files with spaces properly. The "read" statement at the top of the while loop will schlurp up everything after the GID as the file name, so that takes care of the spaces in file names problem there. However, inside the loop we need to be careful with our quoting so things work out OK.

The BSD version of our command is nearly identical except for the stat command. On BSD the correct stat command to plug into xargs is "stat -f '%Mp%Lp %u %g %N'". The permissions bits on BSD are returned with "%p", but unfortunately "%p" includes the file type as part of the octal sequence, so you get unhelpful output like "100644" for regular files, "40755" for directories, etc. The "%Mp%Lp" sequence means to output just the suid/sgid/sticky bits ("%Mp", the middle permissions bits) and the normal r/w/x info ("%Lp", the lower permissions bits). By the way, notice that the BSD stat command also uses "-f" for the format option and "%N" for the file name, while Linux uses "-c" and "%n" respectively.

But what about other Unix flavors that don't have a built-in stat command? It would be against the rules of the blog to even suggest you go download and install the GNU coreutils package. So the only thing I can think of doing is using "ls -ln" instead of stat to get the information about each file. The problem is parsing the permissions bits and converting them into octal notation to use with chmod. Remember that such a conversion routine would need to deal with things like "rwxrw-r--", "r-sr-s--x", and "rwxrwxrwt" (to say nothing of crazy corner cases like "S" and "l"). That's almost certainly going to turn into a script. In fact, you're probably better off just coding the who thing in Perl or Python in the first place so you can just call stat() on the files directly.

So we've got a decent solution for Linux and BSD, but a trip to Scriptistan on all other platforms. That's a bummer. I think I'll make myself feel better by watching one of my favorite movie scenes (as a bit of movie trivia, that's Director Rob Reiner's mom delivering the punch line at the end of the scene).