Tuesday, August 10, 2010

Episode #107: Email for Natural File Enhancement

Tim pretends he is Cliff Claven:

Another dive into the mailbag this week. Jonathon English writes in asking how to find files that are larger than 10Mb, determine the owner, and send him/her an email.

Let's start off with the easy part, getting the owner of files bigger than 10MB.

PS C:\> ls -r | ? { $_.Length -ge 10485760 } | get-acl | select path, owner

Path Owner
---- -----
Microsoft.PowerShell.Core\FileSystem::C:\bigfile.txt DOMAIN\user1
Microsoft.PowerShell.Core\FileSystem::C:\biggerfile.txt DOMAIN\user2
Microsoft.PowerShell.Core\FileSystem::C:\biggestfile.txt DOMAIN\user3
The command starts off with the Get-ChildObject (Alias dir, ls). The -r[ecurse] option is used to search all subdirectories. Files are filtered with Where-Object cmdlet (alias ?) to return only files that are 10MB or larger. The results are piped into Get-Acl, which gets the owner. Finally, the attributes owner and path are selected. As you may notice, the path looks a little funny. The Get-Acl cmdlet works with multiple providers, and the path returned by the cmdlet includes the provider.

The path is a bit ugly, so it would be good to clean it up - but how?" There are a few ways to do it. Here is the easiest and the quickest.

PC C:\> ls -r | ? { $_.Length -ge 10485760 } |
select FullName, @{Name="Owner";Expression={(Get-Acl $_).owner }}


FullName Owner
-------- -----
C:\bigfile.txt DOMAIN\user1
C:\biggerfile.txt DOMAIN\user2
C:\biggestfile.txt DOMAIN\user3
Here the Select-Object cmdlet (alias select) is used with a calculated property to get the file's owner.

So we have the info, now to email it. Some of this will depend on the security settings used by your mail server(s) and you may have to customize this yourself. Also, we may cross into scriptland and violate rule #1. Ed isn't around, so no one tell him.


PS C:\> $SmtpClient = new-object system.net.mail.smtpClient
PS C:\> $SmtpClient.Host = "smtpserver.domain.com"
PS C:\> ls -r | ? { $_.Length -ge 10485760 } | select FullName, @{Name="Owner";Expression={(Get-Acl $_).owner } } | % {
$MailMessage = New-Object system.net.mail.mailmessage
$mailmessage.from = "big brother"
$mailmessage.To.add($_.Owner -replace "DOMAIN\","" + "mydomain.com")
$mailmessage.Subject = "Your big file won't copy"
$mailmessage.IsBodyHtml = 1
$mailmessage.Body = "This file is too big and won't be copied:" + $_.Path
$smtpclient.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials
$smtpclient.Send($mailmessage)
}
These are the same commands as used above. The only difference is we have ported some .NET code to send the email. In the code above, I assumed that the username was part of the email address or alias. If that isn't the case, we would have to lookup the user's email address. To easily lookup the email address we would need the Exchange snap-in. (Ok, not a default install, but pretty common. And if I am going to violate one rule, I may as well break them all).

PS C:\> Get-User tim | select WindowsEmailAddress

WindowsEmailAddress
-------------------
Tim.Medin@commandlinekungfu.com
Here is the rewritten command using the Exchange cmdlet.

PS C:\> ls -r \\sharepoint\data\it\general | ? { $_.Length -ge 10485760 } |
select FullName, @{Name="Owner";Expression={(Get-User (Get-Acl $_).owner).WindowsEmailAddress } }


FullName Owner
-------- -----
C:\bigfile.txt user1@domain.com
C:\biggerfile.txt user2@domain.com
C:\biggestfile.txt user3@domain.com
Here is how it would look with our...ahem...script...


PS C:\> $SmtpClient = new-object system.net.mail.smtpClient
PS C:\> $SmtpClient.Host = "smtpserver.domain.com"
PS C:\> ls -r \\sharepoint\data\it\general | ? { $_.Length -ge 10485760 }
| select FullName, @{Name="Owner";Expression={(Get-User (Get-Acl $_).owner).WindowsEmailAddress } } % {
$MailMessage = New-Object system.net.mail.mailmessage
$mailmessage.from = "big brother"
$mailmessage.To.add($_.Owner)
$mailmessage.Subject = "Your big file won't copy"
$mailmessage.IsBodyHtml = 1
$mailmessage.Body = "This file is too big and won't be copied:" + $_.Path
$smtpclient.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials
$smtpclient.Send($mailmessage)
}
So that is how it looks. Let's see if Hal is manly enough to bend some rules.

Hal thinks hard

Wait, the guy who brought up "Natural Enhancement" is questioning my manhood? Hey, sport, in Unix-land we can actually send email from the command line without resorting to writing a script.

Actually, I did end up relaxing one of my own personal rules for this week's Episode. Generally I try very hard to come up with a solution that's a single command-line. It may be a really long pipeline and spawn lots of subshells, but you should only have to hit the "Go" button once. But this time, it was frankly easier to come up with a clear solution that uses two primary command lines. Sue me.

First is the task of identifying large files and sorting them by user. I'm going to do this work in a temporary directory just to make my life easier:

# mkdir /tmp/emails
# cd /tmp/emails
# find / -size +10000000 -ls | awk '{ print $5, $0 }' |
while read user line; do echo $line >>$user; done

The find command uses the same expression we saw back in Episode 102-- find all files whose size is greater than 10,000,000 (bytes by default). I've added the "-ls" action so that instead of just the file names we get a detailed listing ala "ls -dils". This output gets piped into awk, where I pull out the username in field number five and re-output the line with the username duplicated at the front of the line. The modified lines then go into the while loop where I pull the username off the front of each line and append the remainder of the line (the "find ... -ls" output) to a file named for the user. So at the end of the command, I've got the detailed listing information for all the big files on the system broken out into separate files by user.

That's actually the hard part. Sending email to the users is easy (assuming outgoing email on your machine is configured properly):

# for user in *; do cat $user | mailx -s 'Large File Report' $user; done

All we need to do is shove the contents of each file into the standard input of a mailx process with the username specified as the recipient. The "-s" option even lets us specify a subject line for the message.

Note that I used "cat ... | mailx ..." here rather than "mailx ... <$user" because I wanted to make it easier to add a canned message to the top of the report in addition to the find output. For example, if you had the boilerplate in /tmp/mymsg, then you would use a command line like:

# for user in *; do cat /tmp/mymsg $user | mailx -s 'Large File Report' $user; done

I actually am feeling a little... well let's just say "impotent"... for having to do this in two commands. There's probably some crazy output redirection stunt I could have done to pack this into a single command line, but my solution accomplishes the task and is straightforward to type and understand.

If anybody feels man enough to accept the challenge of coming up with a single command version, mail your answers to suggestions[at]commandlinekungfu[dot]com. I will choose the "best" solutions using some completely arbitrary set of criteria that I haven't developed yet and post them here on the blog. Prizes will probably not be awarded.