Tuesday, July 6, 2010

Episode #103: Size Might Matter... But Timing is Everything

Ed checks the mailbag:

Diligent reader and command-line warrior Esther Yee writes in:

Would I get your advice on how to track all the files which I have
accessed in my pc (Window XP) on yesterday, with the time of
accessed pls? what would be the command line then? I only know
about last modified, last created. how about last accessed?

Great question, Esther! On first read, it sounds like a cousin of the issue we faced last week, looking for files based on their size. But, from the cmd.exe perspective, there are some important subtleties here with big implications.

I'm gonna start out by pretending that you ask about last modified time, instead of last accessed. Yup... I'm gonna ignore the substance of the question for now, because we have to build up to it. So, how can we find files that were modified on a given date (such as yesterday) and pluck out their modified time. Taking a cue from Episode #102, we could run:
C:\> cmd.exe /v:on /c "for /r c:\ %i in (*) do @set datetime=%~ti& set 
date=!datetime:~0,10!& if !date! equ 07/03/2010 set time=!datetime:~-8,8!&
echo !time! %~fi"
05:06 AM c:\tmp\what.txt
05:13 AM c:\WINDOWS\WindowsUpdate.log
06:24 AM c:\WINDOWS\Prefetch\
05:32 AM c:\WINDOWS\Prefetch\
05:32 AM c:\WINDOWS\Prefetch\

Here, I'm invoking delayed variable expansion (cmd.exe /v:on /c) and then running a FOR /R loop, just like last week's episode, to iterate through files recursively. I'm looking in c:\, with file names assigned to iterator variable %i. I'm looking for any type of file in my set, with in (*). In the body of my loop (do), I'm turning off display of commands (@). I then store the date/time associated with a file (%~ti) in the variable datetime so that I can perform substring operations on it (you can't do substring ops on an iterator variable directly). I've smushed the & right after the %~ti so that I don't get an extraneous space after it. I then set a variable called date to the first 10 characters of datetime (set date=!datetime:~0,10!).

Next, I check to see if the date is equal to the date in question (I put 07/03/2010 here as an example). If that is the case, I set a variable called time to the last eight characters of date time (set time=!datetime:~-8,8!&), and then I simply echo the time as well as the full path to the file. With the filename stored in %i, our FOR /R loop put the full path to the file in %~fi.

That's not too bad from a complexity perspective... but it doesn't answer Esther's question. This shows files that were last modified on the given date, not last accessed. It turns out getting the last accessed time is more difficult, because FOR /R loops give us the %~ti variable for time only in terms of last modified. If we want last accessed info, we can't rely on our FOR /R loop to give us the time option. We're going to have to rely on "dir /ta" instead (I wrote about using dir with /ta to get last access times in Episode #79). It's important to note right up front that, while dir /ta does show Windows last access date and time, this field is often not updated appropriately on Windows machines. Still, the timestamp given by "dir /ta" is officially the last access time, so we'll work with it.

So, let's toss out our FOR /R loop and just plow through with this, using a FOR /F loop to iterate over our dir /ta output, creating something that looks a bit like what we did for last modified time, but instead using a FOR /F loop to iterate over the output of a dir /s /ta command:
C:\> cmd.exe /v:on /c "for /f "tokens=1-4,*" %i in ('dir /a /s /ta c:\') do @set 
date=%i& if !date! equ 07/03/2010 @echo %j %k %m" | more
05:56 AM Documents and Settings
06:19 AM downloads
06:19 AM icecasttemp
06:19 AM Program Files
05:32 AM System Volume Information
Here, I'm invoking delayed variable expansion again. Then, I start a FOR /F loop, which will let me iterate over the output of a command. I'm using some parsing logic to split up the output of my command, assigning iterator variables starting at %i to the first five fields of output ("tokens=1-4,*" %i). So, %i will get the first field, %j the second, %k the third, %l the fourth. Then, %m will get everything left through the end of the line, which may be a file name with spaces in it. The command whose output I'm iterating over is 'dir /a /s /ta'. I put a /a here so that I can get files with any kind of attributes (including hidden) files. My FOR /R loop in the earlier command got files independent of their attributes, so I figured I should make my dir iterator comparable. To tell my FOR /F loop that I want this dir stuff to be interpreted as a command, and not a file or a string, I put it in single quotes in my in () clause. My dir command is recursing (/s) starting from the c:\ and displaying the last access time in its output. The output of dir has the following columns, separated by spaces and tabs:
With my parsing logic, %i will be DATE, %j will be TIME, and so on, up to %m, which will hold the name, even if it includes spaces.

In my do clause, I simply set the date to %i, the first column of the dir output (if it's not a date because of the cruft at the start or end of dir's output, that's ok, because my next command, an if statement, will find that it doesn't match what I'm looking for, a date). I then check to see if the date equals what we're looking for (if !date! equ 07/03/2010), I then echo out %j, %k, and %m. What are those? Well, that would be the TIME, AM/PM, and NAME.

Sweet! So, what's the problem here? Well, the name is just, uh, the name. It's not the full path. We don't have access to the full path like we did in the nice little %~fi trick we had with FOR /R.

Now, normally, what you'd do with dir /s to get the full path is to use a /b, which gives the "bare" form of output (no volume name and total size cruft) but also has the useful side effect of showing full paths when used with /s. However, the /b option, when used with the /ta, overrides the /ta, giving us NO DATE OR TIME FIELDS. Don't ya just love cmd.exe?

OK, so we have a dilemma: to get the full path with dir /s, we use /b, which causes us to lose the datetime field, which kinda screws everything up. This happens a lot when using the dir command. You want to access something (like a timestamp with /t or even the owner of a file with /q), and you want the full path, but the /b which gives you the full path removes the other fields you want. Clearly, we need another approach. If this were going to be easy, Esther wouldn't have asked.

We can accomplish this by using some of the piece parts from above. Let's recurse through the file system using FOR /R, which gives us all those marvelous ways of referring to file properties (with the full path of %~fi). Then, we can run dir with /ta on each individual file to pull out the last accessed date and time, which we can check with some if logic. If the date matches what we're looking for, we can then print the time (which we'll parse out of our dir output using a FOR /F loop) and the full path, which will still be hanging around from our FOR /R loop. Yeah, that's the ticket. Here it is:
C:\> cmd.exe /v:on /c "for /r c:\ %a in (*) do @for /f "tokens=1-5" %i in 
('dir /a /ta "%~fa"') do @set date=%i& if !date! equ 07/03/2010 echo %j %k %~fa"
04:34 AM c:\Documents and Settings\All Users\Start Menu\New Office Document.lnk
04:34 AM c:\Documents and Settings\All Users\Start Menu\Open Office Document.lnk
04:34 AM c:\Documents and Settings\All Users\Start Menu\Set Program Access and Defaults.lnk
04:34 AM c:\Documents and Settings\All Users\Start Menu\Windows Catalog.lnk
04:34 AM c:\Documents and Settings\All Users\Start Menu\Windows Update.lnk
So, here, I've invoked delayed variable expansion, kicked off a FOR /R loop to go through c:\ and grab all files setting each to iterator variable %a. Then, in the do clause of my FOR /R loop, I run a FOR /F loop to do some string parsing on the output of my 'dir /a /ta' command, which is used to pull the directory listing information from the full path of the file assigned by my FOR /R loop ("%~fa"). I have to put that full path in double quotes, or else spaces in directory or file names could cause trouble.

Then, in the do clause of my FOR /F loop, I put the date (%i) in a variable called date. I check to see if the date is equal to the date we had in mind. If it is, I echo out the time (%j) the AM/PM (%k) and the full path of our file (which still lives in %~fa courtesy of the FOR /R loop). Voila! Easy as... uh... pi.

Hal's rocking out

I seriously thought about just posting the Unix solution for this week's Episode without any explanation. But then I thought that would be just rubbing it in, and I'm bigger than that. Here's the solution, though:

find / -type f -atime -1 -print0 | xargs -0 ls -lu

Some commentary is in order here:

  • Notice that I'm just looking for regular files here ("-type f"). If you care about other kinds of files, you might want to suppress directories ("\! -type d"), since the atime on a directory gets updated every time the directory is listed. That means that directy atime info is mostly just noise.

  • The minus sign before the one in "-atime -1" means "less than". So you read that clause as "atime less than one day old". "+1" would mean "greater than one day".

  • Esther was kind enough to want to search for files with one-day granularity, which is all find can handle internally. If you need finer control than that, see the trick in Episode 29.

  • Since I can't be sure whether or not the file names are going to have spaces, quotes, or other funny characters in them, I'm using "-print0" to tell find to output the file names as null-terminated strings. "xargs -0" tells xargs to look for input formatted in this way.

  • "ls -lu" gives a detailed listing ("-l") showing last access times ("-u") instead of last modified times. We had to use "xargs -0 ls -lu" here because the built-in "-ls" operator in find only displays last modified times.

So there you go, a solution that fits easily into a 140 character tweet. Boy, that takes me back to the early days when Paul, Ed, and I were just a group of crazy young kids with visions of command-line glory in our eyes...

Tim's does it quickly:

Ah, the glory days. Oh wait, I wasn't around then, but I'm still just as cynical as the old timers. And that vision of command line glory...I lost it moons ago.

If I had been tweeting fu back in the day, I would have had no trouble posting this one. Even the long version of the PowerShell command fits in 140 characters.

PS C:\> Get-ChildItem -Recurse -Force | Where-Object { $_.LastAccessTime -gt (Get-Date).AddDays(-1) }
So how does it work? A recursive directory listing, which includes system and hidden files by using the -Force option, is filtered based on the last access time. The filter looks for files where the LastAccessTime is greater than (-gt) yesterday at this time. The "date math" is accomplished by getting a date object and using its AddDays method to subtract a day.

The only shortcoming is that the default directory listing displays the LastWriteTime property, not the LastAccessTime. To display the LastAccessTime, or other properties we might want, we can pipe the output into the Select-Object cmdlet. Here is how we do just that:

PS C:\> ls -r -fo | ? { $_.LastAccessTime -gt (Get-Date).AddDays(-1) } | select LastAccessTime, Name
That's it. Short and sweet.