Tuesday, July 7, 2009

Episode #50: Scheduling Stuff

Ed begins:

The other day, I was going through the 49 episodes of this blog we've developed so far when it occurred to me -- besides an occasional minor mention of the cron daemon by Hal, we haven't done much at all with scheduled tasks! Heaven forbid! And, there's some pretty important gotchas to avoid when scheduling stuff in Windows.

Back in the Mesozoic Era, dinosaurs like me scheduled our jobs on Windows using the "at" command. It was nice and simple, with no bells or whistles. And it's still there, offering a quick and easy way to schedule a job:

C:\> at [\\machine] HH:MM[A|P] [/every:day] "command"

If you don't specify a \\machine, the command is run locally. You need to have admin credentials on the local or remote machine where you're scheduling the job. The job itself will run with system privileges. Some versions of Windows support 24-hour military time, and some do not. But, all versions of Windows support HH:MM followed by a cap-A or cap-P. If you omit the /every: option, the command runs just once at the time you specify. If you do provide a /every:day, you can specify the day as a day of the month (1-31) or a day of the week (monday, tuesday, etc.) Remember, if you schedule something for the 31st day of the month, it won't run in months with fewer days: February, April, June... well, you remember the mnemonics, I'm sure.

The at command by itself will show which jobs were scheduled using the at command:
C:\> at \\PaulsComputer
Status ID Day Time Command Line
-------------------------------------------------------------------------------
1 Today 7:00 PM WriteTenableBlog.bat
1 Each Th 7:00 PM RecordPodcast.bat
2 Each F 3:00 PM WashHalsCar.bat
3 Each F 3:00 PM PickupHalsDrycleaning.bat
4 Each F 4:00 PM GiveHalBackrub.bat
5 Each F 6:00 PM GoToTherapistSession.bat
6 Each 28 12:00 PM PayPsychBill.bat
Note that there is no need to list who the job will run as, because, when scheduled using the at command, it runs with SYSTEM privileges.

See those little ID numbers at the beginning of each task? We can use them to refer to tasks, especially to delete them. A simple "at 1 /del" will delete task 1. To kill all the tasks, and let God sort them out, we could run:

C:\> FOR /F "skip=2" %i in ('at') do @at %i /del

Regular readers of this blog should instantly know what I'm doing here: just running the at command, parsing its output by skipping column headers and ----- to zoom in on the ID number, which I then delete.

I still use the at command for quick and dirty scheduling where I want simple syntax to make something run with system privs. But, a far more flexible way to schedule jobs is to use the more recent schtasks command. This is the way Microsoft wants us to go for modern task scheduling. The syntax includes a myriad of options for creating, deleting, querying, changing, and invoking tasks. The syntax is so complicated, by the way, I haven't been able to actually memorize it all, and I've tried. I won't reproduce all the usage options here (run "schtasks /?" for details), but will instead focus on creating and querying tasks.

To create a task, you could use the following syntax:

C:\> schtasks /create [/s system_name] [/u username] [/p password] [/ru runuser]
[/rp runpassword] /sc shedule /mo modifier /tn taskname /tr taskrun
/st starttime /sd startdate

As with the at command, schtasks schedules locally unless you specify a remote machine with the /s option (oh, and no \\ is used before the system name here). The /u and /ru give some people a bit of trouble. The /u and /p refer to the credentials you want to use for a remote machine to schedule the task. The at command always used your current credentials, while schtasks gives you an option to use other credentials to do the scheduling. When it actually starts, the task itself will run with the /ru and /rp credentials. If you want system credentials, just use "/ru SYSTEM" and leave off the /rp.

The /sc schedule can be minute, hourly, daily, weekly, once, onlogon (when you specify a given user) and much m ore. The /mo modifier specifies how often you want it to run within that schedule. So, for example, to run every 2 hours, you could use "/sc hourly /mo 2".

The /tn taskname is a name of your choosing. The /tr is the actual command you want to run.

The /st startime is specified in 24-hour military time. Some versions of Windows will accept it in HH:MM format, but many versions of Windows require HH:MM:SS. I always use the latter to make sure my command works across Windows versions. And, finally, /sd has a date in the form of MM/DD/YYYY.

When run by itself (with no options), schtasks shows scheduled tasks (the same output we get from "schtasks /query").

AND HERE IS A REALLY IMPORTANT POINT! The at command shows only the jobs scheduled via the at command itself. The schtasks command shows all jobs scheduled via schtasks as well as the at command. If you are relying only on the at command to display jobs, you are missing all those tasks scheduled via schtasks. Also, wmic has a job alias. You might think that it would show all jobs, right? Wrong! Like the at command, "wmic job show full" displays only those jobs created using the at command, and does not display jobs created via schtasks. You've been warned!

Both the schtasks and at commands, when they schedule a job, create a file summarizing the job in C:\windows\tasks on Win XP Pro and later, and in c:\windows\system32\tasks in Vista, 2008 Server, and Windows 7. The XP-style file is an ugly blob of non-ASCII printable data. The Vista and later files are considerably less ugly XML. I know you may be tempted to delete tasks by removing these files. I caution you against it. I've found that deleting at-style tasks by removing their files in XP works just fine, but it doesn't remove all traces of at-style tasks in Vista and later. Your best bet is to use "at [n] /del" and "schtask /delete [options]".

Even though its got a bazillion-plus options, the schtasks command does not have the option of displaying only those tasks associated with a given user or scheduled to run on a certain frequency or date. But, we can use a little command-line kung fu to tease that information out. The technique is all based on a useful option in "schtasks /query" which allows us to specify the output format with /fo, with options including table, list, or csv. The table and list formats are nice, but csv is especially useful. The /v option gives us verbose output, which holds all of the attributes of our tasks.

With that info, you can create a nice CSV file with all of your tasks to open in a spreadsheet by running:

C:\> schtasks /query /v /fo csv > tasks.csv

C:\> tasks.csv


In your spreadsheet program, you can see the various column names for all the fields. We could then search just for some specific output in the tasks.csv file using findstr for strings or even regex. For example, if you want a list of jobs scheduled to run weekly, you could use:

C:\> schtasks /query /v /fo csv | findstr /i weekly

Or, if you want a list of jobs associated with the SYSTEM user, you could run:

C:\> schtasks /query /v /fo csv | findstr /i system

Using this as your base command, there are a myriad of other options you can search for, including dates and times.

Hal chimes in:

Ed, I'm so stoked you brought up this topic! For the Unix people reading this blog, I want to see a show of hands from people who knew that Unix had its own "at" command. OK, for the three of you who have your hands raised, put your hands down unless you've actually used an "at" job within the last year. Yeah, I thought so.

It's a real shame that the "at" command has dropped so far out of the "common knowledge" for Unix folks because at jobs are really useful. At their most basic level, you can think of an at job as a one-shot cron job:

# echo find /tmp -type f -mtime +7 -exec rm {} \\\; | at 00:00
warning: commands will be executed using /bin/sh
job 1 at Tue Jul 7 00:00:00 2009

You feed in the commands you want executed on the standard input, or you can use "at -f" to read in commands from a file.

You can use "atq" to get a list of all the pending jobs along with their job numbers:

# atq
1 Tue Jul 7 00:00:00 2009 a root

But if you want to view the details of a pending job, you need to use "at -c jobnum" ("-c" for "confirm" is how I always remember it):

# at -c 1
#!/bin/sh
# atrun uid=0 gid=0
# mail root 0
umask 22
USER=root; export USER
PWD=/home/hal; export PWD
HOME=/root; export HOME
LOGNAME=root; export LOGNAME
cd /home/hal || {
echo 'Execution directory inaccessible' >&2
exit 1
}
find /tmp -type f -mtime +7 -exec rm {} \;

You'll notice that when at sets up the job, it's careful to preserve the environment variable settings in the current shell, including the umask value. It even makes sure to cd into the directory from which you scheduled the at job, just in case your at job uses relative path names. Very smart little program.

Finally, "atrm" allows you to cancel (remove) jobs from the queue:

# atq
1 Tue Jul 7 00:00:00 2009 a root
# atrm 1
# atq

What's cool about the Unix at command compared to the Windows version is that you have much more flexibility as far as time scheduling formats. All of the following are valid:

# at -f mycommands 00:00 7/10       # run mycommands at midnight on Jul 10
# at -f mycommands midnight 7/10 # "noon" and "teatime" (4pm) are also valid
# at -f mycommands 00:00 + 4 days # expressed as relative date offset
# echo init 0 | at now + 2 hours # power off system in two hours

Many more time specifications are allowed-- please consult the at manual page.

For jobs that need to run more than once, you're supposed to use cron. Typically cron jobs are created using an interactive editor via "crontab -e". But sometimes you want to set up a cron job directly from the command line without using an editor-- for example when I'm using a tool like fanout to set up the same cron job on multiple systems in parallel. In these cases, my usual tactic is to do a sequence of commands like:

# crontab -l > /root/crontab                  # dumps current cron jobs to file
# echo '12 4 * * * /usr/local/bin/nightly-cleanup' >> /root/crontab
# crontab /root/crontab # replaces current jobs with modified file
# rm /root/crontab # cleaning up

It's kind of a hassle actually, but it works. By the way if you're root but want to operate on another user's crontab file you can just add the "-u" flag, e.g. "crontab -u hal -l".

The first five columns in the cron entry are the time and day spec for when you want the cron job to operate. The first column is "minutes" (0-59), the second "hours" (0-23), then "day of month" (1-31), "month" (1-12), and "day of week" (0-7, Sunday is 0 or 7). Here are some examples:

12 4 * * * /usr/local/bin/nightly-cleanup             # every morning at 4:12am
0,15,30,45 * * * * /usr/sbin/sendmail -Ac -q # every 15 minutes
15 0 * * 0 /usr/local/sbin/rotate-tape # Sundays at 15 past midnight
0 0 1 * * find /var/log -mtime +30 -exec gzip {} \; # the first of every month
0 0 22 8 * /usr/local/bin/anniversary-reminder # every year on 8/22

Frankly, I don't find myself using the 3rd and 4th columns all that frequently, or really even the 5th column all that much. Most of my jobs tend to run nightly at some regular time.

By the way, you'll discover that old farts like me have a thing about not scheduling cron jobs between 2am and 3am. That's because older cron daemons had trouble dealing with Daylight Savings Time shifts, and would either skip jobs entirely or run them twice depending on which way the clock was shifting. This was fixed by Paul Vixie in his "Vixie cron" package, which is the standard cron daemon on Linux these days, but may still be an issue if you have older, proprietary Unix systems still running around. Check the man page for the cron daemon on your system if you're not sure.