Tuesday, October 27, 2009

Episode #66: Log Jam

Tim logs in:

This episode we take a look at logs, the window to the soul of your computer. Ok, maybe not, but we'll still look at them anyway. Windows has different ways to view the Event Log via the command line depending on the version.

In Windows 2003, XP, and older versions the classic event logs were Application, System, and Security. Beginning with Vista, the entire event log system changed. There is now a plethora of new logs in addition to the classic logs. To fully discuss the differences we would need an entire white paper, so we will stick with getting the familiar information from the event logs. To make it even more fun, since Windows 7 and Server 2008 R2 are officially out, we now have PowerShell Version 2 with its additional cmdlets and another way to access the Event Log.

We will kick off this week with PowerShell and its methods for retrieving information from the Event Log. In v1, the only option for viewing the event log was Get-EventLog. It would "get" the standard event logs such as System or Application. In v2 we have Get-WinEvent, which allows retrieval of a wider range of logs. It also allows us to work with saved event log files (.evtx) which is a great new feature! With Vista and beyond, Get-WinEvent is the recommended method, but we will describe both cmdlets since there are a lot of XP and Windows 2003 machines in addition to Vista and Windows 2008 (R1) machines without PowerShell v2.

Here is a demonstration that there is a difference in the number of logs accessible by the powershell commands.

PS C:\> (Get-WinEvent -ListLog *).Count
PS C:\> (Get-EventLog -List).Count

Whoa, there are a lot more logs! We don't have the space to go over them all, so let's look at a practical example of searching the logs for specific events, such as an account lockout. I have a cheat sheet of Event IDs at my desk, but I'm not there right now so let's find any event containing the word "locked".

PS C:\> Get-WinEvent -logname security | ? {$_.Message -like "*locked*"} | fl
TimeCreated : 10/24/2009 8:57:20 AM
ProviderName : Microsoft-Windows-Security-Auditing
Id : 4740
Message : A user account was locked out.
Security ID: S-1-5-19
Account Name: MYWIN7BOX$
Account Domain: MYDOMAIN
Logon ID: 0x3e7

Account That Was Locked Out:
Security ID: S-1-5-21-1111111111-2222222222-3333333333-4000
Account Name: mrjones

Additional Information:
Caller Computer Name: MYWIN7BOX

PS C:\> Get-EventLog -LogName security -Message *locked* | fl
Index : 3533232
EntryType : SuccessAudit
InstanceId : 4740
Message : A user account was locked out.

Security ID: S-1-5-19
Account Name: MYWIN7BOX$
Account Domain: MYDOMAIN
Logon ID: 0x3e7

Account That Was Locked Out:
Security ID: S-1-5-21-1111111111-2222222222-3333333333-4000
Account Name: mrjones

Additional Information:
Caller Computer Name: MYWIN7BOX
Category : (13824)
CategoryNumber : 13824
ReplacementStrings : {test, MYWIN7BOX, S-1-5-21-1111111111-2222222222-33...}
Source : Microsoft-Windows-Security-Auditing
TimeGenerated : 10/24/2009 8:57:20 AM
TimeWritten : 10/24/2009 8:57:20 AM

You'll notice that Get-EventLog has a parameter that allows searching for a specific string in the message while Get-WinEvent does not. Oh well, you can't win them all. Now we know the Event ID we are looking for, so we can use that for the search. We pipe the command into fl, which is the alias for Format-List, so that it is easier to read. To save space going forward we'll let PowerShell use its default format (Format-Table).

PS C:\> Get-EventLog Security | ? { $_.EventId -eq 4740}
Index Time EntryType Source InstanceID Message
----- ---- --------- ------ ---------- -------
3533232 Oct 24 08:57 SuccessA... Microsoft-Windows... 4740 A user ...

PS C:\> Get-WinEvent -FilterHashtable @{LogName="Security"; ID=4740}
TimeCreated ProviderName Id Message
----------- ------------ -- -------
10/24/2009 8:57:... Microsoft-Window... 4740 A user account w...

Using the FilterHashtable parameter available with Get-WinEvent allows us to do some filtering within the command instead of further down the pipeline, which makes it much much faster. Here is the time difference:

PS C:\> measure-command {Get-EventLog Security | ? { $_.EventId -eq 4740}} |
select TotalMilliseconds

TotalMilliseconds : 1898.7783

PS C:\> measure-command {Get-WinEvent -logname security | ? {$_.ID -eq 4740}} |
select TotalMilliseconds

TotalMilliseconds : 24219.317

PS C:\> measure-command {Get-WinEvent -FilterHashtable @{LogName="Security"; ID=4740}} |
select TotalMilliseconds

TotalMilliseconds : 61.8189

The keys typically used for filtering are LogName, ID, StartTime, EndTime, and UserID (SID). You can see a full list in the help page for that parameter (Get-Help Get-WinEvent -Parameter FilterHashtable). The help page has a decent description of the parameter and its usage. The examples (Get-Help Get-WinEvent -Examples) are better at describing how it works.

Now we know there was an account locked at 8:57. To see what happened within a minute of that event the Before and After parameters are used to narrow the scope.

PS C:\> Get-EventLog -logname security -After 8:56 -Before 8:58
Index Time EntryType Source InstanceID Message
----- ---- --------- ------ ---------- -------
3533474 Oct 24 08:58 SuccessA... Microsoft-Windows... 5447 A Windo...
3533473 Oct 24 08:58 SuccessA... Microsoft-Windows... 5447 A Windo...

PS C:\> Get-WinEvent -FilterHashtable @{ logname="Security"; StartTime = "08:56";
EndTime = "08:58" }

TimeCreated ProviderName Id Message
----------- ------------ -- -------
10/24/2009 08:58... Microsoft-Window... 5447 A Windows Filter...
10/24/2009 08:58... Microsoft-Window... 5447 A Windows Filter...

Now let's try to find some other events such as service started (7036) or stopped (7035), event log service started (6006) or stopped (6005), or system shutdown (1074). These are all in the System Event Log and we can find these with one big command.

PS C:\> Get-EventLog -logname system | ? { $_.EventID -eq 7036 -or
$_.EventID -eq 7035 -or $_.EventID -eq 6006 -or $_.EventID -eq 6005
-or $_.EventID -eq 1074 }

PS C:\> Get-WinEvent -FilterHashtable @{LogName="System";

In this example the Get-WinEvent command is much faster, my tests showed it to be 10x faster.

Getting context between event logs is always handy. Unlike Get-Eventlog, which limits us to viewing only one log, Get-WinEvent we can look across all the logs.

PS C:\> Get-WinEvent -FilterHashtable @{LogName="System"
ID=7035,7036,6005,6006,1074},@{LogName="Security"; ID=4740} -Oldest

We can use an array of hash tables for filtering to get all the account lockouts and all of the interesting events from the system log. These filters are a union of the results, not an intersection (adds not subtracts).

Both commands also support the -ComputerName option will allows us to interrogate another system. Get-WinEvent allows us to use another set of credentials to connect.
PS C:\> Get-EventLog -ComputerName Ed
PS C:\> $cred = Get-Credential
PS C:\> Get-WinEvent -ComputerName Ed -Credential $cred

The Get-Credential cmdlet will popup a prompt to ask you for your credentials. The credentials are stored in a variable then used to connect to Ed's machine.

So that is how we do it in powershell, but what about the accessing the "new" event log with regular old windows command line?

Windows Command Line - Vista, 2008 and 7

Since Vista there is a new sheriff in town for dealing with Event Log, wevtutil.exe. Too bad this is a bad sheriff. In episode 15, Ed stated, "The wevtutil query syntax is impossibly complex, and something I frankly loath." I completely agree. Everything about this command is sideways, even the default format isn't readable. I highly suggest using PowerShell since anything but the most basic query gets ugly. However, this command gives us a lot of control over the event log besides querying such as enumerating logs, getting or setting log configuration, getting log status, exporting, archiving, and clearing logs. Querying is the goal of this episode, so here is the syntax with the most common options:

wevtutil { qe | query-events } <LogName> /q:<XPathQuery>
/c:<# of events to return> /f:[XML|Text|RenderedXml]

Let's use the command to view the System log.

C:> wevtutil qe System
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Pr
ovider Name='Service Control Manager' Guid='{555908d1-a6d7-4695-8e1e-26931d2012f
4}' EventSourceName='Service Control Manager'/><EventID Qualifiers='49152'>7000<
ywords>0x8080000000000000</Keywords><TimeCreated SystemTime='2009-10-23T19:07:02
.954026500Z'/><EventRecordID>433362</EventRecordID><Correlation/><Execution Proc
essID='460' ThreadID='5740'/><Channel>System</Channel><Computer>mycomputer.mydom
ain.locl</Computer><Security/></System><EventData><Data Name='param1'>Diagnostic
Service Host</Data><Data Name='param2'>%%1297</Data></EventData></Event>
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Pr

Uh oh, by default the output is in XML, which means the /f:Text (or /format:Text) option is something we are going to use a lot.

C:> wevtutil qe System
Log Name: System
Source: Service Control Manager
Date: 2009-10-23T14:07:02.954
Event ID: 7000
Task: N/A
Level: Error
Opcode: N/A
Keyword: Classic
User: N/A
User Name: N/A
Computer: mywin7box

To search the logs you need to use an XPath query. As you remember from above we were trying to find a locked account event, here is the equivalent search using wevtutil.

C:\> wevtutil qe Security /q:*[System[(EventID=4740)]] /f:text

Log Name: Security
Source: Microsoft-Windows-Security-Auditing
Date: 2009-10-24T14:08:59.279
Event ID: 4740
Task: User Account Management
Level: Information

In the powershell section there were five different Event IDs we were interested in, and wevtutil can do a similar search.

C:\> wevtutil qe System /q:"*[System[(EventID=7035 or EventID=7036 or
EventID=6005 or EventID=6006 or EventID=1074)]]" /f:text | find "Event ID"

Event ID: 7036
Event ID: 7036
Event ID: 1074
Event ID: 6006
Event ID: 7036

Unfortunately, there isn't a nice way to get results from two logs and combine them. Even worse, querying for a specific time range isn't pretty.

C:\> wevtutil.exe qe System /q:"*[System[(EventID=1074 and
TimeCreated[@SystemTime > '2009-10-23T21:04:15.000000000Z'])]]" /f:text

Enough of that nasty command, what do the rest of you guys have?

Ed kicks it Old-Skool:

On XP and 2003 boxen, we don't have wevtutil.exe... thank goodness. Instead, we have the really nice built-in VB script (Dear Mr. Editor... that's not a typo... I really did mean "really nice VB script." Please don't remove.) called eventquery.vbs. By default, eventquery.vbs runs locally, although you can provide it with a "/s " and it'll pull logs off of remote Windows boxen, provided that you have administrative SMB access of said machines. Of course, you should avoid logging in directly to a console or terminal session with admin privs, and instead use "/u \ /p [password]" to specify some user other than your current logon.

We can then specify which event log we're interested in with the /l option followed by the log name, which could include "application", "system", "security", "DNS Server", or user-defined log names. To hit all logs, specify a log of "/l *". To specify multiple logs, you can re-use /l multiple times.

The real magick with eventquery.vbs is its filter language, specified with /fi followed by a filter in quotes. Our filters are built of an event attribute name, followed by an operator, followed by a value. Attribute names include Datetime, Type, (event) ID, User, Computer, Source, and Category. We then have operators, such as eq (equals), ne (not equals), gt (I don't have to keep spelling them out, do I?), lt (of course not), le (this really isn't cute anymore), and ge (so I'll stop doing it). To specify more complex filters, just put multiple /fi "[filters]", one after the other, and the system will AND them all together.

For output, we specify /fo (which stands for "format output") followed by list, table, or csv. The default is table, but I find csv to be the most useful when I want to do more in-depth analysis at the command-line or in a spreadsheet. For quick looks, though table is prolly best.

You can list the most recent logs with the /r option followed by an integer N. If N is negative, it'll list the N oldest events. You can specify a range of integers to get the matching set of events in the logs, but I don't find that all that useful.

By default, the output of eventquery.vbs is really skimpy, showing only the Type, Event ID, Date Time, Source, and ComputerName of each log. To get more data (including descriptions and associated accounts), we can specify verbose mode, with a handy /v. That /v is hugely important, because, without it, eventquery.vbs doesn't include a description or many other really important details. I almost always use /v with this command.

So, let's apply our handy little eventquery.vbs to match Tim's searches above.

Let's start by looking for all events associated with the word "locked":

C:\> eventquery.vbs /L security /fo csv /v | find /i "locked"
"Audit Success","644","10/26/2009 5:26:48 AM","Security","WILMA","Account Manage
ment","NT AUTHORITY\SYSTEM","User Account Locked Out: Target Account Name:
bob Target Account ID: WILMA\bob Caller Machine Name: WILMA
Caller User Name: WILMA$ Caller Domain: WORKGROUP Caller L
ogon ID: (0x0,0x3E7)"
Note that I did my find to look for the string "locked" in a case-insensitive fashion with the /i. Also, note that the Event ID here is 644.

I can search based on that Event ID by specifying a filter (with /fi):

C:\> eventquery.vbs /L security /fo csv /v /fi "id eq 644"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.

"Type","Event","Date Time","Source","ComputerName","Category","User","Descriptio
"Audit Success","644","10/26/2009 5:26:48 AM","Security","WILMA","Account Manage
ment","NT AUTHORITY\SYSTEM","User Account Locked Out: Target Account Name:
bob Target Account ID: WILMA\bob Caller Machine Name: WILMA
Caller User Name: WILMA$ Caller Domain: WORKGROUP Caller L
ogon ID: (0x0,0x3E7)"

Like Tim, let's see what happened from a minute before up to a minute after this event occurred:

C:\> eventquery.vbs /L security /fo csv /v /fi "datetime gt 10/26/2009,05:25:48AM"
/fi "datetime lt 10/26/2009,05:27:48AM"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.

"Type","Event","Date Time","Source","ComputerName","Category","User","Descriptio
"Audit Success","644","10/26/2009 5:26:48 AM","Security","WILMA","Account Manage
ment","NT AUTHORITY\SYSTEM","User Account Locked Out: Target Account Name:
bob Target Account ID: WILMA\bob Caller Machine Name: WILMA
Caller User Name: WILMA$ Caller Domain: WORKGROUP Caller L
ogon ID: (0x0,0x3E7)"
"Audit Failure","680","10/26/2009 5:26:47 AM","Security","WILMA","Account Logon"
_V1_0 Logon account: bob Source Workstation: WILMA Error Code:
"Audit Failure","680","10/26/2009 5:26:45 AM","Security","WILMA","Account Logon"
_V1_0 Logon account: bob Source Workstation: WILMA Error Code:
"Audit Success","517","10/26/2009 5:26:38 AM","Security","WILMA","System Event",
"NT AUTHORITY\SYSTEM","The audit log was cleared Primary User Name:
SYSTEM Primary Domain: NT AUTHORITY Primary Logon ID: (0x0,0x3
E7) Client User Name: Administrator Client Domain: WILMA Client L
ogon ID: (0x0,0x11E5A)"
"Audit Failure","680","10/26/2009 5:26:48 AM","Security","WILMA","Account Logon"
_V1_0 Logon account: bob Source Workstation: WILMA Error Code:
"Audit Failure","680","10/26/2009 5:26:50 AM","Security","WILMA","Account Logon"
_V1_0 Logon account: bob Source Workstation: WILMA Error Code:

See here how I've bundled together two filters to implement a time range. And we see... a bunch of failed logon attempts. Well, that makes sense. That's what locked out the account.

Continuing to mimic Tim's fu, here's how we can find service started events:

C:\> eventquery.vbs /l system /fi "id eq 7036" /v
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.

Listing the events in 'system' log of host 'WILMA'
Type Event Date Time Source ComputerName
Category User Description
------------- ------ ----------------------- ----------------- ---------------
--------------- -------------------- -----------
Information 7036 10/26/2009 5:50:58 AM Service Control M WILMA
None N/A The Task Scheduler service entered the runn
ing state.
Information 7036 10/26/2009 5:50:16 AM Service Control M WILMA
None N/A The Task Scheduler service entered the stop
ped state.

To get information about event ID 7035 (service stopped), 6006 (event log service started), 6005 (event log service stopped), or 1074 (system shutdown), just substitute in those Event IDs in this query. "But," you might say, "Tim did all of those in a single command, with PowerShell doing a logical OR between them." Yes, and if Tim ran with scissors or jumped off the Brooklyn Bridge, would you do that too? Well, unfortunately, while eventquery.vbs does have "AND" (intersection) capabilities in its filters, I haven't been able to get it to implement an "OR" (union) style filters. If I want to do that, I simply run the command multiple times with different filters, usually separated on a single command-line with &.

So there. :P

Hal gets cut off at the knees:

You know something? I suggested the topic for this Episode. What the heck was I thinking? There's no possible way I'm going to be able to cover all of the different log file locations and log file formats for every flavor of Unix. So I'm going to stick with Red Hat type operating systems (including RHEL, CentOS, and Fedora) and hope you all will get enough hints to figure this out for the particular forest of Unix you find yourself in. Ready? Let's do it!

As far as login type events go, one fruitful source of information is wherever your system puts its LOG_AUTH type logs. The tricky bit is that Linux systems use LOG_AUTHPRIV for this instead of LOG_AUTH. So, let's first check out syslog.conf and find out where these logs are going:

# grep auth /etc/syslog.conf
# Don't log private authentication messages!
*.info;mail.none;news.none;authpriv.none;cron.none /var/log/messages
# The authpriv file has restricted access.
authpriv.* /var/log/secure

So on this system, the authpriv.* stuff is ending up in /var/log/secure. Note that the line that talks about /var/log/messages above explicitly prevents authpriv logs from ending up in the messages file-- that's what the "authpriv.none" bit is about.

To go along with Tim and Ed's example of looking for account lockout events, I set up a test system and deliberately failed logging in to activate the account lockout feature. Let's try some obvious grep searches on /var/log/secure:

# grep lock /var/log/secure
# grep hal /var/log/secure
Oct 26 17:19:03 deer sshd[10960]: Failed password for hal from port 45903 ssh2
Oct 26 17:19:07 deer sshd[10960]: Failed password for hal from port 45903 ssh2
Oct 26 17:19:11 deer sshd[10960]: Failed password for hal from port 45903 ssh2
Oct 26 17:19:11 deer sshd[10960]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=associates.deer-run.com user=hal
Oct 26 17:19:18 deer sshd[10962]: pam_tally(sshd:auth): user hal (500) tally 4, deny 3
Oct 26 17:19:20 deer sshd[10962]: Failed password for hal from port 48941 ssh2
Oct 26 17:20:17 deer sshd[10962]: pam_tally(sshd:auth): user hal (500) tally 5, deny 3

Yes, that's right, the account lockout messages in Linux don't actually contain the string "lock" anywhere in the message, which is more than a little annoying. Unless you're searching for the messages related to a specific user as we're doing here, you need to know to search for the string "pam_tally" to find the account lockout events. pam_tally being the Linux PAM module that handles account lockout on failure. Yes, this is extremely non-obvious. Sorry about that, nobody asked for my opinion when pam_tally was being developed.

By the way, I elided out some other login messages from /var/log/secure in the example above so that it would be easier to see what the failure messages looked like. But here is the rest of the grep output so that you can see some successful authentication logs:

# grep hal /var/log/secure
Oct 26 15:23:00 deer sshd[9508]: Accepted password for hal from port 27691 ssh2
Oct 26 15:23:00 deer sshd[9508]: pam_unix(sshd:session): session opened for user hal by (uid=0)
Oct 26 15:27:07 deer sshd[9508]: pam_unix(sshd:session): session closed for user hal
Oct 26 15:52:27 deer sshd[10667]: Accepted password for hal from port 20043 ssh2
Oct 26 15:52:27 deer sshd[10667]: pam_unix(sshd:session): session opened for user hal by (uid=0)
Oct 26 17:17:30 deer su: pam_unix(su:session): session opened for user root by hal(uid=500)

As you can see, grep-ing for "pam_unix" (yet another Linux PAM module) will get you not only log in and log out events, but even su attempts. But those logs don't show you the remote IP address that the user is connecting in from-- you'll need to look for the "Accepted password" lines for that. Are we having fun yet?

You may find it easier to just use the last command:

# last
hal pts/0 Mon Oct 26 15:52 still logged in
hal pts/0 Mon Oct 26 15:23 - 15:27 (00:04)

Of course last only shows you successful logins. On Linux systems, lastb will show you failed logins:

# lastb
hal ssh:notty Mon Oct 26 17:19 - 17:19 (00:00)
hal ssh:notty Mon Oct 26 17:19 - 17:19 (00:00)
hal ssh:notty Mon Oct 26 17:19 - 17:19 (00:00)
hal ssh:notty Mon Oct 26 17:19 - 17:19 (00:00)
hal ssh:notty Mon Oct 19 10:56 - 10:56 (00:00)

Now let's talk about service start-up events. Actually, let's not. It turns out that there is no consistently logged record of when particular services are restarted on the system. Oh, the daemon itself may choose to log some start-up events, but there's no global system-level logging of these kind of events.

Finding the logs created by a given daemon presents another problem. Often Unix daemons will log to LOG_DAEMON, so you can look at /etc/syslog.conf and find out where these logs end up (hint: it's /var/log/messages on a typical Linux system). But there will always be oddball daemons like Apache with their own application-specific log files (/var/log/httpd/* on Red Hat). It's a mess.

Maybe the easiest way to figure out when a daemon was started is to just look at the output of ps:

# ps -ef | grep http
root 11253 1 0 18:05 ? 00:00:00 /usr/sbin/httpd
apache 11255 11253 0 18:05 ? 00:00:00 /usr/sbin/httpd
apache 11256 11253 0 18:05 ? 00:00:00 /usr/sbin/httpd
apache 11257 11253 0 18:05 ? 00:00:00 /usr/sbin/httpd
apache 11258 11253 0 18:05 ? 00:00:00 /usr/sbin/httpd
apache 11259 11253 0 18:05 ? 00:00:00 /usr/sbin/httpd
# ps -ef | grep sshd
root 5725 1 0 Jul27 ? 00:00:00 /usr/sbin/sshd
root 11032 5725 0 17:27 ? 00:00:00 sshd: hal [priv]
hal 11034 11032 0 17:27 ? 00:00:00 sshd: hal@pts/2

As you can see, the web server was started at 18:05 today. But all we know about the ssh server is that it was started sometime on Jul27.

At this point you're probably asking yourself, "Why is this all so difficult?" Historically, Unix has always left logging up to the discretion of the application developer. There isn't a single central auditing service (or consistent formatting requirements) for applications in general. If you're working on a system that enforces kernel-level auditing (e.g. enabling BSM under Solaris) then you can pull out these sorts of events from the kernel audit logs if you know what you're doing. But unfortunately, kernel-level auditing is an optional feature and there are still plenty of sites out there that don't enable it. So unfortunately we often just have to take what the app developers decide to give us, even when it's not very much.

Tuesday, October 20, 2009

Episode #65: Feeling Loopy

Ed is back in the saddle again:

Well, I'm back from my adventures on the other side of Planet Earth. Many thanks to Tim Medin for holding down the Windows fort while I was away. He really did an awesome job sparring with Hal! In fact, Tim was so good that we're going to have him work as a regular contributor here, adding his thoughts from a PowerShell perspective to each episode. Before now, he'd throw in his insights on occasion, but now he's a regular -- our very own Command Line Kung Fu blog FNG, if you will. Oh, and Tim... don't forget to empty the wastebaskets and scrub the bathroom floor before you leave tonight. No, you don't have to wear the maid's outfit. Hal just thought you'd like it.

Anyway, where was I? Oh yeah, writing a new episode for this week.

Faithful readers (yes, both of you) know that we often use various kinds of loops in the commands we construct here. Individual commands are certainly powerful, but to really mess up your computer, it's helpful to have _iteration_, doing the same thing again and again with some subtle variations, repeating a process to do the same thing again and again, with some sutble variations. If you look at our episodes, I think about 80% of them actually use some sort of loop. And that got me thinking. I have an intuitive feel for what kinds of loops are available in cmd.exe and when to use each kind. But, I'd like to learn more about the looping options within bash and PowerShell, and what specific uses are best for each kind of loop. So, I figured the easiest way for me to learn about bash and PowerShell looping was to throw down some cmd.exe options, and invite my Kung Fu partners to respond in kind. I'll show you mine... if you show me yours. So, here goes.

In cmd.exe, we really have just one command that implements loops: FOR. Sadly, we don't have a WHILE. I'm not going to talk about GOTO, which we do have, but it is for scripts and not for individual commands, the relentless focus of our blog. Within the FOR command, however, we have numerous different kind of looping options. Let me explain each, and talk about what it's most useful for. Depending on how you count, there are 5 or 6 different kinds of FOR loops! (The 5 versus 6 depend on whether you consider a FOR /R, a FOR /D, and a FOR /D /R to be two or three different kinds of loops.) What was Microsoft thinking? Well, when you only have a hammer, the whole world looks like a nail... and with our FOR loops in cmd.exe, we can attack many different types of problems.

Note that each loop has a similar structure: a FOR statement, an iterator variable, the IN component, a (set) that describes what we iterate over (and is always included inside of parentheses ()), a DO clause, and a command for our iteration.

FOR /L loops: These are iterating counters, working their way through integers. Sorry, but they don't work through fractions, letters, or words.... just integers. Their syntax is:

C:\> FOR /L %[var] in ([start],[step],[stop]) do [command]

The %[var] is the iterator variable, a value that will change at each iteration through the loop. You can use any one letter of the alphabet for this variable, such as %a or %i. Most people use %i as the canonical variable, unless there is a specific reason to use something else. Also, note that %i and %I are different variables, which gives us a total of 52 possible different letters, the upper case and lower case sets.

So, if you want to count from 1 to 100, you could run:

C:\> FOR /L %i in (1,1,100) do @echo %i

Or, if you want a loop that'll run forever, you start counting at 1, count in steps of zero, and count all the way to 2:

C:\> FOR /L %i in (1,0,2) do @echo Infinite Loop

FOR /L loops are useful any time you have to count (obviously) but also any time you need the equivalent of a "while (1)" loop to run forever.

I covered FOR /L loops first, because they are both very easy and very useful, and I wanted to set them aside before we start covering loops that iterate over objects in the directory structure, namely FOR, FOR /D, FOR /R, and FOR /R /D.

Plain ol' FOR loops: These loops iterate over files, with the iterator variable taking on the value of the names of files you specify in the (set). For example, to list all .ini files inside of c:\windows, you could run:

C:\> FOR %i in (c:\windows\*.ini) do @echo %i

It's a little-known fact that the (set) in these file/directory FOR loops can have a space-separated list of file specifiers, so you could get all of the .ini files in c:\windows\*.ini and c:\windows\system32\*.ini by just running:

C:\> FOR %i in (c:\windows\*.ini c:\windows\system32\*.ini) do @echo %i

Now, you might think, "Dude... I can do that same thing with the dir command" and you'd be right. But, there is another aspect of file-iterating FOR loops that give us more flexibility than the dir command. By using a variation of the iterator variable, we can get other information about files, including their size, their date/time, their attributes and what not. Access to these items is available via:

   %~fi        - expands %I to a fully qualified path name
%~di - expands %I to a drive letter only
%~pi - expands %I to a path only
%~ni - expands %I to a file name only
%~xi - expands %I to a file extension only
%~si - expanded path contains short names only
%~ai - expands %I to file attributes of file
%~ti - expands %I to date/time of file
%~zi - expands %I to size of file

So, we could list the file's name, attributes, and size by running:

C:\> FOR %i in (c:\windows\*.ini) do @echo %i %~ai %~zi

FOR /D loops: These loops iterate through directories instead of files. So, if you want all directory names inside of c:\windows, you could run:

C:\> FOR /D %i in (c:\windows\*) do @echo %i

FOR /R loops: Ahhh... but you may have noted that neither the plain ol' FOR loops nor the FOR /D loops listed above actually recurse through the directory structure. To make them do that, you'd need to do a /R. The FOR /R loop has a slightly different syntax, though, in that we need to specify a path before the iterator variable to tell it where to start recursion. By itself, FOR /R recurses the directory structure, pulling out files names:

C:\> FOR /R c:\windows %i in (*.ini) do @echo %i

That one will go through c:\windows and find all .ini files, displaying their names.

Now, what if you want just directories and not files? Well, you do a FOR /D with a /R, as follows:

C:\> FOR /D /R c:\windows %i in (*) do @echo %i

This will list all directories inside of c:\windows and its subdirectories.

And that leaves us with the most complex kind of FOR loop in all of Windows.

FOR /F loops: These loops iterate through... uhhh... stuff. Yeah, stuff. The syntax is:

C:\> FOR /F ["options"] %[var] IN (stuff) DO [command]

The stuff can be all manner of things. If the (stuff) has no special punctuation around it, it's interpreted as a file set. But, the file set will be iterated over in a different manner than what we saw with plain ol' FOR loop and even FOR /R loops. With FOR /F, you'll actually iterate over each line of the _contents_ of every file in the file set! The iterator variable will take on the value of the line, which you can then do all kinds of funky stuff with, searching for specific text, parsing it out, using it as a password, etc.

If we specify the stuff with double quotes, as in ("stuff"), the FOR /F loop will interpret it as a string, which we can then parse.

If we specify the stuff with single quotes, as in ('stuff'), the FOR /F loop will interpret stuff as a command, and run the command, iterating on each line of output from the command.

Regardless of the stuff (whether it be files, a string, or a command), we can parse the iterator variable using those "options" in the FOR /F loop. I covered that parsing in more detail in Episode #48, Parse-a-palooza, and I won't repeat it here. There's also some examples of FOR /F in action there.

Suffice it to say, though, that if you master each of these FOR loops, you are rockin' and rollin' at the cmd.exe command line!

Tim, reporting for duty, Sirs!

After washing Ed's car and mowing Hal's lawn, they sent me on a hunt to find a strings command in the standard Windows shell. I haven't found it yet, but I'll keep looking after I finish painting. Anyway, back to the hazing, er, episode.

PowerShell also has five or six different types of loops. The difference is that they aren't all named FOR, and we do have the While loop. The available loop types are:
Do While
Do Until
ForEach-Object (& ForEach statment)

The first three loops are very similar so I'll cover them together. Also, since you are reading a blog such as this I'll assume you have at least a fundamental understanding of programming and understand control flow so I won't go into great depth on the basics.

While, Do While, and Do Until loops

Do While Loop
do {code block} while (condition)

Execute "while" the condition is true.

While Loop
while (condition) {code block}

Same as above, except the condition is checked before the block is executed, the control structure is often also known as a pre-test loop

Do Until Loop
do {code block} until (condition)

Executes "until" the condition is true. In other words it runs while the condition value is False.

These loops are much more commonly used in scripts and not in one-liner commands. However, I use the following command to beep when a host goes down (drops four pings).

PS C:\> do {ping} while ($?); write-host `a

...and this command to let me know when a host comes back up (four successful pings in a row)

PS C:\> do {ping} until ($?); write-host `a

The $? variable contains a boolean value which represents the result status of the previous command. A true value indicates the command completed successfully. The first loop continues to run while the ping command result is successful. The second loops runs until the ping command is successful. After exiting either loop the write-host `a command produces the beep. Note, the `a uses a back quote, not the standard single quote.

For loop
The standard use of the For statement is to run the code block a specified number of times.

for (initialization; condition; repeat) {code block}

If we wanted to count to 100 by 2's we could use this command.
PS C:\> for ($a=2; $a -le 100; $a=$a+2) {echo $a}

So far nothing new, but now it gets cool.

ForEach-Object is a looping cmdlet that executes in the pipeline and uses $_ to reference the current object. The ForEach-Object cmdlet is the most powerful and most commonly used loop in PowerShell. It is used so much that it is given the single character alias %. Here is the typical syntax of the ForEach-Object cmdlet:

... | ForEach-Object { script block } ...

Let's use it to view the contents of all the files in the current directory:

PS C:\> Get-ChildItem | ForEach-Object { Get-Content $_ } 

Shorter versions using built-in aliases:
PS C:\> dir | % { gc $_ } 
PS C:\> gci | % { gc $_ }

This command gets the files in the current directory using Get-ChildItem. Within our script block the current file is referenced by $_, the current pipeline variable. In our script block, denoted with the curly braces "{}", we call the Get-Content cmdlet on the current file. The loop automatically handles iterating through the objects passed down the pipeline and we get the contents of all the files.

With the addition of PowerShell to the regularly scheduled programming, you will see the ForEach cmdlet used regularly in the coming weeks.

The ForEach statement is very similar to the ForEach-Object. The differences are formatting, performance, and memory utilization.

The formatting is different, but no so much different that it should be confusing.

ForEach ($item in $collection) {command_block}

If we rewrote the example above using the ForEach statment this is how it would look:

PS C:\> ForEach ($f in Get-ChildItem) { Get-Content $f }

Not a huge difference. The big difference comes with the resource usage. ForEach will load the entire collection in to memory before executing the script block, and it is usually a bit faster if it doesn't have to load something too large. Conversely, the ForEach-Object cmdlet will process it as it receives it.

If we use each method to multiple the numbers from 1 to 100,000 by the number 2 we can see that the ForEach cmdlet is 30 times faster. In short, the reason for the speed difference is that the ForEach is run as a single function instead of three or more functions.

PS C:\> Measure-Command { 1..100000 | %{$_*2} } |
select TotalMilliseconds


PS C:\> Measure-Command { foreach ($i in (1..100000) ){$i*2} } |
select TotalMilliseconds


This difference is much less noticeable when there are other factors involved, such as disk access, rather than just pure computing power. Here is a similar test when accessing the Windows Security Event Log.

PS C:\> measure-command {get-eventlog -logname security | 
% {echo $_.eventid}} | select TotalMilliseconds


PS C:\> measure-command {foreach ($i in get-eventlog -logname
security) { echo $i.eventid}} | select TotalMilliseconds


I use ForEach-Object with the Get-EventLog cmdlet so my results are displayed as soon as they are processed and the time difference isn't as great. Personally, I think the ForEach-Object is more readable and is much easier to tack on to the end of an existing command.

I look forward to showing more PowerShell tips in the coming weeks. Now back to polishing Hal's car.

Hal finishes up:

Bash looping constructs are actually very simple: there's essentially two different types of for loops plus while loops and that's it. The most common type of loop in command-line tasks is the simple "for <var> in <list of values> ..." type loop:

for f in *.gz; do
echo ===== $i
zcat $i | grep -i pattern

The trick is that the "<list of values>" can be pretty much anything you can imagine, because Unix makes command output substitution so natural. For example, here's one of our previous solutions from Episode #56: Find the Missing JPEG:

for i in $(seq -w 1 1300); do [ ! -f $i.jpg ] && echo $i.jpg; done

You can have the for loop iterate over a directory structure simply by having it iterate over the output of a find command, though usually "find ... -exec ..." or "find ... | xargs ..." suffices instead of a loop. In any event, the ability to do arbitrary command substitution for the list of values the for loop iterates over is why bash only needs a single simple for loop construct rather than separate "for /D", "for /R", etc like Windows does.

Bash does have a C-style for loop for iterating over a series of numbers. For example, here's the alternate solution from Episode #56 that doesn't require the seq command:

for ((i=1; $i <= 1300; i++)); do file=$(printf "%04d.jpg" $i); \
[ ! -f $file ] && echo $file; done

On systems that have seq, I actually find it easier to type "for i in $(seq ...); do ..." than the C-style for loop, but your mileage, as always, may vary.

The other loop construct that bash has is a while loop. The simplest kind of while loop is an infinite loop. For example, there's our first solution in Episode #3 for watching the file count in a directory:

while :; do ls | wc -l; sleep 5; done

The ":" in this context is a special marker in bash that always evaluates to true.

However, you can use any conditional expression in the while loop that you wish. One example is the common idiom for reading data out of a file:

while read l; do ...; done </path/to/some/file

In this case, the read command returns true as long as it is able to read a line from the input file. When EOF is reached, read returns false and the loop terminates.

Here's another example with a more general conditional statement at the top of the loop. This little bit of code tries to periodically unmount a busy file system. It will continue to iterate until the umount command actually succeeds and the mount point no longer appears in the output of df:

umount $MOUNTPT
while [[ "X$(df -P $MOUNTPT | grep $MOUNTPT)" != "X" ]]; do
sleep 10
umount $MOUNTPT

What a lot of folks don't know is that bash also has an "until" loop. But until loops are really just while loops where the condition has been negated. So we could use an until loop to rewrite the example above very easily:

umount $MOUNTPT
until [[ "X$(df -P $MOUNTPT | grep $MOUNTPT)" = "X" ]]; do
sleep 10
umount $MOUNTPT

The only changes are replacing "while" with "until" and "!=" with "=".

There are also other commands in Unix that are essentially implicit iteration operators: find which iterates over a list of directories, xargs which iterates over a list of input values, and sed and awk which iterate over the lines of a file. Very often you can use these operators instead of a traditional for or while loop.

Tuesday, October 13, 2009

Episode #64: The Times (OK, Dates) They Are a Changing

Hal finds an interesting topic:

Recently Rich Shepard, one of my colleagues on the Portland Linux User Group mailing list, posted an interesting problem. He had a data set with pipe-delimited records like:

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|7/6/1993|0

All he wanted to do was convert the date column in the 10th field to YYYY-MM-DD format so that the file could be more easily imported into a relational database. He was curious if there was a simple command-line he could use to accomplish this.

To me, this seemed like a task that was tailor made for awk (apparently Joe Pruett agreed, since his solution on the mailing list was essentially identical to the one I'm presenting here). While awk normally splits fields on whitespace, we can use the "-F" option to specify an alternate delimiter. Once we've got the fields split up, we can work a little magic with the built-in split() and sprintf() operators in awk:

$ awk -F'|' '{split($10,f,"/");
$10=sprintf("%d-%02d-%02d", f[3], f[1], f[2]);
print}' data

1993-1 Water Quality WVR Yamhill, City of Yamhill Hamlin Holt Npv
NPDES-Waste Discharge Limits 1993-07-06 0

The split() function in the example breaks up field #10 on "/" characters and puts the results into the array named "f". Actually the last argument to split() can be a full-on egrep-style regular expression delimited with "/.../". But since we're just splitting on literal slash characters, "/" is a lot easier to type than "/\//".

Once we have the year, month, and day split into an array, we then replace the contents of the 10th field with the output of the sprintf() routine. This puts our data in the desired format. The final "print" statement outputs all of the fields from the original line, including our reformatted field.

Now you'll notice that the output is space-delimited rather than pipe-delimited. That's because awk's default "output field separator" (OFS for short) is space. You can actually change this by changing the value of the OFS variable. The trick is you need to set variables like this in a "BEGIN" block at the front of your awk code so that the new value is set before you begin processing your input file:

$ awk -F'|' 'BEGIN { OFS="|" }
$10=sprintf("%d-%02d-%02d", f[3], f[1], f[2]);
print}' data

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-07-06|0

Of course we could set OFS to anything. For example, we could set it to comma to produce CSV files (though there are possible quoting issues if your data contains commas). There are other variables we can set to control awk's splitting behavior. For instance, the "-F" option is equivalent to setting the FS ("field separator") variable. Similarly, there are the RS ("record separator") and ORS ("output record separator") variables, which are normally set to newline since awk operates on a line-at-a-time basis.

Anyway, if your task is chopping up data and dumping it into a different format, awk is always one good tool to reach for. I could have solved this a bit more tersely using Perl, but that would be breaking the rules for this blog. For those of you who are thinking that even my awk code is breaking the "no scripting languages" rule, it is possible to do this with cut instead of awk or sed, but the result is pretty nasty:

$ IFS='|'
$ while read -a F; do
printf -v d "%d-%02d-%02d" \
`echo ${F[9]} | cut -d/ -f3` \
`echo ${F[9]} | cut -d/ -f1` \
`echo ${F[9]} | cut -d/ -f2`;
echo "${F[*]}";
done < data

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-07-06|0

"read -a F" splits each line using the delimiter we specified in the IFS variable and assigns the fields to elements of the array named F. Notice, however, that bash indexes its arrays starting at zero (like C programs) while awk starts with one. So the date we're reformatting is in F[9], not F[10].

The real difficulty here is that cut doesn't let us reorder multiple fields in a single command, so we're forced to do three instances of the "echo ... | cut ..." pipeline to get the date fields in the order we want. Another minor annoyance is that "printf -v ..." doesn't let us assign directly to array variables, so I have to use $d as a temporary variable.

It's also worth pointing out that the double quotes in the last echo statement in the loop are significant. If I just wrote "echo ${F[*]}" without the double quotes, then I'd get space-separated output. Using the double quotes causes the output to be delimited with the first character of $IFS (similar to setting OFS in awk).

So there you go: an awk solution and a nasty shell-only version. Somehow I think that Tim's Windows solution is going to look even uglier though...

Tim brings the ugly:

First off, the date format of our sample wasn't specified, so I will assume the sample date is July 6th, 1993. My apologies to military and European followers who think the date should be June 7th, 1993.

Linux may have all sorts of different "cool" commands to use, but in the windows world we use the FOR loop...for everything.

We use our FOR loop to split the fields using the "|" and "/" characters as a delimiters. Then all we need to do is rearrange the date parts and put it all back together.

C:\> for /F "tokens=1-14 delims=|/" %a in (c:\file.txt) do @echo

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-7-6|0

Regular readers will remember the FOR loop represents the tokens by using sequential letters of alphabet. We have chosen %a to represent the first token, so %b will represent the second token, %j the 10th (month), %k the 11th (day), and %l the 12th (year). We recreate the original format by adding the "|" and "-" characters between the rearranged tokens. The problem is, if there is a "/" character in any of the text fields our results will be messed up. If we change "Water Quality" to be "Water Quality/Temp" we get these results.

C:\> for /F "tokens=1-14 delims=|/" %a in (c:\file.txt) do @echo

1993-1|Water Quality|Temp|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
6-NPDES-Waste Discharge Limits-7|1993

We need a more robust solution that will only use the "/" character to split the date, but not the rest of the string. How do we do that? Well, if one FOR loop is good, then two must be better.

C:\> for /F "tokens=1-12 delims=|" %a in (c:\file.txt) do @for /F
"tokens=1-3 delims=/" %x in ('echo %j') do
@echo %a^|%b^|%c^|%d^|%e^|%f^|%g^|%h^|%i^|%z-%x-%y^|%k

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-7-6|0

The first FOR loop is used to split the string using the "|" character as the delimiter. The second FOR loop is used to split only the date field using the "/" character as a delimiter. The variables can be a little confusing so let's take a deeper look in to the second FOR loop.

..for /F "tokens=1-3 delims=/" %x in ('echo %j') do...

This FOR loop operates on the output of
echo %j
, which is the entire date field. Using the delims option we slice the date field using the "/" character as our delimiter. The iterator in this loop is %x and it will contain the first token (month). The second and third tokens are represented by %y (day) and %z (year). Finally, we glue it all back together in the order we like using the variables created by both FOR loops.

Some of you detail oriented folks may have noticed that I neglected one point, the month and day need a leading zero. I ignored this point because this tiny change makes things really ugly. We have to use our old friend "delayed environment variable expansion" which you can read about in episodes #48, #12, and #46. Since it has been covered so many times I'll skip some of the details for sake of brevity (ironic I know). Here is our final result:

C:\> cmd.exe /v:on /c "for /F "tokens=1-12 delims=^|" %a in (c:\file.txt) do
@for /F "tokens=1-3 delims=/" %x in ('echo %j') do @set month=0%x& @set day=0%y&
@echo %a^|%b^|%c^|%d^|%e^|%f^|%g^|%h^|%i^|%z-!month:~-2!-!day:~-2!^|%k"

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-07-06|0

That is a big mess and it may be difficult to see where and how the leading zero was added. Using the delayed variable expansion we set the month variable, with a leading zero, like this:

... set month=0%x& ...!month:~-2!....

The variable %x could contain 7 (July) or 11 (November). We set the variable, month, equal to the concatenation of zero and %x. The month variable would contain 07 (July) or 011 (November). Notice when the month variable is set there is no space between the variable (%x) and the "&" character. If we did leave a space then our month variable would contain a trailing space which would later have to be removed. When we echo the month variable we only want the two rightmost characters so July is displayed as 07 and November as 11. The same process is used for the day of the month.


Powershell gives us the ability to use regular expressions which makes everything much easier. We can reformat any date in our file using this command:

PS C:\> Get-Content file.txt | ForEach-Object { $_ -replace
'(\d{1,2})/(\d{1,2})/(\d{4})','$3-$1-$2' }

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-7-6|0

The Get-Content commandlet (alias gc) returns each line of the file given. Using the ForEach-Object (alias %) we operate on each line of the file. The "current pipeline object", represented as $_, contains the content of the current line in the file (in our example we only have one line in our file).

Our regular expression search and replace finds the month/day/year and rearranges it as year-month-day. Again we have the problem of adding that pesky leading zero so we need to use a slightly different command.

PS C:\> gc file.txt | % { $_ -replace '(\d{1,2})/(\d{1,2})/(\d{4})',
'$3-0$1-0$2' } | % { $_ -replace '(\d{4}-)\d?(\d{2}-)\d?(\d{2})','$1$2$3'}

1993-1|Water Quality|WVR|Yamhill, City of|Yamhill|Hamlin|Holt|Npv|
NPDES-Waste Discharge Limits|1993-07-06|0

We use two replace commands in order to add our leading zero. The first replace command adds a leading zero and rearranges our month/day/year, resulting in year-0month-0day. The second command removes the leading zeros if they are unnecessary.

Tuesday, October 6, 2009

Episode #63: Death To Users!

Tim kicks it off:

Last week we discussed ways to determine who is logged in to a system. Now what? Well, what is the most fun thing to do with that information? Of course, kick those users off the system.

Windows has two commands we can use, loggoff and rwinsta (Reset WINdows STAtion), and both do the same thing. Both commands require either a session name or session id, and both accept an /server option. How do we get the session name or id? Last week's post explained how to use qwinsta to get that info.

I didn't cover it last week, but there is another command, query session, that gives the same output as qwinsta. It has an undocumented switch /sm which returns the session id first which is easier for parsing. Unfortunately, it isn't available in XP so we will skip it.

C:\> qwinsta /server:Alpha
console shemp 0 Conn wdcon
rdp-tcp 65537 Listen rdpwd
rda-tcp#5 larry 1 Active wdica
rda-tcp#6 moe 2 Active wdica
curly 16 Disc wdica

We want to kick Larry and we have four ways to do it.

C:\> logoff /server:Alpha rda-rcp#5
C:\> logoff /server:Alpha 1
C:\> rwinsta /server:Alpha rda-rcp#5
C:\> rwinsta /server:Alpha 1

Why stop with just Larry, we want to kick off everyone! What if we try to logoff the listener?

C:\> logoff /server:Alpha rda-rcp
If you reset this session, all users using this protocol will be logged off,
Continue (n=no)? y
C:\> qwinsta /server:Alpha
console shemp 0 Conn wdcon
rdp-tcp 65537 Listen rdpwd
larry 1 Disc wdica
moe 2 Disc wdica
curly 16 Disc wdica

We disconnected the users, but we didn't kill their session. The behavior is different depending if a listener or a active/disconnected session is specified. Unfortunately, rwinsta acts the same way so that won't help. So then how do we kill the session? We will need to get the session id's and logoff each one.

C:\> for /F "tokens=2,3" %i in ('qwinsta /server:xen03 ^| findstr  "Active Disc"') do
@echo %i | findstr /v "[a-z]" && logoff /server:xen03 %i || logoff /server:xen03 %j

I'll just explain the differences since you can get most of the details from the last post. Previously, we worked with tokens 1 and 2 in order to find the username. This week we want token 2 or 3 in order to get the session id. Remember, a space is the default delimiter and is therefore ignored at the beginning of the line. The first token is either the session name or the username, the second token is either the username or session id.

Now let's look at the logic used to find the session id and ultimately logoff the user. Stealing from Episode 47 we use "the shorthand [command1] && [command2] || [command3]. If command1 succeeds, command2 will run. If command1 has an error, command3 will execute." In our example, command1 looks at the variable %i to ensure it does NOT contain a letter (and is therefore a number). If %i is determined to be a number (session id), then we use it to logoff the user, if %i is not a number then %j is our session id and is used to logoff the user.

So now we have everyone off the server. Now we can take it offline and install Windows 2008 R2 (unless you're Hal).

Hal weighs in:

Just for that crack, Tim, I'm not going to invite you to my Windows 7 release party...

In general, to kick a user off a Unix system you need to either kill their command shell or one of the ancestors of that process-- like the SSH daemon that spawned their shell or the X server that's supporting their windows. Back in Episode 22 I noted that the pkill command was very handy for this, so let's review:

# pkill -u hal         # kicks hal off the system by killing all hal-owned procs
# pkill -P 1 sshd # kicks all SSH logins off system, master daemon still running
# pkill X # terminate GUI session on console

The above commands are all fairly indiscriminate. For example, the first command kicks a single user off the system by terminating all processes owned by that user. This includes not only their command shells, but also any other jobs that user might have running. That might not be the best idea if the user had a legitimate but long-running job that shouldn't have been terminated.

However, pkill also lets us be more selective. For example, "pkill -u hal bash" would kill only the bash command shells running as user hal. Actually bash apparently traps SIGTERM, so we need to explicitly use SIGKILL:

# pkill -9 -u hal bash

The other, less discriminating versions of the pkill command I showed you earlier work without the "-9" because they're terminating ancestor processes of users' shells, which forces those shells to exit.

Another approach is to terminate only the processes associated with a login session on a particular tty. The who command will show us the tty associated with each logged in user, and we can use "pkill -t ..." to terminate only processes associated with that pty:

# who
moe pts/0 2009-10-03 07:51 (host1.deer-run.com)
larry pts/2 2009-10-03 07:51 (host2.deer-run.com)
hal pts/3 2009-10-03 07:52 (host3.deer-run.com)
# pkill -9 -t pts/3

By the way, the "-t" option is not unique to pkill: many other Unix commands allow you to get information for a single tty. For example, on older systems that don't have pkill I can use "ps -t ..." to do something similar:

# who
moe pts/0 2009-10-03 07:51 (host1.deer-run.com)
larry pts/2 2009-10-03 07:51 (host2.deer-run.com)
hal pts/3 2009-10-03 08:06 (host3.deer-run.com)
# ps -t pts/3
1511 pts/3 00:00:00 bash
# kill -9 1511

Similarly, the other pkill variants I've shown you in this Episode can be accomplished with a combination of ps, grep, and kill if you happen to have a Unix machine without pkill installed.