Tuesday, November 17, 2009

Episode #69: Destroy All Connections

Ed looks out on the serene waters of Tokyo Bay:

Mr. Byte Bucket sent in a request from the ever insightful Pauldotcom IRC channel:

Can anyone suggest a Windows cmd to disconnect a specific socket?

Nice question! Unfortunately, Windows doesn't offer much in the way of built-in tools that are fine grained enough to operate at the socket level. But, we can either restart the service handling the connection, or, if you want to be a little more violent, just kill its service.

We'll start by taking the violent rout. First off, we need to figure out what processid is associated with the connection you seek. You can get a list of all connections to a given port using the following command:

C:\> netstat -nao | find ":[port]"

The right-most (i.e., fifth) column is the processid number.

If you have multiple different clients clients connected to that same destination port, you can select out the given process that is associated with a specific client connection using:

C:\> netstat -nao | find ":[port]" | find "[ClientIPaddr]"

You can then kill that process using wmic. However, be very careful! That process may be really important, and killing it could end you up in a world of hurt. In fact, a lot of Windows built-in features (such as file sharing and IIS) are all associated with the "System" service (with a PID of 4 or 8 depending on the version of Windows you are using).

C:\> wmic process where processid="[PID]" delete

You can wrap this all up in one very dangerous command... but I really don't recommend doing this. Again, if you inadvertently kill the wrong process, your system could come crashing down around you. I recommend figuring out what process is at issue first, investigating it, and only then killing it.

C:\> for /f "tokens=5" %i in ('netstat -nao ^| find ":[port]" ^| 
find "[ClientIPaddr]"') do @wmic process where processid="%i" delete
Deleting instance \\WILMA\ROOT\CIMV2:Win32_Process.Handle="[PID]"
Instance deletion successful.

Our second approach involves restarting the service associated with the process. Based on the processID information we retrieved above, we can get a list of the associated services by running:

C:\> tasklist /fi "pid eq [PID]" /svc

That command tells us all of the services that are associated with a given process. We'll then have to do a little research to figure out which specific service it is that is handling our given connection. Unfortunately, Windows doesn't give us the ability to map a connection directly to a service, but we must instead map a connection to a process, which we then map to a set of services running from within that process. Now, if there's only one service in that process, we're golden. But, if there are more, we have to do this research step to find out which service it really is that we need to focus on.

Once you've discovered the appropriate service, you can then restart the service using:

C:\> sc stop [service] & sc start [service]

It's kind of annoying, I think, that there is no "restart" option explicitly here, so we have to stop the service and then start it. Not a huge liability, but still a little frustrating.

Tim Sees Something Starting to Stir in the Otherwise Tranquil Waters of the Bay:

PowerShell doesn't (yet) have any cmdlets similar to netstat. The Grand Poobah of PowerShell (@halr9000 on twitter) has a PowerShell script to "objectize" netstat, but that crosses in to scriptland so we will steer clear.

To find the connection and its associated process you will have to refer to the first portion of Ed's section. But once we have the process id we can use PowerShell cmdlets.

So we have the PID, now we can get the process object by using the aptly named command Get-Process. The default method for Get-Process is the process name, so we need to use the Id parameter.

PS C:\> Get-Process -Id 3004
Handles NPM(K) PM(K) WS(K) VM(M) CPU(s) Id ProcessName
------- ------ ----- ----- ----- ------ -- -----------
210 7 5116 13344 61 6.67 3004 Evil


The process can be killed using PowerShell, but as Ed said, "Be very careful!" The Stop-Process cmdlet's default method is the process Id so we aren't required to use Id parameter (but you can if you wish).

PS C:\> Stop-Process 3004


If the offending process is a service, we can't retrieve the service from the process id using by using just Get-Proccess since it doesn't include the Id property. However, we can use wmi in conjunction with Get-Service to "get" the service.

PS C:\> Get-WmiObject win32_service | ? { $_.ProcessID -eq 3004 } | % { Get-Serv
ice $_.Name }


To paraphrase Ed (again), you will have to do some research to since there can be multiple services running from within that process (multiple services be in one process).

Once we find the service that needs a kick, we can stop it using Stop-Service. We also have the ability to restart the service using Restart-Service.

If you felt gutsy you could pipe the command above into Restart-Service or Stop-Service.

PS C:\> Get-WmiObject win32_service | ? { $_.ProcessID -eq 3004 } | % { Get-Serv
ice $_.Name } | Restart-Service


PS C:\> Get-WmiObject win32_service | ? { $_.ProcessID -eq 3004 } | % { Get-Serv
ice $_.Name } | Stop-Service


..or you could do it manually.

PS C:\> Stop-Service [Service]


PS C:\> Restart-Service [Service]


It would be nice if there was a facility in PowerShell to just kill a connection. Maybe we can get that in version 3, but while we wait for the additional capability I get the uneasy feeling Hal is getting ready to squash us like bugs.

And the waters of Tokyo bay begin to boil:

Oh dear. I really am going to have to open my Godzilla-sized can of whup-ass on my poor Windows-using bretheren. But first, let me try to make them feel not so bad by doing a quick netstat-based solution.

On Linux systems at least, netstat has a "-p" option to display process information. Let's take a quick look at some sample output:

# netstat -anp --tcp -4 | grep :22
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 15805/sshd
tcp 0 0 192.168.1.4:60279 192.168.1.2:22 ESTABLISHED 18054/ssh
tcp 0 0 192.168.1.4:32921 192.168.1.2:22 ESTABLISHED 19409/ssh

Here I'm listing all TCP sockets that are using IPv4 ("--tcp -4"), and using "-n" so that the socket numbers are not converted into human-readable names. Anyway, as you can see, the 7th column is "PID/name" (this is what the "-p" option does).

So with the help of awk and cut, we can pull out just the PID of the master SSH daemon:

# netstat -anp --tcp -4 | awk '/:22/ && /LISTEN/ { print $7 }' | cut -f1 -d/
15805

Killing that process just requires appropriate use of backticks:

# kill `netstat -anp --tcp -4 | awk '/:22/ && /LISTEN/ { print $7 }' | cut -f1 -d/`

We could also use cut to pull out the second field, if we wanted to shut down the process using it's init script:

# /etc/init.d/`netstat -anp --tcp -4 | awk '/:22/ && /LISTEN/ { print $7 }' | cut -f2 -d/` stop
bash: /etc/init.d/sshd: No such file or directory

Rats, it's /etc/init.d/ssh on this system, and not /etc/init.d/sshd, but you get the idea.

But really, the 300 foot giant mutated atomic lizard we want to employ here is lsof. You might be aware that we can get lsof to show us just the port 22 stuff with it's "-i" flag:

# lsof -i :22
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
sshd 15805 root 3u IPv6 37028 TCP *:ssh (LISTEN)
sshd 15805 root 4u IPv4 37030 TCP *:ssh (LISTEN)
ssh 18054 hal 3u IPv4 44514 TCP elk.deer-run.com:60279->deer.deer-run.com:ssh (ESTABLISHED)
ssh 19409 hal 3u IPv4 53249 TCP elk.deer-run.com:32921->deer.deer-run.com:ssh (ESTABLISHED)

If we wanted to kill the SSH daemon, we could easily adapt the awk fu from the netstat example to pull out the appropriate PID value and then use backticks to feed this value into the kill command.

But we don't need awk, because lsof has the "-t" ("terse") option, which just spits out the PIDs:

# lsof -t -i :22
15805
18054
19409

The "-t" option is specifically designed so that you can do things like "kill `lsof -t -i :22`". Of course, the problem with doing that is that I'd also end up killing my SSH sessions to the remote machine deer.deer-run.com-- PIDs 18054 and 19409-- which I don't really want to do.

So how do we pull out just the daemon PID? Well looking at the lsof output above, I can see that I want to kill the sshd processes that are listening on port 22 but leave the regular ssh processes alone. I could use the "-c" option to select the PIDs by command name:

# lsof -t -c sshd
15805

But that means I'd have to already know the command name associated with the port in question-- in this case by having already run lsof once to search by port number. And if I knew the command name already, why wouldn't I just use pkill (see Episode #22) instead of lsof?

What I really want is a single command-line using just kill and lsof that lets me reliably destroy the master server process as effectively as Godzilla destroys Tokyo. Luckily, lsof has some more atom-age madness up its sleeve. You see, lsof's "-c" option isn't just limited to simple substring matching: you can use "/.../" to specify a full-on egrep-style regular expression. Because Unix server processes almost always end in "d" (think sshd, httpd, ftpd, and so on), we can construct a similar regular expression to match the daemon process name associated with an arbitrary port number:

# lsof -a -i :22 -c /d$/
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
sshd 15805 root 3u IPv6 37028 TCP *:ssh (LISTEN)
sshd 15805 root 4u IPv4 37030 TCP *:ssh (LISTEN)

The "-a" option means to logically "and" your search criteria together (the default is logical "or" for some strange reason, although this is almost never what you want). So here we're looking for everything using port 22 and where the process name ends with "d".

Adding the "-t" option and some backtick action, we have our final answer:

# kill `lsof -t -a -i :22 -c /d$/`

And with that, I shall swim home to Monster Island and await my next challenge.