Tuesday, February 17, 2009

Episode #1 - Convert Dos To UNIX

Okay, so we'll start simple here at Command Line Kung Fu. I think many of us run into the problem where we get those funny Windows characters in our files (^M). These are of course Windows line breaks. The simple solution (which should work on OS X and Linux):

 tr '\r' '\n' < file.txt > file.txt


I just had to do this on a file that someone shared with me. Next maybe we'll look at how I chopped up and hacked that file with grep and cut :)

Paul Asadoorian
PaulDotCom

Davide Brini points out that there are a couple of problems with the solution above:

  1. The output redirections here "... <file.txt >file.txt" are going to leave you with an empty file. That's because the ">file.txt" truncates the file before "<file.txt" reads any data from it. What you need to do instead is something like "... <file.txt >newfile.txt"


  2. Paul's tr command results in double newlines for DOS-formatted files whose lines typically end with "\r\n". A more correct solution would be "sed 's/\r$//' file.txt >newfile.txt"


As Davide also points out, there's also the dos2unix utility which is an Open Source tool that's often found in Unix-like OSes.