Jan 092014
 

How do I convert between Unix and Windows text files?

The format of Windows and Unix text files differs slightly. In Windows, lines end with both the line feed and carriage return ASCII characters, but Unix uses only a line feed. As a consequence, some Windows applications will not show the line breaks in Unix-format files. Likewise, Unix programs may display the carriage returns in Windows text files with Ctrl-m ( ^M ) characters at the end of each line.

There are many ways to solve this problem. This document provides instructions for using FTP, screen capture, unix2dos and dos2unix, tr, awk, Perl, and vi to do the conversion. To use these utilities, the files you are converting must be on a Unix computer.

Note: In the instructions below, replace unixfile.txt with the name of your Unix file, and replace winfile.txt with the Windows filename..

FTP
When using an FTP program to move a text file between Unix and Windows, be sure the file is transferred in ASCII format, so the document is transformed into a text format appropriate for the host. Some FTP programs, especially graphical applications (e.g., Hummingbird FTP), do this automatically. If you are using command line FTP, before you begin the transfer, enter:

ascii
Note: You need to use a client that supports secure FTP to transfer files to and from Indiana University’s central systems. For more, see At IU, what SSH/SFTP clients are supported and where can I get them?

dos2unix and unix2dos
The utilities dos2unix and unix2dos are available for converting files from the Unix command line.

To convert a Windows file to a Unix file, enter:

dos2unix winfile.txt unixfile.txt

To convert a Unix file to Windows, enter:

unix2dos unixfile.txt winfile.txt

tr
You can use tr to remove all carriage returns and Ctrl-z ( ^Z ) characters from a Windows file:

tr -d '\15\32' < winfile.txt > unixfile.txt

However, you cannot use tr to convert a document from Unix format to Windows.

awk
To use awk to convert a Windows file to Unix, enter:

awk '{ sub("\r$", ""); print }' winfile.txt > unixfile.txt

To convert a Unix file to Windows, enter:

awk 'sub("$", "\r")' unixfile.txt > winfile.txt

Older versions of awk do not include the sub function. In such cases, use the same command, but replace awk with gawk or nawk.

Perl
To convert a Windows text file to a Unix text file using Perl, enter:

perl -p -e 's/\r$//' < winfile.txt > unixfile.txt

To convert from a Unix text file to a Windows text file, enter:

perl -p -e 's/\n/\r\n/' < unixfile.txt > winfile.txt

You must use single quotation marks in either command line. This prevents your shell from trying to evaluate anything inside.

vi
In vi, you can remove carriage return ( ^M ) characters with the following command:

:1,$s/^M//g

Note: To input the ^M character, press Ctrl-v , and then press Enter or return.
In command mode in the vi editor I have also used Ctrl-v to get the “^” and then Ctrl-M to get the “M”. Example:
:%s/^M//g inside the vi editor to remove the ^M characters.

The same works for ^[ inside the vi editor. Ctrl-V and Ctrl-[ to get “^[” next to
each other. Example find and replace: :%s/^[//g will remove the ^[ characters from
your document in the vi editor.

In vim, use :set ff=unix to convert to Unix; use :set ff=dos to convert to Windows.

Recursive conversion of files

To recursively convert text files in a directory tree, use dos2unix in combination
with the ‘find’ and ‘xargs’ commands.

For instance to convert all .txt files under 
the current directory type:

find . -name *.txt |xargs dos2unix

Source for most of this document.

Jul 202012
 

As simple and straightforward as they may seem, text files still harbor an opportunity for compatibility problems. Different operating systems have traditionally used different ways to indicate line endings (line breaks). Mac OS has traditionally used the Carriage Return character (ASCII chcracter 13, aka CR or ^M) to indicate line breaks; unix has traditionally used the Line Feed character (ASCII 10, aka LF or ^J). Since Mac OS X derives from both heritages, it winds up using a mix of the two in various contexts. But most command line utilites only understand (and produce) files with unix-style breaks.Just to make things even more fun, there’s actually a third variant: MS-DOS its successors use a carriage return followed by a line feed to indicate a line break. Few Macintosh programs will generate such files, but if you need to deal with a file that came from a PC, you’ll probably want to convert it to a more native format on the Mac.

Fortunately, it’s fairly easy to convert the formats back and forth on the command line. Here are some examples of how to transform files back and forth:

tr '\r' '\n' <macfile.txt >unixfile.txt
convert the Mac-format file macfile.txt to unix format, and save
the result as unixfile.txt. tr is a program that does character
substitution, and in this case it's simply being used to replace
CR (written \r on the command line) with LF (written \n)
throughout the file.

tr '\r' '\n' <macfile.txt | grep fnord
convert the Mac-format file macfile.txt to unix format, then use
grep to search the file for the word "fnord". (Note: grep doesn't
understand Mac-style line breaks.)

tr '\n' '\r' <unixfile.txt >macfile.txt
convert the unix-format file unixfile.txt to Mac format, and save
the result as macfile.txt.

perl -p -e 's/\r/\n/g' macfile.txt >unixfile.txt
convert the Mac-format file macfile.txt to unix format, and save
the result as unixfile.txt. This is functionally identical to the
first example, but since perl is actually a very general
programming language, it can also do some other useful things...
BTW, he -e means the program will be the next thing on the command
line ('s/\r/\n/g' - perlese for replace all \r's with \n's), and
the -p means do this for each line of the file.

perl -pi -e 's/\r/\n/g' textfile.txt
convert the file textfile.txt from Mac-style (CR) line breaks to
unix-style (LF), and replace the original file with the converted
version (that's what the -i means).

perl -pi -e 's/\r\n?/\n/g' textfile.txt
convert the file textfile.txt from Mac-style (CR) or PC-style
(CRLF) line breaks to unix-style (LF), and replace the original
file.

perl -pi -e 's/\r\n?/\n/g' *.txt
convert all text files (or rather, files with .txt extensions)
in the current directory to unix-style breaks. Note that any that
were already in unix format will not be changed.

perl -pi -e 's/\n/\r/g' textfile.txt
convert the file textfile.txt from unix-style (LF) line breaks to
Mac-style (CR), and replace the original file. Source

Source