[linux] How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script?

How can I programmatically (i.e., not using vi) convert DOS/Windows newlines to Unix?

The dos2unix and unix2dos commands are not available on certain systems. How can I emulate these with commands like sed/awk/tr?

This question is related to linux windows bash unix newline

The answer is


I tried sed 's/^M$//' file.txt on OSX as well as several other methods (http://www.thingy-ma-jig.co.uk/blog/25-11-2010/fixing-dos-line-endings or http://hintsforums.macworld.com/archive/index.php/t-125.html). None worked, the file remained unchanged (btw Ctrl-v Enter was needed to reproduce ^M). In the end I used TextWrangler. Its not strictly command line but it works and it doesn't complain.


tr -d "\r" < file

take a look here for examples using sed:

# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//'               # assumes that all lines end with CR/LF
sed 's/^M$//'              # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//'            # works on ssed, gsed 3.02.80 or higher

# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/"            # command line under ksh
sed 's/$'"/`echo \\\r`/"             # command line under bash
sed "s/$/`echo \\\r`/"               # command line under zsh
sed 's/$/\r/'                        # gsed 3.02.80 or higher

Use sed -i for in-place conversion e.g. sed -i 's/..../' file.


This worked for me

tr "\r" "\n" < sampledata.csv > sampledata2.csv 

I made a script based on the accepted answer so you can convert it directly without needing an additional file in the end and removing and renaming afterwards.

convert-crlf-to-lf() {
    file="$1"
    tr -d '\015' <"$file" >"$file"2
    rm -rf "$file"
    mv "$file"2 "$file"
}

just make sure if you have a file like "file1.txt" that "file1.txt2" doesn't already exist or it will be overwritten, I use this as a temporary place to store the file in.


Super duper easy with PCRE;

As a script, or replace $@ with your files.

#!/usr/bin/env bash
perl -pi -e 's/\r\n/\n/g' -- $@

This will overwrite your files in place!

I recommend only doing this with a backup (version control or otherwise)


The solutions posted so far only deal with part of the problem, converting DOS/Windows' CRLF into Unix's LF; the part they're missing is that DOS use CRLF as a line separator, while Unix uses LF as a line terminator. The difference is that a DOS file (usually) won't have anything after the last line in the file, while Unix will. To do the conversion properly, you need to add that final LF (unless the file is zero-length, i.e. has no lines in it at all). My favorite incantation for this (with a little added logic to handle Mac-style CR-separated files, and not molest files that're already in unix format) is a bit of perl:

perl -pe 'if ( s/\r\n?/\n/g ) { $f=1 }; if ( $f || ! $m ) { s/([^\n])\z/$1\n/ }; $m=1' PCfile.txt

Note that this sends the Unixified version of the file to stdout. If you want to replace the file with a Unixified version, add perl's -i flag.


Just install dos2unix then to convert a file in place use

dos2unix <filename>

To output converted text to a different file use

dos2unix -n <input-file> <output-file>

You can install it on Ubuntu or Debian with

sudo apt install dos2unix

or on macOS using homebrew

brew install dos2unix

Doing this with POSIX is tricky:

  • POSIX Sed does not support \r or \15. Even if it did, the in place option -i is not POSIX

  • POSIX Awk does support \r and \15, however the -i inplace option is not POSIX

  • d2u and dos2unix are not POSIX utilities, but ex is

  • POSIX ex does not support \r, \15, \n or \12

To remove carriage returns:

ex -bsc '%!awk "{sub(/\r/,\"\")}1"' -cx file

To add carriage returns:

ex -bsc '%!awk "{sub(/$/,\"\r\")}1"' -cx file

TIMTOWTDI!

perl -pe 's/\r\n/\n/; s/([^\n])\z/$1\n/ if eof' PCfile.txt

Based on @GordonDavisson

One must consider the possibility of [noeol] ...


You can use awk. Set the record separator (RS) to a regexp that matches all possible newline character, or characters. And set the output record separator (ORS) to the unix-style newline character.

awk 'BEGIN{RS="\r|\n|\r\n|\n\r";ORS="\n"}{print}' windows_or_macos.txt > unix.txt

With bash 4.2 and newer you can use something like this to strip the trailing CR, which only uses bash built-ins:

if [[ "${str: -1}" == $'\r' ]]; then
    str="${str:: -1}"
fi

This problem can be solved with standard tools, but there are sufficiently many traps for the unwary that I recommend you install the flip command, which was written over 20 years ago by Rahul Dhesi, the author of zoo. It does an excellent job converting file formats while, for example, avoiding the inadvertant destruction of binary files, which is a little too easy if you just race around altering every CRLF you see...


An even simpler awk solution w/o a program:

awk -v ORS='\r\n' '1' unix.txt > dos.txt

Technically '1' is your program, b/c awk requires one when given option.

UPDATE: After revisiting this page for the first time in a long time I realized that no one has yet posted an internal solution, so here is one:

while IFS= read -r line;
do printf '%s\n' "${line%$'\r'}";
done < dos.txt > unix.txt

For Mac osx if you have homebrew installed [http://brew.sh/][1]

brew install dos2unix

for csv in *.csv; do dos2unix -c mac ${csv}; done;

Make sure you have made copies of the files, as this command will modify the files in place. The -c mac option makes the switch to be compatible with osx.


Using AWK you can do:

awk '{ sub("\r$", ""); print }' dos.txt > unix.txt

Using Perl you can do:

perl -pe 's/\r$//' < dos.txt > unix.txt

As an extension to Jonathan Leffler's Unix to DOS solution, to safely convert to DOS when you're unsure of the file's current line endings:

sed '/^M$/! s/$/^M/'

This checks that the line does not already end in CRLF before converting to CRLF.


You can use tr to convert from DOS to Unix; however, you can only do this safely if CR appears in your file only as the first byte of a CRLF byte pair. This is usually the case. You then use:

tr -d '\015' <DOS-file >UNIX-file

Note that the name DOS-file is different from the name UNIX-file; if you try to use the same name twice, you will end up with no data in the file.

You can't do it the other way round (with standard 'tr').

If you know how to enter carriage return into a script (control-V, control-M to enter control-M), then:

sed 's/^M$//'     # DOS to Unix
sed 's/$/^M/'     # Unix to DOS

where the '^M' is the control-M character. You can also use the bash ANSI-C Quoting mechanism to specify the carriage return:

sed $'s/\r$//'     # DOS to Unix
sed $'s/$/\r/'     # Unix to DOS

However, if you're going to have to do this very often (more than once, roughly speaking), it is far more sensible to install the conversion programs (e.g. dos2unix and unix2dos, or perhaps dtou and utod) and use them.

If you need to process entire directories and subdirectories, you can use zip:

zip -r -ll zipfile.zip somedir/
unzip zipfile.zip

This will create a zip archive with line endings changed from CRLF to CR. unzip will then put the converted files back in place (and ask you file by file - you can answer: Yes-to-all). Credits to @vmsnomad for pointing this out.


Had just to ponder that same question (on Windows-side, but equally applicable to linux.) Suprisingly nobody mentioned a very much automated way of doing CRLF<->LF conversion for text-files using good old zip -ll option (Info-ZIP):

zip -ll textfiles-lf.zip files-with-crlf-eol.*
unzip textfiles-lf.zip 

NOTE: this would create a zip file preserving the original file names but converting the line endings to LF. Then unzip would extract the files as zip'ed, that is with their original names (but with LF-endings), thus prompting to overwrite the local original files if any.

Relevant excerpt from the zip --help:

zip --help
...
-l   convert LF to CR LF (-ll CR LF to LF)

sed --expression='s/\r\n/\n/g'

Since the question mentions sed, this is the most straight forward way to use sed to achieve this. What the expression says is replace all carriage-return and line-feed with just line-feed only. That is what you need when you go from Windows to Unix. I verified it works.


interestingly in my git-bash on windows sed "" did the trick already:

$ echo -e "abc\r" >tst.txt
$ file tst.txt
tst.txt: ASCII text, with CRLF line terminators
$ sed -i "" tst.txt
$ file tst.txt
tst.txt: ASCII text

My guess is that sed ignores them when reading lines from input and always writes unix line endings on output.


If you don't have access to dos2unix, but can read this page, then you can copy/paste dos2unix.py from here.

#!/usr/bin/env python
"""\
convert dos linefeeds (crlf) to unix (lf)
usage: dos2unix.py <input> <output>
"""
import sys

if len(sys.argv[1:]) != 2:
  sys.exit(__doc__)

content = ''
outsize = 0
with open(sys.argv[1], 'rb') as infile:
  content = infile.read()
with open(sys.argv[2], 'wb') as output:
  for line in content.splitlines():
    outsize += len(line) + 1
    output.write(line + '\n')

print("Done. Saved %s bytes." % (len(content)-outsize))

Cross-posted from superuser.


On Linux it's easy to convert ^M (ctrl-M) to *nix newlines (^J) with sed.

It will something like this on the CLI, there will actually be a line break in the text. However, the \ passes that ^J along to sed:

sed 's/^M/\
/g' < ffmpeg.log > new.log

You get this by using ^V (ctrl-V), ^M (ctrl-M) and \ (backslash) as you type:

sed 's/^V^M/\^V^J/g' < ffmpeg.log > new.log

You can use vim programmatically with the option -c {command} :

Dos to Unix:

vim file.txt -c "set ff=unix" -c ":wq"

Unix to dos:

vim file.txt -c "set ff=dos" -c ":wq"

"set ff=unix/dos" means change fileformat (ff) of the file to Unix/DOS end of line format

":wq" means write file to disk and quit the editor (allowing to use the command in a loop)


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to windows

"Permission Denied" trying to run Python on Windows 10 A fatal error occurred while creating a TLS client credential. The internal error state is 10013 How to install OpenJDK 11 on Windows? I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? git clone: Authentication failed for <URL> How to avoid the "Windows Defender SmartScreen prevented an unrecognized app from starting warning" XCOPY: Overwrite all without prompt in BATCH Laravel 5 show ErrorException file_put_contents failed to open stream: No such file or directory how to open Jupyter notebook in chrome on windows Tensorflow import error: No module named 'tensorflow'

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found

Examples related to newline

How can I insert a line break into a <Text> component in React Native? Print "\n" or newline characters as part of the output on terminal Using tr to replace newline with space How to write one new line in Bitbucket markdown? Line break in SSRS expression How to insert a new line in Linux shell script? Replace CRLF using powershell How to write new line character to a file in Java What is the newline character in the C language: \r or \n? How to print values separated by spaces instead of new lines in Python 2.7