Remove non-ASCII characters from CSV

Question

I want to remove all the non-ASCII characters from a file in place   I found one solution with tr  but I guess I need to write back that file after modification   I need to do it in place with relatively good performance   Any suggestions

User · Answer

I m using a very minimal busybox system  in which there is no support for ranges in tr or POSIX character classes  so I have to do it the crappy old-fashioned way   Here s the solution with sed  stripping ALL non-printable non-ASCII characters from the file   sed -i  s   a-zA-Z 0-9         amp                           lt  gt     g  FILE

User · Answer

I tried all the solutions and nothing worked  The following  however  does   tr -cd   11 12 15 40- 176    Which I found here   https   alvinalexander com blog post linux-unix how-remove-non-printable-ascii-characters-file-unix  My problem needed it in a series of piped programs  not directly from a file  so modify as needed

User · Answer

A perl oneliner would do  perl -i bak -pe  s     ascii     g   lt your file gt   -i says that the file is going to be edited inplace  and the backup is going to be saved with extension  bak

User · Answer

-i  inplace   sed -i  s   d128- d255   g  FILENAME

User · Answer

This worked for me   sed -i  s     print     g

User · Answer

-i  inplace   LANG C sed -i -E  s   d128- d255   g   path to file s    The LANG C part s role is to avoid a Invalid collation character error   Based on Ivan s answer and Patrick s comment

User · Answer

sed -i  s     print       FILENAME   Also  this acts like dos2unix

User · Answer

Try tr instead of sed  tr -cd    print     lt  file txt

User · Answer

I appreciate the tips I found on this site   But  on my Windows 10  I had to use double quotes for this to work      sed -i  s   d128- d255   g  FILENAME  Noticed these things       For FILENAME the entire path name needs to be quoted This didn t work --  TEMP   FILENAME  This did         --  TEMP  FILENAME  sed leaves behind temp files in the current directory  named sed

User · Answer

As an alternative to sed or perl you may consider to use ed 1  and POSIX character classes   Note  ed 1  reads the entire file into memory to edit it in-place  so for really large files you should use sed -i      perl -i        see    - http   wiki bash-hackers org doku php id howto edit-ed   - http   en wikipedia org wiki Regular expression POSIX character classes    test echo   aaa  177 bbb  200  214 ccc  254 ddd r n   gt  testfile ed -s testfile  lt  lt  lt     l   ed -s testfile  lt  lt  lt    H ng     graph    space    cntrl    s   g nwq  ed -s testfile  lt  lt  lt     l

User · Answer

awk    sub    a-zA-Z0-9         amp                   print    MYinputfile txt  gt  pipe out to CONVERTED FILE txt

[sed] Remove non-ASCII characters from CSV

Examples related to sed

Examples related to awk