The answer of @PaulDixon is completely wrong, because it removes the printable extended ASCII characters 128-255! has been partially corrected. I don't know why he still wants to delete 128-255 from a 127 chars 7-bit ASCII set as it does not have the extended ASCII characters.
But finally it was important not to delete 128-255 because for example chr(128)
(\x80
) is the euro sign in 8-bit ASCII and many UTF-8 fonts in Windows display a euro sign and Android regarding my own test.
And it will kill many UTF-8 characters if you remove the ASCII chars 128-255 from an UTF-8 string (probably the starting bytes of a multi-byte UTF-8 character). So don't do that! They are completely legal characters in all currently used file systems. The only reserved range is 0-31.
Instead use this to delete the non-printable characters 0-31 and 127:
$string = preg_replace('/[\x00-\x1F\x7F]/', '', $string);
It works in ASCII and UTF-8 because both share the same control set range.
The fastest slower¹ alternative without using regular expressions:
$string = str_replace(array(
// control characters
chr(0), chr(1), chr(2), chr(3), chr(4), chr(5), chr(6), chr(7), chr(8), chr(9), chr(10),
chr(11), chr(12), chr(13), chr(14), chr(15), chr(16), chr(17), chr(18), chr(19), chr(20),
chr(21), chr(22), chr(23), chr(24), chr(25), chr(26), chr(27), chr(28), chr(29), chr(30),
chr(31),
// non-printing characters
chr(127)
), '', $string);
If you want to keep all whitespace characters \t
, \n
and \r
, then remove chr(9)
, chr(10)
and chr(13)
from this list. Note: The usual whitespace is chr(32)
so it stays in the result. Decide yourself if you want to remove non-breaking space chr(160)
as it can cause problems.
¹ Tested by @PaulDixon and verified by myself.