Some of my script are using different encoding, and when I try to combine them, this has becom an issue.
But I can't change the encoding they use, instead I want to change the encodig of the result from script A, and use it as parameter in script B.
So: is there any simple way to change a string from UTF-8 to ISO-88591 in PHP? I have looked at utf_encode and _decode, but they doesn't do what i want. Why doesn't there exsist any "utf2iso()"-function, or similar?
I don't think I have characters that can't be written in ISO-format, so that shouldn't be an huge issue.
This question is related to
php
encoding
utf-8
iso-8859-1
First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.
Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding
.
Nevertheless, utf8_encode
and utf8_decode
should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.
I used:
function utf8_to_html ($data) {
return preg_replace(
array (
'/ä/',
'/ö/',
'/ü/',
'/é/',
'/à/',
'/è/'
),
array (
'ä',
'ö',
'ü',
'é',
'à',
'è'
),
$data
);
}
Use html_entity_decode()
and htmlentities()
.
$html = html_entity_decode(htmlentities($html, ENT_QUOTES, 'UTF-8'), ENT_QUOTES , 'ISO-8859-1');
htmlentities()
formats your input into UTF8
and html_entity_decode()
formats it back to ISO-8859-1
.
You need to use the iconv package, specifically its iconv function.
set meta tag in head as
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
use the link http://www.i18nqa.com/debug/utf8-debug.html to replace the symbols character you want.
then use str_replace like
$find = array('“', '’', '…', '—', '–', '‘', 'é', 'Â', '•', 'Ëœ', 'â€'); // en dash
$replace = array('“', '’', '…', '—', '–', '‘', 'é', '', '•', '˜', '”');
$content = str_replace($find, $replace, $content);
Its the method i use and help alot. Thanks!
First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.
Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding
.
Nevertheless, utf8_encode
and utf8_decode
should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.
You need to use the iconv package, specifically its iconv function.
function parseUtf8ToIso88591(&$string){
if(!is_null($string)){
$iso88591_1 = utf8_decode($string);
$iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $string);
$string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');
}
}
First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.
Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding
.
Nevertheless, utf8_encode
and utf8_decode
should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.
In my case after files with names containing those characters were uploaded, they were not even visible with Filezilla! In Cpanel filemanager they were shown with ? (under black background). And this combination made it shown correctly on the browser (HTML document is Western-encoded):
$dspFileName = utf8_decode(htmlspecialchars(iconv(mb_internal_encoding(), 'utf-8', basename($thisFile['path']))) );
function parseUtf8ToIso88591(&$string){
if(!is_null($string)){
$iso88591_1 = utf8_decode($string);
$iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $string);
$string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');
}
}
I used:
function utf8_to_html ($data) {
return preg_replace(
array (
'/ä/',
'/ö/',
'/ü/',
'/é/',
'/à/',
'/è/'
),
array (
'ä',
'ö',
'ü',
'é',
'à',
'è'
),
$data
);
}
You need to use the iconv package, specifically its iconv function.
It is much better to use
$value = mb_convert_encode($value,'HTML-ENTITIES','UTF-8');
Specially when you are using AJAX call for submitting 'ISO-8859-1' characters. It works for Chinese, Japanese, Czech, German and many more languages.
I use this function:
function formatcell($data, $num, $fill=" ") {
$data = trim($data);
$data=str_replace(chr(13),' ',$data);
$data=str_replace(chr(10),' ',$data);
// translate UTF8 to English characters
$data = iconv('UTF-8', 'ASCII//TRANSLIT', $data);
$data = preg_replace("/[\'\"\^\~\`]/i", '', $data);
// fill it up with spaces
for ($i = strlen($data); $i < $num; $i++) {
$data .= $fill;
}
// limit string to num characters
$data = substr($data, 0, $num);
return $data;
}
echo formatcell("YES UTF8 String Zürich", 25, 'x'); //YES UTF8 String Zürichxxx
echo formatcell("NON UTF8 String Zurich", 25, 'x'); //NON UTF8 String Zurichxxx
Check out my function in my blog http://www.unexpectedit.com/php/php-handling-non-english-characters-utf8
set meta tag in head as
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
use the link http://www.i18nqa.com/debug/utf8-debug.html to replace the symbols character you want.
then use str_replace like
$find = array('“', '’', '…', '—', '–', '‘', 'é', 'Â', '•', 'Ëœ', 'â€'); // en dash
$replace = array('“', '’', '…', '—', '–', '‘', 'é', '', '•', '˜', '”');
$content = str_replace($find, $replace, $content);
Its the method i use and help alot. Thanks!
It is much better to use
$value = mb_convert_encode($value,'HTML-ENTITIES','UTF-8');
Specially when you are using AJAX call for submitting 'ISO-8859-1' characters. It works for Chinese, Japanese, Czech, German and many more languages.
In my case after files with names containing those characters were uploaded, they were not even visible with Filezilla! In Cpanel filemanager they were shown with ? (under black background). And this combination made it shown correctly on the browser (HTML document is Western-encoded):
$dspFileName = utf8_decode(htmlspecialchars(iconv(mb_internal_encoding(), 'utf-8', basename($thisFile['path']))) );
First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.
Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding
.
Nevertheless, utf8_encode
and utf8_decode
should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.
You need to use the iconv package, specifically its iconv function.
I use this function:
function formatcell($data, $num, $fill=" ") {
$data = trim($data);
$data=str_replace(chr(13),' ',$data);
$data=str_replace(chr(10),' ',$data);
// translate UTF8 to English characters
$data = iconv('UTF-8', 'ASCII//TRANSLIT', $data);
$data = preg_replace("/[\'\"\^\~\`]/i", '', $data);
// fill it up with spaces
for ($i = strlen($data); $i < $num; $i++) {
$data .= $fill;
}
// limit string to num characters
$data = substr($data, 0, $num);
return $data;
}
echo formatcell("YES UTF8 String Zürich", 25, 'x'); //YES UTF8 String Zürichxxx
echo formatcell("NON UTF8 String Zurich", 25, 'x'); //NON UTF8 String Zurichxxx
Check out my function in my blog http://www.unexpectedit.com/php/php-handling-non-english-characters-utf8
Use html_entity_decode()
and htmlentities()
.
$html = html_entity_decode(htmlentities($html, ENT_QUOTES, 'UTF-8'), ENT_QUOTES , 'ISO-8859-1');
htmlentities()
formats your input into UTF8
and html_entity_decode()
formats it back to ISO-8859-1
.
Source: Stackoverflow.com