[unicode] FPDF utf-8 encoding (HOW-TO)

Does anybody know how to set the encoding in FPDF package to utf-8? Or at least to ISO-8859-7 (Greek) that support greek characters?

Basically I want to create a pdf file containing greek characters.

Any suggestions would help. George

This question is related to unicode utf-8 character-encoding fpdf

The answer is


there is a really simple solution for this problem.

In the file fpdf.php go to the line that says:

if($txt!=='')
{

It is line 648 in my version of fpdf. Insert the following line of code:

$txt = iconv('utf-8', 'cp1252', $txt);

(above the line of code)

if($align=='R')

This works for all German special characters and should also work for Greek special characters. Otherwise simply replace cp1252 with the respective alphabet you require. You can see all supported characters here: http://en.wikipedia.org/wiki/Windows-1252

I saw the solution here: http://fudforum.org/forum/index.php?t=msg&goto=167345 Please use my example code above, as the original author forgot to insert a dash between utf and 8.

Hope the above was helpful.

Daan


I wanted to answer this for anyone who hasn't switched over to TFPDF for whatever reason (framework integration, etc.)

Go to: http://www.fpdf.org/makefont/index.php

Use a .ttf compatible font for the language you want to use. Make sure to choose the encoding number that is correct for your language. Download the files and paste them in your current FPDF font directory.

Use this to activate the new font: $pdf->AddFont($font_name,'','Your_Font_Here.php');

Then you can use $pdf->SetFont normally.

On the font itself, use iconv to convert to UTF-8. So if for example you're using Hebrew, you would do iconv('UTF-8', 'windows-1255', $first_name).

Substitute the windows encoding number for your language encoding.

For right to left, a quick fix is doing something like strrev(iconv('UTF-8', 'windows-1255', $first_name)).


There's an extention to FPDF called UFDPF http://acko.net/blog/ufpdf-unicode-utf-8-extension-for-fpdf/

But, imho, it's better to use mpdf if you're it's possible for you to change class.


None of the above solutions are going to work.

Try this:

function filter_html($value){
    $value = mb_convert_encoding($value, 'ISO-8859-1', 'UTF-8');
    return $value;
}

You need to generate a font first. You must use the MakeFont utility included within the FPDF package. I used on Linux this a bit extended script from the demo:

<?php
// Generation of font definition file for tutorial 7
require('../makefont/makefont.php');

$dir = opendir('/usr/share/fonts/truetype/ttf-dejavu/');
while (($relativeName = readdir($dir)) !== false) {
    if ($relativeName == '..' || $relativeName == '.')
        continue;
    MakeFont("/usr/share/fonts/truetype/ttf-dejavu/$relativeName",'ISO-8859-2');
}
?>

Then I copied generated files to the font directory of my web and used this:

$pdf->Cell(80,70, iconv('UTF-8', 'ISO-8859-2', 'Bunka jedna'),1);

(I was working on a table.) That worked for my language (Bunka jedna is czech for Cell one). Czech language belongs to central european languages, also ISO-8859-2. Regrettably the user of FPDF is forced to lost advantages of UTF-8 encoding. You cannot get this in your PDF:

Mestecko Fruens Bøge

Danish letter ø becomes r in ISO-8859-2.

Suggestion of solution: You need to get a Greek font, generate the font using proper encoding (ISO-8859-7) and use iconv with the same target encoding as the one the font has been generated with.


just edit the function cell in the fpdf.php file, look for the line that looks like this

function cell ($w, $h = 0, $txt = '', $border = 0, $ln = 0, $align = '', $fill = false, $link = '')
{ 

after finding the line

write after the {,

$txt = utf8_decode($txt);

save the file and ready, the accents and the utf8 encoding will be working :)


You can make a class to extend FPDF and add this:

class utfFPDF extends FPDF {

function Cell($w, $h=0, $txt="", $border=0, $ln=0, $align='', $fill=false, $link='')
{
    if (!empty($txt)){
        if (mb_detect_encoding($txt, 'UTF-8', false)){
            $txt = iconv('UTF-8', 'ISO-8859-5', $txt);

        }
    }
    parent::Cell($w, $h, $txt, $border, $ln, $align, $fill, $link);

} 

}


This answer didn't work for me, I needed to run html decode on the string also. See

iconv('UTF-8', 'windows-1252', html_entity_decode($str));

Props go to emfi from html_entity_decode in FPDF(using tFPDF extention)


I know that this question is old but I think my answer would help those who haven't found solution in other answers. So, my problem was that I couldn't display croatian characters in my PDF. Firstly, I used FPDF but, I think, it does not support Unicode. Finally, what solved my problem is tFPDF which is the version of FPDF that supports Unicode. This is the example that worked for me:

require('tFPDF/tfpdf.php');
$pdf = new tFPDF();
$pdf->AddPage();
$pdf->AddFont('DejaVu','','DejaVuSansCondensed.ttf',true);
$pdf->AddFont('DejaVu', 'B', 'DejaVuSansCondensed-Bold.ttf', true);

$pdf->SetFont('DejaVu','',14);

$txt = 'ccžšdCCŽŠÐ';
$pdf->Write(8,$txt);

$pdf->Output();

How do I create PDF's in FPDF that support Chinese, Japanese, Russian, etc.?

(snapshots of code in use below)

I'd like to provide: a summary of the problem, the solution, a github project with the working code, and an online example with the expected, resultant PDF.

The Problem :

  1. As stated by Tarsis, swap FPDF to TFPDF.
  2. You actually need a font that supports the UTF-8 characters you are using.

    I.E., merely using Helvetica and trying to display Japanese will not work. If you use Font Forge, or some other font tool, you can scroll to the Chinese characters of the font, and see that they are blank.

    Google has a font (Noto font) that contains all languages, and it is 20mb, which is usually several factors the size of your text. So, you can see why many fonts simply won't cover every single language.

The Solution :

I'm using rounded-mgenplus-20140828.ttf and ZCOOL_QingKe_HuangYou.ttf font packs for Japanese and Chinese, which are open source and can be found in many open source projects. In tFPDF itself, or a new inheriting class of it, like class HTMLtoPDF extends tFPDF {...}, you'll do this...

$this->AddFont('japanese', '', 'rounded-mgenplus-20140828.ttf', true);
$this->SetFont('japanese', '', 14);
$this->Write(14, '???');

Should be nothing more to it!

Code Package on GitHub :

https://github.com/HoldOffHunger/php-html-to-pdf

Working, Online Demo of Japanese :

https://www.earthfluent.com/privacy.pdf?language=ja


You can apply this function on your text :

 $yourtext = iconv('UTF-8', 'windows-1252', $yourtext);

Thanks


I use FPDF for ASP, and the iconv function is not available. It seems strange, by I solved the UTF-8 problem by adding a fake image (an 1x1px jpeg) to the pdf, just after the AddPage() function:

pdf.Image "images/fpdf.jpg",0,0,1

In this way, accented characters are correctly added to my pdf, don't ask me why but it works.


There is an extension of FPDF called mPDF that allows Unicode fonts.

http://www.mpdf1.com/mpdf/index.php


Not sure if it will do for Greek, but I had the same issue for Brazilian Portuguese characters and my solution was to use html entities. I had basically two cases:

  1. String may contain UTF-8 characters.

For these, I first encoded it to html entities with htmlentities() and then decoded them to iso-8859-1. Example:

$s = html_entity_decode(htmlentities($my_variable_text), ENT_COMPAT | ENT_HTML401, 'iso-8859-1');
  1. Fixed string with html entities:

For these, I just left htmlentities() call out. Example:

$s = html_entity_decode("Treasurer/Tr&eacute;sorier", ENT_COMPAT | ENT_HTML401, 'iso-8859-1');

Then I passed $s to FPDF, like in this example:

$pdf->Cell(100, 20, $s, 0, 0, 'L');

Note: ENT_COMPAT | ENT_HTML401 is the standard value for parameter #2, as in http://php.net/manual/en/function.html-entity-decode.php

Hope that helps.


There also is a official UTF-8 Version of FPDF called tFPDF http://www.fpdf.org/en/script/script92.php

You can easyly switch from the original FPDF, just make sure you also use a unicode Font as shown in the example in the above link or my code:

<?php

//this is a UTF-8 file, we won't need any encode/decode/iconv workarounds

//define the path to the .ttf files you want to use
define('FPDF_FONTPATH',"../fonts/");
require('tfpdf.php');

$pdf = new tFPDF();
$pdf->AddPage();

// Add Unicode fonts (.ttf files)
$fontName = 'Helvetica';
$pdf->AddFont($fontName,'','HelveticaNeue LightCond.ttf',true);
$pdf->AddFont($fontName,'B','HelveticaNeue MediumCond.ttf',true);

//now use the Unicode font in bold
$pdf->SetFont($fontName,'B',12);

//anything else is identical to the old FPDF, just use Write(),Cell(),MultiCell()... 
//without any encoding trouble
$pdf->Cell(100,20, "Some UTF-8 String");

//...
?>

I think its much more elegant to use this instead of spaming utf8_decode() everywhere and the ability to use .ttf files directly in AddFont() is an upside too.

Any other answer here is just a way to avoid or work around the problem, and avoiding UTF-8 is no real option for an up to date project.

There are also alternatives like mPDF or TCPDF (and others) wich base on FPDF but offer advanced functions, have UTF-8 Support and can interpret HTML Code (limited of course as there is no direct way to convert HTML to PDF). Most of the FPDF code can be used directly in those librarys, so its pretty easy to migrate the code.

https://github.com/mpdf/mpdf http://www.tcpdf.org/


For offsprings.

How I managed to add russian language to fpdf on my Linux machine:

1) Go to http://www.fpdf.org/makefont/ and convert your ttf font(for example AerialRegular.ttf) into 2 files using ISO-8859-5 encoding: AerialRegular.php and AerialRegular.z

2) Put these 2 files into fpdf/font directory

3) Use it in your code:

$pdf = new \FPDI();
    $pdf->AddFont('ArialMT','','ArialRegular.php');
    $pdf->AddPage();
    $tplIdx = $pdf->importPage(1);
    $pdf->useTemplate($tplIdx, 0, 0, 211, 297); //width and height in mms
    $pdf->SetFont('ArialMT','',35);
    $pdf->SetTextColor(255,0,0);
    $fullName = iconv('UTF-8', 'ISO-8859-5', '???????');
    $pdf->SetXY(60, 54);
    $pdf->Write(0, $fullName);

Instead of this iconv solution:

$str = iconv('UTF-8', 'windows-1252', $str);

You could use the following:

$str = mb_convert_encoding($str, "UTF-8", "Windows-1252");

See: How to convert Windows-1252 characters to values in php?


Examples related to unicode

How to resolve TypeError: can only concatenate str (not "int") to str (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape UnicodeEncodeError: 'ascii' codec can't encode character at special name Python NLTK: SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP) HTML for the Pause symbol in audio and video control Javascript: Unicode string to hex Concrete Javascript Regex for Accented Characters (Diacritics) Replace non-ASCII characters with a single space UTF-8 in Windows 7 CMD NameError: global name 'unicode' is not defined - in Python 3

Examples related to utf-8

error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte Changing PowerShell's default output encoding to UTF-8 'Malformed UTF-8 characters, possibly incorrectly encoded' in Laravel Encoding Error in Panda read_csv Using Javascript's atob to decode base64 doesn't properly decode utf-8 strings What is the difference between utf8mb4 and utf8 charsets in MySQL? what is <meta charset="utf-8">? Pandas df.to_csv("file.csv" encode="utf-8") still gives trash characters for minus sign UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128) Android Studio : unmappable character for encoding UTF-8

Examples related to character-encoding

Changing PowerShell's default output encoding to UTF-8 JsonParseException : Illegal unquoted character ((CTRL-CHAR, code 10) Change the encoding of a file in Visual Studio Code What is the difference between utf8mb4 and utf8 charsets in MySQL? How to open html file? All inclusive Charset to avoid "java.nio.charset.MalformedInputException: Input length = 1"? UTF-8 output from PowerShell ERROR 1115 (42000): Unknown character set: 'utf8mb4' "for line in..." results in UnicodeDecodeError: 'utf-8' codec can't decode byte How to make php display \t \n as tab and new line instead of characters

Examples related to fpdf

Setting paper size in FPDF FPDF error: Some data has already been output, can't send PDF FPDF utf-8 encoding (HOW-TO) Make text wrap in a cell with FPDF? Inserting an image with PHP and FPDF Inserting line breaks into PDF