@S-Lott gives the right procedure, but expanding on the Unicode issues, the Python interpreter can provide more insights.
Jon Skeet is right (unusual) about the codecs
module - it contains byte strings:
>>> import codecs
>>> codecs.BOM
'\xff\xfe'
>>> codecs.BOM_UTF8
'\xef\xbb\xbf'
>>>
Picking another nit, the BOM
has a standard Unicode name, and it can be entered as:
>>> bom= u"\N{ZERO WIDTH NO-BREAK SPACE}"
>>> bom
u'\ufeff'
It is also accessible via unicodedata
:
>>> import unicodedata
>>> unicodedata.lookup('ZERO WIDTH NO-BREAK SPACE')
u'\ufeff'
>>>