Attempts to detect the character encoding of the xml file given by a
file object fp. fp must not be a codec wrapped file object!
The return value can be:
-
if detection of the BOM succeeds, the codec name of the corresponding
unicode charset is returned
-
if BOM detection fails, the xml declaration is searched for the
encoding attribute and its value returned. the "<"
character has to be the very first in the file then (it's xml
standard after all).
-
if BOM and xml declaration fail, None is returned. According to xml
1.0 it should be utf_8 then, but it wasn't detected by the means
offered here. at least one can be pretty sure that a character coding
including most of ASCII is used :-/
|