Open
Description
Bug report
Bug description:
Hi, when I use the mimetypes
module and one of the known mime.types
files include a non utf-8 encoded comment, the operation fails with UnicodeDecodeError
:
......
File "/usr/lib/python3.9/urllib/request.py", line 1506, in open_local_file
mtype = mimetypes.guess_type(filename)[0]
File "/usr/lib/python3.9/mimetypes.py", line 289, in guess_type
init()
File "/usr/lib/python3.9/mimetypes.py", line 362, in init
db.read(file)
File "/usr/lib/python3.9/mimetypes.py", line 204, in read
self.readfp(fp, strict)
File "/usr/lib/python3.9/mimetypes.py", line 215, in readfp
line = fp.readline()
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 168: invalid start byte
The same can be forced with:
import mimetypes
mimetypes.init(files=["mimefile"])
and occurs because the file is opened in text mode expecting unicode encoding:
Line 215 in 2e098ab
I am not sure whether there is a convention for which encoding the mime.types
file will use, but I feel that at least comments should be allowed in any encoding?
CPython versions tested on:
3.9, 3.11
Operating systems tested on:
Linux, Other