Skip to content

mimetypes raises UnicodeDecodeError when map files are not unicode encoded #117807

Open
@kulikjak

Description

@kulikjak

Bug report

Bug description:

Hi, when I use the mimetypes module and one of the known mime.types files include a non utf-8 encoded comment, the operation fails with UnicodeDecodeError:

......
  File "/usr/lib/python3.9/urllib/request.py", line 1506, in open_local_file
    mtype = mimetypes.guess_type(filename)[0]
  File "/usr/lib/python3.9/mimetypes.py", line 289, in guess_type
    init()
  File "/usr/lib/python3.9/mimetypes.py", line 362, in init
    db.read(file)
  File "/usr/lib/python3.9/mimetypes.py", line 204, in read
    self.readfp(fp, strict)
  File "/usr/lib/python3.9/mimetypes.py", line 215, in readfp
    line = fp.readline()
  File "/usr/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 168: invalid start byte

The same can be forced with:

import mimetypes
mimetypes.init(files=["mimefile"]) 

and occurs because the file is opened in text mode expecting unicode encoding:

with open(filename, encoding='utf-8') as fp:

I am not sure whether there is a convention for which encoding the mime.types file will use, but I feel that at least comments should be allowed in any encoding?

CPython versions tested on:

3.9, 3.11

Operating systems tested on:

Linux, Other

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy