Skip to content

Correctly block&allow IPv6 domains in http.cookiejar #135768

Open
@LamentXU123

Description

@LamentXU123

Bug report

Bug description:

Now, let's open a flask app here:

from flask import Flask, make_response

app = Flask(__name__)

@app.route('/')
def set_cookie():

    response = make_response("Cookie has been set!")
    response.set_cookie(
        'foo',
        value='bar',   
    )

    return response

if __name__ == '__main__':
    app.run()

This web app set a cookie foo=bar. Then, we use http.cookiejar to process it:

import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy

policy = DefaultCookiePolicy(blocked_domains=['']) # no blockers
cj = CookieJar(policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://127.0.0.1:5000")
for item in cj:
   print('Name = %s' % item.name)
   print('Value = %s' % item.value)

# this should return 

'''
Cookie has been set!
Name = foo
Value = bar
'''

blocked_policy = DefaultCookiePolicy(blocked_domains=["127.0.0.1"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://127.0.0.1:5000")
for item in cj:
   print('Name = %s' % item.name)
   print('Value = %s' % item.value)
# this should return 

'''
Cookie has been set!
'''

Everything goes well right? BUT if we open the flask app in IPv6 host:

from flask import Flask, make_response

app = Flask(__name__)

@app.route('/')
def set_cookie():

    response = make_response("Cookie has been set!")
    response.set_cookie(
        'foo',
        value='bar',   
    )

    return response

if __name__ == '__main__':
    app.run(host='::1')

Then we use cookiejar to process:

import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy

policy = DefaultCookiePolicy(blocked_domains=['']) # no blockers
cj = CookieJar(policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
   print('Name = %s' % item.name)
   print('Value = %s' % item.value)

# this should return 

'''
Cookie has been set!
Name = foo
Value = bar
'''

blocked_policy = DefaultCookiePolicy(blocked_domains=["[::1]"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
   print('Name = %s' % item.name)
   print('Value = %s' % item.value)
# this should return 

'''
Cookie has been set!
Name = foo
Value = bar
'''

blocked_policy = DefaultCookiePolicy(blocked_domains=["::1"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
   print('Name = %s' % item.name)
   print('Value = %s' % item.value)
# this should return 

'''
Cookie has been set!
Name = foo
Value = bar
'''

NO COOKIES ARE BLOCKED.

I've found the problem in func http.cookiejar.DefaultCookiePolicy.is_blocked

    def is_blocked(self, domain):
        for blocked_domain in self._blocked_domains:
            if user_domain_match(domain, blocked_domain):
                return True
        return False

it use func user_domain_match, as below:

def user_domain_match(A, B):
    """For blocking/accepting domains.

    A and B may be host domain names or IP addresses.

    """
    A = A.lower()
    B = B.lower()
    if not (liberal_is_HDN(A) and liberal_is_HDN(B)):
        if A == B:
            # equal IP addresses
            return True
        return False
    initial_dot = B.startswith(".")
    if initial_dot and A.endswith(B):
        return True
    if not initial_dot and A == B:
        return True
    return False

Well, it seems like we are using liberal_is_HDN func to check if A and B are whether HDN or IP addr. the func is as below:

def liberal_is_HDN(text):
    """Return True if text is a sort-of-like a host domain name.

    For accepting/blocking domains.

    """
    if IPV4_RE.search(text):
        return False
    return True

Well, the IPV4_RE regex:

IPV4_RE = re.compile(r"\.\d+$", re.ASCII)

Now, since the program only check IPv4, our addr of IPv6 is forever a HDN, which is completely wrong. So the user_domain_match func forever returns False because it don't have an initial dot.

And instead of blocked_domains we've also got allow_domains which use the same logic and always returns False.

Why does it retr False? because the IPV6 addr will be added a .local on the end since its been treaded as a abnormal HDN. So when it comes to user_domain_match func, A is [::1].local and B is [::1]

That is, every IPv6 addr will be allowed in the DefaultCookiePolicy no matter what blocked_domains is set.

as the DefaultCookiePolicy mainly focused on privacy issues, this could cause some bypassing tho, so things is getting quiet serious here I think.

This issue is previously disscused in:

#135500
https://discuss.python.org/t/support-ipv6-in-http-cookiejar-when-deciding-whether-a-string-is-a-hdn-or-a-ip-addr/95439

And my previous solution is at #135502 which use ipaddress.ip_address() to identify it, which is NOT good because the IPv6 addr is wrapped in []. I am writing the tests script and completing the PR now.

@ericvsmith Thanks!

CPython versions tested on:

3.14

Operating systems tested on:

Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy