Skip to content

[BUG] Concurrency issue with Deltaproxy lazyloader cache invalidation. #61734

@n-holmstedt

Description

@n-holmstedt

Description
Deltaproxy causing importlibs invalidate_caches to throw a KeyError for caches that has already been deleted. This is causing the module function that is being executed to fail.

Setup
(Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

  • on-prem machine
  • VM (Virtualbox, KVM, etc. please specify)
  • VM running on a cloud service, please be explicit and add details
  • container (Kubernetes, Docker, containerd, etc. please specify)
  • or a combination, please be explicit
  • jails if it is FreeBSD

Running 210 IOX-XR napalm-minions controlled by a total of 30 deltaproxies (max 8 per delta) connected one master, all of them deployed in Kubernetes using the official image.

Steps to Reproduce the behavior
(Include debug logs if possible and relevant)
This bug is a bit hard to reproduce, and only seem to occur when scheduling custom module-commands. The example here is a small abstraction of the napalm 'cli' function run every night as:

        device_version:
          function: iosxr.version
          when: 2:10am
          splay: 10
[ERROR   ] Unhandled exception running iosxr.version
Traceback (most recent call last):
  File "/var/cache/salt/proxy/extmods/har3-28597-510/modules/iosxr.py", line 46, in _send_command
    ret = __salt__['net.cli'](command, textfsm_parse=parse, textfsm_template=template_path)
  File "/usr/local/lib/python3.7/site-packages/salt/loader/context.py", line 78, in __getitem__
    return self.value()[item]
  File "/usr/local/lib/python3.7/site-packages/salt/loader/lazy.py", line 334, in __getitem__
    super().__getitem__(item)  # try to get the item from the dictionary
  File "/usr/local/lib/python3.7/site-packages/salt/utils/lazy.py", line 98, in __getitem__
    if self._load(key):
  File "/usr/local/lib/python3.7/site-packages/salt/loader/lazy.py", line 1033, in _load
    ret = _inner_load(mod_name)
  File "/usr/local/lib/python3.7/site-packages/salt/loader/lazy.py", line 1022, in _inner_load
    if self._load_module(name) and key in self._dict:
  File "/usr/local/lib/python3.7/site-packages/salt/loader/lazy.py", line 822, in _load_module
    self.__clean_sys_path()
  File "/usr/local/lib/python3.7/site-packages/salt/loader/lazy.py", line 641, in __clean_sys_path
    importlib.invalidate_caches()
  File "/usr/local/lib/python3.7/importlib/__init__.py", line 71, in invalidate_caches
    finder.invalidate_caches()
  File "<frozen importlib._bootstrap_external>", line 1186, in invalidate_caches
KeyError: '/usr/local/lib/python37.zip'

Looking at the importlib invalidate_caches

    def invalidate_caches(cls):
        """Call the invalidate_caches() method on all path entry finders
        stored in sys.path_importer_caches (where implemented)."""
        for name, finder in list(sys.path_importer_cache.items()):
            if finder is None:
                del sys.path_importer_cache[name]
            elif hasattr(finder, 'invalidate_caches'):
                finder.invalidate_caches()

It seems like the deltaproxy-hosted minions share this cache, and under certain circumstances a minion that has evaluated 'finder' as None wont be able to delete it from its sys.path since another minion already deleted it. Might be that this issue scales somewhat linearly with the amount of hosted minions on a delta-proxy?

We've solved this internally catching the error in the lazy loader

Expected behavior
The function being scheduled for all minions under the deltaproxy should be executed.

Screenshots
If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
          Salt: 3004

Dependency Versions:
          cffi: 1.14.6
      cherrypy: unknown
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.11.3
       libgit2: 1.1.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.17
      pycrypto: Not Installed
  pycryptodome: 3.9.8
        pygit2: 1.6.1
        Python: 3.7.12 (default, Sep  8 2021, 01:55:52)
  python-gnupg: 0.4.4
        PyYAML: 5.4.1
         PyZMQ: 18.0.1
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.1

System Versions:
          dist: alpine 3.14.2
        locale: UTF-8
       machine: x86_64
       release: 5.4.72-microsoft-standard-WSL2
        system: Linux
       version: Alpine Linux 3.14.2

Additional context
N/A

Metadata

Metadata

Assignees

Labels

Bugbroken, incorrect, or confusing behaviorDelta-Proxy

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions

    pFad - Phonifier reborn

    Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

    Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


    Alternative Proxies:

    Alternative Proxy

    pFad Proxy

    pFad v3 Proxy

    pFad v4 Proxy