Skip to content

deduplication is not working #12320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
phuget opened this issue Apr 28, 2025 · 3 comments
Open

deduplication is not working #12320

phuget opened this issue Apr 28, 2025 · 3 comments
Labels

Comments

@phuget
Copy link

phuget commented Apr 28, 2025

Hey,

I have a problem with deduplication. I use the trivy-dojo-report-operator to import my reports to Defectdojo but I keep getting clones of vulnerabilities that only differ in creation-time and description.

I enabled deduplication in Defectdojo and set the max number of duplicates to 0. I think the issue could be the description-field. It contains our ressource-name which ends with a hash that changes every time we deploy. I already tried to change the deduplication algorithm. However nothing worked for me so far. Is there a workaround?

I looked into the logs of the deployed Defectdojo pods, but didn't see any errors.

Here are the values of one of the findings that have not been recognized as duplicates:

Title CVE-2024-7254 com.google.protobuf:protobuf-java 3.25.4 (same for both)
Productname: Testrun (same for both)
Servicename: Testrun (same for both)
Component Version: 3.25.4 (same for both)
Component Name  com.google.protobuf:protobuf-java (same for both)
Vulnerability Ids CVE-2024-7254 (same for both)
Severity: high (same for both)
Description:
      protobuf: StackOverflow vulnerability in Protocol Buffers (same for both)
      Fixed version: 3.25.5, 4.27.5, 4.28.2 (same for both)
      container.name: Testrun (same for both)
      resource.kind: ReplicaSet (same for both)
      resource.name: Testrun-5b66c55585 (---------------The hash is different between both--------------)
      resource.namespace: dev (same for both)

Defect-Dojo-Django Version Docker: 2.42.0-alpine
Helm Version: 1.6.183

@phuget phuget added the bug label Apr 28, 2025
@valentijnscholten
Copy link
Member

valentijnscholten commented Apr 28, 2025

The dedupe config for trivy operator by default:

"Trivy Operator Scan": ["title", "severity", "vulnerability_ids", "description"],

And recalculating the hash_codes via:

docker compose exec uwsgi /bin/bash -c "python manage.py dedupe.py --parser 'Trivy Operator Scan' --hash_code_only"

@MPritsch
Copy link

Thanks @valentijnscholten, I'm a collegue of phuget. This seems to be working, I actually found this before your reply by reading up different issues on github and looking up linked markdown files. Might I suggest adding this information to the official documentation at the deduplication section here https://docs.defectdojo.com/en/working_with_findings/finding_deduplication/about_deduplication/

We had trouble understanding what parsers do, how they are connected to Tests and how Hashcodes are involved.
It was not obvious, that the key of the parsers is connected to the "Test Type". I assumed it was a typo, since spaces in key-value mappings are rare. We configured the HASHCODE_FIELDS_PER_SCANNER value for "Trivy Operator Scan" without the "description" field and regenerated the hash_codes again.

All of this was not mentioned or linked in the documentation linked above.

We found the information we needed in this document and the subsequent chapters: https://github.com/DefectDojo/django-DefectDojo/blob/master/docs/content/en/open_source/archived_docs/usage/features.md#deduplication-algorithms
My problem with the location is, that it is part of the "archived_docs" folder where I would assume the information to be outdated.

All in all we spent about 2-3 hours searching up on this.

@valentijnscholten
Copy link
Member

copying in @paulOsinski

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy