-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
I'm experiencing very slow performance when running a CodeQL query on a Python project using TaintTracking::Global
. The analysis never finishes, even after more than 2 hours, on a project that I believe is not very large. Below are some details:
- CVE project: CVE-2024-23637
- Python files: 263
- Total lines: ~88,981
- Sources: < 200
- Sinks: < 200
- Tracking config:
TaintTracking::Global<RemoteToFileConfiguration>
My query looks like this:
module RemoteToFileConfiguration implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
MySources::isSource(source)
}
predicate isSink(DataFlow::Node sink) {
MySinks::isMySink(sink)
}
}
module Flow = TaintTracking::Global<RemoteToFileConfiguration>;
import Flow::PathGraph
from Flow::PathNode source, Flow::PathNode sink
where Flow::flowPath(source, sink)
select sink.getNode(), source, sink, "Flow path from source to sink"
I defined sinks or sources like this (simplified):
module MySinks {
class Sink extends DataFlow::Node {
Sink() {
exists(FunctionValue func, Call call |
func.getQualifiedName() = "run_code" or
func.getQualifiedName() = "check_syntax_error" or
...
call.getFunc().pointsTo(func) and
this = DataFlow::exprNode(call.getAnArg())
)
}
}
predicate isMySink(DataFlow::Node sink) {
exists(Sink s | s = sink)
}
}
My questions:
- Why is the performance so slow in this case?
- Are there any best practices for optimizing
TaintTracking::Global
on Python? - I tried using
func.getQualifiedName()
with a full path like"Module xml.etree.ElementInclude.Function default_loader"
, but it didn’t work in VSCode (the function wasn't found). Is there a correct way to define sinks using fully qualified names for Python?
Thank you very much for any guidance or suggestions!
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested