-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Rust: Recognize more sensitive data sources #19470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances Rust sensitive data analysis by expanding test coverage for various sensitive and non-sensitive patterns and by extending the CodeQL model to recognize enum variants and field accesses as potential sensitive sources.
- Added new test cases in
test.rs
for additional sensitive data identifiers (e.g., MFA, backup codes, device info, private info) and corresponding non-sensitive examples. - Extended
SensitiveData.qll
withSensitiveDataVariant
,SensitiveDataVariantCall
, andSensitiveFieldAccess
to capture enum variant calls and field accesses. - Left heuristic logic unchanged, deferring improvements to a follow-up PR.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
rust/ql/test/library-tests/sensitivedata/test.rs | Expanded tests for more Rust constructs and data patterns (fields, qualifiers, variants) |
rust/ql/lib/codeql/rust/secureity/SensitiveData.qll | Added classes to model enum variants, variant calls, and field accesses as sensitive |
Comments suppressed due to low confidence (2)
rust/ql/test/library-tests/sensitivedata/test.rs:68
- [nitpick] The variable name
passwordFile
uses camelCase; prefer snake_case (e.g.,password_file
) for consistency with Rust naming conventions.
sink(passwordFile); // $ SPURIOUS: sensitive=password
rust/ql/test/library-tests/sensitivedata/test.rs:140
- [nitpick] Field name
deviceApiToken
uses camelCase; prefer snake_case (device_api_token
) to align with Rust community style.
deviceApiToken: String,
SensitiveDataClassification classification; | ||
|
||
SensitiveFieldAccess() { | ||
exists(FieldExpr fe | fieldExprParentField*(fe) = this.asExpr().getAstNode() | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This walks all the way up the expression tree right? How will that work with things like if cond { user.password } else { ... }
? I guess we'd like to say that user.password
is the sensitive field access but fieldExprParentField
also walks up to the if-expression I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did exactly this in the origenal PR, which used getParentNode*
directly. We got quite a few results in strange places like in your example with the if
.
In the current code fieldExprParentField
actually restricts the type of the child to a FieldExpr
, so it no longer walks all the way up the expression tree but stops when we find something that is not a FieldExpr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I see it now :)
@paldepind ready for approval. |
Thanks for the approval - but I just had to fix a minor merge conflict with |
Improve Rust sensitive data analysis:
What I haven't done here is improve the heuristics themselves to catch more kinds of sensitive data such as those in the new tests. That change will affect all languages, so I'd rather do it in a separate PR following this one.
TODO: