Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Add MaD models for SysAllocString and friends #18463

Merged
merged 3 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions cpp/ql/lib/ext/oleauto.model.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
extensions:
- addsTo:
pack: codeql/cpp-all
extensible: summaryModel
data: # namespace, type, subtypes, name, signature, ext, input, output, kind, provenance
- ["", "", False, "SysAllocString", "", "", "Argument[*0]", "ReturnValue[*]", "value", "manual"]
- ["", "", False, "SysAllocStringByteLen", "", "", "Argument[*0]", "ReturnValue[*]", "value", "manual"]
- ["", "", False, "SysAllocStringLen", "", "", "Argument[*0]", "ReturnValue[*]", "value", "manual"]
Comment on lines +7 to +8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two presumably can cut the tail of the string passed in as an argument (when the passed length is too short). I assume we still consider that value flow (although part of the data has gone missing)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. We model that somewhat inconsistency across different models. For example, strncpy is currently modeled as taint because the length argument could mean that we only did a partial copy. In practice, I have a feeling that most of these length-bounded functions are used simply as defensive programming and do provide full copies in most cases. So nowadays I tend to lean towards modeling them as value-preserving.

4 changes: 4 additions & 0 deletions cpp/ql/src/change-notes/2025-01-09-SysAllocString.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Added dataflow models for `SysAllocString` and related functions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ edges
| asio_streams.cpp:100:44:100:62 | call to buffer | asio_streams.cpp:103:29:103:39 | *send_buffer | provenance | Sink:MaD:6 |
| asio_streams.cpp:100:64:100:71 | *send_str | asio_streams.cpp:56:18:56:23 | [summary param] *0 in buffer | provenance | |
| asio_streams.cpp:100:64:100:71 | *send_str | asio_streams.cpp:100:44:100:62 | call to buffer | provenance | MaD:10 |
| test.cpp:4:5:4:11 | [summary param] 0 in ymlStep | test.cpp:4:5:4:11 | [summary] to write: ReturnValue in ymlStep | provenance | MaD:966 |
| test.cpp:7:10:7:18 | call to ymlSource | test.cpp:7:10:7:18 | call to ymlSource | provenance | Src:MaD:964 |
| test.cpp:7:10:7:18 | call to ymlSource | test.cpp:11:10:11:10 | x | provenance | Sink:MaD:965 |
| test.cpp:4:5:4:11 | [summary param] 0 in ymlStep | test.cpp:4:5:4:11 | [summary] to write: ReturnValue in ymlStep | provenance | MaD:969 |
| test.cpp:7:10:7:18 | call to ymlSource | test.cpp:7:10:7:18 | call to ymlSource | provenance | Src:MaD:967 |
| test.cpp:7:10:7:18 | call to ymlSource | test.cpp:11:10:11:10 | x | provenance | Sink:MaD:968 |
| test.cpp:7:10:7:18 | call to ymlSource | test.cpp:13:18:13:18 | x | provenance | |
| test.cpp:13:10:13:16 | call to ymlStep | test.cpp:13:10:13:16 | call to ymlStep | provenance | |
| test.cpp:13:10:13:16 | call to ymlStep | test.cpp:15:10:15:10 | y | provenance | Sink:MaD:965 |
| test.cpp:13:10:13:16 | call to ymlStep | test.cpp:15:10:15:10 | y | provenance | Sink:MaD:968 |
| test.cpp:13:18:13:18 | x | test.cpp:4:5:4:11 | [summary param] 0 in ymlStep | provenance | |
| test.cpp:13:18:13:18 | x | test.cpp:13:10:13:16 | call to ymlStep | provenance | MaD:966 |
| test.cpp:13:18:13:18 | x | test.cpp:13:10:13:16 | call to ymlStep | provenance | MaD:969 |
nodes
| asio_streams.cpp:56:18:56:23 | [summary param] *0 in buffer | semmle.label | [summary param] *0 in buffer |
| asio_streams.cpp:56:18:56:23 | [summary] to write: ReturnValue in buffer | semmle.label | [summary] to write: ReturnValue in buffer |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7735,6 +7735,12 @@ WARNING: module 'TaintTracking' has been deprecated and may be removed in future
| taint.cpp:790:10:790:12 | ref arg & ... | taint.cpp:790:11:790:12 | f2 [inner post update] | |
| taint.cpp:790:10:790:12 | ref arg & ... | taint.cpp:791:7:791:8 | f2 | |
| taint.cpp:790:11:790:12 | f2 | taint.cpp:790:10:790:12 | & ... | |
| taint.cpp:805:12:805:25 | call to SysAllocString | taint.cpp:806:8:806:9 | p1 | |
| taint.cpp:806:8:806:9 | p1 | taint.cpp:806:7:806:9 | * ... | TAINT |
| taint.cpp:808:12:808:32 | call to SysAllocStringByteLen | taint.cpp:809:8:809:9 | p2 | |
| taint.cpp:809:8:809:9 | p2 | taint.cpp:809:7:809:9 | * ... | TAINT |
| taint.cpp:811:12:811:28 | call to SysAllocStringLen | taint.cpp:812:8:812:9 | p3 | |
| taint.cpp:812:8:812:9 | p3 | taint.cpp:812:7:812:9 | * ... | TAINT |
| vector.cpp:16:43:16:49 | source1 | vector.cpp:17:26:17:32 | source1 | |
| vector.cpp:16:43:16:49 | source1 | vector.cpp:31:38:31:44 | source1 | |
| vector.cpp:17:21:17:33 | call to vector | vector.cpp:19:14:19:14 | v | |
Expand Down
21 changes: 21 additions & 0 deletions cpp/ql/test/library-tests/dataflow/taint-tests/taint.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -789,4 +789,25 @@ void fopen_test(char* source) {
FILE* f2;
fopen_s(&f2, source, "r");
sink(f2); // $ ast,ir
}

typedef wchar_t OLECHAR;
typedef OLECHAR* LPOLESTR;
typedef const LPOLESTR LPCOLESTR;
typedef OLECHAR* BSTR;
typedef const char* LPCSTR;

BSTR SysAllocString(const OLECHAR *);
BSTR SysAllocStringByteLen(LPCSTR, unsigned );
BSTR SysAllocStringLen(const OLECHAR *,unsigned);

void test_sysalloc() {
auto p1 = SysAllocString((LPOLESTR)indirect_source());
sink(*p1); // $ ir MISSING: ast

auto p2 = SysAllocStringByteLen(indirect_source(), 10);
sink(*p2); // $ ir MISSING: ast

auto p3 = SysAllocStringLen((LPOLESTR)indirect_source(), 10);
sink(*p3); // $ ir MISSING: ast
}
Original file line number Diff line number Diff line change
Expand Up @@ -2021,6 +2021,14 @@ getParameterTypeName
| taint.cpp:783:5:783:11 | fopen_s | 1 | const char * |
| taint.cpp:783:5:783:11 | fopen_s | 2 | const char * |
| taint.cpp:785:6:785:15 | fopen_test | 0 | char * |
| taint.cpp:800:6:800:19 | SysAllocString | 0 | const OLECHAR * |
| taint.cpp:800:6:800:19 | SysAllocString | 0 | const wchar_t * |
| taint.cpp:801:6:801:26 | SysAllocStringByteLen | 0 | LPCSTR |
| taint.cpp:801:6:801:26 | SysAllocStringByteLen | 0 | const char * |
| taint.cpp:801:6:801:26 | SysAllocStringByteLen | 1 | unsigned int |
| taint.cpp:802:6:802:22 | SysAllocStringLen | 0 | const OLECHAR * |
| taint.cpp:802:6:802:22 | SysAllocStringLen | 0 | const wchar_t * |
| taint.cpp:802:6:802:22 | SysAllocStringLen | 1 | unsigned int |
| vector.cpp:13:6:13:9 | sink | 0 | int |
| vector.cpp:14:27:14:30 | sink | 0 | vector> & |
| vector.cpp:14:27:14:30 | sink | 0 | vector> & |
Expand Down
Loading