Discovering a Real World Vulnerability

This tutorial is a technical write-up of the blog entry about the discovery of CVE-2018-19859, a vulnerability allowing an attacker to execute arbitrary file writes through OpenRefine.

OpenRefine is described by its authors as "a free, open source power tool for working with messy data and improving it". A common use-case for OpenRefine is the sanitization of messy public data sets prior to statistical calculations for which it provides features such as importing and exporting of data which may be scattered among multiple files or archives.

CVE-2018-19859

CVE-2018-19859 describes a directory traversal attack which can be exploited as a arbitrary file write. The vulnerability is rooted in an unsafe handling of ZIP files, a vulnerability pattern often seen by security researchers, analysts, and penetration tester.

Creating the CPG

OpenRefine is not distributed via a single JAR file as expected by java2cpg. However, java2cpg does not require its input file to comply with a certain file structure as long as its input is provided in ZIP format which is basically equivalent to a JAR archive: a ZIP file with a .jar suffix containing the .class files of interest is sufficient. Based on the OpenRefine sources, which you can download as .tar.gz archive, we can prepare the .jar for java2cpg by executing:

wget https://github.com/OpenRefine/OpenRefine/releases/download/3.1/openrefine-linux-3.1.tar.gz
tar xfz openrefine-linux-3.1.tar.gz
find openrefine-3.1 -name "*.class" | zip openrefine.jar -@

We then run java2cpg with the -w flag followed by a comma-separated list of package-names (com.google.refine, org.openrefine) to be included in the CPG; only the parts of the application with the package prefixes com.google.refine and org.openrefine are included in the CPG. For relatively large applications such as OpenRefine, focusing on certain application parts can save computing and analysis time.

./java2cpg.sh openrefine.jar -w com.google.refine,org.openrefine -nb -o openrefine.bin.zip

Finally, we load the newly created CPG into ocular:

./ocular.sh
ocular> loadCpg("openrefine.bin.zip")
[..]

Identifying importers (sources)

Input sources represent program points where potentially malicious (attacker-controlled) data may enter the system. In this section, we will show how to search and define an input source with ocular.

As mentioned in the introduction, OpenRefine relies on importing and exporting data. Hence, as a first guess, it may be a good idea to look for imports. We can do this by looking for all methods that contain the substring Import in their full method name. By applying this search strategy, we can find 502 methods which is a too large number of methods to be inspected manually. Hence, we have to make our search filter more specific in order to further narrow down the list.

cpg.method.fullName(".*Import.*").toList.size
res1: Int = 502

Based on naming convention often found in Java Servlet code, we can narrow down the list of methods by including only methods accessible via HTTP through the addition of do(Get|Post) to our search pattern: HTTP Get handlers are named doGet, while HTTP POST handlers are named doPost. By executing the query below, the number of results is decreased to only 13 methods, a small enough number to be inspected manually.

cpg.method.fullName(".*Import.*do(Get|Post).*").toList.size
res2: Int = 13

Some methods are stored in a class with the name DefaultImportingController which, as suggested by its name, may be interesting from a security standpoint. We enhance our query by replacing Import with DefaultImportingController. The new search result consists of a doGet and a doPost method.

cpg.method.fullName(".*DefaultImportingController.*do(Get|Post).*").fullName.p
com.google.refine.importing.DefaultImportingController.doGet:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse)
com.google.refine.importing.DefaultImportingController.doPost:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse)

Based on these results, we have found a source as starting point for our security analysis with only issuing three queries. The query below defines a source by applying our search filter. To be more specific, we also specify the required type of the parameters (which is HttpServletRequest).

val source = cpg.method.fullName(".*DefaultImportingController.*do(Get|Post).*").parameter.evalType(".*HttpServletRequest.*")

Looking for the sink

Sinks are security-sensitive program points to which malicious, attacker-controlled input (coming from the sink) may flow.

This section illustrates, based on the OpenRefine CVE, how you can find and define sinks with ocular. Recall that we are specifically looking for unzipping-vulnerabilities.

We can use a very basic query to first identify methods which are part of the zip package; among them we are looking for calls to methods with getName as part of their name. As a result, we find the method explodeArchive:

ocular> cpg.method.fullName(".*zip.*getName.*").caller.fullName.p
com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)

Judging from its name, the function explodeArchive appears to be worth investigating. The query below tags the parameter of the method explodeArchive as sink.

val sink = cpg.method.name("explodeArchive").parameter

For finding a possible data flow between sources and sinks, we can issue a reachableBy query as in the snippet below:

sink.reachableBy(source).flows.p

Resulting flow

By issuing the reachableBy query above, we get a rather detailed picture about the dataflow which starts from the doPost method and the parameter named request of type HttpServletRequest. Skimming over the flow, we can see that OpenRefine retrieves content from a POST request (retrieveContentFromPostRequest); afterwards, a method named download is using a variable named urlString; saveStream is consuming a url; files are allocated (allocateFile) and eventually the methods tryOpenAsArchive are called before we end up in explodeArchive with a ZipInputStream variable named archiveIS.

In summary, OpenRefine downloads data based on a URL, reads it as ZipInputStream and tries to unpack it with the method explodeArchive.

________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
| param | type | method | signature |
|=======================================================================================================================================================================================================================================================================================================================================================================|
| request(1) | javax.servlet.http.HttpServletRequest | doPost | com.google.refine.importing.DefaultImportingController.doPost:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse) |
| request | javax.servlet.http.HttpServletRequest | doPost | com.google.refine.importing.DefaultImportingController.doPost:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse) |
| request(1) | javax.servlet.http.HttpServletRequest | doLoadRawData | com.google.refine.importing.DefaultImportingController.doLoadRawData:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse,java.util.Properties) |
| request | javax.servlet.http.HttpServletRequest | doLoadRawData | com.google.refine.importing.DefaultImportingController.doLoadRawData:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse,java.util.Properties) |
| request(1) | javax.servlet.http.HttpServletRequest | loadDataAndPrepareJob | com.google.refine.importing.ImportingUtilities.loadDataAndPrepareJob:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse,java.util.Properties,com.google.refine.importing.ImportingJob,org.json.JSONObject) |
| request | javax.servlet.http.HttpServletRequest | loadDataAndPrepareJob | com.google.refine.importing.ImportingUtilities.loadDataAndPrepareJob:void(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse,java.util.Properties,com.google.refine.importing.ImportingJob,org.json.JSONObject) |
| request(1) | javax.servlet.http.HttpServletRequest | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| request | javax.servlet.http.HttpServletRequest | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| param0(1) | javax.servlet.http.HttpServletRequest | parseRequest | org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest:java.util.List(javax.servlet.http.HttpServletRequest) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| tempFiles | java.util.List | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| tempFiles | java.util.List | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| this(0) | java.util.List | iterator | java.util.List.iterator:java.util.Iterator() |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| l12_0 | java.util.Iterator | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| l12_0 | java.util.Iterator | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| this(0) | java.util.Iterator | next | java.util.Iterator.next:java.lang.Object() |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| $r2 | java.lang.Object | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| $r2 | java.lang.Object | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| param1(2) | ANY | <operator>.cast | <operator>.cast |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| fileItem | org.apache.commons.fileupload.FileItem| retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| fileItem | org.apache.commons.fileupload.FileItem| retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| this(0) | org.apache.commons.fileupload.FileItem| getInputStream | org.apache.commons.fileupload.FileItem.getInputStream:java.io.InputStream() |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| stream | java.io.InputStream | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| stream | java.io.InputStream | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| param0(1) | java.io.InputStream | asString | org.apache.commons.fileupload.util.Streams.asString:java.lang.String(java.io.InputStream) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| urlString | java.lang.String | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| urlString | java.lang.String | retrieveContentFromPostRequest| com.google.refine.importing.ImportingUtilities.retrieveContentFromPostRequest:void(javax.servlet.http.HttpServletRequest,java.util.Properties,java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress) |
| urlString(6)| java.lang.String | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String) |
| urlString | java.lang.String | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String) |
| urlString(6)| java.lang.String | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String,java.lang.String) |
| urlString | java.lang.String | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String,java.lang.String) |
| param0(1) | java.lang.String | <init> | java.net.URL.<init>:void(java.lang.String) |
| url | java.net.URL | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String,java.lang.String) |
| url | java.net.URL | download | com.google.refine.importing.ImportingUtilities.download:void(java.io.File,org.json.JSONObject,com.google.refine.importing.ImportingUtilities$Progress,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$SavingUpdate,java.lang.String,java.lang.String) |
| url(2) | java.net.URL | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| url | java.net.URL | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| this(0) | java.net.URL | getPath | java.net.URL.getPath:java.lang.String() |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| localname | java.lang.String | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| localname | java.lang.String | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| param0(1) | java.lang.String | append | java.lang.StringBuilder.append:java.lang.StringBuilder(java.lang.String) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| $r1 | java.lang.StringBuilder | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| $r1 | java.lang.StringBuilder | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| this(0) | java.lang.StringBuilder | append | java.lang.StringBuilder.append:java.lang.StringBuilder(java.lang.String) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| $r2 | java.lang.StringBuilder | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| $r2 | java.lang.StringBuilder | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| this(0) | java.lang.StringBuilder | toString | java.lang.StringBuilder.toString:java.lang.String() |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| localname | java.lang.String | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| localname | java.lang.String | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| name(2) | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| this(0) | java.lang.String | substring | java.lang.String.substring:java.lang.String(int,int) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | java.lang.String | <init> | java.io.File.<init>:void(java.io.File,java.lang.String) |
| file | java.io.File | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| file | java.io.File | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| file | java.io.File | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| file | java.io.File | saveStream | com.google.refine.importing.ImportingUtilities.saveStream:boolean(java.io.InputStream,java.net.URL,java.io.File,com.google.refine.importing.ImportingUtilities$Progress,com.google.refine.importing.ImportingUtilities$SavingUpdate,org.json.JSONObject,org.json.JSONArray,long)|
| file(2) | java.io.File | postProcessRetrievedFile | com.google.refine.importing.ImportingUtilities.postProcessRetrievedFile:boolean(java.io.File,java.io.File,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress) |
| file | java.io.File | postProcessRetrievedFile | com.google.refine.importing.ImportingUtilities.postProcessRetrievedFile:boolean(java.io.File,java.io.File,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress) |
| file(1) | java.io.File | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| file | java.io.File | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| param0(1) | java.io.File | <init> | java.io.FileInputStream.<init>:void(java.io.File) |
| $r7 | java.io.FileInputStream | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| $r7 | java.io.FileInputStream | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| param0(1) | java.io.InputStream | <init> | java.util.zip.ZipInputStream.<init>:void(java.io.InputStream) |
| $r6 | java.util.zip.ZipInputStream | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| $r6 | java.util.zip.ZipInputStream | tryOpenAsArchive | com.google.refine.importing.ImportingUtilities.tryOpenAsArchive:java.io.InputStream(java.io.File,java.lang.String,java.lang.String) |
| param1(2) | ANY | <operator>.assignment | <operator>.assignment |
| archiveIS | java.io.InputStream | postProcessRetrievedFile | com.google.refine.importing.ImportingUtilities.postProcessRetrievedFile:boolean(java.io.File,java.io.File,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress) |
| archiveIS | java.io.InputStream | postProcessRetrievedFile | com.google.refine.importing.ImportingUtilities.postProcessRetrievedFile:boolean(java.io.File,java.io.File,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress) |
| archiveIS(2)| java.io.InputStream | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress) |

Vulnerable flow

Now that we know how to look for sources and sinks, we can refine our search to detect an actual vulnerability.

We already control a ZipInputSteam but for our vulnerability we need to find a flow to a File or better a FileOutputStream. In the query below, we use the FileOutputStream constructor as sink, more specifically, the first parameter of the constructor.

val source = cpg.method.name("explodeArchive").parameter
val sink = cpg.method.fullName(".*FileOutputStream.*init.*").parameter.index(1)

By issuing a reachableBy query, we find various flows, since we have many FileInputStream sinks. Hence, we have to narrow-down the number of flows by providing additional expert knowledge: we know that so an attacker can control the destination path to which a ZIP file is extracted, malicious input has to pass through the getName method. After applying the passes filter on the flows, the number of flows could be reduced to 2!

sink.reachableBy(source).flows.l.size
res28: Int = 1847
sink.reachableBy(source).flows.passes("getName").l.size
res29: Int = 2

The resulting flow

___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
| param | type | method | signature |
|==========================================================================================================================================================================================================================================================================|
| archiveIS(2)| java.io.InputStream | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| archiveIS | java.io.InputStream | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| param1(2) | ANY | <operator>.cast | <operator>.cast |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| zis | java.util.zip.ZipInputStream| explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| zis | java.util.zip.ZipInputStream| explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| this(0) | java.util.zip.ZipInputStream| getNextEntry | java.util.zip.ZipInputStream.getNextEntry:java.util.zip.ZipEntry() |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| ze | java.util.zip.ZipEntry | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| ze | java.util.zip.ZipEntry | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| ze | java.util.zip.ZipEntry | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| ze | java.util.zip.ZipEntry | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| this(0) | java.util.zip.ZipEntry | getName | java.util.zip.ZipEntry.getName:java.lang.String() |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| fileName2 | java.lang.String | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| fileName2 | java.lang.String | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| name(2) | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| this(0) | java.lang.String | substring | java.lang.String.substring:java.lang.String(int,int) |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| name | java.lang.String | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | java.lang.String | <init> | java.io.File.<init>:void(java.io.File,java.lang.String) |
| file | java.io.File | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| file | java.io.File | allocateFile | com.google.refine.importing.ImportingUtilities.allocateFile:java.io.File(java.io.File,java.lang.String) |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| file2 | java.io.File | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| file2 | java.io.File | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| file(2) | java.io.File | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| file | java.io.File | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| param0(1) | java.io.File | <init> | java.io.FileOutputStream.<init>:void(java.io.File)

At this point we know that we control a flow from doPost to explodeArchive and from explodeArchive to a FileOutputStream instance. In order to verify the existence of a vulnerability, there is only one remaining question left: is the file actually written? To answer this question, we have to look at the write method of the FileOutputStream, which we use as a sink. In addition to our reachableBy query, we use the passesNot filter which makes sure that only flows without an uncompressFile method, a method that that handles .gz and .bz2, are considered.

val source = cpg.method.name("explodeArchive").parameter
val sink = cpg.method.fullName(".*FileOutputStream.*write.*").parameter.index(1)
sink.reachableBy(source).passesNot("uncompressFile").p

The resulting flow

___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
| param | type | method | signature |
|==========================================================================================================================================================================================================================================================================|
| archiveIS(2)| java.io.InputStream | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| archiveIS | java.io.InputStream | explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| param1(2) | ANY | <operator>.cast | <operator>.cast |
| param1(2) | ANY | <operator>.assignment| <operator>.assignment |
| zis | java.util.zip.ZipInputStream| explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| zis | java.util.zip.ZipInputStream| explodeArchive | com.google.refine.importing.ImportingUtilities.explodeArchive:boolean(java.io.File,java.io.InputStream,org.json.JSONObject,org.json.JSONArray,com.google.refine.importing.ImportingUtilities$Progress)|
| stream(1) | java.io.InputStream | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| stream | java.io.InputStream | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| this(0) | java.io.InputStream | read | java.io.InputStream.read:int(byte[]) |
| bytes | byte[] | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| bytes | byte[] | saveStreamToFile | com.google.refine.importing.ImportingUtilities.saveStreamToFile:long(java.io.InputStream,java.io.File,com.google.refine.importing.ImportingUtilities$SavingUpdate) |
| param0(1) | byte[] | write | java.io.FileOutputStream.write:void(byte[],int,int) |

Conclusion

We have demonstrated how to detect the vulnerability CVE-2018-19859 in OpenRefine with ocular using explodeArchive and write (from FileOutputStream) as sources and sinks, respectively. Furthermore, we have verified the existence of the vulnerability. However, we need to create an exploit to see if it is really exploitable. Interested readers may want to look at this issue where we have documented description and exploitation steps.