Filter Steps
Filter Steps are CPGQL Steps which filter nodes in a traversal according to a criterion. Ocular supports four Generic Filter Steps which can be added to any other step, and number of specific filter steps called Property Filter Steps which can be used on nodes of a certain type.
The Generic Filter Steps are filter
, filterNot
, where
and whereNot
. The Property Filter Steps for each node type correspond to the Property Directives it has defined.
We will look at the behaviour of each of these steps while analyzing a simple program named X42
:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
if (argc > 1 && strcmp(argv[1], "42") == 0) {
fprintf(stderr, "It depends!\n");
exit(42);
}
printf("What is the meaning of life?\n");
exit(0);
}
Property Filter Steps
Property Filter Steps are Filter Steps which continue a traversal if the properties of the nodes they point to pass a certain criterion.
For example, to query the Code Property Graph for all CALL nodes which have the string exit
as the value for their NAME property, and return their CODE property:
ocular> cpg.call.name("exit").code.l
res0: List[String] = List("exit(0)", "exit(42)")
The criterion is unique to every Property Filter Step. In the case of the name
Property Filter Step of the call
Node-Type Step, its criterion is a string that represents a regular expression; that is, all following queries will deliver the same result for the X42
program:
ocular> cpg.call.name("exit").code.l
res0: List[String] = List("exit(0)", "exit(42)")
ocular> cpg.call.name("[eE]xit").code.l
res1: List[String] = List("exit(0)", "exit(42)")
ocular> cpg.call.name("ex.*").code.l
res2: List[String] = List("exit(0)", "exit(42)")
Just like all other Filter Steps, Property Filter Steps can be chained together:
ocular> cpg.call.name("exit").code(".*0.*").code.l
res0: List[String] = List("exit(0)")
Unlike Property Directives, Property Filter Steps are usually greater in number than the properties defined on a node type. Most Property Filter Steps have their negated version available:
ocular> cpg.call.name("exit").codeNot(".*0.*").code.l
res0: List[String] = List("exit(42)")
where
The where
Step is a Filter Step which continues a traversal for all nodes which pass its criterion. The criterion of the where
step is represented by an expression which has one argument, a variable that points to the node matched in the previous step, and which returns a boolean. For example, suppose you'd like to query the Code Property Graph of the X42
program for all CALL nodes which have exit
as the value of their NAME property, and return their CODE property in a list:
ocular> cpg.call.where(node => node.name == "exit").code.l
res0: List[String] = List("exit(0)", "exit(42)")
where
steps can be chained together:
ocular> cpg.call.where(node => node.name == "exit").where(node => node.code.contains("42")).code.l
res0: List[String] = List("exit(42)")
And their expression can contain any combination of boolean statements:
ocular> cpg.call.where(node => node.name == "exit" && node.code.contains("42")).code.l
res0: List[String] = List("exit(42)")
// equivalent in logic to the query above
ocular> cpg.call.where(node => true && 1 == 1 && node.name == "exit" && node.code.contains("42")).code.l
res0: List[String] = List("exit(42)")
One helpful trick is to use the shorthand _
operator in where
expressions, which points to the single argument that is passed into it, that is, the node.
// long form
ocular> cpg.call.where(node => node.name == "exit").code.l
res0: List[String] = List("exit(0)", "exit(42)")
// short form
ocular> cpg.call.where(_.name == "exit").code.l
res1: List[String] = List("exit(0)", "exit(42)")
filter
Just like where
, the filter
Step is a Filter Step (ಠ~ಠ
) which continues a traversal for all nodes which pass its criterion. The expression representing the criterion takes in one argument, a variable that represents the traversal of the previous step, and returns another traversal. Say you'd like to query the Code Property Graph of the X42
program again for all CALL nodes which have exit
as the value of their NAME property, and return the value of their CODE property in a list:
ocular> cpg.call.filter(node => node.name("exit")).code.l
res0: List[String] = List("exit(0)", "exit(42)")
// or using the `_` shorthand:
ocular> cpg.call.filter(_.name("exit")).code.l
res1: list[string] = list("exit(0)", "exit(42)")
filter
's expression supports traversals of any length:
ocular> cpg.call.filter(_.name("exit").argument.code("42")).code.l
res0: List[String] = List("exit(42)")
And just like where
, it supports chaining:
// equivalent to the previous query
ocular> cpg.call.filter(_.name("exit")).filter(_.argument.code("42")).code.l
res0: List[String] = List("exit(42)")
filterNot
filterNot
is the inverse operation of the filter
Step.
ocular> cpg.call.filterNot(_.name("exit")).code.l
res0: List[String] = List(
"printf(\"What is the meaning of life?\\n\")",
"fprintf(stderr, \"It depends!\\n\")",
"argv[1]",
"strcmp(argv[1], \"42\")",
"strcmp(argv[1], \"42\") == 0",
"argc > 1",
"argc > 1 && strcmp(argv[1], \"42\") == 0"
)
It supports chaining as well:
ocular> cpg.call.filterNot(_.name("exit")).filterNot(_.name("strcmp")).code.l
res0: List[String] = List(
"printf(\"What is the meaning of life?\\n\")",
"fprintf(stderr, \"It depends!\\n\")",
"argv[1]",
"strcmp(argv[1], \"42\") == 0",
"argc > 1",
"argc > 1 && strcmp(argv[1], \"42\") == 0"
)
And traversals of any length:
ocular> cpg.call.filterNot(_.name("exit").argument.code("42")).filterNot(_.name("strcmp")).code.l
res0: List[String] = List(
"exit(0)",
"printf(\"What is the meaning of life?\\n\")",
"fprintf(stderr, \"It depends!\\n\")",
"argv[1]",
"strcmp(argv[1], \"42\") == 0",
"argc > 1",
"argc > 1 && strcmp(argv[1], \"42\") == 0"
)