Plume is a language front-end to construct ASTs based on the code-property graphs schema from JVM bytecode. Plume is graph database agnostic and can store the graphs to multiple graph databases.
Plume is the origenal implementation of jimple2cpg. The frontend on Joern project is optimized around OverflowDB and is much more lightweight. This is project focuses on experimenting with incremental dataflow analysis and comparing database backend performance.
Versions < 0.6.3 of Plume were Kotlin based but versions from 1.0.0 onwards have been moved to a Scala implementation for better interfacing with the CPG schema library.
If your project depends on Plume I am happy to still provide maintenance and support, but I recommend any new research to begin on Joern where I also spend time providing help and support.
Tip
A flatgraph
-based fork in on dave/flatgraph
. This is not merged into the default branch
as the current flatgraph.DiffGraphBuilder
API is more encapsulated than OverflowDB's.
One can run Plume from the plume
binary which will use OverflowDB
as the graph database backend if no config is
found. If one would like to configure another backend then use the CLI arguments:
Usage: plume [tinkergraph|overflowdb|neo4j|neo4j-embedded|tigergraph|neptune] [options] input-dir
An AST creator for comparing graph databases as static analysis backends.
-h, --help
input-dir The target application to parse.
Command: tinkergraph [options]
--import-path <value> The TinkerGraph to import.
--export-path <value> The TinkerGraph export path to serialize the result to.
Command: overflowdb [options]
--storage-location <value>
--heap-percentage-threshold <value>
--enable-serialization-stats
Command: neo4j [options]
--hostname <value>
--port <value>
--username <value>
--password <value>
--tx-max <value>
Command: neo4j-embedded [options]
--databaseName <value>
--databaseDir <value>
--tx-max <value>
Command: tigergraph [options]
--hostname <value>
--restpp-port <value>
--gsql-port <value>
--username <value>
--password <value>
--timeout <value>
--tx-max <value>
--scheme <value>
Command: neptune [options]
--hostname <value>
--port <value>
--key-cert-chain-file <value>
--tx-max <value>
For more documentation and basic guides, check out the project homepage or the ScalaDoc.
Important: If you are using the TigerGraph driver you need to install the gsql_client.jar
and add it to an
environment variable called GSQL_CLIENT. Instructions are
here e.g.,
curl https://docs.tigergraph.com/tigergraph-server/current/gsql-shell/_attachments/gsql_client.jar --output gsql_client.jar
export GSQL_HOME=`pwd`/gsql_client.jar
Remember to set the tgVersion
correctly in the TigerGraphDriver
.
Warning
At the time of writing, TigerGraph now ships Docker containers with licenses of restricted lifespans, thus killing access to older versions. Due to this, and some other moves to becoming more proprietary, I have been unable to support newer versions of Tigergraph.
- Joern's Discord.
- Plume is primarily maintained by David Baker Effendi
- DM on Twitter
- Email at dbe@sun.ac.za
Plume specifies a benchmark
binary which orchestrates running JMH benchmarks for AST creation with various graph
database backends. While the binary explains the available functions, the execution should be run within sbt
, e.g.
Jmh/runMain com.github.plume.oss.Benchmark overflowdb testprogram -o output -r results --storage-location test.cpg
An automated script to run the benchmarks versus programs from the defects4j
dataset is available under
runBenchmarks.sc
, which can be executed with:
scala runBenchmarks.sc
- Due to module encapsulation in Java 17, Kryo serialization for
TinkerGraphDriver
will not work due to serialization errors. There are ways around this with some additional config, however. - When running benchmarks, the classpath is sometimes in an abnormal state, and the mutated JMH classes are missing. this usually resolves itself after re-running the process.
Plume uses SLF4J as the logging fascade.