Using Soot programmatically to analyze .java source files - code-analysis

I have just started playing around with Soot in order to analyze .java files programmatically. From what I've read, Soot seems to be a very powerful tool for source code analysis but most of the material I found online talks about using it as a command-line tool.
I need to programmatically load classes from .java files in a given directory, construct a Program Dependence Graph (PDG) and do some Program Slicing. I am still not sure if Soot offers slicing but I can implement that myself once I have the PDG.
To get started, I tried using the code below:
Options.v().set_whole_program(true);
Options.v().set_soot_classpath("...");
SootClass c = Scene.v().loadClassAndSupport("MyClass");
c.setApplicationClass();
CHATransformer.v().transform();
CallGraph cg = Scene.v().getCallGraph();
However, it does not work. It gets stuck at the loadClassAndSupport call for a few seconds and then my program just exists abruptly, without giving any exception or anything.
If anyone has tried to use Soot programmatically, are there any other options that I need to set? Or can you point me to a tutorial where they set up Soot programmatically from scratch?

You should not use loadClassAndSupport. Insert a "Scene transformer" instead. Slicing can be achieved by using the FlowDroid extension to Soot. It supports slicing of both Android and Java code.

Related

Compile pbtxt into binarypb

I'm playing around with Mediapipe and I'm trying to better understand how the graph works and what is the input/output of the different calculators.
If I understand correctly, the .pbtxt files are just plain-text instructions that describe how each calculator should interact with the rest of the calculators. These files are compiled into .binarypb files, which are fed to Mediapipe.
For example, this .pbtxt file got compiled into this .binarypb file.
I have a few questions:
I saw https://viz.mediapipe.dev/ , which seems to be Mediapipe's playground. That playground seems to be compiling the text in the textarea on the right. If that is correct, how does it do it? Is there any documentation I can read about it? How are .pbtxt compiled into .binarypb?
I'm especially interested in the web capabilities of mediapipe and I'd like to create a small POC using both face-mesh and depth-to-iris features. Unfortunately, there isn't a "solution" for the second one, but there is a demo in Mediapipe's viz claiming depth-to-iris web support (the demo doesn't seem to be working correctly though). If I were able to create a .pbtxt with a pipeline containing the features that I'm interested into, how would I ¿compile? the .wasm and .data files required to deploy the code to the web?

Adding a two new phases to an Xcode framework project

I am building a project on Github written in Objective-C. It resolves MAC addresses down to manufacturer details. The lookup table is currently stored as text file manuf.txt (from the Wireshark project), which is parsed at run-time, which is costly. I would prefer to compile this down to archived objects at build-time, and load that instead.
I would like to amend the build phases such that I:
Build a simple compiler
Run the compiler, parsing manuf.txt and outputting archived objects
Build the framework
Copy the archived objects into the framwork
I am looking for wisdom on how to achieve steps 1 and 2 using Xcode v7.3 as Xcode provides only a Copy Files phase or a Run Script phase. An example of other projects achieving similar goals would be inspiring.
I suspect that what you are asking is possible, but tricky. The reason is that you will need to write a bunch of class files and then dynamically add them to the project.
Firstly you will need to employ a run script phase to run various tools from the command line to parse your file and generate a number of class files from it. I would suggest looking into various templating engines. For example appledoc uses moustache templates to generate API documentation files. You could use the same technique to generate header and implementation files.
Next, rather than generating archived objects an trying to import into a framework. I think you may be better off generating raw source code, adding it to a project and compiling into a framework. Probably simpler in the long run.
To automatically include the generated code I would look into (which means I haven't actually tried this :-) adding a folder reference to the project rather than an Xcode group. Folder references are an option in the 'Add files to ...' dialog.
Folder references refer to a directory and automatically add the entire contents of that directory to a project. So you can use one to point to the directory where you have generated the source code. This is a much better option than trying to manipulate the project or injecting things into an established framework.
I would prefer to parse the file at runtime. After launch you can look for an already existing output, otherwise parse it one time.
However, I have to do something similar at Objective-Cloud. I simply added a run script build phase and put the compiler call into it.

Compiling .hx code directly (or indirectly) to a dynamic library (ndll)

I am working on a project and I have a plan to separate certain sections out into separate dlls/ndlls in the final program. The main reason I want to do this is to support plugins and add ons, so more functionality can be added if needed, but the core app can still be used if that's the only requirement.
I have done something similar in C# (abet through an IDE so I never had to write any linker/compiling commands) so I know the general process but I can't seem to find a way to write HX code and then have it compile into a ndll.
I found this http://old.haxe.org/doc/cpp/ffi?lang=en which shows how to compile cpp code into a ndll using hxcpp and g++. I would think there should be a way I can use LIME or HXCPP to create a build file that will allow me to do it in one step instead of having to make a "fake" main function to compile the HX to CPP or CS.
If anyone knows of a project that does this and has a build.hxml or build.xml file that describes this or a tutorial or guide that takes about this, I would love it see it.
Try this:
lime create extension TestExt
lime rebuild TestExt windows
Replace "windows" with "mac" or "linux" as appropriate. Assuming it works, the ndll will show up in a subfolder of TestExt/ndll/.
As for tutorials, I wrote this one. It's targeted at OpenFL programmers, but the "Writing code for iOS" section covers what you'll need to know. (You can also just model your code on the template.)
In case it helps, I've made a tool to generate some of the boilerplate code that Lime requires.

Using Apparat dump with FDT and ant

I am totally new to flash development, don't even know ActiveScript yet.
I have to improve some existing flash application, so at first I need to understand the code.
I want to use some tool for code analysis, something to visualize class dependencies and code structure. I googled and found out about Apparat tool. Now I'm struggling with it because I can not find documentation that describes how to use Apparat. I'm frustrated, but it seems to be the only such tool.
So I started with example.
I've set up apparat running on FDT following this guide:
http://www.webdevotion.be/blog/2010/06/02/how-to-get-up-and-running-with-apparat/
The example (http://blog.joa-ebert.com/2010/05/26/new-apparat-example/) builds well and creates two SWF files. (I'm using ANT builder)
Now I want to analyze existing swf and see a PNG with class dependencies.
How should I do that?
What do I have to add and where?
Or maybe someone can explain how to use dump from windows command line? Something like
dump example.swf exampleAnalysis.png
After resolving all dependencies (which was tricky), I managed to get dump running
dump -i example.swf -uml
But it saves the UML diagram in .DOT format which is really hard to read as Graphviz GVedit cannot zoom and exports to PNG only what you see (messy impossible to read zoomed out graph), smyrna doesn't work and zgrviewer fails to load some files.

CDash Custom Dynamic Analysis

I'm trying to integrate custom dynamic analysis tools to CDash. Such as KWStyle, CppCheck and Visual Leak Detector.
I'v figured out that I need to generate a DynamicAnalysis.xml file and submit it to CDash, from CTest scripts.
I think I know how to run the external tool as a part of the ctest script.
Either by using these variables to change how ctest_memcheck() works
CTEST_MEMORYCHECK_COMMAND
CTEST_MEMORYCHECK_SUPPRESSIONS_FILE
CTEST_MEMORYCHECK_COMMAND_OPTIONS
or by running the tool from the execute_process() command.
But I'm a bit uncertain which one to use.
The main problem I think I have is, how can I extract errors from the output of the custom tool and include that information into the DynamicAnalysis.xml to submit?
The extreme solution i see is that i'd need to make a program that generates a valid DynamicAnalysis.xml file.
But the problem is that I don't know the syntax of the DefectList element in the XML file. I have found no answer from google and even the XML Schema for that file is unhelpful.
EDIT:
Looking at this:
http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=987149
What draws my attention are the labels, especially the empty ones. I don't see how these would come from the DynamicAnalysis.xml file. Maybe it tracks any labels that have ever appearred? Can i create my own custom labels somehow?
Does CDash create the labels automatically, depending on the tool type? Does this block custom defect types?
I'm just guessing here, so the question is; can i create custom labels for my custom tool, just by generating a DynamicAnalysis.xml - file.
It occurred to me that the amount of different errors from CppCheck (static code analysis) is huge, compared to valgrind for instance. I'm not that certain that I should use the dynamic analysis. Maybe a custom build type (Continuous / Experimental / Nightly) thing would work better. Like this:
http://www.cdash.org/CDash/buildSummary.php?buildid=930174
I have no idea how to do this, i guess it requires meddling around with CDash code?
Which one would work better?
If you are using valgrind, you can simply set CTEST_MEMORYCHECK_COMMAND to the full path to valgrind, and ctest will generate the DynamicAnalysis.xml file for you from the valgrind output when you call ctest_memcheck.
The best way to understand the possible values that can appear in the DynamicAnalysis.xml file is to analyze the source code of CTest.
The file CMake/Source/CTest/cmCTestMemCheckHandler.cxx has the list of defect types in a variable named "cmCTestMemCheckResultLongStrings". Search through that file for references to that variable to see what the possible values are and how they are used to generate "<Defect/>" xml elements.
EDIT (for additional information):
You can also easily see what XML elements CDash is expecting by inspecting its source code. Specifically, the file "CDash/xml_handlers/dynamic_analysis_handler.php".
From what I'v learned so far, is that for a tool that runs on the tests made in the cmake script, the Dynamic Analysis is the thing.
For tools that run on the entire program, a custom Build.xml is the thing you need.
I found out that i can commit those files from the ctest_submit command by using the FILES parameter.
I also found out that you can add custom "build names" to the side of Continuous, Nightly, and others.
And that you can set the builds from certain machines to be automatically transferred under these.
The custom labels under DynamicAnalysis did come from somewhere in CDash, i can't remember where anymore.