Dynamic Bytecode Instrumentation fails without any error - instrumentation
Objective
I'm doing dynamic bytecode instrumentation using a JVMTI agent. I have to instrument those methods which are "hot", that is, the methods which invoke JIT compiler. To do so I listen to a CompiledLoadEvent and inside its call back function, call RetransformClasses. This in turn invokes ClassFileLoadHook on the class containing "hot" function and actual instrumentation begins.
Problem Premises
Currently I'm instrumenting my class to spawn some threads. I also listen to thread starts and print them within my agent. With simple ClassFileLoadHook at class load time (without RetransformClasses), my instrumentation works perfectly and spawns new threads. I get following output when ClassFileLoadHook instruments at class load time:
Running Thread: Signal Dispatcher, Priority: 9, context class loader:Not Null
Running Thread: main, Priority: 5, context class loader:Not Null
Running Thread: Thread-0, Priority: 5, context class loader:Not Null
Running Thread: Thread-1, Priority: 5, context class loader:Not Null
Running Thread: Thread-2, Priority: 5, context class loader:Not Null
Running Thread: Thread-3, Priority: 5, context class loader:Not Null
Running Thread: Thread-4, Priority: 5, context class loader:Not Null
Running Thread: Thread-6, Priority: 5, context class loader:Not Null
Running Thread: Thread-5, Priority: 5, context class loader:Not Null
Running Thread: Thread-7, Priority: 5, context class loader:Not Null
Running Thread: DestroyJavaVM, Priority: 5, context class loader:: NULL
When I instrument the class file by invoking RetransformClasses and then ClassFileLoadHook, everything works fine but no threads are spawned and hence no effective instrumentation takes place. VM takes a long time even to execute the original code.
I double checked both instrumentations using -XX:+TraceClassLoading. All the retransformed classes are loaded in both cases. Even the class I'm generating during runtime also gets loaded but no instrumentation happens. Below is the output of class loading trace:
[Loaded Test from __VM_RedefineClasses__]
[Loaded Test_Worker_main_0 from file:/home/saqib/workspace/test/bin]
I'm generating second class during runtime and it loads into VM but I don't get any thread spawning.
Questions
Given my understanding of the problem (There is a high probability
that I'd be wrong), why ClassFileLoadHook retransforms the class
successfully during load time, but somehow doesn't behave correctly
when JIT is invoked?
Just writing the RetransformClasses function, with empty
ClassFileLoadHook call back, also takes a lot of time without
incurring any sort of error. What could be taking time?
Agent Code
Compiled Load Event Call Back
static int x = 1;
void JNICALL
compiled_method_load(jvmtiEnv *jvmti, jmethodID method, jint code_size,
const void* code_addr, jint map_length, const jvmtiAddrLocationMap* map,
const void* compile_info) {
jvmtiError err;
jclass klass;
char* name = NULL;
char* signature = NULL;
char* generic_ptr = NULL;
err = (*jvmti)->RawMonitorEnter(jvmti, lock);
check_jvmti_error(jvmti, err, "raw monitor enter");
err = (*jvmti)->GetMethodName(jvmti, method, &name, &signature,
&generic_ptr);
check_jvmti_error(jvmti, err, "Get Method Name");
printf("\nCompiled method load event\n");
printf("Method name %s %s %s\n\n", name, signature,
generic_ptr == NULL ? "" : generic_ptr);
if (strstr(name, "main") != NULL && x == 1) {
x++;
err = (*jvmti)->GetMethodDeclaringClass(jvmti, method, &klass);
check_jvmti_error(jvmti, err, "Get Declaring Class");
err = (*jvmti)->RetransformClasses(jvmti, 1, &klass);
check_jvmti_error(jvmti, err, "Retransform class");
}
if (name != NULL) {
err = (*jvmti)->Deallocate(jvmti, (unsigned char*) name);
check_jvmti_error(jvmti, err, "deallocate name");
}
if (signature != NULL) {
err = (*jvmti)->Deallocate(jvmti, (unsigned char*) signature);
check_jvmti_error(jvmti, err, "deallocate signature");
}
if (generic_ptr != NULL) {
err = (*jvmti)->Deallocate(jvmti, (unsigned char*) generic_ptr);
check_jvmti_error(jvmti, err, "deallocate generic_ptr");
}
err = (*jvmti)->RawMonitorExit(jvmti, lock);
check_jvmti_error(jvmti, err, "raw monitor exit");
}
Class File Load Hook
void JNICALL
Class_File_Load_Hook(jvmtiEnv *jvmti_env, JNIEnv* jni_env,
jclass class_being_redefined, jobject loader, const char* name,
jobject protection_domain, jint class_data_len,
const unsigned char* class_data, jint* new_class_data_len,
unsigned char** new_class_data) {
jvmtiError err;
unsigned char* jvmti_space = NULL;
if (strstr(name, "Test") != NULL && x == 2) {
char* args = "op";
javab_main(2, args, class_data, class_data_len);
err = (*jvmti_env)->Allocate(jvmti_env, (jlong)global_pos, &jvmti_space);
check_jvmti_error(jvmti_env, err, "Allocate new class Buffer.");
(void)memcpy((void*)jvmti_space, (void*)new_class_ptr, (int)global_pos);
*new_class_data_len = (jint)global_pos;
*new_class_data = jvmti_space;
if ( new_class_ptr != NULL ) {
(void)free((void*)new_class_ptr);
}
#if DEBUG
printf("Size of the class is: %d\n", class_data_len);
for (int i = 0; i < class_data_len; i += 4) {
if (i % 16 == 0)
printf("\n");
printf("%02x%02x %02x%02x ", new_class_data[i],
new_class_data[i + 1], new_class_data[i + 2],
new_class_data[i + 3]);
}
printf("\n");
system("javap -c -v Test_debug");
#endif
x++;
}
}
Here javab_main returns the instrumented char * array which is correct. The instrumented array is stored in a global variable new_class_ptr which is copied into new_class_data. To debug the output of the instrumentation, I also printed the instrumented class in a file called Test_debug and invoking javap on it produces desired result.
The complete agent file is given here:
Agent.c
Original Code:
for (int i = 0; i < s; i++)
for (int j = 0; j < s; j++) {
c2[i][j] = 0;
for (int k = 0; k < s; k++)
c2[i][j] += a[i][k] * b[k][j];
}
Instrumented Code: (Equivalent)
Thread[] threads = new Thread[NTHREADS];
for (int i = 0; i < NTHREADS ; i++) {
final int lb = i * SIZE/NTHREADS;
final int ub = (i+1) * SIZE/NTHREADS;
threads[i] = new Thread(new Runnable() {
public void run() {
for (int i = lb; i < ub; i++)
for (int j = 0; j < SIZE; j++) {
c2[i][j] = 0;
for (int k = 0; k < SIZE; k++)
c2[i][j] += a[i][k] * b[k][j];
}
}
});
threads[i].start();
}
// wait for completion
for (int i = 0; i < NTHREADS; i++) {
try {
threads[i].join();
} catch (InterruptedException ignore) {
}
}
Java Version
openjdk version "1.8.0-internal-debug"
OpenJDK Runtime Environment (build 1.8.0-internal-debug-saqib_2016_12_26_10_52-b00)
OpenJDK 64-Bit Server VM (build 25.71-b00-debug, mixed mode)
I'm constructing this answer mainly from the comments. There are still some riddles unsolved but the main question has been resolved. Bytecode instrumentation does not fail in my case. It actually never happens. According to the theory,
Dynamic bytecode instrumentation of an executing function takes place
at subsequent call of the function. If a function has only one invocation, it cannot be instrumented while execution using current hotswap techniques in JVM.
I was trying to instrument a class which had only one function, i.e. main. I was trying to instrument that during runtime. This was the main problem. To check the validity of this argument, I tried to put my code in another function and called it from main in a loop. It got instrumented and everything worked. Nothing has to be changed in the agent.
Related
Point cloud library (PCL) C++ exception when tying to run
i am trying to use PCL with a V1 Microsoft Kinect camera to do 3D mapping. I have installed it and used Cmake to create the project using visual studio 2019 when i try to run the file "openni_grabber.cpp" i get the exception shown below and i cannot figure out what is causing it and how to fix it. I'm not very experience with PCL so any help would be very appreciated I have tried searching and cannot find anyone else who has had this error or any info on how to fix it The code im using is shown below #include <pcl/point_cloud.h> #include <pcl/point_types.h> #include <pcl/io/openni2_grabber.h> #include <pcl/common/time.h> class SimpleOpenNIProcessor { public: void cloud_cb_ (const pcl::PointCloud<pcl::PointXYZRGBA>::ConstPtr &cloud) { static unsigned count = 0; static double last = pcl::getTime (); if (++count == 30) { double now = pcl::getTime (); std::cout << "distance of center pixel :" << cloud->points [(cloud->width >> 1) * (cloud->height + 1)].z << " mm. Average framerate: " << double(count)/double(now - last) << " Hz" << std::endl; count = 0; last = now; } } void run () { // create a new grabber for OpenNI devices pcl::Grabber* interface = new pcl::io::OpenNI2Grabber(); // make callback function from member function boost::function<void (const pcl::PointCloud<pcl::PointXYZRGBA>::ConstPtr&)> f = boost::bind (&SimpleOpenNIProcessor::cloud_cb_, this, _1); // connect callback function for desired signal. In this case its a point cloud with color values boost::signals2::connection c = interface->registerCallback (f); // start receiving point clouds interface->start (); // wait until user quits program with Ctrl-C, but no busy-waiting -> sleep (1); while (true) boost::this_thread::sleep (boost::posix_time::seconds (1)); // stop the grabber interface->stop (); } }; int main () { SimpleOpenNIProcessor v; v.run (); return (0); } The details of the exception are "Unhandled exception at 0x00007FFBEB9206BC in openni_grabber.exe: Microsoft C++ exception: pcl::IOException at memory location 0x00000043E9EFF378."
I have to write logfile to trace init sequence from my simulator application
So, I dont have much idea about logfile and i need to write a logfile using fprintf between BlueModPlusS50::Send and BlueModPlusS50::Receive as you can check in the code, so that it can trace init, scan and print sequences from the simulator application that i use. So please help me to understand what it actually means and how to do that. Even if there are different versions of your understanding, you may let me know. I am using Visual Studio 2019. void BlueModPlusS50::Send(std::shared_ptr< Message > msg, const ms timeout) { #ifdef SIMULATION DWORD BytesWritten = 0; printf(">>>>>>%s<<<<<<", msg->GetPayload()); bool Status = WriteFile(com_handle_, // Handle to the Serialport msg->GetPayload(), // Data to be written to the port msg->GetSize(), // No of bytes to write into the port &BytesWritten, // No of bytes written to the port NULL); if (Status == FALSE) { printf_s("\nFail to Written"); } #endif // SIMULATION } void BlueModPlusS50::Receive(std::shared_ptr<Message> msg) { #ifdef SIMULATION DWORD NoBytesRead; // Bytes read by ReadFile() char ReadData; // temporary Character unsigned char received = 0; auto payload = msg->GetPayload(); do { bool Status = ReadFile(com_handle_, &ReadData, sizeof(ReadData), &NoBytesRead, NULL); if (NoBytesRead > 0) { payload[received] = ReadData; printf("%c", ReadData); ++received; } } while (received < msg->GetSize()); msg->SetSize(received);
Can the pre compiled contract related to bls12-381 be invoked successfully in the local test environment provided by Remix?
When I call the precompiled contract, it always returns 0. Of course, there is more likely to be a problem with my code. my code: struct G1Point { uint256[2] X; uint256[2] Y; } function addition(G1Point memory p1, G1Point memory p2) internal view returns (G1Point memory r) { uint256[8] memory input; input[0] = p1.X[0]; input[1] = p1.X[1]; input[2] = p1.Y[0]; input[3] = p1.Y[1]; input[4] = p2.X[0]; input[5] = p2.X[1]; input[6] = p2.Y[0]; input[7] = p2.Y[1]; bool success; assembly { success := staticcall(gas(), 0x0b, input, 256 , r, 0x80) switch success case 0 { invalid() } } require(success, "pairing-add-failed"); } My English is not good. I hope I have clearly explained the problems I have encountered
What happens if corePoolSize of ThreadPoolExecutor is 0
I'm reading Efficient Android Threading. It says, With zero-core threads and a bounded queue that can hold 10 tasks, no tasks actually run until the 11th task is inserted, triggering the creation of a thread. But when I try code such as, int N = Runtime.getRuntime().availableProcessors(); ThreadPoolExecutor executor = new ThreadPoolExecutor( 0, N*2, 60L, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(10)); for(int i = 1 ; i <= 5 ; ++i) { final int j = i; executor.execute(new Runnable() { #Override public void run() { Log.d("Debug", "Executed : " + j); SystemClock.sleep(1000); } }); Log.d("Debug", "Queued : " + i); } The tasks are executed correctly even though there are only 5 tasks in the queue. What am I missing ?
how can i get all process name in os x programmatically? not just app processes
I want to get a snapshot of the process info in the os x system. The 'NSProcessInfo' can only get info of the calling process. The ps cmd can be one solution, but i'd like a c or objective-c program.
Here's an example using using libproc.h to iterate over all the processes on the system and determine how many of them belong to the effective user of the process. You can easily modify this for your needs. - (NSUInteger)maxSystemProcs { int32_t maxproc; size_t len = sizeof(maxproc); sysctlbyname("kern.maxproc", &maxproc, &len, NULL, 0); return (NSUInteger)maxproc; } - (NSUInteger)runningUserProcs { NSUInteger maxSystemProcs = self.maxSystemProcs; pid_t * const pids = calloc(maxSystemProcs, sizeof(pid_t)); NSAssert(pids, #"Memory allocation failure."); const int pidcount = proc_listallpids(pids, (int)(maxSystemProcs * sizeof(pid_t))); NSUInteger userPids = 0; uid_t uid = geteuid(); for (int *pidp = pids; *pidp; pidp++) { struct proc_bsdshortinfo bsdshortinfo; int writtenSize; writtenSize = proc_pidinfo(*pidp, PROC_PIDT_SHORTBSDINFO, 0, &bsdshortinfo, sizeof(bsdshortinfo)); if (writtenSize != (int)sizeof(bsdshortinfo)) { continue; } if (bsdshortinfo.pbsi_uid == uid) { userPids++; } } free(pids); return (NSUInteger)userPids; }