Building release version of a project with Android NDK r6 - optimization

I am compiling helloworld example of Android NDK r6b using cygwin and Windows Vista. I have noticed that the following code takes between 14 and 20 mseconds on my Android phone (it has an 800mhz CPU Qualcomm MSM7227T chipset, with hardware floating point support):
float *v1, *v2, *v3, tot;
int num = 50000;
v1 = new float[num];
v2 = new float[num];
v3 = new float[num];
// Initialize vectors. RandomEqualREAL() returns a floating point number in a specified range.
for ( int i = 0; i < num; i++ )
{
v1[i] = RandomEqualREAL( -10.0f, 10.0f );
if (v1[i] == 0.0f) v1[i] = 1.0f;
v2[i] = RandomEqualREAL( -10.0f, 10.0f );
if (v2[i] == 0.0f) v2[i] = 1.0f;
}
clock_t start = clock() / (CLOCKS_PER_SEC / 1000);
tot = 0.0f;
for ( int k = 0; k < 1000; k++)
{
for ( int i = 0; i < num; i++ )
{
v3[i] = v1[i] / (v2[i]);
tot += v3[i];
}
}
clock_t end = clock() / (CLOCKS_PER_SEC / 1000);
printf("time %f\n", tot, (end-start)/1000.0f);
On my 2.4ghz notebook it takes .45 msec (timings taken when the system is full of other programs running, like Chrome, 2/3 ides, .pdf opens etc...). I wonder if the helloworld application is builded as a release version. I noticed that g++ get called with
-msoft-float.
Does this means that it is using floating point emulations?
What command line options i need to use in order to build an optimized version of the program? How to specify those options?
This is how g++ get called.:
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-g++ -MMD -MP -MF D:/android/workspace/hell
oworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.o.d.org -fpic -ffunction-sections -f
unwind-tables -fstack-protector -D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ -D__ARM_ARCH_
5E__ -D__ARM_ARCH_5TE__ -Wno-psabi -march=armv5te -mtune=xscale -msoft-float -f
no-exceptions -fno-rtti -mthumb -Os -fomit-frame-pointer -fno-strict-aliasing -f
inline-limit=64 -ID:/android/workspace/helloworld/jni/boost -ID:/android/workspa
ce/helloworld/jni/../../mylib/jni -ID:/android/android-ndk-r6b/sources/cxx-stl/g
nu-libstdc++/include -ID:/android/android-ndk-r6b/sources/cxx-stl/gnu-libstdc++/
libs/armeabi/include -ID:/android/workspace/helloworld/jni -DANDROID -Wa,--noex
ecstack -fexceptions -frtti -O2 -DNDEBUG -g -ID:/android/android-ndk-r6b/plat
forms/android-9/arch-arm/usr/include -c D:/android/workspace/helloworld/jni/ndk
foo.cpp -o D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.
o && ( if [ -f "D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkfoo/nd
kfoo.o.d.org" ]; then awk -f /cygdrive/d/android/android-ndk-r6b/build/awk/conve
rt-deps-to-cygwin.awk D:/android/workspace/helloworld/obj/local/armeabi/objs/ndk
foo/ndkfoo.o.d.org > D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkf
oo/ndkfoo.o.d && rm -f D:/android/workspace/helloworld/obj/local/armeabi/objs/nd
kfoo/ndkfoo.o.d.org; fi )
Prebuilt : libstdc++.a <= <NDK>/sources/cxx-stl/gnu-libstdc++/libs/armeabi
/
cp -f /cygdrive/d/android/android-ndk-r6b/sources/cxx-stl/gnu-libstdc++/libs/arm
eabi/libstdc++.a /cygdrive/d/android/workspace/helloworld/obj/local/armeabi/libs
tdc++.a
SharedLibrary : libndkfoo.so
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-g++ -Wl,-soname,libndkfoo.so -shared --sys
root=D:/android/android-ndk-r6b/platforms/android-9/arch-arm D:/android/workspac
e/helloworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.o D:/android/workspace/hellow
orld/obj/local/armeabi/libstdc++.a D:/android/android-ndk-r6b/toolchains/arm-lin
ux-androideabi-4.4.3/prebuilt/windows/bin/../lib/gcc/arm-linux-androideabi/4.4.3
/libgcc.a -Wl,--no-undefined -Wl,-z,noexecstack -lc -lm -lsupc++ -o D:/androi
d/workspace/helloworld/obj/local/armeabi/libndkfoo.so
Install : libndkfoo.so => libs/armeabi/libndkfoo.so
mkdir -p /cygdrive/d/android/workspace/helloworld/libs/armeabi
install -p /cygdrive/d/android/workspace/helloworld/obj/local/armeabi/libndkfoo.
so /cygdrive/d/android/workspace/helloworld/libs/armeabi/libndkfoo.so
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-strip --strip-unneeded D:/android/workspac
e/helloworld/libs/armeabi/libndkfoo.so
Edit.
I have run the commnad adb shell cat /proc/cpuinfo. This is the result:
Processor : ARMv6-compatible processor rev 5 (v6l)
BogoMIPS : 532.48
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 6TEJ
CPU variant : 0x1
CPU part : 0xb36
CPU revision : 5
Hardware : GELATO Global board (LGE LGP690)
Revision : 0000
Serial : 0000000000000000
I don't understand what swp, half thumb fastmult vfp edsp and java means, but i don't like that 'vfp'!! Does it means virtual-floating points? That processor should have a floating point unit...

You are right, -msoft-float is a synonym for -mfloat-abi=soft (see list of gcc ARM options) and means floating point emulation.
For hardware floating point the following flags can be used:
LOCAL_CFLAGS += -march=armv6 -marm -mfloat-abi=softfp -mfpu=vfp
To see what floating point unit you really have on your device you can check the output of adb shell cat /proc/cpuinfo command. Some units are compatible with another: vfp < vfpv3-d16 < vfpv3 < neon - so if you have vfpv3, then vfp also works for you.
Also you might want to add the line
APP_OPTIM := release
into your Application.mk file. This setting overrides automatic 'debug' mode for native part of application if the manifest sets android:debuggable to 'true'
But even with all these settings NDK will put -march=armv5te -mtune=xscale -msoft-float into the beginning of compiler options. And this behavior can not be changed without modifications in NDK sources (these options are hardcoded in file $NDKROOT\toolchains\arm-linux-androideabi-4.4.3\setup.mk).

Related

How to mix compile HIP and Fortran using CMake

I don't know how to mix HIP compilation and Fortran using CMake. The follow is a demo.
I have 3 files:
fcode.f90
SUBROUTINE fcode()
implicit double precision (a-h, o-z)
parameter (M=64)
dimension x(M),y(M),z(M)
do k=1, M
x(k) = 1.0
y(k) = 2.0
z(k) = 0.0
end do
call hipcode(x,y,z,M)
do k = 1, M
if ( z(k) .ne. 3.0 ) then
write(6,*) 'u fail !'
return
endif
end do
write(6,*)' PASSED !'
return
end
hipcode.cpp
#include <hip/hip_runtime.h>
#define HIP_ASSERT(status) assert(status == hipSuccess)
__global__ void add(double *x, double *y, double *z, const unsigned int M) {
z[threadIdx.x] = x[threadIdx.x] + y[threadIdx.x];
}
extern "C" void hipcode_(double *h_x, double *h_y, double *h_z, int &M) {
HIP_ASSERT(hipSetDevice(0));
double *d_x, *d_y, *d_z;
HIP_ASSERT(hipMalloc((void **)&d_x, M * sizeof(double)));
HIP_ASSERT(hipMalloc((void **)&d_y, M * sizeof(double)));
HIP_ASSERT(hipMalloc((void **)&d_z, M * sizeof(double)));
HIP_ASSERT(hipMemcpy(d_x, h_x, M * sizeof(double), hipMemcpyHostToDevice));
HIP_ASSERT(hipMemcpy(d_y, h_y, M * sizeof(double), hipMemcpyHostToDevice));
HIP_ASSERT(hipMemcpy(d_z, h_z, M * sizeof(double), hipMemcpyHostToDevice));
hipLaunchKernelGGL(add, 1, 64, 0, 0, d_x, d_y, d_z, M);
HIP_ASSERT(hipMemcpy(h_z, d_z, M * sizeof(double), hipMemcpyDeviceToHost));
hipFree(d_x);
hipFree(d_y);
hipFree(d_z);
}
main.f90
call fcode()
stop
end
and I write a Makefile to compile it, it works. But I don't know how to use cmake to do this.
OBJS=main.o fcode.o hipcode.o
FC=gfortran
HIPCC=hipcc
FCFLAGS=-c
HIPCCFLAGS=-c
LDFLAGS=-lgfortran
all :
$(HIPCC) $(HIPCCFLAGS) hipcode.cpp
$(FC) $(FCFLAGS) fcode.f90
$(FC) $(FCFLAGS) main.f90
$(HIPCC) $(OBJS) $(LDFLAGS) -o test
Here is my CMakeLists.txt
cmake_minimum_required(VERSION 3.15)
project(test LANGUAGES Fortran CXX)
# source file: hipcode.cpp fcode.f90 main.f90
# target:
# hipcc -c hipcode.cpp
# gfortran -c fcode.f90
# gfortran -c main.f90
# hipcc hipcode.o fcode.o main.o -lgfortran -o test
set(sources_list hipcode.cpp)
set(raw_sources_list_f90 fcode.f90)
# find hip
find_package(HIP QUIET)
set(CMAKE_HIP_FLAGS "${CMAKE_CXX_FLAGS} -D__HIP_PLATFORM_HCC__ --offload-arch=gfx906")
set(HIP_CLANG_FLAGS "${HIP_CLANG_FLAG} --hip-link")
set_source_files_properties(${sources_list} PROPERTIES HIP_SOURCE_PROPERTY_FORMAT 1)
set(MY_SOURCE_FILES ${sources_list})
set(MY_TARGET_NAME hipcode)
set(MY_HIPCC_OPTIONS "--hip-link")
set(HIP_TARGET_LINK_LIB "rocm/hip/lib/libamdhip64.so" )
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -x hip -fgpu-rdc --hip-link -std=c++14 -g")
set(CMAKE_CXX_COMPILER "hipcc")
set(CMAKE_HIP_FLAFS "${CMAKE_HIP_FLAGS} --hip-link")
set(CMAKE_HIP_LINKER_WRAPPER_FLAG "--hip-link")
set(CMAKE_CXX_LINK_FLAGS " -fgpu-rdc --hip-link -std=c++14 ")
set(HIP_HIPCC_CMAKE_LINKER_HELPER "hipcc")
set(HIP_CLANG_PATH " ")
set(HIP_CLANG_PARALLEL_BUILD_LINK_OPTIONS " ")
add_library(${MY_TARGET_NAME} ${MY_SOURCE_FILES})
target_link_libraries(${MY_TARGET_NAME} ${HIP_TARGET_LINK_LIB} )
add_library(fcodef90 STATIC ${raw_sources_list_f90})
target_link_libraries(fcodef90 hipcode)
I use CXX=hipcc cmake .. && make -j to build the demo, and it passed.
But then I got an error: "undefined reference to `hipcode_'", so how to modify the CMakeLists.txt?
First of all thank you again, after use nm, I realized that maybe I need to add separate compilation options, then I found answer here, all right, thank you very much!
And the final compiled command is
gfortran -c main.f90
hipcc -fgpu-rdc --hip-link main.o libfcode.a libhipcode.a -lgfortran

Tensorflow XLA AOT: Eigen related Error Building Project

I'm currently trying to work through the tensorflow XLA ahead of time compilation work flow for the first time, and I've hit a problem while trying to build the final executable binary which includes the AOT compiled object.
I've used the tutorial here to generate the test_graph_tfgather.pb and test_graph_tfgather.config.pbtxt files. Then I've used the tfcompile tool directly to produce MyClass.o and MyClass.h. So far so good.
I'm now building a simple makefile project which includes this compiled model, but I'm getting some errors related to Eigen. Could this be due to a different version of eigen3 being installed on my computer? I've also had to comment out the Eigen::ThreadPool lines due to eigen errors too so some version miss match may be the problem. Has anyone seen this problem before or does anyone have any ideas how to get this working?
Thanks.
The build errors:
g++ -c -std=c++11 -I . -I /usr/include/eigen3 -I /home/user/tensorflow_xla/tensorflow -I /usr/include main.cpp
In file included from /home/user/tensorflow_xla/tensorflow/tensorflow/compiler/xla/types.h:22:0,
from /home/user/tensorflow_xla/tensorflow/tensorflow/compiler/xla/executable_run_options.h:20,
from /home/user/tensorflow_xla/tensorflow/tensorflow/compiler/tf2xla/xla_compiled_cpu_function.h:22,
from MyClass.h:14,
from main.cpp:6:
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h: In static member function ‘static tensorflow::bfloat16 Eigen::NumTraits<tensorflow::bfloat16>::infinity()’:
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h:79:28: error: ‘infinity’ is not a member of ‘Eigen::NumTraits<float>’
return FloatToBFloat16(NumTraits<float>::infinity());
^
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h: In static member function ‘static tensorflow::bfloat16 Eigen::NumTraits<tensorflow::bfloat16>::quiet_NaN()’:
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h:83:28: error: ‘quiet_NaN’ is not a member of ‘Eigen::NumTraits<float>’
return FloatToBFloat16(NumTraits<float>::quiet_NaN());
^
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h: At global scope:
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h:95:34: error: ‘log’ is not a template function
const tensorflow::bfloat16& x) {
^
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h:101:34: error: ‘exp’ is not a template function
const tensorflow::bfloat16& x) {
^
/home/user/tensorflow_xla/tensorflow/tensorflow/core/framework/numeric_types.h:107:34: error: ‘abs’ is not a template function
const tensorflow::bfloat16& x) {
^
Makefile:10: recipe for target 'main.o' failed
main.cpp source:
#define EIGEN_USE_THREADS
#define EIGEN_USE_CUSTOM_THREAD_POOL
#include <iostream>
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "MyClass.h" // generated
int main(int argc, char** argv) {
//Eigen::ThreadPool tp(2); // Size the thread pool as appropriate.
//Eigen::ThreadPoolDevice device(&tp, tp.NumThreads());
MyClass matmul;
//matmul.set_thread_pool(&device);
// Set up args and run the computation.
const float args[12] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
std::copy(args + 0, args + 6, matmul.arg0_data());
std::copy(args + 6, args + 12, matmul.arg1_data());
matmul.Run();
// Check result
if (matmul.result0(0, 0) == 58) {
std::cout << "Success" << std::endl;
} else {
std::cout << "Failed. Expected value 58 at 0,0. Got:"
<< matmul.result0(0, 0) << std::endl;
}
return 0;
}
Makefile
EIGEN_INC=-I /usr/include/eigen3
TF_INC=-I /home/user/tensorflow_xla/tensorflow
CPPFLAGS=-c -std=c++11
xla_hw: main.o MyClass.o
g++ -o xla_hw main.o MyClass.o
main.o: main.cpp
g++ $(CPPFLAGS) -I . $(TF_INC) $(EIGEN_INC) -I /usr/include main.cpp
I've solved this problem now, it turns out there is a specific version of eigen3 included with tensorflow and you need to use this version for it to work. When tensorflow has been built the correct version of eigen3 is located at <tensorflow path>bazel-tensorflow/external/eigen_archive
Below is the working makefile which includes the correct Eigen path as well as the libraries needed to link the project.
TF_INC=-I /home/user/tensorflow_xla/tensorflow/bazel-tensorflow/external/eigen_archive -I /home/user/tensorflow_xla/tensorflow
TF_LIBS=-L/home/user/tensorflow_xla/tensorflow/bazel-bin/tensorflow/compiler/tf2xla/ -lxla_compiled_cpu_function -L/home/user/tensorflow_xla/tensorflow/bazel-bin/tensorflow/compiler/aot -lruntime
CPPFLAGS=-c -std=c++11
xla_hw: main.o MyClass.o
g++ -o xla_hw main.o MyClass.o $(TF_LIBS)
main.o: main.cpp
g++ $(CPPFLAGS) -I . $(TF_INC) -I /usr/include main.cpp

Rust optimizing out loops?

I was doing some very simple benchmarks to compare performance of C and Rust. I used a function adding integers 1 + 2 + ... + n (something that I could verify by a computation by hand), where n = 10^10.
The code in Rust looks like this:
fn main() {
let limit: u64 = 10000000000;
let mut buf: u64 = 0;
for u64::range(1, limit) |i| {
buf = buf + i;
}
io::println(buf.to_str());
}
The C code is as follows:
#include <stdio.h>
int main()
{
unsigned long long buf = 0;
for(unsigned long long i = 0; i < 10000000000; ++i) {
buf = buf + i;
}
printf("%llu\n", buf);
return 0;
}
I compiled and run them:
$ rustc sum.rs -o sum_rust
$ time ./sum_rust
13106511847580896768
real 6m43.122s
user 6m42.597s
sys 0m0.076s
$ gcc -Wall -std=c99 sum.c -o sum_c
$ time ./sum_c
13106511847580896768
real 1m3.296s
user 1m3.172s
sys 0m0.024s
Then I tried with optimizations flags on, again both C and Rust:
$ rustc sum.rs -o sum_rust -O
$ time ./sum_rust
13106511847580896768
real 0m0.018s
user 0m0.004s
sys 0m0.012s
$ gcc -Wall -std=c99 sum.c -o sum_c -O9
$ time ./sum_c
13106511847580896768
real 0m16.779s
user 0m16.725s
sys 0m0.008s
These results surprised me. I did expected the optimizations to have some effect, but the optimized Rust version is 100000 times faster :).
I tried changing n (the only limitation was u64, the run time was still virtually zero), and even tried a different problem (1^5 + 2^5 + 3^5 + ... + n^5), with similar results: executables compiled with rustc -O are several orders of magnitude faster than without the flag, and are also many times faster than the same algorithm compiled with gcc -O9.
So my question is: what's going on? :) I could understand a compiler optimizing 1 + 2 + .. + n = (n*n + n)/2, but I can't imagine that any compiler could derive a formula for 1^5 + 2^5 + 3^5 + .. + n^5. On the other hand, as far as I can see, the result must've been computed somehow (and it seems to be correct).
Oh, and:
$ gcc --version
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
$ rustc --version
rustc 0.6 (dba9337 2013-05-10 05:52:48 -0700)
host: i686-unknown-linux-gnu
Yes, compilers do use the 1 + ... + n = n*(n+1)/2 optimisation to remove the loop, and there are similar tricks for any power of the summation variable. e.g. k1 are triangular numbers, k2 are pyramidal numbers, k3 are squared triangular numbers, etc. In general, there is even a formula to calculate ∑k kp for any p.
You can use a more complicated expression, so that the compiler doesn't have any tricks to remove the loop. e.g.
fn main() {
let limit: u64 = 1000000000;
let mut buf: u64 = 0;
for u64::range(1, limit) |i| {
buf += i + i ^ (i*i);
}
io::println(buf.to_str());
}
and
#include <stdio.h>
int main()
{
unsigned long long buf = 0;
for(unsigned long long i = 0; i < 1000000000; ++i) {
buf += i + i ^ (i * i);
}
printf("%llu\n", buf);
return 0;
}
which gives me
real 0m0.700s
user 0m0.692s
sys 0m0.004s
and
real 0m0.698s
user 0m0.692s
sys 0m0.000s
respectively (with -O for both compilers).

Undefined reference to `oslIsWlanPowerOn'

I am developing a PSP homebrew application and I using the makefile from the exampel but it won't link because the stupid (excuse my French) linker says that oslIsWlanPowrOn is undefined. I know I am linking the right library, plus I am following an example so it should compile. I know most stackoverflow users don't use the oslib or do much psp programming but any help would be appreciated. I have also tried reordering the order of the libs but still states the same linker errors. Anyhow here is the code below:
Makefile
TARGET = main
OBJS = main.o
CFLAGS = -O2 -g -G0 -Wall
CXXFLAGS = $(CFLAGS) -fno-exceptions -fno-rtti
ASFLAGS = $(CFLAGS)
LIBDIR =
LIBS= -lpspwlan -losl -lpng -lz -lpspnet \
-lpsphprm -lpspsdk -lpspctrl -lpsprtc -lpsppower -lpspgu -lpspgum -lpspaudiolib -lpspaudio \
-lpspnet_adhocmatching -lpspnet_adhoc -lpspnet_adhocctl -lm -ljpeg
LDFLAGS =
EXTRA_TARGETS = EBOOT.PBP
PSP_EBOOT_TITLE = PSP Chat
#PSP_EBOOT_ICON = ICON0.PNG
PSPSDK=$(shell psp-config --pspsdk-path)
include $(PSPSDK)/lib/build.mak
Error details:
1>------ Build started: Project: PSP Chat, Configuration: Debug Win32 ------
1> psp-gcc -I. -IC:/pspsdk/psp/sdk/include -O2 -g -G0 -Wall -D_PSP_FW_VERSION=150 -L. -LC:/pspsdk/psp/sdk/lib main.o -lpspwlan -losl -lpng -lz -lpspnet -lpsphprm -lpspsdk -lpspctrl -lpsprtc -lpsppower -lpspgu -lpspgum -lpspaudiolib -lpspaudio -lpspnet_adhocmatching -lpspnet_adhoc -lpspnet_adhocctl -lm -ljpeg -lpspdebug -lpspdisplay -lpspge -lpspctrl -lpspsdk -lc -lpspnet -lpspnet_inet -lpspnet_apctl -lpspnet_resolver -lpsputility -lpspuser -lpspkernel -o main.elf
1> main.o: In function `main':
1> c:\Users\Danny\documents\visual studio 2010\Projects\PSP Chat\PSP Chat/main.cpp (24) : undefined reference to `oslIsWlanPowerOn'
1> c:\Users\Danny\documents\visual studio 2010\Projects\PSP Chat\PSP Chat/main.cpp (52) : undefined reference to `oslIsWlanPowerOn'
1> C:\pspsdk\bin\make: *** [main.elf] Error 1
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========
main.cpp
#include <pspkernel.h>
#include <oslib\oslib.h>
PSP_MODULE_INFO("PSP Chat", 0, 1, 0);
OSL_FONT* font;
int main()
{
char* screename = (char*)malloc(100);
int skip = 0;
printf("Initializing OSL...");
oslInit(0);
printf("Loading Font...");
oslIntraFontInit(INTRAFONT_CACHE_MED);
font = oslLoadFontFile("flash0:/font/ltn0.pgf");
printf("Configuring Font Style...");
oslIntraFontSetStyle(font, 1.0, RGBA(0, 0, 255, 255), RGBA(0, 0, 0, 0), INTRAFONT_ALIGN_LEFT);
printf("Setting Font...");
oslSetFont(font);
while(!osl_quit)
{
if (!skip)
{
oslStartDrawing();
if (oslIsWlanPowerOn())
{
oslDrawString(10, 10, "Please Enter Screename By Pressing X (Client)...");
oslDrawString(10, 25, "Please Press O To Act As Server...");
if (oslOskIsActive()){
oslDrawOsk();
if (oslGetOskStatus() == PSP_UTILITY_DIALOG_NONE)
{
if (oslOskGetResult() == OSL_OSK_CANCEL)
{
screename = (char*)"Client";
}
else
{
oslOskGetText(screename);
}
oslEndOsk();
}
}
else
{
oslDrawString(10, 10, "Please turn on the wlan switch!");
}
oslEndDrawing();
}
oslEndFrame();
skip = oslSyncFrame();
oslReadKeys();
if (osl_keys->released.cross && oslIsWlanPowerOn())
{
oslInitOsk((char*)"Please enter screename!", (char*)"Client", 99, 1, -1);
}
}
}
sceKernelExitGame();
return 0;
}
There was a problem with the installation of the sdk and so I reinstalled it. Voila--it worked.
Thanks for everyone who tried to diagnose the problem.

Finding dylib version using dlopen

Is there a way to find the version of a dylib using its path? I am looking for something that accepts the same arguments as dlopen. I have looked at NSVersionOfRunTimeLibrary, but from my reading of the documentation it looks like it gets the version of the current dylib, not the one specified in the path.
Thank you
Run otool -L on it, and it will show its actually version. I choose libSystem.B as it has different version in the 10.4 and 10.5 SDKs:
$ otool -L /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/libSystem.B.dylib
/Developer/SDKs/MacOSX10.4u.sdk/usr/lib/libSystem.B.dylib:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.3.11)
/usr/lib/system/libmathCommon.A.dylib (compatibility version 1.0.0, current version 220.0.0)
$ otool -L /Developer/SDKs/MacOSX10.5.sdk/usr/lib/libSystem.B.dylib
/Developer/SDKs/MacOSX10.5.sdk/usr/lib/libSystem.B.dylib:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.1.4)
/usr/lib/system/libmathCommon.A.dylib (compatibility version 1.0.0, current version 292.4.0)
(see how the first one has 88.3.11 version, while the second has 111.1.4). This example also shows that not all libraries are symbolic links to files with the version number in them:
$ ll /Developer/SDKs/MacOSX10.*.sdk/usr/lib/libSystem.B.dylib
-rwxr-xr-x 1 root wheel 749K May 15 2009 /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/libSystem.B.dylib
-rwxr-xr-x 1 root wheel 670K May 15 2009 /Developer/SDKs/MacOSX10.5.sdk/usr/lib/libSystem.B.dylib
-rwxr-xr-x 1 root wheel 901K Sep 25 00:21 /Developer/SDKs/MacOSX10.6.sdk/usr/lib/libSystem.B.dylib
Here, the files don't have the version number in their name.
EDIT: a second solution is to use NSVersionOfRunTimeLibrary in a test program, in which you force load the library you want to check. Create a program libversion from the following C source:
#include <stdio.h>
#include <mach-o/dyld.h>
int main (int argc, char **argv)
{
printf ("%x\n", NSVersionOfRunTimeLibrary (argv[1]));
return 0;
}
Then, you call it like that:
$ DYLD_INSERT_LIBRARIES=/usr/lib/libpam.2.dylib ./a.out libpam.2.dylib
30000
(here, the version number is printed as hexadecimal, but you can adapt to your needs.)
You can check the source code of NSVersionOfRunTimeLibrary here:
http://www.opensource.apple.com/source/dyld/dyld-132.13/src/dyldAPIsInLibSystem.cpp
Based on that you can create your own version which replaces if(names_match(install_name, libraryName) == TRUE) with if(strcmp(_dyld_get_image_name(i), libraryName) == 0)
That will fix the issue that the original expected the library name without full path, the edited version expects the full path, but it'll still search in the loaded dylibs.
#include <mach-o/dyld.h>
int32_t
library_version(const char* libraryName)
{
unsigned long i, j, n;
struct load_command *load_commands, *lc;
struct dylib_command *dl;
const struct mach_header *mh;
n = _dyld_image_count();
for(i = 0; i < n; i++){
mh = _dyld_get_image_header(i);
if(mh->filetype != MH_DYLIB)
continue;
load_commands = (struct load_command *)
#if __LP64__
((char *)mh + sizeof(struct mach_header_64));
#else
((char *)mh + sizeof(struct mach_header));
#endif
lc = load_commands;
for(j = 0; j < mh->ncmds; j++){
if(lc->cmd == LC_ID_DYLIB){
dl = (struct dylib_command *)lc;
if(strcmp(_dyld_get_image_name(i), libraryName) == 0)
return(dl->dylib.current_version);
}
lc = (struct load_command *)((char *)lc + lc->cmdsize);
}
}
return(-1);
}