Shared executable memory - executable

I have this short snippet of C:
const char *name = "/asdf";
int desc = shm_open(name, O_RDWR | O_CREAT, 0777);
ftruncate(desc, 4096);
void *block = mmap(NULL, 4096, PROT_EXEC, MAP_SHARED, desc, 0);
shm_unlink(name);
It creates a shared memory object, writable, readable, and execuable by everyone; and then maps it in memory with executable permissions. However, the mmap call fails with EPERM for some reason.
Strace gives the following:
statfs("/dev/shm/", {f_type=0x1021994, f_bsize=4096, f_blocks=450722, f_bfree=450372, f_bavail=450372, f_files=450722, f_ffree=450713, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
futex(0x7fa7f4c8b330, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("/dev/shm/asdf", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0777) = 3
fcntl(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
ftruncate(3, 4096) = 0
mmap(NULL, 4096, PROT_EXEC, MAP_SHARED, 3, 0) = -1 EPERM (Operation not permitted)
The problem seems to arise from the use of PROT_EXEC, as mmap succeeds with any combination of permissions as long as there's no PROT_EXEC.
mmaping without execute permissions and then trying to mprotect it fails as well.

Assuming you're running Linux here, the [no]exec'ness of the /dev/shm mount gets propagated to the shared memory areas created by shm_open in current Linux kernels. So your issue will happen if your /dev/shm is mounted with the noexec mount option (which some distributions (Debian?) might do by default).
I've modified your code to add a perror (and return 2) in case of mmap failure:
testbox $ mount|grep shm
tmpfs on /dev/shm type tmpfs (rw)
testbox $ ./a.out ; echo $?
0
testbox $ sudo mount -o remount,noexec /dev/shm
testbox $ mount|grep shm
tmpfs on /dev/shm type tmpfs (rw,noexec)
testbox $ ./a.out ; echo $?
mmap: Operation not permitted
2

Related

objcopy: bloated binary output file

While working on my bootloader project, I noticed that the resulting binary file is larger than the sum of the sizes of each section in the original ELF file.
After linking the bootloader image via ld, the ELF file is structured as follows:
$ readelf -S elfboot.elf
There are 10 section headers, starting at offset 0xa708:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00007e00 000e00 004bc7 00 AX 0 0 16
[ 2] .rodata PROGBITS 0000c9d0 0059d0 000638 00 A 0 0 32
[ 3] .initcalls PROGBITS 0000ec20 007c20 00000c 00 WA 0 0 4
[ 4] .exitcalls PROGBITS 0000ec30 007c30 00000c 00 WA 0 0 4
[ 5] .data PROGBITS 0000ec40 007c40 0001e0 00 WA 0 0 32
[ 6] .bss NOBITS 0000ee20 007e20 00025c 00 WA 0 0 32
[ 7] .symtab SYMTAB 00000000 007e20 0016e0 10 8 166 4
[ 8] .strtab STRTAB 00000000 009500 0011bb 00 0 0 1
[ 9] .shstrtab STRTAB 00000000 00a6bb 00004a 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
p (processor specific)
I noticed that there is a rather big gap (~7KB) between the sections .rodata and .initcalls. I also checked out the content of each section:
$ objdump -s elfboot.elf
[...]
Contents of section .rodata:
[...]
cfc0 23cf0000 80000000 2fcf0000 00010000 #......./.......
cfd0 3bcf0000 00020000 47cf0000 00040000 ;.......G.......
cfe0 54cf0000 00080000 61cf0000 00100000 T.......a.......
cff0 6ecf0000 00200000 7bcf0000 00400000 n.... ..{....#..
d000 89cf0000 00800000 ........
Contents of section .initcalls:
ec20 02b50000 6db50000 feac0000 ....m.......
Contents of section .exitcalls:
[...]
My linker script tells to align every section at a 16 byte (0x10) boundary. After
objcopy -O binary elfboot.elf elfboot.bin
the resulting binary contains a huge chunk of zero filled bytes at offset 0x5200 (0xd000 - 0x7e00) all the way to offset 0x6e20 (0xec20 - 0x7e00). Now that I know where the zero filled bytes come from, how do I remove them? For reference I added the verbose output of the linker step including the linker script used:
i686-elfboot-gcc -o elfboot.elf -T elfboot.ld ./src/arch/x86/bootstrap.o ./src/arch/x86/a20.o ./src/arch/x86/bda.o ./src/arch/x86/bios.o ./src/arch/x86/copy.o ./src/arch/x86/e820.o ./src/arch/x86/entry.o ./src/arch/x86/idt.o ./src/arch/x86/realmode_jmp.o ./src/arch/x86/setup.o ./src/arch/x86/opmode.o ./src/arch/x86/pic.o ./src/arch/x86/ptrace.o ./src/arch/x86/video.o ./src/core/bdev.o ./src/core/cdev.o ./src/core/elf.o ./src/core/input.o ./src/core/interrupt.o ./src/core/loader.o ./src/core/main.o ./src/core/module.o ./src/core/pci.o ./src/core/printf.o ./src/core/string.o ./src/core/symbol.o ./src/crypto/crc32.o ./src/drivers/ide/ide.o ./src/fs/fs.o ./src/fs/file.o ./src/fs/super.o ./src/fs/ramfs/ramfs.o ./src/fs/isofs/isofs.o ./src/lib/ata/libata.o ./src/lib/tmg/libtmg.o ./src/mm/memblock.o ./src/mm/page_alloc.o ./src/mm/slub.o ./src/mm/util.o -O2 -Wl,--verbose -nostdlib -lgcc
GNU ld (GNU Binutils) 2.31.1
Supported emulations:
elf_i386
elf_iamcu
opened script file elfboot.ld
using external linker script:
==================================================
OUTPUT_FORMAT(elf32-i386)
ENTRY(_arch_start)
SECTIONS
{
/*
* The boot stack starts at 0x6000 and grows towards lower addresses.
* The stack is designed to be only one page in size, which should be
* sufficient for the bootstrap stage.
*/
. = 0x6000;
__stack_start = .;
/*
* Buffer for reading files from the boot device. Usually boot devices
* are designed to read data in chunks of 0x200 (512) or 0x800 (2048)
* bytes. The buffer is large enough to read 4 512 or 2 2048 chunks of
* data from the disk.
*/
__buffer_start = .;
. = 0x7000;
__buffer_end = .;
/*
* Actual bootstrap stage code and data.
*/
. = 0x7E00;
__bootstrap_start = .;
.text ALIGN(0x10) : {
__text_start = .;
*(.text*)
__text_end = .;
}
.rodata ALIGN(0x10) : {
__rodata_start = .;
*(.rodata*)
__rodata_end = .;
}
/*
* This section is dedicated to all built-in modules. If a module will
* be included in the elfboot binary, the module_init function pointer
* of that module is placed in this section.
*
* Make sure to initialize built-in filesystems first as we need it to
* setup the root file system node.
*/
.initcalls ALIGN(0x10) : {
__initcalls_vfs_start = .;
/* Built-in file systems */
*(.initcalls_vfs*)
__initcalls_vfs_end = .;
__initcalls_dev_start = .;
/* Built-in devices */
*(.initcalls_dev*)
__initcalls_dev_end = .;
__initcalls_start = .;
/* Built-in modules */
*(.initcalls*)
__initcalls_end = .;
}
.exitcalls ALIGN(0x10) : {
__exitcalls_start = .;
*(.exitcalls*)
__exitcalls_end = .;
}
.data ALIGN(0x10) : {
__data_start = .;
*(.data*)
__data_end = .;
}
.bss ALIGN(0x10) : {
__bss_start = .;
*(.bss*)
__bss_end = .;
}
__bootstrap_end = .;
}
==================================================
attempt to open ./src/arch/x86/bootstrap.o succeeded
./src/arch/x86/bootstrap.o
attempt to open ./src/arch/x86/a20.o succeeded
./src/arch/x86/a20.o
attempt to open ./src/arch/x86/bda.o succeeded
./src/arch/x86/bda.o
attempt to open ./src/arch/x86/bios.o succeeded
./src/arch/x86/bios.o
attempt to open ./src/arch/x86/copy.o succeeded
./src/arch/x86/copy.o
attempt to open ./src/arch/x86/e820.o succeeded
./src/arch/x86/e820.o
attempt to open ./src/arch/x86/entry.o succeeded
./src/arch/x86/entry.o
attempt to open ./src/arch/x86/idt.o succeeded
./src/arch/x86/idt.o
attempt to open ./src/arch/x86/realmode_jmp.o succeeded
./src/arch/x86/realmode_jmp.o
attempt to open ./src/arch/x86/setup.o succeeded
./src/arch/x86/setup.o
attempt to open ./src/arch/x86/opmode.o succeeded
./src/arch/x86/opmode.o
attempt to open ./src/arch/x86/pic.o succeeded
./src/arch/x86/pic.o
attempt to open ./src/arch/x86/ptrace.o succeeded
./src/arch/x86/ptrace.o
attempt to open ./src/arch/x86/video.o succeeded
./src/arch/x86/video.o
attempt to open ./src/core/bdev.o succeeded
./src/core/bdev.o
attempt to open ./src/core/cdev.o succeeded
./src/core/cdev.o
attempt to open ./src/core/elf.o succeeded
./src/core/elf.o
attempt to open ./src/core/input.o succeeded
./src/core/input.o
attempt to open ./src/core/interrupt.o succeeded
./src/core/interrupt.o
attempt to open ./src/core/loader.o succeeded
./src/core/loader.o
attempt to open ./src/core/main.o succeeded
./src/core/main.o
attempt to open ./src/core/module.o succeeded
./src/core/module.o
attempt to open ./src/core/pci.o succeeded
./src/core/pci.o
attempt to open ./src/core/printf.o succeeded
./src/core/printf.o
attempt to open ./src/core/string.o succeeded
./src/core/string.o
attempt to open ./src/core/symbol.o succeeded
./src/core/symbol.o
attempt to open ./src/crypto/crc32.o succeeded
./src/crypto/crc32.o
attempt to open ./src/drivers/ide/ide.o succeeded
./src/drivers/ide/ide.o
attempt to open ./src/fs/fs.o succeeded
./src/fs/fs.o
attempt to open ./src/fs/file.o succeeded
./src/fs/file.o
attempt to open ./src/fs/super.o succeeded
./src/fs/super.o
attempt to open ./src/fs/ramfs/ramfs.o succeeded
./src/fs/ramfs/ramfs.o
attempt to open ./src/fs/isofs/isofs.o succeeded
./src/fs/isofs/isofs.o
attempt to open ./src/lib/ata/libata.o succeeded
./src/lib/ata/libata.o
attempt to open ./src/lib/tmg/libtmg.o succeeded
./src/lib/tmg/libtmg.o
attempt to open ./src/mm/memblock.o succeeded
./src/mm/memblock.o
attempt to open ./src/mm/page_alloc.o succeeded
./src/mm/page_alloc.o
attempt to open ./src/mm/slub.o succeeded
./src/mm/slub.o
attempt to open ./src/mm/util.o succeeded
./src/mm/util.o
attempt to open /home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.so failed
attempt to open /home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a succeeded
(/home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a)_umoddi3.o
(/home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a)_udivmoddi4.o
Please note that I am not able to load the sections individually into memory, since my first stage bootloader (boot sector) simply loads the entire file at a specified offset from the boot device. To reduce the size of the second stage bootloader image, I want to try to remove the gap at link time or also if possible at all, at post link time (objcopy, ...).

Why perror() changes orientation of stream when it is redirected?

The standard says that:
The perror() function shall not change the orientation of the standard error stream.
This is the implementation of perror() in GNU libc.
Following are the tests when stderr is wide-oriented, multibyte-oriented and not oriented, prior to calling perror().
Tests 1) and 2) are OK. The issue is in test 3).
1) stderr is wide-oriented:
#include <stdio.h>
#include <wchar.h>
#include <errno.h>
int main(void)
{
fwide(stderr, 1);
errno = EINVAL;
perror("");
int x = fwide(stderr, 0);
printf("fwide: %d\n",x);
return 0;
}
$ ./a.out
Invalid argument
fwide: 1
$ ./a.out 2>/dev/null
fwide: 1
2) stderr is multibyte-oriented:
#include <stdio.h>
#include <wchar.h>
#include <errno.h>
int main(void)
{
fwide(stderr, -1);
errno = EINVAL;
perror("");
int x = fwide(stderr, 0);
printf("fwide: %d\n",x);
return 0;
}
$ ./a.out
Invalid argument
fwide: -1
$ ./a.out 2>/dev/null
fwide: -1
3) stderr is not oriented:
#include <stdio.h>
#include <wchar.h>
#include <errno.h>
int main(void)
{
printf("initial fwide: %d\n", fwide(stderr, 0));
errno = EINVAL;
perror("");
int x = fwide(stderr, 0);
printf("fwide: %d\n", x);
return 0;
}
$ ./a.out
initial fwide: 0
Invalid argument
fwide: 0
$ ./a.out 2>/dev/null
initial fwide: 0
fwide: -1
Why perror() changes orientation of stream if it is redirected? Is it proper behavior?
How does this code work? What is this __dup trick all about?
TL;DR: Yes, it's a bug in glibc. If you care about it, you should report it.
The quoted requirement that perror not change the stream orientation is in Posix, but does not seem to be required by the C standard itself. However, Posix seems quite insistent that the orientation of stderr not be changed by perror, even if stderr is not yet oriented. XSH 2.5 Standard I/O Streams:
The perror(), psiginfo(), and psignal() functions shall behave as described above for the byte output functions if the stream is already byte-oriented, and shall behave as described above for the wide-character output functions if the stream is already wide-oriented. If the stream has no orientation, they shall behave as described for the byte output functions except that they shall not change the orientation of the stream.
And glibc attempts to implement Posix semantics. Unfortunately, it doesn't quite get it right.
Of course, it is impossible to write to a stream without setting its orientation. So in an attempt to comply with this curious requirement, glibc attempts to make a new stream based on the same fd as stderr, using the code pointed to at the end of the OP:
58 if (__builtin_expect (_IO_fwide (stderr, 0) != 0, 1)
59 || (fd = __fileno (stderr)) == -1
60 || (fd = __dup (fd)) == -1
61 || (fp = fdopen (fd, "w+")) == NULL)
62 { ...
which, stripping out the internal symbols, is essentially equivalent to:
if (fwide (stderr, 0) != 0
|| (fd = fileno (stderr)) == -1
|| (fd = dup (fd)) == -1
|| (fp = fdopen (fd, "w+")) == NULL)
{
/* Either stderr has an orientation or the duplication failed,
* so just write to stderr
*/
if (fd != -1) close(fd);
perror_internal(stderr, s, errnum);
}
else
{
/* Write the message to fp instead of stderr */
perror_internal(fp, s, errnum);
fclose(fp);
}
fileno extracts the fd from a standard C library stream. dup takes an fd, duplicates it, and returns the number of the copy. And fdopen creates a standard C library stream from an fd. In short, that doesn't reopen stderr; rather, it creates (or attempts to create) a copy of stderr which can be written to without affecting the orientation of stderr.
Unfortunately, it doesn't work reliably because of the mode:
fp = fdopen(fd, "w+");
That attempts to open a stream which allows both reading and writing. And it will work with the original stderr, which is just a copy of the console fd, originally opened for both reading and writing. But when you bind stderr to some other device with a redirect:
$ ./a.out 2>/dev/null
you are passing the executable an fd opened only for output. And fdopen won't let you get away with that:
The application shall ensure that the mode of the stream as expressed by the mode argument is allowed by the file access mode of the open file description to which fildes refers.
The glibc implementation of fdopen actually checks, and returns NULL with errno set to EINVAL if you specify a mode which requires access rights not available to the fd.
So you could get your test to pass if you redirect stderr for both reading and writing:
$ ./a.out 2<>/dev/null
But what you probably wanted in the first place was to redirect stderr in append mode:
$ ./a.out 2>>/dev/null
and as far as I know, bash does not provide a way to read/append redirect.
I don't know why the glibc code uses "w+" as a mode argument, since it has no intention of reading from stderr. "w" should work fine, although it probably won't preserve append mode, which might have unfortunate consequences.
I'm not sure if there's a good answer to "why" without asking the glibc developers - it may just be a bug - but the POSIX requirement seems to conflict with ISO C, which reads in 7.21.2, ¶4:
Each stream has an orientation. After a stream is associated with an external file, but before any operations are performed on it, the stream is without orientation. Once a wide character input/output function has been applied to a stream without orientation, the stream becomes a wide-oriented stream. Similarly, once a byte input/output function has been applied to a stream without orientation, the stream becomes a byte-oriented stream. Only a call to the freopen function or the fwide function can otherwise alter the orientation of a stream. (A successful call to freopen removes any orientation.)
Further, perror seems to qualify as a "byte I/O function" since it takes a char * and, per 7.21.10.4 ¶2, "writes a sequence of characters".
Since POSIX defers to ISO C in the event of a conflict, there is an argument to be made that the POSIX requirement here is void.
As for the actual examples in the question:
Undefined behavior. A byte I/O function is called on a wide-oriented stream.
Nothing at all controversial. The orientation was correct for calling perror and did not change as a result of the call.
Calling perror oriented the stream to byte orientation. This seems to be required by ISO C but disallowed by POSIX.

apache2 cross-build failed for arm

On cross-building apache2 2.4.12 I get the following Error during complitation:
ptxdist/platform/build-target/httpd-2.4.12/modules/mappers -prefer-non-pic -static -c exports.c && touch exports.lo
exports.c:4117:55: error: 'apr_hash_this_key' undeclared here (not in a function)
exports.c:4121:59: error: 'apr_hash_this_key_len' undeclared here (not in a function)
exports.c:4125:55: error: 'apr_hash_this_val' undeclared here (not in a function)
exports.c:4568:62: error: 'apr_sockaddr_is_wildcard' undeclared here (not in a function)
exports.c:5307:55: error: 'apr_shm_create_ex' undeclared here (not in a function)
exports.c:5323:55: error: 'apr_shm_attach_ex' undeclared here (not in a function)
It seems that there is a dirty hack that generate httpd-2.4.12/server/exports.c during compile time from file make_exports.awk
How do I adjust this make_exports.awk file to get a working cross-build for arm?
The solution was to create a patch for apr-1.5.1 which disable apr_hash_this_key, apr_hash_this_key_len and so on. Then recompile apr, apr-util and apache 2.4.12.
diff -rupN AY/include/apr_hash.h AZ/include/apr_hash.h
--- AY/include/apr_hash.h 2015-02-27 10:35:07.436548295 +0100
+++ AZ/include/apr_hash.h 2015-02-27 10:33:30.038497258 +0100
## -171,21 +171,21 ## APR_DECLARE(void) apr_hash_this(apr_hash
* #param hi The iteration state
* #return The pointer to the key
*/
-APR_DECLARE(const void*) apr_hash_this_key(apr_hash_index_t *hi);
+//APR_DECLARE(const void*) apr_hash_this_key(apr_hash_index_t *hi);
/**
* Get the current entry's key length from the iteration state.
* #param hi The iteration state
* #return The key length
*/
-APR_DECLARE(apr_ssize_t) apr_hash_this_key_len(apr_hash_index_t *hi);
+//APR_DECLARE(apr_ssize_t) apr_hash_this_key_len(apr_hash_index_t *hi);
/**
* Get the current entry's value from the iteration state.
* #param hi The iteration state
* #return The pointer to the value
*/
-APR_DECLARE(void*) apr_hash_this_val(apr_hash_index_t *hi);
+//APR_DECLARE(void*) apr_hash_this_val(apr_hash_index_t *hi);
/**
* Get the number of key/value pairs in the hash table.
diff -rupN AY/include/apr_network_io.h AZ/include/apr_network_io.h
--- AY/include/apr_network_io.h 2013-12-06 18:10:34.000000000 +0100
+++ AZ/include/apr_network_io.h 2015-02-27 11:50:10.847966181 +0100
## -730,7 +730,7 ## APR_DECLARE(int) apr_sockaddr_equal(cons
* #remark The return value will be non-zero if the address is
* initialized and is the wildcard address.
*/
-APR_DECLARE(int) apr_sockaddr_is_wildcard(const apr_sockaddr_t *addr);
+//APR_DECLARE(int) apr_sockaddr_is_wildcard(const apr_sockaddr_t *addr);
/**
* Return the type of the socket.
diff -rupN AY/include/apr_shm.h AZ/include/apr_shm.h
--- AY/include/apr_shm.h 2015-02-27 12:00:01.119541200 +0100
+++ AZ/include/apr_shm.h 2015-02-27 12:02:01.294089137 +0100
## -111,12 +111,12 ## APR_DECLARE(apr_status_t) apr_shm_create
* function will return the first usable byte of memory.
*
*/
-APR_DECLARE(apr_status_t) apr_shm_create_ex(apr_shm_t **m,
+/*APR_DECLARE(apr_status_t) apr_shm_create_ex(apr_shm_t **m,
apr_size_t reqsize,
const char *filename,
apr_pool_t *pool,
apr_int32_t flags);
-
+*/
/**
* Remove named resource associated with a shared memory segment,
* preventing attachments to the resource, but not destroying it.
## -164,11 +164,11 ## APR_DECLARE(apr_status_t) apr_shm_attach
* structure for this process.
* #param flags mask of APR_SHM_* (defined above)
*/
-APR_DECLARE(apr_status_t) apr_shm_attach_ex(apr_shm_t **m,
+/*APR_DECLARE(apr_status_t) apr_shm_attach_ex(apr_shm_t **m,
const char *filename,
apr_pool_t *pool,
apr_int32_t flags);
-
+*/
/**
* Detach from a shared memory segment without destroying it.
* #param m The shared memory structure representing the segment
diff -rupN AY/Makefile.in AZ/Makefile.in
--- AY/Makefile.in 2015-02-27 10:17:32.817392979 +0100
+++ AZ/Makefile.in 2015-02-27 10:17:44.075619396 +0100
## -113,7 +113,7 ## apr.exp: exports.c export_vars.c
#echo "#! lib#APR_LIBNAME#.so" > $#
#echo "* This file was AUTOGENERATED at build time." >> $#
#echo "* Please do not edit by hand." >> $#
- $(CPP) $(ALL_CPPFLAGS) $(ALL_INCLUDES) exports.c | grep "ap_hack_" | sed -e 's/^.*[)]\(.*\);$$/\1/' >> $#
+ #$(CPP) $(ALL_CPPFLAGS) $(ALL_INCLUDES) exports.c | grep "ap_hack_" | sed -e 's/^.*[)]\(.*\);$$/\1/' >> $#
$(CPP) $(ALL_CPPFLAGS) $(ALL_INCLUDES) export_vars.c | sed -e 's/^\#[^!]*//' | sed -e '/^$$/d' >> $#
dox:

Building release version of a project with Android NDK r6

I am compiling helloworld example of Android NDK r6b using cygwin and Windows Vista. I have noticed that the following code takes between 14 and 20 mseconds on my Android phone (it has an 800mhz CPU Qualcomm MSM7227T chipset, with hardware floating point support):
float *v1, *v2, *v3, tot;
int num = 50000;
v1 = new float[num];
v2 = new float[num];
v3 = new float[num];
// Initialize vectors. RandomEqualREAL() returns a floating point number in a specified range.
for ( int i = 0; i < num; i++ )
{
v1[i] = RandomEqualREAL( -10.0f, 10.0f );
if (v1[i] == 0.0f) v1[i] = 1.0f;
v2[i] = RandomEqualREAL( -10.0f, 10.0f );
if (v2[i] == 0.0f) v2[i] = 1.0f;
}
clock_t start = clock() / (CLOCKS_PER_SEC / 1000);
tot = 0.0f;
for ( int k = 0; k < 1000; k++)
{
for ( int i = 0; i < num; i++ )
{
v3[i] = v1[i] / (v2[i]);
tot += v3[i];
}
}
clock_t end = clock() / (CLOCKS_PER_SEC / 1000);
printf("time %f\n", tot, (end-start)/1000.0f);
On my 2.4ghz notebook it takes .45 msec (timings taken when the system is full of other programs running, like Chrome, 2/3 ides, .pdf opens etc...). I wonder if the helloworld application is builded as a release version. I noticed that g++ get called with
-msoft-float.
Does this means that it is using floating point emulations?
What command line options i need to use in order to build an optimized version of the program? How to specify those options?
This is how g++ get called.:
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-g++ -MMD -MP -MF D:/android/workspace/hell
oworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.o.d.org -fpic -ffunction-sections -f
unwind-tables -fstack-protector -D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ -D__ARM_ARCH_
5E__ -D__ARM_ARCH_5TE__ -Wno-psabi -march=armv5te -mtune=xscale -msoft-float -f
no-exceptions -fno-rtti -mthumb -Os -fomit-frame-pointer -fno-strict-aliasing -f
inline-limit=64 -ID:/android/workspace/helloworld/jni/boost -ID:/android/workspa
ce/helloworld/jni/../../mylib/jni -ID:/android/android-ndk-r6b/sources/cxx-stl/g
nu-libstdc++/include -ID:/android/android-ndk-r6b/sources/cxx-stl/gnu-libstdc++/
libs/armeabi/include -ID:/android/workspace/helloworld/jni -DANDROID -Wa,--noex
ecstack -fexceptions -frtti -O2 -DNDEBUG -g -ID:/android/android-ndk-r6b/plat
forms/android-9/arch-arm/usr/include -c D:/android/workspace/helloworld/jni/ndk
foo.cpp -o D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.
o && ( if [ -f "D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkfoo/nd
kfoo.o.d.org" ]; then awk -f /cygdrive/d/android/android-ndk-r6b/build/awk/conve
rt-deps-to-cygwin.awk D:/android/workspace/helloworld/obj/local/armeabi/objs/ndk
foo/ndkfoo.o.d.org > D:/android/workspace/helloworld/obj/local/armeabi/objs/ndkf
oo/ndkfoo.o.d && rm -f D:/android/workspace/helloworld/obj/local/armeabi/objs/nd
kfoo/ndkfoo.o.d.org; fi )
Prebuilt : libstdc++.a <= <NDK>/sources/cxx-stl/gnu-libstdc++/libs/armeabi
/
cp -f /cygdrive/d/android/android-ndk-r6b/sources/cxx-stl/gnu-libstdc++/libs/arm
eabi/libstdc++.a /cygdrive/d/android/workspace/helloworld/obj/local/armeabi/libs
tdc++.a
SharedLibrary : libndkfoo.so
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-g++ -Wl,-soname,libndkfoo.so -shared --sys
root=D:/android/android-ndk-r6b/platforms/android-9/arch-arm D:/android/workspac
e/helloworld/obj/local/armeabi/objs/ndkfoo/ndkfoo.o D:/android/workspace/hellow
orld/obj/local/armeabi/libstdc++.a D:/android/android-ndk-r6b/toolchains/arm-lin
ux-androideabi-4.4.3/prebuilt/windows/bin/../lib/gcc/arm-linux-androideabi/4.4.3
/libgcc.a -Wl,--no-undefined -Wl,-z,noexecstack -lc -lm -lsupc++ -o D:/androi
d/workspace/helloworld/obj/local/armeabi/libndkfoo.so
Install : libndkfoo.so => libs/armeabi/libndkfoo.so
mkdir -p /cygdrive/d/android/workspace/helloworld/libs/armeabi
install -p /cygdrive/d/android/workspace/helloworld/obj/local/armeabi/libndkfoo.
so /cygdrive/d/android/workspace/helloworld/libs/armeabi/libndkfoo.so
/cygdrive/d/android/android-ndk-r6b/toolchains/arm-linux-androideabi-4.4.3/prebu
ilt/windows/bin/arm-linux-androideabi-strip --strip-unneeded D:/android/workspac
e/helloworld/libs/armeabi/libndkfoo.so
Edit.
I have run the commnad adb shell cat /proc/cpuinfo. This is the result:
Processor : ARMv6-compatible processor rev 5 (v6l)
BogoMIPS : 532.48
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 6TEJ
CPU variant : 0x1
CPU part : 0xb36
CPU revision : 5
Hardware : GELATO Global board (LGE LGP690)
Revision : 0000
Serial : 0000000000000000
I don't understand what swp, half thumb fastmult vfp edsp and java means, but i don't like that 'vfp'!! Does it means virtual-floating points? That processor should have a floating point unit...
You are right, -msoft-float is a synonym for -mfloat-abi=soft (see list of gcc ARM options) and means floating point emulation.
For hardware floating point the following flags can be used:
LOCAL_CFLAGS += -march=armv6 -marm -mfloat-abi=softfp -mfpu=vfp
To see what floating point unit you really have on your device you can check the output of adb shell cat /proc/cpuinfo command. Some units are compatible with another: vfp < vfpv3-d16 < vfpv3 < neon - so if you have vfpv3, then vfp also works for you.
Also you might want to add the line
APP_OPTIM := release
into your Application.mk file. This setting overrides automatic 'debug' mode for native part of application if the manifest sets android:debuggable to 'true'
But even with all these settings NDK will put -march=armv5te -mtune=xscale -msoft-float into the beginning of compiler options. And this behavior can not be changed without modifications in NDK sources (these options are hardcoded in file $NDKROOT\toolchains\arm-linux-androideabi-4.4.3\setup.mk).

Linux splice() returning EINVAL ("Invalid argument")

I'm trying to experiment with using splice (man 2 splice) to copy data from a UDP socket directly to a file. Unfortunately the first call to splice() returns EINVAL.
The man page states:
EINVAL Target file system doesn't support splicing; target file is opened in
append mode; neither of the descriptors refers to a pipe; or offset
given for nonseekable device.
However, I believe none of those conditions apply. I'm using Fedora 15 (kernel 2.6.40-4) so I believe splice() is supported on all filesystems. The target file should be irrelevant in the first call to splice, but for completeness I'm opening it via open(path, O_CREAT | O_WRONLY | O_TRUNC, S_IRUSR | S_IWUSR). Both calls use a pipe and neither call uses an offset besides NULL.
Here's my sample code:
int sz = splice(sock_fd, 0, mPipeFds[1], 0, 8192, SPLICE_F_MORE);
if (-1 == sz)
{
int err = errno;
LOG4CXX_ERROR(spLogger, "splice from: " << strerror(err));
return 0;
}
sz = splice(mPipeFds[0], 0, file_fd, 0, sz, SPLICE_F_MORE);
if (-1 == sz)
{
int err = errno;
LOG4CXX_ERROR(spLogger, "splice to: " << strerror(err));
}
return 0;
sock_fd is initialized by the following psuedocode:
int sock_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
fcntl(sock_fd, F_SETFL, flags | O_NONBLOCK);
bind(sock_fd, ...);
Possibly related is that this code snippet is running inside a libevent loop. libevent is using epoll() to determine if the UDP socket is hot.
Found my answer. tl;dr - UDP isn't supported on the inbound side.
After enough Googling I stumbled upon a forum discussion and some test code which prints out a table of in/out fd types and their support:
$ ./a.out
in\out pipe reg chr unix tcp udp
pipe yes yes yes yes yes yes
reg yes no no no no no
chr yes no no no no no
unix no no no no no no
tcp yes no no no no no
udp no no no no no no
Yeah, it is definitely not supported for reading from a UDP socket, even in the latest kernels. References to the kernel source follow.
splice invokes do_splice in the kernel, which calls do_splice_to, which calls the splice_read member in the file_operations structure for the file.
For sockets, that structure is defined as socket_file_ops in net/socket.c, which initializes the splice_read field to sock_splice_read.
That function, in turn, contains this line of code:
if (unlikely(!sock->ops->splice_read))
return -EINVAL;
The ops field of the socket is a struct proto_ops. For an IPv4 UDP socket, it is initialized to inet_dgram_ops in net/ipv4/af_inet.c. Finally, that structure does not explicitly initialize the splice_read field of struct proto_ops; i.e., it initializes it to zero.
So sock_splice_read returns -EINVAL, and that propagates up.