how objdump handles global variables - objdump

I have made the following dummy code for testing
/tmp/test.c contains the following:
#include "test.h"
#include <stdio.h>
#include <stdlib.h>
struct s* p;
unsigned char *c;
void main(int argc, char ** argv) {
memset(c, 0, 10);
p->a = 10;
p->b = 20;
}
/tmp/test.h contains the following:
struct s {
int a;
int b;
};
I compile and run objdump as follows:
cd /tmp
gcc -c test.c -o test.o
objdump -gdsMIntel test.o
I get the following output:
test.o: file format elf32-i386
Contents of section .text:
0000 5589e5a1 00000000 c7000000 0000c740 U..............#
0010 04000000 0066c740 080000a1 00000000 .....f.#........
0020 c7000a00 0000a100 000000c7 40041400 ............#...
0030 00005dc3 ..].
Contents of section .comment:
0000 00474343 3a202855 62756e74 752f4c69 .GCC: (Ubuntu/Li
0010 6e61726f 20342e36 2e332d31 7562756e naro 4.6.3-1ubun
0020 74753529 20342e36 2e3300 tu5) 4.6.3.
Contents of section .eh_frame:
0000 14000000 00000000 017a5200 017c0801 .........zR..|..
0010 1b0c0404 88010000 1c000000 1c000000 ................
0020 00000000 34000000 00410e08 8502420d ....4....A....B.
0030 05700c04 04c50000 .p......
Disassembly of section .text:
00000000 <main>:
0: 55 push ebp
1: 89 e5 mov ebp,esp
3: a1 00 00 00 00 mov eax,ds:0x0 ;;;; should be the address of unsigned char *c
8: c7 00 00 00 00 00 mov DWORD PTR [eax],0x0 ;;;; setting 10 bytes to 0
e: c7 40 04 00 00 00 00 mov DWORD PTR [eax+0x4],0x0
15: 66 c7 40 08 00 00 mov WORD PTR [eax+0x8],0x0
1b: a1 00 00 00 00 mov eax,ds:0x0
20: c7 00 0a 00 00 00 mov DWORD PTR [eax],0xa ;;;; p->a = 10;
26: a1 00 00 00 00 mov eax,ds:0x0
2b: c7 40 04 14 00 00 00 mov DWORD PTR [eax+0x4],0x14 ;;;; p->b = 20;
32: 5d pop ebp
33: c3 ret
In the above disassembly, I find that:
In the case of c, the following is done:
mov eax, ds:0x0
mov DWORD PTR [eax], 0
In the case of p->a the following is done:
mov eax, ds:0x0
mov DWORD PTR [eax],0x0
In that case, are both c and p->a located in the same address (ds:0x0)?

The .o file is not yet executable, there'll be relocation entries pointing to those 0's which the linker will fixup later. Since your source file seems mostly self-contained, can you drop the -c flag to produce an (fully linked) executable rather than a relocatable object file?

Related

Relocation R_X86_64_PC32: why value is x-4 instead of x?

Consider this code:
int arr[4];
void foo(void)
{
arr[0] = arr[1];
}
compiled and objdumped as:
gcc t57.c -O3 -c && objdump -Dr t57.o
leading to:
0000000000000000 <foo>:
0: f3 0f 1e fa endbr64
4: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # a <foo+0xa>
6: R_X86_64_PC32 arr
a: 89 05 00 00 00 00 mov %eax,0x0(%rip) # 10 <foo+0x10>
c: R_X86_64_PC32 arr-0x4
10: c3 retq
Here we see arr and arr-0x4.
Question: why not arr+0x4 and arr? Where this -0x4 comes from?

gcc address sanitizer heap-buffer-overflow error during sort caused by change of condition?

This is the bare-bone version of my function which illustrates the error.
File runtime-error.cpp:
#include <vector>
#include <algorithm>
using namespace std;
int main() {
using vi = vector<int>;
vector<vi> result = {{0}, {0}, {0}, {0}, {0}, {0}, {0}, {0}, {0}, {0},
{0}, {0}, {0}, {0}, {0}, {0}, {0}, {0}, {0}, {0},
{0}, {0}, {0}, {0}, {0}, {2}, {2}, {2}, {2}, {2},
{2}, {2}, {2}, {2}, {2}, {2}, {2}, {2}, {2}, {2},
{2}, {2}, {2}, {2}, {2}};
sort(result.begin(), result.end(), [](const vi& v1, const vi& v2) {
const auto v10 = v1[0];
const auto v20 = v2[0]; // line 15
return (v10 <= v20); // error condition
// return (v10 < v20); // no error condition
});
return 0;
}
I build it and run it with the following command:
g++ -Wall -g -fsanitize=address -fno-omit-frame-pointer
runtime-error.cpp -o re && ./re
gcc version:
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
if the //no error condition is used, no problem. but as it is, the following error occurs:
==4046513==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x619000000068 at pc 0x55dec650f945 bp 0x7ffc95746530 sp 0x7ffc95746520
READ of size 8 at 0x619000000068 thread T0
#0 0x55dec650f944 in std::vector<int, std::allocator<int> >::operator[](unsigned long) const /usr/include/c++/9/bits/stl_vector.h:1061
#1 0x55dec650851e in operator() ./runtime-error.cpp:15
#2 0x55dec650d424 in operator()<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, __gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > > > /usr/include/c++/9/bits/predefined_ops.h:143
#3 0x55dec650d8c1 in __unguarded_partition<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main()::<lambda(const vi&, const vi&)> > > /usr/include/c++/9/bits/stl_algo.h:1910
#4 0x55dec650c8ed in __unguarded_partition_pivot<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main()::<lambda(const vi&, const vi&)> > > /usr/include/c++/9/bits/stl_algo.h:1928
#5 0x55dec650c311 in __introsort_loop<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, long int, __gnu_cxx::__ops::_Iter_comp_iter<main()::<lambda(const vi&, const vi&)> > > /usr/include/c++/9/bits/stl_algo.h:1958
#6 0x55dec650c011 in __sort<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, __gnu_cxx::__ops::_Iter_comp_iter<main()::<lambda(const vi&, const vi&)> > > /usr/include/c++/9/bits/stl_algo.h:1973
#7 0x55dec650bd55 in sort<__gnu_cxx::__normal_iterator<std::vector<int>*, std::vector<std::vector<int> > >, main()::<lambda(const vi&, const vi&)> > /usr/include/c++/9/bits/stl_algo.h:4905
#8 0x55dec650aec7 in main ./runtime-error.cpp:13
#9 0x7f42966df082 in __libc_start_main ../csu/libc-start.c:308
#10 0x55dec65083ed in _start (./re+0x13ed)
0x619000000068 is located 24 bytes to the left of 1080-byte region [0x619000000080,0x6190000004b8)
allocated by thread T0 here:
#0 0x7f4296d08587 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cc:104
#1 0x55dec65113bf in __gnu_cxx::new_allocator<std::vector<int, std::allocator<int> > >::allocate(unsigned long, void const*) /usr/include/c++/9/ext/new_allocator.h:114
#2 0x55dec6510f2e in std::allocator_traits<std::allocator<std::vector<int, std::allocator<int> > > >::allocate(std::allocator<std::vector<int, std::allocator<int> > >&, unsigned long) /usr/include/c++/9/bits/alloc_traits.h:443
#3 0x55dec6510749 in std::_Vector_base<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >::_M_allocate(unsigned long) /usr/include/c++/9/bits/stl_vector.h:343
#4 0x55dec650fe40 in void std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >::_M_range_initialize<std::vector<int, std::allocator<int> > const*>(std::vector<int, std::allocator<int> > const*, std::vector<int, std::allocator<int> > const*, std::forward_iterator_tag) /usr/include/c++/9/bits/stl_vector.h:1579
#5 0x55dec650f588 in std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >::vector(std::initializer_list<std::vector<int, std::allocator<int> > >, std::allocator<std::vector<int, std::allocator<int> > > const&) /usr/include/c++/9/bits/stl_vector.h:626
#6 0x55dec650a47a in main ./runtime-error.cpp:11
#7 0x7f42966df082 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/9/bits/stl_vector.h:1061 in std::vector<int, std::allocator<int> >::operator[](unsigned long) const
Shadow bytes around the buggy address:
0x0c327fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c327fff8000: fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa
0x0c327fff8010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff8020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff8030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff8040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c327fff8050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==4046513==ABORTING
<= does not conform to the semantics of strict partial ordering while < does.

Boost asio io service memcpy()

I have build application based on boost::asio. Sometimes I got this kind of core dump (not regullary). I tried investigate what's going on but I haven't more ideas to solve it.
In my point of view I think that could be some problem inside io service object - I mean maybe any bug? Should I update it ?
Anybody can explain what does memcpy() do in this case ? What is a reason of core ?
More details:
Platform . SunOS.
Boost - 1.49
/app/bin/executor_3'executor_dumpstack+0x13 [0x426645]
/app/bin/executor_3'signal_dumpstack+0x9d [0x426625]
/lib/amd64/libc.so.1'__sighndlr+0x6 [0xfffffd7fff224ea6]
/lib/amd64/libc.so.1'call_user_handler+0x2a4 [0xfffffd7fff217b5c]
/lib/amd64/libc.so.1'memcpy+0x1929 [0xfffffd7fff18a449] [Signal 11 (SEGV)]
/opt/lib/extralibs/exe_io.so'_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmE4callINS_10shared_ptrIS2_EES5_mEEvRT_PKvRT0_RT1_+0x8b [0xfffffd7ff64ff869]
/opt/lib/extralibs/exe_io.so'_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEclINS_10shared_ptrIS2_EEEEvRT_S6_m+0x3c [0xfffffd7ff64fe47e]
/opt/lib/extralibs/exe_io.so'_ZN5boost3_bi5list3INS0_5valueINS_10shared_ptrI3GETEEEEPFNS_3argILi1EEEvEPFNS7_ILi2EEEvEEclINS_4_mfi3mf2IvS4_RKNS_6system10error_codeEmEENS0_5list2ISL_RKmEEEEvNS0_4typeIvEERT_RT0_i+0x72 [0xfffffd7ff64fce58]
/opt/lib/extralibs/exe_io.so'_ZN5boost3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENS0_5list3INS0_5valueINS_10shared_ptrIS4_EEEEPFNS_3argILi1EEEvEPFNSF_ILi2EEEvEEEEclIS6_mEEvRKT_RKT0_+0x43 [0xfffffd7ff64fbfa7]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail17read_streambuf_opINS0_19basic_stream_socketINS0_2ip3tcpENS0_21stream_socket_serviceIS5_EEEESaIcENS1_18transfer_exactly_tENS_3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENSB_5list3INSB_5valueINS_10shared_ptrISF_EEEEPFNS_3argILi1EEEvEPFNSQ_ILi2EEEvEEEEEEclESJ_mi+0x16f [0xfffffd7ff64fa31b]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail7binder2INS1_17read_streambuf_opINS0_19basic_stream_socketINS0_2ip3tcpENS0_21stream_socket_serviceIS6_EEEESaIcENS1_18transfer_exactly_tENS_3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENSC_5list3INSC_5valueINS_10shared_ptrISG_EEEEPFNS_3argILi1EEEvEPFNSR_ILi2EEEvEEEEEEESI_mEclEv+0x2d [0xfffffd7ff6503345]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio19asio_handler_invokeINS0_6detail7binder2INS2_17read_streambuf_opINS0_19basic_stream_socketINS0_2ip3tcpENS0_21stream_socket_serviceIS7_EEEESaIcENS2_18transfer_exactly_tENS_3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENSD_5list3INSD_5valueINS_10shared_ptrISH_EEEEPFNS_3argILi1EEEvEPFNSS_ILi2EEEvEEEEEEESJ_mEEEEvT_z+0x8e [0xfffffd7ff6502d76]
/opt/lib/extralibs/exe_io.so'_ZN33boost_asio_handler_invoke_helpers6invokeIN5boost4asio6detail7binder2INS3_17read_streambuf_opINS2_19basic_stream_socketINS2_2ip3tcpENS2_21stream_socket_serviceIS8_EEEESaIcENS3_18transfer_exactly_tENS1_3_bi6bind_tIvNS1_4_mfi3mf2Iv3GETRKNS1_6system10error_codeEmEENSE_5list3INSE_5valueINS1_10shared_ptrISI_EEEEPFNS1_3argILi1EEEvEPFNST_ILi2EEEvEEEEEEESK_mEES11_EEvRT_RT0_+0x3e [0xfffffd7ff650250a]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail19asio_handler_invokeINS1_7binder2INS1_17read_streambuf_opINS0_19basic_stream_socketINS0_2ip3tcpENS0_21stream_socket_serviceIS7_EEEESaIcENS1_18transfer_exactly_tENS_3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENSD_5list3INSD_5valueINS_10shared_ptrISH_EEEEPFNS_3argILi1EEEvEPFNSS_ILi2EEEvEEEEEEESJ_mEESA_SB_SC_S10_EEvRT_PNS4_IT0_T1_T2_T3_EE+0x21 [0xfffffd7ff6501f61]
/opt/lib/extralibs/exe_io.so'_ZN33boost_asio_handler_invoke_helpers6invokeIN5boost4asio6detail7binder2INS3_17read_streambuf_opINS2_19basic_stream_socketINS2_2ip3tcpENS2_21stream_socket_serviceIS8_EEEESaIcENS3_18transfer_exactly_tENS1_3_bi6bind_tIvNS1_4_mfi3mf2Iv3GETRKNS1_6system10error_codeEmEENSE_5list3INSE_5valueINS1_10shared_ptrISI_EEEEPFNS1_3argILi1EEEvEPFNST_ILi2EEEvEEEEEEESK_mEES12_EEvRT_RT0_+0x25 [0xfffffd7ff65019a3]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail23reactive_socket_recv_opINS0_17mutable_buffers_1ENS1_17read_streambuf_opINS0_19basic_stream_socketINS0_2ip3tcpENS0_21stream_socket_serviceIS7_EEEESaIcENS1_18transfer_exactly_tENS_3_bi6bind_tIvNS_4_mfi3mf2Iv3GETRKNS_6system10error_codeEmEENSD_5list3INSD_5valueINS_10shared_ptrISH_EEEEPFNS_3argILi1EEEvEPFNSS_ILi2EEEvEEEEEEEE11do_completeEPNS1_15task_io_serviceEPNS1_25task_io_service_operationESL_m+0xc4 [0xfffffd7ff65007cc]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail25task_io_service_operation8completeERNS1_15task_io_serviceERKNS_6system10error_codeEm+0x32 [0xfffffd7ff6433dc8]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail15task_io_service10do_run_oneERNS1_11scoped_lockINS1_11posix_mutexEEERNS2_11thread_infoERNS1_8op_queueINS1_25task_io_service_operationEEERKNS_6system10error_codeE+0x202 [0xfffffd7ff6433c9e]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio6detail15task_io_service3runERNS_6system10error_codeE+0xff [0xfffffd7ff6433925]
/opt/lib/extralibs/exe_io.so'_ZN5boost4asio10io_service3runEv+0x26 [0xfffffd7ff64335d6]
/opt/lib/extralibs/exe_io.so'_ZN6ZohaIO9RunIOEv+0x19 [0xfffffd7ff64335ad]
/opt/lib/extralibs/exe_io.so'_ZNK5boost4_mfi3mf0Iv6ZohaIOEclEPS2_+0x64 [0xfffffd7ff6440cee]
/opt/lib/extralibs/exe_io.so'_ZN5boost3_bi5list1INS0_5valueIP6ZohaIOEEEclINS_4_mfi3mf0IvS3_EENS0_5list0EEEvNS0_4typeIvEERT_RT0_i+0x41 [0xfffffd7ff6440c4d]
/opt/lib/extralibs/exe_io.so'_ZN5boost3_bi6bind_tIvNS_4_mfi3mf0Iv6ZohaIOEENS0_5list1INS0_5valueIPS4_EEEEEclEv+0x33 [0xfffffd7ff6440bfb]
/opt/lib/extralibs/exe_io.so'_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv6ZohaIOEENS2_5list1INS2_5valueIPS6_EEEEEEE3runEv+0x1c [0xfffffd7ff6440514]
/opt/csw/gxx/lib/amd64/libboost_thread.so.1.49.0'0xf655 [0xfffffd7ffa97f655]
/lib/amd64/libc.so.1'_thrp_setup+0xbc [0xfffffd7fff224b14]
/lib/amd64/libc.so.1'_lwp_start+0x0 [0xfffffd7fff224de0]
objdump output.
0000000000000000 <_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmE4callINS_10shared_ptrIS2_EES5_mEEvRT_PKvRT0_RT1_>:
template<class U, class B1, class B2> R call(U & u, T const *, B1 & b1, B2 & b2) const
{
BOOST_MEM_FN_RETURN (u.*f_)(b1, b2);
}
template<class U, class B1, class B2> R call(U & u, void const *, B1 & b1, B2 & b2) const
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 53 push %rbx
5: 48 83 ec 38 sub $0x38,%rsp
9: 48 89 7d e8 mov %rdi,-0x18(%rbp)
d: 48 89 75 e0 mov %rsi,-0x20(%rbp)
11: 48 89 55 d8 mov %rdx,-0x28(%rbp)
15: 48 89 4d d0 mov %rcx,-0x30(%rbp)
19: 4c 89 45 c8 mov %r8,-0x38(%rbp)
{
BOOST_MEM_FN_RETURN (get_pointer(u)->*f_)(b1, b2);
1d: 48 8b 45 e0 mov -0x20(%rbp),%rax
21: 48 89 c7 mov %rax,%rdi
24: e8 00 00 00 00 callq 29 <_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmE4callINS_10shared_ptrIS2_EES5_mEEvRT_PKvRT0_RT1_+0x29>
29: 48 89 c2 mov %rax,%rdx
2c: 48 8b 45 e8 mov -0x18(%rbp),%rax
30: 48 8b 00 mov (%rax),%rax
33: 83 e0 01 and $0x1,%eax
36: 84 c0 test %al,%al
38: 74 23 je 5d <_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmE4callINS_10shared_ptrIS2_EES5_mEEvRT_PKvRT0_RT1_+0x5d>
3a: 48 8b 45 e8 mov -0x18(%rbp),%rax
3e: 48 8b 40 08 mov 0x8(%rax),%rax
42: 48 8d 04 02 lea (%rdx,%rax,1),%rax
46: 48 8b 08 mov (%rax),%rcx
49: 48 8b 45 e8 mov -0x18(%rbp),%rax
4d: 48 8b 00 mov (%rax),%rax
50: 48 83 e8 01 sub $0x1,%rax
54: 48 8d 04 01 lea (%rcx,%rax,1),%rax
58: 48 8b 00 mov (%rax),%rax
5b: eb 07 jmp 64 <_ZNK5boost4_mfi3mf2Iv3GETRKNS_6system10error_codeEmE4callINS_10shared_ptrIS2_EES5_mEEvRT_PKvRT0_RT1_+0x64>
5d: 48 8b 45 e8 mov -0x18(%rbp),%rax
61: 48 8b 00 mov (%rax),%rax
64: 48 8b 4d c8 mov -0x38(%rbp),%rcx
68: 48 8b 19 mov (%rcx),%rbx
6b: 48 8b 4d e8 mov -0x18(%rbp),%rcx
6f: 48 8b 49 08 mov 0x8(%rcx),%rcx
73: 48 8d 3c 0a lea (%rdx,%rcx,1),%rdi
77: 48 8b 4d d0 mov -0x30(%rbp),%rcx
7b: 48 89 da mov %rbx,%rdx
7e: 48 89 ce mov %rcx,%rsi
Hint: use c++filt utility so your backtrace will become readable: cat backtrace | c++filt
Something happen to async_read handler on GET invoking. Maybe this object is destroyed at the time handler is invoked, or some mess with parameters. Cant say accurately without code, but something with read callback is what can be seen from this backtrace.

GNU inline assembly optimisation

I am trying to write a small library for highly optimised x86-64 bit operation code and am fiddling with inline asm.
While testing this particular case has caught my attention:
unsigned long test = 0;
unsigned long bsr;
// bit test and set 39th bit
__asm__ ("btsq\t%1, %0 " : "+rm" (test) : "rJ" (39) );
// bit scan reverse (get most significant bit id)
__asm__ ("bsrq\t%1, %0" : "=r" (bsr) : "rm" (test) );
printf("test = %lu, bsr = %d\n", test, bsr);
compiles and runs fine in both gcc and icc, but when I inspect the assembly I get differences
gcc -S -fverbose-asm -std=gnu99 -O3
movq $0, -8(%rbp)
## InlineAsm Start
btsq $39, -8(%rbp)
## InlineAsm End
movq -8(%rbp), %rax
movq %rax, -16(%rbp)
## InlineAsm Start
bsrq -16(%rbp), %rdx
## InlineAsm End
movq -8(%rbp), %rsi
leaq L_.str(%rip), %rdi
xorb %al, %al
callq _printf
I am wondering why so complicated? I am writing high performance code in which the number of instructions is critical. I am especially wondering why gcc makes a copy of my variable test before passing it to the second inline asm?
Same code compiled with icc gives far better results:
xorl %esi, %esi # test = 0
movl $.L_2__STRING.0, %edi # has something to do with printf
orl $32832, (%rsp) # part of function initiation
xorl %eax, %eax # has something to do with printf
ldmxcsr (%rsp) # part of function initiation
btsq $39, %rsi #106.0
bsrq %rsi, %rdx #109.0
call printf #111.2
despite the fact that gcc decides to keep my variables on the stack rather then in registers, what I do not understand is why make a copy of test before passing it to the second asm?
If I put test in as an input/output variable in the second asm
__asm__ ("bsrq\t%1, %0" : "=r" (bsr) , "+rm" (test) );
then those lines disappear.
movq $0, -8(%rbp)
## InlineAsm Start
btsq $39, -8(%rbp)
## InlineAsm End
## InlineAsm Start
bsrq -8(%rbp), %rdx
## InlineAsm End
movq -8(%rbp), %rsi
leaq L_.str(%rip), %rdi
xorb %al, %al
callq _printf
Is this gcc screwed up optimisation or am I missing some vital compiler switches? I do have icc for my production system, but if I decide to distribute the source code at some point then it will have to be able to compile with gcc too.
compilers used:
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)
icc Version 12.0.2
I've tried your example on Linux like this (making it "evil" by forcing a stack ref/loc for test through using &test in the printf:):#include <stdio.h>
int main(int argc, char **argv)
{
unsigned long test = 0;
unsigned long bsr;
// bit test and set 39th bit
asm ("btsq\t%1, %0 " : "+rm" (test) : "rJ" (39) );
// bit scan reverse (get most significant bit id)
asm ("bsrq\t%1, %0" : "=r" (bsr) : "rm" (test) );
printf("test = %lu, bsr = %d, &test = %p\n", test, bsr, &test);
return 0;
}
and compiled it with various versions of gcc -O3 ... to the following results:
code generated gcc version
================================================================================
400630: 48 83 ec 18 sub $0x18,%rsp 4.7.2,
400634: 31 c0 xor %eax,%eax 4.6.2,
400636: bf 50 07 40 00 mov $0x400750,%edi 4.4.6
40063b: 48 8d 4c 24 08 lea 0x8(%rsp),%rcx
400640: 48 0f ba e8 27 bts $0x27,%rax
400645: 48 89 44 24 08 mov %rax,0x8(%rsp)
40064a: 48 89 c6 mov %rax,%rsi
40064d: 48 0f bd d0 bsr %rax,%rdx
400651: 31 c0 xor %eax,%eax
400653: e8 68 fe ff ff callq 4004c0
[ ... ]
---------------------------------------------------------------------------------
4004f0: 48 83 ec 18 sub $0x18,%rsp 4.1
4004f4: 31 c0 xor %eax,%eax
4004f6: bf 28 06 40 00 mov $0x400628,%edi
4004fb: 48 8d 4c 24 10 lea 0x10(%rsp),%rcx
400500: 48 c7 44 24 10 00 00 00 00 movq $0x0,0x10(%rsp)
400509: 48 0f ba e8 27 bts $0x27,%rax
40050e: 48 89 44 24 10 mov %rax,0x10(%rsp)
400513: 48 89 c6 mov %rax,%rsi
400516: 48 0f bd d0 bsr %rax,%rdx
40051a: 31 c0 xor %eax,%eax
40051c: e8 c7 fe ff ff callq 4003e8
[ ... ]
---------------------------------------------------------------------------------
400500: 48 83 ec 08 sub $0x8,%rsp 3.4.5
400504: bf 30 06 40 00 mov $0x400630,%edi
400509: 31 c0 xor %eax,%eax
40050b: 48 c7 04 24 00 00 00 00 movq $0x0,(%rsp)
400513: 48 89 e1 mov %rsp,%rcx
400516: 48 0f ba 2c 24 27 btsq $0x27,(%rsp)
40051c: 48 8b 34 24 mov (%rsp),%rsi
400520: 48 0f bd 14 24 bsr (%rsp),%rdx
400525: e8 fe fe ff ff callq 400428
[ ... ]
---------------------------------------------------------------------------------
4004e0: 48 83 ec 08 sub $0x8,%rsp 3.2.3
4004e4: bf 10 06 40 00 mov $0x400610,%edi
4004e9: 31 c0 xor %eax,%eax
4004eb: 48 c7 04 24 00 00 00 00 movq $0x0,(%rsp)
4004f3: 48 0f ba 2c 24 27 btsq $0x27,(%rsp)
4004f9: 48 8b 34 24 mov (%rsp),%rsi
4004fd: 48 89 e1 mov %rsp,%rcx
400500: 48 0f bd 14 24 bsr (%rsp),%rdx
400505: e8 ee fe ff ff callq 4003f8
[ ... ]
and while there's a significant difference in the created code (including whether the bsr acceesses test as register or memory), none of the tested revs recreate the assembly that you've shown. I'd suspect a bug in the 4.2.x version you used on MacOSX, but then I don't have either your testcase nor that specific compiler version available.
Edit: The code above is obviously different in the sense that it forces test into the stack; if that is not done, then all "plain" gcc versions I've tested do a direct pair bts $39, %rsi / bsr %rsi, %rdx.
I have found, though, that clang creates different code there: 140: 50 push %rax
141: 48 c7 04 24 00 00 00 00 movq $0x0,(%rsp)
149: 31 f6 xor %esi,%esi
14b: 48 0f ba ee 27 bts $0x27,%rsi
150: 48 89 34 24 mov %rsi,(%rsp)
154: 48 0f bd d6 bsr %rsi,%rdx
158: bf 00 00 00 00 mov $0x0,%edi
15d: 30 c0 xor %al,%al
15f: e8 00 00 00 00 callq printf#plt>so the difference seems to be indeed between the code generators of clang/llvm and "gcc proper".

Problem with publishing rtmp stream to FMS with librtmp

I'm trying to get video from the webcam and get it encoded and then publish the stream to FMS. and now I'am having a problem when I try to publish RTMP stream to the FMS with librtmp.
My code:
char uri[]="rtmp://127.0.0.1/live/bolton";
r= RTMP_Alloc();
RTMP_Init(r);
RTMP_SetupURL(r, (char*)uri);
r->Link.lFlags |= RTMP_LF_LIVE;
r->Link.lFlags |= RTMP_LF_BUFX;
RTMP_EnableWrite(r);
//RTMP_SetBufferMS(r, bufferTime);
RTMP_Connect(r, NULL);
RTMP_ConnectStream(r,0);
And log:
DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DEBUG: HandShake: Handshaking finished....
DEBUG: RTMP_Connect1, handshaked
DEBUG2: RTMP_SendPacket: fd=768, size=85
DEBUG2: 0000: 03 00 00 00 00 00 55 14 00 00 00 00 ......U.....
DEBUG2: 0000: 02 00 07 63 6f 6e 6e 65 63 74 00 3f f0 00 00 00 ...connect.?....
DEBUG2: 0010: 00 00 00 03 00 03 61 70 70 02 00 04 6c 69 76 65 ......app...live
DEBUG2: 0020: 00 04 74 79 70 65 02 00 0a 6e 6f 6e 70 72 69 76 ..type...nonpriv
DEBUG2: 0030: 61 74 65 00 05 74 63 55 72 6c 02 00 15 72 74 6d ate..tcUrl...rtm
DEBUG2: 0040: 70 3a 2f 2f 31 32 37 2e 30 2e 30 2e 31 2f 6c 69 p://127.0.0.1/li
DEBUG2: 0050: 76 65 00 00 09 ve...
DEBUG: Invoking connect
DEBUG2: RTMP_ReadPacket: fd=768
ERROR: RTMP_ReadPacket, failed to read RTMP packet header
DEBUG2: RTMP_SendPacket: fd=-1, size=307
It seems that RTMP_connect connected correctly, but failes in the function RTMP_ConnectStream, I'm not familiar with the rtmp connect sequence, and it's killing me.
What should I do to find the problems, Thanks very much!
I'm dealing with the same problem, but using librtmp with ffmpeg. If you look in the function RTMP_ReadPacket(), you'll see the error is thrown when trying to read the packet header with the method ReadN():
int
RTMP_ReadPacket(RTMP *r, RTMPPacket *packet)
{
uint8_t hbuf[RTMP_MAX_HEADER_SIZE] = { 0 };
char *header = (char *)hbuf;
int nSize, hSize, nToRead, nChunk;
int didAlloc = FALSE;
RTMP_Log(RTMP_LOGDEBUG2, "%s: fd=%d", __FUNCTION__, r->m_sb.sb_socket);
if (ReadN(r, (char *)hbuf, 1) == 0)
{
RTMP_Log(RTMP_LOGERROR, "%s, failed to read RTMP packet header", __FUNCTION__);
return FALSE;
}
That error only throws if the ReadN returns a 0. I haven't figured out on my end why that's happening yet, though.
I got the same issue in the Visual studio.
It turned out that in the debug mode,librtmp set some handshake value 0 instead of generating random data.
//handshake.h
/* generate random data */
#ifdef _DEBUG
memset(serversig+8, 0, RTMP_SIG_SIZE-8);
#else
ip = (int32_t *)(serversig+8);
for (i = 2; i < RTMP_SIG_SIZE/4; i++)
*ip++ = rand();
#endif
Just make sure generating random data.