assertion from valgrind mc_main.c - valgrind

My valgrind runs reports errors like this
Memcheck: mc_main.c:8292 (mc_pre_clo_init): Assertion 'MAX_PRIMARY_ADDRESS == 0x1FFFFFFFFFULL' failed.
What does this mean? Is it a valgrind internal error or an error from my program?

This is a valgrind internal error. This is very weird, as this failed
assertion is a self check done very early on.
You should file a bug on valgrind bugzilla, reporting all the needed
details (version, platform, ...)

From the Valgrind source code (from git HEAD)
/* Only change this. N_PRIMARY_MAP *must* be a power of 2. */
#if VG_WORDSIZE == 4
/* cover the entire address space */
# define N_PRIMARY_BITS 16
#else
/* Just handle the first 128G fast and the rest via auxiliary
primaries. If you change this, Memcheck will assert at startup.
See the definition of UNALIGNED_OR_HIGH for extensive comments. */
# define N_PRIMARY_BITS 21
#endif
/* Do not change this. */
#define N_PRIMARY_MAP ( ((UWord)1) << N_PRIMARY_BITS)
/* Do not change this. */
#define MAX_PRIMARY_ADDRESS (Addr)((((Addr)65536) * N_PRIMARY_MAP)-1)
...
tl_assert(MAX_PRIMARY_ADDRESS == 0x1FFFFFFFFFULL);
So it looks like something has been changed that shouldn't have been.

Related

DLL silently ignored even though the library links

I'm using a third part DLL that I've used successfully for ages. Now the linker links the dll lib without complaint but the exe doesn't load the dll.
I recently upgraded from the 32 bit to 64 bit cygwin.
I'm doing a mingw cross compile to 32 bits.
I'm trying to use the FTDI USB interface FTD2XX dll.
I have the version 2.04.06 FTD2XX lib, .h, and dll.
I had been using that dll successfully for ages but with older versions of cygwin and mingw.
Recently upgraded to cygwin64.
The app appears to link with the FTD2XX.lib without complaint.
But when I run the app it doesn't seem to look for or load the FTD2XX.dll.
The app runs but crashes as soon as it tries to call something in FTD2XX dll.
I created a simple hello_dll.dll for side by side test. That works.
The app.c does calls on both hello_dll.dll and ftd2xx.dll.
Is starts without complain, successfully calls function in hello_dll, and then it crashes on a call to ft2xx.dll.
(I renamed the lib to ftd2xx_2.04.06 to distinguish them from other versions I have. Newer versions don't work any better.)
Link with -verbose gives:
i686-w64-mingw32-gcc -Wall -m32 -g -O2 -c -I . -o app.o app.c
i686-w64-mingw32-gcc -Wall -m32 -o app.exe app.o -Wl,-verbose -L. -lhello_dll -lftd2xx_2.04.06
GNU ld (GNU Binutils) 2.34.50.20200227
Supported emulations:
i386pe
using internal linker script:
<snip>
/usr/lib/gcc/i686-w64-mingw32/9.2.0/../../../../i686-w64-mingw32/bin/ld: mode i386pe
attempt to open /usr/i686-w64-mingw32/sys-root/mingw/lib/../lib/crt2.o succeeded
/usr/i686-w64-mingw32/sys-root/mingw/lib/../lib/crt2.o
attempt to open /usr/lib/gcc/i686-w64-mingw32/9.2.0/crtbegin.o succeeded
/usr/lib/gcc/i686-w64-mingw32/9.2.0/crtbegin.o
attempt to open app.o succeeded
app.o
<snip>
attempt to open ./hello_dll.lib succeeded
./hello_dll.lib
(./hello_dll.lib)d000001.o
(./hello_dll.lib)d000000.o
(./hello_dll.lib)d000002.o
<snip>
attempt to open ./ftd2xx_2.04.06.lib succeeded
./ftd2xx_2.04.06.lib
(./ftd2xx_2.04.06.lib)FTD2XX.dll
(./ftd2xx_2.04.06.lib)FTD2XX.dll
(./ftd2xx_2.04.06.lib)FTD2XX.dll
(./ftd2xx_2.04.06.lib)FTD2XX.dll
::::::::::::::::::::::::::::
I obtained a 32 bit compatible version of gdb. When I run gdb:
GNU gdb (GDB) 7.7.50.20140303-cvs
<snip>
This GDB was configured as "i686-pc-mingw32".
<snip>
(gdb) break main
(gdb) Breakpoint 1 at 0x40267b: file app.c, line 28.
(gdb) run
(gdb) Starting program: C:\_d\aaa\pd\src\dll\pathological\app.exe
[New Thread 1428.0x2528]
Breakpoint 1, main (argc=1, argv=0x9b2f70) at app.c:28
28 dostuff();
(gdb) info share
(gdb) From To Syms Read Shared Object Library
0x774e0000 0x77644ccc Yes (*) C:\Windows\SysWOW64\ntdll.dll
0x753d0000 0x754cadec Yes (*) C:\Windows\syswow64\kernel32.dll
0x75ea1000 0x75ee6a3a Yes (*) C:\Windows\syswow64\KernelBase.dll
0x64081000 0x6408a1d8 Yes C:\_d\aaa\pd\src\dll\pathological\hello_dll.dll
0x75041000 0x750eb2c4 Yes (*) C:\Windows\syswow64\msvcrt.dll
(*): Shared library is missing debugging information.
(gdb) A debugging session is active.
(gdb) c
Continuing.
Hello dll. <--- The function in hello_dll.dll prints this.
Program received signal SIGSEGV, Segmentation fault.
0x8000004c in ?? () <----- call to FT_GetLibraryVersion()
(gdb) bt
#0 0x8000004c in ?? ()
#1 0x0040158e in dostuff () at app.c:49
#2 0x00402680 in main (argc=1, argv=0x8e2f70) at app.c:28
(gdb)
It links with the lib without complaint but when I run the exe it (silently) doesn't load the dll.
Anybody have any ideas? Is there some linker control that I am missing? Are there other diagnostic or debug tools to dig into this further?
:::::::::::::::::::::::
edit 7/11/20
I'll post some code. (If I know how. I'm new here.)
It should be shown in the "info share", but it isn't, as you can see above.
I'm suspecting name decoration. Objdump -x of the .exe shows an entry for FTD2XX.dll in the Import Tables. But it doesn't show any vma or bound name under it. I suspect that at program load the loader sees no vma/name and decides it doesn't really need to load the dll.
There is an import table in .idata at 0x406000
<snip>
The Import Tables (interpreted .idata section contents)
vma: Hint Time Forward DLL First
Table Stamp Chain Name Thunk
00006000 0000607c 00000000 00000000 00006218 0000614c
DLL Name: FTD2XX.dll
vma: Hint/Ord Member-Name Bound-To
<----- empty?
00006014 00006080 00000000 00000000 000064f8 00006150
DLL Name: hello_dll.dll
vma: Hint/Ord Member-Name Bound-To
6224 1 hello_dll
00006028 00006088 00000000 00000000 00006554 00006158
DLL Name: KERNEL32.dll
vma: Hint/Ord Member-Name Bound-To
6230 277 DeleteCriticalSection
6248 310 EnterCriticalSection
<snip>
:::::::::::::::::::::::::::::::::::::::::::::
edit 2, 7/11/20
This is the program that calls functions in the DLLs.
/* app.c
Demonstrates using the function imported from the DLL.
*/
// 200708 pathological case. Based on the simple hello_dll.
//#include <stdlib.h>
// for sleep
#include <unistd.h>
#include <stdio.h>
// for dword
#include <windef.h>
// for lpoverlapped
#include <minwinbase.h>
#include "hello_dll.h"
// My legacy app, and really all others too, use 2.04.06.h
#include "ftd2xx_2.04.06.h"
//#include "ftd2xx_2.02.04.h"
///////////////////////////
void dostuff( void );
void call_ft_listdevices( void );
///////////////////////////
int main(int argc, char** argv)
{
FT_STATUS status;
DWORD libver;
//dostuff();
printf( "Calling hello_dll():\n" );
fflush( stdout );
hello_dll();
fflush( stdout );
printf( "Back from hello_dll()\n" );
fflush( stdout );
sleep( 1 );
printf( "Calling FT_GetLibraryVersion().\n" );
fflush( stdout );
status = FT_GetLibraryVersion( &libver );
if( status == FT_OK ){
printf( "FTD2XX library version 0x%lx\n", libver );
fflush( stdout );
}
else{
printf( "Error reading FTD2XX library version.\n" );
fflush( stdout );
}
// 200710 Adding call to different ft function did
// not result in entries in the import table.
//call_ft_listdevices( );
return 0;
}
I don't think there is a need to include the code for my hello_dll. It works.
I have three versions of the FTD2XX. I'm pretty careful about tracking versions. Plus, when one is beating one's head against the wall, double checking the versions appeals early on as a way to end the pain.
I found a surprise copy of FTD2XX.dll. It's in c:/Windows/SysWOW64. It is the oldest of the three versions I have. Versions of my app that were compiled before this problem started run correctly using that dll in that place.
Solved.
There's a bug in the 2.34.50.20200227 i686-w64-mingw32-ld.exe. It won't work with ftd2xx.lib, regardless of ftd2xx version as far as I can tell.
2.25.51.20150320 and 2.29.1.20171006 work with ftd2xx.lib. I've reverted back to 2.29 mingw64-i686-binutils. I'm running again.

Mono 5 on Solaris Build issues

I am trying to build Mono on Solaris 10 and have ran into an issue with error "Posix system lacks support for recursive mutexes". I am trying to build this with gcc 3.4.3 and I have installed gmake, gar, granlib, and gstrip to replace their Solaris alternatives. I have found a possible solution, but I can not locate the blog that references this line "1) patch /usr/lib/pkgconfig/gthread-2.0.pc to replace the -mt option (see Jonel's blog)" located at https://lists.dot.net/pipermail/mono-list/2007-January/034101.html. The blog is no longer active. Does anyone have any ideas what the patch they are referring to may be? Thanks in advance.
Solaris 10 does support recursive mutexes.
Per the Solaris 10 pthread_mutexattr_settype man page:
PTHREAD_MUTEX_RECURSIVE
A thread attempting to relock this mutex without first unlocking it
will succeed in locking the mutex. The relocking deadlock that can
occur with mutexes of type PTHREAD_MUTEX_NORMAL cannot occur with
this type of mutex. Multiple locks of this mutex require the same
number of unlocks to release the mutex before another thread can
acquire the mutex. A thread attempting to unlock a mutex that another
thread has locked will return with an error. A thread attempting to
unlock an unlocked mutex will return with an error. This type of mutex
is only supported for mutexes whose process shared attribute is
PTHREAD_PROCESS_PRIVATE.
Also per the man page, the required compile/link options, and the proper #include statement:
cc –mt [ flag... ] file... –lpthread [ library... ]
#include <pthread.h>
Note the addition of -lpthread. I suspect the blog you're referring to said to replace -mt with -mt -lpthread.
(A bit more research into this problem, prodded by the Unix question How to install .net Mono on Solaris 11 (source code compile)? produced this answer, which I'm repeating here.)
The exact value that you set _XOPEN_SOURCE to doesn't matter. Mono is incorrectly defining the _XOPEN_SOURCE_EXTENDED macro:
case "${host}" in
*solaris* )
AC_MSG_CHECKING(for Solaris XPG4 support)
if test -f /usr/lib/libxnet.so; then
CPPFLAGS="$CPPFLAGS -D_XOPEN_SOURCE=600"
CPPFLAGS="$CPPFLAGS -D__EXTENSIONS__"
CPPFLAGS="$CPPFLAGS -D_XOPEN_SOURCE_EXTENDED=1" <-- WRONG
LIBS="$LIBS -lxnet"
The _XOPEN_SOURCE_EXTENDED macro does not exist in either the POSIX 6 or the POSIX 7 standard.
Even Linux agrees that _XOPEN_SOURCE_EXTENDED should not be defined here. Per the Linux feature_test_macros man page:
_XOPEN_SOURCE_EXTENDED
If this macro is defined, and _XOPEN_SOURCE is defined, then expose definitions corresponding to the XPG4v2 (SUSv1) UNIX extensions (UNIX 95). Defining _XOPEN_SOURCE with a value of 500 or more also produces the same effect as defining _XOPEN_SOURCE_EXTENDED. Use of _XOPEN_SOURCE_EXTENDED in new source code should be avoided.
Since defining _XOPEN_SOURCE with a value of 500 or more has the same effect as defining _XOPEN_SOURCE_EXTENDED, the latter (obsolete) feature test macro is generally not described in the SYNOPSIS in man pages.
Note the precise wording:
If this macro (_XOPEN_SOURCE_EXTENDED) is defined, and _XOPEN_SOURCE is defined, then expose definitions corresponding to the XPG4v2 (SUSv1) UNIX extensions (UNIX 95). ...
Defining _XOPEN_SOURCE to any value while also defining _XOPEN_SOURCE_EXTENDED results in XPG4v2, and that's NOT the XPG6 necessary to get recursive mutexes.
You're likely running into this check in the Solaris 11 /usr/include/sys/feature_tests.h:
/*
* It is invalid to compile an XPG3, XPG4, XPG4v2, or XPG5 application
* using c99. The same is true for POSIX.1-1990, POSIX.2-1992, POSIX.1b,
* and POSIX.1c applications. Likewise, it is invalid to compile an XPG6
* or a POSIX.1-2001 application with anything other than a c99 or later
* compiler. Therefore, we force an error in both cases.
*/
#if defined(_STDC_C99) && (defined(__XOPEN_OR_POSIX) && !defined(_XPG6))
#error "Compiler or options invalid for pre-UNIX 03 X/Open applications \
and pre-2001 POSIX applications"
#elif !defined(_STDC_C99) && \
(defined(__XOPEN_OR_POSIX) && defined(_XPG6))
#error "Compiler or options invalid; UNIX 03 and POSIX.1-2001 applications \
require the use of c99"
#endif
_XPG6 gets defined earlier in the file, in this block:
/*
* Use of _XOPEN_SOURCE
*
* The following X/Open specifications are supported:
*
* X/Open Portability Guide, Issue 3 (XPG3)
* X/Open CAE Specification, Issue 4 (XPG4)
* X/Open CAE Specification, Issue 4, Version 2 (XPG4v2)
* X/Open CAE Specification, Issue 5 (XPG5)
* Open Group Technical Standard, Issue 6 (XPG6), also referred to as
* IEEE Std. 1003.1-2001 and ISO/IEC 9945:2002.
*
* XPG4v2 is also referred to as UNIX 95 (SUS or SUSv1).
* XPG5 is also referred to as UNIX 98 or the Single Unix Specification,
* Version 2 (SUSv2)
* XPG6 is the result of a merge of the X/Open and POSIX specifications
* and as such is also referred to as IEEE Std. 1003.1-2001 in
* addition to UNIX 03 and SUSv3.
*
* When writing a conforming X/Open application, as per the specification
* requirements, the appropriate feature test macros must be defined at
* compile time. These are as follows. For more info, see standards(5).
*
* Feature Test Macro Specification
* ------------------------------------------------ -------------
* _XOPEN_SOURCE XPG3
* _XOPEN_SOURCE && _XOPEN_VERSION = 4 XPG4
* _XOPEN_SOURCE && _XOPEN_SOURCE_EXTENDED = 1 XPG4v2
* _XOPEN_SOURCE = 500 XPG5
* _XOPEN_SOURCE = 600 (or POSIX_C_SOURCE=200112L) XPG6
*
* In order to simplify the guards within the headers, the following
* implementation private test macros have been created. Applications
* must NOT use these private test macros as unexpected results will
* occur.
*
* Note that in general, the use of these private macros is cumulative.
* For example, the use of _XPG3 with no other restrictions on the X/Open
* namespace will make the symbols visible for XPG3 through XPG6
* compilation environments. The use of _XPG4_2 with no other X/Open
* namespace restrictions indicates that the symbols were introduced in
* XPG4v2 and are therefore visible for XPG4v2 through XPG6 compilation
* environments, but not for XPG3 or XPG4 compilation environments.
*
* _XPG3 X/Open Portability Guide, Issue 3 (XPG3)
* _XPG4 X/Open CAE Specification, Issue 4 (XPG4)
* _XPG4_2 X/Open CAE Specification, Issue 4, Version 2 (XPG4v2/UNIX 95/SUS)
* _XPG5 X/Open CAE Specification, Issue 5 (XPG5/UNIX 98/SUSv2)
* _XPG6 Open Group Technical Standard, Issue 6 (XPG6/UNIX 03/SUSv3)
*/
/* X/Open Portability Guide, Issue 3 */
#if defined(_XOPEN_SOURCE) && (_XOPEN_SOURCE - 0 < 500) && \
(_XOPEN_VERSION - 0 < 4) && !defined(_XOPEN_SOURCE_EXTENDED)
#define _XPG3
/* X/Open CAE Specification, Issue 4 */
#elif (defined(_XOPEN_SOURCE) && _XOPEN_VERSION - 0 == 4)
#define _XPG4
#define _XPG3
/* X/Open CAE Specification, Issue 4, Version 2 */
#elif (defined(_XOPEN_SOURCE) && _XOPEN_SOURCE_EXTENDED - 0 == 1)
#define _XPG4_2
#define _XPG4
#define _XPG3
/* X/Open CAE Specification, Issue 5 */
#elif (_XOPEN_SOURCE - 0 == 500)
#define _XPG5
#define _XPG4_2
#define _XPG4
#define _XPG3
#undef _POSIX_C_SOURCE
#define _POSIX_C_SOURCE 199506L
/* Open Group Technical Standard , Issue 6 */
#elif (_XOPEN_SOURCE - 0 == 600) || (_POSIX_C_SOURCE - 0 == 200112L)
#define _XPG6
#define _XPG5
#define _XPG4_2
#define _XPG4
#define _XPG3
#undef _POSIX_C_SOURCE
#define _POSIX_C_SOURCE 200112L
#undef _XOPEN_SOURCE
#define _XOPEN_SOURCE 600
#endif

Linux ioctl return value interpreted by who?

I'm working with a custom kernel char device which sometimes returns large negative values (around the thousands, say -2000) for its ioctl().
In userspace, I don't get these values returned from the ioctl call. Instead I get a return value of -1 back with errno set to the negated value from the kernel module (+2000).
As far as I can read and google, __syscall_return() is the macro which is supposed to interpret negative return values as errors. But, it only seems to look for values between -1 and -125. So I didn't expect these large negative values to be translated.
Where are these return values translated? Is it expected behaviour?
I am on Linux 2.6.35.10 with EGLIBC 2.11.3-4+deb6u6.
The translation and move to errno occur on the libc level. Both Gnu libc and μClibc treat negative numbers down to at least -4095 as error conditions, per http://www.makelinux.net/ldd3/chp-6-sect-1
See https://github.molgen.mpg.de/git-mirror/glibc/blob/85b290451e4d3ab460a57f1c5966c5827ca807ca/sysdeps/unix/sysv/linux/aarch64/ioctl.S for the Gnu libc implementation of ioctl.
So, with the help of BRPocock I will report my findings here.
The linux kernel will do a error check for all syscalls along the lines of (from unistd.h):
#define __syscall_return(type, res) \
do { \
if ((unsigned long)(res) >= (unsigned long)(-125)) { \
errno = -(res); \
res = -1; \
} \
return (type) (res); \
} while (0)
Libc will also do an error check for all syscalls along the lines of (from syscall.S):
.text
ENTRY (syscall)
PUSHARGS_6 /* Save register contents. */
_DOARGS_6(44) /* Load arguments. */
movl 20(%esp), %eax /* Load syscall number into %eax. */
ENTER_KERNEL /* Do the system call. */
POPARGS_6 /* Restore register contents. */
cmpl $-4095, %eax /* Check %eax for error. */
jae SYSCALL_ERROR_LABEL /* Jump to error handler if error. */
ret /* Return to caller. */
PSEUDO_END (syscall)
Glibc gives a reason for the 4096 value (from sysdep.h):
/* Linux uses a negative return value to indicate syscall errors,
unlike most Unices, which use the condition codes' carry flag.
Since version 2.1 the return value of a system call might be
negative even if the call succeeded. E.g., the `lseek' system call
might return a large offset. Therefore we must not anymore test
for < 0, but test for a real error by making sure the value in %eax
is a real error number. Linus said he will make sure the no syscall
returns a value in -1 .. -4095 as a valid result so we can savely
test with -4095. */
__syscall_return seems to be missing from newer kernels, I haven't researched that yet.

making valgrind abort on error for heap corruption checking?

I'd like to try using valgrind to do some heap corruption detection. With the following corruption "unit test":
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main()
{
char * c = (char *) malloc(10) ;
memset( c, 0xAB, 20 ) ;
printf("not aborted\n") ;
return 0 ;
}
I was suprised to find that valgrind doesn't abort on error, but just produces a message:
valgrind -q --leak-check=no a.out
==11097== Invalid write of size 4
==11097== at 0x40061F: main (in /home/hotellnx94/peeterj/tmp/a.out)
==11097== Address 0x51c6048 is 8 bytes inside a block of size 10 alloc'd
==11097== at 0x4A2058F: malloc (vg_replace_malloc.c:236)
==11097== by 0x400609: main (in /home/hotellnx94/peeterj/tmp/a.out)
...
not aborted
I don't see a valgrind option to abort on error (like gnu-libc's mcheck does, but I can't use mcheck because it isn't thread safe). Does anybody know if that is possible (our code dup2's stdout to /dev/null since it runs as a daemon, so a report isn't useful and I'd rather catch the culprit in the act or closer to it).
There is no such option in valgrind.
Consider adding a non-daemon mode (debug mode) into your daemon.
http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs 4.6 explains some requests from debugged program to valgrind+memcheck, so you can use some of this in your daemon to do some checks at fixed code positions.

Executing a script from inside code in VxWorks 6.7

In VxWorks 5.5.1 you could run a script using the execute command. In VxWorks 6.7 the execute command is no longer supported. Does anyone now if there is a replacement? I am specifically talking about from inside code not command line.
Through much research it appears like there are a few ways to accomplish this but none is exactly the same as the execute command from before. As I stated in the comment below it turns out that the execute command is not an official API call.
1) shellCmdExec can be used but most be called from inside the shell task.
2) The solution we choose to employ - which is to call it from within our startup script
3) And a hack way:
fd = open("/y/startup.go", 0, 0) /* open the script you want to execute /
v=shellFromNameGet("tShell0") / Get the shell i.d. */
/* Use shellinOutGet to save off the standard in of the shell /
shellInOutSet (v, fd, -1, -1) / Set the standard in of the shell to the file */
/* Here you should restore the standard in (do a shellInOutGet beforehand). Do it after the shell is done with the script. I would say that your script should increrment a variable when ti is done. */
close(fd)
There's a solution in the VxWorks Kernel programmer's guide 6.7, the problem is that it did not work for me, but it could help you:
shellGenericInit ("INTERPRETER=Cmd", 0, NULL, &shellTaskName, FALSE, FALSE,fdScript, STD_OUT, STD_ERR); do
taskDelay (sysClkRateGet ());
while (taskNameToId (shellTaskName) != ERROR); close (fdScript);
Check Section 15.2.15 of the document.
You can do it in the serial driver layer. Try the following code. It shows how to send text to the shell's input.
For example,
pass_to_sio("memShow; ifconfig"); in your c code.
-> sp pass_to_sio, "memShow; ifconfig" in the shell.
pass_to_sio("< test.scr"); in your c code if you want to run a script file.
-> sp pass_to_sio, "< test.scr" in the shell if you want to run a script file.
void pass_to_sio(char *input)
{
int old_priority;
taskPriorityGet(taskIdSelf(),&old_priority);
taskPrioritySet(taskIdSelf(),250); /* task priority must be lower than tShell0 */
NS16550_CHAN *pChan = &ns16550Chan[0]; /* this line depends on your BSP */
while (input != NULL && *input != NULL)
{
(*pChan->putRcvChar) (pChan->putRcvArg, *input);
input++;
}
(*pChan->putRcvChar) (pChan->putRcvArg, '\r');
taskPrioritySet(taskIdSelf(),old_priority);
}