Github CI blowing up memory

Github CI blowing up memory - testing

I have a CI setup in Github which runs all the tests I currently have (200 approx) which are e2e tests using jest and superagent, following NestJS's standard testing pattern, also meaning that I have one or more test suites (files) for each module
Everything was pink and beautiful until recently I had to implement some queue using BullMQ (also integrated in the framework). At this point my PR keeps failing the checks, even skipping test suites that actually tests the endpoints using the queue, blowing up memory, and timing out another theoretically unrelated suites. This is the final error trace:
<--- Last few GCs --->
[2400:0x57d01a0] 287364 ms: Mark-sweep 1846.7 (2086.3) -> 1830.1 (2083.3) MB,
1710.5 / 9.9 ms (average mu = 0.173, current mu = 0.177) allocation failure scavenge
might not succeed
[2400:0x57d01a0] 289587 ms: Mark-sweep 1846.6 (2083.3) -> 1833.2 (2080.8) MB,
1649.9 / 0.1 ms (average mu = 0.216, current mu = 0.258) task scavenge might not
succeed
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript
heap out of memory
1: 0xb00d90 node::Abort() [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
2: 0xa1823b node::FatalError(char const*, char const*)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
3: 0xcedbce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
4: 0xcedf47 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char
const*, bool) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
5: 0xea6105 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
6: 0xea6be6 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
7: 0xeb4b1e [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
8: 0xeb5560 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace,
v8::internal::GarbageCollectionReason, v8::GCCallbackFlags)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
9: 0xeb84de v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int,
v8::internal::AllocationType, v8::internal::AllocationOrigin,
v8::internal::AllocationAlignment) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
10: 0xe7990a v8::internal::Factory::NewFillerObject(int, bool,
v8::internal::AllocationType, v8::internal::AllocationOrigin)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
11: 0x11f2f06 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*,
v8::internal::Isolate*) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node].
12: 0x15e7819 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
Aborted (core dumped)
Custom code implementation increase is very little besides using as I said BullMQ modules/decorators already integrated in NestJS, and the tests are passing nice locally
I belive if the issue is related to the queue implementation and that now I added Redis setup to the CI, if I skip the tests building the modules using this implementation, everything should be fine, but this keeps happening
Tried increasing global timeout setting, but nothing changed so far
I have been wondering around trying to find the underlying cause for some days now but got nothing
Feeling completely lost at this point, has anyone any idea of what could be happening here or any clue on how to hunt the underlying cause of this?
Thanks in advance

Related

How to generate valgrind suppressions without manual cut and paste?

I want to generate a suppressions file with --gen-suppressions in valgrind.
However, I do not want to have to go through thousands of lines of output the cut and paste out the suppressions and remove the valgrind stack traces / other valgrind output, and resolve .
Is there a way to do this easily? This seems like a very basic use case...
// I want this part vvvvv
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
fun:strdup
fun:_XlcCreateLC
fun:_XlcDefaultLoader
fun:_XOpenLC
fun:_XrmInitParseInfo
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
fun:XrmGetStringDatabase
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
fun:XGetDefault
fun:GetXftDPI
fun:X11_InitModes_XRandR
fun:X11_InitModes
fun:X11_VideoInit
}
// I do not want this part vvvv
==187526== 2 bytes in 1 blocks are still reachable in loss record 2 of 137
==187526== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==187526== by 0x4B7C50E: strdup (strdup.c:42)
==187526== by 0x5922D81: _XlcResolveLocaleName (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5926387: ??? (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5925956: ??? (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x592615C: _XlcCreateLC (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5943664: _XlcDefaultLoader (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x592D995: _XOpenLC (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)

It is quite unlikely that all of the suppressions are different.
If you create a suppression like
{
XINIT-1
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
fun:strdup
fun:_XlcCreateLC
fun:_XlcDefaultLoader
fun:_XOpenLC
fun:_XrmInitParseInfo
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
}
Then re-run. Typically the error count will go down very quickly and you will only need to add a fairly small number of suppressions (single or low double digits).
(you need to apply your knowledge of the code and libs(s) to get a sensible stack depth for suppressions - too many stack entries and the suppression will be too specific and you need more suppressions, too few and you risk suppressing real problems).

DeepSleepLock underflow error when doing pow(2, ((m - 69.0f) / 12.0f)) - MBed OS

I'm using MBed OS on an NUCLEO_L432KC and the MBed CLI to compile, flash, and test. Using OpenOCD and gdb to debug. MBed has their own GreenTea test automation tool for unit testing on the embedded hardware and it used the utest and Unity testing frameworks.
When I use GreenTea to unit test this function:
float Piano::midiNumToFrequency(uint8_t m)
{
float exp = (m - 69.0f) / 12.0f;
return pow(2, exp);
}
I get a DeepSleepLock underflow error:
[1589410046.26][CONN][RXD] ++ MbedOS Error Info ++
[1589410046.30][CONN][RXD] Error Status: 0x80040124 Code: 292 Module:
4 [1589410046.35][CONN][RXD] Error Message: DeepSleepLock underflow (<
0) [1589410046.37][CONN][RXD] Location: 0x8003B09
[1589410046.40][CONN][RXD] File: mbed_power_mgmt.c+197
[1589410046.43][CONN][RXD] Error Value: 0xFFFF
[1589410046.53][CONN][RXD] Current Thread: main Id: 0x20001200 Entry:
0x80044A7 StackSize: 0x1000 StackMem: 0x20001C18 SP: 0x2000FF04
[1589410046.62][CONN][RXD] For more info, visit:
https://mbed.com/s/error?error=0x80040124&tgt=NUCLEO_L432KC
[1589410046.64][CONN][RXD] – MbedOS Error Info –
Yet when I change the function to this:
float Piano::midiNumToFrequency(uint8_t m)
{
float exp = (m - 69.0f);
return pow(2, exp);
}
it works and tests fine.
MBed has an error status decoder here which says
Use the "Location" reported to figure out the address of the location
which caused the error or try building a non-release version with
MBED_CONF_PLATFORM_ERROR_FILENAME_CAPTURE_ENABLED configuration
enabled to capture the filename and line number where this error
originates from.
When I enable the MBED_CONF_PLATFORM_ERROR_FILENAME_CAPTURE_ENABLED, it says the location is in mbed_power_mgmt.c line 197 which is the functoin:
/** Send the microcontroller to sleep
*
* #note This function can be a noop if not implemented by the platform.
* #note This function will be a noop in debug mode (debug build profile when MBED_DEBUG is defined).
* #note This function will be a noop if the following conditions are met:
* - The RTOS is present
* - The processor turn off the Systick clock during sleep
* - The target does not implement tickless mode
*
* The processor is setup ready for sleep, and sent to sleep using __WFI(). In this mode, the
* system clock to the core is stopped until a reset or an interrupt occurs. This eliminates
* dynamic power used by the processor, memory systems and buses. The processor, peripheral and
* memory state are maintained, and the peripherals continue to work and can generate interrupts.
*
* The processor can be woken up by any internal peripheral interrupt or external pin interrupt.
*
* #note
* The mbed interface semihosting is disconnected as part of going to sleep, and can not be restored.
* Flash re-programming and the USB serial port will remain active, but the mbed program will no longer be
* able to access the LocalFileSystem
*/
static inline void sleep(void)
{
#if DEVICE_SLEEP
#if (MBED_CONF_RTOS_PRESENT == 0) || (DEVICE_SYSTICK_CLK_OFF_DURING_SLEEP == 0) || defined(MBED_TICKLESS)
sleep_manager_sleep_auto();
#endif /* (MBED_CONF_RTOS_PRESENT == 0) || (DEVICE_SYSTICK_CLK_OFF_DURING_SLEEP == 0) || defined(MBED_TICKLESS) */
#endif /* DEVICE_SLEEP */
}
Any ideas why this is happening or how to troubleshoot further?

This part:
StackSize: 0x1000
StackMem: 0x20001C18
SP: 0x2000FF04
Suggests that the stack pointer is no longer within the task's own stack.
The cause of that cannot really be determined from just the code posted, but the reported location is irellevant; when a function pops a return address from a corrupted stack or using a corrupted stack pointer, the program-counter could end up anywhere or nowhere.
It is possible for example that your test thread has insufficient stack allocation and the overflow has corrupted the stack or TCB of some other thread that is then crashing. That kind of behaviour could lead to the kind of error you are seeing where the code indicated is unrelated to the source of the error. That is purely speculation however, there are other error mechanisms such as buffer-overrun that might cause similar non-deterministic behaviour.
The critical thing to understand is that just because modifying this function appears to affect the result does not suggest that this function is itself at fault.

Favorability of alloca for array allocation vs simple [] array declaration

Reading some Apple code, I stumbled upon the following C chunk
alloca(sizeof(CMTimeRange) * 3)
is this the same thing as allocation stack memory via
CMTimeRange *p = CMTimeRange[3] ?
Is there any implications on performance? The need to free the memory?

If you really only want to allocate 3 elements of something on the stack the use of alloca makes no sense at all. It only makes sense if you have a variable length that depends on some dynamic parameter at runtime, or if you do an unknown number of such allocations in the same function.
alloca is not a standard function and differs from platform to platform. The C standard has prefered to introduce VLA, variable length arrays as a replacement.

is this the same thing as allocation stack memory via...
I would think not quite. Declaring a local variable causes the memory to be reserved when the stack frame is entered (by subtracting the size of variable from the stack pointer and adjusting for alignment).
It looks like alloca(3) works by adjusting the stack pointer at the moment it is encountered. Note the "Bugs" section of the man page.
alloca() is machine and compiler dependent; its use is discouraged.
alloca() is slightly unsafe because it cannot ensure that the pointer returned points to a valid and usable block of memory. The allocation made may exceed the bounds of the stack, or even go further into other objects in memory, and alloca() cannot determine such an error. Avoid alloca() with large unbounded allocations.
These two points together add up to the following in my opinion:
DO NOT USE ALLOCA

Assuming as Joachim points out you mean CMTimeRange someVariableName[3]...
Both will allocate memory on the stack.
I'm guessing alloca() will have to add extra code after your function prologue to do the allocation... The function prologue is code that the compiler automatically generates for you to create room on the stack. The upshot is that your function may be slightly larger once compiled but not by much... a few extra instructions to modify the stack pointer and possibly stack frame. I guess a compiler could optimize the call out if it wasn't in a conditional branch, or just even lift it outside of a conditional branch though?
I experimented on my MQX compiler with no optimisations... it's not objective-c, just C, also a different platform, but hopefully that's a good enough approximation and does show a difference in emitted code. I used two simple functions with a large array on the stack to make sure stack space had to be used (variable couldn't exist solely in registers).
Obviously it is not advisable to put large arrays on the stack... this is just for demo purposes.
unsigned int TEST1(unsigned int stuff)
{
unsigned int a1[100]; // Make sure it must go on stack
unsigned int a2[100]; // Make sure it must go on stack
a1[0] = 0xdead;
a2[0] = stuff + 10;
return a2[0];
}
unsigned int TEST2(unsigned int stuff)
{
unsigned int a1[100]; // Make sure it must go on stack
unsigned int *a2 = alloca(sizeof(unsigned int)*100);
a1[0] = 0xdead;
a2[0] = stuff + 10;
return a2[0];
}
The following assembler was generated:
TEST1:
Both arrays a1 and a2 are put on the stack in the function prologue...
0: 1cfcb6c8 push %fp
4: 230a3700 mov %fp,%sp
8: 24993901 sub3 %sp,%sp,100 # Both arrays put on stack
c: 7108 mov_s %r1,%r0
e: 1b38bf98 0000dead st 0xdead,[%fp,0xffff_fce0] ; 0xdead
16: e00a add_s %r0,%r0,10
18: 1b9cb018 st %r0,[%fp,0xffff_fe70]
1c: 240a36c0 mov %sp,%fp
20: 1404341b pop %fp
24: 7ee0 j_s [%blink]
TEST2:
Only array a1 is put on the stack in the proglogue... Extra lines of code have to be generated to deal with the alloca.
0: 1cfcb6c8 push %fp
4: 230a3700 mov %fp,%sp
8: 24593c9c sub3 %sp,%sp,50 # Only one array put on stack
c: 240a07c0 mov %r4,%blink
10: 220a0000 mov %r2,%r0
14: 218a0406 mov %r1,0x190 # Extra for alloca()
18: 2402305c sub %sp,%sp,%r1 # Extra for alloca()
1c: 08020000r bl _stkchk # Extra for alloca()
20: 738b mov_s %r3,%sp # Extra, r3 to access write via pointer
22: 1b9cbf98 0000dead st 0xdead,[%fp,0xffff_fe70] ; 0xdead
2a: 22400280 add %r0,%r2,10
2e: a300 st_s %r0,[%r3] # r3 to access write via pointer
30: 270a3100 mov %blink,%r4
34: 240a36c0 mov %sp,%fp
38: 1404341b pop %fp
3c: 7ee0 j_s [%blink]
Also you alloca() memory will be accessed through pointers (unless there are clever compiler optimisations for this... I don't know) so causes actual memory access. Automatic variables might be optimized to being just register accesses, which is better... the compiler can figure out using register colouring what automatic variables are best left in registers and if they ever need to be on the stack.
I had a quick search through C99 standard (C11 is about... my reference is out of date a little). Could not see a reference to alloca so maybe not a standard-defined function. A possible disadvantage?

Control a robotic arm

I have a Cyber Robot CYBER 310 and a Sciento CS-113 robotic arm with no documentation. Both use a parallel port.
How could I program those?
For the Cyber one, I found this:
Nothing at all on the Sciento one.
Any pointers or examples in Python/Java/C/whatever appreciated.
[update] This page contains some information, but I'm still lost: http://www.anf.nildram.co.uk/beebcontrol/arms/cyber/software.html

I am not entirely sure I understand what the question is.
Are you unfamiliar with with programming the parallel port?
My memory on it is hazy, but iirc it's pretty simple. It's a "dumb" interface so you simply need to write to it.
If you are running under linux then there are some great resources on it:
Linux Device Drivers: Chapter 9: An Overview of the Parallel port - Talks a bit about parallel port programming and goes on to talk about writing device drivers for it. A bit overkill I think for your application, but the entire book is fascinating, and enlightening.
Linux I/O port programming - essentially you can write to /dev/port, or include asm/io.h and use inb() and outb() (I haven't done this in a while, but im sure if you run into a specific problem there will be a multitude of answers out there once you have it narrowed down to something specific)
If you are on windows or mac, then id still suggest reading the above so you know what you are trying to do, they are straightforward in my opinion, then search for the windows/mac equivalent.
Now for what I assume the crux of the question is, what do you write to the ports?
For the Cyber 310 you have the pin layouts, although there seems to be multiple different pin layouts if you browse the site you have listed, and if we follow anf.nildram.co.uk here we can find some PIC assembly that will show us how to rotate the base.
I have never touched PIC assembly before today, but with some help from the internet and the comments, I think we can translate what this is trying to do (snipped out the relevant portion, as most of it is timing and looping )
; 6: Symbol prf = PORTA.0
; The address of 'prf' is 0x5,0
; 7: Symbol strobe = PORTA.1
; The address of 'strobe' is 0x5,1
; 8: Symbol base = PORTB.0
; The address of 'base' is 0x6,0
; 9: Symbol shoulder = PORTB.1
; The address of 'shoulder' is 0x6,1
...
; 16: main:
L0001:
; 17: base = 1
BSF 0x06,0 // set bit 0 at 0x06 to 1 essentially set base bit to 1
; 18: strobe = 1
BSF 0x05,1 // set strobe bit to 1
; 19: strobe = 0
BCF 0x05,1 // set strobe bit to 0
; 20: While a <> 730 // now we loop 729 more times
So it appears, from my naive perspective, that to rotate the arm you need to set the motor bits (grabbed from your pinout) then set and clear strobe.
Let me know if I am completely off base, this is a fascinating project.

Chris is right about the parallel port being a dumb interface. The parallel port has an address that you can output an 8bit binary number to that match the Digital Output's positions.
I found this to be a really good example of programming the Parallel port using C#.
http://www.codeproject.com/Articles/4981/I-O-Ports-Uncensored-1-Controlling-LEDs-Light-Emit
To match your project to his example. C0 is strobe. Then your Digital Outputs from left to right match his D0-D6.
Seems like a really fun project. Have fun.

What does this Windows crash dump mean?

Yesterday my system software got crashed on WINDOWS 2003 server. The Core shown below.
kernel32.dll!_RaiseException#16() + 0x3c bytes
rpcrt4.dll!_RpcpRaiseException#4() + 0x21 bytes
rpcrt4.dll!_NdrGetBuffer#12() - 0x1d3fe bytes
rpcrt4.dll!_NdrClientCall2() + 0x132 bytes
hnetcfg.dll!_FwOpenDynamicFwPort#16() + 0x1d bytes
hnetcfg.dll!_IcfOpenDynamicFwPort#12() + 0x6a bytes
mswsock.dll!_WSPBind#16() + 0xa55 bytes
ws2_32.dll!_bind#12() + 0x4e bytes
sal.dll!s_SktBind(s_Socket * sp=0x05943800, SAL_AddrBuf_t
* addrp=0x057cfe00,unsigned int addrsz=0x00000042) Line 76 + 0x14 bytes C++
sal.dll!SAL_SktBind(SAL_SktHandle_t * sh=0x05943800, SAL_AddrBuf_t
*addrp=0x057cfe00, unsigned int addrsz=0x00000042) Line 101 + 0xe bytes C++
Note: sal.dll is my software module. it is calling System Call bind() from our function SktBind()
Could you please tell, why it was crashed ? and how can I solve this problem.
If you have any comments or suggestion , please share with me.

The call to bind() from function s_SktBind() in sal.dll has caused the crash.
The first thing I would check is that you bind() is being called with proper arguments.

This doesn't look like kernel programming to me (re the tag).
Which process faulted? Looks like your program, since you have line number info.
What was the fault? AV? Or some other exception?
Paste the line and surrounding code that crashed. (Line 101 of the file that defines SAL_SktBind).

hnetcfg.dll is a process associated with Home Networking Configuration Manager from Microsoft Corporation.
Search Microsoft support for relevant articles (e.g. maybe this one)
The code that is the origin of the problem seems to be dealing with networking. Is this correct?

I don't think you have posted the full call stack, but sal.dll is a dll that is provided by Novell and this is from where the error originates. So you might want to check if a newer version of this dll is available.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Github CI blowing up memory - testing

Related

How to generate valgrind suppressions without manual cut and paste?

DeepSleepLock underflow error when doing pow(2, ((m - 69.0f) / 12.0f)) - MBed OS

Favorability of alloca for array allocation vs simple [] array declaration

Control a robotic arm

What does this Windows crash dump mean?

Categories

Resources