Objective-C plugin crashes with initialization of large array - objective-c

I am working on a plugin in objective-C which I am running in Osirix. At some point in the plugin I initialize two large arrays (which are meant to accept a 512x512 image which is later fed into a CoreML model). The original plugin used a coreml model which accepted 200x200 sized images, the var IMG_SQSIZE was set as 200^2 and everything worked fine. Now I have increased IMG_SQSIZE to 512^2 but this crashes the plugin. If I the array initialization and everything after there is no crash, if I keep this line but remove everything after the crash persists...so I've concluded that this is causing the problem. I'm new to Objective-C and X-code but this seems like a memory issue. I'm wondering if this is memory I need to allocate in the code (it builds fine) or if this is an issue with the program running the plugin. Any advice would be great, thanks
#define IMG_SQSIZE 262144 //(512^2)
double tmp_var[IMG_SQSIZE], tmp_var2[IMG_SQSIZE];

Note that 512^2 is a LOT larger than 200^2 and I suspect as you do that it is a memory issue.
You should allocate the memory using malloc and move the code to C. This is my suggestion based on the requirements - lots of memory and an array of doubles. If you could use floats or even ints here it would also dramatically reduce the resource requirements so see if that can work.
At least with Objective-C this is fairly easy to do but you should probably wrap all of this inside its own autoreleasepool as well.
Let's try the easy way first. See if the following works.
// Allocate the memory
tmp_var = malloc( IMG_SQSIZE * sizeof( double ) );
tmp_var2 = malloc( IMG_SQSIZE * sizeof( double ) );
if ( tmp_var && tmp_var2 )
{
// ... do stuff
// ... see if it works
// ... if it crashes you have trouble
// ... when done free - below
}
else
{
// Needs better handling but for now just to test
NSLog( #"Out of memory" );
}
// You must to call this, ensure you do not return
// without passing here
free ( tmp_var );
free ( tmp_var2 );
EDIT
Here is another version that does a single malloc. Not sure which will be better but worth a shot ... if memory is not a problem this one should perform better.
// Supersizeme
tmp_var = malloc( 2 * IMG_SQSIZE * sizeof( double ) );
if ( tmp_var )
{
// This points to the latter portion
tmp_var2 = tmp_var + IMG_SZSIZE;
// ... do stuff
// ... see if it works
// ... if it crashes you have trouble
// ... when done free - below
}
else
{
// Needs better handling but for now just to test
NSLog( #"Out of memory" );
}
// You must to call this, ensure you do not return
// without passing here
free ( tmp_var );
Also, in both cases, you need to define the variables as
double * tmp_var;
double * tmp_var2;

Related

Write UART on PIC18

I need help with the uart communication I am trying to implement on my Proteus simulation. I use a PIC18f4520 and I want to display on the virtual terminal the values that have been calculated by the microcontroller.
Here a snap of my design on Proteus
Right now, this is how my UART code looks like :
#define _XTAL_FREQ 20000000
#define _BAUDRATE 9600
void Configuration_ISR(void) {
IPR1bits.TMR1IP = 1; // TMR1 Overflow Interrupt Priority - High
PIE1bits.TMR1IE = 1; // TMR1 Overflow Interrupt Enable
PIR1bits.TMR1IF = 0; // TMR1 Overflow Interrupt Flag
// 0 = TMR1 register did not overflow
// 1 = TMR1 register overflowed (must be cleared in software)
RCONbits.IPEN = 1; // Interrupt Priority High level
INTCONbits.PEIE = 1; // Enables all low-priority peripheral interrupts
//INTCONbits.GIE = 1; // Enables all high-priority interrupts
}
void Configuration_UART(void) {
TRISCbits.TRISC6 = 0;
TRISCbits.TRISC7 = 1;
SPBRG = ((_XTAL_FREQ/16)/_BAUDRATE)-1;
//RCSTA REG
RCSTAbits.SPEN = 1; // enable serial port pins
RCSTAbits.RX9 = 0;
//TXSTA REG
TXSTAbits.BRGH = 1; // fast baudrate
TXSTAbits.SYNC = 0; // asynchronous
TXSTAbits.TX9 = 0; // 8-bit transmission
TXSTAbits.TXEN = 1; // enble transmitter
}
void WriteByte_UART(unsigned char ch) {
while(!PIR1bits.TXIF); // Wait for TXIF flag Set which indicates
// TXREG register is empty
TXREG = ch; // Transmitt data to UART
}
void WriteString_UART(char *data) {
while(*data){
WriteByte_UART(*data++);
}
}
unsigned char ReceiveByte_UART(void) {
if(RCSTAbits.OERR) {
RCSTAbits.CREN = 0;
RCSTAbits.CREN = 1;
}
while(!PIR1bits.RCIF); //Wait for a byte
return RCREG;
}
And in the main loop :
while(1) {
WriteByte_UART('a'); // This works. I can see the As in the terminal
WriteString_UART("Hello World !"); //Nothing displayed :(
}//end while(1)
I have tried different solution for WriteString_UART but none has worked so far.
I don't want to use printf cause it impacts other operations I'm doing with the PIC by adding delay.
So I really want to make it work with WriteString_UART.
In the end I would like to have someting like "Error rate is : [a value]%" on the terminal.
Thanks for your help, and please tell me if something isn't clear.
In your WriteByte_UART() function, try polling the TRMT bit. In particular, change:
while(!PIR1bits.TXIF);
to
while(!TXSTA1bits.TRMT);
I don't know if this is your particular issue, but there exists a race-condition due to the fact that TXIF is not immediately cleared upon loading TXREG. Another option would be to try:
...
Nop();
while(!PIR1bits.TXIF);
...
EDIT BASED ON COMMENTS
The issue is due to the fact that the PIC18 utilizes two different pointer types based on data memory and program memory. Try changing your declaration to void WriteString_UART(const rom char * data) and see what happens. You will need to change your WriteByte_UART() declaration as well, to void WriteByte_UART(const unsigned char ch).
Add delay of few miliseconds after line
TXREG = ch;
verify that pointer *data of WriteString_UART(char *data) actually point to
string "Hello World !".
It seems you found a solution, but the reason why it wasn't working in the first place is still not clear. What compiler are you using?
I learned the hard way that C18 and XC8 are used differently regarding memory spaces. With both compilers, a string declared literally like char string[]="Hello!", will be stored in ROM (program memory). They differ in the way functions use strings.
C18 string functions will have variants to access strings either in RAM or ROM (for example strcpypgm2ram, strcpyram2pgm, etc.). XC8 on the other hand, does the job for you and you will not need to use specific functions to choose which memory you want to access.
If you are using C18, I would highly recommend you switch to XC8, which is more recent and easier to work with. If you still want to use C18 or another compiler which requires you to deal with program/data memory spaces, then here below are two solutions you may want to try. The C18 datasheet says that putsUSART prints a string from data memory to USART. The function putrsUSART will print a string from program memory. So you can simply use putrsUSART to print your string.
You may also want to try the following, which consists in copying your string from program memory to data memory (it may be a waste of memory if your application is tight on memory though) :
char pgmstring[] = "Hello";
char datstring[16];
strcpypgm2ram(datstring, pgmstring);
putsUSART(datstring);
In this example, the pointers pgmstring and datstring will be stored in data memory. The string "Hello" will be stored in program memory. So even if the pointer pgmstring itself is in data memory, it initially points to a memory address (the address of "Hello"). The only way to point to this same string in data memory is to create a copy of it in data memory. This is because a function accepting a string stored in data memory (such as putsUSART) can NOT be used directly with a string stored in program memory.
I hope this could help you understand a bit better how to work with Harvard microprocessors, where program and data memories are separated.

Is it possible to cast a managed bytes-array to native struct without pin_ptr, so not to bug the GC too much?

It is possible to cast a managed array<Byte>^ to some non-managed struct only using pin_ptr, AFAIK, like:
void Example(array<Byte>^ bfr) {
pin_ptr<Byte> ptr = &bfr[0];
auto data = reinterpret_cast<NonManagedStruct*>(ptr);
data->Header = 7;
data->Length = sizeof(data);
data->CRC = CalculateCRC(data);
}
However, is with interior_ptr in any way?
I'd rather work on managed data the low-level-way (using unions, struct-bit-fields, and so on), without pinning data - I could be holding this data for quite a long time and don't want to harass the GC.
Clarification:
I do not want to copy managed-data to native and back (so the Marshaling way is not an option here...)
You likely won't harass the GC with pin_ptr - it's pretty lightweight unlike GCHandle.
GCHandle::Alloc(someObject, GCHandleType::Pinned) will actually register the object as being pinned in the GC. This lets you pin an object for extended periods of time and across function calls, but the GC has to track that object.
On the other hand, pin_ptr gets translated to a pinned local in IL code. The GC isn't notified about it, but it will get to see that the object is pinned only during a collection. That is, it will notice it's pinned status when looking for object references on the stack.
If you really want to, you can access stack memory in the following way:
[StructLayout(LayoutKind::Explicit, Size = 256)]
public value struct ManagedStruct
{
};
struct NativeStruct
{
char data[256];
};
static void DoSomething()
{
ManagedStruct managed;
auto nativePtr = reinterpret_cast<NativeStruct*>(&managed);
nativePtr->data[42] = 42;
}
There's no pinning at all here, but this is only due to the fact that the managed struct is stored on the stack, and therefore is not relocatable in the first place.
It's a convoluted example, because you could just write:
static void DoSomething()
{
NativeStruct native;
native.data[42] = 42;
}
...and the compiler would perform a similar trick under the covers for you.

How does one add vertices to a mesh object in OpenGL?

I am new to OpenGL and I have been using The Red Book, and the Super Bible. In the SB, I have gotten to the section about using objects loaded from files. So far, I don't think I have a problem understanding what is going on and how to do it, but it got me thinking about making my own mesh within my own app--in essence, a modeling app. I have done a lot of searching through both of my references as well as the internet, and I have yet to find a nice tutorial about implementing such functionality into one's own App. I found an API that just provides this functionality, but I am trying to understand the implementation; not just the interface.
Thus far, I have created an "app" (I use this term lightly), that gives you a view that you can click in and add vertices. The vertices don't connect, just are just displayed where you click. My concern is that this method I stumbled upon while experimenting is not the way I should be implementing this process.
I am working on a Mac and using Objective-C and C in Xcode.
MyOpenGLView.m
#import "MyOpenGLView.h"
#interface MyOpenGLView () {
NSTimer *_renderTimer
Gluint VAO, VBO;
GLuint totalVertices;
GLsizei bufferSize;
}
#end
#implementation MyOpenGLView
/* Set up OpenGL view with a context and pixelFormat with doubleBuffering */
/* NSTimer implementation */
- (void)drawS3DView {
currentTime = CACurrentMediaTime();
NSOpenGLContext *currentContext = self.openGLContext;
[currentContext makeCurrentContext];
CGLLockContext([currentContext CGLContextObj]);
const GLfloat color[] = {
sinf(currentTime * 0.2),
sinf(currentTime * 0.3),
cosf(currentTime * 0.4),
1.0
};
glClearBufferfv(GL_COLOR, 0, color);
glUseProgram(shaderProgram);
glBindVertexArray(VAO);
glPointSize(10);
glDrawArrays(GL_POINTS, 0, totalVertices);
CGLFlushDrawable([currentContext CGLContextObj]);
CGLUnlockContext([currentContext CGLContextObj]);
}
#pragma mark - User Interaction
- (void)mouseUp:(NSEvent *)theEvent {
NSPoint mouseLocation = [theEvent locationInWindow];
NSPoint mouseLocationInView = [self convertPoint:mouseLocation fromView:self];
GLfloat x = -1 + mouseLocationInView.x * 2/(GLfloat)self.bounds.size.width;
GLfloat y = -1 + mouseLocationInView.y * 2/(GLfloat)self.bounds.size.height;
NSOpenGLContext *currentContext = self.openGLContext;
[currentContext makeCurrentContext];
CGLLockContext([currentContext CGLContextObj]);
[_renderer addVertexWithLocationX:x locationY:y];
CGLUnlockContext([currentContext CGLContextObj]);
}
- (void)addVertexWithLocationX:(GLfloat)x locationY:(GLfloat)y {
glBindBuffer(GL_ARRAY_BUFFER, VBO);
GLfloat vertices[(totalVertices * 2) + 2];
glGetBufferSubData(GL_ARRAY_BUFFER, 0, (totalVertices * 2), vertices);
for (int i = 0; i < ((totalVertices * 2) + 2); i++) {
if (i == (totalVertices * 2)) {
vertices[i] = x;
} else if (i == (totalVertices * 2) + 1) {
vertices[i] = y;
}
}
glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
totalVertices ++;
}
#end
The app is supposed take the location of the mouse click and provide it is a vertex location. With each added vertex, I first bind the VBO to make sure it is active. Next, I create a new array to hold my current vertex location's (totalVertices) plus space for one more vertex (+ 2 for x and y). Then I used glGetBufferSubData to bring the data back from the VBO and put it into this array. Using a for loop I add the X and Y numbers to the end of the array. Finally, I send this data back to the GPU into a VBO and call totalVertices++ so I know how many vertices I have in the array next time I want to add a vertex.
This brings me to my question: Am I doing this right? Put another way, should I be keeping a copy of the BufferData on the CPU side so that I don't have to call out to the GPU and have the data sent back for editing? In that way, I wouldn't call glGetBufferSubData, I would just create a bigger array, add the new vertex to the end, and then call glBufferData to realloc the VBO with the updated vertex data.
** I tried to include my thinking process so that someone like myself who is very inexperienced in programming can hopefully understand what I am trying to do. I don't want anyone to be offended by my explanations of what I did. **
I would certainly avoid reading the data back. Not only because of the extra data copy, but also to avoid synchronization between CPU and GPU.
When you make an OpenGL call, you can picture the driver building a GPU command, queuing it up for later submission to the GPU, and then returning. These commands will then be submitted to the GPU at a later point. The idea is that the GPU can run as independently as possible from whatever runs on the CPU, which includes your application. CPU and GPU operating in parallel with minimal dependencies is very desirable for performance.
For most glGet*() calls, this asynchronous execution model breaks down. They will often have to wait until the GPU completed all (or at least some) pending commands before they can return the data. So the CPU might block while only the GPU is running, which is undesirable.
For that reason, you should definitely keep your CPU copy of the data so that you don't ever have to read it back.
Beyond that, there are a few options. It will all depend on your usage pattern, the performance characteristics of the specific platform, etc. To really get the maximum out of it, there's no way around implementing multiple variations, and benchmarking them.
For what you're describing, I would probably start with something that works similar to a std::vector in C++. You allocate a certain amount of memory (typically named capacity) that is larger than what you need at the moment. Then you can add data without reallocating, until you fill the allocated capacity. At that point, you can for example double the capacity.
Applying this to OpenGL, you can reserve a certain amount of memory by calling glBufferData() with NULL as the data pointer. Keep track of the capacity you allocated, and populate the buffer with calls to glBufferSubData(). When adding a single point in your example code, you would call glBufferSubData() with just the new point. Only when you run out of capacity, you call glBufferData() with a new capacity, and then fill it with all the data you already have.
In pseudo-code, the initialization would looks something like this:
int capacity = 10;
glBufferData(GL_ARRAY_BUFFER,
capacity * sizeof(Point), NULL, GL_DYNAMIC_DRAW);
std::vector<Point> data;
Then each time you add a point:
data.push_back(newPoint);
if (data.size() <= capacity) {
glBufferSubData(GL_ARRAY_BUFFER,
(data.size() - 1) * sizeof(Point), sizeof(Point), &newPoint);
} else {
capacity *= 2;
glBufferData(GL_ARRAY_BUFFER,
capacity * sizeof(Point), NULL, GL_DYNAMIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER,
0, data.size() * sizeof(Point), &data[0]);
}
As an alternative to glBufferSubData(), glMapBufferRange() is another option to consider for updating buffer data. Going farther, you can look into using multiple buffers, and cycle through them, instead of updating just a single buffer. This is where benchmarking comes into play, because there isn't a single approach that will be best for every possible platform and use case.

Pre-processing a loop in Objective-C

I am currently writing a program to help me control complex lights installations. The idea is I tell the program to start a preset, then the app has three options (depending on the preset type)
1) the lights go to one position (so only one group of data sent when the preset starts)
2) the lights follows a mathematical equation (ex: sinus with a timer to make smooth circles)
3) the lights respond to a flow of data (ex midi controller)
So I decided to go with an object I call the AppBrain, that receive data from the controllers and the templates, but also is able to send processed data to the lights.
Now, I come from non-native programming, and I kinda have trust issues concerning working with a lot of processing, events and timing; as well as troubles with understanding 100% the Cocoa logic.
This is where the actual question starts, sorry
What I want to do, would be when I load the preset, I parse it to prepare the timer/data receive event so it doesn't have to go trough every option for 100 lights 100 times per second.
To explain more deeply, here's how I would do it in Javascript (crappy pseudo code, of course)
var lightsFunctions = {};
function prepareTemplate(theTemplate){
//Let's assume here the template is just an array, and I won't show all the processing
switch(theTemplate.typeOfTemplate){
case "simpledata":
sendAllDataTooLights(); // Simple here
break;
case "periodic":
for(light in theTemplate.lights){
switch(light.typeOfEquation){
case "sin":
lightsFunctions[light.id] = doTheSinus; // doTheSinus being an existing function
break;
case "cos":
...
}
}
function onFrame(){
for(light in lightsFunctions){
lightsFunctions[light]();
}
}
var theTimer = setTimeout(onFrame, theTemplate.delay);
break;
case "controller":
//do the same pre-processing without the timer, to know which function to execute for which light
break;
}
}
}
So, my idea is to store the processing function I need in an NSArray, so I don't need to test on each frame the type and loose some time/CPU.
I don't know if I'm clear, or if my idea is possible/the good way to go. I'm mostly looking for algorithm ideas, and if you have some code that might direct me in the good direction... (I know of PerformSelector, but I don't know if it is the best for this situation.
Thanks;
I_
First of all, don't spend time optimizing what you don't know is a performance problem. 100 iterations of the type is nothing in the native world, even on the weaker mobile CPUs.
Now, to your problem. I take it you are writing some kind of configuration / DSL to specify the light control sequences. One way of doing it is to store blocks in your NSArray. A block is the equivalent of a function object in JavaScript. So for example:
typedef void (^LightFunction)(void);
- (NSArray*) parseProgram ... {
NSMutableArray* result = [NSMutableArray array];
if(...) {
LightFunction simpledata = ^{ sendDataToLights(); };
[result addObject:simpleData];
} else if(...) {
Light* light = [self getSomeLight:...];
LightFunction periodic = ^{
// Note how you can access the local scope of the outside function.
// Make sure you use automatic reference counting for this.
[light doSomethingWithParam:someParam];
};
[result addObject:periodic];
}
return result;
}
...
NSArray* program = [self parseProgram:...];
// To run your program
for(LightFunction func in program) {
func();
}

Can't get where is the memory leak in the function

I am having a problem in identifying a memory leak. I tried Instruments and it says that there is memory leak every time I call the function described below.
CFStringRef getStringFromLocalizedNIB(int cmdId)
{
IBNibRef nibRef;
WindowRef wind=NULL;
CFStringRef alertString;
CreateNibReference(CFSTR("main"), &nibRef);
CreateWindowFromNib(nibRef, CFSTR("Localized Strings"), &wind);
DisposeNibReference(nibRef);
ControlID alertID = {'strn',cmdId};
ControlRef alertRef;
GetControlByID(wind, &alertID,&alertRef);
GetControlData(alertRef, kControlNoPart, kControlStaticTextCFStringTag, sizeof(CFStringRef), &alertString, NULL);
return alertString;
}
Every time I call the function, I release the returned object.
CFStringRef lstr;
lstr = getStringFromLocalizedNIB(20);
//Use lstr;
CFRelease(lstr);
So can anybody please explain where the leak is?
If I understand correctly, you aren't showing the window created using CreateWindowFromNib(). I would expect the window to have the Carbon equivalent of release-on-close, and the CreateWindowFromNib() to be balanced by a ShowWindow(). I haven't done Carbon in 9 years though, so I'm not sure.
Try calling DisposeWindow() on wind to balance the create:
...
DisposeWindow(wind);
return alertString;
}