I'm writing an interface to a C library. A C function allocates some memory, reads a value, and returns a void * pointer to that buffer, to be subsequently freed.
I wish to be sure that when I assign the output of a call to nativecast(Str, $data) to a Raku Str variable, the data is assigned to the variable (copied into), not just bound to it, so I can free the space allocated by the C function soon after the assignment.
This is approximately what the code looks like:
my Pointer $data = c_read($source);
my Str $value = nativecast(Str, $data);
c_free($data);
# $value is now ready to be used
I run this code through valgrind, which didn't report any attempt to reference a freed memory buffer. Still I'm curious.
The internals for Str are completely incompatible with C strings. So they have to be decoded before they are used.
More specifically MoarVM stores grapheme clusters as [negative] synthetic codepoints if there isn't already an NFC codepoint for them. This means that even two instances of the same program may use different synthetic codepoints for the same grapheme cluster.
Even ignoring that, MoarVM stores strings as immutable data structures. Which means that it can't just use the C string, as C code may change it out from under MoarVM breaking that assumption.
I'm sure there are a bunch more reasons that it can't use C strings as-is.
Like I said, the internals of Str are completely incompatible with C strings. So there is zero chance of it continuing to use the space allocated by the C function.
The biggest problem here would be to call nativecast after freeing the buffer.
Related
In this question: Fortran Functions with a pointer result in a normal assignment, it is stated that functions returning pointers are not recommended.
My question concerns constructors of user defined types. Consider the code below:
program PointTest
use PointMod, only: PointType
implicit none
class(PointType), allocatable :: TypeObject
TypeObject = PointType(10)
end program PointTest
module PointMod
implicit none
type PointType
real(8), dimension(:), allocatable :: array
contains
final :: Finalizer
end type PointType
interface PointType
procedure NewPointType
end interface PointType
contains
function NewPointType(n) result(TypePointer)
implicit none
integer, intent(in) :: n
type(PointType), pointer :: TypePointer
allocate(TypePointer)
allocate(TypePointer%array(n))
end function NewPointType
subroutine Finalizer(this)
implicit none
type(PointType) :: this
print *, 'Finalizer called'
end subroutine Finalizer
end module PointMod
In the code, I have defined a type with a constructor that allocates the object and then allocates an array in the object. It then returns a pointer to the object.
If the constructor just returned the object, the object and the array would be copied and then deallocated (at least with standard compliant compilers). This could cause overhead and mess with our memory tracking.
Compiling the above code with ifort gives no warnings with -warn all (except unused variable in the finalizer) and the code behaves the way I expect. It also works fine with gfortran, except I get a warning when using -Wall
TypeObject = PointType(10)
1
Warning: POINTER-valued function appears on right-hand side of assignment at (1) [-Wsurprising]
What are the risks of using constructors like these? As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated. One workaround that would achieve the same result is to explicitly allocate the object and turn the constructor into a subroutine that sets variables and does the allocation of array, but it looks a lot less elegant. Are there other solutions? Our code is in the Fortran 2008 standard.
Do not use pointer valued functions. As a rule I never make functions that return functions. They are bad and confusing. They lead to nasty bugs, especially when one confuses => and =.
What the function does is that it allocates a new object and creates a pointer that allocates the object.
What
TypeObject = PointType(10)
does is that it copies the value of the object stored in the pointer. Then the pointer is forgotten and the memory where the pointer had pointed is leaked and lost forever.
You write "As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated." However, I do not see a way to avoid the dangling pointer allocated inside the function. Not even a finalizer can help here. I also do not see how you have more control. The memory you explicitly allocated is just lost. You have a different memory for TypeObject (likely on the main program's stack) and the array inside the type will get allocated again during the copy at the intrinsic assignment TypeObject = PointType(10).
The finalizer could take care of the array component so the array allocated inside the function does not have to be lost. However, the type itself, to which the pointer TypePointer points, with its non-allocatable non-pointer components and descriptors and so on, cannot be deallocated from the finalizer and will remain dangling and the memory will be leaked.
Do not be afraid of functions that return objects as values. That is not a problem. Compilers are smart and are able to optimize an unnecessary copy. Compiler might be easily able to find out that you are just assigning the function result so it can use the memory location of the assignment target for the function result variable (if it does not have to be allocatable).
Many other optimizations exist.
function NewPointType(n) result(TypePointer)
integer, intent(in) :: n
type(PointType) :: TypePointer
allocate(TypePointer%array(n))
end function NewPointType
is simpler and should work just fine. With optimizations it could even be faster. If using a non-pointer non-allocatable result is not possible, use allocatable. Do not use pointers for function results.
void
f
()
{
int a[1];
int b;
int c;
int d[1];
}
I have found that these local variables, for this example, are not pushed on to the stack in order. b and c are pushed in the order of their declaration, but, a and d are grouped together. So the compiler is allocating arrays differently from any other built in type or object.
Is this a C/C++ requirement or gcc implementation detail?
The C standard says nothing about the order in which local variables are allocated. It doesn't even use the word "stack". It only requires that local variables have a lifetime that begins on entry to the nearest enclosing block (basically when execution reaches the {) and ends on exit from that block (reaching the }), and that each object has a unique address. It does acknowledge that two unrelated variables might happen to be adjacent in memory (for obscure technical reasons involving pointer arithmetic), but doesn't say when this might happen.
The order in which variables are allocated is entirely up to the whim of the compiler, and you should not write code that depends on any particular ordering. A compiler might lay out local variables in the order in which they're declared, or alphabetically by name, or it might group some variables together if that happens to result in faster code.
If you need to variables to be allocated in a particular order, you can wrap them in an array or a structure.
(If you were to look at the generated machine code, you'd most likely find that the variables are not "pushed onto the stack" one by one. Instead, the compiler will probably generate a single instruction to adjust the stack pointer by a certain number of bytes, effectively allocating a single chunk of memory to hold all the local variables for the function or block. Code that accesses a given variable will then use its offset within the stack frame.)
And since your function doesn't do anything with its local variables, the compiler might just not bother allocating space for them at all, particularly if you request optimization with -O3 or something similar.
The compiler can order the local variables however it wants. It may even choose to either not allocate them at all (for example, if they're not used, or are optimized away through propagation/ciscizing/keeping in register/etc) or allocate the same stack location for multiple locals that have disjoint live ranges.
There is no common implementation detail to outline how a particular compiler does it, as it may change at any time.
Typically, compilers will try to group similar sized variables (and/or alignments) together to minimize wasted space through "gaps", but there are so many other factors involved.
structs and arrays have slightly different requirements, but that's beyond the scope of this question I believe.
Putting aside good programming practises. Ill give context after.
With respect to Objective-C string literals #"foobar"
Does this structure...
NSString *kFoobar = #"foobar";
[thing1 setValue:xyz forKey:kFoobar];
[thing2 setValue:abc forKey:kFoobar];
[thing3 setValue:def forKey:kFoobar];
[thing4 setValue:ghi forKey:kFoobar];
Use more runtime memory than this structure...
[thing1 setValue:xyz forKey:#"foobar"];
[thing2 setValue:abc forKey:#"foobar"];
[thing3 setValue:def forKey:#"foobar"];
[thing4 setValue:ghi forKey:#"foobar"];
Or does the compiler sort things out and merge all instances of #"foobar" into a single reference in the TEXT section
Context...
I have inherited a large amount of source code in which most keys are expressed as string literals rather than string constants. Its not mine and the owner isn't going to pay for nice to have. Is there any point to spending time on constantifying the strings from a runtime view.
I did pass the exe through strings and it appears as if the compiler does the heavy lifting but I'm not sure.
The two are, for all intents and purposes, identical. Only one instance of a given literal string is created per compilation unit. (And, in fact, in some cases even less, since the system will attempt to combine them.)
The var kFoobar used in the first example would, if a local var, be a temporary which may never be more than a register. At most it would occupy 8 bytes in the stack frame that goes away on method exit. And the compiler would likely load a temp to point to the literal anyway, for the second case. So the code for the two examples could actually be identical.
If kFoobar were some sort of instance or global var then the pointer var itself it would of course occupy instance or global space, but it would have no other effect.
And the NSMutableDictionary does not need to make a local copy of the string (when it's used as a key) because NSString is immutable. The single copy is shared by all referencing objects.
I'm implementing a dynamic language that will compile to C#, and it's implementing its own reflection API (.NET's is too slow, and the DLR is limited only to more recent and resourceful implementations).
For this, I've implemented a simple .GetField(string f) and .SetField(string f, object val) interface. Until recently, the implementation just switches over all possible field string values and makes the corresponding action.
Also, this dynamic language has the possibility to define anonymous objects. For those anonymous objects, at first, I had implemented a simple hash algorithm.
By now, I am looking for ways to optimize the dynamic parts of the language, and I have come across the fact that a hash algorithm for anonymous objects would be overkill. This is because the objects are usually small. I'd say the objects contain 2 or 3 fields, normally. Very rarely, they would contain more than 15 fields. It would take more time to actually hash the string and perform the lookup than if I would test for equality between them all. (This is not tested, just theoretical).
The first thing I did was to -- at compile-time -- create a red-black tree for each anonymous object declaration and have it laid onto an array so that the object can look for it in a very optimized way.
I am still divided, though, if that's the best way to do this. I could go for a perfect hashing function. Even more radically, I'm thinking about dropping the need for strings and actually work with a struct of 2 longs.
Those two longs will be encoded to support 10 chars (A-za-z0-9_) each, which is mostly a good prediction of the size of the fields. For fields larger than this, a special function (slower) receiving a string will also be provided.
The result will be that strings will be inlined (not references), and their comparisons will be as cheap as a long comparison.
Anyway, it's a little hard to find good information about this kind of optimization, since this is normally thought on a vm-level, not a static language compilation implementation.
Does anyone have any thoughts or tips about the best data structure to handle dynamic calls?
Edit:
For now, I'm really going with the string as long representation and a linear binary tree lookup.
I don't know if this is helpful, but I'll chuck it out in case;
If this is compiling to C#, do you know the complete list of fields at compile time? So as an idea, if your code reads
// dynamic
myObject.foo = "some value";
myObject.bar = 32;
then during the parse, your symbol table can build an int for each field name;
// parsing code
symbols[0] == "foo"
symbols[1] == "bar"
then generate code using arrays or lists;
// generated c#
runtimeObject[0] = "some value"; // assign myobject.foo
runtimeObject[1] = 32; // assign myobject.bar
and build up reflection as a separate array;
runtimeObject.FieldNames[0] == "foo"; // Dictionary<int, string>
runtimeObject.FieldIds["foo"] === 0; // Dictionary<string, int>
As I say, thrown out in the hope it'll be useful. No idea if it will!
Since you are likely to be using the same field and method names repeatedly, something like string interning would work well to quickly generate keys for your hash tables. It would also make string equality comparisons constant-time.
For such a small data set (expected upper bounds of 15) I think almost any hashing will be more expensive then a tree or even a list lookup, but that is really dependent on your hashing algorithm.
If you want to use a dictionary/hash then you'll need to make sure the objects you use for the key return a hash code quickly (perhaps a single constant hash code that's built once). If you can prevent collisions inside of an object (sounds pretty doable) then you'll gain the speed and scalability (well for any realistic object/class size) of a hash table.
Something that comes to mind is Ruby's symbols and message passing. I believe Ruby's symbols act as a constant to just a memory reference. So comparison is constant, they are very lite, and you can use symbols like variables (I'm a little hazy on this and don't have a Ruby interpreter on this machine). Ruby's method "calling" really turns into message passing. Something like: obj.func(arg) turns into obj.send(:func, arg) (":func" is the symbol). I would imagine that symbol makes looking up the message handler (as I'll call it) inside the object pretty efficient since it's hash code most likely doesn't need to be calculated like most objects.
Perhaps something similar could be done in .NET.
I'm confused about COM string assignments. Which of the following string assignment is correct. Why?
CComBSTR str;
.
.
Obj->str = L"" //Option1
OR should it be
Obj->str = CComBSTR(L"") //Option2
What is the reason
A real BSTR is:
temporarily allocated from the COM heap (via SysAllocString() and family)
a data structure in which the string data is preceded by its length, stored in a 32-bit value.
passed as a pointer to the fifth byte of that data structure, where the string data resides.
See the documentation:
MSDN: BSTR
Most functions which accept a BSTR will not crash when passed a BSTR created the simple assignment. This leads to confusion as people observe what seems to be working code from which they infer that a BSTR can be initialized just like any WCHAR *. That inference is incorrect.
Only real BSTRs can be passed to OLE Automation interfaces.
By using the CComBSTR() constructor, which calls SysAllocString(), your code will create a real BSTR. The CComBSTR() destructor will take care of returning the allocated storage to the system via SysFreeString().
If you pass the CComBSTR() to an API which takes ownership, be sure to call the .Detach() method to ensure the BSTR is not freed. BSTRs are not reference counted (unlike COM objects, which are), and therefore an attempt to free a BSTR more than once will crash.
If you use str = CComBSTR(L"") you use the constructor:
CComBSTR( LPCSTR pSrc );
If you use str = L"" you use the assignment operator:
CComBSTR& operator =(LPCSTR pSrc);
They both would initialize the CComBSTR object correctly.
Personally, I'd prefer option 1, because that doesn't require constructing a new CComBSTR object. (Whether their code does so behind the scenes is a different story, of course.)
Option 1 is preferred because it only does one allocation for the string where as option 2 does 2 (not withstanding the creation of a new temporary object for no particular reason). Unlike the bstr_t type in VC++ the ATL one does not do referenced counted strings so it will copy the entire string across.