Out of interest, how are method names stored in memory in compiled Objective-C? The main reason of interest is understanding dynamic typing better.
Thanks in advance!
The source for the runtime is available, btw, if you really want to go deep.
In short; method names -- their selectors -- are stored as C strings in the mach-o of the binary. I.e. if you have a method -(void)foo:(int)a bar:(int)b;, there will be a selector foo:bar: string in the mach-o.
Type encoding information is also stored in a different segment of the mach-o file. That type information -- for which there is API in the runtime to retrieve it -- describes the type of the return value and arguments to the method.
Note that the type information is incomplete. Note also that using the type information to figure out how to generically encode/decode the arguments to and return value from a method is a downright pain.
Related
I'm using reflection to serialize an object. Getting the values as objects is a real murder on performance due to late binding penalties. CType / DirectCast can get rid of most of it but I can't feed a type variable into it so currently I'm using a switch case block on the type variable to select the correct DirectCast.
It came to my attention that CTypeDynamic exists and takes type variables but the return type is Object so... it converts an object into an object, cool. That got me wondering, what is the purpose of this function?
The CTypeDynamic function looks for dynamic information and performs the cast/conversion appropriately. This is different from the CType operator which looks for static information at compile time or relies on the types being IConvertible.
This function examines the object at runtime including looking for Shared (aka static) custom operators. As always, if you know the type then use CType, but if you need dynamic casting then you need to use CTypeDynamic.
More information here: http://blogs.msmvps.com/bill/2010/01/24/ctypedynamic/
Is it possible to create variables to be a specific type in Lua?
E.g. int x = 4
If this is not possible, is there at least some way to have a fake "type" shown before the variable so that anyone reading the code will know what type the variable is supposed to be?
E.g. function addInt(int x=4, int y=5), but x/y could still be any type of variable? I find it much easier to type the variable's type before it rather than putting a comment at above the function to let any readers know what type of variable it is supposed to be.
The sole reason I'm asking isn't to limit the variable to a specific data type, but simply to have the ability to put a data type before the variable, whether it does anything or not, to let the reader know what type of variable that it is supposed to be without getting an error.
You can do this using comments:
local x = 4 -- int
function addInt(x --[[int]],
y --[[int]] )
You can make the syntax a = int(5) from your other comment work using the following:
function int(a) return a end
function string(a) return a end
function dictionary(a) return a end
a = int(5)
b = string "hello, world!"
c = dictionary({foo = "hey"})
Still, this doesn't really offer any benefits over a comment.
The only way I can think of to do this, would be by creating a custom type in C.
Lua Integer type
No. But I understand your goal is to improve understanding when reading and writing functions calls.
Stating the expected data type of parameters adds only a little in terms of giving a specification for the function. Also, some function parameters are polymorphic, accepting a specific value, or a function or table from which to obtain the value for a context in which the function operates. See string.gsub, for example.
When reading a function call, the only thing known at the call site is the name of the variable or field whose value is being invoked as a function (sometimes read as the "name" of the function) and the expressions being passed as actual parameters. It is sometimes helpful to refactor parameter expressions into named local variables to add to the readability.
When writing a function call, the name of the function is key. The names of the formal parameters are also helpful. But still, names (like types) do not comprise much of a specification. The most help comes from embedded structured documentation used in conjunction with an IDE that infers the context of a name and performs content assistance and presentations of available documentation.
luadoc is one such a system of documentation. You can write luadoc for function you declare.
Eclipse Koneki LDT is one such an IDE. Due to the dynamic nature of Lua, it is a difficult problem so LDT is not always as helpful as one would like. (To be clear, LDT does not use luadoc; It evolved its own embedded documentation system.)
Is it possible create an IMP where the number of parameters matches the selector for the instance method being resolved?
I could use an 'if' statement and a finite number of parameters (say between 0 and 10), but is it possible to have eg IMP_implementationWithBlock with va_args ?
You can't create a function at runtime in C; the number of parameters has to be known at compile time.
You can use a variadic function to pretend that you have a function with any number of arguments, (I've included this usage in a recent project) but this may not be portable and is probably Undefined Behavior.
If you need to move arguments between functions where the signatures and arguments are not known until runtime, you almost certainly want to look into libffi.
Mike Ash has a few really useful posts about it: http://www.mikeash.com/pyblog/?tag=libffi
that's where I got started and learned most of what I know about it.
I have the following in my constants file:
typedef enum
{
AnimalTypeBear,
AnimalTypeCamel,
AnimalTypeCow,
AnimalTypeCount
}
AnimalType;
If I declare an AnimalType variable somewhere in my code like following and set it to AnimalTypeBear:
AnimalType animalType = 0;
Is there away to somehow derive the string "Bear" from that animalType variable or just in general to access the string of its corresponding constant type (in this case AnimalTypeBear).
Enums are constant expressions like #define. Enums at compile time will be "translated" into the code as constants (while #define will be evaluated before compilation). So basically it is not possible to reference the enum string in this way.
As suggested by others you can use a string array.
You cannot do this without code in (Objective-)C. If you want to be able to use actual enumeration literals as strings in your code, or during I/O, with language support then you need to use a language with enumeration type support such as Pascal or Ada.
If you are keen to have this and don't mind work as long as it is reusable then you need to learn about reading the symbol tables structures from a binary and make sure that the information is not stripped from your application. You'll see the debugger can show the correct literals, also if you use Xcode's "Product > Generate Output > Assembly File" menu item you'll see the literals are in there as strings. It will be a lot of work for you, but would be reusable once done.
After that give up and write some code - a simple static array of labels and an index operation. Yes, it's a maintenance headache if you ever change your enumeration.
Alternatively you can write some different code, say in Ruby... Xcode supports adding your own file "types" and running scripts to (pre-)process them. So you could define, say, a file type ".enum" and use a Ruby script to convert that into a C enumeration definition and code to provide the strings. Apple has examples of using Ruby to pre-process files in this way. Once you have your script Xcode will do the rest, on each compilation it will run your script to convert your ".enum" into ".m" (or ".c") and the compile the result. This approach is usually best though for files which contain only one thing, e.g. localised string file processing, you don't usually write your enum declarations in their own files.
I'm having a hard time understanding what I'm supposed to do. The only thing I've figured out is I need to use yacc on the cminus.y file. I'm totally confused about everything after that. Can someone explain this to me differently so that I can understand what I need to do?
INTRODUCTION:
We will use lex/flex and yacc/Bison to generate an LALR parser. I will give you a file called cminus.y. This is a yacc format grammar file for a simple C-like language called C-minus, from the book Compiler Construction by Kenneth C. Louden. I think the grammar should be fairly obvious.
The Yahoo group has links to several descriptions of how to use yacc. Now that you know flex it should be fairly easy to learn yacc. The only base type is int. An int is 4 bytes. Booleans are handled as ints, as in C. (Actually the grammar allows you to declare a variable as a type void, but let's not do that.) You can have one-dimensional arrays.
There are no pointers, but references to array elements should be treated as pointers (as in C).
The language provides for assignment, IF-ELSE, WHILE, and function calls and returns.
We want our compiler to output MIPS assembly code, and then we will be able to run it on SPIM. For a simple compiler like this with no optimization, an IR should not be necessary. We can output assembly code directly in one pass. However, our first step is to generate a symbol table.
SYMBOL TABLE:
I like Dr. Barrett’s approach here, which uses a lot of pointers to handle objects of different types. In essence the elements of the symbol table are identifier, type and pointer to an attribute object. The structure of the attribute object will differ according to the type. We only have a small number of types to deal with. I suggest using a linear search to find symbols in the table, at least to start. You can change it to hashing later if you want better performance. (If you want to keep in C, you can do dynamic allocation of objects using malloc.)
First you need to make a list of all the different types of symbols that there are—there are not many—and what attributes would be necessary for each. Be sure to allow for new attributes to be added, because we
have not covered all the issues yet. Looking at the grammar, the question of parameter lists for functions is a place where some thought needs to be put into the design. I suggest more symbol table entries and pointers.
TESTING:
The grammar is correct, so taking the existing grammar as it is and generating a parser, the parser will accept a correct C-minus program but it won’t produce any output, because there are no code snippets associated with the rules.
We want to add code snippets to build the symbol table and print information as it does so.
When an identifier is declared, you should print the information being entered into the symbol table. If a previous declaration of the same symbol in the same scope is found, an error message should be printed.
When an identifier is referenced, you should look it up in the table to make sure it is there. An error message should be printed if it has not been declared in the current scope.
When closing a scope, warnings should be generated for unreferenced identifiers.
Your test input should be a correctly formed C-minus program, but at this point nothing much will happen on most of the production rules.
SCOPING:
The most basic approach has a global scope and a scope for each function declared.
The language allows declarations within any compound statement, i.e. scope nesting. Implementing this will require some kind of scope numbering or stacking scheme. (Stacking works best for a one-pass
compiler, which is what we are building.)
(disclaimer) I don't have much experience with compiler classes (as in school courses on compilers) but here's what I understand:
1) You need to use the tools mentioned to create a parser which, when given input will tell the user if the input is a correct program as to the grammar defined in cminus.y. I've never used yacc/bison so I don't know how it is done, but this is what seems to be done:
(input) file-of-some-sort which represents output to be parsed
(output) reply-of-some-sort which tells if the (input) is correct with respect to the provided grammar.
2) It also seems that the output needs to check for variable consistency (ie, you can't use a variable you haven't declared same as any programming language), which is done via a symbol table. In short, every time something is declared you add it to the symbol table. When you encounter an identifier, if it is not one of the language identifiers (like if or while or for), you'll look it up in the symbol table to determine if it has been declared. If it is there, go on. If it's not - print some-sort-of-error
Note: point(2) there is a simplified take on a symbol table; in reality there's more to them than I just wrote but that should get you started.
I'd start with yacc examples - see what yacc can do and how it does it. I guess there must be some big example-complete-with-symbol-table out there which you can read to understand further.
Example:
Let's take input A:
int main()
{
int a;
a = 5;
return 0;
}
And input B:
int main()
{
int a;
b = 5;
return 0;
}
and assume we're using C syntax for parsing. Your parser should deem Input A all right, but should yell "b is undeclared" for Input B.