What is the purpose of Jane Street Base's *_intf.ml files? - module

This is a general OCaml question.
A .mli file usually provides the "interface" for its corresponding .ml file.
But in Jane Street's Base sometimes there are three files with the same name-minus-extension, for example (commit):
binary_searchable.ml
binary_searchable.mli
binary_searchable_intf.ml
Why do only some .mls have corresponding .ml_intf.ml files? And, more generally, what are the limitations of the .mli format such that _intf.ml files are sometimes needed?

This is explained in detail in this article on "the _intf trick", but in short, as I understand it, it's to avoid having to repeat large type definitions, especially module types, in both the interface and implementation files.
Essentially, if you have a large type definition that is not intended to be abstracted away externally, say:
type t =
| A of int
| B of string
...
| Z of (int array, string * float)
Instead of duplicating this definition in both Foo.ml and Foo.mli, you can put it in Foo_intf.ml, and then in the .ml and .mli file simply write:
include Foo_intf

Related

Good practice in naming derived types in fortran

I'd like to optimize the readability of my codes in Fortran by using OOP.
I thus use derived types. what is the best practice to name the types and derived types?
For example, is it better to:
type a
real :: var
end type
type(a) :: mya
or always begin type names by type_ like in type_a? I like this one but maybe better ideas can be foud.
Also, is it better (then why) to use short names that are less readable or longer names that end up quite difficult to read if the type has too many "levels". For example, in a%b%c%d%e, if a, b, c, d and e are 8 or more letters long as in country%hospital%service%patient%name, then once again readability seems to be a concern.
Advices from experts are really welcome.
This not anything special to Fortran. You can use coding recommendation for other languages.
Usually, type names are not designated by any prefix or suffix. In many languages class names start with a capital letter. You can use this in Fortran also, even if it is not case sensitive. Just be sure not to reuse the name with a small letter as a variable name.
One example of a good coding guideline is this and you can adapt it for Fortran very easily. Also, have a look on some Fortran examples in books like MRC or RXX. OS can be also useful.
I would recommend not to use too short component names, if the letter is not the same as used in the written equation. In that case it can be used.
Use the associate construct or pointers to make aliases to nested names like country%hospital%service%patient%name.
In my experience, naming issues come up in OO Fortran more than other languages (e.g. C++) because of Fortran's named modules, lack of namespaces, and case-insensitivity, among other things. Case-insensitivity hurts because if you name a type Foo, you cannot have a variable named foo or you will get a compiler error (tested with gfortran 4.9).
The rules I have settled on are:
Each Fortran module provides a single, primary class named Namespace_Foo.
The class Namespace_Foo can be located in your source tree as Namespace/Foo_M.f90.
Class variables are nouns with descriptive, lower case names like bar or bar_baz.
Class methods are verbs with descriptive (but short if possible) names and use a rename search => Namespace_Foo_search.
Instances of class Namespace_Foo can be named foo (without namespace) when there is no easy alternative.
These rules make it particularly easy to mirror a C/C++ class Namespace::Foo in Fortran or bind (using BIND(C)) a C++ class to Fortran. They also avoid all of the common name collisions I've run into.
Here's a working example (tested with gfortran 4.9).
module Namespace_Foo_M
implicit none
type :: Namespace_Foo
integer :: bar
real :: bar_baz
contains
procedure, pass(this) :: search => Namespace_Foo_search
end type
contains
function Namespace_Foo_search(this, offset) result(index)
class(Namespace_Foo) :: this
integer,intent(in) :: offset !input
integer :: index !return value
index = this%bar + int(this%bar_baz) + offset
end function
end module
program main
use Namespace_Foo_M !src/Namespace/Foo_M.f90
type(Namespace_Foo) :: foo
foo % bar = 1
foo % bar_baz = 7.3
print *, foo % search(3) !should print 11
end program
Note that for the purpose of running the example, you can copy/paste everything above into a single file.
Final Thoughts
I have found the lack of namespaces extremely frustrating in Fortran and the only way to hack it is to just include it in the names themselves. We have some nested "namespaces", e.g. in C++ Utils::IO::PrettyPrinter and in Fortran Utils_IO_PrettyPrinter. One reason I use CamelCase for classes, e.g. PrettyPrinter instead of Pretty_Printer, is to disambiguate what is a namespace. It does not really matter to me if namespaces are upper or lower case, but the same case should be used in the name and file path, e.g. class utils_io_PrettyPrinter should live at utils/io/PrettyPrinter_M.f90. In large/unfamiliar projects, you will spend a lot of time searching the source tree for where specific modules live and developing a convention between module name and file path can be a major time saver.

configuration file of consts to many classes

Using objective c , I have 2 classes that are using hardware, and are written in c +objC .
The other classes in the project are objective c, and they create instance of those classes.
My question .
Lets say I have classA.m and classB.m . they both have an integer const that needs to be the same say : const int numOfSamples=7;
I am seeking for the best solution to create some configuration file, that will holds all this const variables, that both A and B can see them .
I know some ways,but I wonder whats the RIGHT thing to do .
I wonder if I can just create a : configuration.m and write them into it .
to use a singleton file that holds all globals .
Number 1 seems to me the best , but how exactly should I do it ?
Thanks.
For approach 1 to work, you need to define two files:
a header file where you declare all of your constants;
a .m file where your constants are defined and initialized.
In your example:
/* .h file */
extern const int numOfSamples;
/* .m or .c file */
const int numOfSamples = 7;
Then, you include the .h header in every other file where you need those constants. Notice the extern keyword, this will declare the variable without also defining; in this way, you can include the .h file multiple times without having a duplicate symbol error.
EDIT:
The approach I suggest is the correct way to handle global variables in a C program.
Now, if global variables are a good thing or not, well, that is a longer story.
Generally speaking, global variables are tricky and go against a 40 years long effort towards better encapsulation (aka, information hiding) of data and behavior in a program (see "On the Criteria to Be Used in Decomposing Systems Into Modules", David Parnas, 1972).
To further explain this, one aspect of the problem is exactly what you mention in your comment: the possibility of one module changing the value of a global variable and thus affecting the whole behavior of the program. This is recognizedly bad and leads to uncontrollable side effects (in any non trivially sized program).
In your case, I think things are a bit different, in that you are talking about "configuration" and "const" values. This is an altogether different case than the general one and I think you could safely use a header file of consts to that aim.
That said, you understand that the whole theme of configuration is a huge one, in general. E.g., you could need mechanisms to change your program configuration on the fly; in this case the constant variable header approach would be not correct. Or, your program configuration could depend on the state of some remote system (imagine: you are logged in vs. you are not logged in).
I can't guarantee that using a header file is the best approach for your case, but I hope that the above discussion and the example I gave you can help you figure that out.
I think the best way is to use a plist file with all your configuration values.
If you have few configuration values, you can use the Info.plist file.

Ocaml naming convention

I am wondering if there exists already some naming conventions for Ocaml, especially for names of constructors, names of variables, names of functions, and names for labels of record.
For instance, if I want to define a type condition, do you suggest to annote its constructors explicitly (for example Condition_None) so as to know directly it is a constructor of condition?
Also how would you name a variable of this type? c or a_condition? I always hesitate to use a, an or the.
To declare a function, is it necessary to give it a name which allows to infer the types of arguments from its name, for example remove_condition_from_list: condition -> condition list -> condition list?
In addition, I use record a lot in my programs. How do you name a record so that it looks different from a normal variable?
There are really thousands of ways to name something, I would like to find a conventional one with a good taste, stick to it, so that I do not need to think before naming. This is an open discussion, any suggestion will be welcome. Thank you!
You may be interested in the Caml programming guidelines. They cover variable naming, but do not answer your precise questions.
Regarding constructor namespacing : in theory, you should be able to use modules as namespaces rather than adding prefixes to your constructor names. You could have, say, a Constructor module and use Constructor.None to avoid confusion with the standard None constructor of the option type. You could then use open or the local open syntax of ocaml 3.12, or use module aliasing module C = Constructor then C.None when useful, to avoid long names.
In practice, people still tend to use a short prefix, such as the first letter of the type name capitalized, CNone, to avoid any confusion when you manipulate two modules with the same constructor names; this often happen, for example, when you are writing a compiler and have several passes manipulating different AST types with similar types: after-parsing Let form, after-typing Let form, etc.
Regarding your second question, I would favor concision. Inference mean the type information can most of the time stay implicit, you don't need to enforce explicit annotation in your naming conventions. It will often be obvious from the context -- or unimportant -- what types are manipulated, eg. remove cond (l1 # l2). It's even less useful if your remove value is defined inside a Condition submodule.
Edit: record labels have the same scoping behavior than sum type constructors. If you have defined a {x: int; y : int} record in a Coord submodule, you access fields with foo.Coord.x outside the module, or with an alias foo.C.x, or Coord.(foo.x) using the "local open" feature of 3.12. That's basically the same thing as sum constructors.
Before 3.12, you had to write that module on each field of a record, eg. {Coord.x = 2; Coord.y = 3}. Since 3.12 you can just qualify the first field: {Coord.x = 2; y = 3}. This also works in pattern position.
If you want naming convention suggestions, look at the standard library. Beyond that you'll find many people with their own naming conventions, and it's up to you to decide who to trust (just be consistent, i.e. pick one, not many). The standard library is the only thing that's shared by all Ocaml programmers.
Often you would define a single type, or a single bunch of closely related types, in a module. So rather than having a type called condition, you'd have a module called Condition with a type t. (You should give your module some other name though, because there is already a module called Condition in the standard library!). A function to remove a condition from a list would be Condition.remove_from_list or ConditionList.remove. See for example the modules List, Array, Hashtbl,Map.Make`, etc. in the standard library.
For an example of a module that defines many types, look at Unix. This is a bit of a special case because the names are mostly taken from the preexisting C API. Many constructors have a short prefix, e.g. O_ for open_flag, SEEK_ for seek_command, etc.; this is a reasonable convention.
There's no reason to encode the type of a variable in its name. The compiler won't use the name to deduce the type. If the type of a variable isn't clear to a casual reader from the context, put a type annotation when you define it; that way the information provided to the reader is validated by the compiler.

Description format for an embedded structure

I have a C structure that allow users to configure options in an embedded system. Currently the GUI we use for this is custom written for every different version of this configuration structure. What I'd like for is to be able to describe the structure members in some format that can be read by the client configuration application, making it universal across all of our systems.
I've experimented with describing the structure in XML and having the client read the file; this works in most cases except those where some of the fields have inter-dependencies. So the format that I use needs to have a way to specify these; for instance, member A must always be less than or equal to half of member B.
Thanks in advance for your thoughts and suggestions.
EDIT:
After reading the first reply I realized that my question is indeed a little too vague, so here's another attempt:
The embedded system needs to have access to the data as a C struct, running any other language on the processor is not an option. Basically, all I need is a way to define metadata with the structure, this metadata will be downloaded onto flash along with firmware. The client configuration utility will then read the metadata file over RS-232, CAN etc. and populate a window (a tree-view) that the user can then use to edit options.
The XML file that I mentioned tinkering with was doing exactly that, it contained the structure member name, data type, number of elements etc. The location of the member within the XML file implicitly defined its position in the C struct. This file resides on flash and is read by the configuration program; the only thing lacking is a way to define dependencies between structure fields.
The code is generated automatically using MATLAB / Simulink so I do have access to a scripting language to help with the structure creation. For example, if I do end up using XML the structure will only be defined in the XML format and I'll use a script to create the C structure during code generation.
Hope this is clearer.
For the simple case where there is either no relationship or a relationship with a single other field, you could add two fields to the structure: the "other" field number and a pointer to a function that compares the two. Then you'd need to create functions that compared two values and return true or false depending upon whether or not the relationship is met. Well, guess you'd need to create two functions that tested the relationship and the inverse of the relationship (i.e. if field 1 needs to be greater than field 2, then field 2 needs to be less than or equal to field 1). If you need to place more than one restriction on the range, you can store a pointer to a list of function/field pairs.
An alternative is to create a validation function for every field and call it when the field is changed. Obviously this function could be as complex as you wanted but might require more hand coding.
In theory you could generate the validation functions for either of the above techniques from the XML description that you described.
I would have expected you to get some answers by now, but let me see what I can do.
Your question is a bit vague, but it sounds like you want one of
Code generation
An embedded extension language
A hand coded run-time mini language
Code Generation
You say that you are currently hand tooling the configuration code each time you change this. I'm willing to bet that this is a highly repetitive task, so there is no reason that you can't write program to do it for you. Your generator should consume some domain specific language and emit c code and header files which you subsequently build into you application. An example of what I'm talking about here would be GNU gengetopt. There is nothing wrong with the idea of using xml for the input language.
Advantages:
the resulting code can be both fast and compact
there is no need for an interpreter running on the target platform
Disadvantages:
you have to write the generator
changing things requires a recompile
Extension Language
Tcl, python and other languages work well in conjunction with c code, and will allow you to specify the configuration behavior in a dynamic language rather than mucking around with c typing and strings and and and...
Advantages:
dynamic language probably means the configuration code is simpler
change configuration options without recompiling
Disadvantages:
you need the dynamic language running on the target platform
Mini language
You could write your own embedded mini-language.
Advantages:
No need to recompile
Because you write it it will run on your target
Disadvantages:
You have to write it yourself
How much does the struct change from version to version? When I did this kind of thing I hardcoded it into the PC app, which then worked out what the packet meant from the firmware version - but the only changes were usually an extra field added onto the end every couple of months.
I suppose I would use something like the following if I wanted to go down the metadata route.
typedef struct
{
unsigned char field1;
unsigned short field2;
unsigned char a_string[4];
} data;
typedef struct
{
unsigned char name[16];
unsigned char type;
unsigned char min;
unsigned char max;
} field_info;
field_info fields[3];
void init_meta(void)
{
strcpy(fields[0].name, "field1");
fields[0].type = TYPE_UCHAR;
fields[0].min = 1;
fields[0].max = 250;
strcpy(fields[1].name, "field2");
fields[1].type = TYPE_USHORT;
fields[1].min = 0;
fields[1].max = 0xffff;
strcpy(fields[2].name, "a_string");
fields[2].type = TYPE_STRING;
fields[2].min = 0 // n/a
fields[2].max = 0 // n/a
}
void send_meta(void)
{
rs232_packet packet;
memcpy(packet.payload, fields, sizeof(fields));
packet.length = sizeof(fields);
send_packet(packet);
}

Objective C - Why do constants start with k

Why do constants in all examples I've seen always start with k? And should I #define constants in header or .m file?
I'm new to Objective C, and I don't know C. Is there some tutorial somewhere that explains these sorts of things without assuming knowledge of C?
Starting constants with a "k" is a legacy of the pre-Mac OS X days. In fact, I think the practice might even come from way back in the day, when the Mac OS was written mostly in Pascal, and the predominant development language was Pascal. In C, #define'd constants are typically written in ALL CAPS, rather than prefixing with a "k".
As for where to #define constants: #define them where you're going to use them. If you expect people who #import your code to use the constants, put them in the header file; if the constants are only going to be used internally, put them in the .m file.
Current recommendations from Apple for naming constants don't include the 'k' prefix, but many organizations adopted that convention and still use it, so you still see it quite a lot.
The question of what the "k" means is answered in this question.
And if you intend for files other than that particular .m to use these constants, you have to put the constants in the header, since they can't import the .m file.
You might be interested in Cocoa Dev Central's C tutorial for Cocoa programmers. It explains a lot of the core concepts.
The k prefix comes from a time where many developers loved to use Hungarian notation in their code. In Hungarian notation, every variable has a prefix that tells you what type it is. pSize would be a pointer named "size" whereas iSize would be an integer named "size". Just looking at the name, you know the type of a variable. This can be pretty helpful in absence of modern IDEs that can show you the type of any variable at any time, otherwise you'd always have to search the declaration to know it. Following the trend of the time, Apple wanted to have a common prefix for all constants.
Okay, why not c then, like c for "constant"? Because c was already taken, in Hungarian notation, c is for "counter" (cApple means "count of apples"). There's a similar problem with the class, being a keyword in many languages, so how do you name a variable that points to a class? You will find tons of code naming this variable klass and thus k was chosen, k as in "konstant". In many languages this word actually does start with a k, see here.
Regarding your second question: You should not use #define for constant at all, if you can avoid it, as #define is typeless.
const int x = 10; // Type is int
const short y = 20; // Type is short
const uint64_t z = 30; // Type is for sure UInt64
const double d = 5000; // Type is for sure double
const char * str = "Hello"; // Type is for sure char *
#define FOO 90
What type is FOO? It's some kind of number. But what kind of number? So far any type or no type at all. Type will depend on how and where you use FOO in your code.
Also if you have a fixed set of numbers, use an enum as then the compiler can verify you are using a valid value and enum values are always constant.
If you have to use a define, it won't matter where you define it. Header files are files you share among multiple code files, so if you need the same define in more than one place, you write it into a header file and include that header file wherever that define is needed. What you write into a code file is only visible within that code file, except for non-static functions and Obj-C classes that are both globally visible by default. But unless a function is declared in a header file and that header file is included into a code file where you want to use that function, the compiler will not know how this function looks like (what parameters it expects, what result value it returns), so it cannot check any of this and must rely that you call it correctly (usually this will cause it to create a warning). Obj-C classes cannot be used at all, unless you tell the current code file at least that this name is the name of a class, yet if you want to actually do something with that class (other than just passing it around), the compiler needs to know the interface of the class, that's why interfaces go into header files (if the class is only used within the current code file, writing interface and implementation into the file is legal and will work, too).
k for "konvention". Seriously; it is just convention.
You can put a #define wherever you like; in a header, in the .m at the top, in the .m right next to where you use it. Just put it before any code that uses it.
The "intro to objective-c" documentation provided with the Xcode tool suite is actually quite good. Read it a few times (I like to re-read it once every 2 to 5 years).
However, neither it nor any of the C books that I'm aware of will answer these particular questions. The answers sort of become obvious through experience.
I believe it is because of the former prevalence of Hungarian Notation, so k was chosen because c stood for character. ( http://en.wikipedia.org/wiki/Hungarian_notation )
--Alan