Structure of QuickTime's 'dref' atom 'alis' element - alias

I need to rewrite a QuickTime reference movie, making it point to another set of files.
I'm working in Windows environment, so I don't have acces to the QuickTime API, and being the referenced files unaccesible, I can't also use the COM interface to load the movie because it can't resolve the referenced paths.
The documentation in the "QuickTime File Format Specification" says that the 'dref' atom can have a list of 'alis', 'url ' and 'rsrc' data references. In this case I need to parse the 'alis' elements. According to the reference, "Data reference is a Macintosh alias".
So long, I have not been able to see a declaration of the structure or any related information. Do you know the structure of an alias record? Where can I find detailed information about it's structure?
Thank you a lot for your help!

The format is very similar to the sort of alias that you could generate in the Finder by right-clicking an item, and creating an alias to it.
Aside: When the QuickTime format was originally specified, Apple intelligently chose to incorporate a number of other standards and paradigms that were extensively already being used elsewhere in the OS. This is one of the reasons why QT is (or was) able to do really clever things like reference movies. Unfortunately, there's also now a lot of cruft leftover from OS features that are no longer relevant (ie. AppleShare). Back in its heyday, QuickTime was slick, especially compared to its competitors; today, it's vastly underappreciated due to the buggy Windows port, and the relatively low processing power of the desktop systems of its time.
Back ontopic, unfortunately, the format for alias files is not an open/published standard, and there is precious little documentation on the topic on the 'net. There's one really old doc that deconstructs the alias format used in Mac OS Classic. Although the structure used in OS X is very similar, the alias files themselves tend to be much larger, as they contain numerous extra data strings at the end of the file that are not documented in the above-linked documentation.
Also, aliases created in the finder do look a bit different from the ones contained within the dref atom, although I've never run through them bit-by-bit to deduce the actual differences. If you want to take a peek at what those files, and have the OS X Developer Tools installed, you can run
setfile -a a [filename]
on a Finder-generated alias to strip the file of its alias-ness so that you can look at its contents in a hex editor (otherwise, the OS will just redirect you to the linked file - doh!). You can re-set the file's alias attribute, or arbitrarily designate any file as an alias by running
setfile -a A [filename]
Unfortunately, during my experiments, dumping the alis portion of a QT movie's dref atom has never seemed to generate an alias that Mac OS was able to interpret.
Fortunately (or not, as it was in my case), the functions that Mac OS allegedly uses to create/handle aliases are part of a public API called the Alias Manager, which is part of the very-low-level CoreServices framework. If you've got time to delve into this further, you can write some code to experiment with Mac OS's built-in alias-generating and interpreting capabilities.
Unfortunately, if you're dealing with an old/buggy file, you have no way of knowing if the file was actually generated by CoreServices' Alias Manager, or if that framework has changed/evolved/regressed since then. Because it's a closed format, 3rd-party developers who opt to not use the Alias Manager can only take guesses as to the format's "legal" structure.

You can use this Java program to see what is in the header, and extract data (it's a bit old, but may still work). What is more useful, though, is the thorough discussion by the author about the Quicktime header.
But I think you may just be looking for the Apple documentation, currently found here.

Related

Emacs equivalent of Xcode's "Open Quickly"

I'm trying to get a Cocoa development environment working in Emacs, and I'm 80% of the way there. The one feature I miss is Xcode's "Open Quickly", which basically performs a fuzzy match of the string you type against the filenames referenced in the Xcode workspace and the symbols defined in those files.
My problem is that our project is huge: if I generate a TAGS file using etags for the .h and .m files in our project's sub-directories, the result is over a gig in size and Emacs complains "TAGS file is large. Really open?", and if I say yes, then Emacs hangs and becomes essentially unusable. Of course, this is before I've even considered indexing tags for system libraries. I've also tried projectile, but unfortunately it's similarly unusable on a project of my size (on the order of a full minute to find a match).
It occurs to me that all the indexing information I really want is in the Xcode projects themselves, so if I had an Emacs package that could parse them and traverse their dependencies, that might be a start, but I'm not aware of any such package.
Any suggestions/solutions in this respect?
I've never found a single function quite as convenient as Xcode's "Open Quickly", but these days I use
helm-projectile-git-grep when I want to match on strings I know to be in the filenames, and
helm-git-grep for quick searches through the contents of the files themselves.
I've found that this gets me really close to what I wanted in my original question.

Can I write to the resource fork using NSDocument?

I'd like to store some additional information along with a document, but I can't use bundles or packages, and I cannot store it inside the document itself.
The application is a text editor, and I'd like it to store code folding and bookmark locations with the document, but obviously this cannot be embedded into the code directly, and I don't want to alter the code with ugly comments.
Can I use NSDocument to store information in the resource fork of a document? If so, how can I do this? Should I directly write to <filename>/..namedfork/rsrc or is there an API available?
First, don't use the resource fork. It's virtually deprecated. Instead, use extended attributes. They can be set programmatically at the BSD level via setxattr and getxattr. Extended attributes are used in many places... for example, in the latest OS X, the resource fork itself is implemented as a special type of extended attributes.
For example, the Cocoa text system automatically adds an extended attribute to a file to specify the encoding.
I thought NSFileManager and NSFileWrapper supported extended attributes since Snow Leopard, but I can't find any documentation :p You can always use the BSD level functions, though.
Does the state need to move with the file if it's copied to another computer? If not, you could do a lot worse than emulating the way Bare Bones handles document state with BBEdit. They store state for all documents in ~/Library/Preferences/com.barebones.bbedit.PreferenceData/Document State.plist.
The resource fork documentation is here. But it contains plenty of suggestions to not use the resource fork.
I have a class on my web site for reading and writing resource forks, which I have never got around to moving to my GitHub repository because, as Yuji points out, they are not really used any more.
I was going to say alias files and web Internet location file are the only places they are used, but I used and tested it on Mac OS X v10.7 (Lion), and they are not even used there any more; they may still be used for custom icons. I didn't test for that exclusively. I will have to see how that affect my NDAlias class on 10.7.
ndresourcefork

How to get metadata from video-movie file using Objective-c?

Any help? Now can get NSSize, duration and its all.
You can do this almost entirely using Spotlight's metadata.
For example, I do the following in one of my apps
MDItemRef fileMetadata=MDItemCreate(NULL,(CFStringRef)eachPath);
NSDictionary *metadataDictionary = (NSDictionary*)MDItemCopyAttributes (fileMetadata,
(CFArrayRef)[NSArray arrayWithObjects:(id)kMDItemPixelHeight,(id)kMDItemPixelWidth,nil]);
This code essentially asks for the pixel width and height for a movie file (to determine if it's the dimension of an HD movie or not is the reason).
The Spotlight Metadata Attributes Reference lists all the available keys for various file types by category. You can probably get the required data this way without doing anything significant, provided that the media type you're examining has a Spotlight plug-in.
This functionality may not be built in (I'm honestly not sure), but I do know of two third-party libraries which can tell you the information you need.
VLCKit, the framework being used by the newest beta versions of VLC for Mac.
libmediainfo, a multi-purpose library that can read practically any bit of information you need out of practically any media file.
I can go into more depth with how to use either of these, but I'd rather only do so if you end up needing me to. Let me know!

How to decide on document file extension?

I'm writing a new document-based cross-platform chemistry application (Win, Mac, Unix), which saves files in its own format (no standard format exists for this field). I'm trying to decide on a file extension for the saved files. My questions are:
How important is it nowadays to stick to 3 characters?
Where can you check how much this file extension is already used? (Google helps, of course, but it does not tell me how much a given app is popular)
Do I really need to use a file-specific extension? My save format is gzip'ed XML, so I could name it .xml.gz, but I fear it would confuse beginning users (i.e. when you see it, it does not immediately "ring a bell").
Finally, do you have other important guidelines when choosing for your own programs?
PS: I tried to keep the right balance between "giving too little information" and "being too specific to be really useful to others". I'll happily provide more information in comments if the need arises.
FileInfo.com lists a lot of file extensions along with their own estimation of how much it is ued.
I suggest a unique extension (rather then xml.gz) so that the OS can identify the file type to users when looking at a file listing or whatever. 'Ringing a bell' is important, especially if you will have less sophisticated users.
I don't see any need to stick to 3 characters, but I wouldn't go bigger than 5 (I don't suppose I have a real reason for this, other than personal preference).
How important is it nowadays to stick to 3 characters?
It's not unless you have to support older operating systems. All current OSes handle >3 char file extensions without any problems. Think of .html, .config, .resx, and I'm sure there are more.
Where can you check how much this file extension is already used?
check out FileExt.
Do I really need to use a file-specific extension? My save
format is gzip'ed XML, so I could name
it .xml.gz, but I fear it would
confuse beginning users (i.e. when you
see it, it does not immediately "ring
a bell").
Remember that windows (and windows users) associate files with applications by extension, so using something too generic like .xml.gz may cause problems. You are probably better coming up with something that is more specific to your file type or application. Users don't care weather your format is gzipped xml internally, they care about what is in the file. Think about abstraction layers, your users will think of it as a file containing chemistry info not gzipped xml, so .chem is far more appropriate than .xml.gz
Some suggestions of things to thing about:
Obviously, don't clash with anything big - Don't use .doc, .xls, .exe, etc.
Don't clash with anything common in your industry domain that your user demographic is likely to have installed. For example, if you are writing a programming tool, don't use .cs or .cpp. You probably know your domain best, so write a list of all the apps you and your users are likely to have installed, and any of their competitors and avoid them.
Make sure your app includes the options to register and unregister the extension. don't just automatically do it in the installation, make sure it's an option.
Remember unix/linux and Mac are case sensitive, so consider sticking to always all lower case by default.
Remember CD/DVD file naming rules are stricter, so don't use non alpha numeric characters.
Finally, remember that most non-tech users are going to have file extensions turned off, so don't stress about it too much.
There is more info here.
Wikipedia has lists of files extensions here (by type) and here (alphabetical), and also some general information
Depends on the platform, but in general, not very important for newer Operating Systems. Check the documentation for the platforms you're targeting.
I'm not aware of better alternatives to Google. Hopefully someone else has a better suggestion for this one.
Not unless you have some reason to do so. Examples would be "I want to ensure that Windows always opens this program with my app". I'm not sure that your users need to be concerned with the extension anyway. The default configuration on Windows, for example, is to hide extensions for known file types. BUT if you have a compelling reason (such as allowing your program to easily identify files it should be able to handle, for example) then you could use the extension, or you could come up with something else.
I have only ever once written a program where I thought I needed to come up with my own extension. I used my initials. Then later I realized I didn't really need a special extension and reverted to ".xml". However, most extensions seem to be something that seems to mean something. (.doc for documents, etc.) so something meaningful is a good idea if you do need to go this route.
It sure depends on the OSes you want to support, but people have globally moved over the 3-characters extension limit these days: .html is well used for webpages, for example.
Of course, if you go to much longer extensions, people will stop visually recognizing it as a file extension, I think...
Barring your needing to be compatible with a specific OS that you know still has the three-letter limitation, no need to keep it to three characters. It may be useful to have a three-character version of it if you end up supporting those platforms.
The Wikipedia list of file formats is pretty good. Some mime mapping lists will list common extensions associated with those mappings. Ray already mentioned FileInfo.com.
It's a convenience thing; I'd probably go with your own but document the fact that they're just gzipped XML files conforming to a specific DTD and make it easy for users to use .xml.gz instead. Be sure that your software doesn't care about the extension, so that users could even choose their own if they wanted, although I'd tend to avoid encouraging them to by providing a reasonable default.
I'd go for typeability, clarity, uniqueness, and brevity -- in that order. For instance, .config is a lot easier to type than .q2z but it falls down on uniqueness. (I'm not suggesting it for your app; it's an example.) Similarly, .q2z is just a pain. :-) So for instance, .chemstuff is easy to type and probably not in wide use elsewhere. (Again, not a suggestion, just an example.)
Have it as document_name.app_name.xml.gz where document_name and app_name are variables, the latter some easily readable and recognisable short string of your application's title.
Modern systems are quite flexible, and there is absolutely no need to drag the 3-character extensions further along in time with us.
I agree that .xml.gz would confuse users, however keep in mind that modern systems are moving into recognizing files not based on extensions but by probing their headers and even contents instead. In fact, users do not often even see the extensions. For gzipped XML files, a system may decide to first unpack the file stream in memory, then find out it is a literal XML file, then it may take its 'xmlns' as the application identifier. However, such systems are not yet widespread use. In any case, don't make the mistake of only opening files by extension - be smart and raise the bar - do exactly the above to find out if the file can be considered a document for your application.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.