Is there a limit of text data that can be stored in an executable binary?

Is there a limit of text data that can be stored in an executable binary? - objective-c

My OS X app features an in-application help system that consists of static strings worth roughly 4 MB of raw text data.
Normally, one would store these help texts in and on access fetch them from a lightweight database (SQLite springs to mind) that comes bundled with the application binary. Instead, for the reason of simplicity I chose to store the help text in a large NSDictionary consisting of many NSString (generated automatically at compile time). Access is reasonably fast and the only "drawback" I can think of is the constant consumption of 4 MB of memory the NSDictionary has even when it's not in use - which is really not an issue with modern day hardware.
My solution is pragmatic, works fine for now, makes a compact app that doesn't spill its internal data on disk and yet it gives me an uneasy feeling.
So, I think my question is if what I'm doing is okay or if it is bad practice in any way. Concise:
Is it, from a technical point of view, okay to "bake in" large amounts of text into an application binary?
Is there a size limit of static variable data that can be stored in (64 bit) Darwin Mach-O images?

and when you find a typografically error, you have to compile the app and deploy it entirely, instead of just provide an update to the database. This makes the deployment for the customer more smoothly.
And when it happens, that your app is so demanded, that you want to provide a (f.e.) german language version, you have to change everything from scratch.
as a rule of thump: small binary, large database, assets separately.

Related

How to deal with thousands of small audio files?

Need to implement an app that has a feature to play sounds. Each sound will be some word sound, number of expected sounds is about one thousand. So, the most simple solution would be to store those sounds as sound files, each word sound in separate sound file, and to play them on demand. Would there be any potential problems with such a large number of files?

No problem with that many files, but they will take up more space than just the total of their sizes. Each file will fill up a whole # of space blocks on the device. On average you will then waste half a block (as a rule of thumb) unless all your files are significantly smaller than one block, in which case you will always use 1.000 blocks (one pr. file) and waste 1000 * (blocksize - average file size).
Things you could do:
Concatenate the files into one big file, store the start and length of each subfile, either read the chunk into memory or copy to a temporary file.
Drop the files in a database as BLOB fields for easier retrieval. This won't save space, but may make your code simpler or more reliable.
I don't think you need to make your own caching mechanism. Most likely iOS has a system-wide cache that does a far better job. That should only be relevant if you experience performance issues and need to get much shorter load times. In that case prhaps consider using bolcks for loading and dispatching the playing, as that's an easier way to hide the load latency and avoid UI freezes.
If your audio is uncompressed, the App Store will report the compressed size. If that differs a lot from the unpacked size, some (nitpicking) customers will definitely notice ald complain, as they think the advertised size is the install size. I know from personal experience. They wil generally not take a technical answer for an answer, any may even bypass talking to you, and just downvote you based on this. I s#it you not.

You should be fine storing 1000 audio clip files within the IPA but it is important to take note about the space requirements and organisation.
Also to take into consideration is the fact that accessing the disk is slower than memory and it also takes up battery space so it my be ideal to load up the most frequently used audio clips into memory.

If you can afford it, use FMOD which I believe can extract audio from various compressed schemes. If you just want to handle all those files yourself create a .zip file and extract them on the fly using libz (iOS library libs.dylib).

Store images in sqlite or just a reference to it?

I have made couple of apps using coredata and I was storing images in sqlite, but somewhere i found that it is bad. I've searched the net but all I've found is this suggestion:
image size < 100kb store in the same table as the relevant data
image size < 1mb store in a separate table attached via a relationship
to avoid loading unnecessarily
image size > 1mb store on disk and reference it inside of Core Data
So my question is: what are pros and cons of saving an image in sqlite db as NSData, and storing just a reference to the image while image is saved in the file system?

Apple provide some guidance on this topic in their guide on Core Data Performance. In general, although SQLite scales pretty well and can handle databases that are many gigabytes in size with ease, large binary blobs are not queryable or indexable, and inflate the size of the database with little return.
If you're targeting iOS 4 and above, you can set the "Allows External Binary Data Storage" flag on your attributes that contain such data, and Core Data will automatically store them separately on the file system (if it deems appropriate), and automatically manage the link to that data in your data store.

Benefits: Not so sure, but I can think of couple of benefits of storing just links in the database.
The native code interaction with the file system would be faster than the SQLite image fetching. (overall faster performance)
Clean and scalable database -- (with size being the concern, migration would be easier)

You may want to check the answer I get for a similiar, if not the same, topic. Because as you I only found person giving advices, but no one really was providing benchmark and real technical answer.
Provide example for why it is not advisable to store images in CoreData?
Beside that, after my app have been realized with all images in db, and shipped to app store,
I can tell you that things are easier if you use iCloud. If you use small images in UITableView with thumbnail icons, you can completely avoid asynchronous image loading.
Only one advice, provide an entity for each images size, rather storing all in a set attached to main entity.
The only downside I found, with iCloud use, is the larger transaction log generated each time I change an image. But in my case image are small, and the needs of updating the images is rare. Also, iCloud+CoreData at the moment is quite buggy so I removed it before shipping, so at the moment it is really not a problem for me.

Writing many values received in real time to database on iPhone with SQLite3

I'm currently writing an iOS app and I have many records that I'm writing to a database.
Even though with the iPhone you are writing to flash memory, the ram still has a faster access time.
To improve performance I am writing to a temporary cache in ram and then at one point I append that cache to the database.
What is a standard practice / technique with knowing how often to write the cache to the database?
How can I fine tune this?
Thanks in advance!

I had a similar issue with a cache that needed to be flushed to a server instead of a local DB. I used instruments to find the "typical" size of one of the cached objects (mine were fairly uniformed) and I just maintain a count of how many are in the cache and when I cross the threshold I empty my cache to the server. I then learned about NSCache that has much of this same behavior. I investigated ways to dynamically determine the size of each object in the cache, but found it tedious and brittle.
Basically, I think you need to decide what makes sense from your app based on the usage characteristics gathered with instruments. I found the video from the 2011 WWDC conference "Section 318 - iOS Performance in Depth" to be very helpful for similar situations. You can find it on itunes U.

Saving large objects to file

I'm working on a project in Objective-c where I need to work with large quantities of data stored in an NSDictionary (it's around max ~2 gigs in ram). After all the computations that I preform on it, it seems like it would be quicker to save/load the data when needed (versus re-parsing the original file).
So I started to look into saving large amount of data. I've tried using NSKeyedUnarchiver and [NSDictionary writeToFile:atomically:], but both failed with malloc errors (Can not allocate ____ bytes).
I've looked around SO, Apple's Dev forums and Google, but was unable to find anything. I'm wondering if it might be better to create the file bit-by-bit instead of all at once, but I can't anyway to add to an existing file. I'm not completely opposed to saving with a bunch of small files, but I would much rather use one big file.
Thanks!
Edited to include more information: I'm not sure how much overhead NSDictionary gives me, as I don't take all the information from the text files. I have a 1.5 gig file (of which I keep ~1/2), and it turns out to be around 900 megs through 1 gig in ram. There will be some more data that I need to add eventually, but it will be constructed with references to what's already loaded into memory - it shouldn't double the size, but it may come close.
The data is all serial, and could be separated in storage, but needs to all be in memory for execution. I currently have integer/string pairs, and will eventually end up with string/strings pairs (with all the values also being a key for a different set of strings, so the final storage requirements will be the same strings that I currently have, plus a bunch of references).
In the end, I will need to associate ~3 million strings with some other set of strings. However, the only important thing is the relationship between those strings - I could hash all of them, but NSNumber (as NSDictionary needs objects) might give me just as much overhead.

NSDictionary isn't going to give you the scalable storage that you're looking for, at least not for persistence. You should implement your own type of data structure/serialisation process.
Have you considered using an embedded sqllite database? Then you can process the data but perhaps only loading a fragment of the data structure at a time.

If you can, rebuilding your application in 64-bit mode will give you a much larger heap space.
If that's not an option for you, you'll need to create your own data structure and define your own load/save routines that don't allocate as much memory.

How important is size in an application?

When creating applications (Java, run on a normal computer). How important is program size for users? For example, would it be necessary to replace .png's with .jpg's, convert .wav's to .midi's, or strip down libraries to save space, or do users generally not care if my program is 5mb when it could be 50kb if stripped down?
Thanks.

That depends on the delivery mechanism.
Size is generally only relevant in terms of the bandwidth required to download it. If you download it often, then it matters a lot. If its only once, it matters less and you have to weigh up the time involved in reducing that vs how much space you save.
After that, nobody cares until you get into gigabytes. Well, mobile applications will probably start caring at about 10MB+.

Users definitely care (after all, not only does space cost money, but affects program load time). However, the question becomes how much do you optimize. I suggest the 80/20 rule. 80% of your benefit comes from the first 20% of the effort.
If you use a utility like TreePie you might be able to see what parts of a large application are consuming most of your resources. If you find it's just a few large images, or one big DLL with a bunch of embedded resources, it's probably worth taking a look at reducing the size, if it's easy.
But there's a cost/benefit tradeoff. I just saw a terrabyte drive for $100 the other day. Saving the user 1 gig is about 10 cents in terms of storage space, and perhaps some hard to quantify amount of time spent loading every time they load. If you have 100,000 users, it probably worth your time to optimize a bit, but if you're writing custom software for one user it's probably not worth it unless they're complaining.

As mentioned by Graham Lee, a great deal of this is very dependant on your users. If you are writing something that needs to be optimized to fit on the chip of a 68000 processor, then you'd better believe that program size matters. Assuming you're not programming 30 years ago, you probably won't run across that particular issue.
But in general, you should be making your application as small as possible while still achieving the quality you want. That is to say, if your application is likely to be viewed on an 640x480 screen, then you don't need hi-res 6mg pngs for all your images. On the other hand, if your application is designed to be blown up on a big screen at conferences, then you probably want to upsize your images.
Another option that is very common is creating installers with separate options ranging from full to minimal. That way you can allow your users to decide whether size matters to them. It allows you to create the pretty pretty version of your app, and a scaled back version that doesn't include tutorials or mp3 files of a soothing woman's voice telling you that you've push the wrong button.
Know your users. And if you don't, then let them decide for themselves.

Consider yourself, what would you use? Would you rather save space with 5KB programs or waste it with 5MB programs?
I think that smaller is better, especially if the program doesn't use/need much graphics and can be optimized.

I would say not important at all, unless it's obscenely large.

I would argue that startup time is far more important to users that application size.
However if you include a lot of media files with your system it is logical to optimise this data as much as possible. But don't compromise the quality - switching to jpeg might be okay for photos, but it sucks for technical diagrams. A .wav could be an .aac or .mp3, but not if you're writing a professional audio application.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas