How to deal with thousands of small audio files?

How to deal with thousands of small audio files? - objective-c

Need to implement an app that has a feature to play sounds. Each sound will be some word sound, number of expected sounds is about one thousand. So, the most simple solution would be to store those sounds as sound files, each word sound in separate sound file, and to play them on demand. Would there be any potential problems with such a large number of files?

No problem with that many files, but they will take up more space than just the total of their sizes. Each file will fill up a whole # of space blocks on the device. On average you will then waste half a block (as a rule of thumb) unless all your files are significantly smaller than one block, in which case you will always use 1.000 blocks (one pr. file) and waste 1000 * (blocksize - average file size).
Things you could do:
Concatenate the files into one big file, store the start and length of each subfile, either read the chunk into memory or copy to a temporary file.
Drop the files in a database as BLOB fields for easier retrieval. This won't save space, but may make your code simpler or more reliable.
I don't think you need to make your own caching mechanism. Most likely iOS has a system-wide cache that does a far better job. That should only be relevant if you experience performance issues and need to get much shorter load times. In that case prhaps consider using bolcks for loading and dispatching the playing, as that's an easier way to hide the load latency and avoid UI freezes.
If your audio is uncompressed, the App Store will report the compressed size. If that differs a lot from the unpacked size, some (nitpicking) customers will definitely notice ald complain, as they think the advertised size is the install size. I know from personal experience. They wil generally not take a technical answer for an answer, any may even bypass talking to you, and just downvote you based on this. I s#it you not.

You should be fine storing 1000 audio clip files within the IPA but it is important to take note about the space requirements and organisation.
Also to take into consideration is the fact that accessing the disk is slower than memory and it also takes up battery space so it my be ideal to load up the most frequently used audio clips into memory.

If you can afford it, use FMOD which I believe can extract audio from various compressed schemes. If you just want to handle all those files yourself create a .zip file and extract them on the fly using libz (iOS library libs.dylib).

Related

Thumbnail storage strategy

I am working on a portion of an app that requires a "Photos" type presentation of multiple thumbnail images. The full size images are quite large, and generating the thumbnails every time is taking too long, so I am going to cache the thumbnails.
I am having a hard time determining how best to store the thumbnails on the filesystem once I create them. I can think of a few possibilities but I don't like any of them:
Save the thumbnail in the same directory as the original file, with _Thumb added to the filename (image.png and image_Thumb.png). This makes for a messy directory and I would think performance would become a problem because of reading so many different files to load at once.
Save the thumbnails in their own sub-directory, with the same filename as the original. I think that this is slightly cleaner, but I'm still opening lots of different files.
Save all of the thumbnails to a Thumbnails file. I think that this is commonly done in Windows and OS X? I like the idea because I can open one file and read multiple thumbnails from it, but I'm not sure how to store all of them in the same file and associate them with the original files. EDIT: I thought of using NSKeyedArchiver/unArchiver but from what I can find, anytime a thumbnail is added/removed, I would have to re-create the entire archive. Perhaps there is something that I am overlooking?
EDIT Store the thumbnails in a core data/sqlite database file. I have heard over the years that it is a bad idea to store images in a database file due to slow performance and the possibility of database corruption on writes that take a (relatively) long time to complete. Does anyone have experience using either one this way?
Any suggestions on the best approach to take?

I would go for the second option. On iDevices you use flash memory. Performance penalty for accessing many files is very low comparing to HDDs. Also you can cache some in memory to prevent reading one and the same file too often. SDWebImage caching mechanism contains a great sample how to do it.
The third option - using one file for that would probably mean using database file. You could have some performance improvements there if you store uncompressed data. You'll need to do some performance tests because loading more data (uncompressed form of the thumbs), might slow it down saving CPU for more storage access.
Combined approach would be to store thumbnails as files but in uncompressed format (not .jpg, .png etc.)
A fourth option worth considering, as long as the thumbnails are reasonably small: save them in CoreData.

Is there any performance difference between creating an NSFileHandle for a large versus a small file?

This question strikes me as almost silly, but I just want to sanity check myself. For a variety of reasons, I'm welding together a bunch of files into a single megafile before packing this as a resource in my iOS app. I'm then using NSFileHandle to open the file, seek to the right place, and read out just the bytes I want.
Is there any performance difference between doing it this way and reading loose files? Or, supposing I could choose to use just one monolithic megafile, versus, say, 10 medium-sized (but still joined) files, is there any performance difference between "opening" the large versus a smaller file?
Since I know exactly where to seek to, and I'm reading just the bytes I want, I don't see how there could be a difference. But, hey -- Stranger things have proved to be. Thanks in advance!

There could be a difference if it was an extremely large number of files. Every open file uses up resources in memory (file handles, and the like), and on some storage devices, a file will take up an entire block even if it doesn't fill it. That can lead to wasted space in extreme cases. But in practice, it probably won't be a problem. To know for sure, you can profile your code and see if it's faster one way vs. the other, and see what sort of space it takes up on a typical device.

Saving large objects to file

I'm working on a project in Objective-c where I need to work with large quantities of data stored in an NSDictionary (it's around max ~2 gigs in ram). After all the computations that I preform on it, it seems like it would be quicker to save/load the data when needed (versus re-parsing the original file).
So I started to look into saving large amount of data. I've tried using NSKeyedUnarchiver and [NSDictionary writeToFile:atomically:], but both failed with malloc errors (Can not allocate ____ bytes).
I've looked around SO, Apple's Dev forums and Google, but was unable to find anything. I'm wondering if it might be better to create the file bit-by-bit instead of all at once, but I can't anyway to add to an existing file. I'm not completely opposed to saving with a bunch of small files, but I would much rather use one big file.
Thanks!
Edited to include more information: I'm not sure how much overhead NSDictionary gives me, as I don't take all the information from the text files. I have a 1.5 gig file (of which I keep ~1/2), and it turns out to be around 900 megs through 1 gig in ram. There will be some more data that I need to add eventually, but it will be constructed with references to what's already loaded into memory - it shouldn't double the size, but it may come close.
The data is all serial, and could be separated in storage, but needs to all be in memory for execution. I currently have integer/string pairs, and will eventually end up with string/strings pairs (with all the values also being a key for a different set of strings, so the final storage requirements will be the same strings that I currently have, plus a bunch of references).
In the end, I will need to associate ~3 million strings with some other set of strings. However, the only important thing is the relationship between those strings - I could hash all of them, but NSNumber (as NSDictionary needs objects) might give me just as much overhead.

NSDictionary isn't going to give you the scalable storage that you're looking for, at least not for persistence. You should implement your own type of data structure/serialisation process.
Have you considered using an embedded sqllite database? Then you can process the data but perhaps only loading a fragment of the data structure at a time.

If you can, rebuilding your application in 64-bit mode will give you a much larger heap space.
If that's not an option for you, you'll need to create your own data structure and define your own load/save routines that don't allocate as much memory.

considerations for saving data to ONE file or MULTIPLE?

i am going to be saving data with DPAPI encryption. i am not sure whether i should just have one big file with all the data or should i break up the data into separate files, where every file is its own record. i suspect the entire dataset would be less than 10mb, so i am not sure whether it's worth it to break it down into about a few hundred separate files or should i just keep it one file?
will it take a long time to decrypt 10mb of data?

For 10 megabytes, I wouldn't worry about splitting it up. The cost of encrypting/decrypting
a given volume of data will be pretty much the same, whether it's one big file or a
group of small files. If you needed the ability to selectively decrypt individual records,
as opposed to all at once, splitting the file might be useful.

If you can never think of the hardware your app is going to run on, make it scaleable. It can then run from 10 parallel floppy drives if it's too slow reading from 1.
If your scope is limited to high-perfo computers, and the file size is not likely to rise within the coming next 10 years, put it in 1 file.

How important is size in an application?

When creating applications (Java, run on a normal computer). How important is program size for users? For example, would it be necessary to replace .png's with .jpg's, convert .wav's to .midi's, or strip down libraries to save space, or do users generally not care if my program is 5mb when it could be 50kb if stripped down?
Thanks.

That depends on the delivery mechanism.
Size is generally only relevant in terms of the bandwidth required to download it. If you download it often, then it matters a lot. If its only once, it matters less and you have to weigh up the time involved in reducing that vs how much space you save.
After that, nobody cares until you get into gigabytes. Well, mobile applications will probably start caring at about 10MB+.

Users definitely care (after all, not only does space cost money, but affects program load time). However, the question becomes how much do you optimize. I suggest the 80/20 rule. 80% of your benefit comes from the first 20% of the effort.
If you use a utility like TreePie you might be able to see what parts of a large application are consuming most of your resources. If you find it's just a few large images, or one big DLL with a bunch of embedded resources, it's probably worth taking a look at reducing the size, if it's easy.
But there's a cost/benefit tradeoff. I just saw a terrabyte drive for $100 the other day. Saving the user 1 gig is about 10 cents in terms of storage space, and perhaps some hard to quantify amount of time spent loading every time they load. If you have 100,000 users, it probably worth your time to optimize a bit, but if you're writing custom software for one user it's probably not worth it unless they're complaining.

As mentioned by Graham Lee, a great deal of this is very dependant on your users. If you are writing something that needs to be optimized to fit on the chip of a 68000 processor, then you'd better believe that program size matters. Assuming you're not programming 30 years ago, you probably won't run across that particular issue.
But in general, you should be making your application as small as possible while still achieving the quality you want. That is to say, if your application is likely to be viewed on an 640x480 screen, then you don't need hi-res 6mg pngs for all your images. On the other hand, if your application is designed to be blown up on a big screen at conferences, then you probably want to upsize your images.
Another option that is very common is creating installers with separate options ranging from full to minimal. That way you can allow your users to decide whether size matters to them. It allows you to create the pretty pretty version of your app, and a scaled back version that doesn't include tutorials or mp3 files of a soothing woman's voice telling you that you've push the wrong button.
Know your users. And if you don't, then let them decide for themselves.

Consider yourself, what would you use? Would you rather save space with 5KB programs or waste it with 5MB programs?
I think that smaller is better, especially if the program doesn't use/need much graphics and can be optimized.

I would say not important at all, unless it's obscenely large.

I would argue that startup time is far more important to users that application size.
However if you include a lot of media files with your system it is logical to optimise this data as much as possible. But don't compromise the quality - switching to jpeg might be okay for photos, but it sucks for technical diagrams. A .wav could be an .aac or .mp3, but not if you're writing a professional audio application.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas