Storing large amounts of text in Core Data

Storing large amounts of text in Core Data - objective-c

I'm trying to see what the best way to store large amounts of text (more than 255 characters) in Cocoa would be. Being a big fan of Core Data, I would assume there's an effective way to do so. However, I feel like 'string' is the wrong data type for this type of thing. Does anyone have any info on this? I don't see an option for BLOB in Core Data

Well you can't very well compress the text or store it as a binary that must be translated, otherwise you give up SQLite's querying speed (because all text-stored-as-binary-encoded-data) records must be read into memory, translated/decompressed, then searched). Otherwise, you'd have to mirror (and maintain) the text-only representation in your Core Data store alongside the more full-featured stuff.
How about a hybrid solution? Core Data stores all but the actual text; the text itself is archived a one-file-per-entry-in-Core-Data on the file system. Each file named for its unique identifier in the Core Data store. This way a search could do two things (in the background, of course): search the Core Data store for things like titles, dates, etc; search the files (maybe even with Spotlight) for content search. If there's a file search match, its file name is used to find the matching record in Core Data for display in your app's UI.
This lets you leverage your app-specific internal search criteria and Spotlight's programmatic asynchronous search. It's a little more work, granted, but if you're talking about a LOT of text, I can't think of a better way.

The BLOB data type is called "Binary data" in Core Data. As middaparka has pointed out, the Core Data Programming Guide offers some guidance on how to deal with binary data in Core Data. Depending on your requirements, an alternative to using BLOBs would be to just store references to files on disk.

I'd recommend a read of Apple's Core Data Programming Guide (specifically the "Core Data Performance" section). This specifically mentions BLOBs (see the "Large Data Objects (BLOBs)" section) and gives some, albeit vague, guidelines.

Related

best way of searching a list of files suign core data

I currently hace a core data project where I'm handling a list of file with some customs meta data, this list can be huge, so I'm trying figure it out what will be the best way to search for properties (meta data) with core data, to be honest what I want to know is if core data is already optimized to handle quick search or I need to create some sort of algorything like a binary tree, what will be best way to do this, I just need some guidance, thanks for any help

Yes if you create a core Data model with indexed properties then searches will be fast. However this does not take into consideration the overhead associated with loading and maintaining the list in core data

Saving Data - Core Data vs plist file

I'm writing an iOS applications that saves Music albums(just an exercise I'm doing for the fun of it) .
For every Album there's a singer, song names, time, and a picture
The final result will be a lot of objects with a lot of details including a picture attached to every object. Should I even consider doing something like that with plist? (can pictures be stored in a plist?)
What's the best way to save and access that data?
I'm new to iOS and from the training videos I've seen Core Data is not recommend for the beginner user. Is that really the case?
If I'm going with plist, should I create one plist for every genre for example rap.plist , rock.plist etc' or just a big data.plist?
Thanks

I would go for core data. If you choose the right template when you create your new project in xcode then reduce the once-off overhead work significantly.
With that simple structure I would say that the templates provides nearly everything you need. Just define your model and layout and off you go.
There is just the images where I would spend a bit more time in thinking it over. I personally never put the image data into core data itself. I rather stored them as file and within my core data model I just stored the path and filename to access it. As file name I simply used a timestamp. You could use some auto-increment or other unique id technique but a time stamp would do as well. It will not be visible to the user anyway.

I think the best way you can do this, since you are new to IOS is by using sqlite. Save all the information you want in your local database and display it on the screen.
You can use plist if you have data structure that is small.
Note that property lists should be used for data that consists primarily
of strings and numbers. They are very inefficient when used with large blocks
of binary data. A property list is probably the easiest to maintain, but it will be loaded into memory all at once. This could eat up a lot of the device's memory.
With Sqlite you will easily be able delete , edit, insert your data into the database.
Core data also uses sqlite for data storage only it helps you to manage your data objects, their relationships and dependencies with minimal code.
And since your are new getting started with core data would not be such a good idea i think.. so i would suggest start off with normal sqlite. Save the data in one of your folders of your app and store their path in the database.
You dont have to write different plists.. You can use the same one if you are using.
EDIT : here is a link that will help you with learning sqlite
http://www.iosdevelopment.be/sqlite-tutorial/

you need some more code to set up the core data stack (the store coordinator, the store, the object model, and a context)
it is a tad more complicated but that shouldnt scare you off.
Reading a plist is indeed dead easy but while good for smaller data (like the info.plist) it doesnt scale and soon you need a fullblown DB

As you edited your original question an decided to go with plist now.
In that case I would go for one plist per ablum and one overall plist for the list of albums.
You could, of course, use more plists for categories etc.
However, if you are thinking of data structures like categories you are far better off with core data. Especially when it comes to searching.

No one seems to be mentioning SQLLite, I would go that way and for reasons that I explain here ( https://stackoverflow.com/a/12619813/1104563 ). Hope this helps!

coredata is a apple provided persistant tool, while plist is XML file. The reason why core data is not recommended for beginner, I think, is core data is more difficult than plist from programming perspective. For your application, obviously core data is more suitable. But alternatively, you may also use archive file, that's between core data and plist.

is there an ocaml library store/use data structure on disk

like bdb. However, I looked at the ocaml-bdb, seems like it's made to store only string. My problem is I have arrays that store giant data. Sure, I can serialize them into many files, or encode/decode my data and put them on database or those key-value db things, which is my last resort. I'm wondering if there's a better way.

The HDF4 / HDF5 file format might suit your needs. See http://forge.ocamlcore.org/projects/ocaml-hdf/

In addition to the HDF4 bindings mentioned by jrouquie there are HDF5 bindings available (http://opam.ocaml.org/packages/hdf5/). Depending on the type of data you're storing there are bindings to GDAL (http://opam.ocaml.org/packages/gdal/).
For data which can fit in a bigarray you also have the option of memory mapping a large file on disk. See https://caml.inria.fr/pub/docs/manual-ocaml/libref/Bigarray.Genarray.html#VALmap_file for example. While it ties you to a rather strict on-disk format, it does make it relatively simple to manipulate arrays which are larger than the available RAM.

there was an ocaml BerkeleyDB wrapper in the past:
OCamlDB
Apparently someone looked into it recently:
recent patch for OCamlDB
However, the GDAL bindings from hcarty are probably production ready and in intensive usage somewhere.
Also, there are bindings for dbm in opam: dbm and cryptodbm

HDF5 is prolly the answer, but given the question is somewhat vague, another solution is possible.
Disclaimer: I don't know ocaml (but I knew caml-light) and I know berkeley database (AKA. bsddb (AKA bdb)).
However, I looked at the ocaml-bdb, seems like it's made to store only string.
That maybe true in ocaml-bdb but in reality it stores bytes. I am not sure about your case, because in Python2 there was no difference between bytes and strings of unicode chars. It's until recently that Python 3 got a proper byte type and the bdb bindings take and spit bytes. That said, the difference is subtile but you'd rather work with bytes because that what bdb understand and use.
My problem is I have arrays that store giant data. Sure, I can serialize them into many files, or encode/decode my data and put them on database
or use those key-value db things, which is my last resort.
I'm wondering if there's a better way.
It depends on you need and how the data looks.
If the data can all stay in memory, you'd rather dump memory to a file and load it back.
If you need to share than data among several architectures or Operating system you'd rather use a serialisation framework like HDF5. Remember is that HDF5 doesn't handle circular references.
If the data can not stay all in memory, then you need to use something like bdb (or wiredtiger).
Why bdb (or wiredtiger)
Simply said, several decades of work have gone into:
splitting data
storing it on disk
retrieve data
As fast as possible.
wiredtiger is the successor of bdb.
So yes you could split the files yourself et al. but that will require a lot of work. Only specialized compagnies do that (bloomberg included...), among people that manage themself all the above there is the famous postgresql, mariadb, google and algolia.
ordered key value stores like wiredtiger and bdb use similar algorithm to higher level databases like postgresql and mysql or specialized one like lucene/solr or sphinx ie. mvcc, btree, lsm, PSSI etc...
MongoDB since 3.2 use wiredtiger backend for storing all the data.
Some people argue that key-value store are not good at storing relational data, that said several project started doing distributed databases on top of key value stores. This is a clue that it's useful. E.g. FoundationDB or CockroachDB.
The idea behind key-value stores is to deliver a generic framework for:
splitting data
storing it on disk
retrieve data
As fast as possible, giving some guarantees (like ACID) and other nice to haves (like compression or cryptography).
To take advantage of the power offer by those libraries. You need to learn about key-value composition.

Provide example for why it is not advisable to store images in CoreData?

this question has been asked many times, I have read many users telling that it is not advisable to store images in a DB, in particular within CoreData. By they all seems to omit the reason why they would do so. Even Apple documentation state this, and everybody points to that direction, and every discussion end like this "well you can, but storing the path is better".
Apart from opinions, I would like to have a concrete example of why it is not a good solution.
I explain better, I have a strong background in building Web Application. A concrete example I would give from my point of view could be: do not store images in a DB, but rather the path to them, because you can have them served them by the web server, which can apply all of its caching issues.
But in a desktop environment, especially in iOS application, what are the downside of having stored in Core Data using sqllite, providing that:
There's a separate entity holding the images, it is not an attribute
of main entity
Also seems to be a limit of 100kb for images. Why ? What does happen with a 110,120...200kb ecc ?
thanks

There's nothing special about what Core Data normally does here. It's just using an SQLite database. You can put large blobs of data into it, but it just doesn't scale all that well. You can read more about it here: Internal Versus External BLOBs in SQLite.
That said, Core Data has support for external blobs which in Core Data terminology is called stored in external record (iOS 5.0 and later). Again, there's nothing magic about it, it's just storing the large pieces of data in the file system separately from the SQLite db itself. The benefit is that Core Data updates all this for you.
When you're in Xcode, there'll be a checkbox called Allows External Storage that you can check for Binary Data properties.

The filesystem, and the API:s surrounding it is (just like a webserver) optimized to serve files, of any size, and to apply caching where appropriate.
CoreData is optimized for handling an object graph with tiny pieces of data, like integers and short strings.
Also, there are a number of other issues that tend to creep up on you, like periodically vacuuming the SQLite database CoreData uses, or it won't be able to shrink, just grow.

Leonardo,
With Lion/iOS 5, Core Data started handling file system storage of large BLOBs for you.
The choice is really determined by how many images you are going to have open. If you have many, then you should keep them in the DB. Why? Because you only have a modest number of file descriptors, one of which is used for each open image stored in the file system.
That said, there is still a reason to manage the files yourself. If your BLOBs are really big, say 2+ MB, you will want to map them into memory and not just read them in. (When the memory warnings come, this lets the OS automatically purge them from your resident memory. This is a very good thing.) Even so, you still have the limited number of file descriptors problem.
Andrew

How should I store a bunch of data locally to display on a UITableView?

I have to display a lot of text preferably pre-formatted or formatted with NSString methods. Each row will have a detailed screen. In the detailed screen another UITableView will have sections for lets say "Definition", "Examples" etc each with only one row. In these rows I will be displaying the text, which spans multiple lines. Should I store all the text in a SQLite database like a column for each section? Are there other ways to store data locally?

You have quite a few different ways to store application data on your iPhone:
Using Flat Files
These are files that contain data in the format you decide it would be best to store it. They are useful for persisting small bits of text data that don't require a complicated structure and a strong relational organization in order to make sense.
Using Plist files
Property list files already have a key-value structure in place, that you can use to your advantage if your data lends itself well to this format. Native data types, such as NSDictionary and NSarray can be serialized and deserialized easily to and from this format.
Storing key-value data in NSUserDefaults
Typically used to store application settings and other small amount of data, NSUserDefaults are useful for holding simple data types without excessive complications.
Storing Information in an SQL Database
Useful for when your data is strongly structured and relational and you want to avoid rolling your own file-based data storage structure for time and performance reasons. The SQL language is a powerful tool for retrieving and persisting relational information and you can manage the complexity of your implementation by resorting to wrappers around SQLite, such as FMDB.
Using Core Data
If you plan on persisting and managing a complicated, dynamic object graph, without worrying on how to serialize and deserialize it from storage yourself, Core Data is your best bet. It can help you in many ways, from change tracking and undo support to relational structure maintenance and migration.
Here is a detailed Oreilly article explaining in more detail the particularities of most of these methods, a great read if you want to develop a solid understanding of the fundamentals.

Some methods are...
Core Data
Write to NSUserDefaults
Write to a file, like for example an NSDictionary to a plist
If all you're storing is a bunch of strings, it might make sense for you to keep it simple and just use NSUserDefaults or make a plist file and load it into an NSDictionary when you start your app. If you have actual objects with relationships and sub properties, then I'd look into core data.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas