I ask a few questions on this topic but still can't get it to work. I have a core data with 10k+ rows of people names that i am showing in a tableview. I would like to be able to search and update the table with every letter. It's very laggy. As suggested i watch the WWWDC '10 core data presentation and was trying to implement
[request setFetchBatchSize:50];
Doesn't seem to work. When i use instruments to check core data it still shows that there is still 10k request when loading the tableview and when i search it also gets all the results.
Is there anything else that needs to be done to set the batch size or thats not something that will help me.
The only thing that seems to work is setting the fetchlimit to 100 when i search. Do you think its a good solution?
Thanks in advance!
The batch size just tells it how many objects to fetch at a time. This is probably not going to help you very much. Let's consider your use case a bit...
The user types "F" and you tell the database, "Go find all the names that start with 'F'" and the database looks at all 10k+ records to find the ones that start with 'F'
Then, the user types 'r', so you tell the database to go find all the records that start with "Fr" and it again looks at all 10k+ records to find the ones that start with "Fr."
All fetchBatchSize is doing is telling it "Hey, when you fault in a record, bring in 50 at once because I'm going to probably need all those anyway." That does nothing to limit your search.
However, setting fetchLimit to 100 helps some because the database starts hunting through all 10k+ records, but once it has its 100 records, it does not have to keep looking at the rest of the records because it already has filled its request. It's done, and stops searching as soon as it gets 100 records that satisfy the request.
So, there are several things you can do, all depending on your other use cases.
The easiest thing to try is adding an index on the field that you are searching. You can set that in the Xcode model editor (section that says Indexes, right under where you can name the entity in the inspector). This will allow the database to setup a special index on that field, and searching will be much faster.
Second, after your initial request, you already have an array of names that begin with 'F' so there is no need to go back to the database to ask for names that begin with 'Fr' If it begins with 'Fr' it also begins with 'F' and you already have NSManagedObject pointers for all of those. Now, you can just search the array you got back.
Even better, if you gave it a sort descriptor, the array is sorted. Thus, you can do a simple binary search on the array. Or, if you prefer, you can just use the same predicate, and apply it to the results array instead of the database.
Even if you don't use the results-pruning I just discussed, I think indexing the attribute will belt dramatically.
EDIT
Maybe you should run instruments to see how much time you are spending where. Also, a badly formed predicate can bring any index scheme to it knees. Code would help.
Finally, consider how many elements you are bringing into memory. CoreData does not fault all the information in, but it does create shells for everything in the array.
If you give it a sort predicate,
I don't know how SQLLite implements its search on an index, but a B-Tree has complexity logBN so even on 30k records, that's not a lot of searching. Unless you have another problem, the indexing should have given you a pretty big improvement.
Once you have the index, you should not be examining all records. However, you still may have a very large set of data. Try fetchBatchSize on those, because it will limit the number of records fetched, and create proxies for the rest.
You can also call countFetchRequext instead of executeFetchRequest to get the number of items. Then, you can use fetchLimit to restrict the number you get.
As far as working all this with a fetched results controller... well, that guy has to know the records, so it still has to do the search.
Also, one place to look... are you doing sections? If you have a user defined comparator for anything (like translating for sections) this will get called for every single record.
Thus, I guess my big suggestion, after making the index change, is to run instruments and really study it to determine where you are spending your time. It should be pretty obvious. That will help steer you toward the real issue.
My bet is that you are still access all of the elements for some reason...
Related
Hello World,
I'm in research mode for one of feature to be built in our software and there one new thing that we have never faced.
The thing is, on one form we have a drop down with list of items. User can select default which means all items needs to be considered or else he can selectively opt for certain list items.
Actually the form is related to filter functionality depending upon users input the data is going to get filtered and displayed on UI.
The main problem we are trying to solve is suppose user selects default, which means all list items ID's are gonna be considered in POST call of API. The list can be huge, say 1 to 1K and above too.
So under such circumstances we can build the query string but, it seems its gonna be so huge. I have also studied that certain browsers support limited query string as per their standard limits.
So currently I have following doubts in mind.
Will shortening of query string work here ?
By which technique it can be handled efficiently ?
What performance considerations I need to take care during during so ?
Any suggestions or thoughts are welcome. That would boast my software design thinking.
based on what I understand from your question, here is my opinion:[if I understood wrong, please correct me, so I can help you]
You need to send query in URL and not in body or using JSON!is that correct?
I think you don't need to send every one of the selected items one by one!
If there are selected in serial, you can perform a range in your query!
Like http://abcd/test?id=1-43,6-765(take ID as string and then export the useful data in back-end) with this approach, you can shorten your query!
And also think about the database too (if there is any).querying this much data is use a lot of IO and make query low performance.
I have inherited a database that's causing me issues.
I'm in the need of describing something horrible to stakeholders. So far using the names of anti patterns and sending them away pointing them to a google search on this has been the most efficent to buy me some time.
Trouble is, I have not come across this before. Here's what's happening.
I have a simple single table, with a couple of columns. One of these columns contains values like:
660x90_SomeCity_SomeCountryISO_ImageName_SomeRubbish
or
SomeIataAirportCode_SomeCountry_660x90_SomeRubbish_ImageName
Now the database contains an (admittedly so far and on current data) faultless logic to extract and lookup things so that the output has additional columns such as:
AdSize
Country
City
The trouble is that this is achieved through gradual conversions implemented in a labyrinth of 50 (not joking) different views. I've now got to formalize the logic to something like
View One: Extract the first column and work out the length of it.
View Two: Now split of the 2nd column using the length.
View Three: If after replacing the x in the first column the value is numeric, store the value in "AdSize", and place the second value in the "CityCandidateOne" column.
To me this is a horrible antipattern and should all be done either in custom functions, or preferably during the ETL process, in one place so the logic can be captured.
However I'm not being given the time and wonder if this is a known anti pattern. Usually I can then use the credibility of a Google search to buy a little time to really sort this out.
I'd start with this answer which covers the violation of First Normal Form.
I also found this free ebook that might be of value.
I understand that what you are facing is something on a grander scale that just putting a couple of values in a field with a comma or other token to separate them, but I don't know of any antipattern that covers such a baroque mess.
Finally, here you can find more about "replacing SQL logic with Views" as an antipattern (just look for "Views as SQL Building Blocks Anti-Pattern" in the article) but take in account that in this case the problem seem to be about inefficient access to the data.
Last minute edit: maybe this is just a special case of the general Golden Hammer antipattern? (see also: http://en.wikipedia.org/wiki/Golden_hammer)
Why not simply rewrite the SQL how you would rather do it, then print out the execution plans of both, and show the performance and timing of both. That should be enough to show them that it needs to change (and if there is no major performance difference, then your only other argument can be one of maintainability and that's something you're going to have to argue by showing them what it takes to make changes).
I'm making a patient database program using Visual C#. It will have forms and will consist of 3 tabs with information about the patient. It will also have add, save, previous, next buttons and a search function. The most important thing is each record will have like 60 items/columns/attributes per record and the records could reach to 50k-100k or more.
Now my question is, which is better for my program? Should I use SQlite or Serialization/Deserialization?
Thanks
The "database" word in the question strongly suggests that just serialization/deserialization isn't enough. Of course if you can fit all of your data into memory and you're happy to perform all the querying yourself, it could work - but you'll need to consider the cost of potentially reading everything into memory on startup, and possibly writing everything out whenever you change anything.
A database does sound like a better fit to me, to be honest. Whether SQLite is the most appropriate database for you or not is a different question though.
Having said all of this, for the C# in Depth website I keep all the information about comments / errata in a simple XML file, which is loaded lazily and saved every time I make a change. It works well, it's easy to manage, and the file is human readable in source control when I want it. However, I have vastly fewer records than you, and they're much simpler too. I don't have any search requirements - I just need to list everything and fetch by ID. My guess is that your needs are rather more complex, hence my recommendation to use a database.
So, I have an array of various names and I have populated the table with section headers
A-Z.
Is it correct to find the first char of my data and then subsequently put it in the correct section, or is there a way to do it using a faster method.
I believe what I am doing is wrong as I am thinking of making an A array for example and then find every element starting with 'A' and insert it inside. But this is a bit crazy as then, I would need to create A-Z arrays which I seriously do not think that is the correct way.
I'm sorry if this is posted in the documentations but I don't seem to be able to find it.
Any help from you guys in this matter?
Well I actually did it this way, too and I don't see any reasons why this shouldn't be done, as it decouples the fetching / sorting process from refreshing the tableview, which in turn keeps loading and scrolling the tableview smooth.
Having A-Z arrays is not a big issue memorywise since they're just holding references to your data objects anyway and not the data itself. But it allows you fast access to your data objects without the need for expensive comparison operations.
Just make sure your arrays are kept up to date if data objects are added or deleted.
An iphone app we are producing will potentially have a table with 100,000+ records (scaling wise) and I am concerned that loading all this data via a SELECT * command into an array, or an Object will either make the app very slow; or will leave me with memory issues later on.
Obviously loading 100,000+ records into an array when the viewscreen/viewport only shows 30 odd records in a go is a tad silly.
In addition, I am not sure if storing all this data in an object is the right thing to do either.
So my question is, is it possible to stagger sqlite through records, say 50 records in cache and then when you swipe down or up it loads an appropriate amount into the cache. I guess its similar to the JQuery Lazy Loading library where it only loads a little bit on the view port and then loads more as you move down.
I'm looking at JSON, but it appears to only be for Web services as it requires a URL and I am not sure if it works for files that are on the phone.
Therefore. Is there a proper way to load sqlite data into Objective C arrays/objects without causing problems when the data suddenly starts to scale?
Thanks for your help.
You definitely want to avoid loading in everything at once. Instead, you want to use a cursor and smarter queries so that you only pull data out of the database as you actually need it for display; that link points to some hints on how to do that.
Generally, you also want to avoid blindly selecting all columns. This is because you may sometime update the schema to add a column that your existing code would not always know how to extract; far better to name the explicitly columns that you expect and desire.