Performance overhead for frequent (5Hz) Core Data saves

Performance overhead for frequent (5Hz) Core Data saves - objective-c

For an iPhone app that plays audio files, I'm working on a system to track the user's progress in any episode they've listened to (eg, they listen to the first 4:35 of file1, then starts another file, and goes back to file1 and it starts at 4:35).
I've set up a Core Data model to store the metadata, but I'm wondering how aggressively I could/should cache the current location during playback.
Currently I have just stuck the save: call in a method that was previously being used to update the time labels and UISlider playhead. That method is being called by a NSTimerInterval every 0.2 seconds.
0.2 seconds is much more precision than I need to keep track of for the progress cache. The values are rounded to the nearest second anyway, so essentially 4/5 of every save is redundant.
Given, though, that this is pretty much all Core Data is doing, it's only only ever dealing with a single value for a single record at any given time, I'm wondering if it makes more sense to just do the extra, unnecessary save:'s, or to manage a second timer for doing the update less frequently.
As is, Instruments reports the Save Duration of each event as ~800, peaking around 2000. I'm not really sure how to interpret those results. Actual app performance in the simulator doesn't appear to be significantly impacted.
If this kind of save is so cheap that it makes sense to keep code complexity low (only managing a single timer), I would keep it as is, but my gut instinct is that that's a lot of operations, no matter how cheap.

You shouldn't see as much of a difference in performance as you may see in battery consumption.
Writing to disk with flash storage in an iOS device is much faster than writing to a spinning plate HDD on a computer. Also, a write to a HDD does not cost much electricity compared to just keeping the plated spinning anyway. However, writing to the flash storage takes more power relative to a read or just leaving the flash alone.
In other words, the power consumption for a write on an iOS device it not negligible. If you can get away with 4hz, that could easily result in a notable improvement in batter consumption for your app.

Related

How can I estimate if a feature is going to take up too many resources on an FPGA?

I'm starting on my first commercial sized application, and I often find myself making a design, but stopping myself from coding and implementing it, because it seems like a huge use of resources. This is especially true when it's on a piece that is peripheral (for example an enable for the output taps of a shift register). It gets even worse when I think about how large the generic implementation can get (4k bits for the taps example). The cleanest implementation would have these, but in my head it adds a great amount of overhead.
Is there any kind of rule I can use to make a quick decision on whether a design option is worth coding and evaluation? In general I worry less about the number of flip-flops, and more when it comes to width of signals. This may just be coming from a CS background where all application boundarys should be as small as possibly feasable to prevent overhead.

Point 1. We learn by playing, so play! Try a couple of things. See what the tools do. Get a feel for the problem. You won't get past this is you don't try something. Often the problems aren't where you think they're going to be.
Point 2. You need to get some context for these decisions. How big is adding an enable to a shift register compared to the capacity of the FPGA / your design?
Point 3. There's two major types of 'resource' to consider :- Cells and Time.
Cells is relatively easy in broad terms. How many flops? How much logic in identifiable blocks (e.g. in an ALU: multipliers, adders, etc)? Often this is defined by the design you're trying to do. You can't build an ALU without registers, a multiplier, an adder, etc.
Time is more subtle, and is invariably traded off against cells. You'll be trying to hit some performance target and recognising the structures that will make that hard are where to experience from point 1 comes in.
Things to look out for include:
A single net driving a large number of things. Large fan-outs cause a heavy load on a single driver which slows it down. The tool will then have to use cells to buffer that signal. Classic time vs cells trade off.
Deep clumps of logic between register stages. Again the tool will have to spend more cells to make logic meet timing if it's close to the edge. Simple logic is fast and small. Sometimes introducing a pipeline stage can decrease the size of a design is it makes the logic either side far easier.
Don't worry so much about large buses, if each bit is low fanout and you've budgeted for the registers. Large buses are often inherent in fast designs because you need high bandwidth. It can be easier to go wide than to go to a higher clock speed. On the other hand, think about the control logic for a wide bus, because it's likely to have a large fan-out.
Different tools and target devices have different characteristics, so you have to play and learn the rules for your set-up. There's always a size vs speed (and these days 'vs power') compromise. You need to understand what moves you along that curve in each direction. That comes with experience.
Is there any kind of rule I can use to make a quick decision on whether a design option is worth coding and evaluation?
Only rule I can come up with is 'Have I got time? or not?'
If I have, I'll explore. If not I better just make something work.
Ahhh, the life of doing design to a deadline!

It's something that comes with experience. Here's some pointers:
adding numbers is fairly cheap
choosing between them (multiplexing) gets big quite quickly if you have a lot of inputs to the multiplexer (the width of each input is a secondary issue also).
Multiplications are free if you have spare multipliers in your chip, they suddenly become expensive when you run out of hard DSP blocks.
memory is also cheap, until you run out. For example, your 4Kbit shift register easily fits within a single Xilinx block RAM, which is fine if you have one to spare. If not it'll take a large number of LUTs (depending on the device - an older Spartan 3 can fit 17 bits into a LUT (including the in-CLB register), so will require ~235 LUTS). And not all LUTs can be shift registers. If you are only worried about the enable for the register, don't. Unless you are pushing the performance of the device, routing that sort of signal to a few hundred LUTs is unlikely to cause major timing issues.

Best cache size for iOS apps

I'm currently developing an application that loads lots of images from the internet and saves them locally (I'm using SDURLCache). However, old images have get removed from the disk again so I was wondering what the best cache size is.
The advantage of a big cache is obviously that more images get saved which leads to better UX.
The disadvantage is that images need a lot of space and the user will run out of disk space faster. The size I am thinking of is 20MB. It seems so big to me though so I'm asking you what you're opinion is.

The best way to decide on an appropriate cache size is to test. Run the app under Instruments to measure both performance and battery usage. Keep increasing the cache size until you can't discern a difference in performance. That's the largest size you'd need, at least under the test conditions. Once you've established that size, reduce the size until performance is just barely acceptable to determine the smallest acceptable size.
The right size is somewhere between those two sizes, depending on what you think is important. If you can't determine a right size, then either pick a size or add a slider to the app's settings to let the user decide. (I'd avoid making it user-adjustable if you can -- users shouldn't have to think about such things.)

Considering that the smallest iDevices have 8GB of storage, I don't think a 20MB cache is too big, especially if it significantly improves the performance of the app. Also, keep in mind the huge advantage a network cache can have for battery life, since network usage is very expensive in battery time.
Determining the ideal size however is hard without some more information. How often is the same picture accessed? How large is each picture (i.e. how many pictures can 20MB hold). How often will images need to be removed from the cache to add new ones?
If you are constantly changing the images in the cache, it could actually have an adverse effect on the battery life due to the increased disk usage.

Trying to work out some leaks for iPad game

I have a slight problem where when the user plays my game for more than 20 minutes or so it begins to slow quite considerably. However I have been trying to work through the issues pretty hard as of late but still no luck. I have tried the leaks instrument and I now have that squared away, but I read at "bbum's weblog" about using the Allocations Instrument and taking heap shots. But i dont quite understand what i am looking at, could some one give me a hand with this?
My game involves users selecting words. I took a heap shot after each word was selected, but i am not too sure how exactly to read this. Is the heap Growth column what is currently running or is it was has been added to what is currenlly running?
And what is the # Persistent?
Also why is the # Persistent jump so much? Could that be my memory problem?
Thanks for all the help!

The heap growth column represents all of the allocations in that iteration that did not exist prior to that iteration but continue to exist in all subsequent iterations.
I.e. Heapshot 4 shows a 10.27KB permanent growth in the size of your heap.
If you were to take an additional Heapshot and any of the objects in any of the previous iterations were deallocated for whatever reason, the corresponding iteration's heapshot would decrease in size.
In this case, the heapshot data is not going to be terribly useful. Sure; you can dive in an d look at the various objects sticking around, but you don't have a consistent pattern across each iteration.
I wrote considerably more about this in a weblog post.
If it's slowing down, why not try CPU profiling instead? Unless you're
getting memory warnings, what makes you think it's a leak?
Tim's comment is correct in that you should be focusing on CPU usage. However, it is quite effective to assume that an app is slowing down because of increased algorithmic cost associated with a growing working set. I.e. if there are more objects in memory, and those objects are still in use, then it takes more time to muck with 'em.
That isn't the case here; your heap isn't growing that significantly and, thus, it sounds like you have a pure algorithmic issue if your app is truly slowing down.

Does your game save files to NSUserDefaults or to an any arrays. If so as the game is played and more and more stuff is added to the array it would take longer to loop through it hence gradually slowing down the game.

Scattered-write speed versus scattered-read speed on modern Intel or AMD CPUs?

I'm thinking of optimizing a program via taking a linear array and writing each element to a arbitrary location (random-like from the perspective of the CPU) in another array. I am only doing simple writes and not reading the elements back.
I understand that a scatted read for a classical CPU can be quite slow as each access will cause a cache miss and thus a processor wait. But I was thinking that a scattered write could technically be fast because the processor isn't waiting for a result, thus it may not have to wait for the transaction to complete.
I am unfortunately unfamiliar with all the details of the classical CPU memory architecture and thus there may be some complications that may cause this also to be quite slow.
Has anyone tried this?
(I should say that I am trying to invert a problem I have. I currently have an linear array from which I am read arbitrary values -- a scattered read -- and it is incredibly slow because of all the cache misses. My thoughts are that I can invert this operation into a scattered write for a significant speed benefit.)

In general you pay a high penalty for scattered writes to addresses which are not already in cache, since you have to load and store an entire cache line for each write, hence FSB and DRAM bandwidth requirements will be much higher than for sequential writes. And of course you'll incur a cache miss on every write (a couple of hundred cycles typically on modern CPUs), and there will be no help from any automatic prefetch mechanism.

I must admit, this sounds kind of hardcore. But I take the risk and answer anyway.
Is it possible to divide the input array into pages, and read/scan each page multiple times. Every pass through the page, you only process (or output) the data that belongs in a limited amount of pages. This way you only get cache-misses at the start of each input page loop.

How important is size in an application?

When creating applications (Java, run on a normal computer). How important is program size for users? For example, would it be necessary to replace .png's with .jpg's, convert .wav's to .midi's, or strip down libraries to save space, or do users generally not care if my program is 5mb when it could be 50kb if stripped down?
Thanks.

That depends on the delivery mechanism.
Size is generally only relevant in terms of the bandwidth required to download it. If you download it often, then it matters a lot. If its only once, it matters less and you have to weigh up the time involved in reducing that vs how much space you save.
After that, nobody cares until you get into gigabytes. Well, mobile applications will probably start caring at about 10MB+.

Users definitely care (after all, not only does space cost money, but affects program load time). However, the question becomes how much do you optimize. I suggest the 80/20 rule. 80% of your benefit comes from the first 20% of the effort.
If you use a utility like TreePie you might be able to see what parts of a large application are consuming most of your resources. If you find it's just a few large images, or one big DLL with a bunch of embedded resources, it's probably worth taking a look at reducing the size, if it's easy.
But there's a cost/benefit tradeoff. I just saw a terrabyte drive for $100 the other day. Saving the user 1 gig is about 10 cents in terms of storage space, and perhaps some hard to quantify amount of time spent loading every time they load. If you have 100,000 users, it probably worth your time to optimize a bit, but if you're writing custom software for one user it's probably not worth it unless they're complaining.

As mentioned by Graham Lee, a great deal of this is very dependant on your users. If you are writing something that needs to be optimized to fit on the chip of a 68000 processor, then you'd better believe that program size matters. Assuming you're not programming 30 years ago, you probably won't run across that particular issue.
But in general, you should be making your application as small as possible while still achieving the quality you want. That is to say, if your application is likely to be viewed on an 640x480 screen, then you don't need hi-res 6mg pngs for all your images. On the other hand, if your application is designed to be blown up on a big screen at conferences, then you probably want to upsize your images.
Another option that is very common is creating installers with separate options ranging from full to minimal. That way you can allow your users to decide whether size matters to them. It allows you to create the pretty pretty version of your app, and a scaled back version that doesn't include tutorials or mp3 files of a soothing woman's voice telling you that you've push the wrong button.
Know your users. And if you don't, then let them decide for themselves.

Consider yourself, what would you use? Would you rather save space with 5KB programs or waste it with 5MB programs?
I think that smaller is better, especially if the program doesn't use/need much graphics and can be optimized.

I would say not important at all, unless it's obscenely large.

I would argue that startup time is far more important to users that application size.
However if you include a lot of media files with your system it is logical to optimise this data as much as possible. But don't compromise the quality - switching to jpeg might be okay for photos, but it sucks for technical diagrams. A .wav could be an .aac or .mp3, but not if you're writing a professional audio application.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas