I have a distributed cache of float (not int). My process will frequently increment these floats and access them occasionally.
If it's local, atomic float data structure (or float adder if there is one) with an increment method would probably be the best way to go. Non-blocking and async would be ideal, since the sequence of increment does not matter as long as each increment is conducted eventually.
What's the best way of incrementing numerical value to achieve high throughput?
My current method is:
batch several increment operations for different key
using invokeAll method in IgniteCache, passing in a CacheEntryProcessor which contains the increment value for each key.
CacheAtomicityMode configuration is set to ATOMIC
Is this the best way to go?
Is there any configuration that I should pay attention to for performance boost, e.g. use binary format or on-heap memory or avoid unnecessary serialization?
I think you are definitely on the right track.
Binary format will be used by default, so you do not need any special configuration for it. I do not think you should worry about serialization of floats, so I would not configure an on-heap cache, unless you run into performance issues.
I also would suggest to take a look at internal implementation of IgniteAtomicSequence as it may give you more useful ideas.
Lastly, I would suggest to implement it as Ignite service, which will allow you to provide a custom API tailored for this functionality.
Related
I'm building a database, and I'm working on the storage format. For each record I have a logical clock which increases by 1 every time there's a write operation, and I would like to be able to compare the clocks of two writes to understand which happened first.
How can I deal with overflowing/wrapping of the counter?
Solutions I thought:
Use a very large counter (u64?) and ignore the possibility of overflow
Use a second "epoch counter" in order to keep track of overflows
Periodically reset all the counters when I deem it safe
Use BigInt and variable length encoding
All the strategies I envisioned have ugly drawbacks. I saw in some papers that overflowing counters are considered a solved problem, without mentioning how they would solve it, which makes me think I'm missing something. Any advice on how to deal with this problem?
Am I mistaken that in the case of immutable sampler it's set into the pipeline descriptor layout, whereas a non-immutable descriptor is a pointer to a sampler, and so essentially a non-immutable sampler is one extra indirection to read data from it? What kind of performance increase are we talking?
If there is any performance increase, it would likely be quite trivial and heavily hardware and algorithm dependent.
Vulkan is an explicit, low-level API. This allows it to better match the hardware, but it also means that you get to specify more precisely what it is that you want to do. In the vast majority of cases, the sampler you're using with a texture will be fixed for that particular use. As such, the API allows you to explicitly state this. While this can potentially allow for some hardware optimization, the main thing it allows you to do is to stop carrying around VkSampler objects when you don't need them. You specify them in the set layout, and you're done.
I am trying to do research on Commutative Replicated Data Type, and do not find any good definitions that aren't mired down in a ton of technical terms that makes it hard to understand how this allows for replication of data in a distributed environment without using consensus.
In Layman's terms you can think of CRDTs as follows:
CRDTS are a datatype to achieve strong eventual consistency in distributed environments without explicit synchronization. The attractive property of CRDTs is that they are both conflict-free and don't require synchronization, which is a bit confusing at first since you'd think there must be some sort of synchrony, e.g what happens if I write 2 and then 3 to the replicas and replica A receives update 3 before 2 and replica B receives the correct order, first 2 then 3, then we have a conflict?
The key to CRDT is that they are delimited to specific operations in which case the scenario above would not yield any conflict. The simplest scenario is to increment an aggregated value, if A and B just add all the incoming values they will both eventually end up with the value 5, without conflict, relying only on the weak assumptions of eventual consistency. Specifically the operation requirements are typically that the operations are commutative and that operations don't violate causal order.
Basically, CRDTs guarantees that all concurrent operations commute with each other.
Of course if CRDTs could just implement simple summation it would not be very interesting. However, it turns out that clever people have developed CRDT algorithms for more useful things such editing a shared document (see "CRDT Logoot"), grow-only sets, etc.
But still, bare in mind that by removing the need for Consensus, CRDTs are inherently limited and there are many simple things they cannot do.
Hope this description made any sense for you, for a more exact description I think the mathematical definition is the most intuitive.
I'd like to know if there is a guideline for the maximum number of attributes any one NSManagedObject subclass should have. I've read Apple's docs carefully, but find no mention of a limit where performance starts to erode.
I DO see a compiler flag option in Xcode that provides a warning when an NSManagedObject has more than 100 attributes, but I can't find any documentation on it. Does anyone here have experience with Core Data MOs that have a large number of attributes?
I'm focused on performance, not memory usage. In my app, there will only ever be about 10-20 instances of this MO that has a large number of attributes and I'm developing on OS X rather than iOS, so memory usage isn't a factor. But if there is a point where performance (especially when faulting) starts to die, I'd like to know now so I can change the structure of my data model accordingly.
Thanks!
Each attribute gets mapped to a table column, if you're using SQLite backing stores. SQLite has a hard limit on the number of columns you can use (2000 by default, though it's compile-time configurable so Apple's implementation could differ), and they recommend not using more than one hundred. That could well be why the Xcode warning sets its threshold at 100.
That same linked page on limits also notes that there are some O(N^2) algorithms where N is the number of columns, so, sounds like you should generally avoid high numbers.
For other file formats, I don't know of any limits or recommendations. But I'd expect similar things - i.e. there's probably some algorithm in there that's O(N^2) or worse, so you want to avoid becoming an uncommon edge case.
Not that I have run across even on iOS. The biggest limiting factor of performance is the cache size in the NSPersistentStoreCoordinator which on Mac OSX is pretty big.
If your attributes are strings, numbers, dates, etc. (i.e. not binary data) then you can probably have a huge number of attributes before you start to see a performance hit. If you are working with binary data then I would caution you against blowing the cache and consider storing binary data outside of SQLite. More recent versions of the OS can even do this automatically for you.
However, I would question why you would want to do this. Surely there are attributes that are going to be less vital than others and can be abstracted away into child entities on the other side of a one-to-one relationship?
I'm writing an API that gets information about the CPU (using CPUID). What I'm wondering is should I store the values from the bit field returned by calling CPUID in separate integer values, or should I just store the entire bit field in a value and write functions to get the different values on-the-fly?
What is preferable in this case? Memory usage or speed? If it's memory usage, I'll just store the entire bit field in a single variable. If it's speed, I'll store each value in a separate variable.
You're only going to query a CPU once. With modern computers having both huge amounts of memory and processing power, it would make no difference either way.
Just do what would make more sense for the next person who reads it.
Programs must be written for people to read, and only incidentally for machines to execute.
— The Structure and Interpretation of Computer Programs
I think it does not matter here, b/c you will not call your CPU-id code 10000 times per second.. will you?
I think you can define different interface (method) for different value. this is more clear and easy to use. a clear, accuracy & easy to use of interface should be the first thing to consider, then performance (memory usage & speed).