Chronicle Map v3.9.0 returning different sizes - chronicle-map

I am using v3.9.0 of Chronicle Map where I have two processes where Process A writes to a ChronicleMap and Process B just initializes with the same persistent file that A uses. After loading, I print Map.size in Process A and Process B but I get different Map size. I am expecting both sizes to be the same. In what cases, can I see this behaviour?
How can I troubleshoot this problem? Is there any kind of flush operation required?
One thing, I tried to do is to dump the file using getAll method but it dumps everything as a json on a single file which is pretty much killing any of the editors I have. I tried to use MapEntryOperations in Process B to see if anything interesting happening but seems like it is mainly invoked when something is written into the map but not when Map is initialized directly from the persistent store.

I was using createOrRecoverPersistedTo instead of createPersistedTo method. Because of this, my other process was not seeing the entire data.
As described in Recovery section in the tutorial elaborates:
.recoverPersistedTo() needs to access the Chronicle Map exclusively. If a concurrent process is accessing the Chronicle Map while another process is attempting to perform recovery, result of operations on the accessing process side, and results of recovery are unspecified. The data could be corrupted further. You must ensure no other process is accessing the Chronicle Map store when calling for .recoverPersistedTo() on this store."

This seems pretty strange. Size in Chronicle Map is not stored in a single memory location. ChronicleMap.size() walks through and sums each segment's size. So size is "weakly consistent" and if one process constantly writes to a Map, size() calls from multiple threads/processes could return slightly different values. But if nobody writes to a map (e. g. in your case just after loading, when process A hasn't yet started writing) all callers should see the same value.
Instead of getAll() and analyzing output manually, you can try to "count" entries by something like
int entries = 0;
for (K k : map.keySet()) {
entries++;
}

Related

calling getall in apache geode to getall data for the keys in the list is slow

I am using region getall method to get values for all keys, but what i found is that for the key present in apache geode it gets data quickly but for one which is not present in apache geode. it calls the cache loader one by one. Is there any mechanism so that calling cacheloader can be made parallel.
I don't think there's a way of achieving this out of the box, at least not while using a single Region.getAll() operation. If I recall correctly, the servers just iterate through the keys and performs a get on every single one, which ends up triggering the CacheLoader execution.
You could, however, achieve some degree of parallelism by splitting the set of keys into multiple sets, and launching different threads to execute the Region.getAll() operation using these smaller sets. The actual size of each set and the number of threads will depend on the ratio of cache hits / misses you expect, and your SLA requirements, of course.

Flink batching Sink

I'm trying to use flink in both a streaming and batch way, to add a lot of data into Accumulo (A few million a minute). I want to batch up records before sending them to Accumulo.
I ingest data either from a directory or via kafka, convert the data using a flatmap and then pass to a RichSinkFunction, which adds the data to a collection.
With the streaming data, batching seems ok, in that I can add the records to a collection of fixed size which get sent to accumulo once the batch threshold is reached. But for the batch data which is finite, I'm struggling to find a good approach to batching as it would require a flush time out in case there is no further data within a specified time.
There doesn't seem to be an Accumulo connector unlike for Elastic search or other alternative sinks.
I thought about using a Process Function with a trigger for batch size and time interval, but this requires a keyed window. I didn't want to go down the keyed route as data looks to be very skewed, in that some keys would have a tonne of records and some would have very few. If I don't use a windowed approach, then I understand that the operator won't be parallel. I was hoping to lazily batch, so each sink only cares about numbers or an interval of time.
Has anybody got any pointers on how best to address this?
You can access timers in a sink by implementing ProcessingTimeCallback. For an example, look at the BucketingSink -- its open and onProcessingTime methods should get you started.

Aerospike: Device Overload Error when size of map is too big

We got "device overload" error after the program ran successfully on production for a few months. And we find that some maps' sizes are very big, which may be bigger than 1,000.
After I inspected the source code, I found that the reason of "devcie overload" is that the write queue is beyond limitations, and the length of the write queue is related to the effiency of processing.
So I checked the "particle_map" file, and I suspect that the whole map will be rewritten even if we just want to insert one pair of KV into the map.
But I am not so sure about this. Any advice ?
So I checked the "particle_map" file, and I suspect that the whole map will be rewritten even if we just want to insert one pair of KV into the map.
You are correct. When using persistence, Aerospike does not update records in-place. Each update/insert is buffered into an in-memory write-block which, when full, is queued to be written to disk. This queue allows for short bursts that exceed your disks max IO but if the burst is sustained for too long the server will begin to fail the writes with the 'device overload' error you have mentioned. How far behind the disk is allowed to get is controlled by the max-write-cache namespace storage-engine parameter.
You can find more about our storage layer at https://www.aerospike.com/docs/architecture/index.html.

Retrieving and using partial results from Pool

I have three functions that read, process and write respectively. Each function was optimized (to the best of my knowledge) to work independently. Now, I am trying to pass each result of each function the next one in the chain as soon as it is available instead of waiting for the entire list. I am not really sure how I can connect them. Here's what I have so far.
def main(files_to_load):
loaded_files = load(files_to_load)
with ThreadPool(processes=cpu_count()) as pool:
proccessed_files = pool.map_async(processing_function_with_Pool, iterable=loaded_files).get()
write(proccessed_files)
As you can see, my main() function waits for all the files to load (about 500Mb) stores them to memory and sends them to processing_function_with_Pool() which divides the files into chunks to be processed.After all the processing is done, the files will start to be written to disk. I feel like there's a lot of unnecessary waiting between these three steps. How can I connect everything?
Now your logic is reading all the files sequentially (I guess) and storing them at once in memory.
I'd recommend you to send to processing_function_with_Pool just a list with the file names to be processed.
The processing_function_with_Pool will take care of reading, processing the file and writing the results back.
In this way you'll take advantage of dealing with IO concurrently.
If the processing_function_with_Pool is doing CPU-bound work, I'd suggest you to switch to a Pool of processes.

Trident or Storm topology that writes on Redis

I have a problem with a topology. I try to explain the workflow...
I have a source that emits ~500k tuples every 2 minutes, these tuples must be read by a spout and processed exatly once like a single object (i think a batch in trident).
After that, a bolt/function/what else?...must appends a timestamp and save the tuples into Redis.
I tried to implement a Trident topology with a Function that save all the tuples into Redis using a Jedis object (Redis library for Java) into this Function class, but when i deploy i receive a NotSerializable Exception on this object.
My question is.How can i implement a Function that writes on Redis this batch of tuples? Reading on the web i cannot found any example that writes from a function to Redis or any example using State object in Trident (probably i have to use it...)
My simple topology:
TridentTopology topology = new TridentTopology();
topology.newStream("myStream", new mySpout()).each(new Fields("field1", "field2"), new myFunction("redis_ip", "6379"));
Thanks in advance
(replying about state in general since the specific issue related to Redis seems solved in other comments)
The concepts of DB updates in Storm becomes clearer when we keep in mind that Storm reads from distributed (or "partitioned") data sources (through Storm "spouts"), processes streams of data on many nodes in parallel, optionally perform calculations on those streams of data (called "aggregations") and saves the results to distributed data stores (called "states"). Aggregation is a very broad term that just means "computing stuff": for example computing the minimum value over a stream is seen in Storm as an aggregation of the previously known minimum value with the new values currently processed in some node of the cluster.
With the concepts of aggregations and partition in mind, we can have a look at the two main primitives in Storm that allow to save something in a state: partitionPersist and persistentAggregate, the first one runs at the level of each cluster node without coordination with the other partitions and feels a bit like talking to the DB through a DAO, while the second one involves "repartitioning" the tuples (i.e. re-distributing them across the cluster, typically along some groupby logic), doing some calculation (an "aggregate") before reading/saving something to DB and it feels a bit like talking to a HashMap rather than a DB (Storm calls the DB a "MapState" in that case, or a "Snapshot" if there's only one key in the map).
One more thing to have in mind is that the exactly once semantic of Storm is not achieved by processing each tuple exactly once: this would be too brittle since there are potentially several read/write operations per tuple defined in our topology, we want to avoid 2-phase commits for scalability reasons and at large scale, network partitions become more likely. Rather, Storm will typically continue replaying the tuples until he's sure they have been completely successfully processed at least once. The important relationship of this to state updates is that Storm gives us primitive (OpaqueMap) that allows idempotent state update so that those replays do not corrupt previously stored data. For example, if we are summing up the numbers [1,2,3,4,5], the resulting thing saved in DB will always be 15 even if they are replayed and processed in the "sum" operation several times due to some transient failure. OpaqueMap has a slight impact on the format used to save data in DB. Note that those replay and opaque logic are only present if we tell Storm to act like that, but we usually do.
If you're interested in reading more, I posted 2 blog articles here on the subject.
http://svendvanderveken.wordpress.com/2013/07/30/scalable-real-time-state-update-with-storm/
http://svendvanderveken.wordpress.com/2014/02/05/error-handling-in-storm-trident-topologies/
One last thing: as hinted by the replay stuff above, Storm is a very asynchronous mechanism by nature: we typically have some data producer that post event in a queueing system (e,g. Kafka or 0MQ) and Storm reads from there. As a result, assigning a timestamp from within storm as suggested in the question may or may not have the desired effect: this timestamp will reflect the "latest successful processing time", not the data ingestion time, and of course it will not be identical in case of replayed tuples.
Have you tried trident-state for redis. There is a code on github that does it already:
https://github.com/kstyrc/trident-redis.
Let me know if this answers your question or not.