B+ Tree and Index Page in Apache Ignite - ignite

I'm trying to understanding the purpose of B+ Tree and Index Pages for Apache Ignite as described here: https://apacheignite.readme.io/docs/page-memory
I have a few questions:
What exactly does an Index Page contain? An ordered list of hash code values for keys that fall into the index page and "other" information that will be used to locate and index into the data page to store/get the key-value pair?
Since hash codes are being used in the index pages, what would happen if collision occurs?
For a "typical" application, do we expect the number of data pages to be much higher than the number of index pages ? (since data pages contain key-value pairs)
What type of relation exists between a distributed cache that we create using ignite.getOrCreateCache(name) and a memory region? 1-to-1, Many-to-1, 1-to-Many, or Many-to-Many?
Consider the following pseudo code:
Ignite ignite = Ignition.start("two_server_node_config");
IgniteCache<Integer,String> cache = ignite.getOrCreateCache("my_cache");
cache.put(7, "abcd");
How does Ignite determine the node to put the key into?
Once the node where to put the key is determined, how does Ignite locate the specific memory region the key belongs to?
Thanks

Index Page contains ordered list of hash values along with links to key-value pairs stored in durable memory. Link = page ID + offset inside page
All links to objects with collided hashes will be present in index page. To perform a lookup, Ignite will dereference links and compare the keys.
This is dependent on object size. You can roughly estimate ratio of data pages to index pages in "typical" application as 90 to 10. However, share of index pages will grow if you add extra indexes: https://apacheignite.readme.io/v2.1/docs/indexes#section-registering-indexed-types
You may also find useful the most recent version of docs: https://apacheignite.readme.io/v2.1/docs/memory-architecture

Answering last two questions:
Many-to-1. Same memory region can be used for multiple caches.
This is based on affinity. Basically, cache key is mapped to affinity key (by default they are the same), and then affinity function is called to determine partition and node. Some more details about affinity here: https://apacheignite.readme.io/docs/affinity-collocation

Related

Is there a way of shuffling partition data on Apache Ignite?

I've got a question that is related to data repartitioning.
Suppose there's a cache with a pre-defined affinity key. Assume I need to repartition data with a new affinity key. I'm wondering whether there is a way of shuffling partition data across all nodes by a new affinity key?
You need to repopulate the data in that case.
First, it's a static configuration and can't be changed on the fly.
The second, most likely you will need to clear meta-information for that particular type, i.e. clean work/binary_meta folder.
The last one - once you changed it, you won't be able to locate the data since most likely it will be stored in a different partition.
In other words, say, you had a cache key with two fields A and B: K(A,B) where A is your affinity key. Say, your Key(1,2) was mapped to a partition 5. In that case, to locate the value, Ignite will search for this partition 5 depending on which node hold the primary copy of it. Later you wanted to have B as the affinity key and re-configure the cache accordingly. In that case, Key(1,2) might now be mapped to a partition 780, meaning that Ignite will never search for a partition 5 and won't be able to locate the previous data.

Performance difference in Couchbase's get by Key and select by index

As we are doing benchmark tests on our Couchbase DB, we tried to compare search for item by their id / key and search for items by a query that uses secondary index.
Following this article about indexing and performance in Couchbase we thought the performance of the two will be the same.
However, in our tests, we discovered that sometimes, the search by key/id was much faster then the search that uses the secondary index.
E.g. ~3MS to search using the index and ~0.3MS to search by the key.(this is a 10 times factor)
The point is that this difference is not consist. The search by key varies from 0.3MS to 15MS.
We are wondering if:
There should be better performance for search by key over search by secondary index?
There should be such time difference between key searches?
The results you get are consistent with what I would expect. Couchbase works as a key-value store when you do any operation using the id. A key-value store is roughly a big distributed hashmap, and in this data structure, you can a very good performance on get/save/delete while using the id.
Whenever you store a new document, couchbase hash the key and assign a Virtual Bucket to it (something similar to a shard). When you need to get this document back, it uses the same algorithm to find out in which virtual bucket the document is located, as the SDK has the cluster map and knows exactly which node has which shards, your application will request the document directly to the node who owns it.
On the other hand, when you query the database, Couchbase has to make internally a map/reduce to find out where the document is located, that is why operations by id are faster.
About your questions about results from 0.3ms to 15ms, it is hard to tell without debugging your environment. However, there are a number of factors that could contribute to it. Ex: the document is cached/not cached, node is undersized, etc.
To add to #deniswrosa's answer, the secondary index will always be slower, because first the index must be traversed based on your query to find the document key, and then a key lookup is performed. Doing just the key lookup is faster if you already have the key. The amount of work to traverse the index can vary depending on how selective the index is, whether the entire index is in memory, etc. Memory-optimized indexes can ensure that the whole index is in memory, if you have enough memory to support that.
Of course even a simple key lookup can be slower if the document in question is not in the cache, and needs to be brought in to memory from storage.
It is possible to achieve sub-millisecond secondary lookups at scale, but it requires some tuning of your query, index, and some possibly some of Couchbase' system parameters. Consider the following simple example:
Sample document in userBucket:
"user::000000000001" : {
"email" : "benjamin1#couchbase.com",
"userId" : "000000000001"
}
This query:
SELECT userId
FROM userBucket
WHERE
email = "benjamin1#couchbase.com"
AND userId IS NOT NULL
;
...should be able to achieve sub-millisecond performance with a properly tuned secondary index:
CREATE INDEX idx01 ON userBucket(email, userId);
Since the index is covering the query completely there is no need for the Query engine to FETCH the document from the K/V store. However "SELECT * ..." will always cause the Query service to FETCH the document and thus will be slower than a simple k/v GET("user::000000000001").
For the best latencies, make sure to review your query plan (using EXPLAIN syntax) and make sure your query is not FETCHing. https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/explain.html

Aerospike: How Primary & Secondary Index works internally

We are using Aerospike DB and was going through the documentation.
I could not find good explanation of algorithm explaining how Primary & Secondary index works.
The documentation says it uses some sort of distributed hash + B Tree.
Could someone please explain it.
The primary index is a mix of a distributed hash and distributed trees. It holds the metadata for every record in the Aerospike cluster.
Each namespace has 4096 partitions that are evenly distributed to the nodes of the cluster, by way of the partition map. Within the node, the primary index is an in-memory structure that indexes only the partitions assigned to the node.
The primary index has a hash table that leads to sprigs. Each sprig is a red-black tree that holds a portion of the metadata. The number of sprigs per-partition is configurable through partition-tree-sprigs.
Therefore, to find any record in the cluster, the client first uses the record's digest to find the correct node with one lookup against the partition map. Then, the node holding the master partition for the record will look up its metadata in the primary index. If this namespace stores data on SSD, the metadata includes the device, block ID and byte offset of the record, so it can be read with a single read operation. The records are stored contiguously, whether on disk or in memory.
The primary index is used for operations against a single record (identified by its key), or batch operations against multiple records (identified by a list of keys). It's also used by scans.
Secondary indexes are optional in-memory structures within each node of the cluster, that also only index the records of the partitions assigned to each node. They're used for query operations, which are intended to return many records based on a non-key predicate.
Because Aerospike is a distributed database, a query must go to all the nodes. The concurrency level (how many nodes are queried at a time) is controlled through a query policy in the client. Each node receiving the query has to lookup the criteria of the predicate against the appropriate secondary index. This returns zero to many records. At this point the optional predicate filter can be applied. The records found by secondary index query are then streamed back to the client. See the documentation on managing indexes.

Understanding Cache Keys, Index, Partition and Affinity w.r.t reads and writes

I am new to Apache Ignite and come from a Data Warehousing background.
So pardon me if I try to relate to Ignite through DBMS jargon.
I have gone through forums but I am still unclear about some of the basics.
I also would like specific answers to the scenario I have posted later.
1.) CacheMode=PARTITIONED
a.) When a cache is declared as partitioned, does the data get equally
partitioned across all nodes by default?
b.) Is there an option to provide a "partition key" based on which the data
would be distributed across the nodes? Is this what we call the Affinity
Key?
c.) How is partitioning different from affinity and can a cache have both
partition and affinity key?
2.) Affinity Concept
With an Affinity Key defined, when I load data (using loadCache()) into a partitioned cache, will the source rows be sent to the node they belong to or all the nodes on the cluster?
3.) If I create one index on the cache, does it by default become the partition/
affinity key as well? In such a scenario, how is a partition different from index?
SCNEARIO DESCRIPTION
I want to load data from a persistent layer into a Staging Cache (assume ~2B) using loadCache(). The cache resides on a 4 node cluster.
a.) How to load data such that each node has to process only 0.5B records?
Is is by using Partitioned Cache mode and defining an Affinity Key?
Then I want to read transactions from the Staging Cache in TRANSACTIONAL atomicity mode, lookup a Target Cache and do some operations.
b.) When I do the lookup on Target Cache, how can I ensure that the lookup is happening only on the node where the data resides and not do lookup on all the nodes on which Target Cache resides?
Would that be using the AffinityKeyMapper API? If yes, how?
c.) Lets say I wanted to do a lookup on a key other than Affinity Key column, can creating an index on the lookup column help? Would I end up scanning all nodes in that case?
Staging Cache
CustomerID
CustomerEmail
CustomerPhone
Target Cache
Seq_Num
CustomerID
CustomerEmail
CustomerPhone
StartDate
EndDate
This is answered on Apache Ignite users forum: http://apache-ignite-users.70518.x6.nabble.com/Understanding-Cache-Key-Indexes-Partition-and-Affinity-td11212.html
Ignite uses AffinityFunction [1] for data distribution. AF implements two mappings: key->partition and partition->node.
Key->Partition mapping is definitely map entry to partition. It doesn't bother of backups, but data collocation\distribution over partitions.
Usually, entry key (actually it's hashcode) is used to calculate partition entry belongs to.
But you can use AffinityKey [2] that would be use instead to manage data collocation. See also 'org.apache.ignite.cache.affinity.AffinityKey' javadoc.
Partition->Node mapping determines primary and backup nodes for partition. It doesn't bother of data collocation, but backups and partition distribution among nodes
Cache.loadCache just makes all nodes to call localLoadCache method. Which calls CacheStore.loadCache. So, each of grid nodes will load all the data from cache store and then discard data that is not local for the node.
Same data may resides on several nodes if you use a backups. AffinityKey should be a part of entry key and if AffinityKey mapping is configured then AffinityKey will be used instead of entry key for entry->partition mapping
and AffinityKey will be passed to AffinityFunction.
Indexes always resides on same node with the data.
a. To achieve this you should implement CacheStore.loadCache method to load data for certain partitions. E.g. you can store partitionID for each row in database.
However, if you change AF or partitions numbers you should update partitionID for entries in database as well.
The other way. If it is posible, you can load all the data in single node and then add other nodes to the grid. Data will rebalanced over nodes automatically.
b. AffinityKey is always used if it is as it shoud be part of entry key. So, lookup will always be happening on the node where the data resides.
c. I can't understand the question. Would you please clarify if it still is actual?

Neo4j - Find node by ID - How to get the ID for querying?

I want to be able to to find a specific node by it's ID for performance reasons (IDs are more efficient than indexes)
In order to execute the following example:
MATCH (s)
WHERE ID(s) = 65110
RETURN s
I will need the ID of the node (65110 in this case)
But how to I get it? Since the ID is auto-generated, It's impossible to find the ID without querying the graph, which kind of defeats the purpose since I will already have the node.
Am I missing something?
TL;DR: use an indexed property for lookups unless you absolutely need to optimise and can measure the difference.
Typically you use an index lookup as an entry point to the graph, that is, to obtain the node that provides the start of an edge traversal. While the pointer-like nature of Neo4j node IDs means they are theoretically faster, index lookups are also very efficient so you should not discount them on performance grounds unless you are sure it will make a measurable difference.
You should also consider that Neo4j node IDs are not stable. If you delete a node it is possible for the same ID to be re-used in future. For this reason they should really be considered an internal implementation detail and not one that should be relied on as part of your application's external interface.
That said, I have an application that stores Neo4j IDs in a Solr index for looking up nodes in bulk, but this index is considered volatile and the nodes also contain an indexed, application-generated UUID property (with a unique constraint) that serves as their main "primary key".
Further reading and discussion: https://github.com/neo4j/neo4j/issues/258