Moving data from one node to other node in same cluster in Apache Ignite - ignite

In a baseline cluster of 8 nodes, we have data in the partitioned template without backup. Assume I have avg 28K entries in all 8 nodes in SampleTable(#Cache). Total data = 28K*8 = 224K entries.
CREATE TABLE SampleTable(....) WITH "template=partitioned"
Now I want to shut down one node and before shutting down I want to move data from 8th Node to other nodes so approx 32K (32K*7=224K) entries to 7 nodes. Can I move data from any node to other nodes?
How can I move all the data from one node to other nodes (cluster) before shutting down that node? Keeping the data balanced and distributed in rest 7 nodes.
I created the table (SampleTable) using create statement and inserted data using insert statement (using JDBC connection).
Persistence is enabled.

I think that the most straightforward way is using backups.
Anyway, if you need to avoid data loss, using backups (or/and persistence) is a must.
As a simple workaround you can try the following steps:
You can scan local data on the node, which you want to shut down, using ScanQuery and store the data in a database.
After that, shut down the node and exclude it from baseline.
Upload data from a database.

The approach described below will work only if there are backups configured in a cluster (> 0).
To remove a node from Baseline Topology and rebalance data between rest 7 nodes you are able to use Cluster Activation Tool:
Stop the node you want to remove from topology.
Wait until the node is stopped. Message Ignite node stopped OK should appear in logs.
Check that node is offline:
$IGNITE_HOME/bin/control.sh --baseline
Cluster state: active
Current topology version: 8
Baseline nodes:
ConsistentID=<node1_id>, STATE=ONLINE
ConsistentID=<node2_id>, STATE=ONLINE
...
ConsistentID=<node8_id>, STATE=OFFLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 8
Other nodes not found.
Remove the node from baseline topology:
$IGNITE_HOME/bin/control.sh --baseline remove <node8_id>

Related

Ignite with backup count zero

I have set backup count of ignite cache to zero. I have created two server node(say s1 and s2) and one client node(c1). I have set cache mode as Partitioned. I have inserted data in ignite cache. Stopped server 2 and tried access data it is not getting data. If backup count is 0 then how to copy data from one server node other server node. Does ignite does automatically when we stop node.
The way Ignite manages this is with backups. If you set it to zero, you have no resilience and removing a node will result in data loss (unless you enable persistence). You can configure how Ignite responds to this situation with the Partition Loss Policy.

Merge two persistent caches in Apache Ignite

My application uses Apache Ignite persistent storage. For some weeks I ran the application storing the persistent data in let's say "c:\db1". Later I ran the same application with persistent data in c:\db2. The data was only stored on this one server node.
Is there a way to merge the data from db1 folder to db2 folder?
No, you can't, at least not easily.
The best way would be two start two nodes in separate clusters, one using c:\db1 and one using c:\db2 and stream data from one to the other:
Start the two clusters
Start a helper application that will load the data
In the application, start two client nodes with different configurations - one connected to the first cluster, one connected to the second
Transfer the data roughly like this (code is not tested!)
IgniteCache cache1 = client1.cache("mycache");
IgniteCache cache2 = client2.cache("mycache");
for (Cache.Entry e : cache1.query(new ScanQuery())) {
client2.put(e.getKey(), e.getValue());
}

Is it possible to perform SQL Query with distributed join over a local cache and a partitioned cache?

I am currently using apache ignite 2.3.0 and the java api. I have a data grid with two nodes and two different caches. One is local and the other partitioned.
Lets say my local cache is on node #1.
I want to perform an SQL query (SqlFieldsQuery) with distributed join so that it returns data from local cache on node #1 and data from partitioned cache on node #2.
Is it possible? Do I need to specify the join in some particular order or activate a specific flag?
All my current tests are not returning any rows from partitioned cache that are not located on same node as local cache.
I tested the same query with distributed join over two different partitioned cache with no affinity and it was able to return data from different nodes properly. Is there a reason why this wouldn't work with local cache too?
Thanks
It is not possible to perform joins (both distributed an co-located) between LOCAL and PARTITIONED caches. The workaround is to use two PARTITIONED caches.

redis keys not getting deleted after expire time

keys set with a expire is not getting cleared after expire time. For example, in reds-cli
> set hi bye
>expire hi 10
>ttl hi #=> 9
#(after 10 seconds)
>ttl hi #=> 0
>get hi #=> bye
redis version is 2.8.4. This is a master node in a sentinel setup with a single slave. Persistence is turned off. Kindly help me with debugging this issue.
If there is any data in redis with a large size, there might be problems while slave nodes sync from master nodes, which might lead to the TTL of these data in slave won't sync and so that the data in slave nodes won't get deleted.
You can use scripts to delete specific data in master node and slave nodes will delete those data whose key can't be found in master node.
Update redis.conf file to keep notify-keyspace-events Ex and then restart the redis server using redis-server /usr/local/etc/redis.conf

Aerospike Data Recovery Other Cluster Or Local HDD

I use cluster
configuration storage-engine device
when i restart one node ,the data will recovery in other cluster or local HDD?
When I restart the whole cluster,data from which to restore?
I want to know is how the whole process
version : community edition
i have 3 node;
storage-engine device {
file /opt/aerospike/datafile
filesize 1G
data-in-memory true
}
this is config
i stop node1--->the cluster have 2 node -->i modify data(if data before in node1)
i stop node2 and node3,after cluster all stop,i start the node1 -->node2 -->node3
This will have a dirty data?
I can think node3 has all the data?
Let me try to answer from what I could get from your question. Correct me if my understanding is wrong.
You are having a file-backed namespace in aerospike. The data will be persisted to the file. The data is also kept in memory (because of the 'data-in-memory true' setting). The default replication factor is 2. So, your data will reside on 2 nodes in a stable state.
When you shutdown 3 nodes, one by one, the unchanged data will be there in the persistent files. So, when the nodes are restarted, they data will come back from the persistent files.
The data that changed during the shutdown (node1 is down but node2 & node3 are up) is the tricky question. When node1 is done, a copy of its data will be in one of node2 & node3 (because of the replication factor=2). So, when you update a record, we do something called duplicate resolution which will fetch the latest record and update it on the new master node. It will be persisted on that node.