Multiple Persistence Store for Apache Ignite - ignite

I have one use case where I have to support multiple persistence store for my ignite cluster,For example Cache A1 should be primed from Database db1 and Cache B1 should be primed from database db2. can this be done?.In ignite Configuration XML I can only provide one persistence store details,
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util.xsd">
<!-- Datasource for Persistence. -->
<bean name="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="oracle.jdbc.driver.OracleDriver" />
<property name="url" value="jdbc:oracle:thin:#localhost:1521:roc12c" />
<property name="username" value="test" />
<property name="password" value="test" />
</bean>
In my CacheStore implementation I can only access this Database right?.

I've not tried this, but if its similar to other bean-configured systems. You should be able to create another bean with a different name and configuration. Then in your cache configuration for A1 and B1 specify the different data sources. That being said, I'm guessing that theoretically.
It may be that you are already doing so, but I can't tell from your question. If you instead choose to implement your caches in this manner https://apacheignite.readme.io/docs/persistent-store you can definitely configure two caches to have different data sources. This is how I'm currently implementing multiple caches. In the cache store I use I specifically call out which database to go to.
Here is a cache configuration I use for mine.
<property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<!-- Set a cache name. -->
<property name="name" value="recordData"/>
<property name="rebalanceMode" value="ASYNC"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="backups" value="1"/>
<!-- Enable Off-Heap memory with max size of 10 Gigabytes (0 for unlimited). -->
<property name="memoryMode" value="OFFHEAP_TIERED"/>
<property name="offHeapMaxMemory" value="0"/>
<property name="swapEnabled" value="false"/>
<property name="cacheStoreFactory">
<bean class="javax.cache.configuration.FactoryBuilder" factory-method="factoryOf">
<constructor-arg value="com.company.util.MyDataStore"/>
</bean>
</property>
<property name="readThrough" value="true"/>
<property name="writeThrough" value="true"/>
</bean>
</property>

Cache store is configured per cache, so you just need to inject different data sources to different stores. What you showed is just a standalone data source bean, it's not even a part of IgniteConfiguration. You can have multiple data source beans with different IDs.

Related

GridGain Near Cache Not storing data

I have a query re. the setup of the GridGain near cache, we have a single server node with the config as listed below and have a single thick client connecting successfully to it ~
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- PEER CLASS LOADING -->
<property name="peerClassLoadingEnabled" value="true"/>
<!-- CACHE CONFIG-->
<property name="cacheConfiguration">
<list>
<!-- ENTER CACHE TEMPLATE-->
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache1"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache2"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache3"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
</list>
</property>
<!-- DISCOVERY-->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
<property name="namespace" value="gridgain"/>
<property name="serviceName" value="gridgain-service"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
In setting the server up like this it was my understanding that as per the documentation here , that "Once configured in this way, the near cache is created on any node that requests data from the underlying cache, including both server nodes and client nodes. When you get an instance of the cache, as shown in the following example, the data requests go through the near cache.
IgniteCache<Integer, Integer> cache = ignite.cache("myCache");
int value = cache.get(1);
Based on this I do not believe that I have any need to create the near cache config on our client? and have just implemented code as ~
IgniteCache<Object, Object> cache = ignite.cache(ourCacheName);
The issue I see is that when I peek at the local cache to try and find values in there, after searching for them ~
cache_.localPeek(key, CachePeekMode.NEAR)
The objects are not found, despite being searched for several times, and it looks like they are not added to our near cache setup, everything just refers to the underlying cache. Previously we had programmatically created the Near cache on the client and it had worked, but we would like to config the solution on the server if possible. Our client node is just using default config, if this makes a difference.
Any thoughts why we are not seeing a near cache?
Thanks,
LS
In order to use the cache I suggest you create the near cache explicitly using the following syntax:
IgniteCache<Integer, Integer> clientCache = client.getOrCreateNearCache(cacheCfg.getName(), nearCfg);
...
clientCache.get(1);
System.out.println(clientCache.localPeek(1, CachePeekMode.NEAR));
There are some tickets like IGNITE-15960 or IGNITE-1163 with discussions about the API improvements. I suppose the cache has to be declared on the servers first and then you would be able to create it explicitly on the clients. Agree, the docs and API are super confusing and have to be reworked.
Also, the near cache is local to a node, i.e. you might have them for some clients/servers and do not want to create it for other ones.

Apache Ignite high data availability - partitional & backup setup

I run Apache Ignite to store large data set for computation & retrieval. For now I am trying to see if in-memory itself can address the caching problem.
I have partitioned the cache & set the backup count to 1. I believe, the data will be copied to another node to address any failure, this means, any one of the node goes down, the respective data should be available from the backup node. So, querying the cache should not be affected.
In my setup, when I shutdown one of the node, the data becomes unavailable & query to the cache returns null. Below is my ignite setup (run locally). What's the right way to configure the partition with right backup to address any node failures?
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:util="http://www.springframework.org/schema/util" xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util.xsd">
<bean abstract="true" id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Set to true to enable distributed class loading for examples, default is false. -->
<property name="peerClassLoadingEnabled" value="true" />
<property name="rebalanceThreadPoolSize" value="4" />
<property name="cacheConfiguration">
<list>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="mycache" />
<property name="cacheMode" value="PARTITIONED" />
<property name="backups" value="1" />
<property name="rebalanceMode" value="SYNC" />
<property name="writeSynchronizationMode" value="FULL_SYNC" />
<property name="partitionLossPolicy" value="READ_ONLY_SAFE" />
</bean>
</list>
</property>
<!-- Enable task execution events for examples. -->
<property name="includeEventTypes">
<list>
<!--Task execution events-->
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_TIMEDOUT" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_SESSION_ATTR_SET" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_REDUCED" />
<!--Cache events-->
<util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_PUT" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_READ" />
<util:constant static-field="org.apache.ignite.events.EventType.EVT_CACHE_OBJECT_REMOVED" />
</list>
</property>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<value>127.0.0.1:47500..47509</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
Check that your nodes are a part of the same cluster. You should see info in the logs that the topology includes multiple servers.
Your understanding of the cache configuration is correct - the data should be available after one node goes down. Moreover, with the READ_ONLY_SAFE policy you cannot silently lose data even if the cluster doesn't have enough copies - your cache reads would start throwing errors.
Since you're getting null I'm guessing that your client just reconnected to the second server, which is not connected to the first one (and is therefore empty).

Apache Ignite zone(rack)-aware parititons

I'm battling to configure Apache Ignite to distribute partitions in zone-aware manner. I have Ignite 2.8.0 with 4 nodes running as StatefulSet pods in GKE 1.14 split in two zones. I followed the guide, and the example:
Propagated zone names into pod under AVAILABILITY_ZONE env var.
Then using Web-Console I verified that this env var was loaded correctly for each node.
I setup cache template in node XML config as in the below and created a cache from it using GET /ignite?cmd=getorcreate&cacheName=zone-aware-cache&templateName=zone-aware-cache (I can't see affinityBackupFilter settings in UI, but other parameters from the template got applied, so I assume it worked)
To simplify verification of partition distribution, I the partition number is set to just 2. After creating the cache I observed the following partition distribution:
Then I mapped nodes ids to values in AVAILABILITY_ZONE env var, as reported by nodes, with the following results:
AA146954 us-central1-a
3943ECC8 us-central1-c
F7B7AB67 us-central1-a
A94EE82C us-central1-c
As one can easily see, partition 0 pri/bak resides on nodes 3943ECC8 and A94EE82C which both are in the same zone. What am I missing to make it work?
Another odd thing, is then specifying partition number to be low (e.g. 2 or 4), only 3 out of 4 nodes are used). When using 1024 partitions, all nodes are utilized, but the problem still exists - 346 out of 1024 partitions had their primary/backup colocated in the same zone.
Here is my node config XML:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
</bean>
</property>
</bean>
</property>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!-- Enables Kubernetes IP finder and setting custom namespace and service names. -->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
<property name="namespace" value="ignite"/>
</bean>
</property>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="zone-aware-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<!-- when you create a template via XML configuration, you must add an asterisk to the name of the template -->
<property name="name" value="zone-aware-cache*"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="atomicityMode" value="ATOMIC"/>
<property name="backups" value="1"/>
<property name="readFromBackup" value="true"/>
<property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>
<property name="copyOnRead" value="true"/>
<property name="eagerTtl" value="true"/>
<property name="statisticsEnabled" value="true"/>
<property name="affinity">
<bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
<property name="partitions" value="2"/> <!-- for debugging only! -->
<property name="excludeNeighbors" value="true"/>
<property name="affinityBackupFilter">
<bean class="org.apache.ignite.cache.affinity.rendezvous.ClusterNodeAttributeAffinityBackupFilter">
<constructor-arg>
<array value-type="java.lang.String">
<!-- Backups must go to different AZs -->
<value>AVAILABILITY_ZONE</value>
</array>
</constructor-arg>
</bean>
</property>
</bean>
</property>
</bean>
</list>
</property>
</bean>
</beans>
Update: Eventually excludeNeighbors false/true makes or breaks zone awareness. I'm not sure why it didn't work with excludeNeighbors=false previously for me. I made some scripts to automate my testing. And now it's definite that it's the excludeNeighbors setting. It's all here: https://github.com/doitintl/ignite-gke. Regardless I also opened a bug with IGNITE Jira: https://issues.apache.org/jira/browse/IGNITE-12896. Many thanks to #alamar for his suggestions.
I recommend setting excludeNeighbors to false. It is true in your case, it is not needed, and I get correct partitions mapping when I set it to false (of course, I also run all four nodes locally).
Environment property was enough, did not need to add it manually to user attributes.

Apache Ignite CacheConfiguration repeat for each data set?

I am trying to modify default-config.xml by adding cacheConfiguration tags. Do i need to repeat cacheConfiguration XML tag for each data set RDD that i am tyring to keep to keep it in the memory ? Can i set backups to 0, if i don't want it.
ex:
<property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="TEST1_RDD"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="backups" value="0"/>
</bean>
</property> <property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="TEST2_RDD"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="backups" value="0"/>
</bean>
</property>
Also, do i need to specify explicitly write synchronization mode ? and by default which one Ignite consider ?
ex:
<property name="writeSynchronizationMode" value="FULL_SYNC"/>
Appreciate your response.
Yes, You have to write configuration for each cache as your cache may have different functionality/purpose and you have to set configuration according to it.
For backups it's default value is 0 and for CacheWriteSynchronizationMode default value is PRIMARY_SYNC
There is a possibility to define cache templates, if you don't want to provide the same configuration for caches: https://apacheignite.readme.io/docs/cache-template

Ignite C++ client mode, Near cache

I have an ignite server running in replicated mode and many clients on same node which has near cache enabled. Now I don't find a significant performance difference when I run client with near cache and without near cache.
My understanding of near cache is that frequently used key and value would be stored on client itself, so there won't be an actual Get() call made to server. please correct me if I am wrong.
Can someone share a working near cache configuration xml.
SERVER CONFIG:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util.xsd">
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="cacheMode" value="LOCAL" />
<!-- Enable near cache to cache recently accessed data. -->
<!-- <property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration"/>
</property> -->
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
</bean>
</property>
</bean>
</property>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!--
Ignite provides several options for automatic discovery that can be used
instead os static IP based discovery.
-->
<!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
<!-- <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"> -->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<!-- <value>127.0.0.1:48550..48551</value> -->
<value>XXX.ZZZ.yyy.36:47500..47501</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
CLIENT CONFIG:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util.xsd">
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="cacheMode" value="LOCAL" />
<!-- Enable near cache to cache recently accessed data. -->
<!-- <property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration"/>
</property> -->
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
</bean>
</property>
</bean>
</property>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!--
Ignite provides several options for automatic discovery that can be used
instead os static IP based discovery.
-->
<!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
<!-- <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"> -->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<!-- <value>127.0.0.1:48550..48551</value> -->
<value>XXX.ZZZ.yyy.38:47500..47501</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Yes, near cache improves performance by caching often used entries on node locally, but it doesn't make sense if you run all tests on single machine or JVM. Near cache allows not to go on remote node for data, but in your test everything already works locally.
Also Near cache have no sense for server nodes on REPLICATED or PARTITIONED cache, where number of backups equals or bigger than number of data nodes, because all data set already available for each node locally.
So to get performance boost you need to configure client node to use Near cache, when server nodes work on remote machines. Do not forget to warm up near cache before measuring.
Here is XML snippet for setting Near cache:
...
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<!-- Your other cache config -->
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration"/>
</property>
</bean>
...