I have 400M records on a ignite cache. And native persistence is enabled. I want to enable expiry policy. TO do so i have added below below on my xml config.
<!-- Enabling expiry policy -->
<property name="cacheConfiguration">
<list>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="CACHE_L4_TRIGGER_NOTIFICATION"/>
<property name="expiryPolicyFactory">
<bean class="javax.cache.expiry.CreatedExpiryPolicy" factory-method="factoryOf">
<constructor-arg>
<bean class="javax.cache.expiry.Duration">
<constructor-arg value="MINUTES"/>
<constructor-arg value="60"/>
</bean>
</constructor-arg>
</bean>
</property>
</bean>
</list>
</property>
It worked for newly added data but i have old 400M data. i need help to remove 30 days old data from this 400M data. How can do this? I have searched but cant find anything. Also i cant purge all data as they are important.
You can't do this for existing data. Ignite doesn't keep track of when an entry was created or modified in any way if expiry policy is not set. You have to iterate over all your data and clean it manually based on the contents (e.g. if you have a creation timestamp attribute).
Related
I have a query re. the setup of the GridGain near cache, we have a single server node with the config as listed below and have a single thick client connecting successfully to it ~
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- PEER CLASS LOADING -->
<property name="peerClassLoadingEnabled" value="true"/>
<!-- CACHE CONFIG-->
<property name="cacheConfiguration">
<list>
<!-- ENTER CACHE TEMPLATE-->
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache1"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache2"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache3"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="rebalanceMode" value="SYNC"/>
<property name="nearConfiguration">
<bean class="org.apache.ignite.configuration.NearCacheConfiguration">
<property name="nearEvictionPolicyFactory">
<bean class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicyFactory">
<property name="maxSize" value="100000"/>
</bean>
</property>
</bean>
</property>
</bean>
</list>
</property>
<!-- DISCOVERY-->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
<property name="namespace" value="gridgain"/>
<property name="serviceName" value="gridgain-service"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
In setting the server up like this it was my understanding that as per the documentation here , that "Once configured in this way, the near cache is created on any node that requests data from the underlying cache, including both server nodes and client nodes. When you get an instance of the cache, as shown in the following example, the data requests go through the near cache.
IgniteCache<Integer, Integer> cache = ignite.cache("myCache");
int value = cache.get(1);
Based on this I do not believe that I have any need to create the near cache config on our client? and have just implemented code as ~
IgniteCache<Object, Object> cache = ignite.cache(ourCacheName);
The issue I see is that when I peek at the local cache to try and find values in there, after searching for them ~
cache_.localPeek(key, CachePeekMode.NEAR)
The objects are not found, despite being searched for several times, and it looks like they are not added to our near cache setup, everything just refers to the underlying cache. Previously we had programmatically created the Near cache on the client and it had worked, but we would like to config the solution on the server if possible. Our client node is just using default config, if this makes a difference.
Any thoughts why we are not seeing a near cache?
Thanks,
LS
In order to use the cache I suggest you create the near cache explicitly using the following syntax:
IgniteCache<Integer, Integer> clientCache = client.getOrCreateNearCache(cacheCfg.getName(), nearCfg);
...
clientCache.get(1);
System.out.println(clientCache.localPeek(1, CachePeekMode.NEAR));
There are some tickets like IGNITE-15960 or IGNITE-1163 with discussions about the API improvements. I suppose the cache has to be declared on the servers first and then you would be able to create it explicitly on the clients. Agree, the docs and API are super confusing and have to be reworked.
Also, the near cache is local to a node, i.e. you might have them for some clients/servers and do not want to create it for other ones.
I want to use network shared folder as persistent store path in DataStorageConfiguration.Ignite stucks there.
Can anyone please tell me how to do in ignite?
I wouldn’t recommend putting Ignite’s persistent files on a network volume. The performance and locking characteristics often lead to problems. Fast, local disks are very much preferable.
But to directly answer your question, as per the documentation:
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
</bean>
</property>
<property name="storagePath" value="/opt/storage"/>
<property name="walPath" value="/opt/wal"/>
<property name="walArchivePath" value="/opt/walarch"/>
</bean>
</property>
</bean>
I'm battling to configure Apache Ignite to distribute partitions in zone-aware manner. I have Ignite 2.8.0 with 4 nodes running as StatefulSet pods in GKE 1.14 split in two zones. I followed the guide, and the example:
Propagated zone names into pod under AVAILABILITY_ZONE env var.
Then using Web-Console I verified that this env var was loaded correctly for each node.
I setup cache template in node XML config as in the below and created a cache from it using GET /ignite?cmd=getorcreate&cacheName=zone-aware-cache&templateName=zone-aware-cache (I can't see affinityBackupFilter settings in UI, but other parameters from the template got applied, so I assume it worked)
To simplify verification of partition distribution, I the partition number is set to just 2. After creating the cache I observed the following partition distribution:
Then I mapped nodes ids to values in AVAILABILITY_ZONE env var, as reported by nodes, with the following results:
AA146954 us-central1-a
3943ECC8 us-central1-c
F7B7AB67 us-central1-a
A94EE82C us-central1-c
As one can easily see, partition 0 pri/bak resides on nodes 3943ECC8 and A94EE82C which both are in the same zone. What am I missing to make it work?
Another odd thing, is then specifying partition number to be low (e.g. 2 or 4), only 3 out of 4 nodes are used). When using 1024 partitions, all nodes are utilized, but the problem still exists - 346 out of 1024 partitions had their primary/backup colocated in the same zone.
Here is my node config XML:
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
</bean>
</property>
</bean>
</property>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!-- Enables Kubernetes IP finder and setting custom namespace and service names. -->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
<property name="namespace" value="ignite"/>
</bean>
</property>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="zone-aware-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<!-- when you create a template via XML configuration, you must add an asterisk to the name of the template -->
<property name="name" value="zone-aware-cache*"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="atomicityMode" value="ATOMIC"/>
<property name="backups" value="1"/>
<property name="readFromBackup" value="true"/>
<property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>
<property name="copyOnRead" value="true"/>
<property name="eagerTtl" value="true"/>
<property name="statisticsEnabled" value="true"/>
<property name="affinity">
<bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
<property name="partitions" value="2"/> <!-- for debugging only! -->
<property name="excludeNeighbors" value="true"/>
<property name="affinityBackupFilter">
<bean class="org.apache.ignite.cache.affinity.rendezvous.ClusterNodeAttributeAffinityBackupFilter">
<constructor-arg>
<array value-type="java.lang.String">
<!-- Backups must go to different AZs -->
<value>AVAILABILITY_ZONE</value>
</array>
</constructor-arg>
</bean>
</property>
</bean>
</property>
</bean>
</list>
</property>
</bean>
</beans>
Update: Eventually excludeNeighbors false/true makes or breaks zone awareness. I'm not sure why it didn't work with excludeNeighbors=false previously for me. I made some scripts to automate my testing. And now it's definite that it's the excludeNeighbors setting. It's all here: https://github.com/doitintl/ignite-gke. Regardless I also opened a bug with IGNITE Jira: https://issues.apache.org/jira/browse/IGNITE-12896. Many thanks to #alamar for his suggestions.
I recommend setting excludeNeighbors to false. It is true in your case, it is not needed, and I get correct partitions mapping when I set it to false (of course, I also run all four nodes locally).
Environment property was enough, did not need to add it manually to user attributes.
I am trying to modify default-config.xml by adding cacheConfiguration tags. Do i need to repeat cacheConfiguration XML tag for each data set RDD that i am tyring to keep to keep it in the memory ? Can i set backups to 0, if i don't want it.
ex:
<property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="TEST1_RDD"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="backups" value="0"/>
</bean>
</property> <property name="cacheConfiguration">
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="TEST2_RDD"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="backups" value="0"/>
</bean>
</property>
Also, do i need to specify explicitly write synchronization mode ? and by default which one Ignite consider ?
ex:
<property name="writeSynchronizationMode" value="FULL_SYNC"/>
Appreciate your response.
Yes, You have to write configuration for each cache as your cache may have different functionality/purpose and you have to set configuration according to it.
For backups it's default value is 0 and for CacheWriteSynchronizationMode default value is PRIMARY_SYNC
There is a possibility to define cache templates, if you don't want to provide the same configuration for caches: https://apacheignite.readme.io/docs/cache-template
I have created an Ignite cache "contact" and added "Person" object to it.
When I use Ignite JDBC Client mode I am able to query this cache. But when I implement JDBC Thin Client, it says that the table Person does not exist.
I tried the query this way:
Select * from Person
Select * from contact.Person
Both did not work with Thin Client. I am using Ignite 2.1.
I appreciate your help as how to query an existing cache using Thin Client.
Thank you.
Cache Configuration in default-config.xml
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="persistentStoreConfiguration">
<bean class="org.apache.ignite.configuration.PersistentStoreConfiguration"/>
</property>
<property name="binaryConfiguration">
<bean class="org.apache.ignite.configuration.BinaryConfiguration">
<property name="compactFooter" value="false"/>
</bean>
</property>
<property name="memoryConfiguration">
<bean class="org.apache.ignite.configuration.MemoryConfiguration">
<!-- Setting the page size to 4 KB -->
<property name="pageSize" value="#{4 * 1024}"/>
</bean>
</property>
<!-- Explicitly configure TCP discovery SPI to provide a list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<value>127.0.0.1:55500..55502</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Cache Configuration in the Server Side of the Code
CacheConfiguration<Long, Person> cc = new CacheConfiguration<>(cacheName);
cc.setCacheMode(CacheMode.REPLICATED);
cc.setRebalanceMode(CacheRebalanceMode.ASYNC);
cc.setIndexedTypes(Long.class, Person.class);
cache = ignite.getOrCreateCache(cc);
Thin Client JDBC URL
Class.forName("org.apache.ignite.IgniteJdbcThinDriver");
// Open the JDBC connection.
Connection conn = DriverManager.getConnection("jdbc:ignite:thin://192.168.1.111:10800");
Statement st = conn.createStatement();
If you want to query data from an existing cache using SQL, you should specify an SQL schema in the cache configuration. Add the following code before the cache creation:
cc.setSqlSchema("PUBLIC");
Note that you have persistence configured, so when you do ignite.getOrCreateCache(cc); the new configuration won't be applied, if a cache with this name is already persisted. You should, for example, remove persistence data or use createCache(...) method instead.