Ignite read stale data from backup node - backup

I ran into in-consistency data issues on ignite version 2.8.1. I have three nodes run as a cluster and the cache configuration as:
CacheConfiguration<String, Balance> cacheConfiguration = new CacheConfiguration<>(Balance.class.getSimpleName());
cacheConfiguration.setIndexedTypes(String.class, Balance.class);
cacheConfiguration.setSqlIndexMaxInlineSize(100);
cacheConfiguration.setSqlSchema("PUBLIC");
cacheConfiguration.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
cacheConfiguration.setBackups(4);
cacheConfiguration.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
Then I have very simple code to add balance in a loop:
for (int i = 0; i < 10000; i++) {
balance = balanceDao.findByKey(accountId, "USD");
balance.setQuantity(balance.getQuantity().add(BigDecimal.ONE));
balanceDao.save(balance);
}
when run above on the primary node, I always have balance increased 10000 correctly, however when run that on backup node, sometimes my balance increased around 8k, and sometimes 9k.
if setWriteSynchronizationMode was set to PRIMARY_SYNC and setReadFromBackup was set to false, I can get correct balance on all nodes.
is this a bug on 2.8.1 or anything wrong with my configuration?

If readFromBackup == true (which is the default), it is possible to read stale values on the backup nodes. Set it to false to avoid stale reads.

Related

SQL queries returning incorrect results during High Load

I have a table in which during the performance runs, there are inserts happening in the beginning when the job starts, during the insertion time there are also parallel operations(GET/UPDATE queries) happening on that table. The Get operation also updates a value in column marking that record as picked. However, the next get performed on the table would again return back the same record even when the record was marked in progress.
P.S. --> both the operations are done by the same single thread existing in the system. Logs below for reference, record marked in progress at Line 1 on 20:36:42,864, however, it is returned back in the result set of query executed after 20:36:42,891 by the same thread.
We also observed that during high load (usually during same scenario as mentioned above) some update operation (intermittent) were not happening on the table even when the update executed successfully (validated using the returned result and then doing a get just after that to check the updated value ) without throwing an exception.
13 Apr 2020 20:36:42,864 [SHT-4083-initial] FINEST - AbstractCacheHelper.markContactInProgress:2321 - Action state after mark in progresss contactId.ATTR=: 514409 for jobId : 4083 is actionState : 128
13 Apr 2020 20:36:42,891 [SHT-4083-initial] FINEST - CacheAdvListMgmtHelper.getNextContactToProcess:347 - Query : select priority, contact_id, action_state, pim_contact_store_id, action_id
, retry_session_id, attempt_type, zone_id, action_pos from pim_4083 where handler_id = ? and attempt_type != ? and next_attempt_after <= ? and action_state = ? and exclude_flag = ? order
by attempt_type desc, priority desc, next_attempt_after asc,contact_id asc limit 1
This happens usually during the performance runs when there are parallel JOB's started which are working on Ignite. Can anyone suggest what can be done to avoid such a situation..?
We have 2 ignite data nodes that are deployed as springBootService deployed in the cluster being accessed, by 3 client nodes.
Ignite version -> 2.7.6, Cache configuration is as follows,
IgniteConfiguration cfg = new IgniteConfiguration();
CacheConfiguration cachecfg = new CacheConfiguration(CACHE_NAME);
cachecfg.setRebalanceThrottle(100);
cachecfg.setBackups(1);
cachecfg.setCacheMode(CacheMode.REPLICATED);
cachecfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
cachecfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
cachecfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
// Defining and creating a new cache to be used by Ignite Spring Data repository.
CacheConfiguration ccfg = new CacheConfiguration(CACHE_TEMPLATE);
ccfg.setStatisticsEnabled(true);
ccfg.setCacheMode(CacheMode.REPLICATED);
ccfg.setBackups(1);
DataStorageConfiguration dsCfg = new DataStorageConfiguration();
dsCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);
dsCfg.setStoragePath(storagePath);
dsCfg.setWalMode(WALMode.FSYNC);
dsCfg.setWalPath(walStoragePath);
dsCfg.setWalArchivePath(archiveWalStoragePath);
dsCfg.setWriteThrottlingEnabled(true);
cfg.setAuthenticationEnabled(true);
dsCfg.getDefaultDataRegionConfiguration()
.setInitialSize(Long.parseLong(cacheInitialMemSize) * 1024 * 1024);
dsCfg.getDefaultDataRegionConfiguration().setMaxSize(Long.parseLong(cacheMaxMemSize) * 1024 * 1024);
cfg.setDataStorageConfiguration(dsCfg);
cfg.setClientConnectorConfiguration(clientCfg);
// Run the command to alter the default user credentials
// ALTER USER "ignite" WITH PASSWORD 'new_passwd'
cfg.setCacheConfiguration(cachecfg);
cfg.setFailureDetectionTimeout(Long.parseLong(cacheFailureTimeout));
ccfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
ccfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
ccfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
ccfg.setRebalanceThrottle(100);
int pool = cfg.getSystemThreadPoolSize();
cfg.setRebalanceThreadPoolSize(2);
cfg.setLifecycleBeans(new MyLifecycleBean());
logger.info(methodName, "Starting ignite service");
ignite = Ignition.start(cfg);
ignite.cluster().active(true);
// Get all server nodes that are already up and running.
Collection<ClusterNode> nodes = ignite.cluster().forServers().nodes();
// Set the baseline topology that is represented by these nodes.
ignite.cluster().setBaselineTopology(nodes);
ignite.addCacheConfiguration(ccfg);

Inconsistent behavior of Quartz2 scheduler in Apache Camel

I have an Apache Camel project that is using Quartz2 as the scheduler. The requirement is to make it a cluster. The code is deployed to weblogic 12c. the quartz is configured as per many samples with clustering enabled.
This is my properties file (without the datasource)
org.quartz.scheduler.instanceName = MyScheduler
org.quartz.scheduler.instanceId = AUTO
org.quartz.scheduler.skipUpdateCheck = true
org.quartz.scheduler.jobFactory.class = org.quartz.simpl.SimpleJobFactory
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 10
org.quartz.threadPool.threadPriority = 5
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
org.quartz.jobStore.useProperties=true
org.quartz.JobBuilder.requestRecovery=true
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.clusterCheckinInterval = 20000
When I deploy and start both nodes I see that the QRTZ_SCHEDULER_STATE table has extra entry for one of the nodes:
MyScheduler-routerContext server_node21567108546690
MyScheduler-routerContext-1 server_node11565896495100
MyScheduler-routerContext-1 server_node11567108547295
And I am guessing because of that the one node is being called once in a while while the other node gets called all the time (so occasionally both nodes are invoked at the same time).
I have tried to do a clean restart of weblogic nodes but the issue is still there
This is how my route(s) look like:
from("quartz2://provRegGroup/createUsersTrigger?cron={{create_users_cron}}&job.name=createUsersJob")
.routeId("createUsersRB")
.log("**** starting check for create users");
//where
//create_users_cron=0+0,5,10,15,20,25,30,35,40,45,50,55+*+*+*+?
//expecting one node being called by the scheduler at a time..
I figured out what caused the issue. apparently there were orphan weblogic processes that were running on one (or even both nodes) - this would be a question to our tech archs - why this was such a mess.. ps was showing two weblogic servers running on a node - one that I started recently and one that was there for say a month..
expecting this would never happen to production environment I assume the issue has been resolved..

Apache Ignite sql query returns only cache contents, not complete results from database

My Ignite nodes (2 server nodes - let's call them A and B) are configured as follows:
ccfg.setCacheMode(CacheMode.PARTITIONED);
ccfg.setAtomicityMode(CacheMode.TRANSACTIONAL);
ccfg.setReadThrough(true);
ccfg.setWriteThrough(true);
ccfg.setWriteBehindEnabled(true);
ccfg.setWriteBehindBatchSize(10000);
Node A is started first, from command line as follows:
apache-ignite-fabric-2.2.0-bin>bin/ignite.bat config/default-config.xml
Node B is started from java code by running
public static void main(String[] args) throws Exception {
Ignite ignite = Ignition.start(ServerConfigurationFactory.createConfiguration());
ignite.cache("MyCache").loadCache(null);
...
}
(jar containing ServerConfigurationFactory is put in the apache-ignite-fabric-2.2.0-bin\libs directory so Node A and B are on the same cluster..otherwise there is an error)
I have a query that is supposed to return 9061 results from the database. After the cache loading process in Node B, I went to the Web Console and ran a simple count SQL statement against the caches. There is a button "Execute on selected node" that allows you to choose a specific cache to query. I queried Node A and got a count of 2341, and on Node B I get a count of 2064. If I just use the "Execute" button I get 4405 which is just the total of node A and B. Obviously they are missing 4656 records (9061 total records in db - 4405 in nodes A and B). I also ran the same count query in Java code using SqlFieldsQuery and I also get 4405.
Since readThrough is set to true I expected Ignite to also return results that are not in memory. But this is not the case because it just returns whatever is on the cache. Am I doing something wrong here? Thank you.
Read though works only for key-value APIs, so SQL engine assumes that all required data is preloaded from database prior to running a query.
If your data set doesn't fit in memory and you can't preload all the data, you can use native Ignite persistence storage: https://apacheignite.readme.io/docs/distributed-persistent-store

Sync Framework 2.1 taking too much time to sync for first time

I am using sync framework 2.1 to sync my database and it is unidirectional only i.e. source to destination database. I am facing problem while syncing database for the first time. I have divided tables in different groups so that there is less overhead while syncing. I have one table which has 900k records in it. While syncing that table, SyncOrchestrator.Synchronize(); method not returning anything. Network usage,disk i/o everything goes high. I have wait to complete the sync process for 2 days but it still happening nothing. I have also check in sql db using "sp_who2" and the process is in suspended mode. I have also use some queries found from online and it says table_selectchanges takes too much time.
I have used following code to sync my database.
//setup the connections
var serverConn = new SqlConnection(sourceConnectionString);
var clientConn = new SqlConnection(destinationConnectionString);
// create the sync orchestrator
var syncOrchestrator = new SyncOrchestrator();
//setup providers
var localProvider = new SqlSyncProvider(scopeName, clientConn);
var remoteProvider = new SqlSyncProvider(scopeName, serverConn);
localProvider.CommandTimeout = 0;
remoteProvider.CommandTimeout = 0;
localProvider.ObjectSchema = schemaName;
remoteProvider.ObjectSchema = schemaName;
syncOrchestrator.LocalProvider = localProvider;
syncOrchestrator.RemoteProvider = remoteProvider;
// set the direction of sync session Download
syncOrchestrator.Direction = SyncDirectionOrder.Download;
// execute the synchronization process
syncOrchestrator.Synchronize();
We had this problem. We removed all the foreign key constraints from the database for the first sync. This reduced the initial sync from over 4 hours to about 10 minutes. After the initial sync completed we replaced the foreign key constraints.

Apache Ignite - Distibuted Queue and Executors

I am planning to use Apache Ignite Distributed Queue.
I am using Ignite with a spring boot application. So, on bootup, I will be adding 20 names in a queue. But, since there are 3 servers in a cluster, the same 20 names gets added 3 times. But, i want to add them only once in the queue.
Ignite ignite = Ignition.ignite();
IgniteQueue<String> queue = ignite.queue(
"queueName", // Queue name.
0, // Queue capacity. 0 for unbounded queue.
null // Collection configuration.
);
Distributed executors, will be able to poll from the queue and run the task. Here, the executor is expected to poll, run the task and then add the same name to the queue. Trying to achieve round robin here.
Only one executor should be running the same task at any point of time, though there are multiple servers in a cluster.
Any suggestion for this.
You can launch ignite cluster singleton service https://apacheignite.readme.io/docs/cluster-singletons which will fill data to queue. Also you can adding data from coordinator node (oldest node in cluster) ignite.cluster().forOldest().node().isLocal()
I fixed bootup time duplicate cache loading issue this way:
final IgniteAtomicLong cacheLoadCnt = ignite.atomicLong(cacheName + "Cnt", 0, true);
if (cacheLoadCnt.get() == 0) {
loadCache();
cacheLoadCnt.addAndGet(1);
}