Sync Framework 2.1 taking too much time to sync for first time - sql

I am using sync framework 2.1 to sync my database and it is unidirectional only i.e. source to destination database. I am facing problem while syncing database for the first time. I have divided tables in different groups so that there is less overhead while syncing. I have one table which has 900k records in it. While syncing that table, SyncOrchestrator.Synchronize(); method not returning anything. Network usage,disk i/o everything goes high. I have wait to complete the sync process for 2 days but it still happening nothing. I have also check in sql db using "sp_who2" and the process is in suspended mode. I have also use some queries found from online and it says table_selectchanges takes too much time.
I have used following code to sync my database.
//setup the connections
var serverConn = new SqlConnection(sourceConnectionString);
var clientConn = new SqlConnection(destinationConnectionString);
// create the sync orchestrator
var syncOrchestrator = new SyncOrchestrator();
//setup providers
var localProvider = new SqlSyncProvider(scopeName, clientConn);
var remoteProvider = new SqlSyncProvider(scopeName, serverConn);
localProvider.CommandTimeout = 0;
remoteProvider.CommandTimeout = 0;
localProvider.ObjectSchema = schemaName;
remoteProvider.ObjectSchema = schemaName;
syncOrchestrator.LocalProvider = localProvider;
syncOrchestrator.RemoteProvider = remoteProvider;
// set the direction of sync session Download
syncOrchestrator.Direction = SyncDirectionOrder.Download;
// execute the synchronization process
syncOrchestrator.Synchronize();

We had this problem. We removed all the foreign key constraints from the database for the first sync. This reduced the initial sync from over 4 hours to about 10 minutes. After the initial sync completed we replaced the foreign key constraints.

Related

SQL queries returning incorrect results during High Load

I have a table in which during the performance runs, there are inserts happening in the beginning when the job starts, during the insertion time there are also parallel operations(GET/UPDATE queries) happening on that table. The Get operation also updates a value in column marking that record as picked. However, the next get performed on the table would again return back the same record even when the record was marked in progress.
P.S. --> both the operations are done by the same single thread existing in the system. Logs below for reference, record marked in progress at Line 1 on 20:36:42,864, however, it is returned back in the result set of query executed after 20:36:42,891 by the same thread.
We also observed that during high load (usually during same scenario as mentioned above) some update operation (intermittent) were not happening on the table even when the update executed successfully (validated using the returned result and then doing a get just after that to check the updated value ) without throwing an exception.
13 Apr 2020 20:36:42,864 [SHT-4083-initial] FINEST - AbstractCacheHelper.markContactInProgress:2321 - Action state after mark in progresss contactId.ATTR=: 514409 for jobId : 4083 is actionState : 128
13 Apr 2020 20:36:42,891 [SHT-4083-initial] FINEST - CacheAdvListMgmtHelper.getNextContactToProcess:347 - Query : select priority, contact_id, action_state, pim_contact_store_id, action_id
, retry_session_id, attempt_type, zone_id, action_pos from pim_4083 where handler_id = ? and attempt_type != ? and next_attempt_after <= ? and action_state = ? and exclude_flag = ? order
by attempt_type desc, priority desc, next_attempt_after asc,contact_id asc limit 1
This happens usually during the performance runs when there are parallel JOB's started which are working on Ignite. Can anyone suggest what can be done to avoid such a situation..?
We have 2 ignite data nodes that are deployed as springBootService deployed in the cluster being accessed, by 3 client nodes.
Ignite version -> 2.7.6, Cache configuration is as follows,
IgniteConfiguration cfg = new IgniteConfiguration();
CacheConfiguration cachecfg = new CacheConfiguration(CACHE_NAME);
cachecfg.setRebalanceThrottle(100);
cachecfg.setBackups(1);
cachecfg.setCacheMode(CacheMode.REPLICATED);
cachecfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
cachecfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
cachecfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
// Defining and creating a new cache to be used by Ignite Spring Data repository.
CacheConfiguration ccfg = new CacheConfiguration(CACHE_TEMPLATE);
ccfg.setStatisticsEnabled(true);
ccfg.setCacheMode(CacheMode.REPLICATED);
ccfg.setBackups(1);
DataStorageConfiguration dsCfg = new DataStorageConfiguration();
dsCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);
dsCfg.setStoragePath(storagePath);
dsCfg.setWalMode(WALMode.FSYNC);
dsCfg.setWalPath(walStoragePath);
dsCfg.setWalArchivePath(archiveWalStoragePath);
dsCfg.setWriteThrottlingEnabled(true);
cfg.setAuthenticationEnabled(true);
dsCfg.getDefaultDataRegionConfiguration()
.setInitialSize(Long.parseLong(cacheInitialMemSize) * 1024 * 1024);
dsCfg.getDefaultDataRegionConfiguration().setMaxSize(Long.parseLong(cacheMaxMemSize) * 1024 * 1024);
cfg.setDataStorageConfiguration(dsCfg);
cfg.setClientConnectorConfiguration(clientCfg);
// Run the command to alter the default user credentials
// ALTER USER "ignite" WITH PASSWORD 'new_passwd'
cfg.setCacheConfiguration(cachecfg);
cfg.setFailureDetectionTimeout(Long.parseLong(cacheFailureTimeout));
ccfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
ccfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
ccfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
ccfg.setRebalanceThrottle(100);
int pool = cfg.getSystemThreadPoolSize();
cfg.setRebalanceThreadPoolSize(2);
cfg.setLifecycleBeans(new MyLifecycleBean());
logger.info(methodName, "Starting ignite service");
ignite = Ignition.start(cfg);
ignite.cluster().active(true);
// Get all server nodes that are already up and running.
Collection<ClusterNode> nodes = ignite.cluster().forServers().nodes();
// Set the baseline topology that is represented by these nodes.
ignite.cluster().setBaselineTopology(nodes);
ignite.addCacheConfiguration(ccfg);

Querying over large data under NHibernate transaction

I understand that explicit transactions should be used even for reading data but I am unable to understand why the below code runs much slower under an NHibernate transaction (as opposed to running without it)
session.BeginTransaction();
var result = session.Query<Order>().Where(o=>o.OrderNumber > 0).Take(100).ToList();
session.Transaction.Commit();
I can post more detailed UT code if needed but if I am querying over 50,000 Order records, it takes about 1 sec for this query to run under NHibernate's explicit transaction, and it takes only about 15/20 msec without one.
Update 1/15/2019
Here is the detailed code
[Test]
public void TestQueryLargeDataUnderTransaction()
{
int count = 50000;
using (var session = _sessionFactory.OpenSession())
{
Order order;
// write large amount of data
session.BeginTransaction();
for (int i = 0; i < count; i++)
{
order = new Order {OrderNumber = i, OrderDate = DateTime.Today};
OrderLine ol1 = new OrderLine {Amount = 1 + i, ProductName = $"sun screen {i}", Order = order};
OrderLine ol2 = new OrderLine {Amount = 2 + i, ProductName = $"banjo {i}", Order = order};
order.OrderLines = new List<OrderLine> {ol1, ol2};
session.Save(order);
session.Save(ol1);
session.Save(ol2);
}
session.Transaction.Commit();
Stopwatch s = new Stopwatch();
// read the same data
session.BeginTransaction();
var result = session.Query<Order>().Where(o => o.OrderNumber > 0).Skip(0).Take(100).ToList();
session.Transaction.Commit();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
}
}
Your for-loop iterates 50000 times and for each iteration it creates 3 objects. So by the time you reach the first call to Commit(), the session knows about 150000 objects that it will flush to the database at Commit time (or earlier) (subject to your id generator policy and flush mode).
So far, so good. NHibernate is not necessarily optimised to handle so many objects in the session, but it can be acceptable providing one is careful.
On to the problem...
It's important to realize that committing the transaction does not remove the 150000 objects from the session.
When you later perform the query, it will notice that it is inside a transaction, in which case, by default, "auto-flushing" will be performed. This means that before sending the SQL query to the database, NHibernate will check if any of the objects known to the session has changes that might affect the outcome of the query (this is somewhat simplified). If such changes are found, they will be transmitted to the database before performing the actual SQL query. This ensures that the executed query will be able to filter based on changes made in the same session.
The extra second you notice is the time it takes for NHibernate to iterate over the 150000 objects known to the session to check for any changes. The primary use cases for NHibernate rarely involves more than tens or a few hundreds of objects, in which case the time needed to check for changes is negligible.
You can use a new session for the query to not see this effect, or you can call session.Clear() immediately after the first commit. (Note that for production code, session.Clear() can be dangerous.)
Additional: The auto-flushing happens when querying but only if inside a transaction. This behaviour can be controlled using session.FlushMode. During auto-flush NHibernate will aim to flush only objects that may affect the outcome of the query (i.e. which database tables are affected).
There is an additional effect to be aware of with regards to keeping sessions around. Consider this code:
using (var session = _sessionFactory.OpenSession())
{
Order order;
session.BeginTransaction();
for (int i = 0; i < count; i++)
{
// Your code from above.
}
session.Transaction.Commit();
// The order variable references the last order created. Let's modify it.
order.OrderDate = DateTime.Today.AddDays(4);
session.BeginTransaction();
var result = session.Query<Order>().Skip(0).Take(100).ToList();
session.Transaction.Commit();
}
What will happen with the change to the order date done after the first call to Commit()? That change will be persisted to the database when the query is performed in the second transaction despite the fact that the object modification itself happened before the transaction was started. Conversely, if you remove the second transaction, that modification will not be persisted of course.
There are multiple ways to manage sessions and transaction that can be used for different purposes. However, by far the easiest is to always follow this simple unit-of-work pattern:
Open session.
Immediately open transaction.
Perform a reasonable amount of work.
Commit or rollback transaction.
Dispose transaction.
Dispose session.
Discard all objects loaded using the session. At this point they can still
be used in memory, but any changes will not be persisted. Safer to just get
rid of them.

Data is not properly stored to hsqldb when using pooled data source by dbcp

I'm using hsqldb to create cached tables and indexed tables.
The data being stored has pretty high frequency so I need to use a connection pool.
Also because there is a lot of data I do not call checkpoint on every commit, but rather expect the data to be flushed after 50,000 rows are inserted.
So the thing is that I can see the .data file is growing but when I connect with hsqldb client I don't see the tables and the data.
So I had 2 simple tests, one inserted single row and one inserted 60,000 rows to new table. In both cases I couldn't see the result in any hsqldb client.
(Note that I use shutdown=true)
So when I add checkpoint after each commit, it solve the problem.
Also if specify in the connection string to use log, it solves the problem (I don't want the log in production though). Also not using pooled connection solved the problem and last is using pooled data source and explicitly close it before shutdown.
So I guess that some connections in the connection pool are not being closed, preventing from the db to somehow commit the changes and make them available for the client. But then, why couldn't I see the result even with 60,000 rows?
I also would expect the pool to be closed automatically...
What am I doing wrong? What is happening behind the scene?
The code to get the data source looks like this:
Class.forName("org.hsqldb.jdbcDriver");
String url = "jdbc:hsqldb:" + m_dbRoot + dbName + "/db" + ";hsqldb.log_data=false;shutdown=true;hsqldb.nio_data_file=false";
ConnectionFactory connectionFactory = new DriverManagerConnectionFactory(url, user, password);
GenericObjectPool connectionPool = new GenericObjectPool();
KeyedObjectPoolFactory stmtPool = new GenericKeyedObjectPoolFactory(null);
new PoolableConnectionFactory(connectionFactory, connectionPool, stmtPool, null, false, true);
DataSource ds = new PoolingDataSource(connectionPool);
And I'm using this Pooled data source to create table:
Connection c = m_dataSource.getConnection();
Statement st = c.createStatement();
String script = String.format("CREATE CACHED TABLE IF NOT EXISTS %s (id %s NOT NULL, entity %s NOT NULL, PRIMARY KEY (id));", m_tableName, m_idGenerator.getIdType(), TABLE_ENTITY_TYPE);
st.execute(script);
c.close;
st.close();
And insert rows:
Connection c = m_dataSource.getConnection();
c.setAutoCommit(false);
Statement stmt = c.prepareStatement(m_sqlInsert);
stmt.setObject(1, id);
stmt.setBinaryStream(2, Serializer.Helper.serialize(m_serializer, entity));
stmt.executeUpdate();
stmt.close();
stmt = null;
c.commit();
c.close();
stmt.close();
so the above seems to add data but it cannot be seen.
When I explicitly called
connectionPool.close();
Then and only then I could see the result.
I also tried to use JDBCDataSource and it worked as well.
So what is going on? And what is the right way to do this?
Your method of accessing the database from outside your application process is simply wrong.
Only one java process is supposed to connect to the file: database.
In order to achieve your aim, launch an HSQLDB server within your application, using exactly the same JDBC URL. Then connect to this server from the external client.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/listeners-chapt.html#lsc_app_start
Update: The OP commented that the external client was used after the application had stopped. Because you have turned the log off with hsqldb.log_data=false, nothing is persisted permanently. You need to perform an explicit CHECKPOINT or SHUTDOWN when your application completes its work. You cannot rely on shutdown=true at all, even without connection pooling.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/deployment-chapt.html#dec_bulk_operations

Releasing XASession XAResource - Manual enlisting

In our MDB we have a Xatransaction between DB and Tibco foreign server Queue. we have enlisted the foreign server XaResouce using below.
The MDB is on Weblogic server 10.3.6, JDK 1.6.
init()---
XAConnection tempXAConn = xaConn;
TibjmsXAConnectionFactory xaConnFactory = (TibjmsXAConnectionFactory)ServiceLocator.getInstance().getJNDIReferencedObject(JMS_Q_CONNECTION_FACTORY_JNDI_XA);
xaConn = xaConnFactory.createXAConnection(JMS_USER,JMS_PSWD);
getsession()---
XASession xaSession = xaConn.createXASession();
TransactionHelper txHelper = TransactionHelper.popTransactionHelper();
Transaction tx = txHelper.getTransaction();
tx.enlistResource(xaSession.getXAResource());
Transactions are working fine. we are using one connection and create new xasession for every message.
but the problem is releasing resources. after few thousand msgs i see heap containing same number of Tibjmsxasession,Tibjmsxaresource,Tibjmslongkey objects. this leads to outofmemory issue.
we cannot use session.close() in between the transaction.
The transaction are container managed. only enlisting is done manually.
i used
tx.registerSynchronization(new SessionSynchronization());
SessionSynchronization implements Synchronization and has 2 methods afterCompletion and beforeCompletion.
session.close can be called inside afterCompletion. session can be maintained in threadlocal.

Transaction timeout expired while using Linq2Sql DataContext.SubmitChanges()

please help me resolve this problem:
There is an ambient MSMQ transaction. I'm trying to use new transaction for logging, but get next error while attempt to submit changes - "Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding." Here is code:
public static void SaveTransaction(InfoToLog info)
{
using (TransactionScope scope =
new TransactionScope(TransactionScopeOption.RequiresNew))
{
using (TransactionLogDataContext transactionDC =
new TransactionLogDataContext())
{
transactionDC.MyInfo.InsertOnSubmit(info);
transactionDC.SubmitChanges();
}
scope.Complete();
}
}
Please help me.
Thx.
You could consider increasing the timeout or eliminating it all together.
Something like:
using(TransactionLogDataContext transactionDC = new TransactionLogDataContext())
{
transactionDC.CommandTimeout = 0; // No timeout.
}
Be careful
You said:
thank you. but this solution makes new question - if transaction scope was changed why submit operation becomes so time consuming? Database and application are on the same machine
That is because you are creating new DataContext right there:
TransactionLogDataContext transactionDC = new TransactionLogDataContext())
With new data context ADO.NET opens up new connection (even if connection strings are the same, unless you do some clever connection pooling).
Within transaction context when you try to work with more than 1 connection instances (which you just did)
ADO.NET automatically promotes transaction to a distributed transaction and will try to enlist it into MSDTC. Enlisting very first transaction per connection into MSDTC will take time (for me it takes 30+ seconds), consecutive transactions will be fast, however (in my case 60ms). Take a look at this http://support.microsoft.com/Default.aspx?id=922430
What you can do is reuse transaction and connection string (if possible) when you create new DataContext.
TransactionLogDataContext tempDataContext =
new TransactionLogDataContext(ExistingDataContext.Transaction.Connection);
tempDataContext.Transaction = ExistingDataContext.Transaction;
Where ExistingDataContext is the one which started ambient transaction.
Or attemp to speed up your MS DTC.
Also do use SQL Profiler suggested by billb and look for SessionId between different commands (save and savelog in your case). If SessionId changes, you are in fact using 2 different connections and in that case will have to reuse transaction (if you don't want it to be promoted to MS DTC).