Ignite SQL query is taking time - ignite
We are currently using GridGain community Edition 8.8.10. We have setup the Ignite Cluster in Kubernetes using the Ignite operator. The cluster consists of 2 nodes with native persistence enabled and we are using thick client to connect to the Ignite cluster . The clients are also deployed in the same Kubernetes Cluster. The memory configuration of the Cluster is as follows :
-DIGNITE_WAL_MMAP=false -DIGNITE_QUIET=false -Xms6g -Xmx6g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="name" value="Knowledge_Region"/>
<!-- Memory region of 20 MB initial size. -->
<property name="initialSize" value="#{20 * 1024 * 1024}"/>
<!-- Maximum size is 9 GB -->
<property name="maxSize" value="#{9L * 1024 * 1024 * 1024}"/>
<!-- Enabling eviction for this memory region. -->
<property name="pageEvictionMode" value="RANDOM_2_LRU"/>
<property name="persistenceEnabled" value="true"/>
<!-- Enabling SEGMENTED_LRU page replacement for this region. -->
<property name="pageReplacementMode" value="SEGMENTED_LRU"/>
</bean>
We are using the Ignite String function to query the cache. The Cache structure is as follows:
#QuerySqlField(index = true, inlineSize = 100)
private String value;
#QuerySqlField(name = "label", index = true, inlineSize = 100)
private String label;
#QuerySqlField(name = "type", index = true, inlineSize = 100)
#AffinityKeyMapped
private String type;
private String typeLabel;
private List<String> synonyms;
The SQL Query which we are using to get the data is as follows :
select _key, _val from TESTCACHEVALUE USE INDEX(TESTCACHEVALUE_label_IDX) WHERE REGEXP_LIKE(label, 'unit.*s.*','i') LIMIT 8
The Query Plan it is getting generated:
[05:04:56,613][WARNING][long-qry-#36][LongRunningQueryManager] Query execution is too long [duration=1124ms, type=MAP, distributedJoin=false, enforceJoinOrder=false, lazy=false, schema=staging_infrastructuretesting_business_object, sql='SELECT
"__Z0"."_KEY" AS "__C0_0",
"__Z0"."_VAL" AS "__C0_1"
FROM "staging_infrastructuretesting_business_object"."TESTCACHEVALUE" AS "__Z0" USE INDEX ("TESTCACHEVALUE_LABEL_IDX")
WHERE REGEXP_LIKE("__Z0"."LABEL", 'uni.*', 'i') FETCH FIRST 8 ROWS ONLY', plan=SELECT
__Z0._KEY AS __C0_0,
__Z0._VAL AS __C0_1
FROM staging_infrastructuretesting_business_object.TESTCACHEVALUE __Z0 USE INDEX (TESTCACHEVALUE_LABEL_IDX)
/* staging_infrastructuretesting_business_object.TESTCACHEVALUE.__SCAN_ */
/* scanCount: 289643 */
/* lookupCount: 1 */
WHERE REGEXP_LIKE(__Z0.LABEL, 'uni.*', 'i')
FETCH FIRST 8 ROWS ONLY
As I can see the Query is going for full scan and not using the Index specified in the Query.
The cache contains 5 million Objects.
The memory statistics of the Cluster is as follows :
^-- Node [id=d87d1212, uptime=00:30:00.229]
^-- Cluster [hosts=6, CPUs=20, servers=2, clients=4, topVer=12, minorTopVer=25]
^-- Network [addrs=[10.57.5.10, 127.0.0.1], discoPort=47500, commPort=47100]
^-- CPU [CPUs=1, curLoad=16%, avgLoad=38.3%, GC=0%]
^-- Heap [used=4265MB, free=30.58%, comm=6144MB]
^-- Off-heap memory [used=4872MB, free=58.58%, allocated=11564MB]
^-- Page memory [pages=620072]
^-- sysMemPlc region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.96%, allocRam=100MB, allocTotal=0MB]
^-- metastoreMemPlc region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.87%, allocRam=0MB, allocTotal=0MB]
^-- TxLog region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=100MB, allocTotal=0MB]
^-- volatileDsMemPlc region [type=internal, persistence=false, lazyAlloc=true,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
^-- Default_Region region [type=default, persistence=true, lazyAlloc=true,
... initCfg=20MB, maxCfg=9216MB, usedRam=4781MB, freeRam=48.12%, allocRam=9216MB, allocTotal=4753MB]
^-- Ignite persistence [used=4844MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=8, qSize=0]
^-- Striped thread pool [active=0, idle=8, qSize=0]
From the memory snapshot it seems like we have enough memory in the Cluster.
What I have tried so far.
Index hint in the Query
Applied limit to the Query
Partitioned Cache with Query parallelism 3
SkipReducer on update True
OnheapCacheEnabled set to True
Not sure why the Query is taking time. Please let me know if i have missed anything.
One observation from the Query execution plan the time taken is around 2 secs but on the client side getting response in 5 sec.
Thanks in advance.
It seems you are missing the fact that Apache Ignite SQL engine leverages B+Tree data structure internally. B+Tree relies on some "order" of stored objects (there should be a way to compare them). The only case of a textual search that could be handled with this structure is the prefix search because it establishes a branching condition for the search algorithm. Here is the example:
select _key, _val from TESTCACHEVALUE WHERE label LIKE 'unit%'
In that case you would see the TESTCACHEVALUE_label_IDX index being used even without a hint.
For your scenario REGEXP_LIKE is just an iteration applying Matcher.find() to the label one by one.
Try the Ignite Text Query machinery. It's based on Apache Lucene and looks more suitable for the case.
No!!! You don't want to use index for your problem. Using index will only further delay your query.
It will proceed with top to bottom parsing doing full scan.
The below query should work.
select _key, _val from TESTCACHEVALUE WHERE label LIKE 'unit.%s.%'
Related
Spring JMS - Message-driven-channel-adapter The number of consumers doesn't reduce to the standard level
I have a message-driven-channel-adapter and I defined the max-concurrent-consumers as 100 and concurrent-consumers as 2. When I tried a load test, I saw that the concurrent-consumers increased but after the load test, The number of consumers didn't reduce to the standard level. I'm checking it with RabbitMQ management portal. When the project restarted (no load test), the GET (Empty) is 650/s but after load test it stays about 2500/s. It is not returning to 650/s. I think concurrent-consumers property is being increased to a number but is not being reduced to original value. How can I make it to reduce to normal level again? Here is my message-driven-channel-adapter definition: <int-jms:message-driven-channel-adapter id="inboundAdapter" auto-startup="true" connection-factory="jmsConnectionFactory" destination="inboundQueue" channel="requestChannel" error-channel="errorHandlerChannel" receive-timeout="-1" concurrent-consumers="2" max-concurrent-consumers="100" />
With receiveTimeout=-1; the container has no control over the idle consumer (it is blocked in the jms client). You also need to set max-messages-per-task for the container to consider stopping a consumer. <int-jms:message-driven-channel-adapter id="inboundAdapter" auto-startup="true" connection-factory="jmsConnectionFactory" destination-name="inboundQueue" channel="requestChannel" error-channel="errorHandlerChannel" receive-timeout="5000" concurrent-consumers="2" max-messages-per-task="10" max-concurrent-consumers="100" /> The time elapsed for an idle consumer is receiveTimeout * maxMessagesPerTask.
Hibernate search Monitoring the Index process
I am using Hibernate search to index Data from Postgresql datenbank, while the process takes really long i want to display Process bar to estimate how long it will take to finish indexing, i also want to display which Entity is being indexed. First i enabled jmx_enabled and generate_statistics in my Persistence.xml <property name="hibernate.search.generate_statistics" value="true"/> <property name="hibernate.search.jmx_enabled" value="true"/> then added the processMotitor to FullTextSession in my Index Class like this MassIndexerProgressMonitor monitor = new SimpleIndexingProgressMonitor(); FullTextSession fullTextSession = Search.getFullTextSession(em.unwrap(Session.class)); fullTextSession.getStatistics(); fullTextSession.createIndexer(TCase.class).progressMonitor(monitor).startAndWait(); the Problem is that i still don't know how to print the Process results on console while Indexing
According to documentation of SimpleIndexingProgressMonitor you need to have INFO level enabled at package level org.hibernate.search.batchindexing.impl or class level org.hibernate.search.batchindexing.impl.SimpleIndexingProgressMonitor Can you check your log level?
Getting out of memory on Ignite server node
I have setup an ignite 2.3 server node along with 32 clients nodes. After running the multiple query, I have been observed Out Of Memory Error in server node logs. Server Configuration: Configure 4 GB java max heap memory. Ignite Persistence is disabled Using default data region. Using Spring Data for apply the query on ignite node. Captured memory snapshots of ignite server node. - Node [id= =44:33:12.948] ^-- H/N/C [hosts=32, nodes=32, CPU =39] ^-- CPU [cur=3.7%, avg=0.23%, G C =0%] ^-- Page Memory [pages=303325] ^-- Heap [used=2404 MB, free=36.21%, comm=3769 MB] ^-- Non heap [used=78 MB, free=-1%, comm=80 MB] ^-- Public thread pool [active=0, idle=0, q Size =0] ^-- System thread pool [active=0, idle=6, q Size=0] ^-- Outbound messages queue [size=0] Heap Dump Logs Analysis : query-#8779 at java.nio.Bits$1.newDirectByteBuffer(JILjava/lang/Object;)Ljava/nio/ByteBuffer; (Bits.java:758) at org.apache.ignite.internal.util.GridUnsafe.wrapPointer(JI)Ljava/nio/ByteBuffer; (GridUnsafe.java:113) at org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.pageBuffer(J)Ljava/nio/ByteBuffer; (PageMemoryNoStoreImpl.java:253) at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(Lorg/apache/ignite/internal/processors/cache/CacheGroupContext;Lorg/apache/ignite/internal/processors/cache/GridCacheSharedContext;Lorg/apache/ignite/internal/pagemem/PageMemory;Lorg/apache/ignite/internal/processors/cache/persistence/CacheDataRowAdapter$RowData;)V (CacheDataRowAdapter.java:167) at org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(Lorg/apache/ignite/internal/processors/cache/CacheGroupContext;Lorg/apache/ignite/internal/processors/cache/persistence/CacheDataRowAdapter$RowData;)V (CacheDataRowAdapter.java:102) at org.apache.ignite.internal.processors.query.h2.database.H2RowFactory.getRow(J)Lorg/apache/ignite/internal/processors/query/h2/opt/GridH2Row; (H2RowFactory.java:62) at org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(Lorg/apache/ignite/internal/processors/cache/persistence/tree/BPlusTree;JI)Lorg/h2/result/SearchRow; (H2ExtrasLeafIO.java:126) at org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(Lorg/apache/ignite/internal/processors/cache/persistence/tree/BPlusTree;JI)Ljava/lang/Object; (H2ExtrasLeafIO.java:36) at org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(Lorg/apache/ignite/internal/processors/cache/persistence/tree/io/BPlusIO;JILjava/lang/Object;)Lorg/apache/ignite/internal/processors/query/h2/opt/GridH2Row; (H2Tree.java:123) at org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(Lorg/apache/ignite/internal/processors/cache/persistence/tree/io/BPlusIO;JILjava/lang/Object;)Ljava/lang/Object; (H2Tree.java:40) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(JLorg/apache/ignite/internal/processors/cache/persistence/tree/io/BPlusIO;II)Z (BPlusTree.java:4548) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.nextPage()Z (BPlusTree.java:4641) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.next()Z (BPlusTree.java:4570) at org.apache.ignite.internal.processors.query.h2.H2Cursor.next()Z (H2Cursor.java:78) at org.h2.index.IndexCursor.next()Z (IndexCursor.java:305) at org.h2.table.TableFilter.next()Z (TableFilter.java:499) at org.h2.command.dml.Select$LazyResultQueryFlat.fetchNextRow()[Lorg/h2/value/Value; (Select.java:1452) at org.h2.result.LazyResult.hasNext()Z (LazyResult.java:79) at org.h2.result.LazyResult.next()Z (LazyResult.java:59) at org.h2.command.dml.Select.queryFlat(ILorg/h2/result/ResultTarget;J)Lorg/h2/result/LazyResult; (Select.java:519) at org.h2.command.dml.Select.queryWithoutCache(ILorg/h2/result/ResultTarget;)Lorg/h2/result/ResultInterface; (Select.java:625) at org.h2.command.dml.Query.queryWithoutCacheLazyCheck(ILorg/h2/result/ResultTarget;)Lorg/h2/result/ResultInterface; (Query.java:114) at org.h2.command.dml.Query.query(ILorg/h2/result/ResultTarget;)Lorg/h2/result/ResultInterface; (Query.java:352) at org.h2.command.dml.Query.query(I)Lorg/h2/result/ResultInterface; (Query.java:333) at org.h2.command.CommandContainer.query(I)Lorg/h2/result/ResultInterface; (CommandContainer.java:113) at org.h2.command.Command.executeQuery(IZ)Lorg/h2/result/ResultInterface; (Command.java:201) at org.h2.jdbc.JdbcPreparedStatement.executeQuery()Ljava/sql/ResultSet; (JdbcPreparedStatement.java:111) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQuery(Ljava/sql/Connection;Ljava/sql/PreparedStatement;ILorg/apache/ignite/internal/processors/query/GridQueryCancel;)Ljava/sql/ResultSet; (IgniteH2Indexing.java:961) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(Ljava/sql/PreparedStatement;Ljava/sql/Connection;Ljava/lang/String;Ljava/util/Collection;ILorg/apache/ignite/internal/processors/query/GridQueryCancel;)Ljava/sql/ResultSet; (IgniteH2Indexing.java:1027) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(Ljava/sql/Connection;Ljava/lang/String;Ljava/util/Collection;ZILorg/apache/ignite/internal/processors/query/GridQueryCancel;)Ljava/sql/ResultSet; (IgniteH2Indexing.java:1006)
Is there a chance that you have tried to execute SELECT * without WHERE clause or similar request with huge result set? Result set will be retained on heap which will lead to OOM when serving such request. Either use LIMIT clause in your SQL query, or use lazy=true in your Connection/SqlFieldsQuery.
Hive: acquire explicit exclusive lock
Configuration (hortonworks) hive: BUILD hive-1.2.1.2.3.0.0 Hadoop 2.7.1.2.3.0.0-2557 I'm trying to execute lock table event_metadata EXCLUSIVE; Hive response: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Current transaction manager does not support explicit lock requests. Transaction manager: org.apache.hadoop.hive.ql.lockmgr.DbTxnManager In the code there is obvious place where explicit locks are disabled: http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hive/hive-exec/1.2.0/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java#DbTxnManager 321 #Override 322 public boolean supportsExplicitLock() { 323 return false; 324 } Questions: how can I make explicit locks work? In what version of hive do they appear? Here is an example http://www.ericlin.me/how-table-locking-works-in-hive for cloudera that explicit locks work.
You may set the concurrency parameter on the fly: set hive.support.concurrency=true; After this you may try executing your command
Hive includes a locking feature that uses Apache Zookeeper for locking. Zookeeper implements highly reliable distributed coordination. Other than some additional setup and configuration steps, Zookeeper is invisible to Hive users. In the $HIVE_HOME/hive-site.xml file, set the following properties: <property> <name>hive.zookeeper.quorum</name> <value>zk1.site.pvt,zk1.site.pvt,zk1.site.pvt</value> <description>The list of zookeeper servers to talk to. This is only needed for read/write locks. </description> </property> <property> <name>hive.support.concurrency</name> <value>true</value> <description>Whether Hive supports concurrency or not. A Zookeeper instance must be up and running for the default Hive lock manager to support read-write locks.</description> </property> After restarting hive, run the command hive> lock table event_metadata EXCLUSIVE; Reference: Programing Hive, O'REILLY EDIT: DummyTxnManager.java, which provides default Hive behavior, has #Override public boolean supportsExplicitLock() { return true; } DummyTxnManager replicates pre Hive-0.13 behavior doesn't support transactions where as DbTxnManager.java,which stores the transactions in the metastore database, has: #Override public boolean supportsExplicitLock() { return false; }
Try the following: set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager; unlock table tablename;
ActiveMQ memory limit exceeded
I try to configure ActiveMQ for the following behavior: when broker exceeds its memory limit, it should store message in persistence storage. If use the following configuration: BrokerService broker = new BrokerService(); broker.setBrokerName("activemq"); KahaDBPersistenceAdapter persistence = new KahaDBPersistenceAdapter(); persistence.setDirectory(new File(config.getProperty("amq.persistenceDir", "amq"))); broker.setPersistenceAdapter(persistence); broker.setVmConnectorURI(new URI("vm://activemq")); broker.getSystemUsage().getMemoryUsage().setLimit(64 * 1024 * 1024L); broker.getSystemUsage().getStoreUsage().setLimit(1024 * 1024 * 1024 * 100L); broker.getSystemUsage().getTempUsage().setLimit(1024 * 1024 * 1024 * 100L); PolicyEntry policyEntry = new PolicyEntry(); policyEntry.setCursorMemoryHighWaterMark(50); policyEntry.setExpireMessagesPeriod(0L); policyEntry.setPendingDurableSubscriberPolicy(new StorePendingDurableSubscriberMessageStoragePolicy()); policyEntry.setMemoryLimit(64 * 1024 * 1024L); policyEntry.setProducerFlowControl(false); broker.setDestinationPolicy(new PolicyMap()); broker.getDestinationPolicy().setDefaultEntry(policyEntry); broker.setUseJmx(true); broker.setPersistent(true); broker.start(); However, this does not work. ActiveMQ still consumes as much memory as needed to store the full queue. I also tried to remove PolicyEntry, that caused broker to stop producers after memory limit is reached. I could find nothing in documentation about what I am doing wrong.
we use a storeCursor and set the memory limit as follows...this will limit the amount of memory for all queues to 100MB... <destinationPolicy> <policyMap> <policyEntries> <policyEntry queue=">" producerFlowControl="false" memoryLimit="100mb"> <pendingQueuePolicy> <storeCursor/> </pendingQueuePolicy> </policyEntry> </policyEntries> </policyMap> </destinationPolicy> make sure you set the "destinations" that your policy should apply against...in my XML examples this is done using queue=">", but your example is using a new PolicyMap()...try calling policyEntry.setQueue(">") instead to apply to all queues or add specific destinations to your PolicyMap, etc. see this test for a full example... https://github.com/apache/activemq/blob/master/activemq-unit-tests/src/test/java/org/apache/activemq/PerDestinationStoreLimitTest.java