Enviornment : Ignite-2.8.1, Java 8
I am getting heap full for my application after few hours of start. On analyzing heap dump, I see instances class org.apache.internal.processors.query.*. Looks like after query execution, it is not getting cleaned up from the heap and after some time leading to failure due to heap full.
One thing I have realized is all these entries are for queries that are triggered via ignite executor services or normal task scheduling services.
Please suggest. Attaching snapshot.
Ignite is just trying to execute the SQL queries that are being sent to it. You would need to investigate the client.
On the Ignite side, you can make sure you're using the right garbage collector and setting the heap size appropriately. There's no One Good Answer, but in general, use the G1 GC and I'd start with 4Gb of heap space if you're using a lot of SQL.
Related
We're running Matillion (v1.54) on an AWS EC2 instance (CentOS), based on Tomcat 8.5.
We have developped a few ETL jobs by now, and their execution takes quite a lot of time (that is, up to hours). We'd like to speed up the execution of our jobs, and I wonder how to identify the bottle neck.
What confuses me is that both the m5.2xlarge EC2 instance (8 vCPU, 32G RAM) and the database (Snowflake) don't get very busy and seem to be sort of idle most of the time (regarding CPU and RAM usage as shown by top).
Our environment is configured to use up to 16 parallel connections.
We also added JVM options -Xms20g -Xmx30g to /etc/sysconfig/tomcat8 to make sure the JVM gets enough RAM allocated.
Our Matillion jobs do transformations and loads into a lot of tables, most of which can (and should) be done in parallel. Still we see, that most of the tasks are processed in sequence.
How can we enhance this?
By default there is only one JDBC connection to Snowflake, so your transformation jobs might be getting forced serial for that reason.
You could try bumping up the number of concurrent connections under the Edit Environment dialog, like this:
There is more information here about concurrent connections.
If you do that, a couple of things to avoid are:
Transactions (begin, commit etc) will force transformation jobs to
run in serial again
If you have a parameterized transformation job,
only one instance of it can ever be running at a time. More information on that subject is here
Because the Matillion server is just generating SQL statements and running them in Snowflake, the Matillion server is not likely to be the bottleneck. You should make sure that your orchestration jobs are submitting everything to Snowflake at the same time and there are no dependencies (unless required) built into your flow.
These steps will be done in sequence:
These steps will be done in parallel (and will depend on Snowflake warehouse size to scale):
Also - try the Alter Warehouse Component with a higher concurrency level
Enviornment : Ignite-2.8.1, Java 11
I am getting out of memory for my application after few minutes of start. On analyzing heap dump created on OOM, I see millions of instances of class org.apache.internal.processors.continuous.GridContinuousMessage
I do not see any direct references of these from my code.
Please suggest. Attaching snapshot.
You seem to have a Continuous Query running, and it is too slow/hanging and not able to process notifications in time, leading to their pile-up.
I'm performing some queries over a tpch 100gb dataset on presto, I have 4 nodes, 1 master, 3 workers. When I try to run some queries, not all of them, I see on Presto web interface that the nodes die during the execution, resulting in query failure, the error is the following:
.facebook.presto.operator.PageTransportTimeoutException: Encountered too many errors talking to a worker node. The node may have crashed or been under too much load. This is probably a transient issue, so please retry your query in a few minutes.
I rebooted all nodes and presto service but the error remains, this problem doesn't exist if I run the same queries over a smaller dataset.Can someone provide some help on this problem?
Thanks
3 possible causes for this kind of error. You may ssh into one of worker to find out what the problem is when the query is running.
High CPU
Tune down the task.concurrency to, for example, 8
High memory
In the jvm.config, -Xmx should no more than 80% total memory. In the config.properties, query.max-memory-per-node should be no more than the half of Xmx number.
Low open file limit
Set in the /etc/security/limits.conf a larger number for the Presto process. The default is definitely way too low.
It might be an issue for configuration. For example, if the local maximum memory is not set appropriately and the query use too much heap memory, full GC might happen to cause such errors. I would suggest to ask in the Presto Google Group and describe someway to reproduce the issue :)
I was running presto on Mac with 16GB of ram below is the configuration of java.config file.
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
I was getting following error even for running the Query
Select now();
Query 20200817_134204_00005_ud7tk failed: Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes.
I changed my -Xmx16G value to -Xmx10G and it works fine.
I used following link to install presto on my system.
Link for Presto Installation
I deployed an ignite cluster in yarn. The cluster has 5 servers. Each server has 10GB memory and 8GB heap. I was trying to write a lot of data to ignite cache. Each item is an integer array whose length is 100K. The backups is 2. When I write 3980 items to the ignite cache, the cluster's heap is almost full. But instead of reject writing, the servers went down one by one.
My questions are:
Is there a configuration or way to control the cache ratio of servers, so the heap won't be full and servers won't go down?
Apparently, servers go down when write too much into cache seems not good for users. I'm wondering why ignite will let this happen, if user uses default configuration.
Apache Ignite, as well as Java Virtual Machine, is NOT responsible for managing or controlling a size of data sets that are placed into Java heap. This is the reason why OutOfMemoryError is presented in Java API because it's a responsibility of an application to handle its data sets and make sure that they fit into the heap.
You can set up eviction policy and Ignite may either move data to off-heap region or swap or completely remove from the memory.
Refer to my foreword above. This is the responsibility of an application. Ignite can assist here wit its eviction policy, off-heap mode and ability to scale out.
I am trying to load a dataset to GraphDB 7.0. I wrote a Python script to transform and load the data on Sublime Text 3. The program suddenly stopped working and closed, the computer threatened to restart but didn't, and I lost several hours worth of computing as GraphDB doesn't let me query the inserts. This is the error I get on GraphDB:
The currently selected repository cannot be used for queries due to an error:
org.openrdf.repository.RepositoryException: java.lang.RuntimeException: There is not enough memory for the entity pool to load: 65728645 bytes are required but there are 0 left. Maybe cache-memory/tuple-index-memory is too big.
I set the JVM as follows:
-Xms8g
-Xmx9g
I don't exactly remember what I set as the values for the cache and index memories. How do I resolve this issue?
For the record, the database I need to parse has about 300k records. The program shut shop at about 50k. What do I need to do to resolve this issue?
Open the workbench and check the amount of memory you have given to cache memory.
Xmx should be a value that is enough for
cache-memory + memory-for-queries + entity-pool-hash-memory
sadly the latter cannot be calculated easily because it depends on the number of entities in the repository. You will either have to:
Increase the java memory with a bigger value for Xmx
Decrease the value for cache memory