Maximum number of connection pool in gremlin server - tinkerpop

Does anyone know the maximum number of connection pool that can be simultaneously opened in gremlin server, version 3.3 and the latest version respectively?

I'm not sure there is an absolute answer to this question. The "maximum size" is dependent on the available resources allotted to Gremlin Server and the workload being sent to it for processing. Some tuning options are described in the Reference Documentation but you will need to run tests to find out what works best in your situation and what your own maximum setting will be to yield you the best performance.

Related

Matillion: How to identify performance bottleneck

We're running Matillion (v1.54) on an AWS EC2 instance (CentOS), based on Tomcat 8.5.
We have developped a few ETL jobs by now, and their execution takes quite a lot of time (that is, up to hours). We'd like to speed up the execution of our jobs, and I wonder how to identify the bottle neck.
What confuses me is that both the m5.2xlarge EC2 instance (8 vCPU, 32G RAM) and the database (Snowflake) don't get very busy and seem to be sort of idle most of the time (regarding CPU and RAM usage as shown by top).
Our environment is configured to use up to 16 parallel connections.
We also added JVM options -Xms20g -Xmx30g to /etc/sysconfig/tomcat8 to make sure the JVM gets enough RAM allocated.
Our Matillion jobs do transformations and loads into a lot of tables, most of which can (and should) be done in parallel. Still we see, that most of the tasks are processed in sequence.
How can we enhance this?
By default there is only one JDBC connection to Snowflake, so your transformation jobs might be getting forced serial for that reason.
You could try bumping up the number of concurrent connections under the Edit Environment dialog, like this:
There is more information here about concurrent connections.
If you do that, a couple of things to avoid are:
Transactions (begin, commit etc) will force transformation jobs to
run in serial again
If you have a parameterized transformation job,
only one instance of it can ever be running at a time. More information on that subject is here
Because the Matillion server is just generating SQL statements and running them in Snowflake, the Matillion server is not likely to be the bottleneck. You should make sure that your orchestration jobs are submitting everything to Snowflake at the same time and there are no dependencies (unless required) built into your flow.
These steps will be done in sequence:
These steps will be done in parallel (and will depend on Snowflake warehouse size to scale):
Also - try the Alter Warehouse Component with a higher concurrency level

Presto Nodes with too much load

I'm performing some queries over a tpch 100gb dataset on presto, I have 4 nodes, 1 master, 3 workers. When I try to run some queries, not all of them, I see on Presto web interface that the nodes die during the execution, resulting in query failure, the error is the following:
.facebook.presto.operator.PageTransportTimeoutException: Encountered too many errors talking to a worker node. The node may have crashed or been under too much load. This is probably a transient issue, so please retry your query in a few minutes.
I rebooted all nodes and presto service but the error remains, this problem doesn't exist if I run the same queries over a smaller dataset.Can someone provide some help on this problem?
Thanks
3 possible causes for this kind of error. You may ssh into one of worker to find out what the problem is when the query is running.
High CPU
Tune down the task.concurrency to, for example, 8
High memory
In the jvm.config, -Xmx should no more than 80% total memory. In the config.properties, query.max-memory-per-node should be no more than the half of Xmx number.
Low open file limit
Set in the /etc/security/limits.conf a larger number for the Presto process. The default is definitely way too low.
It might be an issue for configuration. For example, if the local maximum memory is not set appropriately and the query use too much heap memory, full GC might happen to cause such errors. I would suggest to ask in the Presto Google Group and describe someway to reproduce the issue :)
I was running presto on Mac with 16GB of ram below is the configuration of java.config file.
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
I was getting following error even for running the Query
Select now();
Query 20200817_134204_00005_ud7tk failed: Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes.
I changed my -Xmx16G value to -Xmx10G and it works fine.
I used following link to install presto on my system.
Link for Presto Installation

Running out of Connection with WCF service in Azure with SQL Azure

We have Multi-instance WCF Service (more than 2) which receives requests from ServiceBus Topics (Can have more than 10000 request in subscription).
The nature of the request is that we mainly do inserts in out database. Very minimal processing. Our database is of P1 in SQL Azure.
After sometime, we keep running out of Connection & do receive time outs. I have increased Pool size to 1000 & connection time out to 120 secs. We have checked, & connection pools are definately getting disposed off correctly.
Any Idea where we should start digging?
Thanks
The higher latencies and the resulting timeouts could be due to reaching the max write capacity of the database.
You can check if this is the case by querying the view sys.dm_db_resource_stats in the database. It shows the resource utilization in percent for the last hour.
If you indeed reach the log write limits, you should consider to upgrade your server to the latest service version (V12) which will give you higher log write rates. If you are already running V12, you may want to consider upgrading to P2.

Multi-threaded performance testing MS SQL server DB

Let's assume the following situation:
I have a database server that uses 4 core CPU;
My machine has 2 core CPU;
Assume they are of equal speed in terms of GHZ;
Systems are connected over a network (two lines 200mb/s each);
Test tool that I use provides # of threads parameter and will issue commands in parallel to the server.
QUESTIONS:
How would you test parallel reads/writes via stored procedure? Please brainstorm as any advice is appreciated;
How can I prove that many threads are executing the queries on the server (or should I not pay attention to this as this servers and DB's responsibility)?
What controls how many threads are executed at any time primarily in case of SQL server? I checked the "server properties" > processors > # of processors and threads section - waht more should I check?
How can I check that my application truly executes on all my machine cores - in other words - uses real threads instead of virtual ones? Or should I pay attention only to the virtual ones?
Should I pay attention to the network bandwidth? Can it be a bottleneck (I dont' send any big data, only commands with variables).
1.) not sure perhaps someone else can answer
2.) SQL Sentry allows you to monitor your SQL activity (use the free trial and buy if you like it)
3.) Max Dop controls the number of processors & also the cost threshold will affect parrallelism
4.) Same as 2 perhaps, i'm not sure i understand
5.) Depends on what you are doing are where you see aproblem SQL sentry will show wait stats that may help

Where to modify threadcount in Weblogic 10.0 MP2

I need to modify the number of threads available in my Weblogic 10.0 MP2 environment for some perf benchmarking but I cannot seem to be able to find where exactly that option lies.
Can anyone share this info please? thank you.
Weblogic 10 does not use execute thread queues like in previous versions (i.e. Weblogic 8.1 and older)
This concept is now replaced with Work Managers.
These are self-tuned, i.e. WLS will auto-tune the number of threads every 2 seconds based on how it sees the need to increase threads for the application load.
You can confirm this from the console, it will show the increasing number of execute threads as the load increases.
You can use the work manager and constraints to make sure your applications get certain criteria met.
Such as certain web apps or EJBs can get a higher share of threads and so on.
For a quick read see http://www.oracle.com/technetwork/articles/entarch/workload-management-088692.html
and
http://m-button.blogspot.com/2009/02/tuning-default-workmanager-on-weblogic.html
Secondly, are you running in dev mode or production mode.
If dev mode, you can try this cmd line parameter
-Dweblogic.threadpool.MinPoolSize=100
but I am not sure if it will work, so it's better to leave it to Work Managers