Get uwsgi busy info in python layer - load-balancing

I am planning to write a load balancer on top of uwsgi instance. I was hoping to use cheapness-busyness to get the load on the uwsgi instance.
The uWsgi logs prints all the busyness time and again. Is it possible to get this information in the python layer ?

Related

HTTPD response time is increasing after 90 TPS

I am doing load test to tune my apache to server maximum concurrent https request. Below is the details of my test.
System
I dockerized my httpd and deployed in openshift with pod configuration is 4CPU, 8GB RAM.
Running load from Jmeter with 200 thread, 600sec ramup time, loop is for infinite. duration is long run (Jmeter is running in same network with VM configuration 16CPU, 32GB RAM ).
I compiled by setting module with worker and deployed in openshift.
Issue
Httpd is not scaling more than 90TPS, even after tried multiple mpm worker configuration (no difference with default and higher configuration)
2.Issue which i'am facing after 90TPS, average time is increasing and TPS is dropping.
Please let me know what could be the issue, if any information is required further suggestions.
I don't have the answer, but I do have questions.
1/ What does your Dockerfile look like?
2/ What does your OpenShift cluster look like? How many nodes? Separate control plane and workers? What version?
2b/ Specifically, how is traffic entering the pod (if you are going in via a route, you'll want to look at your load balancer; if you want to exclude OpenShift from the equation then for the short term, expose a NodePort and have Jmeter hit that directly)
3/ Do I read correctly that your single pod was assigned 8G ram limit? Did you mean the worker node has 8G ram?
4/ How did you deploy the app -- raw pod, deployment config? Any cpu/memory limits set, or assumed? Assuming a deployment, how many pods does it spawn? What happens if you double it? Doubled TPS or not - that'll help point to whether the problem is inside httpd or inside the ingress route.
5/ What's the nature of the test request? Does it make use of any files stored on the network, or "local" files provisioned in a network PV.
And,
6/ What are you looking to achieve? Maximum concurrent requests in one container, or maximum requests in the cluster? If you've not already look to divide and conquer -- more pods on more nodes.
Most likely you have run into a bottleneck/limitation at the SUT. See the following post for a detailed answer:
JMeter load is not increasing when we increase the threads count

Apache threads stay in state reading after queries

My configuration is apache and tomcat behind an aws elb. Apache is configured with no keepalive, and set with max clients to a low number due to each query being very cpu intensive. I'll load test the machine with queries. Then the number of available requests goes to zero as can be seen by curl -s localhost/server-status?auto not responding immediately. When I stop the loadtest I can see the scoreboard from curl -s localhost/server-status?auto still is full of R's even though from the tomcat logs it is clear nothing is happening. Does anyone have an idea on what possible causes there might be?
If your apache display 'R' in the status it means there is open TCP connections from ELB to apache (just an open TCP connection, no data sent yet).
There is no official complete documentation on this subject (how the numbers of pre-opened connections is optimized ), but the amazon documentation state (at this page: https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html ) that:
Classic Load Balancers use pre-open connections but Application Load Balancers do not.
So, the answer is: it is an optimization from amazon (TCP connection cost a little bit to be openned).

Docker Swarm - load-balancing to closest node first

I'm trying to optimize Docker-Swarm load-balancing in a way that it will first route requests to services by the following priority
Same machine
Same DC
Anywhere else.
Given the following setup:
DataCenter-I
Server-I
Nginx:80
Server-II
Nginx:80
Worker
DataCenter-II
Server-I
Nginx:80
Worker
In case and DataCenter-I::Server-II::Worker will issue an API request over port 80, The desired behavior is:
Check if there are any tasks (containers) mapped to port:80 on local server (DataCenter-I::Server-II)
Fallback and check in local DataCenter (i.e DataCenter-I::Server-I)
Fallback and check in all clusters (i.e DataCenter-II::Server-I)
This case is very useful when using workers and response time doesn't matter while bandwidth does.
Please advise,
Thanks!
According to this question I asked before, docker swarm is currently only using round-robin and no indication to be pluginable yet.
However, Nginx Plus support least_time load balancing method, which I think there will be an similar open-source module, and it is similar to what you need, with perhaps the least effort.
ps: Don't run Nginx with the docker swarm. Instead, run Nginx with regular docker or docker-compose in the same docker network of your app. You don't want docker swarm to load balancing your load balancer.

How can i view the UI of Elastic Load Balancer 2.1.0,

HI, just now i download the Elastic Load Balance 2.1.0 from WSO2 ,It
is running on terminal side of Linux ubuntu, but it is not showing the
Management console url. If it is not showing url where can i get UI
of Elastic Load Balance.
i have a multiple esb server with same configuration.if my a1 server
go down that time data load will shift to my a2 server .Is this use of
Elasticloadbalance will you explain me about this what is the exactly
use of this .
No, there is no UI component for ELB. Everything has to be done through configuring physical files.
Elastic LoadBalancer 2.1.0 is based on Hazlecast dependent clustering. This has two parts, one is load balancing and the other is elasticity. Load Balancing is simply distributing workload among a number of endpoints configured in a static or dynamic manner. Elasticity is simply scaling, ie monitoring load on worker nodes and starts or terminates nodes based on need on an IaaS environment.
Not only manages when a node goes down but also depending on load it can spawn new nodes to handle and if the load is low it can kill unwanted instances in an IaaS environment.

WSGI / Apache clarification

I have a Pyramid application running on apache with mod_wsgi.
What exactly is the lifeline of my application when a request is made?
Does my application get created (which entails loading the configuration, creating the database engine) every time a request comes in? When using paste serve, this isn't the case. But with mod_wsgi - how does it work? When does the application "terminate"?
For a start, read:
http://blog.dscpl.com.au/2009/03/python-interpreter-is-not-created-for.html
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
Initialisation is not done on a per request basis. In generally the application should persist in memory between requests. In the case of embedded mode then you may be at the mercy of Apache as to when it recycles processes.