Limit CPU usage when creating containers using the Docker API - api

I'm trying to create a container using the Docker API with a limit on CPU usage. I was able to find documentation regarding the --cpus=<value> option that you can pass when using Docker on the command line, but I'm not sure if this feature is available using the HTTP API.
Is there a way to pass this option using the HTTP API, or is there a comparable alternative for limiting CPU usage of a container? There are options that are documented for the API like CPUCount and CPUPercent that are marked "Windows Only", but I'm running Docker on Ubuntu.

A little bit late, but if someone is still looking for this, in the HTTP API you have access to: CpuQuota and CpuPeriod which you can use to limit the CPU.
For example if you have only one CPU and want to limit the usage to 50% you can do it with the following options:
"HostConfig": {
"CpuPeriod": 100000,
"CpuQuota": 50000,
}

Related

"The search engine appears to be down or failing to respond to the search query"

I've installed FusionAuth (awesome product) into a Docker Swarm cluster using the official docker-compose.yml file and everything seems to work brilliantly.
EXCEPT
Periodically, when a user goes to login they will be presented with the above error stating that the search engine is not available. If they try again immediately then everything works correctly! I would, obviously, prefer that they never saw the error.
Elasticsearch is definitely running and is responding to API calls correctly, and I can see the fusionauth_user index is present and populated with docs.
I guess my question is two fold:
1) What role does the ElasticSearch engine play in the FusionAuth ecosystem and can it be disabled?
2) Is there a configurable timeout somewhere that is causing the error message and, if so, where can change it?
I've search the docs for answers to the above but I can't seem to find anything :-(
Thanks for the kind feedback.
1) What role does the ElasticSearch engine play in the FusionAuth ecosystem and can it be disabled?
Elasticsearch provides full text search of user data. Each time a user is created or updated the user is re-indexed. In this case during login, we are updating the search index with the last login instant.
This service is required and cannot be disabled. We have had clients request to make this service optional for embedded applications or small scale scenarios where Elasticsearch may not be required. While this is not currently in plan, it is possible we may revisit this option in the future.
2) Is there a configurable timeout somewhere that is causing the error message and, if so, where can change it?
Not currently.
Full disclosure, I am not a Docker or Docker Swarm expert at all - perhaps there are some nuances to Swarm and response time due to spin up and spin down of resources?
Do you see any exceptions in the log when a user sees this error on the login?

How to set a specific port for single-user Jupyterhub server REST API calls?

I have setup Spark SQL on Jypterhub using Apache Toree SQL kernel. I wrote a Python function to update Spark configuration options in the kernel.json file for my team to change configuration based on their queries and cluster configuration. But I have to shutdown the running notebook and re-open or restart the kernel after running Python function. In this way, I'm forcing the Toree kernel to read the JSON file to pick up the new configuration.
I thought of implementing this shutdown and restart of kernel in a programmatic way. I got to know about the Jupyterhub REST API documentation and am able implement it by invoking related API's. But the problem is, the single user server API port is set randomly by the Spawner object of Jupyterhub and it keeps changing every time I spin up a cluster. I want this to be fixed before launching the Jupyterhub service.
Here is a solution I tried based on Jupyterhub docs:
sudo echo "c.Spawner.port = 35289
c.Spawner.ip = '127.0.0.1'" >> /etc/jupyterhub/jupyterhub_config.py
But this did not work as the port was again set by the Spawner randomly. I think there is a way to fix this. Any help on this would be greatly appreciated. Thanks

how to configure hard memory limit for builds in drone.io

Perhaps I am missing it, but I see no method to control the the hard memory limit for any given build (I have builds being murdered because of it). Is the build memory limit based on the build params supplied by the client (this means a single client can bring down everything) or is there someplace I can configure the service to only allow 512mb (for example) per build?
You can limit the max amount of memory per-container by setting the global DRONE_LIMIT_MEM variable (with the server). This should be set to the amount of memory in bytes, for example:
DRONE_LIMIT_MEM_SWAP=512000000
DRONE_LIMIT_MEM=512000000
These limits are passed to Docker when Drone starts a container [1]. It is the equivalent to the following Docker command:
docker run --memory=512000000 <image>
[1] https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory

How can I get CPU usage of VM(KVM)

How can I get the cpu usage of vm in KVM like virt-manager?
virt-manager monitoring vm cpu usage
Libvirt didn't provide API.
Does anyone know how to get vm cpu usage from host?
If you have command-line access to the server, and you have the virsh command, you can use it to get stats.
There are several dom* subcommands that give you access to different things:
domifstat domain interface-device
Get network interface stats for a running domain.
dommemstat domain [--period seconds] [[--config] [--live] | [--current]]
Get memory stats for a running domain.
domstats [--raw] [--enforce] [--backing] [--state] [--cpu-total] [--balloon] [--vcpu] [--interface] [--block]
[[--list-active] [--list-inactive] [--list-persistent] [--list-transient] [--list-running] [--list-paused]
[--list-shutoff] [--list-other]] | [domain ...]
Get statistics for multiple or all domains. Without any argument this command prints all available statistics for
all domains.
So you might:
#virsh domstats --cpu-total server1
Domain: 'server1'
cpu.time=144940157444
cpu.user=65260000000
cpu.system=14450000000
By polling that you can get the data you want.
Read the man page on virsh for more details.
edit: note that virsh is just a thin wrapper around the libvirt api - and this data is available via api calls also
If you use c or c++ you can try to look into virDomainGetCPUStats that is based on C API. However if you use Java, you won't get much luck.

How to fetch Spark Streaming job statistics using REST calls when running in yarn-cluster mode

I have a spark streaming program running on Yarn Cluster in "yarn-cluster" mode. (-master yarn-cluster).
I want to fetch spark job statistics using REST APIs in json format.
I am able to fetch basic statistics using REST url call:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/metrics/json. But this is giving very basic statistics.
However I want to fetch per executor or per RDD based statistics.
How to do that using REST calls and where I can find the exact REST url to get these statistics.
Though $SPARK_HOME/conf/metrics.properties file sheds some light regarding urls i.e.
5. MetricsServlet is added by default as a sink in master, worker and client driver, you can send http request "/metrics/json" to get a snapshot of all the registered metrics in json format. For master, requests "/metrics/master/json" and "/metrics/applications/json" can be sent seperately to get metrics snapshot of instance master and applications. MetricsServlet may not be configured by self.
but that is fetching html pages not json. Only "/metrics/json" fetches stats in json format.
On top of that knowing application_id pro-grammatically is a challenge in itself when running in yarn-cluster mode.
I checked REST API section of Spark Monitoring page, but that didn't worked when we run spark job in yarn-cluster mode. Any pointers/answers are welcomed.
You should be able to access the Spark REST API using:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/api/v1/applications/
From here you can select the app-id from the list and then use the following endpoint to get information about executors, for example:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/api/v1/applications/{app-id}/executors
I verified this with my spark streaming application that is running in yarn cluster mode.
I'll explain how I arrived at the JSON response using a web browser. (This is for a Spark 1.5.2 streaming application in yarn-cluster mode).
First, use the hadoop url to view the RUNNING applications. http://{yarn-cluster}:8088/cluster/apps/RUNNING.
Next, select a running application, say http://{yarn-cluster}:8088/cluster/app/application_1450927949656_0021.
Next, click on the TrackingUrl link. This uses a proxy and the port is different in my case: http://{yarn-proxy}l:20888/proxy/application_1450927949656_0021/. This shows the spark UI. Now, append the api/v1/applications to this URL: http://{yarn-proxy}l:20888/proxy/application_1450927949656_0021/api/v1/applications.
You should see a JSON response with the application name supplied to SparkConf and the start time of the application.
I was able to reconstruct the metrics in the columns seen in the Spark Streaming web UI (batch start time, processing delay, scheduling delay) using the /jobs/ endpoint.
The script I used is available here. I wrote a short post describing and tying its functionality back to the Spark codebase. This does not need any web-scraping.
It works for Spark 2.0.0 and YARN 2.7.2, but may work for other version combinations too.
You'll need to scrape through the HTML page to get the relevant metrics. There isn't a Spark rest endpoint for capturing this info.