Google Compute Engine: Load balacing(cross-region) only when health-check fails - load-balancing

According to the documentation using "Cross Region load balancing" requests are routed to the closest region automatically.
Is there any way to disable this? I would like to switch over other region only when the health-check fails.

Related

"The search engine appears to be down or failing to respond to the search query"

I've installed FusionAuth (awesome product) into a Docker Swarm cluster using the official docker-compose.yml file and everything seems to work brilliantly.
EXCEPT
Periodically, when a user goes to login they will be presented with the above error stating that the search engine is not available. If they try again immediately then everything works correctly! I would, obviously, prefer that they never saw the error.
Elasticsearch is definitely running and is responding to API calls correctly, and I can see the fusionauth_user index is present and populated with docs.
I guess my question is two fold:
1) What role does the ElasticSearch engine play in the FusionAuth ecosystem and can it be disabled?
2) Is there a configurable timeout somewhere that is causing the error message and, if so, where can change it?
I've search the docs for answers to the above but I can't seem to find anything :-(
Thanks for the kind feedback.
1) What role does the ElasticSearch engine play in the FusionAuth ecosystem and can it be disabled?
Elasticsearch provides full text search of user data. Each time a user is created or updated the user is re-indexed. In this case during login, we are updating the search index with the last login instant.
This service is required and cannot be disabled. We have had clients request to make this service optional for embedded applications or small scale scenarios where Elasticsearch may not be required. While this is not currently in plan, it is possible we may revisit this option in the future.
2) Is there a configurable timeout somewhere that is causing the error message and, if so, where can change it?
Not currently.
Full disclosure, I am not a Docker or Docker Swarm expert at all - perhaps there are some nuances to Swarm and response time due to spin up and spin down of resources?
Do you see any exceptions in the log when a user sees this error on the login?

Disabling reallocation of a resource with Apache Helix 0.7.1

My use case is to disable the auto reallocation by Helix to a new node temporarily.
I tried using the
ZKHelixAdmin.enableResource()
API. I see a change in the ideal state[
{ "HELIX_ENABLED" : "false",...}
] in the Exhibitor but the same resource is getting reallocated to a live instance.
What is the functionality of enableResource() API?The API doc doesn't have much info.
The balancer checks if the resource is enabled and only if its enabled it will allocate it to another live instance.
In my case, I was using a custom re balancer[was using the USER_DEFINED rebalancer algorithm] and had to add this check explicitly in my custom re balancer.
Works like a charm.

Google Cloud Dataflow error: "The Application Default Credentials are not available"

We have a Google Cloud Dataflow job, which writes to Bigtable (via HBase API). Unfortunately, it fails due to:
java.io.IOException: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information. at com.google.bigtable.repackaged.com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentials(DefaultCredentialsProvider.java:74) at com.google.bigtable.repackaged.com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:54) at com.google.bigtable.repackaged.com.google.cloud.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:181) at com.google.bigtable.repackaged.com.google.cloud.config.CredentialFactory.getCredentials(CredentialFactory.java:100) at com.google.bigtable.repackaged.com.google.cloud.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:85) at com.google.bigtable.repackaged.com.google.cloud.grpc.BigtableSession.<init>(BigtableSession.java:257) at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:123) at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:91) at com.google.cloud.bigtable.hbase1_0.BigtableConnection.<init>(BigtableConnection.java:33) at com.google.cloud.bigtable.dataflow.CloudBigtableConnectionPool$1.<init>(CloudBigtableConnectionPool.java:72) at com.google.cloud.bigtable.dataflow.CloudBigtableConnectionPool.createConnection(CloudBigtableConnectionPool.java:72) at com.google.cloud.bigtable.dataflow.CloudBigtableConnectionPool.getConnection(CloudBigtableConnectionPool.java:64) at com.google.cloud.bigtable.dataflow.CloudBigtableConnectionPool.getConnection(CloudBigtableConnectionPool.java:57) at com.google.cloud.bigtable.dataflow.AbstractCloudBigtableTableDoFn.getConnection(AbstractCloudBigtableTableDoFn.java:96) at com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableBufferedWriteFn.getBufferedMutator(CloudBigtableIO.java:836) at com.google.cloud.bigtable.dataflow.CloudBigtableIO$CloudBigtableSingleTableBufferedWriteFn.processElement(CloudBigtableIO.java:861)
Which makes very little sense, because the job is already running on Cloud Dataflow service/VMs.
The Cloud Dataflow job id: 2016-05-13_11_11_57-8485496303848899541
We are using bigtable-hbase-dataflow version 0.3.0, and we want to use HBase API.
I believe this is a known issue where GCE instances are very rarely unable to get the default credentials during startup.
We have been working on a fix which should be part of the next release (1.6.0) which should be coming soon. In the meantime we'd suggest re-submitting the job which should work. If you run into problems consistently or want to discuss other workarounds (such as backporting the 1.6.0 fix) please reach out to us.
1.7.0 is released so this should be fixed now https://cloud.google.com/dataflow/release-notes/release-notes-java

How can I get CPU usage of VM(KVM)

How can I get the cpu usage of vm in KVM like virt-manager?
virt-manager monitoring vm cpu usage
Libvirt didn't provide API.
Does anyone know how to get vm cpu usage from host?
If you have command-line access to the server, and you have the virsh command, you can use it to get stats.
There are several dom* subcommands that give you access to different things:
domifstat domain interface-device
Get network interface stats for a running domain.
dommemstat domain [--period seconds] [[--config] [--live] | [--current]]
Get memory stats for a running domain.
domstats [--raw] [--enforce] [--backing] [--state] [--cpu-total] [--balloon] [--vcpu] [--interface] [--block]
[[--list-active] [--list-inactive] [--list-persistent] [--list-transient] [--list-running] [--list-paused]
[--list-shutoff] [--list-other]] | [domain ...]
Get statistics for multiple or all domains. Without any argument this command prints all available statistics for
all domains.
So you might:
#virsh domstats --cpu-total server1
Domain: 'server1'
cpu.time=144940157444
cpu.user=65260000000
cpu.system=14450000000
By polling that you can get the data you want.
Read the man page on virsh for more details.
edit: note that virsh is just a thin wrapper around the libvirt api - and this data is available via api calls also
If you use c or c++ you can try to look into virDomainGetCPUStats that is based on C API. However if you use Java, you won't get much luck.

How to fetch Spark Streaming job statistics using REST calls when running in yarn-cluster mode

I have a spark streaming program running on Yarn Cluster in "yarn-cluster" mode. (-master yarn-cluster).
I want to fetch spark job statistics using REST APIs in json format.
I am able to fetch basic statistics using REST url call:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/metrics/json. But this is giving very basic statistics.
However I want to fetch per executor or per RDD based statistics.
How to do that using REST calls and where I can find the exact REST url to get these statistics.
Though $SPARK_HOME/conf/metrics.properties file sheds some light regarding urls i.e.
5. MetricsServlet is added by default as a sink in master, worker and client driver, you can send http request "/metrics/json" to get a snapshot of all the registered metrics in json format. For master, requests "/metrics/master/json" and "/metrics/applications/json" can be sent seperately to get metrics snapshot of instance master and applications. MetricsServlet may not be configured by self.
but that is fetching html pages not json. Only "/metrics/json" fetches stats in json format.
On top of that knowing application_id pro-grammatically is a challenge in itself when running in yarn-cluster mode.
I checked REST API section of Spark Monitoring page, but that didn't worked when we run spark job in yarn-cluster mode. Any pointers/answers are welcomed.
You should be able to access the Spark REST API using:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/api/v1/applications/
From here you can select the app-id from the list and then use the following endpoint to get information about executors, for example:
http://yarn-cluster:8088/proxy/application_1446697245218_0091/api/v1/applications/{app-id}/executors
I verified this with my spark streaming application that is running in yarn cluster mode.
I'll explain how I arrived at the JSON response using a web browser. (This is for a Spark 1.5.2 streaming application in yarn-cluster mode).
First, use the hadoop url to view the RUNNING applications. http://{yarn-cluster}:8088/cluster/apps/RUNNING.
Next, select a running application, say http://{yarn-cluster}:8088/cluster/app/application_1450927949656_0021.
Next, click on the TrackingUrl link. This uses a proxy and the port is different in my case: http://{yarn-proxy}l:20888/proxy/application_1450927949656_0021/. This shows the spark UI. Now, append the api/v1/applications to this URL: http://{yarn-proxy}l:20888/proxy/application_1450927949656_0021/api/v1/applications.
You should see a JSON response with the application name supplied to SparkConf and the start time of the application.
I was able to reconstruct the metrics in the columns seen in the Spark Streaming web UI (batch start time, processing delay, scheduling delay) using the /jobs/ endpoint.
The script I used is available here. I wrote a short post describing and tying its functionality back to the Spark codebase. This does not need any web-scraping.
It works for Spark 2.0.0 and YARN 2.7.2, but may work for other version combinations too.
You'll need to scrape through the HTML page to get the relevant metrics. There isn't a Spark rest endpoint for capturing this info.