Apache Beam on Cloud Dataflow - Failed to query Cadvisor

Apache Beam on Cloud Dataflow - Failed to query Cadvisor - google-bigquery

I have a cloud dataflow that is reading from a Pub/Sub and pushing data out to BQ. Recently the dataflow is reporting the error below and not writing any data to BQ.
{
insertId: "3878608796276796502:822931:0:1075"
jsonPayload: {
line: "work_service_client.cc:490"
message: "gcpnoelevationcall-01211413-b90e-harness-n1wd Failed to query CAdvisor at URL=<IPAddress>:<PORT>/api/v2.0/stats?count=1, error: INTERNAL: Couldn't connect to server"
thread: "231"
}
labels: {
compute.googleapis.com/resource_id: "3878608796276796502"
compute.googleapis.com/resource_name: "gcpnoelevationcall-01211413-b90e-harness-n1wd"
compute.googleapis.com/resource_type: "instance"
dataflow.googleapis.com/job_id: "2018-01-21_14_13_45"
dataflow.googleapis.com/job_name: "gcpnoelevationcall"
dataflow.googleapis.com/region: "global"
}
logName: "projects/poc/logs/dataflow.googleapis.com%2Fshuffler"
receiveTimestamp: "2018-01-21T22:41:40.053806623Z"
resource: {
labels: {
job_id: "2018-01-21_14_13_45"
job_name: "gcpnoelevationcall"
project_id: "poc"
region: "global"
step_id: ""
}
type: "dataflow_step"
}
severity: "ERROR"
timestamp: "2018-01-21T22:41:39.524005Z"
}
Any ideas, on how could I help this? Has anyone faced a similar issue before?

If this just happened once it could be attributed to a transient issue. The process running on the worker node can't reach cAdvisor. Either the cAdvisor container is not running or there is a temporal problem on the worker that can't contact cAdvisor and the job gets stuck.

Related

Unable to run relay-compiler

I installed the relay for my react-native project following the steps on relay.dev. Running the compiler works fine when I have an empty schema file. Putting schema to the file starts throwing me this error:
thread 'main' panicked at 'Expect GraphQLAsts to exist.', /home/runner/work/relay/relay/compiler/crates/relay-compiler/src/compiler.rs:335:14
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
My schema file is
// relay_schema.graphql
type Query {
tasks: [TaskNode]
}
type TaskNode {
id: ID!
}
and my relay config is:
module.exports = {
// ...
// Configuration options accepted by the `relay-compiler` command-line tool and `babel-plugin-relay`.
src: './src/',
language: 'flow',
schema: './relay_schema.graphql',
exclude: ['**/node_modules/**', '**/__mocks__/**', '**/__generated__/**'],
};
I'm completely lost on what to do here

Started working after updating src to src: './'. It's better to verify the paths such as src and schema. Error messages shown by the compiler are not much help when stuck here.

How to run a Lambda Docker with serverless offline

I would like to run serverless offline using a Lambda function that points to a Docker image.
When I try to run serverless offline, I am just receiving:
Offline [http for lambda] listening on http://localhost:3002
Function names exposed for local invocation by aws-sdk:
* hello-function: sample-app3-dev-hello-function
If I try to access http://localhost:3002/hello, a 404 error is returned
serverless.yml
service: sample-app3
frameworkVersion: '3'
plugins:
- serverless-offline
provider:
name: aws
ecr:
images:
sampleapp3image:
path: ./app/
platform: linux/amd64
functions:
hello-function:
image:
name: sampleapp3image
events:
- httpApi:
path: /hello
method: GET
app/myfunction.py
def lambda_handler(event, context):
return {
'statusCode': 200,
'body': 'Hello World!'
}
app/Dockerfile
FROM public.ecr.aws/lambda/python:3.9
COPY myfunction.py ./
CMD ["myfunction.lambda_handler"]

at the moment such functionality is not supported in serverless-offline plugin. There's an issue open where the discussion started around supporting this use case: https://github.com/dherault/serverless-offline/issues/1324

'serverless invoke -f hello' gives KeyError

I am following a tutorial in order to learn how to work with the serverless framework. The goal is to deploy a Django application. This tutorial suggests putting the necessary environment variables in a separate yml-file. Unfortunately, following the tutorial gets me a KeyError.
I have a serverless.yml, variables.yml and a handler.py. I will incert all code underneath, together with the given error.
serverless.yml:
service: serverless-django
custom: ${file(./variables.yml)}
provider:
name: aws
runtime: python3.8
functions:
hello:
environment:
- THE_ANSWER: ${self:custom.THE_ANSWER}
handler: handler.hello
variables.yml:
THE_ANSWER: 42
handler.py:
import os
def hello(event, context):
return {
"statusCode": 200,
"body": "The answer is: " + os.environ["THE_ANSWER"]
}
The error in my terminal:
{
"errorMessage": "'THE_ANSWER'",
"errorType": "KeyError",
"stackTrace": [
" File \"/var/task/handler.py\", line 7, in hello\n \"body\": \"The answer is: \" + os.environ[\"THE_ANSWER\"]\n",
" File \"/var/lang/lib/python3.8/os.py\", line 675, in __getitem__\n raise KeyError(key) from None\n"
]
}
Error --------------------------------------------------
Error: Invoked function failed
at AwsInvoke.log (/snapshot/serverless/lib/plugins/aws/invoke/index.js:105:31)
at AwsInvoke.tryCatcher (/snapshot/serverless/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/snapshot/serverless/node_modules/bluebird/js/release/promise.js:547:31)
at Promise._settlePromise (/snapshot/serverless/node_modules/bluebird/js/release/promise.js:604:18)
at Promise._settlePromise0 (/snapshot/serverless/node_modules/bluebird/js/release/promise.js:649:10)
at Promise._settlePromises (/snapshot/serverless/node_modules/bluebird/js/release/promise.js:729:18)
at _drainQueueStep (/snapshot/serverless/node_modules/bluebird/js/release/async.js:93:12)
at _drainQueue (/snapshot/serverless/node_modules/bluebird/js/release/async.js:86:9)
at Async._drainQueues (/snapshot/serverless/node_modules/bluebird/js/release/async.js:102:5)
at Immediate._onImmediate (/snapshot/serverless/node_modules/bluebird/js/release/async.js:15:14)
at processImmediate (internal/timers.js:456:21)
at process.topLevelDomainCallback (domain.js:137:15)
For debugging logs, run again after setting the "SLS_DEBUG=*" environment variable.
Get Support --------------------------------------------
Docs: docs.serverless.com
Bugs: github.com/serverless/serverless/issues
Issues: forum.serverless.com
Your Environment Information ---------------------------
Operating System: linux
Node Version: 12.18.1
Framework Version: 2.0.0 (standalone)
Plugin Version: 4.0.2
SDK Version: 2.3.1
Components Version: 3.1.2
The command i'm trying is 'sls invoke -f hello'. The command 'sls deploy' has been already executed succesfully.
I am new to serverless, so please let me know how to fix this, or if any more information is needed.

First of all, there is an error in yml script:
Serverless Error ---------------------------------------
Invalid characters in environment variable 0
The error is defining environment variables as an array instead of key-value pairs
And then after deployment, everything works smoothly (sls deploy -v)
sls invoke --f hello
{
"statusCode": 200,
"body": "The answer is: 42"
}
serverless.yml
service: sls-example
custom: ${file(./variables.yml)}
provider:
name: aws
runtime: python3.8
functions:
hello:
environment:
THE_ANSWER: ${self:custom.THE_ANSWER}
handler: handler.hello
variables.yml
THE_ANSWER: 42

Databricks spark_jar_task failed when submitted via API

I am using to submit a sample spark_jar_task
My sample spark_jar_task request to calculate Pi :
"libraries": [
{
"jar": "dbfs:/mnt/test-prd-foundational-projects1/spark-examples_2.11-2.4.5.jar"
}
],
"spark_jar_task": {
"main_class_name": "org.apache.spark.examples.SparkPi"
}
Databricks sysout logs where it prints the Pi value as expected
....
(This session will block until Rserve is shut down) Spark package found in SPARK_HOME: /databricks/spark DATABRICKS_STDOUT_END-19fc0fbc-b643-4801-b87c-9d22b9e01cd2-1589148096455
Executing command, time = 1589148103046.
Executing command, time = 1589148115170.
Pi is roughly 3.1370956854784273
Heap
.....
Spark_jar_task though prints the PI value in log, the job got terminated with failed status without stating the error. Below are response of api /api/2.0/jobs/runs/list/?job_id=23.
{
"runs": [
{
"job_id": 23,
"run_id": 23,
"number_in_job": 1,
"state": {
"life_cycle_state": "TERMINATED",
"result_state": "FAILED",
"state_message": ""
},
"task": {
"spark_jar_task": {
"jar_uri": "",
"main_class_name": "org.apache.spark.examples.SparkPi",
"run_as_repl": true
}
},
"cluster_spec": {
"new_cluster": {
"spark_version": "6.4.x-scala2.11",
......
.......
Why the job failed here? Any suggestions will be appreciated!
EDIT :
The errorlog says
20/05/11 18:24:15 INFO ProgressReporter$: Removed result fetcher for 740457789401555410_9000204515761834296_job-34-run-1-action-34
20/05/11 18:24:15 WARN ScalaDriverWrapper: Spark is detected to be down after running a command
20/05/11 18:24:15 WARN ScalaDriverWrapper: Fatal exception (spark down) in ReplId-a46a2-6fb47-361d2
com.databricks.backend.common.rpc.SparkStoppedException: Spark down:
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:493)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:597)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:390)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
at java.lang.Thread.run(Thread.java:748)
20/05/11 18:24:17 INFO ShutdownHookManager: Shutdown hook called

I found answer from this post https://github.com/dotnet/spark/issues/126
Looks like, we shouldnt deliberately call
spark.stop()
when running as a jar in databricks

Redis Enterprise Clustering Command Error 'CLUSTER'

We just installed Redis Enterprise and set some configuration on the database.
We create a simple script becase on our app the cluster command doesn't work, and that's correct it doesn't work:
var RedisClustr = require('redis-clustr');
var redis = new RedisClustr({
servers: [
{
host: 'URL',
port: 18611
}
],
redisOptions: {
password: 'ourpassword'
}
});
redis.get('KSHJDK', function(err, res) {
console.log(res, err);
});
Error on the shell:
undefined Error: couldn't get slot allocation'
at tryClient (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/src/RedisClustr.js:194:17)
at /Users/machine/Sites/redis-testing/node_modules/redis-clustr/src/RedisClustr.js:205:16
at Object.callbackOrEmit [as callback_or_emit] (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis/lib/utils.js:89:9)
at RedisClient.return_error (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis/index.js:706:11)
at JavascriptRedisParser.returnError (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis/index.js:196:18)
at JavascriptRedisParser.execute (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis-parser/lib/parser.js:572:12)
at Socket.<anonymous> (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis/index.js:274:27)
at Socket.emit (events.js:321:20)
at addChunk (_stream_readable.js:297:12)
at readableAddChunk (_stream_readable.js:273:9) {
errors: [
ReplyError: ERR command is not allowed
at parseError (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis-parser/lib/parser.js:193:12)
at parseType (/Users/machine/Sites/redis-testing/node_modules/redis-clustr/node_modules/redis-parser/lib/parser.js:303:14) {
command: 'CLUSTER',
args: [Array],
code: 'ERR'
}
]
}
Are we missing something on the configuration?
We don't know if its an error con the clustering or on Redis Enterprise.

Redis Enterprise supports two clustering flavors.
With regular OSS cluster you need a cluster aware client like the one you use.
The one you are using is for non-cluster aware clients, you should use it with regular client (as if you are connecting to a single Redis process).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Apache Beam on Cloud Dataflow - Failed to query Cadvisor - google-bigquery

If this just happened once it could be attributed to a transient issue. The process running on the worker node can't reach cAdvisor. Either the cAdvisor container is not running or there is a temporal problem on the worker that can't contact cAdvisor and the job gets stuck.

Related

Unable to run relay-compiler

How to run a Lambda Docker with serverless offline

'serverless invoke -f hello' gives KeyError

Databricks spark_jar_task failed when submitted via API

Redis Enterprise Clustering Command Error 'CLUSTER'

Categories

Resources