Is it possible to query data from Whisper (Graphite DB) from console? - api

I have configured Graphite to monitor my application metrics. And I configured Zabbix to monitor my servers CPU and other metrics.
Now I want to pass some critical Graphite metrics to Zabbix to add triggers for them.
So I want to do something like
$ whisper get prefix1.prefix2.metricName
> 155
Is it possible?
P.S. I know about Graphite-API project, I don't want to install extra app.

You can use the whisper-fetch program which is provided in the whisper installation package.
Use it like this:
whisper-fetch /path/to/dot.wsp
Or to get e.g. data from the last 5 minutes:
whisper-fetch --from=$(date +%s -d "-5 min") /path/to/dot.wsp
Defaults will result in output like this:
1482318960 21.187000
1482319020 None
1482319080 21.187000
1482319140 None
1482319200 21.187000
You can change it to json using the --json option.

OK! I found it myself: http://graphite.readthedocs.io/en/latest/render_api.html?highlight=rawJson (I can use curl and return csv or json).
Answer was found here custom querying in graphite
Also see: https://github.com/graphite-project/graphite-web/blob/master/docs/render_api.rst

Related

Deploy sql workflow with DBX

I am developing deployment via DBX to Azure Databricks. In this regard I need a data job written in SQL to happen everyday. The job is located in the file data.sql. I know how to do it with a python file. Here I would do the following:
build:
python: "pip"
environments:
default:
workflows:
- name: "workflow-name"
#schedule:
quartz_cron_expression: "0 0 9 * * ?" # every day at 9.00
timezone_id: "Europe"
format: MULTI_TASK #
job_clusters:
- job_cluster_key: "basic-job-cluster"
<<: *base-job-cluster
tasks:
- task_key: "task-name"
job_cluster_key: "basic-job-cluster"
spark_python_task:
python_file: "file://filename.py"
But how can I change it so I can run a SQL job instead? I imagine it is the last two lines of code (spark_python_task: and python_file: "file://filename.py") which needs to be changed.
There are various ways to do that.
(1) One of the most simplest is to add a SQL query in the Databricks SQL lens, and then reference this query via sql_task as described here.
(2) If you want to have a Python project that re-uses SQL statements from a static file, you can add this file to your Python Package and then call it from your package, e.g.:
sql_statement = ... # code to read from the file
spark.sql(sql_statement)
(3) A third option is to use the DBT framework with Databricks. In this case you probably would like to use dbt_task as described here.
I found a simple workaround (although might not be the prettiest) to simply change the data.sql to a python file and run the queries using spark. This way I could use the same spark_python_task.

gcloud: How to get specific field from pod log in Stackdriver?

How can I get specific field of some log message using gcloud?
I am currently using this command:
gcloud logging read "logName=projects/some_project/logs/stdout AND resource.type:k8s_container and resource.labels.cluster_name=testing AND resource.labels.namespace_name=test" --limit 10 --format json
I'm guessing this should be something related to SELECT (as read in gcloud's standard sql guide for bigquery: https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax)
OK so this seems to do the trick: --format="value(textPayload)"

How to get information on latest successful pod deployment in OpenShift 3.6

I am currently working on making a CICD script to deploy a complex environment into another environment. We have multiple technology involved and I currently want to optimize this script because it's taking too much time to fetch information on each environment.
In the OpenShift 3.6 section, I need to get the last successful deployment for each application for a specific project. I try to find a quick way to do so, but right now I only found this solution :
oc rollout history dc -n <Project_name>
This will give me the following output
deploymentconfigs "<Application_name>"
REVISION STATUS CAUSE
1 Complete config change
2 Complete config change
3 Failed manual change
4 Running config change
deploymentconfigs "<Application_name2>"
REVISION STATUS CAUSE
18 Complete config change
19 Complete config change
20 Complete manual change
21 Failed config change
....
I then take this output and parse each line to know which is the latest revision that have the status "Complete".
In the above example, I would get this list :
<Application_name> : 2
<Application_name2> : 20
Then for each application and each revision I do :
oc rollout history dc/<Application_name> -n <Project_name> --revision=<Latest_Revision>
In the above example the Latest_Revision for Application_name is 2 which is the latest complete revision not building and not failed.
This will give me the output with the information I need which is the version of the ear and the version of the configuration that was used in the creation of the image use for this successful deployment.
But since I have multiple application, this process can take up to 2 minutes per environment.
Would anybody have a better way of fetching the information I required?
Unless I am mistaken, it looks like there are no "one liner" with the possibility to get the information on the currently running and accessible application.
Thanks
Assuming that the currently active deployment is the latest successful one, you may try the following:
oc get dc -a --no-headers | awk '{print "oc rollout history dc "$1" --revision="$2}' | . /dev/stdin
It gets a list of deployments, feeds it to awk to extract the name $1 and revision $2, then compiles your command to extract the details, finally sends it to standard input to execute. It may be frowned upon for not using xargs or the like, but I found it easier for debugging (just drop the last part and see the commands printed out).
UPDATE:
On second thoughts, you might actually like this one better:
oc get dc -a -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.spec.template.spec.containers[0].env}{"\n\t"}{.spec.template.spec.containers[0].image}{"\n-------\n"}{end}'
The example output:
daily-checks
[map[name:SQL_QUERIES_DIR value:daily-checks/]]
docker-registry.default.svc:5000/ptrk-testing/daily-checks#sha256:b299434622b5f9e9958ae753b7211f1928318e57848e992bbf33a6e9ee0f6d94
-------
jboss-webserver31-tomcat
registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat7-openshift#sha256:b5fac47d43939b82ce1e7ef864a7c2ee79db7920df5764b631f2783c4b73f044
-------
jtask
172.30.31.183:5000/ptrk-testing/app-txeq:build
-------
lifebicycle
docker-registry.default.svc:5000/ptrk-testing/lifebicycle#sha256:a93cfaf9efd9b806b0d4d3f0c087b369a9963ea05404c2c7445cc01f07344a35
You get the idea, with expressions like .spec.template.spec.containers[0].env you can reach for specific variables, labels, etc. Unfortunately the jsonpath output is not available with oc rollout history.
UPDATE 2:
You could also use post-deployment hooks to collect the data, if you can set up a listener for the hooks. Hopefully the information you need is inherited by the PODs. More info here: https://docs.openshift.com/container-platform/3.10/dev_guide/deployments/deployment_strategies.html#lifecycle-hooks

In Lettuce(4.x) for Redis how to reduce round trips and use output of one command as input for another command, especially for Georadius

I have seen this pass results to another command in redis
and using via command line this command works well :
src/redis-cli keys '*' | xargs src/redis-cli mget
However how can we achieve the same effect via Lettuce (i started trying out 4.0.2.Final)
Also a solution to this is particularly important in the following scenario :
Say we are using geolocation capabilities, and we add a set of locations of "my-location-category"
using GEOADD
GEOADD "category-1" 8.6638775 49.5282537 "location-id:1" 8.3796281 48.9978127 "location-id:2" 8.665351 49.553302 "location-id:3"
Next, say we do a GeoRadius to get locations within 10 km radius of 8.6582361 49.5285495 for "category-1"
Now when we get "location-id:1" & "location-id:3"
Given that I already set values for above keys "location-id:1" & "location-id:3"
I want to pipe commands to do the GEORADIUS as well as do mget on all the matching results.
Does Redis provide feature to do that?
and / or how can we achieve this via the Lettuce client library without first manually iterating through results of GEORADIUS and then do manual mget.
That would be more efficient performance for the program that uses it.
Does anyone know how we can do this ?
Update
This is the piped command for the scenario I discussed above :
src/redis-cli GEORADIUS "category-1" 8.6582361 49.5285495 10 km | xargs src/redis-cli mget
Now we need to know how to do this via Lettuce
IMPORTANT: never use KEYS, always use SCAN instead if you must.
This isn't really a question about Lettuce nor Java so I can actually answer it :)
What you're trying to do is use the results from a read operation (GEORADIUS) as input (key names) for another read operation (MGET). This type of flow can't be pipelined, well, just because of that - pipelining means that you don't need the answers for operations right away but in you case you do.
However.
Since you're reading String keys with MGET, you might as well just denormalize everything (remember, we're NoSQL) and store the contents of these keys in the Sorted Set's members, e.g.:
GEOADD "category-1" 8.6638775 49.5282537 "location-id:1:moredata:evenmoredata:{maybe a JSON document here}:orperhapsmsgpack"
This will allow you to get the locations and their "data" with one GEORADIUS call. Of course, any updates to location:1's data will need to be done across all categories.
A note about Lua scripts: while a Lua script could definitely save on the back and forth in this case, any such script will be against best practices/not cluster safe.
After digging around and studying Lua script, my conclusion is that removing round-trips in such a way can only be done via Lua scripts as suggested by Itamar Haber.
I ended up creating a lua script file (myscript.lua) as below
local locationKeys = redis.call('GEORADIUS', 'category-1', '8.6582361', '49.5285495', '10', 'km' )
if unpack(locationKeys) == nil then
return nil
else
return redis.call('MGET', unpack(locationKeys))
end
** of course we should be sending in parameters to this... this is just a poc :)
now you can execute it via command
src/redis-cli EVAL "$(cat myscript.lua)" 0
Then to reduce the network-overhead of sending across the entire script to Redis for execution, we have the option of registering the script with Redis.
Redis will give us a sha1 digested code for future references for that script, which can be used for next calls to that script.
This can be done as below :
src/redis-cli SCRIPT LOAD "$(cat myscript.lua)"
this should give back a sha1 code something like this : 49730aa2ed3034ee48f818e486tpbdf1b500b19e
next calls can be done using this code
eg
src/redis-cli evalsha 49730aa2ed3034ee48f818e486b2bdf1b500b19e 0
The sad part however here is that the sha1 digest is remembered only so long as the instance of redis is running. If it is restarted, that the sha1 digest is lost. Then you do the SCRIPT LOAD once again. And if nothing changes in the script, then the sha1-digest code will be the same.
Ideally while using through client api, we should first attempt evalsha, if that returns a "No matching script" error, then as a fallback do script load, and procure the sha1 code once again, and create an internal map of that and use that sha1 code for further calls.
This can well be done via Lettuce. I could find the methods for those. Hope this gives a good insight into solution for the problem.

How to get a list of internal IP addresses of GCE instances

I have a bunch of instances running in GCE. I want to programmatically get a list of the internal IP addresses of them without logging into the instances (locally).
I know I can run:
gcloud compute instances list
But are there any flags I can pass to just get the information I want?
e.g.
gcloud compute instances list --internal-ips
or similar? Or am I going to have to dust off my sed/awk brain and parse the output?
I also know that I can get the output in JSON using --format=json, but I'm trying to do this in a bash script.
The simplest way to programmatically get a list of internal IPs (or external IPs) without a dependency on any tools other than gcloud is:
$ gcloud --format="value(networkInterfaces[0].networkIP)" compute instances list
$ gcloud --format="value(networkInterfaces[0].accessConfigs[0].natIP)" compute instances list
This uses --format=value which also requires a projection which is a list of resource keys that select resource data values. For any command you can use --format=flattened to get the list of resource key/value pairs:
$ gcloud --format=flattened compute instances list
A few things here.
First gcloud's default output format for listing is not guaranteed to be stable, and new columns may be added in the future. Don't script against this!
The three output modes are three output modes that are accessible with the format flag, --format=json, --format=yaml, and format=text, are based on key=value pairs and can scripted against even if new fields are introduced in the future.
Two good ways to do what you want are to use JSON and the jq tool,
gcloud compute instances list --format=json \
| jq '.[].networkInterfaces[].networkIP'
or text format and grep + line-oriented using tools,
gcloud compute instances list --format=text \
| grep '^networkInterfaces\[[0-9]\+\]\.networkIP:' | sed 's/^.* //g'
I hunted around and couldn't find a straight answer, probably because efficient tools weren't available when others replied to the original question. GCP constantly updates their libraries & APIs and we can use the filter and projections to extract targeted attributes.
Here I outline how to reserve an external static IP, see how it's attributes are named & organised, and then export the external IP address so that I can use it in other scripts (e.g. assign this to a VM instance or authorise this network (IP address) on a Cloud SQL instance.
Reserve a static IP in a region of your choice
gcloud compute --project=[PROJECT] addresses create [NAME] --region=[REGION]
[Informational] View the details of the regional static IP that was reserved
gcloud compute addresses describe [NAME] --region [REGION] --format=flattened
[Informational] List the attributes of the static IP in the form of key-value pairs
gcloud compute addresses describe [NAME] --region [REGION] --format='value(address)'
Extract the desired value (e.g. external IP address) as a parameter
export STATIC_IP=$(gcloud compute addresses describe [NAME] --region [REGION] --format='value(address)’)
Use the exported parameter in other scripts
echo $STATIC_IP
The best possible way would be to have readymade gcloud command use the same as and when needed.
This can be achieved using table() format option with gcloud as per below:
gcloud compute instances list --format='table(id,name,status,zone,networkInterfaces[0].networkIP :label=Internal_IP,networkInterfaces[0].accessConfigs[0].natIP :label=External_IP)'
What does it do for you?
Get you data in clean format
Give you option to add or remove columns
Need additional columns? How to find column name even before you run the above command?
Execute the following, which will give you data in raw JSON format consisting value and its name, copy those names and add them into your table() list. :-)
gcloud compute instances list --format=json
Plus Point: This is pretty much same syntax you can tweak with any GCP resources data to fetch including with gcloud, kubectl etc.
As far as I know you can't filter on specific fields in the gcloud tool.
Something like this will work for a Bash script, but it still feels a bit brittle:
gcloud compute instances list --format=yaml | grep " networkIP:" | cut -c 14-100
I agree with #Christiaan. Currently there is no automated way to get the internal IPs using the gcloud command.
You can use the following command to print the internal IPs (4th column):
gcloud compute instances list | tail -n+2 | awk '{print $4}'
or the following one if you want to have the pair <instance_name> <internal_ip> (1st and 4th column)
gcloud compute instances list | tail -n+2 | awk '{print $1, $4}'
I hope it helps.