JQ Pulling Inconsistent Results - iteration

If I run this very straight query on my json data from an aws command I get a correct result as to how many aws server instances I have in an account:
aws ec2 describe-instances | jq -r '.Reservations[].Instances[].InstanceId'
Produces a list of 47 instance IDs which corresponds to the number of server instances I have in the account. For example:
i-01adbf1408ef1a333
i-0f92d078ce975c138
i-0e4e117c44b17b417
and on up to 47 instances
This next query still produces the correct number of results:
aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | [( .InstanceId ) ]'
However, if I add a query to include the name tags of the servers I get dramatically less number of server instances reported:
aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | [( (.Tags[]|select(.Key=="Name")|.Value), .InstanceId ) ]'
This is the output of that command:
"i-08d3c05eed1316c9d"
"USAMZLAB10003","i-79eebb29"
"EOMLABAMZ1306","i-dbc98af4"
"USAMZLAB10002","i-d1dc1d83"
"i-0366c9bf18d27eb96"
"i-04d061334bc2f2d6b"
"USAMZLAB10007","i-f7a680a7"
"i-090e84eff4fece2b3"
"EOMLABAMZ1303","i-7cc98a53"
"EOMLABCSE713","i-08233926"
"i-0705eb3039cd56e04"
jq: error (at <stdin>:5013): Cannot iterate over null (null)
For some reason that query reports that there are only 11 aws server instances (when there should be 47). It does report that there are servers with and without name tags. But it's not reporting the correct number of servers.
It also produces the jq error "Cannot iterate over null".
I have put the original JSON into this paste:
Original JSON
How can I make the error more verbose so I can find out what's going on?
And why does adding the name tag to the query dramatically reduce the number of results?

In your json, not all instances have a set of Tags thus the error. You would have to handle it or substitute an empty array in its place with (.Tags // []). But overall, I would write it like this:
.Reservations[].Instances[] | [ (.Tags // [] | from_entries.Name), .InstanceId ]

How can I make the error more verbose so I can find out what's going on?
You could use debug.
why does adding the name tag to the query dramatically reduce the number of results?
Because your jq program is at variance with your expectations; specifically, you have overlooked what happens when .Tags evaluates to null. To understand the mismatch, consider:
$ jq -n '{} | .Tags[]|select(.Key=="Name")|.Value'
Another issue is the handling of empty arrays. You might like to handle the case of empty arrays along the lines suggested by the following:
$ jq -n '{Tags: []} | (.Tags[] | select(.Key=="Name")|.Value) // null'
$ null
One solution
If you want null to appear whenever there isn't a tag:
.Reservations[].Instances[]
| [ ((.Tags // [])[] | select(.Key=="Name") | .Value) // null,
.InstanceId ]
Given your input, the first two lines of the output would be:
[null,"i-08d3c05eed1316c9d"]
["USAMZLAB10003","i-79eebb29"]
Variant using try
.Reservations[].Instances[]
| [ try (.Tags[] | select(.Key=="Name")|.Value) // null,
.InstanceId ]

Related

Get value by position in Redis

When getting all the keys from Redis, like this:
redis.server.com:6379> keys *
1) "z13235jxby03knne1w1gucl5"
Instead of manually copying the long key to execute get z13235jxby03knne1w1gucl5, I'd like to run something like get $(1) (pseudo code) to get the value at position 1, as output by the keys command.
Is this possible, if not, is there any workaround to not have to manually copy paste?
Note, I don't want to solve this with a script, then I prefer just copy and paste it
off the top of my head, I'm not aware of a way from inside the cli.
But, Regardless of any performance implications, you can pipe the cli command lines together , but you have to do it from the shell
redis-cli --raw KEYS "*" | sed -n 1p | xargs redis-cli GET
where the 1 in :
sed -n 1p
is the line number (1-based index) inside KEYS output.
but still you need to do your validations; like making sure the index is withing the nuber of keys returned by the KEYS command all keys are of simple string type ; not sets, hash maps, etc...

GitLab API: pipeline not returning all jobs

I'm using the GitLab api, to list out the jobs in a pipeline. It's always been fine in the past, but I've added a couple of extra items to the flow and now it doesn't return all of the jobs:
$ curl --globoff -sSH "$CURL_HEADER" https://.../api/v4/projects/$CI_PROJECT_ID/pipelines/$PIPEID/jobs?scope[]=success | jq --raw-output '.[] | "\(.id)"' | wc -l
20
The jobs that are missing aren't retries (as noted here).
I can see the missing jobids in the web interface.
Is there a maximum of 20 jobs via this method?
So turns out this API response is paginated, there's no indication in docs for this item.
There is a general item describing this here, but it doesn't give a list of routes it is related to. If it did it would probably show up in a search far easier.
All I needed to do was append &per_page=100 (qq-ing for the & for my use case). Alternatively you can check the return header for the X-Next-Page value and then append &page=X to get the subsequent pages...
Related page variables are:
x-next-page: 2
x-page: 1
x-per-page: 20
x-prev-page:
x-total: 23
x-total-pages: 2

Get dependency map API [pact-broker]

Is there a way to get full dependency map of all contracts from the Pact Broker (preferably in json format)?
There is an API call used in the graph: https://<broker-url>/groups/<service>.csv to get data to draw the graph, but that is not great for parsing and requires a call to find all services and then a call for each service to get the dependencies.
It would be nice to have one call with a full dependency map in json format.
Yes! There is a HAL browser built in to broker, which enables you to follow the graph programmatically.
For example, you could run a query like this and filter with jq on the subset of properties you need, and re-order the output:
curl -v -u 'dXfltyFMgNOFZAxr8io9wJ37iUpY42M:O5AIZWxelWbLvqMd8PkAVycBJh2Psyg1' https://test.pact.dius.com.au/pacts/latest | jq '.pacts[]._embedded | select(.consumer.name | contains("AWSSummiteer")) | .consumer.name + "->" + .provider.name'
Which produces something like:
"AWSSummiteerSentimentSNSProvider->AWSSummiteerTwitterSNSProvider"
"AWSSummiteerTwitterSNSConsumer->AWSSummiteerTwitterSNSProvider"
"AWSSummiteerTwitterSNSProvider->Twitter"
"AWSSummiteerWeb->AWSSummiteerIoT"
"AWSSummiteerWeb->AWSSummiteerIoTPresignedUrl"
"AWSSummiteerWeb->AWSSummiteerSentimentSNSProvider"
"AWSSummiteerWeb->AWSSummiteerTwitterSNSConsumer"
"AWSSummiteerWeb->AWSSummiteerWeb"
which you could pipe into graphviz to create pretty charts but of course you could translate this into any format you like.
Here is the full graphviz visualisation:
echo "digraph { ranksep=3; ratio=auto; overlap=false; node [ shape = plaintext, fontname = "Helvetica" ];" > latest.dot ; curl -v -u 'dXfltyFMgNOFZAxr8io9wJ37iUpY42M:O5AIZWxelWbLvqMd8PkAVycBJh2Psyg1' https://test.pact.dius.com.au/pacts/latest | jq '.pacts[]._embedded | select(.consumer.name | contains("AWSSummiteer")) | .consumer.name + "->" + .provider.name' | tr -d '"' | sed 's/-/_/g' | sed 's/_>/->/g' >> latest.dot; echo "}" >> latest.dot
dot latest.dot -otest.png -Tpng
which creates this pretty picture:

Execute an Impala query and get query time

I want to be able to execute a number of Impala queries and return the time it took for each query to execute. Using the Impala shell, I can do this with the following command:
impl -q "select count(*) from database.table;"
This gives me the output
Using service name 'impala'
SSL is enabled. Impala server certificates will NOT be verified (set --ca_cert to change)
Connected to *****.************:21000
Server version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0)
Query: select count(*) from database.table
+----------+
| count(*) |
+----------+
| 1130976 |
+----------+
Fetched 1 row(s) in 0.86s
I want to be able to fetch that last line and extract the time. It doesn't really matter how, which is why I haven't tagged a language. I have tried using grep like this:
impl -q "select count(*) from database.table" | grep -Po "\d+\.\d+"
But that does nothing but remove the table. Putting the query in a python script and using subprocess couldn't find impl as a command, and same for scala.
The weird thing is that impala-shell dumps those messages to stderr rather than to stdout, so to fetch the last line, you would have to append a 2>&1 to redirect stderr to stdout
impala-shell -q "query string" 2>&1 | grep -Po "\d+\.\d+(?=s)"
Notice that a positive lookahead (?=s) is probably required to avoid capturing version numbers

Using tr or cut skews column formatting

I'm using a script that uses curl to obtain specific array values from a configuration. Tom Fenech helped with the column formatting. I've run into another issue where the column formatting is skewed by the tr and cut commands. The command output of one array set contains square brackets surrounding the object's key. I'm using tr or cut to remove the surrounding brackets, and they do the job, however the column format is skewed. Here's my code:
# get the device groups
d_groups=`curl -H "X-Person-Token: $auth_token" -H "X-Person-Email: $auth_email" -k "$api_host/api/v1/device_groups"`
grp_uuid=`echo $d_groups | jq '.[] | .uuid'`
grp_name=`echo $d_groups | jq '.[] | .name'`
grp_desc=`echo $d_groups | jq '.[] | .description'`
grp_dev=`echo $d_groups | jq '.[] | .devices'`
echo "========== DEVICES =========="
paste <(printf 'DEVICE_GROUP_NAME\n%s\n' "$grp_name") <(printf 'DESCRIPTION\n%s\n' "$grp_desc") <(printf 'DEVICE_GROUP_UUID\n%s\n' "$grp_uuid") <(printf 'DEVICES\n%s\n' "$grp_dev" | cut -c2-39) | column -t
echo ""
exit 0
I've also tried using 'tr -d "[]"' with the same results as using cut. Here are the results of the above script:
========== DEVICES ==========
DEVICE_GROUP_NAME DESCRIPTION DEVICE_GROUP_UUID DEVICES
"Auto_API_GP1" "Auto_API_GP1" "03be550b-3744-484e-88c5-d25e1ed865c0"
"Auto_API_GP2" "Auto_API_GP2" "3a8e2ee4-3a59-4fba-aaf0-0d527d20fe13" "1a7a2092-29bd-4178-88e9-1373ee5886de"
"Auto_API_GP3" "Auto_API_GP3" "ba1ead89-34f5-4084-a9a2-8a681e83164d"
"81a3969c-1cc3-4f13-8602-afcb981d5295"
"8bbfd1a7-e048-4148-abeb-354763b6aef7"
What I'm expecting for results:
========== DEVICES ==========
DEVICE_GROUP_NAME DESCRIPTION DEVICE_GROUP_UUID DEVICES
"Auto_API_GP1" "Auto_API_GP1" "03be550b-3744-484e-88c5-d25e1ed865c0" "1a7a2092-29bd-4178-88e9-1373ee5886de"
"Auto_API_GP2" "Auto_API_GP2" "3a8e2ee4-3a59-4fba-aaf0-0d527d20fe13" "81a3969c-1cc3-4f13-8602-afcb981d5295"
"Auto_API_GP3" "Auto_API_GP3" "ba1ead89-34f5-4084-a9a2-8a681e83164d" "8bbfd1a7-e048-4148-abeb-354763b6aef7"
I'm an absolute beginner at this, any insight is very much appreciated. Thanks!
So I stopped staring at the script long enough to get lunch and coffee and figured that the code I posted above was removing the brackets on the first output but was either entering a new line or replacing the bracket with a new line. So I simply replaced
tr -d '[]'
with
tr -s '[]' '\n'
and it now works as expected.
I do humbly except any corrections or a more efficient way. :)