Fish Shell | Command Substitution using curl and JSON for variable assignment

Fish Shell | Command Substitution using curl and JSON for variable assignment - api

I cannot find any doc for Fish Shell regarding using Command Substitution more than once.
I'm trying to assign the state, city from the JSON result set (jq parser) piped from a curl API query of LocationIQ. 2 Command Substitution 1:(curl) and 2:(jq). I don't need the location variable assignment if I can get the address variable assignment
Purpose of Function:
#Take 2 arguments (Latitude, Longitude) and return 2 variables $State, $City
The JSON:
{
"address": {
"city": "Aurora",
"country": "United States of America",
"country_code": "us",
"county": "Kane County",
"postcode": "60504",
"road": "Ridge Road",
"state": "Illinois"
},
"boundingbox": [
"41.729347",
"41.730247",
"-88.264466",
"-88.261979"
],
"display_name": "Ridge Road, Aurora, Kane County, Illinois, 60504, USA",
"importance": 0.2,
"lat": "41.729476",
"licence": "https://locationiq.com/attribution",
"lon": "-88.263423",
"place_id": "333878957973"
}
My Function:
function getLocation
set key 'hidden'
set exifLat $argv[1]
set exifLon $argv[2]
set location (curl -s "https://us1.locationiq.com/v1/reverse.phpkey=$key&lat=$exifLat&lon=$exifLon&format=json" | set address (jq --raw-output '.address.state,.address.city') )
echo "Location: $location
echo "state: $address[1]"
echo "city: $address[2]"
end
Error: fish Command substitution not allowed
Works fine using only the curl Command substitution ->removing the: set address & parens for jq.
set location (curl -s "https://us1.locationiq.com/v1/reverse.phpkey=$key&lat=$exifLat&lon=$exifLon&format=json" | jq --raw-output '.address.state,.address.city')
I'm still pretty novice - maybe there is a better way to achieve my desired result: Assign the JSON State to a variable and City to a variable?
I originally tried (slicing the location[17] - City, location[19] - State) and getting inconsistent results as the fields seem to be dynamic and affecting how many results which affects the ordering.
Any help appreciated!

I find the nested set confusing. Did you intend to do use $location to hold the downloaded JSON data, and $address to hold the results of jq? If yes, split them out into separate statements
set url "https://us1.locationiq.com/v1/reverse.phpkey=$key&lat=$exifLat&lon=$exifLon&format=json"
set location (curl -s $url)
set address (echo $location | jq --raw-output '.address.state,.address.city')

Related

Extracting particular nested properties with a $ prefix in Amazon Redshift or Quicksight

I am using PostHog for product analytics and have exported some event data to Amazon Redshift as well as S3 to be used in Quicksight.
Under the personal properties part of the JSON, each individual property is nested but begins with a $
I am quite new to SQL queries as well as getting specific details from JSON. in Quicksight using parseJson
Here is an example of the JSON from PostHog
"properties": {
"$active_feature_flags": [],
"$browser": "Chrome",
"$browser_version": 98,
"$ce_version": 1,
"$device_type": "Desktop",
"$environment": "test",
"$event_type": "click",
"$lib": "web",
"$lib_version": "1.17.8",
"$os": "Mac OS X",
"$pathname": "/events",
"$plugins_deferred": [],
"$plugins_failed": [],
"$plugins_succeeded": [
"First Event Today (4914)",
"GeoIP (5539)"
],
I have sought help from a few sources who have mentioned it isn't as simple because of the $ symbol at the beginning.
So my question would be,
How would I query this in Redshift to successfully extract $device_type and $os for example?
How would I pull the same properties using parseJson in Amazon Quicksight?

I can answer #1.
The json provided looks to be a snippet and invalid as is. So I removed the trailing ',' and used SQL to provide the surrounding '{}'. Once it is valid json this runs fine:
create table test as select '"properties": {
"$active_feature_flags": [],
"$browser": "Chrome",
"$browser_version": 98,
"$ce_version": 1,
"$device_type": "Desktop",
"$environment": "test",
"$event_type": "click",
"$lib": "web",
"$lib_version": "1.17.8",
"$os": "Mac OS X",
"$pathname": "/events",
"$plugins_deferred": [],
"$plugins_failed": [],
"$plugins_succeeded": [
"First Event Today (4914)",
"GeoIP (5539)"
]
}' as json_text;
select json_extract_path_text('{' || json_text ||'}', 'properties' ,'$device_type') as device_type,
json_extract_path_text('{' || json_text ||'}', 'properties' ,'$os') as os
from test;

How to set nested values (objects) using ReJSON

If I insert the following object using ReJSON:
JSON.SET testing . '{"person":{"name":"John","surname":"Doe"}}'
Is there a way to "append" a nested structure? I would like to add "address.name" for an example to get the following JSON:
{
"person": {
"name": "John",
"surname": "Doe"
},
"address": {
"name": "Imaginary Street"
}
}
I was trying to use JSON.SET testing .address.name '"Imaginary Street 7"' but this results in (error) ERR missing key at non-terminal path level.
The docs read:
A key (with its respective value) is added to a JSON Object (in a
Redis ReJSON data type key) if and only if it is the last child in the
path.
Is "address.name" not the last child in the path? What am I doing wrong?

Since you're adding a dictionary ('address'), the way to go about this is:
JSON.SET testing .address '{"name": "Imaginary Street"}'
Alternatively, if you do just:
JSON.SET testing .address '{}'
you'll be able to use the command from your question without any errors.

Output specific key value in object for each element in array with jq for JSON

I have an array:
[
{
"AssetId": 14462955,
"Name": "Cultural Item"
},
{
"AssetId": 114385498,
"Name": "Redspybot"
},
{
"AssetId": 29715011,
"Name": "American Cowboy"
},
{
"AssetId": 98253651,
"Name": "Mahem"
}
]
I would like to loop through each object in this array, and pick out the value of each key called AssetId and output it.
How would I do this using jq for the command line?

The command-line tool jq writes to STDOUT and/or STDERR. If you want to write the .AssetId information to STDOUT, then one possibility would be as follows:
jq -r ".[] | .AssetId" input.json
Output:
14462955
114385498
29715011
98253651
A more robust incantation would be: .[] | .AssetId? but your choice will depend on what you want if there is no key named "AssetId".

You can also do it via this command.
jq ".[].AssetId" input.json
if array like be that which is in my case
{
"resultCode":0,
"resultMsg":"SUCCESS",
"uniqueRefNo":"111222333",
"list":[
{
"cardType":"CREDIT CARD",
"isBusinessCard":"N",
"memberName":"Bank A",
"memberNo":10,
"prefixNo":404591
},
{
"cardType":"DEBIT CARD",
"isBusinessCard":"N",
"memberName":"Bank A",
"memberNo":10,
"prefixNo":407814
},
{
"cardType":"CREDIT CARD",
"isBusinessCard":"N",
"memberName":"Bank A",
"memberNo":10,
"prefixNo":413226
}
]
}
you can get the prefixNo with below jq command.
jq ".list[].prefixNo" input.json
For more specific case on array iterating on jq you can check this blogpost

you have a couple of choices to do the loop itself. you can apply peak's awesome answer and wrap a shell loop around it. replace echo with the script you want to run.
via xargs
$ jq -r ".[] | .AssetId" input.json | xargs -n1 echo # this would print
14462955
114385498
29715011
98253651
via raw loop
$ for i in $(jq -r ".[] | .AssetId" input.json)
do
echo $i
done
14462955
114385498
29715011
98253651

An alternative using map:
jq "map ( .AssetId ) | .[]"

For your case jq -r '.[].AssetId' should work
You can also use online JQ Parser : https://jqplay.org/
If you want to loop through the each value then can use below :
for i in $(echo $api_response | jq -r ".[].AssetId")
do
echo echo $i
done

What is the best way to create a subset of my data in Elasticsearch?

I have an index in elasticsearch containing apache log data. Here is what I want to do:
Identify all visitors (by ip number) that accessed a certain file (e.g. /signup.php).
Do a search/query/aggregation on my data, but limit the documents that are examined to those containing an ip number found in step 1.
In the sql world, I would just create a temporary table and insert all the matching IP numbers from step one. Next I would query my main table and limit the result set by joining in my temporary table on IP number.
I understand joins are not possible in elasticsearch. The elasticsearch documentation suggests a few ways to handle situations like this:
Application side joins
This does not seem practical, because the list of IP numbers may be very large and it seems inefficient to send the results to the client and then pass it back to elasticsearch in one huge terms filter.
Denormalizing the data
This would involve iterating over the matching IP numbers and updating every document in the index for any given IP number with something like "in_group": true, so I can use that in my query later on. This also seems very impractical and inefficient, especially since the source query (step 1) is dynamic.
Nested Object and/or parent-Child relationship
I'm not sure if dynamically creating new documents with nested objects is practical in this case. It seems to me that I would end up copying huge parts of my data.
I'm new to elasticsearch and noSQL in general, so perhaps I'm just looking at the problem the wrong way and I shouldn't be trying to emulate a JOIN in the first place.
But this seems like such a common case for segmenting a dataset, it makes me wonder if I am overlooking some other obvious way of doing this?
Any help would be appreciated!

If I understood your question correctly, you are trying to get a subset of your documents based on certain condition and use that sub set to query/search/aggregate it further.
If true, why would you like to store it in another view(sql types). The main power of elasticsearch is it's caching capability of filters and thus it highly reduces your query time. Using this feature, all the queries/searches/aggregation you need to perform on, would require a term filter which would specify the condition you are trying to do in step 1. Now, whatever other operations you want to do, you can do it in the same query on the already shrinked dataset.
If you have other different use cases, then the storage of document(mapping) might be considered to get changed for easier and faster retrieval.

This is a current workaround that I use:
Run this bash script to save the first query ip-list to a temp index, then use a terms-query filter (in Kibana) to query using the ip-list from step1.
#!/usr/bin/env bash
es_host='https://************'
elk_user='************'
cred=($(pass ELK/************ | tr "\n" " ")) ##password
index_name='iis-************'
index_hostname='"************"'
temp_index_path='temp1/_doc/1'
results_limit=1000
timestamp_gte='"2018-03-20T13:00:00"' #UTC
timestamp_lte='"now"' #UTC
resp_data="$(curl -X POST $es_host/$index_name/_search -u $elk_user:${cred[0]} -H 'Content-Type: application/json; charset=utf-8' -d #- << EOF
{
"query": {
"bool": {
"must": [{
"match": {
"index_hostname": {
"query": $index_hostname
}
}
},
{
"regexp": {
"iis.access.url":{
"value": ".*((jpg)|(jpeg)|(png))"
}
}
}],
"must_not": {
"match": {
"iis.access.agent": {
"query": "Amazon+CloudFront"
}
}
},
"filter": {
"range": {
"#timestamp": {
"gte": $timestamp_gte,
"lte": $timestamp_lte
}
}
}
}
},
"aggs" : {
"whatever" : {
"terms" : { "field" : "iis.access.remote_ip", "size":$results_limit }
}
},
"size" : 0
}
EOF
)"
ip_list="$(echo "$resp_data" | jq '.aggregations.whatever.buckets[].key' | tr "\n" ",\ " | head -c -1)"
resp_data2="$(curl -X PUT $es_host/$temp_index_path -u $elk_user:${cred[0]} -H 'Content-Type: application/json; charset=utf-8' -d #- << EOF
{
"ips" : [$ip_list]
}
EOF
)"
echo "$resp_data2"
Query DSL - "terms-query" filter:
{
"query": {
"terms": {
"iis.access.remote_ip": {
"id": "1",
"index": "temp1",
"path": "ips",
"type": "_doc"
}
}
}
}

Using the Instagram API to get ALL followers

I'm using the Instagram API to get the number of people who follow a given account as follows.
$follow_info = file_get_contents('https://api.instagram.com/v1/users/477644454/followed-by?access_token=ACESS_TOKEN&count=-1');
$follow_info = #json_decode($follow_info, true);
This returns a set of 50 results. They do have a next_url key in the array, but it becomes time consuming to keep on going to the next page of followers when dealing with tens of thousands.
I read on StackOverflow that setting the count parameter to -1 would return the entire set. But, it doesn't seem to...

Instagram limits the number of results returned in their API for all sorts of endpoints, and they change these limits arbitrarily, without warning, presumably to handle server load.
Several similar threads exist:
Instagram API not fufilling count parameter
Displaying more than 20 photos in instagram API
Instagram API: How to get all user media? (see comments on answer too, -1 returns 1 less result).
350 Request Limit for Instagram API
Instagram API: How to get all user media?
In short, you won't be able to increase the maximum returned rows, and you'll be stuck paginating.

$follow_info = file_get_contents('https://api.instagram.com/v1/users/USER_ID?access_token=ACCES_TOKEN');
$follow_info = json_decode($follow_info);
print_r($follow_info->data);
And:
return
{
"meta": {
"code": 200
},
"data": {
"username": "i_errorw",
"bio": "A Casa do Júlio é um espaço para quem gosta da ideia de cuidar da saúde com uma alimentação saudável e saborosa.",
"website": "",
"profile_picture": "",
"full_name": "",
"counts": {
"media": 5,
"followed_by": 10,
"follows": 120000
},
"id": "1066376857"
}
}

if the APIs are optional
using the mobile version of twitter you can extract a full list of a followers for a designed target using a very simple bash script
the sleep time must me chosen carefully to avoid temporary ip block
the script can be executed by :
./scriptname.sh targetusername
content
#!/bin/bash
counter=1
wget --load-cookies ./twitter.cookies -O - "https://mobile.twitter.com/$1/followers?" > page
until [ $counter = 0 ]; do
cat page | grep -i "#" | grep -vi "fullname" | grep -vi "$1" | awk -F">" '{print $5}' | awk -F"<" '{print $1}' >> userlist
nextpage=$(cat page | grep -i "cursor" | awk -F'"' '{print $4}')
wget --load-cookies twitter.cookies -O - "https://mobile.twitter.com/$nextpage" > page
if [ -z $nextpage ]; then
exit 0
fi
sleep 5
done
it creates a file "userlist" including all usernames that follows the designed target one by line
PS: a cookies file filled with your credentials is necessary to wget to authenticate the requests

I personally suggest to use Wizboost for instagram automation. And the reason is that I have used this tool and my experience is amazing. It gave me a lot of followers. Now you don’t need to invest time in competing with other Instagram accounts as Wizboost has got your back for this, in fact for everything. You don’t need to do anything you can just relax and Wizboost will get you followers, likes and comments. And you can also schedule your posts too. So easy to use and still got lots of potential. I just love Wizboost for all the services it has.

$follow_info = file_get_contents('https://api.instagram.com/v1/users/USER_ID?access_token=ACCES_TOKEN');
$follow_info = json_decode($follow_info);
print_r($follow_info->data);
return
{
"meta": {
"code": 200
},
"data": {
"username": "casadojulio",
"bio": "A Casa do Júlio é um espaço para quem gosta da ideia de cuidar da saúde com uma alimentação saudável e saborosa.",
"website": "",
"profile_picture": "",
"full_name": "",
"counts": {
"media": 5,
"followed_by": 25,
"follows": 12
},
"id": "1066376857"
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas