How to set nested values (objects) using ReJSON - redis

If I insert the following object using ReJSON:
JSON.SET testing . '{"person":{"name":"John","surname":"Doe"}}'
Is there a way to "append" a nested structure? I would like to add "address.name" for an example to get the following JSON:
{
"person": {
"name": "John",
"surname": "Doe"
},
"address": {
"name": "Imaginary Street"
}
}
I was trying to use JSON.SET testing .address.name '"Imaginary Street 7"' but this results in (error) ERR missing key at non-terminal path level.
The docs read:
A key (with its respective value) is added to a JSON Object (in a
Redis ReJSON data type key) if and only if it is the last child in the
path.
Is "address.name" not the last child in the path? What am I doing wrong?

Since you're adding a dictionary ('address'), the way to go about this is:
JSON.SET testing .address '{"name": "Imaginary Street"}'
Alternatively, if you do just:
JSON.SET testing .address '{}'
you'll be able to use the command from your question without any errors.

Related

Proper way to convert Data type of a field in MongoDB

Possible Replication of How to change the type of a field?
I am currently newly learning MongoDB and I am facing problem while converting Data type of field value to another data type.
Below is an example of my document
[
{
"Name of Restaurant": "Briyani Center",
"Address": " 336 & 338, Main Road",
"Location": "XYZQWE",
"PriceFor2": "500.0",
"Dining Rating": "4.3",
"Dining Rating Count": "1500",
},
{
"Name of Restaurant": "Veggie Conner",
"Address": " New 14, Old 11/3Q, Railway Station Road",
"Location": "ABCDEF",
"PriceFor2": "1000.0",
"Dining Rating": "4.4",
}]
Like above I have 12k documents. Notice the datatype of PriceFor2 is a string. I would like to convert the data type to Integer data type.
I have referred many amazing answers given in the above link. But when I try to run the query, I get .save() is not a function error. Please advice what is the problem.
Below is the code I used
db.chennaiData.find().forEach( function(x){ x.priceFor2= new NumberInt(x.priceFor2);
db.chennaiData.save(x);
db.chennaiData.save(x);});
This is the error I am getting..
TypeError: db.chennaiData.save is not a function
From MongoDB's save documentation:
Starting in MongoDB 4.2, the
db.collection.save()
method is deprecated. Use db.collection.insertOne() or db.collection.replaceOne() instead.
Likely you are having a MongoDB with version 4.2+, so the save function is no longer available. Consider migrate to the usage of insertOne and replaceOne as suggested.
For your specific scenario, it is actually preferred to do with a single update as mentioned in another SO answer. It only does one db call(while your approach fetches all documents in the collection to the application level) and performs n db call to save them back.
db.collection.update({},
[
{
$set: {
PriceFor2: {
$toDouble: "$PriceFor2"
}
}
}
],
{
multi: true
})
Mongo Playground

AWS Glue / Hive struct with undetermined struct

Adding data to a AWS Glue table where one of the columns is a struct where one of the values has undetermined form.
More specifically there's a known key called 'name', that is a string and another called 'metadata' that can be a dict with any structure.
Ex:
# Row 1
{
"name": "Jane",
"metadata": {
"foo": 123,
"bar": "something"
}
}
# Row 2
{
"name": "Bill",
"metadata": {
"baz": "something else"
}
}
Note how metadata is a different dictionary in the two entries.
How can this be specified as a struct?
struct<
name:string,
metadata:?
>
Ended up doing what I mentioned in the comment, which is to make the column a string and have the JSON blob serialized to string.
SQL queries will then need to deserialize the JSON blob, which is supported in several different implementations, including AWS Athena (the one I'm using).

Check if an array contains a value Query in Cumulocity REST API

I have a managed object of type “ABC” with a fragment “A”, that has sub-structure as following:
{
"type": "ABC",
"A": {
"value": ["B", "C"]
}
}
How would one create a filter/query that would check if "A" fragment contains “C” in the "value" array?
That query fails:
{{url}}/inventory/managedObjects?query=$filter=(type+eq+'ABC'+and+A.value+has+‘C‘)
With
{
"error": "inventory/Invalid Data",
"message": "Find by filter query failed : Query '$filter=(type eq 'ABC' and A.value has ‘C‘)' could not be understood. Please try again.",
"info": "https://www.cumulocity.com/guides/reference-guide/#error_reporting"
}
Cumulocity doc about querying REST API.
Solution:
Use eq instead of has:
{{url}}/inventory/managedObjects?query=$filter=(type+eq+'ABC'+and+A.value+eq+‘C‘)
I couldn't find a source but the following is working for me with the expected result:
{{url}}/inventory/managedObjects?query=$filter=(type+eq+'ABC'+and+A.value+eq+'C')
So basically you need to use the eq operator for your use case.

Pushing array to Firebase via REST API

Question
How do I (with a single HTTP request to the REST API) write an array to Firebase and give each array element a (non-integer) unique ID?
As described here.
Data
The data I have to write looks like the following.
data-to-write.js
myArray = [ {"user_id": "jack", "text": "Ahoy!"},
{"user_id": "jill", "text": "Ohai!"} ];
Goal
When finished, I want my Firebase to look like this following.
my-firebase.firebaseio.com
{
"posts": {
"-JRHTHaIs-jNPLXOQivY": { // <- unique ID (non-integer)
"user_id": "jack",
"text": "Ahoy!"
},
"-JRHTHaKuITFIhnj02kE": { // <- unique ID (non-integer)
"user_id": "jill",
"text": "Ohai!"
}
}
}
I do not want it to look like this following...
my-anti-firebase.firebaseio.com
// NOT RECOMMENDED - use push() instead!
{
"posts": {
"0": { // <- ordered array index (integer)
"user_id": "jack",
"text": "Ahoy!"
},
"1": { // <- ordered array index (integer)
"user_id": "jill",
"text": "Ohai!"
}
}
}
I note this page where it says:
[...] if all of the keys are integers, and more than half of the keys between 0 and the maximum key in the object have non-empty values, then Firebase will render it as an array.
Code
Because I want to do this in a single HTTP request, I want to avoid iterating over each element in the array and, instead, I want to push a batch in a single request.
In other words, I want to do something like this:
pseudocode.js
curl -X POST -d '[{"user_id": "jack", "text": "Ahoy!"},
{"user_id": "jill", "text": "Ohai!"}]' \
// I want some type of batch operation here
'https://my-firebase.firebaseio.com/posts.json'
However, when I do this, I get exactly what I describe above that I don't want (i.e., sequential integer keys).
I want to avoid doing something like this:
anti-pseudocode.js
for(i=0; i<=myArray.length; i++;){ // I want to avoid iterating over myArray
curl -X POST -d '{"user_id": myArray[i]["user_id"],
"text": myArray[i]["text"]}' \
'https://my-firebase.firebaseio.com/posts.json'
}
Is it possible to accomplish what I have described? If so, how?
I don't think there is a way to use the Firebase API to do this as described in the OP.
However, it can be done with a server script as follows:
Iterate through each array element.
Assign each element a unique id (generated by server script).
Create a return object with keys being the unique IDs and values being the corresponding array elements.
Write object to Firebase with a single HTTP request using the patch method. Because post creates a new Firebase generated ID for the entire object itself. Whereas, patch does not; it writes directly to the parent node.
script.js
var myObject = {},
i = myArray.length;
while(i--){
var key = function(){ /* return unique ID */ }();
myObject[key] = myArray[i];
}
curl -X PATCH -d JSON.stringify(myObject) \
'https://my-firebase.firebaseio.com/posts.json'
Your decision to use POST is correct. The one which cause numeric indexes as a key is because your payload is an array. Whenever you post/put and array, the key will always be indexes. Post your object one by one if you want the server generate key for you.
Firebase will generate unique ID only if you use POST. If you use PATCH unique ID is not generated.
Hence for the given case, will need to iterate through using some server/ client side code to save data in firebase.
Peseudo Code:
For each array
curl -X POST -d
"user_id": "jack",
"text": "Ahoy!"
'https://my-firebase.firebaseio.com/posts.json'
Next

What is the best way to create a subset of my data in Elasticsearch?

I have an index in elasticsearch containing apache log data. Here is what I want to do:
Identify all visitors (by ip number) that accessed a certain file (e.g. /signup.php).
Do a search/query/aggregation on my data, but limit the documents that are examined to those containing an ip number found in step 1.
In the sql world, I would just create a temporary table and insert all the matching IP numbers from step one. Next I would query my main table and limit the result set by joining in my temporary table on IP number.
I understand joins are not possible in elasticsearch. The elasticsearch documentation suggests a few ways to handle situations like this:
Application side joins
This does not seem practical, because the list of IP numbers may be very large and it seems inefficient to send the results to the client and then pass it back to elasticsearch in one huge terms filter.
Denormalizing the data
This would involve iterating over the matching IP numbers and updating every document in the index for any given IP number with something like "in_group": true, so I can use that in my query later on. This also seems very impractical and inefficient, especially since the source query (step 1) is dynamic.
Nested Object and/or parent-Child relationship
I'm not sure if dynamically creating new documents with nested objects is practical in this case. It seems to me that I would end up copying huge parts of my data.
I'm new to elasticsearch and noSQL in general, so perhaps I'm just looking at the problem the wrong way and I shouldn't be trying to emulate a JOIN in the first place.
But this seems like such a common case for segmenting a dataset, it makes me wonder if I am overlooking some other obvious way of doing this?
Any help would be appreciated!
If I understood your question correctly, you are trying to get a subset of your documents based on certain condition and use that sub set to query/search/aggregate it further.
If true, why would you like to store it in another view(sql types). The main power of elasticsearch is it's caching capability of filters and thus it highly reduces your query time. Using this feature, all the queries/searches/aggregation you need to perform on, would require a term filter which would specify the condition you are trying to do in step 1. Now, whatever other operations you want to do, you can do it in the same query on the already shrinked dataset.
If you have other different use cases, then the storage of document(mapping) might be considered to get changed for easier and faster retrieval.
This is a current workaround that I use:
Run this bash script to save the first query ip-list to a temp index, then use a terms-query filter (in Kibana) to query using the ip-list from step1.
#!/usr/bin/env bash
es_host='https://************'
elk_user='************'
cred=($(pass ELK/************ | tr "\n" " ")) ##password
index_name='iis-************'
index_hostname='"************"'
temp_index_path='temp1/_doc/1'
results_limit=1000
timestamp_gte='"2018-03-20T13:00:00"' #UTC
timestamp_lte='"now"' #UTC
resp_data="$(curl -X POST $es_host/$index_name/_search -u $elk_user:${cred[0]} -H 'Content-Type: application/json; charset=utf-8' -d #- << EOF
{
"query": {
"bool": {
"must": [{
"match": {
"index_hostname": {
"query": $index_hostname
}
}
},
{
"regexp": {
"iis.access.url":{
"value": ".*((jpg)|(jpeg)|(png))"
}
}
}],
"must_not": {
"match": {
"iis.access.agent": {
"query": "Amazon+CloudFront"
}
}
},
"filter": {
"range": {
"#timestamp": {
"gte": $timestamp_gte,
"lte": $timestamp_lte
}
}
}
}
},
"aggs" : {
"whatever" : {
"terms" : { "field" : "iis.access.remote_ip", "size":$results_limit }
}
},
"size" : 0
}
EOF
)"
ip_list="$(echo "$resp_data" | jq '.aggregations.whatever.buckets[].key' | tr "\n" ",\ " | head -c -1)"
resp_data2="$(curl -X PUT $es_host/$temp_index_path -u $elk_user:${cred[0]} -H 'Content-Type: application/json; charset=utf-8' -d #- << EOF
{
"ips" : [$ip_list]
}
EOF
)"
echo "$resp_data2"
Query DSL - "terms-query" filter:
{
"query": {
"terms": {
"iis.access.remote_ip": {
"id": "1",
"index": "temp1",
"path": "ips",
"type": "_doc"
}
}
}
}