How to modify the following cypher syntax in AgensGraph? - cypher

MATCH (wu:wiki_user)
OPTIONAL MATCH (n:wiki_doc{author:wu.uid}), (o:wiki_doc{editor:wu.uid})
RETURN wu.uid AS User_id, wu.org AS Organization, wu.email AS email, wu.token AS balance,
count(n) AS Writing, count(o) AS Modifying;
user_id | organization | email | balance | writing | modifying
--------------------------------------------------------------------------
"ailee" | "Org2" | "hazel#gbc.com" | 5 | 0 | 0
"hazel" | "Org1" | "hazel#gbc.com" | 5 | 2 | 2
match (n:wiki_doc{editor:'hazel'}) return n;
n
wiki_doc[9.11]
{"bid": "hazel_doc1", "cid": "Basic", "org": "Org1", "title": "Hello world!",
"author": "hazel", "editor": "hazel", "revnum": 1, "created": "2018-09-25
09:00:000", "hasfile": 2, "contents": "I was wrong", "modified": "2018-09-25
10:00:000"}
(1 row)
In fact, the number of updates in the case of hazel is 1, and 2
queries are used when the above query is used.
How to modify the query so that only one can be normally viewed.

MATCH( wu:wiki_user )
OPTIONAL MATCH (n:wiki_doc{author:wu.uid}), (o:wiki_doc{editor:wu.uid})
RETURN wu.uid AS User_id, wu.org AS Organization, wu.email AS email, wu.token AS balance,
count(distinct id(n)) as Writing, count(distinct id(o)) as Modifying;
user_id | organization | email | balance | writing | modifying
+----------------------------------------------------------+
"ailee" | "Org2" | "hazel#gbc.com" | 5 | 0 | 0
"hazel" | "Org1" | "hazel#gbc.com" | 5 | 2 | 1
(2 rows)

Related

How to extract elements from Presto ARRAY(MAP(VARCHAR, VARCHAR))

I have an array of maps and data format is ARRAY(MAP(VARCHAR, VARCHAR)); I'd like to extract "id" and "description" features from this "Item_Details" column:
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Item_Details | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | [{"created":"2019-09-15","product":"apple watch", "amount": "$7,900"},{"created":"2022-09-19","product":"iPhone", "amount": "$78,300"},{"created":"2021-01-13","product":"Macbook Pro", "amount": "$163,980"}] | | |
| Google | US | [{"created":"2020-07-15","product":"Nest", "amount": "$78,300"},{"created":"2021-07-15","product":"Google phone", "amount": "$178,900"}] | | |
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
My expected outputs would be:
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Item_Details | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | ["product":["apple watch", "iPhone", "Macbook Pro"], "amount":[ "$7,900", "$78,300","$163,980"] | | |
| Google | US | ["product":["Nest", "Google phone"], "amount": "$78,300", "$178,900"] | | |
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
or
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Product | Amount | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | apple watch | $7,900 | | |
| Apple | US | iPhone | $78,300 | | |
| Apple | US | Macbook Pro | $163,980 | | |
...
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
I tried element_at(Item_Details, 'product') and json_extract_scalar(Item_Details, '$.product') but received error "Unexpected parameters (array(map(varchar,varchar)), varchar(23)) for function element_at. "
Any suggestions is much appreciated! Thank you in advance
For the second one you can unnest the array and access elements of map:
-- sampel data
WITH dataset(Company, Country, Item_Details) AS (
values ('Google', 'US', array[
map(array['created', 'product', 'amount'], array['2019-09-15', 'Nest', '$78,300']),
map(array['created', 'product', 'amount'], array['2019-09-16', 'Nest1', '$79,300'])
])
)
-- query
select Company,
Country,
m['product'] product,
m['amount'] amount
from dataset d,
unnest(Item_Details) as t(m);
Output:
Company
Country
product
amount
Google
US
Nest
$78,300
Google
US
Nest1
$79,300

PowerBI / SQL Query to verify records

I am working on a PowerBI report that is grabbing information from SQL and I cannot find a way to solve my problem using PowerBI or how to write the required code. My first table, Certifications, includes a list of certifications and required trainings that must be obtained in order to have an active certification.
My second table, UserCertifications, includes a list of UserIDs, certifications, and the trainings associated with a certification.
How can I write a SQL code or PowerBI measure to tell if a user has all required trainings for a certification? ie, if UserID 1 has the A certification, how can I verify that they have the TrainingIDs of 1, 10, and 150 associated with it?
Certifications:
CertificationsTable
UserCertifications:
UserCertificationsTable
This is a DAX pattern to test if contains at least some values.
| Certifications |
|----------------|------------|
| Certification | TrainingID |
|----------------|------------|
| A | 1 |
| A | 10 |
| A | 150 |
| B | 7 |
| B | 9 |
| UserCertifications |
|--------------------|---------------|----------|
| UserID | Certification | Training |
|--------------------|---------------|----------|
| 1 | A | 1 |
| 1 | A | 10 |
| 1 | A | 300 |
| 2 | A | 150 |
| 2 | B | 9 |
| 2 | B | 90 |
| 3 | A | 7 |
| 4 | A | 1 |
| 4 | A | 10 |
| 4 | A | 150 |
| 4 | A | 1000 |
In the above scenario, DAX needs to find out if the mandatory trainings (Certifications[TrainingID]) by Certifications[Certification] is completed by
UserCertifications[UserID ]&&UserCertifications[Certifications] partition.
In the above scenario, DAX should only return true for UserCertifications[UserID ]=4 as it is the only User that completed at least all the mandatory trainings.
The way to achieve this is through the following measure
areAllMandatoryTrainingCompleted =
VAR _alreadyCompleted =
CONCATENATEX (
UserCertifications,
UserCertifications[Training],
"-",
UserCertifications[Training]
) // what is completed in the fact Table; the fourth argument is very important as it decides the sort order
VAR _0 =
MAX ( UserCertifications[Certification] )
VAR _supposedToComplete =
CONCATENATEX (
FILTER ( Certifications, Certifications[Certification] = _0 ),
Certifications[TrainingID],
"-",
Certifications[TrainingID]
) // what is comeleted in the training Table; the fourth argument is very important as it decides the sort order
VAR _isMandatoryTrainingCompleted =
CONTAINSSTRING ( _alreadyCompleted, _supposedToComplete ) // CONTAINSSTRING (<Within Text>,<Search Text>); return true false
RETURN
_isMandatoryTrainingCompleted

Snowflake - using json_parse and select distinct to un-nested column and compare with another column

I have 2 columns, 1 is a nested column named custom_field and the other is sales_id I want to compare the sales_id_2 values in custom_field with sales_id column
I've tried this but it didn't work:
select distinct parse_json(custom_fields) as CUSTOM_FIELDS
from my_table where custom_fields:sales_id_2 = sales_id;
but I get the error:
SQL compilation error: error line 1 at position 111 Invalid argument
types for function 'GET': (VARCHAR(16777216), VARCHAR(2)).
+-----------------------------------------------------+
| custom_field | sales_id |
|-----------------------------------------------------|
| | |
| { | 235324115 |
| "sales_id_2": 235324115, | 1234351 |
| "g": 12, | |
| "r": 255 | |
| } | |
| { | 678322341 |
| "sales_id_2": 1234351, | 5648561 |
| "g": 13, | |
| "r": 254 | |
| } | |
I'm hoping to see empty results, because I believe sales_id_2 is the same as sales_id
:: is for casting, plus you are trying a JSON operation on a varchar column. try this
select distinct parse_json(custom_fields) as CUSTOM_FIELDS from my_table where parse_json(custom_fields):sales_id_2 = sales_id;

SELECT 1 ID and all belonging elements

I try to create a json select query which can give me back the result on next way.
1 row contains 1 main_message_id and belonging messages. (Like the bottom image.) The json format is not a requirement, if its work with other methods, it will be fine.
I store the data as like this:
+-----------------+---------+----------------+
| main_message_id | message | sub_message_id |
+-----------------+---------+----------------+
| 1 | test 1 | 1 |
| 1 | test 2 | 2 |
| 1 | test 3 | 3 |
| 2 | test 4 | 4 |
| 2 | test 5 | 5 |
| 3 | test 6 | 6 |
+-----------------+---------+----------------+
I would like to create a query, which give me back the data as like this:
+-----------------+-----------------------+--+
| main_message_id | message | |
+-----------------+-----------------------+--+
| 1 | {test1}{test2}{test3} | |
| 2 | {test4}{test5}{test6} | |
| 3 | {test7}{test8}{test9} | |
+-----------------+-----------------------+--+
You can use json_agg() for that:
select main_message_id, json_agg(message) as messages
from the_table
group by main_message_id;
Note that {test1}{test2}{test3} is invalid JSON, the above will return a valid JSON array e.g. ["test1", "test2", "test3"]
If you just want a comma separated list, use string_agg();
select main_message_id, string_ag(message, ', ') as messages
from the_table
group by main_message_id;

Rank over partition from postgresql in elasticsearch

We are facing a problem with migration a large data set into elasticsearch from postgres (backup or whatever).
We have schema similar like this
+---------------+--------------+------------+-----------+
| user_id | created_at | latitude | longitude |
+---------------+--------------+------------+-----------+
| 5 | 23.1.2015 | 12.49 | 20.39 |
+---------------+--------------+------------+-----------+
| 2 | 23.1.2015 | 12.42 | 20.32 |
+---------------+--------------+------------+-----------+
| 2 | 24.1.2015 | 12.41 | 20.31 |
+---------------+--------------+------------+-----------+
| 5 | 25.1.2015 | 12.45 | 20.32 |
+---------------+--------------+------------+-----------+
| 1 | 23.1.2015 | 12.43 | 20.34 |
+---------------+--------------+------------+-----------+
| 1 | 24.1.2015 | 12.42 | 20.31 |
+---------------+--------------+------------+-----------+
And we are able to find a latest position by created_at thanks to rank function in SQL
... WITH locations AS (
select user_id, lat, lon, rank() over (partition by user_id order by created_at) as r
FROM locations)
SELECT user_id, lat, lon FROM locations WHERE r = 1
and the result is only newest created locations for each user:
+---------------+--------------+------------+-----------+
| user_id | created_at | latitude | longitude |
+---------------+--------------+------------+-----------+
| 2 | 24.1.2015 | 12.41 | 20.31 |
+---------------+--------------+------------+-----------+
| 5 | 25.1.2015 | 12.45 | 20.32 |
+---------------+--------------+------------+-----------+
| 1 | 24.1.2015 | 12.42 | 20.31 |
+---------------+--------------+------------+-----------+
After we import the data into elasticsearch, our document model looks like:
{
"location" : { "lat" : 12.45, "lon" : 46.84 },
"user_id" : 5,
"created_at" : "2015-01-24T07:55:20.606+00:00"
}
etc...
I am looking for alternatives for this SQL query in elasticsearch query, I think it must be possible, but i did not find how yet.
You can achieve this using field collapsing clubbed with inner_hits.
{
"collapse": {
"field": "user_id",
"inner_hits": {
"name": "order by created_at",
"size": 1,
"sort": [
{
"created_at": "desc"
}
]
}
},
}
Detailed Article: https://blog.francium.tech/sql-window-function-partition-by-in-elasticsearch-c2e3941495b6
It is simple: if you want to find the oldest record (for a given id), you just need the records for which no older ones (with the same id) exist. (this assumes that for a given id, no records exist with the same created_at date)
SELECT * FROM locations ll
WHERE NOT EXISTS (
SELECT * FROM locations nx
WHERE nx.user_id = ll.user_id
AND nx.created_at > ll.created_at
);
EDITED (it appears the OP wants the newst observation, not the oldest)
use top_hits.
"aggs": {
"user_id": {
"terms": {"field": "user_id"},
"aggs": {
"top_location": {
"top_hits": {
"size": 1,
"sort": { "created_at": "asc" },
"_source": []
}
}
}
}
}