CosmosDB: Group By + Join

CosmosDB: Group By + Join - sql

Been having trouble joining grouped data to the source data. This would be easy with relational SQL, but been spending hours trying to do this with CosmosDB SQL API with no success. Any suggestions would be greatly appreciated!
Here is the source document:
[
{
"stream":{
"id":"L1",
"version":1,
"versionName":"abc1"
}
},
{
"stream":{
"id":"L1",
"version":2,
"versionName":"abc2"
}
},
{
"stream":{
"id":"L2",
"version":1,
"versionName":"xyz1"
}
},
{
"stream":{
"id":"L2",
"version":2,
"versionName":"xyz2"
}
},
{
"stream":{
"id":"L2",
"version":3,
"versionName":"xyz3"
}
}
]
Here is the goal (grouped by id):
[
{
"id":"L1",
"versions":[
{
"version":1,
"versionName":"abc1"
},
{
"version":2,
"versionName":"abc2"
}
]
},
{
"id":"L2",
"versions":[
{
"version":1,
"versionName":"xyz1"
},
{
"version":2,
"versionName":"xyz2"
},
{
"version":3,
"versionName":"xyz3"
}
]
}
]
I wonder if this is can be done or if it can only be joined together in javascript after the query results are returned?

I wonder if this is can be done or if it can only be joined together
in javascript after the query results are returned?
I'm afraid this is not supported in cosmos db. The sql of no-sql db is very different from sql of relational db. Aggregating the data into an array is not any one aggregation method which means it can't be used with group by.
JOIN is also different. It is used in cosmos db for joining nested array, not for joining horizontal data.
So, i think this could to be implemented with code, sort the data with order by id, then loop the results to complete the goal.

Related

Filtering items by relation using knex query

I'm trying to filter items by relation using a knex query. I'm almost there (I think) but struggling a little and could use some help as this is new to me.
I have a list of users who are following people and have followers. I'm trying to return a list of users who I'm not already following. Below is my code so far:
const users = await knex("users-permissions_user").whereNotExists(
function () {
this.select("*")
.from("users_followers__users_followings")
.where("user_id", "users-permissions_user.id")
.where("follower_id", id);
}
);
This returns a list of users who currently have no followers and users where I'm the only follower. Any users who I follow and also have more followers are still returned. I thought like would achieve this type of filter but I must be doing it wrong.
Here is how the table for the followers/following relation appears in my db:
And here is the data that would be returned from the above query:
[
{
"id": "138",
"followers": [
{
"id": "143"
}
]
},
{
"id": "140",
"followers": [
{
"id": "160"
},
{
"id": "136"
}
]
},
{
"id": "135",
"followers": []
},
{
"id": "136",
"followers": []
}
]
As you can see, users with no followers are returned as are users who I'm not already following but users who have multiple followers, including me (ID 160), are returned when they should be omitted.
Any advice would be greatly appreciated!

So the reason that users like 140 are being returned even though they are following you (160) is because they are following at least one other person who isn't you, which means your where clause will match them.
If you want to return only users who are not following you, you could achieve this by replacing your left join and where clause with a where not exists clause. In knex that would look something like:
qb.whereNotExists(function() {
this.select(1)
.from('users_followers__users_followings')
.where('users_followers__users_followings.user_id', knex.ref('users-permissions_user.id'))
.where('users_followers__users_followings.follower_id', id);
});

Need to convert this SQL query to MongoDB

I am new to MongoDB. I need to convert this SQL code to MongoDB
select TOP 5 r.regionName, COUNT(c.RegionID)
from region as r,
company as c
where c.RegionID = r._id
group by r.regionName
order by COUNT(c.RegionID) DESC;

Option 1. You can use the aggregation framework with $lookup, $group, $project , $sort and $limit stages, but this seems like a wrong approach since the true power to change relation database with mongoDB is the denormalization and avoidance of join ($lookup) like queries.
Option 2. You convert your multi-table relational database schema to document model and proceed with simple $group, $project, $sort and $limit stage aggregation query for the above task.
Since you have not provided any mongodb document examples it is hard to provide how your queries will look like ...

Despite of my comment I try to give a translation (not tested):
db.region.aggregate([
{
$lookup: // left outer join collections
{
from: "company",
localField: "_id",
foreignField: "RegionID",
as: "c"
}
},
{ $match: { c: { $ne: [] } } }, // remove non-matching documents (i.e. INNER JOIN)
{ $group: { _id: "$regionName", regions: { $addToSet: { "$c.RegionID" } } } }, // group and get distinct regions
{ $project: { regionName: "$_id", count: { $size: "$regions" } , _id: 0} } // some cosmetic and count
{ $sort: { regionName: 1 } }, // order result
{ $limit: 5 } // limit number or returned documents
])

how to select a single item and get it's relations in faunadb?

I have two collections which have the data in the following format
{
"ref": Ref(Collection("Leads"), "267824207030650373"),
"ts": 1591675917565000,
"data": {
"notes": "voicemail ",
"source": "key-name",
"name": "Glenn"
}
}
{
"ref": Ref(Collection("Sources"), "266777079541924357"),
"ts": 1590677298970000,
"data": {
"key": "key-name",
"value": "Google Ads"
}
}
I want to be able to query the Leads collection and be able to retrieve the corresponding Sources document in a single query
I came up with the following query to try and use an index but I couldn't get it to run
Let(
{
data: Get(Ref(Collection('Leads'), '267824207030650373'))
},
{
data: Select(['data'],Var('data')),
source: q.Lambda('data',
Match(Index('LeadSourceByKey'), Get(Select(['source'], Var('data') )) )
)
}
)
Is there an easy way to retrieve the Sources document ?

What you are looking for is the following query which I broke down for you in multiple steps:
Let(
{
// Get the Lead document
lead: Get(Ref(Collection("Leads"), "269038063157510661")),
// Get the source key out of the lead document
sourceKey: Select(["data", "source"], Var("lead")),
// use the index to get the values via match
sourceValues: Paginate(Match(Index("LeadSourceValuesByKey"), Var("sourceKey")))
},
{
lead: Var("lead"),
sourceValues: Var("sourceValues")
}
)
The result is:
{
lead: {
ref: Ref(Collection("Leads"), "269038063157510661"),
ts: 1592833540970000,
data: {
notes: "voicemail ",
source: "key-name",
name: "Glenn"
}
},
sourceValues: {
data: [["key-name", "Google Ads"]]
}
}
sourceValues is an array since you specified in your index that there will be two items returned, the key and the value and an index always returns the array. Since your Match could have returned multiple values in case it wasn't a one-to-one, this becomes an array of an array.
This is only one approach, you could also make the index return a reference and Map/Get to get the actual document as explained on the forum.
However, I assume you asked the same question here. Although I applaud asking questions on stackoverflow vs slack or even our own forum, please do not just post the same question everywhere without linking to the others. This makes many people spend a lot of time while the question is already answered elsewhere.

You might probably change the Leads document and put the Ref to Sources document in source:
{
"ref": Ref(Collection("Leads"), "267824207030650373"),
"ts": 1591675917565000,
"data": {
"notes": "voicemail ",
"source": Ref(Collection("Sources"), "266777079541924357"),
"name": "Glenn"
}
}
{
"ref": Ref(Collection("Sources"), "266777079541924357"),
"ts": 1590677298970000,
"data": {
"key": "key-name",
"value": "Google Ads"
}
}
And then query this way:
Let(
{
lead: Select(['data'],Get(Ref(Collection('Leads'), '267824207030650373'))),
source:Select(['source'],Var('lead'))
},
{
data: Var('lead'),
source: Select(['data'],Get(Var('source')))
}
)

How to convert sql query with exist into mongodb query

I have two documents on mongodb, these are percentages and items. I'm good at SQL, I can write PLSql query as follows but i can not convert to mongodb query. Because my mongodb level of knowledge is at the beginning. Actually I know I have to use $gt for the and condition. But I don't know how I can say not exists or union keyword for mongodb. How can I write mongodb query? which keywords should i search for?
select p.*, "to_top" as list
from percentages p
where p.percentage > 5
and p.updatetime > sysdate - 1/24
and not exists (select 1
from items i
where i.id = p.p_id
and i.seller = p.seller)
order by p.percentage desc
union
select p2.*, "to_bottom" as list
from percentages p2
where p2.percentage > 5
and p2.updatetime > sysdate - 1/24
and exists (select 1
from items i2
where i2.id = p2.p_id
and i2.seller = p2.seller)
order by p2.percentage desc

There is no UNION for MongoDB. Luckely, each query is performed on the same collection and have very close condition, so we can implement "Mongo way" query.
Explanation
Normally, alsmost all complex SQL queries are done with the MongoDB aggregation framework.
We filter document by percentage / updatetime. Explanation why we need to use $expr
SQL JOIN / Subquery is done with the $lookup operator.
SQL SYSDATE in MongoDB way can be NOW or CLUSTER_TIME variable.
db.percentages.aggregate([
{
$match: {
percentage: { $gt: 5 },
$expr: {
$gt: [
"$updatetime",
{
$subtract: [
ISODate("2020-06-14T13:00:00Z"), //Change to $$NOW or $$CLUSTER_TIME
3600000
]
}
]
}
}
},
{
$lookup: {
from: "items",
let: {
p_id: "$p_id",
seller: "$seller"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [ "$$p_id", "$id"]
},
{
$eq: [ "$$seller", "$seller"]
}
]
}
}
},
{
$limit: 1
}
],
as: "items"
}
},
{
$addFields: {
list: {
$cond: [
{
$eq: [{$size: "$items"}, 0]
},
"$to_top",
"$to_bottom"
]
},
items: "$$REMOVE"
}
},
{
$sort: { percentage: -1 }
}
])
MongoPlayground
Note: The MongoDB aggregation has the $facet operator that allows to perform different queries on the same collection.
SCHEMA:
db.percentages.aggregate([
{$facet:{
q1:[...],
q2:[...],
}},
//We apply "UNION" the result documents for each pipeline into single array
{$project:{
data:{$concatArrays:["$q1","$q2"]}
}},
//Flatten array into single object
{$unwind:"$data"}
//Replace top-level document
{$replaceWith:"$data"}
])
MongoPlayground

why you don't import your mangoDB data into oracle and use sql(that is more easy and powerful than mango.)

Creating a couchdb view to index if item in an array exists

I have the following sample documents in my couchdb. The original table in production has about 2M records.
{
{
"_id": "someid|goes|here",
"collected": {
"tags": ["abc", "def", "ghi"]
}
},
{
"_id": "someid1|goes|here",
"collected": {
"tags": ["abc", "klm","pqr"]
},
},
{
"_id": "someid2|goes|here",
"collected": {
"tags": ["efg", "hij","klm"]
},
}
}
Based on my previous question here, how to search for values when the selector is an array,
I currently have an index added for the collected.tags field, but the search is still taking a long time. Here is the search query I have.
{
"selector": {
"collected.tags": {
"$elemMatch": {
"$regex": "abc"
}
}
}
}
There are about 300k records matching the above condition, there search seems to take a long time. So, I want to create a indexed view to retrieve and lookup faster instead of a find/search. I am new to couchdb and am not sure how to setup the map function to create the indexed view.

Figured the map function out myself. Now all the documents are indexed and retrievals are faster
function (doc) {
if(doc.collected.tags.indexOf('abc') > -1){
emit(doc._id, doc);
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

CosmosDB: Group By + Join - sql

Related

Filtering items by relation using knex query

Need to convert this SQL query to MongoDB

how to select a single item and get it's relations in faunadb?

How to convert sql query with exist into mongodb query

Creating a couchdb view to index if item in an array exists

Categories

Resources