Mongo Group By query - sql

I have data stored in a Mongo collection that is structured like this:
{
"numberAtPending" : 3,
"numberAtInProgress" : 5,
"numberAtCancelled" : 1,
"numberAtShipped" : 50,
"timeOfRequest" : ISODate("2022-01-10T12:52:15.813Z"),
"requestingSupplier" : "SUPPLIER_1",
},
{
"numberAtPending" : 5,
"numberAtInProgress" : 3,
"numberAtCancelled" : 4,
"numberAtShipped" : 35,
"timeOfRequest" : ISODate("2022-01-15T09:11:02.992Z"),
"requestingSupplier" : "SUPPLIER_1",
},
{
"numberAtPending" : 12,
"numberAtInProgress" : 3,
"numberAtCancelled" : 1,
"numberAtShipped" : 21,
"timeOfRequest" : ISODate("2022-01-10T14:21:55.221Z"),
"requestingSupplier" : "SUPPLIER_2",
}
I wish to construct a query that would let me sum up each count in each entry and group by requestingSupplier.
For example, I would like to answer the question, for the month of January '22, what was the sum of each entity and get a response similar to:-
"TotalNumberAtPending": 300
"TotalNumberAtInProgress" : 150,
"TotalNumberAtCancelled" : 70,
"TotalNumberAtShipped" : 400
"Supplier" : "SUPPLIER_1",
"TotalNumberAtPending": 230
"TotalNumberAtInProgress" : 110,
"TotalNumberAtCancelled" : 40,
"TotalNumberAtShipped" : 300
"Supplier" : "SUPPLIER_2",
Any help most appreciated!
thanks and regards

You can try this query (also I'm assuming the output you show is an example and not the real values because I don't know from where can you get 300, 150, 400...)
So, try this:
You have two options to match values in the range of two dates. If you want to input the name of the month and the year you can try something like:
Use $expr and $eq with $year and $month. And then you can use as input exaclty the number of the desired month or year.
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
{
"$month": "$timeOfRequest"
},
1
]
},
{
"$eq": [
{
"$year": "$timeOfRequest"
},
2022
]
}
]
}
}
}
Or you can match by the date range. If you want to get all documents from 01-2022 you can use this $match stage where the range is from the first second of January (equal) to the first second of February (not equal, so i.e. is the last second of January).
{
"$match": {
"timeOfRequest": {
"$gte": ISODate("2022-01-01T00:00:00Z"),
"$lt": ISODate("2022-02-01T00:00:00Z")
}
}
}
So, with the filter done you only need to use $group like this to generate the desired fields values.
{
"$group": {
"_id": "$requestingSupplier",
"TotalNumberAtPending": {
"$sum": "$numberAtPending"
},
"TotalNumberAtInProgress": {
"$sum": "$numberAtInProgress"
},
"TotalNumberAtCancelled": {
"$sum": "$numberAtCancelled"
},
"TotalNumberAtShipped": {
"$sum": "$numberAtShipped"
},
"Supplier": {
"$first": "$requestingSupplier"
}
}
}
Example here and here

Related

How do I sum up two fields in the Mongoshell?

I have a find statement in a store database that looks like this:
db.Purchases.find( {}, { store: 1, total: 1, _id: 0 } ).sort( { "store" : 1} ) =
{ "store" : DBRef("Location", ObjectId("5dae22702486f7d89ba7633c")), "total" : "$1500" }
{ "store" : DBRef("Location", ObjectId("5dae227f2486f7d89ba7633d")), "total" : "$156.88" }
{ "store" : DBRef("Location", ObjectId("5dae22992486f7d89ba7633e")), "total" : "$1510" }
{ "store" : DBRef("Location", ObjectId("5dae22992486f7d89ba7633e")), "total" : "$3000" }
{ "store" : DBRef("Location", ObjectId("5dae22cd2486f7d89ba76340")), "total" : "$156.88" }
I need to sum the totals from output 3 and 4 (i.e $1510 and $3000) and display the result as one line in the output along all the other outputs. How do I do this?
Try this:
db.Purchases.aggregate(
[
{
$group:
{
_id: "$store",
totalAmount: { $sum: "$total"}
}
}
]
)
Observation: total must be a numeric type in MongoDB.

MongoDB Array Filters

I am trying to update an nested array using Array filters, for hands on first i am trying with basic array filter update query, i copied pasted the update query from the mongodb tutorial, But i am getting error like: Error:"No array filter found for identifier 'elem' in path 'grades.$[elem].mean'
"and i am using
'db version v4.0.2' and
'MongoDB shell version v4.0.2
Here is my collection Details,
{
"_id" : 1,
"grades" : [
{
"grade" : 80,
"mean" : 75,
"std" : 6
},
{
"grade" : 85,
"mean" : 90,
"std" : 4
},
{
"grade" : 85,
"mean" : 85,
"std" : 6
}
]
}
//End of First Record
{
"_id" : 2,
"grades" : [
{
"grade" : 90,
"mean" : 75,
"std" : 6
},
{
"grade" : 87,
"mean" : 90,
"std" : 3
},
{
"grade" : 85,
"mean" : 85,
"std" : 4
}
]
}
//End of Second record
update Query:
db.getCollection('students2').update(
{ },
{ $set: { "grades.$[elem].mean" : 100 } },
{
multi: true,
arrayFilters: [ { "elem.grade": { $gte: 85 } } ]
}
)
Throw's the Error:
No array filter found for identifier 'elem' in path 'grades.$[elem].mean'
Read the comments of this StackOverflow issue:
arrayFilters not working
It doesn't work in "older shells". I'm using the Robo 3T client and running into the same issue. The shell is apparently removing the arrayFilters object.

mongodb aggregate distinct count

Realise this topic has been asked many times - but the advice hasn't helped me solve this problem.
The following query is trying to determine the presence of sales on a given weekday using ISODay. Because the query will be run at the start of the month, I need to know how many occurrences of the specific ISOday occur in the month.
var query = { eventType: 'Sale', site : 4, tank: 1, txnDate : { "$gt" : new Date('2018-08-01T00:00:00') } };
db.tankevent.aggregate([
{ $match: query },
{ $project : {
isoDay: { $isoDayOfWeek: "$txnDate" },
dayDate: { $dateToString: { format: "%d", date:"$txnDate" } }
}
},
{ $group:
{ _id : { isoday: "$isoDay", dday: "$dayDate" }, count: { "$sum" : 1 } }
},
{ $sort: { "_id.isoday": 1, "_id.dday": 1 } }
])
provides the following output
/* 1 */
{
"_id" : {
"isoday" : 1,
"dday" : "06"
},
"count" : 62.0
}
/* 2 */
{
"_id" : {
"isoday" : 1,
"dday" : "13"
},
"count" : 69.0
}
/* 3 */
{
"_id" : {
"isoday" : 1,
"dday" : "20"
},
"count" : 72.0
}
/* 4 */
{
"_id" : {
"isoday" : 2,
"dday" : "07"
},
"count" : 75.0
}
I am trying to have "count" represent the number of unique "dday" records - so using the output above, I want count to be "3" for isoDay = 1. At the moment count is reporting number of sales events that occurred for the group combination
All you need to do is have the grouping twice.
db.tankevent.aggregate([
{ $match: query },
{ $project : {
isoDay: { $isoDayOfWeek: "$txnDate" },
dayDate: { $dateToString: { format: "%d", date:"$txnDate" } }
}
},
{ $group:
{ _id : { isoday: "$isoDay", dday: "$dayDate" }, count: { "$sum" : 1 } }
},
{ $project : {
isoDay_Final: "$_id.isoday"
}
},
{ $group:
{ _id : "$isoDay_Final", count: { "$sum" : 1 } }
},
{ $sort: { "_id": 1 } }
])

how to count number of keys in embedded mongodb document

I have a mongodb query: (Give me the settings where account='test')
db.collection_name.find({"account" : "test1"}, {settings : 1}).pretty();
where I get the following sample output:
{
"_id" : ObjectId("49830ede4bz08bc0b495f123"),
"settings" : {
"clusterData" : {
"us-south-1" : "cluster1",
"us-east-1" : "cluster2"
},
},
What I'm looking for now, is to give me the account where the clusterData has more than 1 key.
I'm only interested in listing those accounts with (2) or more keys.
I've tried this: (but this doesn't work)
db.collection_name.find({'settings.clusterData.1': {$exists: true}}, {account : 1}).pretty();
Is this possible to do with the current data structure? I don't have the option to redesign this schema.
Your clusterData field is not an array which is why you cannot just filter the number of elements it has. There is a way, though, to get what you want via the aggregation framework. Try this:
db.collection_name.aggregate({
$match: {
"account" : "test1"
}
}, {
$project: {
"settingsAsArraySize": { $size: { $objectToArray: "$settings.clusterData" } },
"settings.clusterData": 1
}
}, {
$match: {
"settingsAsArraySize": { $gt: 1 }
}
}, {
$project: {
"_id": 0,
"settings.clusterData": 1
}
}).pretty();

Why are items apparently duplicated in my mongoDB database when I use find()?

I am going through the try.mongodb.org tutorial on their website (embedded terminal emulator on the webpage). I am on items t4 and t5 (you type tx for items in the tutorial).
In t4 we populate a database.
> t4. Saving and Querying
> Try adding some documents to the scores collection:
> for(i=0; i<10; i++) { db.scores.save({a: i, exam: 5}) };
>
> Try that, then enter
> db.scores.find();
> to see if the save succeeded. Since the shell only displays 10 results at time,
> you'll need to enter the 'it' command to iterate over the rest.
>
> (enter 'next' when you're ready)
I made exam 5 + i just for fun:
for(i=0; i<10; i++) { db.scores.save({a: i, exam: 5+i}) };
So what is in the database? I type in db.scores.find(); and get the following, which is what I had expected, although the order seems random. Fine.
>
[
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d16bcc937439340649c4" } },
{ "exam" : 5, "a" : 0, "_id" : { "$oid" : "52b1d191cc937439340649c5" } },
{ "exam" : 6, "a" : 1, "_id" : { "$oid" : "52b1d191cc937439340649c6" } },
{ "exam" : 7, "a" : 2, "_id" : { "$oid" : "52b1d191cc937439340649c7" } },
{ "exam" : 8, "a" : 3, "_id" : { "$oid" : "52b1d191cc937439340649c8" } },
{ "exam" : 10, "a" : 5, "_id" : { "$oid" : "52b1d191cc937439340649c9" } },
{ "exam" : 9, "a" : 4, "_id" : { "$oid" : "52b1d191cc937439340649ca" } },
{ "exam" : 11, "a" : 6, "_id" : { "$oid" : "52b1d191cc937439340649cb" } },
{ "exam" : 12, "a" : 7, "_id" : { "$oid" : "52b1d191cc937439340649cc" } },
{ "exam" : 13, "a" : 8, "_id" : { "$oid" : "52b1d191cc937439340649cd" } }
]
In t5 we search for items in that database:
>
5. Basic Queries You've already tried a few queries, but let's make them more specific. How about finding all documents where a == 2:
db.scores.find({a: 2});
Or what about documents where a > 15? db.scores.find({a: {'$gt': 15}});
The a== 2 search worked, but the > 15 one did not. First of all, based on item t4, there should be no entry for a greater than 15.
So I try greater than 6: db.scores.find({a: {'$gt': 6}});
And I get the following output, which is really surprising to me since there should only be 3 entries for a == 7, a == 8, and a == 9.
>
[
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d16bcc937439340649c4" } },
{ "exam" : 12, "a" : 7, "_id" : { "$oid" : "52b1d191cc937439340649cc" } },
{ "exam" : 13, "a" : 8, "_id" : { "$oid" : "52b1d191cc937439340649cd" } },
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d191cc937439340649ce" } },
{ "exam" : 12, "a" : 7, "_id" : { "$oid" : "52b1d1a8cc937439340649d6" } },
{ "exam" : 13, "a" : 8, "_id" : { "$oid" : "52b1d1a8cc937439340649d7" } },
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d1a8cc937439340649d8" } },
{ "exam" : 5, "a" : 7, "_id" : { "$oid" : "52b1d49fcc937439340649f1" } },
{ "exam" : 5, "a" : 9, "_id" : { "$oid" : "52b1d49fcc937439340649f3" } },
{ "exam" : 5, "a" : 8, "_id" : { "$oid" : "52b1d49fcc937439340649f4" } }
]
If you look at the initially outputted db.scores.find() id's on the right, the last character goes up with each entry -- 4, 5, 6, 7, 8, 9, a, b, c, d. But in the duplicated entries, take a look at the entries for a == 9. We have one ending in 4, one ending in e, and one ending in 3. It seems like in the brains of the operation the database has 30 entries, not 10.
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d16bcc937439340649c4" } },
{ "exam" : 14, "a" : 9, "_id" : { "$oid" : "52b1d191cc937439340649ce" } },
{ "exam" : 5, "a" : 9, "_id" : { "$oid" : "52b1d49fcc937439340649f3" } },
I noticed is that if I try to repopulate the database using the loop in t4 it doesn't seem to re-write the values. i.e. if I use for(i=0; i<10; i++) { db.scores.save({a: i, exam: 5}) }; as the example had suggested instead of my just for fun for(i=0; i<10; i++) { db.scores.save({a: i, exam: 5+i}) };. Not sure if that is helpful to diagnose the problem but it is another observation.
You're missing something very special,
I noticed is that if I try to repopulate the database using the loop
in t4 it doesn't seem to re-write the values. i.e. if I use for(i=0;
i<10; i++) { db.scores.save({a: i, exam: 5}) }; as the example had
suggested instead of my just for fun for(i=0; i<10; i++) {
db.scores.save({a: i, exam: 5+i}) };. Not sure if that is helpful to
diagnose the problem but it is another observation.
Repopulate the database running the query more than once will create 10 rows every single time. db.scores.save doesn't know what document to update because you didn't refer to an _id field, in that case it will always create 10 records. To update existing records you should provide an _id field from the previous inserts. I'm sure you run it more than once and you expect to have always 10 records, what's happening is you're inserting 10 records every time.
Try it removing the collection, run the loop once and execute your find, it will work.
Are you sure you didn't run the commands more than once? What do you see if you run db.scores.find().count(), that will tell you how many items are in the table.