MongoDB: How retrieve data that is newly constructed instead of original documents in the collection?

MongoDB: How retrieve data that is newly constructed instead of original documents in the collection? - sql

I have a collection in which documents are all in this format:
{"user_id": ObjectId, "book_id": ObjectId}
It represents the relationship between user and book, which is also one-to-many, that means, a user can have more than one books.
Now I got three book_id, for example:
["507f191e810c19729de860ea", "507f191e810c19729de345ez", "507f191e810c19729de860efr"]
I want to query out the users who have these three books, because the result I want is not the document in this collection, but a newly constructed array of user_id, it seems complicated and I have no idea about how to make the query, please help me.
NOTE:
The reason why I didn't use the structure like:
{"user_id": ObjectId, "book_ids": [ObjectId, ...]}
is because in my system, books increase frequently and have no limit in amount, in other words, user may read thousands of books, so I think it's better to use the traditional way to store it.
This question is not restricted by MongoDB, you can answer it in relational database thoughts.

Using a regular find you cannot get back all user_id fields who own all the book_id's because you normalized your collection (flattened it).
You can do it, if you use aggregation framework:
db.collection.aggregate([
{
$match: {
book_id: {
$in: ["507f191e810c19729de860ea",
"507f191e810c19729de345ez",
"507f191e810c19729de860efr" ]
}
}
},
{
$group: {
_id: "$user_id",
count: { $sum: 1 }
}
},
{
$match: {
count: 3
}
},
{
$group: {
_id: null,
users: { $addToSet: "$_id" }
}
}
]);
What this does is filters through the pipeline only for documents which match one of the three book_id values, then it groups by user_id and counts how many matches that user got. If they got three they pass to the next pipeline operation which groups them into an array of user_ids. This solution assumes that each 'user_id,book_id' record can only appear once in the original collection.

Related

Can I refer to the Row/Document internal variables when filtering in Prisma?

How can I use row/document variables in filters and sorting?
As you know in SQL we can filter on joins beside the foreign key Something like this
Select * From A LEFT JOIN B on A.key = B.foriegnKey AND B.key IN A.currentSelection
or even in mongo lookup
collection('A').aggregate([{
$lookup: {
from: "B",
localField: "key",
foreignField: "foreignKey",
let: { A_currentSelection: "$currentSelection" },
pipeline: [{
$match: {
$expr: { $in: ["$key", "$$A_currentSelection"] }
}
}],
as: "matches"
}
},
])
But you can't do the following in Prisma
prisma.A.findMany({
include: {
B: {
where: {
'$A.currentSelection': {
has: "$B.key"
}
}
}
}
})
Regardless of the query itself, the idea is that I can access the current row/document variables in the query, I also know that I can modify the structure of the database to get around these kinds of issues but the database is already structured in a specific manner that might break some parts of the code and it's also not viable to change the structure just because Prisma is not lacking in this part.
At first, I was using a raw query to get around this and know I've created more complex relationships in the schema to fix this in Prisma in this case, but if anyone knows a more elegant solution then I'd be grateful

FaunaDB: how to fetch a custom column

I'm just learning FaunaDB and FQL and having some trouble (mainly because I come from MySQL). I can successfully query a table (eg: users) and fetch a specific user. This user has a property users.expiry_date which is a faunadb Time() type.
What I would like to do is know if this date has expired by using the function LT(Now(), users.expiry_date), but I don't know how to create this query. Do I have to create an Index first?
So in short, just fetching one of the users documents gets me this:
{
id: 1,
username: 'test',
expiry_date: Time("2022-01-10T16:01:47.394Z")
}
But I would like to get this:
{
id: 1,
username: 'test',
expiry_date: Time("2022-01-10T16:01:47.394Z"),
has_expired: true,
}
I have this FQL query now (ignore oauthInfo):
Query(
Let(
{
oauthInfo: Select(['data'], Get(Ref(Collection('user_oauth_info'), refId))),
user: Select(['data'], Get(Select(['user_id'], Var('oauthInfo'))))
},
Merge({ oauthInfo: Var('oauthInfo') }, { user: Var('user') })
)
)
How would I do the equivalent of the mySQL query SELECT users.*, IF(users.expiry_date < NOW(), 1, 0) as is_expired FROM users in FQL?

Your use of Let and Merge show that you are thinking about FQL in a good way. These are functions that can go a long way to making your queries more organized and readable!
I will start with some notes, but they will be relevant to the final answer, so please stick with me.
The Query function
https://docs.fauna.com/fauna/current/api/fql/functions/query
First, you should not need to wrap anything in the Query function, here. Query is necessary for defining functions in FQL that will be run later, for example, in the User-Defined Function body. You will always see it as Query(Lambda(...)).
Fauna IDs
https://docs.fauna.com/fauna/current/learn/understanding/documents
Remember that Fauna assigns unique IDs for every Document for you. When I see fields named id, that is a bit of a red flag, so I want to highlight that. There are plenty of reasons that you might store some business-ID in a Document, but be sure that you need it.
Getting an ID
A Document in Fauna is shaped like:
{
ref: Ref(Collection("users"), "101"), // <-- "id" is 101
ts: 1641508095450000,
data: { /* ... */ }
}
In the JS driver you can use this id by using documentResult.ref.id (other drivers can do this in similar ways)
You can access the ID directly in FQL as well. You use the Select function.
Let(
{
user: Get(Select(['user_id'], Var('oauthInfo')))
id: Select(["ref", "id"], Var("user"))
},
Var("id")
)
More about the Select function.
https://docs.fauna.com/fauna/current/api/fql/functions/select
You are already using Select and that's the function you are looking for. It's what you use to grab any piece of an object or array.
Here's a contrived example that gets the zip code for the 3rd user in the Collection:
Let(
{
page: Paginate(Documents(Collection("user")),
},
Select(["data", 2, "data", "address", "zip"], Var("user"))
)
Bring it together
That said, your Let function is a great start. Let's break things down into smaller steps.
Let(
{
oauthInfo_ref: Ref(Collection('user_oauth_info'), refId)
oauthInfo_doc: Get(Var("oathInfoRef")),
// make sure that user_oath_info.user_id is a full Ref, not just a number
user_ref: Select(["data", "user_id"], Var("oauthInfo_doc"))
user_doc: Get(Var("user_ref")),
user_id: Select("id", Var("user_ref")),
// calculate expired
expiry_date: Select(["data", "expiry_date"], Var("user_doc")),
has_expired: LT(Now(), Var("expiry_date"))
},
// if the data does not overlap, Merge is not required.
// you can build plain objects in FQL
{
oauthInfo: Var("oauthInfo_doc"), // entire Document
user: Var("user_doc"), // entire Document
has_expired: Var("has_expired") // an extra field
}
)
Instead of returning the auth info and user as separate points if you do want to Merge them and/or add additional fields, then feel free to do that
// ...
Merge(
Select("data", Var("user_doc")), // just the data
{
user_id: Var("user_id"), // added field
has_expired: Var("has_expired") // added field
}
)
)

Prisma nested recursive relations depth

I must query a group and all of its subgroups from the same model.
However, when fetching from Group table as shown below, Prisma doesn't include more than a 1-depth to the resulting Subgroups relation (subgroups of subgroups being left out). Subgroups attribute holds an array whose elements are of same type as the said model (recursive).
model Group {
id Int #id #default(autoincrement())
parentId Int?
Parent Group? #relation("parentId", fields: [parentId], references: [id])
Subgroups Group[] #relation("parentId")
}
GroupModel.findFirst({
where: { id: _id },
include: { Subgroups: true }
});
I guess this might be some sort of safeguard to avoid infinite recursive models when generating results. Is there any way of dodging this limitation (if it's one), and if so, how?
Thanks

You can query more than 1-depth nested subgroups by nesting include like so:
GroupModel.findFirst({
where: { id: _id },
include: { Subgroups: { include: { Subgroups: { include: Subgroups: { // and so on... } } } } }
});
But, as mentioned by #TasinIshmam, something like includeRecursive is not supported by Prisma at the moment.
The workaround would be to use $queryRaw (https://www.prisma.io/docs/concepts/components/prisma-client/raw-database-access#queryraw) together with SQL recursive queries (https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-RECURSIVE)

Is there a way to use the graphLookup aggregation pipeline stage for arrays?

I am currently working on an application that uses MongoDB as the data repository. I am mainly concerned about the graphLookup query to establish links between different people, based on what flights they took. My document contains an array field, that in turn contains key value pairs. I need to establish the links based on one of the key:value pairs of that array.
I have already tried some queries of aggregation pipeline with $graphLookup as one of the stages and they have all worked fine. But now that I am trying to use it with an array, I am hitting a blank.
Below is the array field from the first document :
"movementSegments":[
{
"carrierCode":"MO269",
"departureDateTimeMillis":1550932676000,
"arrivalDateTimeMillis":1551019076000,
"departurePort":"DOH",
"arrivalPort":"LHR",
"departurePortText":"HAMAD INTERNATIONAL AIRPORT",
"arrivalPortText":"LONDON HEATHROW",
"serviceNameText":"",
"serviceKey":"BA007_1550932676000",
"departurePortLatLong":"25.273056,51.608056",
"arrivalPortLatLong":"51.4706,-0.461941",
"departureWeeklyTemporalSpatialWindow":"DOH_8",
"departureMonthlyTemporalSpatialWindow":"DOH_2",
"arrivalWeeklyTemporalSpatialWindow":"LHR_8",
"arrivalMonthlyTemporalSpatialWindow":"LHR_2"
}
]
The other document has the below field :
"movementSegments":[
{
"carrierCode":"MO269",
"departureDateTimeMillis":1548254276000,
"arrivalDateTimeMillis":1548340676000,
"departurePort":"DOH",
"arrivalPort":"LHR",
"departurePortText":"HAMAD INTERNATIONAL AIRPORT",
"arrivalPortText":"LONDON HEATHROW",
"serviceNameText":"",
"serviceKey":"BA003_1548254276000",
"departurePortLatLong":"25.273056,51.608056",
"arrivalPortLatLong":"51.4706,-0.461941",
"departureWeeklyTemporalSpatialWindow":"DOH_4",
"departureMonthlyTemporalSpatialWindow":"DOH_1",
"arrivalWeeklyTemporalSpatialWindow":"LHR_4",
"arrivalMonthlyTemporalSpatialWindow":"LHR_1"
},
{
"carrierCode":"MO270",
"departureDateTimeMillis":1548254276000,
"arrivalDateTimeMillis":1548340676000,
"departurePort":"DOH",
"arrivalPort":"LHR",
"departurePortText":"HAMAD INTERNATIONAL AIRPORT",
"arrivalPortText":"LONDON HEATHROW",
"serviceNameText":"",
"serviceKey":"BA003_1548254276000",
"departurePortLatLong":"25.273056,51.608056",
"arrivalPortLatLong":"51.4706,-0.461941",
"departureWeeklyTemporalSpatialWindow":"DOH_4",
"departureMonthlyTemporalSpatialWindow":"DOH_1",
"arrivalWeeklyTemporalSpatialWindow":"LHR_4",
"arrivalMonthlyTemporalSpatialWindow":"LHR_1"
}
]
And I am running the below query :
db.person_events.aggregate([
{ $match: { eventId: "22446688" } },
{
$graphLookup: {
from: 'person_events',
startWith: '$movementSegments.carrierCode',
connectFromField: 'carrierCode',
connectToField: 'carrierCode',
as: 'carrier_connections'
}
}
])
The above query creates an array field in the document, but there are no values in it. As per the expectation, both my documents should get linked based on the carrier number.
Just to be clear about the query, the documents contain an eventId field, and the match pipeline returns one document to me after the match stage.

Well, I don't know how I missed it, but here is the solution to my problem which gives me the required results :
db.person_events.aggregate([
{ $match: { eventId: "22446688" } },
{
$graphLookup: {
from: 'person_events',
startWith: '$movementSegments.carrierCode',
connectFromField: 'movementSegments.carrierCode',
connectToField: 'movementSegments.carrierCode',
as: 'carrier_connections'
}
}
])

Sequelize Querying with Op.or and Op.ne with same array of numbers

I'm having trouble getting the correct query with sequelize.
I have an array representing ids of entries lets say its like this -
userVacationsIds = [1,2,3]
i made the first query like this
Vacation.findAll({
where: {
id: {
[Op.or]: userVacationsIds
}
}
})
.then(vacationSpec => {
Vacation.findAll({
where:{
//Here i need to get all entries that DONT have the ids from the array
}
}
})
I can't get the correct query as specified in my code "comment"
I've tried referring to sequelize documentation but i can't understand how to chain these queries specifically
Also tried an online converter but that failed too.
Specified the code i have above
So i just need some help getting this query correct please.
I eventually expect to get 2 arrays - one containing all entries with the ids from the array, the other containing everything else (as in id is NOT in the array)

I figured it out.
I feel silly.
This is the query that worked
Vacation.findAll({
where: {
id: {
[Op.or]: userVacationsIds
}
}
}).then(vacationSpec => {
Vacation.findAll({
where: {
id: {
[Op.notIn]: userVacationsIds
}
}
})

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MongoDB: How retrieve data that is newly constructed instead of original documents in the collection? - sql

Related

Can I refer to the Row/Document internal variables when filtering in Prisma?

FaunaDB: how to fetch a custom column

Prisma nested recursive relations depth

Is there a way to use the graphLookup aggregation pipeline stage for arrays?

Sequelize Querying with Op.or and Op.ne with same array of numbers

Categories

Resources