How to get a set of not null documents in FaunaDB - faunadb

I am going along this official tutorial and I create a similar index.
Below is my code that calculate the difference of a start and an end, which end is nullable.
CreateIndex({
name: "foo_index",
source: {
collection: Collection("bar_collection"),
fields: {
interval: Query(
Lambda(
"bazDoc",
If(
Or(
IsNull(Select(["data", "start"], Var("bazDoc"), null)),
IsNull(Select(["data", "end"], Var("bazDoc"), null))
),
null,
Subtract(
Select(["data", "end"], Var("bazDoc")),
Select(["data", "start"], Var("bazDoc"))
)
)
)
)
}
},
values: [
{ binding: "interval"},
{ field: ["ref", "id"]}
]
})
This is the return, which I want to filter out all docs that the interval is null. How should I achieve this.
{
data: [
[9, "353542771515064533"],
[10, "353542807600758997"],
[null, "353542787197567188"],
[null, "353542814197350613"]
]
}
Btw, I'm new to FaunaDB, please suggest some resources to learn other than Fauna's own document.

The docs state
When a document is indexed, and all of the index’s defined values evaluate to null, no index entry is stored for the document.
Since you are also including the ref ID, which is never null, there will always be an index entry created.
However, you can add another binding that also returns null in case the interval is null or the Ref otherwise. This way, both values will be null, and no entry will be created.
CreateIndex({
name: "foo_index",
source: {
collection: Collection("bar_collection"),
fields: {
interval: Query(
Lambda(
"bazDoc",
If(
Or(
IsNull(Select(["data", "start"], Var("bazDoc"), null)),
IsNull(Select(["data", "end"], Var("bazDoc"), null))
),
null,
Subtract(
Select(["data", "end"], Var("bazDoc")),
Select(["data", "start"], Var("bazDoc"))
)
)
)
),
ref: Query(
Lambda(
"bazDoc",
If(
Or(
IsNull(Select(["data", "start"], Var("bazDoc"), null)),
IsNull(Select(["data", "end"], Var("bazDoc"), null))
),
null,
Select(["ref"], Var("bazDoc"))
)
)
)
}
},
values: [
{ binding: "interval" },
{ binding: "ref" }
]
})
I’ve included the whole Ref as the value here, which is what I recommend. It makes it easier to reuse in combination with other queries, and all the drivers let you use the Ref values in additional queries and get the ID like ref.id in your app where you need it.
please suggest some resources to learn other than Fauna's own document.
I hope that others can also pitch in here, too. I am on the Customer Support team at Fauna and want to highlight that we have discourse forums and a discord server. Myself and other Fauna employees make a best effort to see that every forums question is addressed in a reasonable time, either by the community, or by answering ourselves; and we’re active in discord, which is better for general discussions.

Related

Laravel where, orWhereHas and whereNotIn

Hello great people of SO!
I hope you all have a good day and have a good health
Note: I'm not good at SQL
Sorry for bad english, but I will try my best to explain my issue
I'm using Laravel v8.x for my app, and after setting up model relationships, events, queues, etc, now I'm working for SQL
ATM, I have 2 Models,
User
Post
Relationships:
User hasMany Post
User belongsToMany User (Block)
User belongsToMany User (Follow)
Post belongsTo User
Database:
5 record for User
2 record for Block
3 records for Post
Table: (Using faker)
users
[
{ id: 1, name: 'Jonathan Beatrice', username: 'kiana.fay', ... },
{ id: 2, name: 'Lacey Kirlin', username: 'kenna.turner', ... },
{ id: 3, name: 'Alexander Schiller', username: 'cassandra95', ... },
{ id: 4, name: 'Daniel Wickozky', username: 'nkoepp', ... },
{ id: 5, name: 'Maymie Lehner', username: 'frami.felton', ... }
]
block
[
{ id: 1, by_id: 1, to_id: 2 }, // User #1 block user #2
{ id: 2, by_id: 4, to_id: 1 } // User #4 block user #1
]
posts
[
{ id: 1, user_id: 2, body: 'Test post', ... },
{ id: 2, user_id: 5, body: 'Lorem ipsum dolor sit amet ...', ... },
{ id: 3, user_id: 4, body: 'ABCD festival soon! ...', ... },
]
Everything works fine and smooth
Now that I want to implement search system, I have a problem, since I'm not good with SQL
Here's my code
SearchController.php
use ...;
use ...;
...
public function posts(Request $request)
{
// For testing purpose
$user = User::with(['userBlocks', 'blocksUser'])->find(1);
// Get all id of user that $user block
// return [2]
$user_blocks = $user->userBlocks->pluck('pivot')->pluck('to_id')->toArray();
// Get all id of user that block $user
// return [4]
$blocks_user = $user->blocksUser->pluck('pivot')->pluck('by_id')->toArray();
// Merge all ids above (must be unique())
// return [2, 4]
$blocks = array_merge($user_blocks, $blocks_user);
// .../search?q=xxx
$query = $request->query('q');
$sql = Post::query();
// Search for posts that has `posts`.`body` LIKE ? ($query)
$sql->where('body', 'LIKE', "%$query%");
// This is where I got confused
$sql->orWhereHas('user', function ($post_user) use ($blocks, $query) {
$post_user
->whereNotIn('id', $blocks) // Exclude posts that has user and their id not in (x, x, x, x, ... ; $block variable above)
->where('name', 'LIKE', "%$query%") // Find user that has name LIKE ? ($query)
->orWhere('username', 'LIKE', "%$query%"); // or Find user that has username LIKE ? ($query)
});
$sql->orderBy('created_at', 'DESC');
$sql->with(['user']);
$posts = $sql->simplePaginate(10, ['*'], 'p');
return $posts;
}
I run the code, .../search?q=e
Note:
All users has alphabet E in their names
And also all posts has alphabet E in their body
We (as User #1), block User #2, and User #4, block us (User #1)
Result: Controller returned all posts
This is the query when I use DB::enableQueryLog() and DB::getQueryLog()
SELECT
*
FROM
`posts`
WHERE `body` LIKE ?
AND EXISTS
(SELECT
*
FROM
`users`
WHERE `posts`.`user_id` = `users`.`id`
AND (
`id` NOT IN (?)
AND `username` LIKE ?
OR `name` LIKE ?
))
ORDER BY `created_at` ASC
LIMIT 11 OFFSET 0
Goal: Search all posts that has body LIKE ?, OR posts that has user; username LIKE ? or name LIKE ? (But also exclude the user we block and the user that block us
Thanks in advance
If there's any unclear explanation, I will edit it A.S.A.P
If I run on my recent laravel install, with my proposed change for one of your issues, version 7.19.1, I get this query:
SELECT
*
FROM
`posts`
WHERE `body` LIKE ?
OR EXISTS <- line of interest
(SELECT
*
FROM
`users`
WHERE `posts`.`user_id` = `users`.`id`
AND (
`id` NOT IN (?)
AND (`username` LIKE ?
OR `name` LIKE ?) <- extra brackets ive added
))
ORDER BY `created_at` ASC
LIMIT 11 OFFSET 0
Have a look at the line of interest, and compare it with the query your version of laravel is running. The AND EXISTS line is being incorrectly generated by laravel. OrWhereHas isnt behaving correctly in your version, I can't find the release number to see where it was fixed.
Id recommend upgrading to latest if possible, but thats not always an option. I've had a dig around, and it looks like the user in this question here encountered a similar problem:
WhereHas() / orWhereHas not constraining the query as expected
You can try moving your $sql->with(['user']); to before you OrWhereHas clause. I'm not sure if that will change it to OR, but its worth a try.
Second thing, I've added whereNested to your OR clause to ensure the precedence is correct, which adds the extra brackets in the query above, as in you dont want:
(`id` NOT IN (1, 2, 3)
AND `name` LIKE % test %)
OR `username` LIKE % test %
Since then it would include your blocked posts in the exists clause.
So final changes look like this, which I think fufills your description:
$sql->with(['user']); //deleted from original position and move here
$sql->where('body', 'LIKE', "%$query%")->whereNotIn('id', $blocks); //additional line
$sql->orWhereHas('ambience', function ($post_user) use ($blocks, $query) {
$post_user
->whereNotIn('id', $blocks);
$post_user->whereNested(function($post_user) use ($query) { //new bit
$post_user->where('name', 'LIKE', "%$query%")
->orWhere('username', 'LIKE', "%$query%");
});
});

Creating an index for all active items

I have a collection of documents that follow this schema {label: String, status: Number}.
I want to introduce a new field, deleted_at: Date that will hold information if a document has already been deleted. Seems like a perfect use case for an index, to be able to search for all undeleted tasks.
CreateIndex({
name: "activeTasks",
source: Collection("tasks"),
terms: [
{ field: ["data", "deleted_at"] }
]
})
And then filter by undefined / null value in shell:
Paginate(Match(Index("activeTasks"), null))
Paginate(Match(Index("activeTasks"), undefined))
It returns nothing, even for documents where I explicitly set deleted_at to null.
That's not my point, though. I want to get documents that do not have the deleted_at defined at all, so that I do not have to update the whole collection.
PS. When I add document where deleted: "test" and query for it, the shell does return the expected result.
What do I don't get?
The reason is because FaunaDB doesn't support reading empty/null value the way you think it does. You need to use a special Bindings to do that.
Make sure to check out https://docs.fauna.com/fauna/current/tutorials/indexes/bindings.html#empty for a more thorough explanation and examples.
My understanding of how bindings work would yield the following code. I haven't tested it though and I'm not sure it works.
You need a special binding index:
CreateIndex({
name: "activeTasks",
source: [{
collection: Collection("tasks"),
fields: {
null_deleted_at: Query(
Lambda(
"doc",
Equals(Select(["data", "deleted_at"], Var("doc"), null), null)
)
)
}
}],
terms: [ {binding: "null_deleted_at"} ],
})
Usage:
Map(
Paginate(Match(Index("activeTasks"), true)),
Lambda("X", Get(Var("X")))
)

Cannot update document by index in FaunaDB

I'm attempting to update a document using an index in my FaunaDB collection using FQL.
Update(
Match(
Index('users_by_id'),
'user-1'
),
{
data: {
name: 'John'
}
}
)
This query gives me the following error:
Error: [
{
"position": [
"update"
],
"code": "invalid argument",
"description": "Ref expected, Set provided."
}
]
How can I update the document using the index users_by_id?
Match returns a set reference, not a document reference, because there could be zero or more matching documents.
If you are certain that there is a single document that matches, you can use Get. When you call Get with a set reference (instead of a document reference), the first item of the set is retrieved. Since Update requires a document reference, you can then use Select to retrieve the fetched document's reference.
For example:
Update(
Select(
"ref",
Get(Match(Index('users_by_id'), 'user-1'))
),
{
data: {
name: 'John'
}
}
)
If you have more than one match, you should use Paginate to "realize" the set into an array of matching documents, and then Map over the array to perform a bulk update:
Map(
Paginate(
Match(Index('users_by_id'), 'user-1')
),
Lambda(
"ref",
Update(
Var("ref"),
{
data: {
name: "John",
}
}
)
)
)
Note: For this to work, your index has to have an empty values definition, or it must explicitly define the ref field as the one and only value. If your index returns multiple fields, the Lambda function has to be updated to accept the same number of parameters as are defined in your index's values definition.

How to select specific fields on FaunaDB Query Language?

I can't find anything about how to do this type of query in FaunaDB. I need to select only specifics fields from a document, not all fields. I can select one field using Select function, like below:
serverClient.query(
q.Map(
q.Paginate(q.Documents(q.Collection('products')), {
size: 12,
}),
q.Lambda('X', q.Select(['data', 'title'], q.Get(q.Var('X'))))
)
)
Forget the selectAll function, it's deprecated.
You can also return an object literal like this:
serverClient.query(
q.Map(
q.Paginate(q.Documents(q.Collection('products')), {
size: 12,
}),
q.Lambda(
'X',
{
title: q.Select(['data', 'title'], q.Get(q.Var('X')),
otherField: q.Select(['data', 'other'], q.Get(q.Var('X'))
}
)
)
)
Also you are missing the end and beginning quotation marks in your question at ['data, title']
One way to achieve this would be to create an index that returns the values required. For example, if using the shell:
CreateIndex({
name: "<name of index>",
source: Collection("products"),
values: [
{ field: ["data", "title"] },
{ field: ["data", "<another field name>"] }
]
})
Then querying that index would return you the fields defined in the values of the index.
Map(
Paginate(
Match(Index("<name of index>"))
),
Lambda("product", Var("product"))
)
Although these examples are to be used in the shell, they can easily be used in code by adding a q. in front of each built-in function.

How to query by multiple conditions in faunadb?

I try to improve my understanding of FaunaDB.
I have a collection that contains records like:
{
"ref": Ref(Collection("regions"), "261442015390073344"),
"ts": 1587576285055000,
"data": {
"name": "italy",
"attributes": {
"amenities": {
"camping": 1,
"swimming": 7,
"hiking": 3,
"culture": 7,
"nightlife": 10,
"budget": 6
}
}
}
}
I would like to query in a flexible way by different attributes like:
data.attributes.amenities.camping > 5
data.attributes.amenities.camping > 5 AND data.attributes.amenities.hiking > 6
data.attributes.amenities.camping < 6 AND data.attributes.amenities.culture > 6 AND hiking > 5 AND ...
I created an index containing all attributes, but I don't know how to do greater equals filtering in an index that contains multiple terms.
My fallback would be to create an index for each attribute and use Intersection to get the records that are in all subqueries that I want to check, but this feels somehow wrong:
The query: budget >= 6 AND camping >=8 would be:
Index:
{
name: "all_regions_by_all_attributes",
unique: false,
serialized: true,
source: "regions",
terms: [],
values: [
{
field: ["data", "attributes", "amenities", "culture"]
},
{
field: ["data", "attributes", "amenities", "hiking"]
},
{
field: ["data", "attributes", "amenities", "swimming"]
},
{
field: ["data", "attributes", "amenities", "budget"]
},
{
field: ["data", "attributes", "amenities", "nightlife"]
},
{
field: ["data", "attributes", "amenities", "camping"]
},
{
field: ["ref"]
}
]
}
Query:
Map(
Paginate(
Intersection(
Range(Match(Index("all_regions_by_all_attributes")), [0, 0, 0, 6, 0, 8], [10, 10, 10, 10, 10, 10]),
)
),
Lambda(
["culture", "hiking", "swimming", "budget", "nightlife", "camping", "ref"],
Get(Var("ref"))
)
)
This approach has the following disadvantages:
It does not work like expected, if for example the first (culture) attribute is in this range, but the second (hiking) not, then I would still get a return values
It causes a lot of reads due to the reference that I need to follow for each result.
Is it possible to store all values in this kind of index that would contain all the data? I know I can just add more values to the index and access them. But this would mean I have to create a new index as soon as we add more fields to the entity. But maybe this is a common thing.
thanks in advance
Thanks for your question. Ben already wrote out a complete example that shows what you can do and I'll base myself on his recommendations and try to clarify further.
FaunaDB's FQL is quite powerful which means there are multiple ways to do that, yet with such power comes a small learning curve so I'm happy to help :). The reason it took a while to answer this question is that such an elaborate answer actually deserves a complete blog post. Well, I've never written a blog post in Stack Overflow, there is a first for everything!
There are three ways to do 'compound range-like queries' but there is one way that will be most performant for your use-case and we'll see that the first approach is actually not entirely what you need. Spoiler, the third option we describe here is what you need.
Preparation - Let's throw in some data just like Ben did
I'll keep it in one collection to keep it simpler and am using the JavaScript flavour of the Fauna Query Language here. There is a good reason to separate data in a second collection though which is related to your second map/get question (see the end of this answer)
Create the collection
CreateCollection({ name: 'place' })
Throw in some data
Do(
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'mullion',
focus: 'team-building',
camping: 1,
swimming: 7,
hiking: 3,
culture: 7,
nightlife: 10,
budget: 6
}
})
),
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'church covet',
focus: 'private',
camping: 1,
swimming: 7,
hiking: 9,
culture: 7,
nightlife: 10,
budget: 6
}
})
),
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'the great outdoors',
focus: 'private',
camping: 5,
swimming: 3,
hiking: 2,
culture: 1,
nightlife: 9,
budget: 3
}
})
)
)
OPTION 1: Composite indexes with multiple values
We can put as many terms as values in an index and use Match and Range to query those. However! Range probably gives you something different than you would expect if you use multiple values. Range gives you exactly what the index does and the index sorts values lexically. If we look at the example of Range in the docs we see an example there which we can extend upon for multiple values.
Imagine we would have an index with two values and we write:
Range(Match(Index('people_by_age_first')), [80, 'Leslie'], [92, 'Marvin'])
Then the result will be what you see on the left and not what you see on the right. This is a very scalable behaviour and exposes the raw-power without overhead of the underlying index but is not exactly what you are looking for!
So let's move on to another solution!
OPTION 2: First Range, then Filter
Another quite flexible solution is to use Range and then Filter. This however is a less good idea in case you are filtering out a lot with filter since your pages will become more empty. Imagine that you have 10 items in a page after the 'Range' and use filter, then you will end up with pages of 2, 5, 4 elements depending on what is filtered out. This is a great idea however if one of these properties has such a high cardinality that it will filter out most of entities. E.g. imagine everything is timestamped, you want to first get a date range and then continue filtering something that will only eliminate a small percentage of the resultset. I believe that in your case all of these values are quite equal so this the third solution (see lower) will be the best for you.
We could in this case just throw all values in so that they all get returned which avoids a Get. For example, let's say that 'camping' is our most important filter.
CreateIndex({
name: 'all_camping_first',
source: Collection('place'),
values: [
{ field: ['data', 'camping'] },
// and the rest will not be used for filter
// but we want to return them to avoid Map/Get
{ field: ['data', 'swimming'] },
{ field: ['data', 'hiking'] },
{ field: ['data', 'culture'] },
{ field: ['data', 'nightlife'] },
{ field: ['data', 'budget'] },
{ field: ['data', 'name'] },
{ field: ['data', 'focus'] },
]
})
You can now write a query that just gets a range based on the camping value:
Paginate(Range(Match('all_camping_first'), [1], [3]))
Which should return two elements (the third has camping === 5)
Now imagine that we want to filter over these and we set our pages small to avoid unnecessary work
Filter(
Paginate(Range(Match('all_camping_first'), [1], [3]), { size: 2 }),
Lambda(
['camping', 'swimming', 'hiking', 'culture', 'nightlife', 'budget', 'name', 'focus'],
And(GTE(Var('hiking'), 0), GTE(7, Var('hiking')))
)
)
Since I want to be clear on both the advantages as disadvantages of each approach, let's show exactly how filter works by adding another one that has attributes that match our query.
Create(Collection('place'), {
data: {
name: 'the safari',
focus: 'team-building',
camping: 1,
swimming: 9,
hiking: 2,
culture: 4,
nightlife: 3,
budget: 10
}
})
Running the same query:
Filter(
Paginate(Range(Match('all_camping_first'), [1], [3]), { size: 2 }),
Lambda(
['camping', 'swimming', 'hiking', 'culture', 'nightlife', 'budget', 'name', 'focus'],
And(GTE(Var('hiking'), 0), GTE(7, Var('hiking')))
)
)
Now still returns only one value but provides you with an 'after' cursor that points to the next page. You might think: "huh? My page size was 2?". Well that's because Filter works after Pagination and your page originally had two entities from which one got filtered out. So you are left with a page of 1 value and a pointer to the next page.
{
"after": [
...
],
"data": [
[
1,
7,
3,
7,
10,
6,
"mullion",
"team-building"
]
]
You could also opt to Filter directly on the SetRef as well and only paginate afterwards. In that case, the size of your pages will contain the required size. However, keep in mind that this is an O(n) operation on the amount of elements that comes back from Range. Range uses an index but from the moment you use Filter, it will loop over each of the elements.
OPTION 3: Indexes on one value + Intersections!
This is the best solution for your use-case but it requires a bit more understanding and an intermediate index.
When we look at the doc examples for intersection we see this example:
Paginate(
Intersection(
Match(q.Index('spells_by_element'), 'fire'),
Match(q.Index('spells_by_element'), 'water'),
)
)
This works because it's two times the same index and that means that **the results are similar values ** (references in this case).
Let's say we add a few indexes.
CreateIndex({
name: 'by_camping',
source: Collection('place'),
values: [
{ field: ['data', 'camping']}, {field: ['ref']}
]
})
CreateIndex({
name: 'by_swimming',
source: Collection('place'),
values: [
{ field: ['data', 'swimming']}, {field: ['ref']}
]
})
CreateIndex({
name: 'by_hiking',
source: Collection('place'),
values: [
{ field: ['data', 'hiking']}, {field: ['ref']}
]
})
We can intersect on them now but it will not give us the right result. For example... let's call this:
Paginate(
Intersection(
Range(Match(Index("by_camping")), [3], []),
Range(Match(Index("by_swimming")), [3], [])
)
)
The result is empty. Although we had one with swimming 3 and camping 5.
That is exactly the problem. If swimming and camping were both the same value we would get a result. So it's important to notice that Intersection intersects the values, so that includes both the camping/swimming value as well as the reference. That means that we have to drop the value since we only need the reference. The way to do that before pagination is with a join, Essentially we are going to join with another index that is going to just.. return the ref (not specifying values defaults to only the ref)
CreateIndex({
name: 'ref_by_ref',
source: Collection('place'),
terms: [{field: ['ref']}]
})
This join looks as follows
Paginate(Join(
Range(Match(Index('by_camping')), [4], [9]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
)))
Here we just took the result of Match(Index('by_camping')) and just dropped the value by joining with an index that only returns the ref. Now let's combine this and just do an AND kind of range query ;)
Paginate(Intersection(
Join(
Range(Match(Index('by_camping')), [1], [3]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
)),
Join(
Range(Match(Index('by_hiking')), [0], [7]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
))
))
The result is two values, and both in the same page!
Note that you can easily extend or compose FQL by just using the native language (in this case JS) to make this look much nicer (note I didn't test this piece of code)
const DropAllButRef = function(RangeMatch) {
return Join(
RangeMatch,
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
))
}
Paginate(Intersection(
DropAllButRef (Range(Match(Index('by_camping')), [1], [3])),
DropAllButRef (Range(Match(Index('by_hiking')), [0], [7]))
))
And a final extension, this only returns indexes so you'll need to map get. There is of course a way around this if you really want to by.. just using another index :)
const index = CreateIndex({
name: 'all_values_by_ref',
source: Collection('place'),
values: [
{ field: ['data', 'camping'] },
{ field: ['data', 'swimming'] },
{ field: ['data', 'hiking'] },
{ field: ['data', 'culture'] },
{ field: ['data', 'nightlife'] },
{ field: ['data', 'budget'] },
{ field: ['data', 'name'] },
{ field: ['data', 'focus'] }
],
terms: [
{ field: ['ref'] }
]
})
Now you have the range query, will get everything without a map/get:
Paginate(
Intersection(
Join(
Range(Match(Index('by_camping')), [1], [3]),
Lambda(['value', 'ref'], Match(Index('all_values_by_ref'), Var('ref'))
)),
Join(
Range(Match(Index('by_hiking')), [0], [7]),
Lambda(['value', 'ref'], Match(Index('all_values_by_ref'), Var('ref'))
))
)
)
With this join approach you could even do range indexes on different collections as long as you join them to the same reference before intersecting! Pretty cool huh?
Can I store more values in the index?
Yes you can, indexes in FaunaDB are views, so let's call them indiviews. It's a tradeoff, essentially you are exchanging compute for storage. By making a view with many values you get very fast access to a certain subset of your data. But there is another tradeoff and that is flexibility. You can not just go adding elements since that would require you to rewrite your whole index. In that case you will have to make a new index and wait for it to build if you have much data (and yes, that is quite common) and make sure that the queries you do (look at the lambda parameters in map filter) match your new index. You can always delete the other index afterwards. Just using Map/Get will be more flexible, everything in databases is a tradeoff and FaunaDB gives you both options :). I would suggest to use such an approach from the moment your datamodel is fixed and you see a specific part in your app that you want to optimise.
Avoiding MapGet
The second question on Map/Get requires some explanation. Separating out the values that you will search on from the places (as Ben did) is a great idea if you want to use Join to get the actual places more efficiently. This will not require a Map Get and therefore cost you far less reads but do notice that Join is rather a traverse (it'll replace the current references with the target references it joins to) so if you need both the values and the actual place data in one object at the end of your query than you will require Map/Get. Look at it from this perspective, indexes are ridiculously cheap in terms of reads and you can go quite far with those but for some operations there is just no way around Map/Get, Get is still only 1 read. Given that you get 100 000 for free per day that is still not expensive :). You could keep your pages also relatively small (size parameter in paginate) to make sure you don't do unnecessary gets unless your users or app requires more pages.
For people reading this that do not know this yet:
1 index page === 1 read
1 get === 1 read
Final notes
We can and will make this easier in the future. However, note that you are working with a scalable distributed database and often these things are just not even possible in other solutions or very inefficient. FaunaDB provides you with very powerful structures and raw access to how indexes work and gives you many options. It does not try to be clever for you behind the scenes as this might result in very inefficient queries in case we get it wrong (that would be a bummer in a scalable pay-as-you-go system).
There are a couple of misconceptions that I think are leading you astray. The most important one: Match(Index($x)) generates a set reference, which is an ordered set of tuples. The tuples correspond to the array of fields that are present in the values section of an index. By default this will just be a one-tuple containing a reference to a document in the collection selected by the index. Range operates on a set reference and knows nothing about the terms used to the select the returned set ref. So how do we compose the query?
Starting from first principles. Lets imagine we just had this stuff in memory. If we had a set of (attribute, scores) ordered by attribute, score then taking only those where attribute == $attribute would get us close, and then filtering by score > $score would get us what we wanted. This corresponds exactly to a range query over scores with attributes as terms, assuming we modeled the attribute value pairs as documents. We can also embed pointers back to the location so we can retrieve that as well in the same query. Enough chatter, lets do it:
First stop: our collections.
jnr> CreateCollection({name: "place_attribute"})
{
ref: Collection("place_attribute"),
ts: 1588528443250000,
history_days: 30,
name: 'place_attribute'
}
jnr> CreateCollection({name: "place"})
{
ref: Collection("place"),
ts: 1588528453350000,
history_days: 30,
name: 'place'
}
Next up some data. We'll chose a couple of places and give them some attributes.
jnr> Create(Collection("place"), {data: {"name": "mullion"}})
jnr> Create(Collection("place"), {data: {"name": "church cove"}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "swimming", "score": 3, "place": Ref(Collection("place"), 264525084639625739)}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "hiking", "score": 1, "place": Ref(Collection("place"), 264525084639625739)}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "hiking", "score": 7, "place": Ref(Collection("place"), 264525091487875586)}})
Now for the more interesting part. The index.
jnr> CreateIndex({name: "attr_score", source: Collection("place_attribute"), terms:[{"field":["data", "attribute"]}], values:[{"field": ["data", "score"]}, {"field": ["data", "place"]}]})
{
ref: Index("attr_score"),
ts: 1588529816460000,
active: true,
serialized: true,
name: 'attr_score',
source: Collection("place_attribute"),
terms: [ { field: [ 'data', 'attribute' ] } ],
values: [ { field: [ 'data', 'score' ] }, { field: [ 'data', 'place' ] } ],
partitions: 1
}
Ok. A simple query. Who has Hiking?
jnr> Paginate(Match(Index("attr_score"), "hiking"))
{
data: [
[ 1, Ref(Collection("place"), "264525084639625730") ],
[ 7, Ref(Collection("place"), "264525091487875600") ]
]
}
Without too much imagination one could sneak a Get call into that to pull the place out.
What about only hiking with a score over 5? We have an ordered set of tuples, so just supplying the first component (the score) is enough to get us what we want.
jnr> Paginate(Range(Match(Index("attr_score"), "hiking"), [5], null))
{ data: [ [ 7, Ref(Collection("place"), "264525091487875600") ] ] }
What about a compound condition? Hiking under 5 and swimming (any score). This is where things take a bit of a turn. We want to model conjunction, which in fauna means intersecting sets. The problem we have is that up until now we have been using an index that returns the score as well as the place ref. For intersection to work we need just the refs. Time for a sleight of hand:
jnr> Get(Index("doc_by_doc"))
{
ref: Index("doc_by_doc"),
ts: 1588530936380000,
active: true,
serialized: true,
name: 'doc_by_doc',
source: Collection("place"),
terms: [ { field: [ 'ref' ] } ],
partitions: 1
}
What's the point of such an index you ask? Well we can use it to drop any data we like from any index and be left with just the refs via join. This gives us the place refs with a hiking score less than 5 (the empty array sorts before anything, so works as a placeholder for a lower bound).
jnr> Paginate(Join(Range(Match(Index("attr_score"), "hiking"), [], [5]), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p")))))
{ data: [ Ref(Collection("place"), "264525084639625739") ] }
So finally the piece de resistance: all places with swimming and (hiking < 5):
jnr> Let({
... hiking: Join(Range(Match(Index("attr_score"), "hiking"), [], [5]), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p")))),
... swimming: Join(Match(Index("attr_score"), "swimming"), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p"))))
... },
... Map(Paginate(Intersection(Var("hiking"), Var("swimming"))), Lambda("ref", Get(Var("ref"))))
... )
{
data: [
{
ref: Ref(Collection("place"), "264525084639625739"),
ts: 1588529629270000,
data: { name: 'mullion' }
}
]
}
Tada. This could be neatened up a lot with a couple of udfs, exercise left to the reader. Conditions involving or can be managed with union in much the same way.
Easy way to query with the multiple conditions I think with the query it with documents differences, In my solutions it is like:
const response = await client.query(
q.Let(
{
activeUsers: q.Difference(
q.Match(q.Index("allUsers")),
q.Match(q.Index("usersByStatus"), "ARCHIVE")
),
paginatedDocuments: q.Map(
q.Paginate(q.Var("activeUsers"), {
size,
before: reqBefore,
after: reqAfter
}),
q.Lambda("x", q.Get(q.Var("x")))
),
total: q.Count(q.Var("activeUsers"))
},
{
documents: q.Var("paginatedDocuments"),
total: q.Var("total")
}
)
);
const {
documents: {
data: dbData = [],
before: dbBefore = [],
after: dbAfter = []
} = {},
total = 0
} = response || {};
const respBefore = dbBefore[0]?.value?.id || null;
const respAfter = dbAfter[0]?.value?.id || null;
const data = await dbData.map((userData) => {
const {
ref: { id = null } = {},
data: { firstName = "", lastName = "" }
} = userData;
return {
id,
firstName,
lastName
};
});
So in the query builder you can filter each nested document in variable in Let section by the index that you want.
Here is the another variant of filtering, in SQL looks like:
SELECT * FROM clients WHERE salary > 2000 AND age > 30;
For fauna query:
const response = await client.query(
q.Let(
{
allClients: q.Match(q.Index("allClients")),
filteredClients: q.Filter(
q.Var("allClients"),
q.Lambda(
"client",
q.And(
q.GT(q.Select(["data", "salary"], q.Get(q.Var("client"))), 2000),
q.GT(q.Select(["data", "age"], q.Get(q.Var("client"))), 30)
)
)
),
paginatedDocuments: q.Map(
q.Paginate(q.Var("filteredClients")),
q.Lambda("x", q.Get(q.Var("x")))
),
total: q.Count(q.Var("filteredClients"))
},
{
documents: q.Var("paginatedDocuments"),
total: q.Var("total")
}
)
);
This is some kind of filtering in javascript where the condition if returns true so it will be in the result of the response. Example:
const filteredClients = allClients.filter((client) => {
const { salary, age } = client;
return ( salary > 2000 ) && (age > 30)
})