Functional programming RamdaJs groupBy with transformation - ramda.js

I want to create function that groups the array with specific key as follow:
var items = [
{name: 'n1', prop: 'p1', value: 90},
{name: 'b', prop: 'p2', value: 1},
{name: 'n1', prop: 'p3', value: 3}];
Into this:
{n1: {p1: 90, p3: 3}, {b: {p2: 1}
Basically group by column "name" and sets the prop name as key with the value.
I know there is groupBy function in RamdaJs but it accepts function to generate the group key.
I know I can format the data after that but I will be inefficient.
Is there any way to pass some kind of "transform" function which prepare the data for each item.
Thanks

There is a trade-off involved in using a generic library and writing custom code for every scenario. A library like Ramda with several hundred functions will offer many tools that can help, but they are not likely to cover every scenario. Ramda does have a specific function to combine groupBy with some sort of fold, reduceBy. But if I didn't know that, I would write a custom version.
I would start with what works and remains simple, only worrying about performance if tests showed an issue with this specific code. Here I show a number of steps of changing such a function each time to improve performance. I'll make the main point here: I would actually stick with my first version, which I find easily readable, and not bother with any of the performance enhancements unless I had hard numbers to show that this was a bottleneck in my application.
Plain Ramda version
My first pass might look like this:
const addTo = (obj, {prop, value}) =>
assoc (prop, value, obj)
const transform1 = pipe (
groupBy (prop ('name')),
map (reduce (addTo, {}))
)
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}];
console .log (
transform1 (items)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {assoc, pipe, groupBy, prop, map, reduce} = R </script>
Only Loop Once
This to me is clear and easy to read.
But there is certainly a question of efficiency, given that we have to loop over the list to group and then loop over each group to fold. So perhaps we'd be better off with a custom function. Here's a fairly straightforward modern JS version:
const transform2 = (items) =>
items .reduce(
(a, {name, prop, value}) => ({...a, [name]: {...a[name], [prop]: value}}),
{}
)
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}];
console .log (
transform2 (items)
)
Don't reduce ({...spread})
This version only loops once, which sounds like a nice improvement... but there is a real question about the performance of what Rich Snap calls the reduce ({...spread}) anti-pattern. So perhaps we want to use a mutating reduce instead. This shouldn't cause problems as it's only internal to our function. We can write an equivalent version that doesn't involve this reduce ({...spread}) pattern:
const transform3 = (items) =>
items .reduce (
(a, {name, prop, value}) => {
const obj = a [name] || {}
obj[prop] = value
a[name] = obj
return a
},
{}
)
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}];
console .log (
transform3 (items)
)
More Performant Looping
Now we've removed that pattern (I don't in fact agree that it's always an anti-pattern), we have a more performant bit of code, but there is still one thing we can do. It's well known that the Array.prototype functions such as reduce are not as fast as their plain loop counterparts. So we can go one step further and write this with a for-loop:
const transform4 = (items) => {
const res = {}
for (let i = 0; i < items .length; i++) {
const {name, prop, value} = items [i]
const obj = res [name] || {}
obj[prop] = value
}
return res
}
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}]; console.log('This version is intentionally broken. See the text for the fix.');
console .log (
transform4 (items)
)
We've reached the limit of what I can think of in terms of performance optimizations.
... And we've made the code much worse! Comparing that last version with the first,
const transform1 = pipe (
groupBy (prop ('name')),
map (reduce (addTo, {}))
)
we see a hand-down winner in terms of code clarity. Without knowing the details of the addTo helper, we can still get a very good sense up-front, of what this function does on first reading. And if we want those details more obvious, we could simply in-line that helper. Version, though, will take a careful reading to understand how it works.
Oh wait; it doesn't work. Did you test it and see that? Do you see what's missing? I pulled this line from the end of the for-loop:
res[name] = obj;
Did you notice that in the code? It's not particularly difficult to spot, but it's not necessarily obvious at a quick glance.
Summary
Performance optimization, when it's needed, has to be done very carefully, as you can't take advantage of many of the tools you get used to using. So, there are times when it's very important, and I do it then, but if my cleaner, easier-to-read code performs well enough, then I'll leave it there.
Point-free (pointless?) Aside
A similar argument applies for pushing too hard for point-free code. It's a useful technique, and many functions become cleaner by using it. But it can be pushed beyond its usefulness. Note that the helper function, addTo, from the initial version above is not point-free. We can make a point-free version of it. There may be simpler ways, but the first thing that comes to my mind is pipe (lift (objOf) (prop ('prop'), prop ('value')), mergeAll). We could write an entirely point-free version of this function in-lining that this way:
const transform5 = pipe (
groupBy (prop ('name')),
map (pipe (
map (lift (objOf) (
prop ('prop'),
prop ('value')
)),
mergeAll
))
)
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}];
console .log (
transform5 (items)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {pipe, groupBy, prop, map, lift, objOf, mergeAll} = R </script>
Does this gain us anything? Not that I can see. The code is much more complex and much less expressive. This is as hard to read as the for-loop variant.
So again, focus on keeping the code simple. That's my advice, and I'm sticking to it!

I would use reduceBy instead:
it allows a function to generate a key
and a function to transform the data
const items = [
{name: 'n1', prop: 'p1', value: 90},
{name: 'b', prop: 'p2', value: 1},
{name: 'n1', prop: 'p3', value: 3}];
// {name: 'n1', prop: 'p1', value: 90} => {p1: 90}
const kv = obj => ({[obj.prop]: obj.value});
// {p1: 90}, {name: 'n1', prop: 'p3', value: 3} -> {p1: 90, p3: 3}
const reducer = (acc, obj) => mergeRight(acc, kv(obj));
console.log(
reduceBy(reducer, {}, prop('name'), items)
)
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>
<script>const {reduceBy, prop, mergeRight} = R;</script>

An imperative for...of loop, with a bit of destruction is readable, albeit verbose, and performant.
const fn = arr => {
const obj = {}
for(const { name, prop, value } of arr) {
if(!obj[name]) obj[name] = {} // initialize the group if it doesn't exist
obj[name][prop] = value // add the prop and it's value to the group
}
return obj
}
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}]
const result = fn(items)
console.log(result)
A functional solution using Ramda would be slower but depending on the number of items in the array it might be negligible. I usually start with a functional solution, and only if I have performance issues, I profile, and then fallback to the more performant imperative option.
A readable pointfree solution using Ramda - R.groupBy and R.map would be the basis. In this case I map each group items to their props, and then use R.fromPairs to generate the object.
const { pipe, groupBy, prop, map, props, fromPairs } = R
const fn = pipe(
groupBy(prop('name')),
map(pipe(
map(props(['prop', 'value'])),
fromPairs
))
)
const items = [{name: 'n1', prop: 'p1', value: 90}, {name: 'b', prop: 'p2', value: 1}, {name: 'n1', prop: 'p3', value: 3}]
const result = fn(items)
console.log(result)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>

Related

Ramda - how to pass dynamic argument to function inside pipe

I am trying to add/use a variable inside the pipe to get the name of an object from a different object. Here is what I got so far:
I have an array of IDs allOutgoingNodes which I am using in the pipe.
Then I filter results using tableItemId property and then I am adding additional property externalStartingPoint and after that I would like to add name of tableItem from tableItems object to content -> html using concat.
const startingPointId = 395;
const allNodes = {
"818": {
"id": "818",
"content": {
"html": "<p>1</p>"
},
"outgoingNodes": [
"819"
],
"tableItemId": 395
},
"821": {
"id": "821",
"content": {
"html": "<p>4</p>"
},
"tableItemId": 396
}
}
const tableItems = {
"395": {
"id": "395",
"name": "SP1",
"code": "SP1"
},
"396": {
"id": "396",
"name": "SP2",
"code": "SP2"
}
}
const allOutgoingNodes = R.pipe(
R.values,
R.pluck('outgoingNodes'),
R.flatten
)(tableItemNodes);
const result = R.pipe(
R.pick(allOutgoingNodes),
R.reject(R.propEq('tableItemId', startingPointId)),
R.map(
R.compose(
R.assoc('externalStartingPoint', true),
SomeMagicFunction(node.tableItemId),
R.over(
R.lensPath(['content', 'html']),
R.concat(R.__, '<!-- Table item name should display here -->')
)
)
),
)(allNodes);
Here is a complete working example: ramda editor
Any help and suggestions on how to improve this piece of code will be appreciated.
Thank you.
Update
In the comments, OriDrori noted a problem with my first version. I didn't really understand one of the requirements. This version tries to address that issue.
const {compose, chain, prop, values, lensPath,
pipe, pick, reject, propEq, map, assoc, over} = R
const getOutgoing = compose (chain (prop('outgoingNodes')), values)
const htmlLens = lensPath (['content', 'html'])
const addName = (tableItems) => ({tableItemId}) => (html) =>
html + ` <!-- ${tableItems [tableItemId] ?.name} -->`
const convert = (tableItemNodes, tableItems, startingPointId) => pipe (
pick (getOutgoing (tableItemNodes)),
reject (propEq ('tableItemId', startingPointId)),
map (assoc ('externalStartingPoint', true)),
map (chain (over (htmlLens), addName (tableItems)))
)
const startingPointId = 395;
const tableItemNodes = {818: {id: "818", content: {html: "<p>1</p>"}, outgoingNodes: ["819"], tableItemId: 395}, 819: {id: "819", content: {html: "<p>2</p>"}, outgoingNodes: ["820"], tableItemId: 395}};
const tableItems = {395: {id: "395", name: "SP1", code: "SP1"}, 396: {id: "396", name: "SP2", code: "SP2"}}
const allNodes = {818: {id: "818", content: {html: "<p>1</p>"}, outgoingNodes: ["819"], tableItemId: 395}, 819: {id: "819", content: {html: "<p>2</p>"}, outgoingNodes: ["820"], tableItemId: 395}, 820: {id: "820", content: {html: "<p>3</p>"}, outgoingNodes: ["821"], tableItemId: 396}, 821: {id: "821", content: {html: "<p>4</p>"}, tableItemId: 396}}
console .log (
convert (tableItemNodes, tableItems, startingPointId) (allNodes)
)
.as-console-wrapper {max-height: 100% !important; top: 0}
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.27.1/ramda.min.js"></script>
As well as most of the comments on the version below still applying, we should also note that chain, when applied to functions acts like this:
chain (f, g) (x) //~> f (g (x)) (x)
So chain (over (htmlLens), addName (tableItems))
ends up being something like
(node) => over (htmlLens) (addName (tableItems) (node)) (node)
which in Ramda is equivalent to
(node) => over (htmlLens, addName (tableItems) (node), node)
which we then map over the nodes coming to it. (You can also see this in the Ramda REPL.)
Original Answer
It's not trivial to weave extra arguments through a pipeline because pipelines are designed for the simple purpose of passing a single argument down the line, transforming it at every step. There are of course techniques we could figure out for that, but I would expect them not to be worth the effort. Because the only thing they gain us would be the ability to write our code point-free. And point-free should not be a goal on its own. Use it when it makes your code simpler and more readable; skip it when it doesn't.
Instead, I would break this apart with some helper functions, and then write a main function that took our arguments and passed them as necessary to helper functions inside our main pipeline. Expand this snippet to see one approach:
const {compose, chain, prop, values, lensPath, flip, concat,
pipe, pick, reject, propEq, map, assoc, over} = R
const getOutgoing = compose (chain (prop ('outgoingNodes')), values)
const htmlLens = lensPath (['content', 'html'])
const addName = flip (concat) ('Table item name goes here')
const convert = (tableItemNodes, startingPointId) => pipe (
pick (getOutgoing (tableItemNodes)),
reject (propEq ('tableItemId', startingPointId)),
map (assoc ('externalStartingPoint', true)),
map (over (htmlLens, addName))
)
const startingPointId = 395;
const tableItemNodes = {818: {id: "818", content: {html: "<p>1</p>"}, outgoingNodes: ["819"], tableItemId: 395}, 819: {id: "819", content: {html: "<p>2</p>"}, outgoingNodes: ["820"], tableItemId: 395}};
const allNodes = {818: {id: "818", content: {html: "<p>1</p>"}, outgoingNodes: ["819"], tableItemId: 395}, 819: {id: "819", content: {html: "<p>2</p>"}, outgoingNodes: ["820"], tableItemId: 395}, 820: {id: "820", content: {html: "<p>3</p>"}, outgoingNodes: ["821"], tableItemId: 396}, 821: {id: "821", content: {html: "<p>4</p>"}, tableItemId: 396}}
console .log (
convert (tableItemNodes, startingPointId) (allNodes)
)
.as-console-wrapper {max-height: 100% !important; top: 0}
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.27.1/ramda.min.js"></script>
(You can also see this on the Ramda REPL.)
Things to note
I find compose (chain (prop ('outgoingNodes')), values) to be slightly simpler than pipe (values, pluck('outgoingNodes'), flatten), but they work similarly.
I often separate out the lens definitions even if I'm only going to use them once to make the call site cleaner.
There is probably no good reason to use Ramda in addName. This would work just as well: const addName = (s) => s + 'Table item name goes here' and is cleaner. I just wanted to show flip as an alternative to using the placeholder.
There is an argument to be made for replacing
map (assoc ('externalStartingPoint', true)),
map (over (htmlLens, addName))
with
map (pipe (
assoc ('externalStartingPoint', true),
over (htmlLens, addName)
))
as was done in the original. The Functor composition law states that they have the same result. And that requires one fewer iterations through the data. But it adds some complexity to the code that I wouldn't bother with unless a performance test pointed to this as a problem.
Before I saw your answer I managed to do something like in the example below:
return R.pipe(
R.pick(allOutgoingNodes),
R.reject(R.propEq('tableItemId', startingPointId)),
R.map((node: Node) => {
const startingPointName = allTableItems[node.tableItemId].name;
return R.compose(
R.assoc('externalStartingPoint', true),
R.over(
R.lensPath(['content', 'html']),
R.concat(
R.__,
`<p class='test'>See node in ${startingPointName}</p>`
)
)
)(node);
}),
R.merge(newNodesObject)
)(allNodes);
What do you think?

How to query by multiple conditions in faunadb?

I try to improve my understanding of FaunaDB.
I have a collection that contains records like:
{
"ref": Ref(Collection("regions"), "261442015390073344"),
"ts": 1587576285055000,
"data": {
"name": "italy",
"attributes": {
"amenities": {
"camping": 1,
"swimming": 7,
"hiking": 3,
"culture": 7,
"nightlife": 10,
"budget": 6
}
}
}
}
I would like to query in a flexible way by different attributes like:
data.attributes.amenities.camping > 5
data.attributes.amenities.camping > 5 AND data.attributes.amenities.hiking > 6
data.attributes.amenities.camping < 6 AND data.attributes.amenities.culture > 6 AND hiking > 5 AND ...
I created an index containing all attributes, but I don't know how to do greater equals filtering in an index that contains multiple terms.
My fallback would be to create an index for each attribute and use Intersection to get the records that are in all subqueries that I want to check, but this feels somehow wrong:
The query: budget >= 6 AND camping >=8 would be:
Index:
{
name: "all_regions_by_all_attributes",
unique: false,
serialized: true,
source: "regions",
terms: [],
values: [
{
field: ["data", "attributes", "amenities", "culture"]
},
{
field: ["data", "attributes", "amenities", "hiking"]
},
{
field: ["data", "attributes", "amenities", "swimming"]
},
{
field: ["data", "attributes", "amenities", "budget"]
},
{
field: ["data", "attributes", "amenities", "nightlife"]
},
{
field: ["data", "attributes", "amenities", "camping"]
},
{
field: ["ref"]
}
]
}
Query:
Map(
Paginate(
Intersection(
Range(Match(Index("all_regions_by_all_attributes")), [0, 0, 0, 6, 0, 8], [10, 10, 10, 10, 10, 10]),
)
),
Lambda(
["culture", "hiking", "swimming", "budget", "nightlife", "camping", "ref"],
Get(Var("ref"))
)
)
This approach has the following disadvantages:
It does not work like expected, if for example the first (culture) attribute is in this range, but the second (hiking) not, then I would still get a return values
It causes a lot of reads due to the reference that I need to follow for each result.
Is it possible to store all values in this kind of index that would contain all the data? I know I can just add more values to the index and access them. But this would mean I have to create a new index as soon as we add more fields to the entity. But maybe this is a common thing.
thanks in advance
Thanks for your question. Ben already wrote out a complete example that shows what you can do and I'll base myself on his recommendations and try to clarify further.
FaunaDB's FQL is quite powerful which means there are multiple ways to do that, yet with such power comes a small learning curve so I'm happy to help :). The reason it took a while to answer this question is that such an elaborate answer actually deserves a complete blog post. Well, I've never written a blog post in Stack Overflow, there is a first for everything!
There are three ways to do 'compound range-like queries' but there is one way that will be most performant for your use-case and we'll see that the first approach is actually not entirely what you need. Spoiler, the third option we describe here is what you need.
Preparation - Let's throw in some data just like Ben did
I'll keep it in one collection to keep it simpler and am using the JavaScript flavour of the Fauna Query Language here. There is a good reason to separate data in a second collection though which is related to your second map/get question (see the end of this answer)
Create the collection
CreateCollection({ name: 'place' })
Throw in some data
Do(
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'mullion',
focus: 'team-building',
camping: 1,
swimming: 7,
hiking: 3,
culture: 7,
nightlife: 10,
budget: 6
}
})
),
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'church covet',
focus: 'private',
camping: 1,
swimming: 7,
hiking: 9,
culture: 7,
nightlife: 10,
budget: 6
}
})
),
Select(
['ref'],
Create(Collection('place'), {
data: {
name: 'the great outdoors',
focus: 'private',
camping: 5,
swimming: 3,
hiking: 2,
culture: 1,
nightlife: 9,
budget: 3
}
})
)
)
OPTION 1: Composite indexes with multiple values
We can put as many terms as values in an index and use Match and Range to query those. However! Range probably gives you something different than you would expect if you use multiple values. Range gives you exactly what the index does and the index sorts values lexically. If we look at the example of Range in the docs we see an example there which we can extend upon for multiple values.
Imagine we would have an index with two values and we write:
Range(Match(Index('people_by_age_first')), [80, 'Leslie'], [92, 'Marvin'])
Then the result will be what you see on the left and not what you see on the right. This is a very scalable behaviour and exposes the raw-power without overhead of the underlying index but is not exactly what you are looking for!
So let's move on to another solution!
OPTION 2: First Range, then Filter
Another quite flexible solution is to use Range and then Filter. This however is a less good idea in case you are filtering out a lot with filter since your pages will become more empty. Imagine that you have 10 items in a page after the 'Range' and use filter, then you will end up with pages of 2, 5, 4 elements depending on what is filtered out. This is a great idea however if one of these properties has such a high cardinality that it will filter out most of entities. E.g. imagine everything is timestamped, you want to first get a date range and then continue filtering something that will only eliminate a small percentage of the resultset. I believe that in your case all of these values are quite equal so this the third solution (see lower) will be the best for you.
We could in this case just throw all values in so that they all get returned which avoids a Get. For example, let's say that 'camping' is our most important filter.
CreateIndex({
name: 'all_camping_first',
source: Collection('place'),
values: [
{ field: ['data', 'camping'] },
// and the rest will not be used for filter
// but we want to return them to avoid Map/Get
{ field: ['data', 'swimming'] },
{ field: ['data', 'hiking'] },
{ field: ['data', 'culture'] },
{ field: ['data', 'nightlife'] },
{ field: ['data', 'budget'] },
{ field: ['data', 'name'] },
{ field: ['data', 'focus'] },
]
})
You can now write a query that just gets a range based on the camping value:
Paginate(Range(Match('all_camping_first'), [1], [3]))
Which should return two elements (the third has camping === 5)
Now imagine that we want to filter over these and we set our pages small to avoid unnecessary work
Filter(
Paginate(Range(Match('all_camping_first'), [1], [3]), { size: 2 }),
Lambda(
['camping', 'swimming', 'hiking', 'culture', 'nightlife', 'budget', 'name', 'focus'],
And(GTE(Var('hiking'), 0), GTE(7, Var('hiking')))
)
)
Since I want to be clear on both the advantages as disadvantages of each approach, let's show exactly how filter works by adding another one that has attributes that match our query.
Create(Collection('place'), {
data: {
name: 'the safari',
focus: 'team-building',
camping: 1,
swimming: 9,
hiking: 2,
culture: 4,
nightlife: 3,
budget: 10
}
})
Running the same query:
Filter(
Paginate(Range(Match('all_camping_first'), [1], [3]), { size: 2 }),
Lambda(
['camping', 'swimming', 'hiking', 'culture', 'nightlife', 'budget', 'name', 'focus'],
And(GTE(Var('hiking'), 0), GTE(7, Var('hiking')))
)
)
Now still returns only one value but provides you with an 'after' cursor that points to the next page. You might think: "huh? My page size was 2?". Well that's because Filter works after Pagination and your page originally had two entities from which one got filtered out. So you are left with a page of 1 value and a pointer to the next page.
{
"after": [
...
],
"data": [
[
1,
7,
3,
7,
10,
6,
"mullion",
"team-building"
]
]
You could also opt to Filter directly on the SetRef as well and only paginate afterwards. In that case, the size of your pages will contain the required size. However, keep in mind that this is an O(n) operation on the amount of elements that comes back from Range. Range uses an index but from the moment you use Filter, it will loop over each of the elements.
OPTION 3: Indexes on one value + Intersections!
This is the best solution for your use-case but it requires a bit more understanding and an intermediate index.
When we look at the doc examples for intersection we see this example:
Paginate(
Intersection(
Match(q.Index('spells_by_element'), 'fire'),
Match(q.Index('spells_by_element'), 'water'),
)
)
This works because it's two times the same index and that means that **the results are similar values ** (references in this case).
Let's say we add a few indexes.
CreateIndex({
name: 'by_camping',
source: Collection('place'),
values: [
{ field: ['data', 'camping']}, {field: ['ref']}
]
})
CreateIndex({
name: 'by_swimming',
source: Collection('place'),
values: [
{ field: ['data', 'swimming']}, {field: ['ref']}
]
})
CreateIndex({
name: 'by_hiking',
source: Collection('place'),
values: [
{ field: ['data', 'hiking']}, {field: ['ref']}
]
})
We can intersect on them now but it will not give us the right result. For example... let's call this:
Paginate(
Intersection(
Range(Match(Index("by_camping")), [3], []),
Range(Match(Index("by_swimming")), [3], [])
)
)
The result is empty. Although we had one with swimming 3 and camping 5.
That is exactly the problem. If swimming and camping were both the same value we would get a result. So it's important to notice that Intersection intersects the values, so that includes both the camping/swimming value as well as the reference. That means that we have to drop the value since we only need the reference. The way to do that before pagination is with a join, Essentially we are going to join with another index that is going to just.. return the ref (not specifying values defaults to only the ref)
CreateIndex({
name: 'ref_by_ref',
source: Collection('place'),
terms: [{field: ['ref']}]
})
This join looks as follows
Paginate(Join(
Range(Match(Index('by_camping')), [4], [9]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
)))
Here we just took the result of Match(Index('by_camping')) and just dropped the value by joining with an index that only returns the ref. Now let's combine this and just do an AND kind of range query ;)
Paginate(Intersection(
Join(
Range(Match(Index('by_camping')), [1], [3]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
)),
Join(
Range(Match(Index('by_hiking')), [0], [7]),
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
))
))
The result is two values, and both in the same page!
Note that you can easily extend or compose FQL by just using the native language (in this case JS) to make this look much nicer (note I didn't test this piece of code)
const DropAllButRef = function(RangeMatch) {
return Join(
RangeMatch,
Lambda(['value', 'ref'], Match(Index('ref_by_ref'), Var('ref'))
))
}
Paginate(Intersection(
DropAllButRef (Range(Match(Index('by_camping')), [1], [3])),
DropAllButRef (Range(Match(Index('by_hiking')), [0], [7]))
))
And a final extension, this only returns indexes so you'll need to map get. There is of course a way around this if you really want to by.. just using another index :)
const index = CreateIndex({
name: 'all_values_by_ref',
source: Collection('place'),
values: [
{ field: ['data', 'camping'] },
{ field: ['data', 'swimming'] },
{ field: ['data', 'hiking'] },
{ field: ['data', 'culture'] },
{ field: ['data', 'nightlife'] },
{ field: ['data', 'budget'] },
{ field: ['data', 'name'] },
{ field: ['data', 'focus'] }
],
terms: [
{ field: ['ref'] }
]
})
Now you have the range query, will get everything without a map/get:
Paginate(
Intersection(
Join(
Range(Match(Index('by_camping')), [1], [3]),
Lambda(['value', 'ref'], Match(Index('all_values_by_ref'), Var('ref'))
)),
Join(
Range(Match(Index('by_hiking')), [0], [7]),
Lambda(['value', 'ref'], Match(Index('all_values_by_ref'), Var('ref'))
))
)
)
With this join approach you could even do range indexes on different collections as long as you join them to the same reference before intersecting! Pretty cool huh?
Can I store more values in the index?
Yes you can, indexes in FaunaDB are views, so let's call them indiviews. It's a tradeoff, essentially you are exchanging compute for storage. By making a view with many values you get very fast access to a certain subset of your data. But there is another tradeoff and that is flexibility. You can not just go adding elements since that would require you to rewrite your whole index. In that case you will have to make a new index and wait for it to build if you have much data (and yes, that is quite common) and make sure that the queries you do (look at the lambda parameters in map filter) match your new index. You can always delete the other index afterwards. Just using Map/Get will be more flexible, everything in databases is a tradeoff and FaunaDB gives you both options :). I would suggest to use such an approach from the moment your datamodel is fixed and you see a specific part in your app that you want to optimise.
Avoiding MapGet
The second question on Map/Get requires some explanation. Separating out the values that you will search on from the places (as Ben did) is a great idea if you want to use Join to get the actual places more efficiently. This will not require a Map Get and therefore cost you far less reads but do notice that Join is rather a traverse (it'll replace the current references with the target references it joins to) so if you need both the values and the actual place data in one object at the end of your query than you will require Map/Get. Look at it from this perspective, indexes are ridiculously cheap in terms of reads and you can go quite far with those but for some operations there is just no way around Map/Get, Get is still only 1 read. Given that you get 100 000 for free per day that is still not expensive :). You could keep your pages also relatively small (size parameter in paginate) to make sure you don't do unnecessary gets unless your users or app requires more pages.
For people reading this that do not know this yet:
1 index page === 1 read
1 get === 1 read
Final notes
We can and will make this easier in the future. However, note that you are working with a scalable distributed database and often these things are just not even possible in other solutions or very inefficient. FaunaDB provides you with very powerful structures and raw access to how indexes work and gives you many options. It does not try to be clever for you behind the scenes as this might result in very inefficient queries in case we get it wrong (that would be a bummer in a scalable pay-as-you-go system).
There are a couple of misconceptions that I think are leading you astray. The most important one: Match(Index($x)) generates a set reference, which is an ordered set of tuples. The tuples correspond to the array of fields that are present in the values section of an index. By default this will just be a one-tuple containing a reference to a document in the collection selected by the index. Range operates on a set reference and knows nothing about the terms used to the select the returned set ref. So how do we compose the query?
Starting from first principles. Lets imagine we just had this stuff in memory. If we had a set of (attribute, scores) ordered by attribute, score then taking only those where attribute == $attribute would get us close, and then filtering by score > $score would get us what we wanted. This corresponds exactly to a range query over scores with attributes as terms, assuming we modeled the attribute value pairs as documents. We can also embed pointers back to the location so we can retrieve that as well in the same query. Enough chatter, lets do it:
First stop: our collections.
jnr> CreateCollection({name: "place_attribute"})
{
ref: Collection("place_attribute"),
ts: 1588528443250000,
history_days: 30,
name: 'place_attribute'
}
jnr> CreateCollection({name: "place"})
{
ref: Collection("place"),
ts: 1588528453350000,
history_days: 30,
name: 'place'
}
Next up some data. We'll chose a couple of places and give them some attributes.
jnr> Create(Collection("place"), {data: {"name": "mullion"}})
jnr> Create(Collection("place"), {data: {"name": "church cove"}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "swimming", "score": 3, "place": Ref(Collection("place"), 264525084639625739)}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "hiking", "score": 1, "place": Ref(Collection("place"), 264525084639625739)}})
jnr> Create(Collection("place_attribute"), {data: {"attribute": "hiking", "score": 7, "place": Ref(Collection("place"), 264525091487875586)}})
Now for the more interesting part. The index.
jnr> CreateIndex({name: "attr_score", source: Collection("place_attribute"), terms:[{"field":["data", "attribute"]}], values:[{"field": ["data", "score"]}, {"field": ["data", "place"]}]})
{
ref: Index("attr_score"),
ts: 1588529816460000,
active: true,
serialized: true,
name: 'attr_score',
source: Collection("place_attribute"),
terms: [ { field: [ 'data', 'attribute' ] } ],
values: [ { field: [ 'data', 'score' ] }, { field: [ 'data', 'place' ] } ],
partitions: 1
}
Ok. A simple query. Who has Hiking?
jnr> Paginate(Match(Index("attr_score"), "hiking"))
{
data: [
[ 1, Ref(Collection("place"), "264525084639625730") ],
[ 7, Ref(Collection("place"), "264525091487875600") ]
]
}
Without too much imagination one could sneak a Get call into that to pull the place out.
What about only hiking with a score over 5? We have an ordered set of tuples, so just supplying the first component (the score) is enough to get us what we want.
jnr> Paginate(Range(Match(Index("attr_score"), "hiking"), [5], null))
{ data: [ [ 7, Ref(Collection("place"), "264525091487875600") ] ] }
What about a compound condition? Hiking under 5 and swimming (any score). This is where things take a bit of a turn. We want to model conjunction, which in fauna means intersecting sets. The problem we have is that up until now we have been using an index that returns the score as well as the place ref. For intersection to work we need just the refs. Time for a sleight of hand:
jnr> Get(Index("doc_by_doc"))
{
ref: Index("doc_by_doc"),
ts: 1588530936380000,
active: true,
serialized: true,
name: 'doc_by_doc',
source: Collection("place"),
terms: [ { field: [ 'ref' ] } ],
partitions: 1
}
What's the point of such an index you ask? Well we can use it to drop any data we like from any index and be left with just the refs via join. This gives us the place refs with a hiking score less than 5 (the empty array sorts before anything, so works as a placeholder for a lower bound).
jnr> Paginate(Join(Range(Match(Index("attr_score"), "hiking"), [], [5]), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p")))))
{ data: [ Ref(Collection("place"), "264525084639625739") ] }
So finally the piece de resistance: all places with swimming and (hiking < 5):
jnr> Let({
... hiking: Join(Range(Match(Index("attr_score"), "hiking"), [], [5]), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p")))),
... swimming: Join(Match(Index("attr_score"), "swimming"), Lambda(["s", "p"], Match(Index("doc_by_doc"), Var("p"))))
... },
... Map(Paginate(Intersection(Var("hiking"), Var("swimming"))), Lambda("ref", Get(Var("ref"))))
... )
{
data: [
{
ref: Ref(Collection("place"), "264525084639625739"),
ts: 1588529629270000,
data: { name: 'mullion' }
}
]
}
Tada. This could be neatened up a lot with a couple of udfs, exercise left to the reader. Conditions involving or can be managed with union in much the same way.
Easy way to query with the multiple conditions I think with the query it with documents differences, In my solutions it is like:
const response = await client.query(
q.Let(
{
activeUsers: q.Difference(
q.Match(q.Index("allUsers")),
q.Match(q.Index("usersByStatus"), "ARCHIVE")
),
paginatedDocuments: q.Map(
q.Paginate(q.Var("activeUsers"), {
size,
before: reqBefore,
after: reqAfter
}),
q.Lambda("x", q.Get(q.Var("x")))
),
total: q.Count(q.Var("activeUsers"))
},
{
documents: q.Var("paginatedDocuments"),
total: q.Var("total")
}
)
);
const {
documents: {
data: dbData = [],
before: dbBefore = [],
after: dbAfter = []
} = {},
total = 0
} = response || {};
const respBefore = dbBefore[0]?.value?.id || null;
const respAfter = dbAfter[0]?.value?.id || null;
const data = await dbData.map((userData) => {
const {
ref: { id = null } = {},
data: { firstName = "", lastName = "" }
} = userData;
return {
id,
firstName,
lastName
};
});
So in the query builder you can filter each nested document in variable in Let section by the index that you want.
Here is the another variant of filtering, in SQL looks like:
SELECT * FROM clients WHERE salary > 2000 AND age > 30;
For fauna query:
const response = await client.query(
q.Let(
{
allClients: q.Match(q.Index("allClients")),
filteredClients: q.Filter(
q.Var("allClients"),
q.Lambda(
"client",
q.And(
q.GT(q.Select(["data", "salary"], q.Get(q.Var("client"))), 2000),
q.GT(q.Select(["data", "age"], q.Get(q.Var("client"))), 30)
)
)
),
paginatedDocuments: q.Map(
q.Paginate(q.Var("filteredClients")),
q.Lambda("x", q.Get(q.Var("x")))
),
total: q.Count(q.Var("filteredClients"))
},
{
documents: q.Var("paginatedDocuments"),
total: q.Var("total")
}
)
);
This is some kind of filtering in javascript where the condition if returns true so it will be in the result of the response. Example:
const filteredClients = allClients.filter((client) => {
const { salary, age } = client;
return ( salary > 2000 ) && (age > 30)
})

Use Ramda.js to pull off items from object

This question is about how to perform a task using RamdaJS.
First, assume I have an object with this structure:
let myObj = {
allItems: [
{
name: 'firstthing',
args: [
{
name: 'arg0'
},
{
name: 'arg1'
}
],
type: {
name: 'type_name_1'
}
},
{
name: 'otherthing',
args: [
{
name: 'arg0'
}
]
}
]
}
I am trying to create an object that looks like:
{
arg0: 'arg0', // myObj.allItems[0].args[0].name
typeName: 'type_name_1' // myObj.allItems[0].type.name
}
(I know the names are stupid, arg0, typeName. It's not important)
So if we weren't using Ramda, this is how I'd do it imperatively:
// The thing I'm searching for in the array (allItems)
let myName = 'firstthing';
// Here's how I'd find it in the array
let myMatch = myObj.allItems.find(item => item.name === myName);
// Here is the desired result, by manually using dot
// notation to access properties on the object (non-functional)
let myResult = {
arg0: myMatch.args[0].name,
typeName: myMatch.type.name
};
// Yields: {"arg0":"arg0","typeName":"type_name_1"}
console.log(myResult)
Finally, just for good measure, this is as far as I've gotten so far. Note that, I'd really like to accomplish this in a single compose/pipe.
(An object goes in, and an object with the desired data comes out)
const ramdaResult = R.compose(
R.path(['type', 'name']),
R.find(
R.propEq('name', myName)
)
)(R.prop('allItems', myObj))
Thanks
A combination of applySpec and path should work:
const transform = applySpec ({
arg0: path (['allItems', 0, 'args', 0, 'name']),
typeName: path (['allItems', 0, 'type', 'name'])
})
const myObj = {allItems: [{name: 'firstthing', args: [{name: 'arg0'}, {name: 'arg1'}], type: {name: 'type_name_1'}}, {name: 'otherthing', args: [{name: 'arg0'}]}]}
console .log (
transform (myObj)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {applySpec, path} = R </script>
But depending upon your preferences, a helper function might be useful to make a slightly simpler API:
const splitPath = useWith (path, [split('.'), identity] )
// or splitPath = curry ( (str, obj) => path (split ('.') (str), obj))
const transform = applySpec({
arg0: splitPath('allItems.0.args.0.name'),
typeName: splitPath('allItems.0.type.name'),
})
const myObj = {allItems: [{name: 'firstthing', args: [{name: 'arg0'}, {name: 'arg1'}], type: {name: 'type_name_1'}}, {name: 'otherthing', args: [{name: 'arg0'}]}]}
console .log (
transform (myObj)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {applySpec, path, useWith, split, identity} = R </script>
splitPath is not appropriate for Ramda, but it's a useful function I often include, especially if the paths are coming from a source outside my control.
Update
Yes, I did miss that requirement. Serves me right for looking only at the input and the requested output. There's always multiple incompatible algorithms that give the same result for a specific input. So here's my mea culpa, an attempt to break this into several reusable functions.
Lenses are probably your best bet for this. Ramda has a generic lens function, and specific ones for an object property (lensProp), for an array index(lensIndex), and for a deeper path(lensPath), but it does not include one to find a matching value in an array by id. It's not hard to make our own, though.
A lens is made by passing two functions to lens: a getter which takes the object and returns the corresponding value, and a setter which takes the new value and the object and returns an updated version of the object.
An important fact about lenses is that they compose, although for technical reasons the order in which you supply them feels opposite to what you might expect.
Here we write lensMatch which find or sets the value in the array where the value at a given path matches the supplied value. And we write applyLensSpec, which acts like applySpec but takes lenses in place of vanilla functions.
Using any lens, we have the view, set, and over functions which, respectively, get, set, and update the value. Here we only need view, so we could theoretically make a simpler version of lensMatch, but this could be a useful reusable function, so I keep it complete.
const lensMatch = (path) => (key) =>
lens
( find ( pathEq (path, key) )
, ( val
, arr
, idx = findIndex (pathEq (path, key), arr)
) =>
update (idx > -1 ? idx : length (arr), val, arr)
)
const applyLensSpec = (spec) => (obj) =>
map (lens => view (lens, obj), spec)
const lensName = (name) => lensMatch (['name']) (name)
const transform = (
name,
nameLens = compose(lensProp('allItems'), lensName(name))
) => applyLensSpec({
arg0: compose (nameLens, lensPath (['args', 0, 'name']) ),
typeName: compose (nameLens, lensPath (['type', 'name']) )
})
const myObj = {allItems: [{name: 'firstthing', args: [{name: 'arg0'}, {name: 'arg1'}], type: {name: 'type_name_1'}}, {name: 'otherthing', args: [{name: 'arg0'}]}]}
console .log (
transform ('firstthing') (myObj)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {lens, find, pathEq, findIndex, update, length, map, view, compose, lensProp, lensPath} = R </script>
While this may feel like more work than some other solutions, the main function, transform is pretty simple, and it's obvious how to extend it with additional behavior. And lensMatch and applyLensSpec are genuinely useful.

Only run map once, ramda js

const arr = [{
_id: 'z11231',
_typename: 'items'
id: '123',
comment: null,
title: 'hello'
}, {
_id: 'z11231',
_typename: 'items'
id: 'qqq',
comment: 'test',
title: 'abc'
}]
Wanted output:
[['123', null, 'hello'], ['qqq', 'test', 'abc']];
export const convertObjectsWithValues = R.map(R.values);
export const removeMongoIdAndGraphqlTypeName = R.map(R.omit(['_id', '__typename']));
export const getExcelRows = R.pipe(removeMongoIdAndGraphqlTypeName, convertObjectsWithValues);
Problem here is I'm running two separate maps. It's to slow. Can I combine this in a way where only one map is executed. And still keep it clean in three seperate functions?
I'd be curious to see whether you've actually tested that it's too slow. The Knuth quote always seems a propos: "premature optimization is the root of all evil".
But if you've tested, and if multiple iterations are an actual bottleneck in your application, then the composition law of Functors should help. In Ramda terms this law states that
compose ( map (f), map (g) ) ≍ map (compose (f, g) )
and of course similarly that
pipe ( map (g), map (f) ) ≍ map (pipe (g, f) )
That means that you can rewrite your function like this:
const getExcelRows = map (pipe (omit ( ['_id', '_typename'] ), values ))
const arr = [
{_id: 'z11231', _typename: 'items', id: '123', comment: null, title: 'hello'},
{_id: 'z11231', _typename: 'items', id: 'qqq', comment: 'test', title: 'abc'}
]
console .log (
getExcelRows (arr)
)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script> <script>
const {map, pipe, omit, values} = R </script>
Use R.map with R.props to state which properties you want in the order that you want them. This will always maintain the correct order, unlike. R.values, which is constrained by the way JS orders keys.
const arr = [{"_id":"z11231","_typename":"items","id":"123","comment":null,"title":"hello"},{"_id":"z11231","_typename":"items","id":"qqq","comment":"test","title":"abc"}]
const getExcelRows = keys => R.map(R.props(keys))
const result = getExcelRows(['id', 'comment', 'title'])(arr)
console.log(result)
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>

Ramda: How to filter based on a value in a nested array

I'm trying to accomplish this in a functional manner (with Ramda). My JSON is structured like this
[
{username: 'bob', age: 30, tags: ['work', 'boring']},
{username: 'jim', age: 25, tags: ['home', 'fun']},
{username: 'jane', age: 30, tags: ['vacation', 'fun']}
]
and I am trying to filter based on a value in the 'tags' property, but have not been successful. I am able to filter on ints/strings (age and username), but I can't figure out how to do so with values in nested arrays (tags). Any help would be much appreciated.
There are many ways you could do this. But I think the cleanest one would be:
R.filter(R.where({tags: R.includes('fun')}))
You can see it in action in the Ramda REPL.
Other options, especially if the field is more deeply nested is to compose (or pipe) prop or path calls with contains or possibly to take advantage of lenses.
Still, I think the answer above is most readable.
const arr = [
{username: 'bob', age: 30, tags: ['work', 'boring']},
{username: 'jim', age: 25, tags: ['home', 'fun']},
{username: 'jane', age: 30, tags: ['vacation', 'fun']}
];
res = R.filter(R.where({tags: R.contains('home')}), arr);