Algolia Kotlin DSL query using AND + OR - kotlin

Example Search Data Structure:
{
"tpe": "HOME",
"sid": "fyyb1-YQWMAs6Y8vGrk6OAcgjZ-XzTY03Ngfr",
"sessionCreatedAtUtc": 1623854018195,
"title": "Baked Enchiladas",
"recipeCreatedAtUtc": 1623854008999,
"releaseStatus": 0,
"rid": "iBtk3PJ7HS9JKLRu4PHa",
"uid": "SelHOKTaw1k4WZpTH9y",
"desc": "Some info about this recipe...",
"objectID": "iBtk3PJ7HS9JKLRu4PHa"
}
Query I am unable to build:
Search for text, where uid = "SelHOKTaw1k4WZpTH9y" OR (tpe = "PRO" AND releaseStatus=1)
So far I have only been able to get the latter part of filtering to work:
filters {
and {
facet("tpe", "PRO")
facet("releaseStatus", 1)
}
}

For anyone else that struggled through this, I found a solution.
First and foremost, from the Algolia docs:
Combining ANDs and ORs# While you may use as many ANDs and ORs as you
need, you’ll need to be careful about how you combine them.
For performance reasons, we do not allow you to put groups of ANDs
within ORs. Here are some considerations to take into account:
We allow ( x AND (y OR z) AND (a OR b) )
We allow ( x AND y OR z AND a OR b )
We don’t allow ( x OR ( y AND z) OR ( a AND b) )
We don’t allow (a AND b) OR (c AND d)
Source
So the query I was attempting to develop was not easily possible.
In the Kotlin Algolia SDK:
and {
facet("tpe", "PRO")
facet("releaseStatus", 1)
}
Means tpe = PRO AND releaseStatus = 1. You can verify this when you attach your debugger to the result from the query.filter DSL block.
filter {
and {
facet("tpe", "PRO")
facet("releaseStatus", 1)
}
orFacet {
facet("uid", "SelHOKTaw1k4WZpTH9y")
facet("rid", "iBtk3PJ7HS9JKLRu4PHa")
}
}
Means (tpe = PRO AND releaseStatus = 1) AND (uid = SelHOKTaw1k4WZpTH9y OR rid = iBtk3PJ7HS9JKLRu4PHa). The junction between all blocks in the filter block is always AND. And that aligns with the documentation.
My solution: Add another datapoint in search structure. I combined "tpe" and "releaseStatus" into a single attribute. tpeRe = PRO-1.
The resulting query that fits my original question:
filters {
orFacet {
facet("uid", userId)
facet("tpeRs", "PRO-1")
}
}

Related

How do perform a graph query and join?

I apologize for the title, I don't exactly know how to word it. But essentially, this is a graph-type query but I know RavenDB's graph functionality will be going away so this probably needs to be solved with Javascript.
Here is the scenario:
I have a bunch of documents of different types, call them A, B, C, D. Each of these particular types of documents have some common properties. The one that I'm interested in right now is "Owner". The owner field is an ID which points to one of two other document types; it can be a Group or a User.
The Group document has a 'Members' field which contains an ID which either points to a User or another Group. Something like this
It's worth noting that the documents in play have custom IDs that begin with their entity type. For example Users and Groups begin with user: and group: respectively. Example IDs look like this: user:john#castleblack.com or group:the-nights-watch. This comes into play later.
What I want to be able to do is the following type of query:
"Given that I have either a group id or a user id, return all documents of type a, b, or c where the group/user id is equal to or is a descendant of the document's owner."
In other words, I need to be able to return all documents that are owned by a particular user or group either explicitly or implicitly through a hierarchy.
I've considered solving this a couple different ways with no luck. Here are the two approaches I've tried:
Using a function within a query
With Dejan's help in an email thread, I was able to devise a function that would walk it's way down the ownership graph. What this attempted to do was build a flat array of IDs which represented explicit and implicit owners (i.e. root + descendants):
declare function hierarchy(doc, owners){
owners = owners || [];
while(doc != null) {
let ownerId = id(doc)
if(ownerId.startsWith('user:')) {
owners.push(ownerId);
} else if(ownerId.startsWith('group:')) {
owners.push(ownerId);
doc.Members.forEach(m => {
let owner = load(m, 'Users') || load(m, 'Groups');
owners = hierarchy(owner, owners);
});
}
}
return owners;
}
I had two issues with this. 1. I don't actually know how to use this in a query lol. I tried to use it as part of the where clause but apparently that's not allowed:
from #all_docs as d
where hierarchy(d) = 'group:my-group-d'
// error: method hierarchy not allowed
Or if I tried anything in the select statement, I got an error that I have exceeded the number of allowed statements.
As a custom index
I tried the same idea through a custom index. Essentially, I tried to create an index that would produce an array of IDs using roughly the same function above, so that I could just query where my id was in that array
map('#all_docs', function(doc) {
function hierarchy(n, graph) {
while(n != null) {
let ownerId = id(n);
if(ownerId.startsWith('user:')) {
graph.push(ownerId);
return graph;
} else if(ownerId.startsWith('group:')){
graph.push(ownerId);
n.Members.forEach(g => {
let owner = load(g, 'Groups') || load(g, 'Users');
hierarchy(owner, graph);
});
return graph;
}
}
}
function distinct(value, index, self){ return self.indexOf(value) === index; }
let ownerGraph = []
if(doc.Owner) {
let owner = load(doc.Owner, 'Groups') || load(doc.Owner, 'Users');
ownerGraph = hierarchy(owner, ownerGraph).filter(distinct);
}
return { Owners: ownerGraph };
})
// error: recursion is not allowed by the javascript host
The problem with this is that I'm getting an error that recursion is not allowed.
So I'm stumped now. Am I going about this wrong? I feel like this could be a subquery of sorts or a filter by function, but I'm not sure how to do that either. Am I going to have to do this in two separate queries (i.e. two round-trips), one to get the IDs and the other to get the docs?
Update 1
I've revised my attempt at the index to the following and I'm not getting the recursion error anymore, but assuming my queries are correct, it's not returning anything
// Entity/ByOwnerGraph
map('#all_docs', function(doc) {
function walkGraph(ownerId) {
let owners = []
let idsToProcess = [ownerId]
while(idsToProcess.length > 0) {
let current = idsToProcess.shift();
if(current.startsWith('user:')){
owners.push(current);
} else if(current.startsWith('group:')) {
owners.push(current);
let group = load(current, 'Groups')
if(!group) { continue; }
idsToProcess.concat(group.Members)
}
}
return owners;
}
let owners = [];
if(doc.Owner) {
owners.concat(walkGraph(doc.Owner))
}
return { Owners: owners };
})
// query (no results)
from index Entity/ByOwnerGraph as x
where x.Owners = "group:my-group-id"
// alternate query (no results)
from index Entity/ByOwnerGraph as x
where x.Owners ALL IN ("group:my-group-id")
I still can't use this approach in a query either as I get the same error that there are too many statements.

In Kotlin, How to groupBy only subsequent items? [duplicate]

This question already has answers here:
Split a list into groups of consecutive elements based on a condition in Kotlin
(4 answers)
Closed 7 months ago.
I want to groupBy a list of items by its value, but only if subsequent, and ignore grouping otherwise:
input:
val values = listOf("Apple", "Apple", "Grape", "Grape", "Apple", "Cherry", "Cherry", "Grape")
output: {"Apple"=2, "Grape"=2, "Apple"=1, "Cherry"=2, "Grape"=1}
There's no built in option for this in Kotlin - it has to be custom, so there are many different options.
Because you need to keep track of the previous element, to compare the current one against, you need to have some sort of state. To achieve this you could use zipWithNext or windowed to group elements. Or use fold and accumulate the values into a list - removing and adding the last element depending on whether there's a break in the sequence.
To try and keep things a bit more clearer (even if it breaks the norms a bit) I recommend using vars and a single loop. I used the buildList { } DSL, which creates a clear scope for the operation.
val result: List<Pair<String, Int>> = buildList {
var previousElement: String? = null
var currentCount: Int = 0
// iterate over each incoming value
values.forEach { currentElement: String ->
// currentElement is new - so increment the count
currentCount++
// if we have a break in the sequence...
if (currentElement != previousElement) {
// then add the current element and count to our output
add(currentElement to currentCount)
// reset the count
currentCount = 0
}
// end this iteration - update 'previous'
previousElement = currentElement
}
}
Note that result will match the order of your initial list.
You cloud use MultiValueMap which can has duplicated keys. Since there is no native model you should implement yourself or use the open-source library.
Here is a reference.
Map implementation with duplicate keys
For comparison purposes, here's a short but inefficient solution written in the functional style using fold():
fun <E> List<E>.mergeConsecutive(): List<Pair<E, Int>>
= fold(listOf()) { acc, e ->
if (acc.isNotEmpty() && acc.last().first == e) {
val currentTotal = acc.last().second
acc.dropLast(1) + (e to currentTotal + 1)
} else
acc + (e to 1)
}
The accumulator builds up the list of pairs, incrementing its last entry when we get a duplicate, or appending a new entry when there's a different item. (You could make it slightly shorter by replacing the currentTotal with a call to let(), but that would be even harder to read.)
It uses immutable Lists and Pairs, and so has to create a load of temporary ones as it goes — which makes this pretty inefficient (𝒪(𝑛²)), and I wouldn't recommend it for production code. But hopefully it's instructive.

How do I read data in one cell and write data into another cell using Google Sheets?

So let's all assume that column B is filled with multiple, short statements. These statements may be used more than once, not at all, or just once throughout the column. I want to be able to read what's in each cell of column B and assign a category to it in column F using the Google Sheets script editor. I'll include some pseudo-code of how I would do something like this normally.
for (var i = 0; i < statements.length; i++) {
if (statements[i] == 'Description One') {
category[i] = 'Category One';
}
else if (statements[i] == 'Description Two') {
category[i] = 'Category Two';
}
// and so on for all known categories....
}
How do I go about accessing a cell for a read and accessing a different cell for a write?
Thanks in advance for the help!
Ok, so after a little more thought on the subject, I've arrived at a solution. It's super simple, albeit tedious
function assignCategory(description) {
if (description == 'Description One') {
return 'Category One';
}
// and so on for all known categories
}
Hopefully someone will see this and be helped anyway, if you guys think of a more efficient and easier to maintain way of doing this, by all means do chime in.
Assuming a sheet such as this one, which has a header and six different columns (where B is the description, and F the category); you could use a dictionary to translate your values as follows:
// (description -> category) dictionary
var translations = {
"cooking": "Cooking",
"sports": "Sport",
"leisure": "Leisure",
"music": "Music",
"others": "Other"
}
function assignCategories() {
var dataRange = SpreadsheetApp.getActiveSheet().getDataRange();
for (var i=2; i<=dataRange.getNumRows(); i++) {
var description = dataRange.getCell(i, 2).getValue();
var category = translations[description];
dataRange.getCell(i, 6).setValue(category);
}
}
In case you need additional ruling (i.e. descriptions that contain cricket must be classified as sport), you could accomplish your desired results by implementing your own custom function and using string functions (such as indexOf) or regular expressions.
Using indexOf
// (description -> category) dictionary
var translations = {
"cooking": "Cooking",
"sports": "Sport",
"leisure": "Leisure",
"music": "Music",
"others": "Other"
}
function assignCategories() {
var dataRange = SpreadsheetApp.getActiveSheet().getDataRange();
for (var i=2; i<=dataRange.getNumRows(); i++) {
var description = dataRange.getCell(i, 2).getValue()
var category = assignCategory(description);
if (category) dataRange.getCell(i, 6).setValue(category);
}
}
function assignCategory(description) {
description = description.toLowerCase();
var keys = Object.keys(translations);
for (var i=0; i<categories.length; i++) {
var currentKey = keys[i];
if (description.indexOf(currentKey) > -1)
return translations[currentKey];
}
}
This version is a bit more sophisticated. It will make the 'description' of each row lowercase in order to better compare with your dictionary, and also uses indexOf for checking whether the 'translation key' appears in the description, rather than checking for an exact match.
You should be aware however that this method will be considerably slower, and that the script may timeout (see GAS Quotas). You could implement ways to 'resume' your script operations such that you can re-run it and continue where it left off, in case that this hinders your operations.

What is the most optimized way to get a set of rows which is present in the middle of the list in java 8?

I've a list of items. I want to process a set of items which are in the middle of the list.
Ex: Assume a list of employees who have id, first name, last name and middle name as attributes.
I want to consider all rows between lastName "xxx" and "yyy" and process them further.
How can this be optimized in Java8? Optimization is my first concern.
Tried using Java8 streams and parallel streams. But termination(break) is not allowed in foreach loop in Java8 streams. Also we cannot use the outside("start" variable below) variables inside foreach.
Below is the code which I need to optimize:
boolean start = false;
for(Employee employee: employees) {
if(employee.getLastname().equals("yyy")) {
break;
}
if(start) {
// My code to process
}
if(employee.getLastname().equals("xxx")) {
start = true;
}
}
What is the best way to handle the above problem in Java8?
That is possible in java-9 via (I've simplified your example):
Stream.of(1, 2, 3, 4, 5, 6)
.dropWhile(x -> x != 2)
.takeWhile(x -> x != 6)
.skip(1)
.forEach(System.out::println);
This will get the values in the range 2 - 6, that is it will print 3,4,5.
Or for your example:
employees.stream()
.dropWhile(e -> e.getLastname().equals("xxx"))
.takeWhile(e -> e.getLastname().equals("yyy"))
.skip(1)
.forEach(....)
There are back-ports for dropWhile and takeWhile, see here and here
EDIT
Or you can get the indexes of those delimiters first and than do a subList (but this assumes that xxx and yyy are unique in the list of employees):
int[] indexes = IntStream.range(0, employees.size())
.filter(x -> list.get(x).getLastname().equals("xxx") || list.get(x).getLastname().equals("yyy"))
.toArray();
employees.subList(indexes[0] + 1, indexes[1])
.forEach(System.out::println);

SQL - Optimization - Select those that do and do not satisfy a condition into 2 separate tables

To select those that do and do not satisfy a condition into 2 seperate tables I can use the following:
select * from myTable where someThing=value into qMatchCondition
select * from myTable where someThing<>value into qNotMatchCondition
However I expect this would be a waste of time! I.E. When SQL interprates the code will it not do something like this:
results = []
database.getTable("myTable").records.foreach(record){
if(record.someThing == value){
results.push(record)
}
}
QueryTable.new("qMatchCondition",results)
results = []
database.getTable("myTable").records.foreach(record){
if(record.someThing != value){
results.push(record)
}
}
QueryTable.new("qNotMatchCondition",results)
When writing the algorithm manually I would do something akin to the following:
results1 = []
results2 = []
database.getTable("myTable").records.foreach(record){
if(record.someThing != value){
results1.push(record)
} else {
results2.push(record)
}
}
QueryTable.new("qMatchCondition",results1)
QueryTable.new("qNotMatchCondition",results2)
I expect this would depend on the SQL engine you are using, but are there any ways to force the SQL engine to "compile" to an optimized algorithm? Or do SQL engines generally compile statements anyway?