RavenDb facet search in string array field with wildcard - ravendb

Is it possible to have a RavenDb faceted search, in a string[] field, where I would want to show facets (counts) for only values starting with a particular string, rather a range?
I'll try to explain myself better to with a simple example, imagine having an index with the below entries
ID | Values
-------------------------
1 | CatPersian, CatNormal, DogLabrador
2 | CatPersian, Camel, DogPoodle
3 | CatNormal, CatBengali, DogNormal
4 | DogNormal
I would perform a query on the above documents, and the Facet search would include a range of 'Cat*', on the 'Values' field. Is this possible? Then, I would get a result based on just the different values for cats, like:
CatPersian [2]
CatNormal [2]
CatBengali [1]

Yes, you can do that. Index the array, and then just use facets normally.
Let's see the full example. You have the following documents:
{
"Name": "John",
"FavoriteAnimals": [
"Cats",
"Dogs",
"Snails"
],
"#metadata": {
"#collection": "Kids"
}
}
{
"Name": "Jane",
"FavoriteAnimals": [
"Cats",
"Rabits"
],
"#metadata": {
"#collection": "Kids"
}
}
Now, you create the following index:
from k in docs.Kids
from animal in k.FavoriteAnimals
select new { Animal = animal }
And run this query:
from index 'YourIndex'
where startsWith(Animal , 'ca')
select facet('Animal')
And the result will be:
{
"Name": "Animal",
"Values": [
{
"Count": 2,
"Range": "cats"
}
]
}
Alternatively, you can use this index:
from k in docs.Kids
select new { k.FavoriteAnimals }
And run this query:
from index 'YourIndex'
where startsWith(FavoriteAnimals , 'ca')
select facet('FavoriteAnimals')
The difference here is that you'll get all matches for the documents that have a match.
So in this case
{
"Name": "Animal",
"Values": [
{
"Count": 2,
"Range": "cats"
},
{
"Count": 1,
"Range": "dogs"// also, snails, rabbits
}
]
}

Related

How to unpack Array to Rows in Snowflake?

I have a table that looks like the following in Snowflake:
ID | CODES
2 | [ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]
And I want to make it into:
ID | CODES
2 | 'CODE1'
2 | 'CODE2'
So far I've tried
SELECT ID,CODES[0]:list
FROM MY_TABLE
But that only gets me as far as:
ID | CODES
2 | [ { "item": "CODE1" }, { "item": "CODE2" } ]
How can I break out every 'item' element from every index of this list into its own row with each CODE as a string?
Update: Here is the answer I got working at the same time as the answer below, looks like we both used FLATTEN:
SELECT ID,f.value:item
FROM MY_TABLE,
lateral flatten(input => MY_TABLE.CODES[0]:list) f
So as you note you have hard coded your access into the codes, via codes[0] which gives you the first item from that array, if you use FLATTEN you can access all of the objects of the first array.
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.*
FROM my_table,
table(flatten(codes)) c;
gives:
2 1 [0] 0 { "list": [ { "item": "CODE1" }, { "item": "CODE2" }]} [ { "list": [{"item": "CODE1"}, { "item": "CODE2" }]}]
so now you want to loop across the items in list, so we use another FLATTEN on that:
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.value, l.value
FROM my_table,
table(flatten(codes)) c,
table(flatten(c.value:list)) l;
gives:
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE1"}
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE2"}
so you can pull apart that l.value how you need to access the parts you need.

How to select field values from array of objects?

I have a JSON column with following JSON
{
"metadata": { "value": "JABC" },
"force": false,
"users": [
{ "id": "111", "comment": "abc" },
{ "id": "222", "comment": "abc" },
{ "id": "333" }
]
}
I am expecting list of IDs from the query output ["111","222", "333"]. I tried following query but getting null value.
select colName->'users'->>'id' ids from tableName
How to get this specific field value from the array of object?
You need to extract the array as rows and then get the id:
select json_array_elements(colName->'users')->>'id' ids from tableName;
If you're using jsonb rather than json, the function is jsonb_array_elements.

Filter nested list with kotlin

Ques: I want to filter list within a list. All of my data models are immutable.
My JSON structure looks like this
{
"root": [
{
"id": 2,
"val": 1231.12,
"fruit": [
{
"id": 2,
"name": "apple"
}
]
},
{
"id": 3,
"val": 1231.12,
"fruit": [
{
"id": 2,
"name": "apple"
},
{
"id": 3,
"name": "orange"
}
]
}
],
"fruits": [
{
"id": 1,
"name": "apple"
},
{
"id": 2,
"name": "guava"
},
{
"id": 3,
"name": "banana"
}
]
}
Problem Statement - Basically, I want to create a list of all items of root where fruit name is apple. Currently, my naive solution looks like this. This involves creating a temporary mutuable list and then add specific items to it.
Below solution works fine but is there any other better way to achieve the same.
val tempList = arrayListOf<RootItem>()
root?.forEach { item ->
item.fruit.filter {
// filter condition
it.id != null && it.name == "apple"
}
testList.add(item)
}
A combination of filter and any will do the work:
val result = root?.filter { item ->
item.fruits.any { it.id != null && it.name == "apple" }
}
BTW: Your solution will not work. The filter function does return a new list, that you are not using. And you always add the item to the list, not only if the predicate returns true.

Using $or selector, There is no index available for this selector

I'd like to retrieve
document with _id of 1
OR
document with value === 13 AND anotherValue === 56
Error:
There is no index available for this selector.
This is my query:
{
"selector": {
"$or": [
{
"_id": "1"
},
{
"value": "13",
"anotherValue": "56"
}
]
}
}
Indexes setup:
Your available Indexes:
special: _id
json: value, anotherValue
json: _id, value, anotherValue
For this query you need to add a selector to get all the IDs like so:
{
"selector": {
"_id": {"$gt":null},
"$or": [
{
"_id": "1"
},
{
"value": "13",
"anotherValue": "56"
}
]
}
}
You can learn more here:
https://cloudant.com/blog/mango-json-vs-text-indexes/
And this SO post:
index and query items in an array with mango query for cloudant and couchdb 2.0
Alternatively, you can add a text index on all fields:
{
"index": {},
"type": "text"
}
And then your original selector should work.

SQL to mongodb conversion

I have two fields in mongodb, A and B
I would like to perform the following sql query in mongo
SELECT DISTINCT A FROM table WHERE B LIKE 'asdf'
EDIT for clarification
foo ={
bar: [{
baz:[
‘one’,
‘two'
]
},{...}
]
}
I would like to select distinct foo objects where bar.baz contains ‘one’.  
The query:
db.runCommand({
    "distinct": "foo",
    "query": {
        “bar.baz": “one"
    },
    "key": “bar.baz"
});
This query, oddly enough, returns foo objects who's bar.baz /doesnt/ contain ‘one’.
There seems to be a misunderstanding here of how the MongoDB distinct command works or indeed how any query works with arrays.
I am going to consider that you actually have documents that look something like this:
{
"_id" : ObjectId("5398f8bf0b5d1b43d3e26816"),
"bar" : [
{
"baz" : [
"one",
"two"
]
},
{
"baz" : [
"three"
]
},
{
"baz" : [
"one",
"four"
]
}
]
}
So the query that you have run, and these two forms are equivalent:
db.runCommand({
"distinct": "foo",
"query": { "bar.baz": "one" },
"key": "bar.baz"
})
db.foo.distinct("bar.baz", { "bar.baz": "one" })
Returns essentially this:
[ "four", "one", "three", "two" ]
Why? Well, because you asked it to. Let's consider a declarative way of describing what you actually invoked.
Your "query" essentially says 'Find me all the "documents" that have "bar.baz" equal to "one" ' then you are asking 'And return me all of the "distinct" values for "bar.baz"
So the "query" part of your statement does exactly that, and matched "documents" and not array members that match the value you specified. In the above example you are then asking for the "distinct" values of "bar.baz", which is exactly what you get, with there only being the value of "one" returned once from all of the values of "bar.baz".
So "query" statements do not "filter" array contents they just "match" where the condition exists. The above document matches the condition and "bar.baz" has a value of "one", and twice even. So selecting the distinct "foo" or basically the document is really:
db.foo.find({ "bar.baz": "one" })
Matching all documents that meet the condition. This is how embedding works, but perhaps you wanted something like filtering the results. So looking at returning only those items of "bar" whose "baz" has a value of "one" you would do:
db.collection.aggregate([
// Matches documents
{ "$match": { "bar.baz": "one" } },
// Unwind to de-normalize arrays as documents
{ "$unwind": "$bar" },
// Match to "filter" documents without "bar.baz" matching "one"
{ "$match": { "bar.baz": "one" } },
// Maybe group back to document with the array
{ "$group": {
"_id": "$_id",
"bar": { "$push": "$bar" }
}}
])
The result of this .aggregate() statement is the document without the member of "bar" that does not contain "one" under "baz":
{
"_id" : ObjectId("5398f8bf0b5d1b43d3e26816"),
"bar" : [
{
"baz" : [
"one",
"two"
]
},
{
"baz" : [
"one",
"four"
]
}
]
}
But then suppose you actually want just the element "bar.baz" equal to "one" and the total count of those occurrences over your whole collection, then you would want to do this:
db.collection.aggregate([
// Matches documents
{ "$match": { "bar.baz": "one" } },
// Unwind to de-normalize arrays as documents
{ "$unwind": "$bar" },
// And the inner array as well
{ "$unwind": "$bar.baz" },
// Then just match and filter out everything but the matching items
{ "$match": { "bar.baz": "one" } },
// Group to get the count
{ "$group": {
"_id": "$bar.baz",
"count": { "$sum": 1 }
}}
])
And from our single document collection sample you get:
{ "_id": "one", "count": 2 }
As there are two occurrences of that matching value.
As for your SQL at the head of your question, that really doesn't apply to this sort of data. The more practical example would be something with data like this:
{ "A": "A", "B": "BASDFJJ" }
{ "A": "A", "B": "ASDFTT" }
{ "A": "B", "B": "CASDF" }
{ "A": "B", "B": "DKITB" }
So the "distinct" values of "A" where "B" is like "ASDF", again using aggregate and noting you are not wildcarding on either side:
db.foo.aggregate([
{ "$match": { "B": "ASDF" } },
{ "$group": { "_id": "$A" } }
])
Which essentially produces:
{ "_id": "A" }
Or with wildcards on either side "%ASDF%" this is a $regex query to match:
db.foo.aggregate([
{ "$match": { "B": { "$regex": "ASDF" } } },
{ "$group": { "_id": "$A" } }
])
So only two results:
{ "_id": "A" }
{ "_id": "B" }
Where if you were "counting" the distinct matches then you would see 2 and 1 as the counts respectively according to the documents that matched.
Take a further look at the SQL Mapping Chart and the SQL to Aggregation Mapping Chart contained within the documentation. It should help you in understanding how common actions actually translate.