How do I filter/sort a sequenced document that belongs in multiple categories in Solr without grouping? - lucene

I'm looking for some help and wisdom on how to properly design the schema for indexing documents for my situation. Basically I have products which can belong in multiple categories. Within those categories these products may or may not be sequenced. Ideally I'd like to keep just one unique document per product.
I'm using Solr 3.4.0 and currently have documents with this structure:
{
productId : "1",
sku : "ABC123",
productName : "My Product",
categorySequence : ["123-1", "456-7", "789-noseq", "000-noseq"],
description : "Product description",
rating: "4.36"
}
The categorySequence is where I'm having trouble. It's a multi value field which contains strings that are formatted with the category id and the sequence of my product within that category id separated by a dash. In cases where the product is not sequenced in the category I've arbitrarily appended "noseq".
Since my product can exist in multiple categories, I do a filter query on the categorySequence field like this:
fq=categorySequence:123-*
which is working for me to bring back only products which are in the category with the id "123".
However my problem now as I have discovered is that you can't sort on multi value fields. I initially was hoping this would be a quick way to sort the filtered products in the appropriate sequence.
I've seen some other suggestions on here regarding grouping and having multiple documents for the same product. However my products can exist in lots of categories and as you can imagine would create a lot of documents.
I'm hoping to stick with a single document representing a single product. Can someone help point me in the right direction? I guess I'm basically looking at doing a filter and a sort on a two dimensional field?

Faced an similar issue, and here is what we implemented -
Create dynamic field for each product involving the product id and the sort sequence.
Field -
<dynamicField name="*_sort_seq" type="string" indexed="true" stored="false" sortMissingLast="true"/>
data fed to Solr -
123_sort_seq=1
456_sort_seq=7
Do not need to store ones without any sort sequence. The positions of these can be handled with sortMissingLast & sortMissingFirst attributes.
These fields will maintain the position/sequence of products for the categories.
As you know the category id you can easily filter and sort for products.
fq=categorySequence:123-*&sort=123_sort_seq asc
Won't need to maintain multiple copies of the products.

Related

Elastic Search, Nest. functional sorting

I'm building a filter page, with facets etc, which works as it should.
Now the our customer has a request to, basically "Be able to decide which sorting the items comes out in".
Each product is decorated with a Product Display Order, and is in a Product Line.
We got these example Product Display Orders:
1. Featured Item
2. Core Item
3. Spare Part
4. Utility
And these Product Lines:
1. Hammers
2. Saw
3. Wood
and the sorting is like this:
Sorting should firstly be based on Product Display Orders, secondly by product lines, thirdly Alphabetically.
So all products which is a Featured Item is listed first, and all these Featured Items is then sorted by their product line, and if some product are in the same Featured Item and Product Line, then its alphabetically.
The challenge is: I can't just get the sorting of Product Display order items and product lines as a number on the product, i only got a name/id.
We've thought of Boosting based on if the product are in the different categories, but it seems a bit messy.
OR
See if it possible to have some logic in the Sorting.
Sort by productDisplayOrder:
1. featured, 2. core Item ...
Then by ProductLines:
1. Hammers, 2. Saw ...
Then by Name DESC.
Which way is the best way to have this sorting, is it possible to give this logic to elastic, if it is a match and then sort it. Or are we needed to twist the boosts of product?
Hopefully this makes sense for you.
Thanks in advance! :)
Option 1). Quickest/Best performing solution would be to create new/separate integer fields for productDisplayOrder and ProductLine and then use those in your sort criteria as described (after reindexing and validating the the data is indexed as expected).
Option 2) If you want more nuance than described (eg higher scoring matches can 'break through' the ordering ceiling described) then you can explore using a Function Score Query to implement a custom scoring strategy that takes productDisplayOrder and ProductLine into consideration in generating an overall match score.
Option 3). If you can't change the mapping and reindexing your data, you can use Script-Based Sorting to generate sorting values from the currently indexed productDisplayOrder/ProductLine text using a script (eg Groovy). Keep in mind that query performance will be worse than the first two options.

Searching products using details.name and details.value using the Best Buy API

The Best Buy Search allows to search products specifying a criterion on details.name and details.value fields.
http://api.remix.bestbuy.com/v1/products(details.name="Processor Speed" & details.value="2.4Ghz")?apiKey=YOURKEY
However details is a collection. The query above actually returns all products has a detail entry named "processor" and a detail entry whose value is "2.4Ghz" but not necessarily in the same details entry. Is there a way to create a query that will return only products for which those value and name are for the same details entry ?
Unfortunately there is no way to do this unless the particular detail you are interested in has been exposed as a top level attribute (processor speed has not). To accomplish this you will need to run your query as you have described, and then comb through the results and remove the irrelevant products in your own code.

How to work with map property in RQL (Oracle's ATG Web Commerce)

We use Oracle's ATG Web Commerce for our project. And currently we need construct RQL query which obtain products which SKU's tacticalTradeStatuses contains certain status and ordered by status value.
I briefly describe the relationship between entities: Product item descriptor contains list of SKUs. Each SKU contains map tacticalTradeStatuses (key - tactical trade status, value - sequense)
For example, how to obtain all products which SKU's tacticalTradeStatuses property contains key 'BEST_SELLER' and ordered by value associated with key 'BEST_SELLER'.
Key by which we want to select products we want to pass as RQL parameter.
i have two ways to doing that
1) first create a query which fetches all the product based on map key BEST_SELLER
2) Now pass it to foreach droplet and add sort properties. which help to sort the result based on your requirements
for sorting please refer to below link
http://docs.oracle.com/cd/E23095_01/Platform.93/PageDevGuide/html/s1316foreach01.html
2 way i think is to use query options in RQLStatement.. which work same as sort properties in for each
If you provide some XML Repository structure that will be good..hope it will help you

How do I make a shopping list from multiple recipes in an SQL database?

I'm working on using a recipe database in SQLw, like the one in this question (which has helped a lot already) Structuring a recipe database , to combine the ingredients of several user selected recipes to a shopping list.
Also, the items on this shopping list are to be divided in two categories (eg: "groceries" and "check pantry")
Example case:
User can select 7 recipes to make a weekly mealplan in a form (almost got this part)
The given output is a shopping list of all the ingredients marked as "groceries" and a "check stock" list of all the ingredients marked as "pantry".
Any help at all would be much appreciated!
I had just posted a full solution, but given the subject here looks like it may be homework, I'm just going to point you in the right direction. If this isn't homework, leave a comment and I'll put the full solution back.
Since you have multiple recipes, a normal selection based on joins would give you back multiple rows per ingredient. You want some way to roll up all of the rows for a given ingredient into a single row and show a total of the quantity that you need.

Apache SOLR search by category

I am using apache-solr-1.4.1 and jdk1.6.0_14.
I have the following scenario.
I have 3 categories of data indexed in SOLR i.e. CITIES, STATES, COUNTRIES.
When I query data from SOLR I need the search result from SOLR based on the following criteria:
In a single query to SOLR I need data fetched from SOLR grouped by each category with a predefined results count for each category.
How can I specify this condition in SOLR?
I have tried to use SOLR Field Collapsing feature, but I am not able to get the desired output from SOLR.
Please suggest.
My solution is not exactly what you have asked but is my take on what SOLR does best, which is full text search. Instead of grouping the results by "category", I'd suggest you order the results by relevance score but also provide a facet count for the category values. In my experience users expect a "search" to behave like Google, with the best matches at the top. Deviating form this norm confuses the user in most cases.
If you want exactly as you have asked (actual results grouped by category) then you could use a relational database and do a group_by or write a custom function query with SOLR (I cannot advise on this as I've never done it).
More info: index the data with the appropriate fields, e.g. name, population, etc. But also add a field called "category", which would have a value of either CITIES, STATES or COUNTRIES. Then perform a standard SOLR search, which will return results in order of relevance - i.e. best matches at the top. As part of the request, you can specify a facet.field=category, which will return counts for the search results for each of the given categories (in the "facet" results section). In the UI you can then create links for each category facet which performs the original search plus &fq=category:CITIES, etc., thus restricting results to just that category. See the facetting overview on the SOLR wiki for more info.