SOLR: populate with data from children - indexing

I have Products in my SOLR index. I need to create calculated fields for each product. These fields are based on product's children.
Is it possible to create such calculated fields?
For example, I have a Product with id 1, I need to add all the Detail entities, which have "parentId" field value 1. Here is a brief schema: https://www.screencast.com/t/EkNG8NpFp.
I need to have values "v1", "v3" from the example above.

not sure what you exactly mean by "create such calculated fields"...
if you mean if you can query for Products and then for example get the average of field 'value'. Yes you can do stuff like that, look at json facets and how you can use children docs.
if you mean how you can add some new field to your Product doc, based on the values of the children docs, then you can probably do it with Streaming Expressions. You need to use the current collection as a source, and compute the new fields, and finally add the new docs (including the new field) into a new collection

Related

Few questions about Grails' createCriteria

I read about createCriteria, and kind of interested on how these works, and its usability in providing values for dropdown box.
So say, i have a table in the database, Resource table, where i have defined the table in the domain class called Resource.groovy. Resource table has a total of 10 columns, where 5 of it are
Material Id
Material description
Resource
Resource Id
Product Code
So using the createCriteria, and i can use just like a query to return the items that i want to
def resList = Resource.createCriteria().list {
and {
eq('resource', resourceInstance)
ne('materialId', '-')
}
}
Where in the above, i want to get the data that matches the resource = resourceInstance, and none of the materialId is equal to '-'.
I want to use the returned data from createCriteria above on my form, where i want to use some of the column on my select dropdown. Below is the code i used for my select dropdown.
<g:select id="resourceId" name="resourceId"
from="${resList}"
disabled="${actionName != 'show' ? false : true}" />
How do i make it so that in a dropdown, it only shows the values taken from column Product Code? I believe the list created using createCriteria returns all 10 columns based on the createCriteria's specification. But i only want to use the Product Column values on my dropdown.
How do i customize the data if in one of the select dropdown in my form, i wanted to show the values as "Resource Id - Resource Description"? The values are combination of more than 1 columns for one select dropdown but i don't know how to combine both in a single select dropdown.
I read that hql and GORM query are better ways of fetching data from table than using createCriteria. Is this true?
Thanks
First of all refer to the document for using select in Grails. To answer all questions:
Yes, the list to select from in the dropdown can be customized. In this case it should be something like from="${resList*.productCode}"
Yes, this can be customized as well with something like
from="${resList.collect { \"${it.resourceId} - ${it.resourceDesc}\" } }"
It depends. If there are associations involved in a domain then using Criteria will lead to eager fetches which might not be required. But with HQL one gets the flexibility of tailoring the query as needed. With latest version of Grails those boundries are minimized a lot. Usage of DetachedCriteria, where queries etc are recommended whereever possible. So it is kind of mixing and matching to the scenario under consideration.

How to work with map property in RQL (Oracle's ATG Web Commerce)

We use Oracle's ATG Web Commerce for our project. And currently we need construct RQL query which obtain products which SKU's tacticalTradeStatuses contains certain status and ordered by status value.
I briefly describe the relationship between entities: Product item descriptor contains list of SKUs. Each SKU contains map tacticalTradeStatuses (key - tactical trade status, value - sequense)
For example, how to obtain all products which SKU's tacticalTradeStatuses property contains key 'BEST_SELLER' and ordered by value associated with key 'BEST_SELLER'.
Key by which we want to select products we want to pass as RQL parameter.
i have two ways to doing that
1) first create a query which fetches all the product based on map key BEST_SELLER
2) Now pass it to foreach droplet and add sort properties. which help to sort the result based on your requirements
for sorting please refer to below link
http://docs.oracle.com/cd/E23095_01/Platform.93/PageDevGuide/html/s1316foreach01.html
2 way i think is to use query options in RQLStatement.. which work same as sort properties in for each
If you provide some XML Repository structure that will be good..hope it will help you

Duplicate a record and its references in web2py

In my web2py application I have a requirement to duplicate a record and all its references.
For example
one user has a product (sponserid is the user). and this product has so many features stored in other tables (reference to product id).
And my requirement is if an another user is copying this product, the a new record will generate in the product table with new productid and new sponserid. And all the reference table records will also duplicate with the new product id. Effectively a duplicate entry is creating in all the tables only change is product id and sponserid.
The product table fields will change. So I have to write a dynamic query.
If I can write a code like below
product = db(db.tbl_product.id==productid).select(db.tbl_product.ALL).first()
newproduct = db.tbl_product.insert(sponserid=newsponserid)
for field,value in product.iteritems():
if field!='sponserid':
db(db.tbl_product.id==newproduct).update(field=value)
But I cannot refer a field name like this in the update function.
Also I would like to know if there is any other better logic to achieve this requirement.
I would greatly appreciate any suggestions.
For the specific problem of using the .update() method when the field name is stored in a variable, you can do:
db(db.tbl_product.id==newproduct).update(**{field: value})
But an easier approach altogether would be something like this:
product = db(db.tbl_product.id==productid).select(db.tbl_product.ALL).first()
product.update(sponserid=newsponserid)
db.tbl_product.insert(**db.tbl_product._filter_fields(product))
The .update() method applied to the Row object updates only the Row object, not the original record in the db. The ._filter_fields() method of the table takes a record (Row, Storage, or plain dict) and returns a dict including only the fields that belong to the table (it also filters out the id field, which the db will auto-generate).

PrefixQuery on multiple fields and based on another field's value?

I am working on an auto complete solution with lucene. Do I need to call the PrefixQuery each time for each field I want to search on? Also, what if I only want to search a small set of items based off another filed's ID?
For example: Let's say I have a list of users that I have indexed. Those users belong to a specific project. I only want to PrefixQuery search users that are on, say, projectId 1.
Assuming your schema has fields "projectid" and "name", you would query for documents (users) matching the query:
+projectid:1 +name:prefix*
where 1 is the projectid and "prefix" is the name prefix you want to search for.

Boosting Multi-Value Fields

I have a set of documents containing scored items that I'd like to index. Our data structure looks like:
Document
ID
Text
List<RelatedScore>
RelatedScore
ID
Score
My first thought was to add each RelatedScore as a multi-value field using the Boost property of the Field to modify the value of the particular score when searching.
foreach (var relatedScore in document.RelatedScores) {
var field = new Field("RelatedScore", relatedScore.ID,
Field.Store.YES, Field.Index.UN_TOKENIZED);
field.SetBoost(relatedScore.Score);
luceneDoc.Add(field);
}
However, it appears that the "Norm" that is calculated applies to the entire multi-field - all the RelatedScore" values for a document will end up having the same score.
Is there a mechanism in Lucene to allow for this functionality? I would rather not create another index just to account for this - it feels like there should be a way using a single index. If there isn't a means to accomplish this, a few ideas that we have to compensate are :
Insert the multi-value field items in order of descending value. Then somehow add a positional-aware analysis to assign higher boost/score to the first items in the field.
Add a high value score multiple times to the field. So, a RelatedScore with Score==1 might be added three times, while a RelatedScore with Score==.3 would only be added once.
Both of these will result in a loss of search fidelity on these fields, yes, but they may be good enough. Any thoughts on this?
This appears to be a use case for Payloads. I'm not sure if this is available in Lucene.NET, as I've only used the Java version.
Another hacky way to do this, if the absolute values of the scores aren't that important, is to discretize them (place them in buckets based on value) and create a field for each bucket. So if you have scores that range from 1 to 100, create say, 10 buckets called RelatedScore0_10, RelatedScore10_20, etc, and for any document that has a RelatedScore in that bucket, add a "true" value in that field. Then for every search that gets executed tack on an OR query like:
(RelatedScore0_10:true^1 RelatedScore10_20:true^2 ...)
The nice thing about this is that you can tweak the boost values for each one of your buckets on the fly. Otherwise you'd need to reindex to change the field norm (boost) values for each field.
If you use Lucene.Net you might not have payloads functionality yet. What you can do is convert 0-100 relevancy score to a bucket from 1-10 (integer division by 10), then add each indexed value that many times (but only store value once). Then if you search for that field, lucene built-in scoring will take into account frequency of indexed field (it will be indexed 1-10 times based on relevance). Therefore results can be sorted by variable relevance.
foreach (var relatedScore in document.RelatedScores) {
// get bucket for relevance...
int bucket=relatedScore.Score / 10;
var field = new Field("RelatedScore", relatedScore.ID,
Field.Store.YES, Field.Index.UN_TOKENIZED);
luceneDoc.Add(field);
// add more instances of field but only store the first one above...
for(int i=0;i<bucket;i++)
{
luceneDoc.Add(new Field("RelatedScore", relatedScore.ID,
Field.Store.NO, Field.Index.UN_TOKENIZED));
}
}