Orchard Search multiple fields with same term - lucene

I am trying to create a custom search module based on the Orchard.Search. I have created a custom field called keywords which I have successfully added to the index. I want to match content where the title, body or keywords match. Adding these using .WithField or passing a string array of fields tests for each field matching the term, I need these to return content if there is a match in any of the fields. I have included examples of how I am using both methods below.
Examples of how I am using the search builder:
var searchBuilder = Search()
.WithField("type", "Cell").Mandatory().ExactMatch()
.WithField("body", query)
.WithField("title", query);
.WithField("cell-keywords", query);
String Array FieldNames:
string[] searchFields = new string[2] { "body", "title", "cell-keywords"};
var searchBuilder = Search().WithField("type", "Cell").Mandatory().ExactMatch().Parse(searchFields, query, false);
If anyone could point me in the right direction that would fantastic :)

A colleague wrote an article on this on his blog, should prove helpful http://breakoutdeveloper.com/orchard-cms/creating-an-advanced-search

I have resolved my issue!
The problem was when I was adding my keywords field to the index on the part handler. There were content items with NULL which was causing an error which I missed!!

Related

Microsoft Graph API not returning custom column from list

Working in VB.Net, using the Microsoft.Graph api communicate with sharepoint.
I have a list on a sharepoint site.
Lets say:
List name : ListTestName
Columns: ListColumnTest1, ListColumnTest2, ListColumnTest3
Dim queryFields As List(Of QueryOption) = New List(Of QueryOption) From {New QueryOption("$expand", "fields")}
Dim items As IListItemsCollectionPage = Await GraphClient.Sites(sharepointSessionId).Lists("ListTestName").Items.Request(queryFields).GetAsync()
This is the code I have to grab the list and trying to get all of the fields (columns) but when I look into the Fields in the "Items" variable I do not see any of the fields that I have added to the list. I only see the sharepoint fields such as "title" or "Id"
I really dont understand why this is not working.
Even when I look via the the graph-explorer site (https://developer.microsoft.com/en-us/graph/graph-explorer) using:
GET https://graph.microsoft.com/v1.0/sites/<SiteId's>/lists/ListTestName/items?expand=fields
I do not see my custom columns However if I try and filter directly to one of the columns like this :
GET https://graph.microsoft.com/v1.0/sites/<SiteId's>/lists/ListTestName/items?expand=fields(select=ListColumnTest1)
This does seem to have returned back my custom field.
Thus I tried adding to the query field {New QueryOption("$expand", "fields(select=ListColumnTest1")} this just crashed when I called the request.
Edit: I asked this question slightly wrong and will be posting a second question that is more to what I need. However, below the question is marked correct because their solution is the correct solution for what I asked. :)
Have you try this endpoint?
GET https://graph.microsoft.com/v1.0/sites/{site-id}/lists/{list-id}?expand=columns,items(expand=fields)
I could get the custom columns with this endpoint.
Updated:
IListColumnsCollectionPage columns = graphClient.Sites["b57886ef-vvvv-4d56-ad29-27266638ac3b,b62d1450-vvvv-vvvv-84a3-f6600fd6cc14"].Lists["538191ae-7802-43b5-90ec-c566b4c954b3"].Columns.Request().GetAsync().Result;
I would avoid to create QueryOption. Try to use Expand and Select method.
Example (C#...apologise I'm not familiar with VB but I hope it will easy for you to rewrite it):
await GraphClient.client.Sites[sharepointSessionId].Lists["ListTestName"].Items.Request()
.Expand(x => new
{
ListColumnTest1 = x.Fields.AdditionalData["ListColumnTest1"],
ListColumnTest2 = x.Fields.AdditionalData["ListColumnTest2"]
})
.Select(x => new
{
ListColumnTest1 = x.Fields.AdditionalData["ListColumnTest1"],
ListColumnTest2 = x.Fields.AdditionalData["ListColumnTest2"]
})
.GetAsync();

Using lucene to search data differently for different users conditionally

Consider that the entities that I need to perform text search are as following
Sample{
int ID, //Unique ID
string Name,//Searchable field
string Description //Searchable field
}
Now, I have several such entities which are commonly shared by all the users but each user can associate different tags, Notes etc to any of these entities. For simplicity lets say a user can add tags to a Sample entity.
UserSampleData{
int ID, //Sample ID
int UserID, //For condition
string tags //Searchable field
}
When a user performs search, I want to search for the given string in the fields Name, Description and tags associated to that Sample by the current user. I am pretty new to using lucene indexing and I am not able to figure how can I design a index and also the queries for such a situation. I need the results sorted on the relevance with the search query. Following approaches crossed my mind, but I have a feeling there could be better solutions:
Separately query 2 different entities Samples and UserSampleData and somehow mix the 2 results. For the results that intersect, we need to combine the match scores by may be averaging.
Flatten out the data by combining both the entities => multiple entries for same ID.
You could use a JoinUtil Lucene class but you must rename the second "ID" field of UserDataSample document into SAMPLE_ID (or another name different from "ID").
Below an example:
r = DirectoryReader.open(dir);
final Version version = Version.LUCENE_47; // Your lucene version
final IndexSearcher searcher = new IndexSearcher(r);
final String fromField = "ID";
final boolean multipleValuesPerDocument = false;
final String toField = "SAMPLE_ID";
String querystr = "UserID:xxxx AND yourQueryString"; //the userID condition and your query String
Query fromQuery = new QueryParser(version, "NAME", new WhitespaceAnalyzer(version)).parse(querystr);
final Query joinQuery = JoinUtil.createJoinQuery(fromField, multipleValuesPerDocument, toField, fromQuery, searcher, ScoreMode.None);
final TopDocs topDocs = searcher.search(joinQuery, 10);
Check the bug https://issues.apache.org/jira/browse/LUCENE-4824). I don't know if the bug is automatically solved into the current version of LUCENE otherwise I think you must convert the type of your ID fields to String.
I think that you need Relational Data. Handling relational data is not simple with Lucene. This is a useful blog post for.

Can you avoid for loops with solrj?

I was curious if there was a way to avoid using loops when using SolrJ.
For example. If I were to use straight SQL, using an appropriate java library, I could return a Query result and caste it as a List and pass it on up to my view (in a webapp).
SolrJ (SolrQuery and QueryResponse) have no way of returning succinct lists it seems. This would imply I have to create an iterator to go through each return doc and get the value I want which isn't ideal.
Is there something I am missing here, is there away to avoid these seemingly useless loops?
The SOLRJ wiki give an example that does what you want:
https://wiki.apache.org/solr/Solrj#Reading_Data_from_Solr
Basically:
QueryResponse rsp = server.query( query );
SolrDocumentList docs = rsp.getResults();
List<Item> beans = rsp.getBeans(Item.class);
EDIT:
Based on your comments below, it appears what you want is a non-looping transform of the SOLR response (e.g. a map function in a functional language). Google's guava library provides something like this. My example below assumes that your SOLR response has a "name" field that you want to return a list of:
QueryResponse rsp = server.query(query);
SolrDocumentList docs = rsp.getResults();
List<String> names = Lists.transform(docs, new Function<String,SolrDocument>() {
#Override
public String apply(SolrDocument d) {
return (String)d.get("name");
}
});
Unfortunately, java does not support this style of programming very well, so the functional approach ends up being more verbose (and probably less clear) than a simple loop:
QueryResponse rsp = server.query(query);
SolrDocumentList docs = rs.getResults();
List<String> names = new ArrayList<String>();
for (SolrDocument d : docs) names.add(d.get("name"));

RavenDB Index created incorrectly

I have a document in RavenDB that looks looks like:
{
"ItemId": 1,
"Title": "Villa
}
With the following metadata:
Raven-Clr-Type: MyNamespace.Item, MyNamespace
Raven-Entity-Name: Doelkaarten
So I serialized with a type MyNamespace.Item, but gave it my own Raven-Entity-Name, so it get its own collection.
In my code I define an index:
public class DoelkaartenIndex : AbstractIndexCreationTask<Item>
{
public DoelkaartenIndex()
{
// MetadataFor(doc)["Raven-Entity-Name"].ToString() == "Doelkaarten"
Map = items => from item in items
where MetadataFor(item)["Raven-Entity-Name"].ToString() == "Doelkaarten"
select new {Id = item.ItemId, Name = item.Title};
}
}
In the Index it is translated in the "Maps" field to:
docs.Items
.Where(item => item["#metadata"]["Raven-Entity-Name"].ToString() == "Doelkaarten")
.Select(item => new {Id = item.ItemId, Name = item.Title})
A query on the index never gives results.
If the Maps field is manually changed to the code below it works...
from doc in docs
where doc["#metadata"]["Raven-Entity-Name"] == "Doelkaarten"
select new { Id = doc.ItemId, Name=doc.Title };
How is it possible to define in code the index that gives the required result?
RavenDB used: RavenHQ, Build #961
UPDATE:
What I'm doing is the following: I want to use SharePoint as a CMS, and use RavenDB as a ready-only replication of the SharePoint list data. I created a tool to sync from SharePoint lists to RavenDB. I have a generic type Item that I create from a SharePoint list item and that I serialize into RavenDB. So all my docs are of type Item. But they come from different lists with different properties, so I want to be able to differentiate. You propose to differentiate on an additional property, this would perfectly work. But then I will see all list items from all lists in one big Items collection... What would you think to be the best approach to this problem? Or just live with it? I want to use the indexes to create projections from all data in an Item to the actual data that I need.
You can't easily change the name of a collection this way. The server-side will use the Raven-Entity-Name metadata, but the client side will determine the collection name via the conventions registered with the document store. The default convention being to use the type name of the entity.
You can provide your own custom convention by assigning a new function to DocumentStore.Conventions.FindTypeTagName - but it would probably be cumbersome to do that for every entity. You could create a custom attribute to apply to your entities and then write the function to look for and understand that attribute.
Really the simplest way is just to call your entity Doelkaarten instead of Item.
Regarding why the change in indexing works - it's not because of the switch in linq syntax. It's because you said from doc in docs instead of from doc in docs.Items. You probably could have done from doc in docs.Doelkaartens instead of using the where clause. They are equivalent. See this page in the docs for further examples.

Lucene query - "Match exactly one of x, y, z"

I have a Lucene index that contains documents that have a "type" field, this field can be one of three values "article", "forum" or "blog". I want the user to be able to search within these types (there is a checkbox for each document type)
How do I create a Lucene query dependent on which types the user has selected?
A couple of prerequisites are:
If the user doesn't select one of the types, I want no results from that type.
The ordering of the results should not be affected by restricting the type field.
For reference if I were to write this in SQL (for a "blog or forum search") I'd write:
SELECT * FROM Docs
WHERE [type] in ('blog', 'forum')
For reference, should anyone else come across this problem, here is my solution:
IList<string> ALL_TYPES = new[] { "article", "blog", "forum" };
string q = ...; // The user's search string
IList<string> includeTypes = ...; // List of types to include
Query searchQuery = parser.Parse(q);
Query parentQuery = new BooleanQuery();
parentQuery.Add(searchQuery, BooleanClause.Occur.SHOULD);
// Invert the logic, exclude the other types
foreach (var type in ALL_TYPES.Except(includeTypes))
{
query.Add(
new TermQuery(new Term("type", type)),
BooleanClause.Occur.MUST_NOT
);
}
searchQuery = parentQuery;
I inverted the logic (i.e. excluded the types the user had not selected), because if you don't the ordering of the results is lost. I'm not sure why though...! It is a shame as it makes the code less clear / maintainable, but at least it works!
Add a constraints to reject documents that weren't selected. For example, if only "article" was checked, the constraint would be
-(type:forum type:blog)
While erickson's suggestion seems fine, you could use a positive constraint ANDed with your search term, such as text:foo AND type:article for the case only "article" was checked,
or text:foo AND (type:article OR type:forum) for the case both "article" and "forum" were checked.