Elastica return empty resultset when it should return some results - resultset

Hello I have a problem with elasticsearch php api, elastica.
if I run this:
$elasticaQueryMatch= new Elastica\Query\Match();
$elasticaQueryMatch->setField('fax', "16147591649");
$elasticaResultSet = $elasticaIndex->search($elasticaQueryMatch);
var_dump($elasticaResultSet);
I get 7 results and the telephone number for all of the results is "16147591649"
Then if I run this:
$elasticaQueryMatch= new Elastica\Query\Match();
$elasticaQueryMatch->setField('telephone', "16147591649");
$elasticaResultSet = $elasticaIndex->search($elasticaQueryMatch);
var_dump($elasticaResultSet);
I get 0 results

Fixed it by creating a new index, changed my mapping and then rebuilt my index. It was the mapping and the analyzers for certain fields that were causing issues.

Related

How to code a simple algorithm to fetch list of data through pagination in a fresh new application?

I'm making a clone of social app. I'm using graphQL as my backend. My problem is that every time I query a list of data it is returning the same result. When I will release that app, the user base will be very small so the amount or data is less in number. So I'm facing the issue described below:
1. My data in data base is like:
I'd=1 title=hello1
I'd=2 title=hello2
I'd=3 title=hello3
2. When I'm querying data through pagination with limit=3, I'm getting list of items is like:
Query 1
I'd=1 title=hello1
I'd=2 title=hello2
I'd=3 title=hello3
3. When I'm adding new items to data base, it is invoked in between the items like below:
I'd=1 title=hello1
I'd=4 title=hello4
I'd=2 title=hello2
I'd=3 title=hello3
I'd=5 title=hello5
4. So next fresh query result(limit=3) Will be like:
Query 2
I'd=1 title=hello1
I'd=4 title=hello4
I'd=2 title=hello2
Look at the data set previously our query result was: I'd=1,2 & 3 now I'd=1,4 & 2 so the user will get same result as id=1,2 is in new list.
If I will save pagination nextToken/cursor(I'd=3) of first query(query 1) then after new data added to data base the new query will start from I'd=5, because it is present after I'd=3. Look at the new dataset it will miss I'd=4 because nextToken is saved for I'd=3 for the query will start from I'd=5. Hope you can understand.
If your suggestion is add a sort key of created at, I want say that if I will add some filter, the data set will become so much selective that might become the reason of limited number of data in feed and we know a feed should query unlimited data.

How to use Bioproject ID, for example, PRJNA12997, in biopython?

I have an Excel file in which are given more then 2000 organisms, where each one of them has a Bioproject ID associated (like PRJNA12997). The idea is to use these IDs to get the sequence for a later multiple alignment with other five sequences that I have in a text file.
Can anyone help me understand how I can do this using biopython? At least the part with the bioproject ID.
You can first get the info using Bio.Entrez:
from Bio import Entrez
Entrez.email = "Your.Name.Here#example.org"
# This call to efetch fails sometimes with a 400 error.
handle = Entrez.efetch(db="bioproject", id="PRJNA12997")
I've been trying, and Entrez.read(handle) doesn't seems to work. But if you do record_xml = handle.read() you'll get the XML entry for this record. In this XML you can get the ID for the organism, in this case 12997.
handle = Entrez.esearch(db="nuccore", term="12997[BioProject]")
search_results = Entrez.read(handle)
Now you can efecth from your search results. At this point you should use Biopython to parse whatever you will get in the efetch step, playing with the rettype http://www.ncbi.nlm.nih.gov/books/NBK25499/table/chapter4.T._valid_values_of__retmode_and/
for result in search_results["IdList"]:
entry = Entrez.efetch(db="nuccore", id=result, rettype="fasta")
this_seq_in_fasta = entry.read()

I am executing query using WildcardQuery of Lucene,but it doesn't work

I am executing query using WildcardQuery of Lucene.but I don't know why the result cannot be found.
Below are the details.
Here is the code for create WildcardQuery,and The record of Field Name :'Full Name' Value:'ABC123DD456CC' is existed Index Document.
BooleanQuery booleanQuery = new BooleanQuery();
for (IndexQueryField field : quickSearchFields)
{
Query query = new WildcardQuery(new Term(queryField.getFieldName(),"ABC*DD*CC"));
booleanQuery.add(query, BooleanClause.Occur.SHOULD);
}
The part of code: Executing query:
Session hibernateSession = (Session) em.getDelegate();
FullTextSession session = SwitchSession.getFullTextSession(hibernateSession, specifyIndexName);
// Set Hibernate flushMode
session.setFlushMode(FlushMode.MANUAL);
// Ignore Hibernate Cache
session.setCacheMode(CacheMode.IGNORE);
FullTextQuery query = session.createFullTextQuery(booleanQuery,XXX.class);
List list = query.setFirstResult(1).setMaxResults(100).list();
The list is empty, i am sure the 'ABC123DD456CC' is existed in Lucene Document.
I just want to do it with WildcardQuery. Any help will be thankful!
I believe that last line should be:
List list = query.setFirstResult(0).setMaxResults(100).list();
Since results are numbered from 0. If there is only 1 document matching that search, which seems likely enough, that probably explains why you're getting nothing (having skipped the first and only result, at index 0).

Get ID from Ravendb query

I am using the clientAPI to query an index (Cards) in RavenDB so:
Dim cards = Raven.CurrentSession.Query(Of Cards)("Cards").ToArray()
This works well and returns all the documents, but how can I get the ID of the documents it returns?
Eystein,
for each of the returned cards, you do
Raven.CurrentSession.Advanced.GetDocumentId(card)

nutch field problem

I was using something like:
Field notdirectory = new Field("notdirectory","1", Field.Store.NO, Field.Index.UN_TOKENIZED);
and queries like "notdirectory:1" can be processed quite well all the time.
But recently I've changed the "Field.Store.NO, Field.Index.UN_TOKENIZED" to index a non-numeric string:
Field stateField = new Field("state","irn_" + state, Field.Store.NO, Field.Index.UN_TOKENIZED);
and queries like "state:irn_CA" can never fetch any results any more,even though I watch through hadoop logs that "irn_CA" is added to "state" field in fact.
So I doubt for Fields that satisfy "Field.Store.NO, Field.Index.UN_TOKENIZED",only numeric Fields can searchable,but I didn't see any documents about that.
So what's the true reason for this?
I think, you are using StandardAnalyzer for parsing the input query, which will tokenize your input query "irn_CA" into two tokens - "irn" and "CA". Since the index has "irn_CA" as single token, it won't match.
Try using KeywordAnalyzer for while searching. It will generate single token for the query string and match the indexed token correctly.
I think the searcher bean forces everything to lowercase...so make the state is in lower case when adding to the index:
Field stateField = new Field("state","irn_" + state.toLowerCase(), Field.Store.NO, Field.Index.UN_TOKENIZED);
and when you query: 'state:irn_ca' instead of 'state:irn_CA'.
I also note you prefixed with 'irn_' - good call, otherwise the highlighter flags up the the query.