How does FAST ESP's xrank operator work? - fast-esp

I have a bunch of documents in my index.
They all have "text" in field1. One has "boosttext" in field2.
I want FAST to put the document with "boosttext" to the front of the result set.
I tried this FQL query:
and(field1:string("text"), xrank(field2:string("boosttext", mode="AND"))
However, this will filter out all documents that do not have "boosttext" in field2 !!!
Has anyone successfully used xrank and can give me a hint? Thanks in advance.
-- Bob

... it seems that the following FQL expression works:
rank(field1:string("text"), xrank(field2:string("boosttext"))
-- Bob

xrank(field1:string("text"), field2:string("boosttext"), boost=100)
See: http://msdn.microsoft.com/en-us/library/ff394462.aspx
xrank(or(cat, dog), thoroughbred, boost=500, boostall=yes)

Related

Using regexp in Big Query to extract URLs

I've been trying to extract any URL present within my 'Text' column in Big Query. The column contains a mixture of text and URLs dotted throughout (a cell might contain more than one URL) I'm trying to use this regexp:
SELECT
REGEXP_EXTRACT (Text, r'(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9%_:?\+.~#&//=]*')
FROM
Data.Text_Files
I currently get 'failed to parse regular expression' when I try to run the query. I've tried modifying it but to no avail.
The regexp works in an online builder but I'm just not sure how to incorporate it into Big Query.
Any help would be much appreciated - or at least pointers on how to incorporate regular expressions into Big Query!
Try below - it is for BigQuery Standard SQL (see Enabling Standard SQL and Migrating from legacy SQL)
WITH YourTable AS (
SELECT 1 AS id, 'What have you tried so far? Please edit your question to show a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of the code that you are having problems with, then we can try to help with the specific problem. You can also read [How to Ask](http://stackoverflow.com/help/how-to-ask). ' AS Text UNION ALL
SELECT 2 AS id, 'Important on SO, you can mark accepted answer by using the tick on the left of the posted answer, below the voting. see http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235 for why it is important. There are more ... You can check about what to do when someone answers your question - http://stackoverflow.com/help/someone-answers.' AS Text UNION ALL
SELECT 3 AS id, 'If an answer has helped you solve your problem and you accept it you should also consider voting it up. See more at http://stackoverflow.com/help/someone-answers and Upvote section in http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235' AS Text
)
SELECT
id,
REGEXP_EXTRACT_ALL(Text, r'(?i:(?:(?:(?:ftp|https?):\/\/)(?:www\.)?|www\.)(?:[\da-z-_\.]+)(?:[a-z\.]{2,7})(?:[\/\w\.-_\?\&]*)*\/?)') AS URL
FROM YourTable
This gives you output with id field, and repeated field with all respective URLs
If you need flattened result - you can use below variation
WITH YourTable AS (
SELECT 1 AS id, 'What have you tried so far? Please edit your question to show a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of the code that you are having problems with, then we can try to help with the specific problem. You can also read [How to Ask](http://stackoverflow.com/help/how-to-ask). ' AS Text UNION ALL
SELECT 2 AS id, 'Important on SO, you can mark accepted answer by using the tick on the left of the posted answer, below the voting. see http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235 for why it is important. There are more ... You can check about what to do when someone answers your question - http://stackoverflow.com/help/someone-answers.' AS Text UNION ALL
SELECT 3 AS id, 'If an answer has helped you solve your problem and you accept it you should also consider voting it up. See more at http://stackoverflow.com/help/someone-answers and Upvote section in http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235' AS Text
)
SELECT
id, URL
FROM (
SELECT id, REGEXP_EXTRACT_ALL(Text, r'(?i:(?:(?:(?:ftp|https?):\/\/)(?:www\.)?|www\.)(?:[\da-z-_\.]+)(?:[a-z\.]{2,7})(?:[\/\w\.-_\?\&]*)*\/?)') AS URL
FROM YourTable
), UNNEST(URL) as URL
Note: you can use here any regexp that you will be able to find on web - but what a must is - there is only one matching group is allowed! so all inner matching group should be escaped with ?: as you can see it in above examples. So the ONLY group that you expect to see in output should be left as is - w/o ?:
Your regex has an incomplete capturing group, and has 2 unescaped characters. I don't know which online regex builder you're using, but maybe you forgot to put your new regex into it?
The problems are as follows:
(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9%_:?\+.~#&//=]*
POINTERS TO PROBLEMS ON THIS LINE ---> ^1 ^^2
This is the start of a capturing group with no end. You probably want the ) right before the *.
All slashes need to be escaped. This should probably be \/ or maybe even \/\\.
Here is an example with both of my suggestions implemented: https://regex101.com/r/pt1hqS/1
Good luck fixing it!

Product Index Using Django ORM

I have a list of Products with a field called 'Title' and I have been trying to get a list of initial letters with not much luck. The closes I have is the following that dosn't work as 'Distinct' fails to work.
atoz = Product.objects.all().only('title').extra(select={'letter': "UPPER(SUBSTR(title,1,1))"}).distinct('letter')
I must be going wrong somewhere,
I hope someone can help.
You can get it in python after the queryset got in, which is trivial:
products = Project.objects.values_list('title', flat=True).distinct()
atoz = set([i[0] for i in products])
If you are using mysql, I found another answer useful, albeit using sql(django execute sql directly):
SELECT DISTINCT LEFT(title, 1) FROM product;
The best answer I could come up with, which isn't 100% ideal as it requires post processing is this.
atoz = sorted(set(Product.objects.all().extra(select={'letter': "UPPER(SUBSTR(title,1,1))"}).values_list('letter', flat=True)))

Unable to query using 4 conditions with WHERE clause

I am trying to query a database to obtain rows that matches 4 conditions.
The code I'm using is the following:
$result = db_query("SELECT * FROM transportesgeneral WHERE CiudadOrigen LIKE '$origen%' AND DepartamentoOrigen LIKE '$origendep' AND DepartamentoDestino LIKE '$destinodep' AND CiudadDestino LIKE '$destino%'");
But it is not working; Nevertheless, when I try it using only 3 conditions; ie:
$result = db_query("SELECT * FROM transportesgeneral WHERE CiudadOrigen LIKE '$origen%' AND DepartamentoOrigen LIKE '$origendep' AND DepartamentoDestino LIKE '$destinodep'");
It does work. Any idea what I'm doing wrong? Or is it not possible at all?
Thank you so much for your clarification smozgur.
Apparently this was the problem:
I was trying to query the database by using the word that contained a tittle "Petén" so I changed the database info and replaced that word to the same one without the tittle "Peten" and it worked.
Now, im not sure why it does not accept the tittle but that was the problem.
If you have any ideas on how I can use the tittle, I would appreciate that very much.

URL encoding a SPARQL query for dbpedia.org

I am trying to write a code (in ABAP) for URL encoding a SPARQL query for dbpedia.org.
Is there any reference code or method?
Ex: Input:
select distinct ?Concept where {[] a ?Concept} LIMIT 100
Output:
http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+distinct+%3FConcept+where+%7B%5B%5D+a+%3FConcept%7D+LIMIT+100&format=text%2Fhtml&timeout=0&debug=on
Thanks
Krishna
Disclaimer: I know nothing about ABAP.
Having said that, I'd be surprised if there isn't a library function to do that. If not, you can try porting the one from the JDK:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/java/net/URLEncoder.java#URLEncoder

Lucene query syntax

I want to write a Lucene query which is the equivalent of the following SQL
where age = 25
and name in ("tom", "dick", "harry")
The best I've come up with so far is:
(age:25 name:tom) OR
(age:25 name:dick) OR
(age:25 name:harry)
Is there a more succinct way to write this?
Thanks,
Don
age:25 AND name:(tom OR dick OR harry)
alternatively
+age:25 +name:(tom OR dick OR harry)
Does this work?
age:25 AND (name:tom OR name:dick OR name:harry)
I understand this may not be what you're looking for. I didn't know if the purpose of your question was to factor out the age:25 clause or if it was to eliminate the name: prefixes.
If you make name your QueryParser's default field, you could reduce this down to:
age:25 AND (tom OR dick OR harry)
It's not much more succinct, but you can try:
(age:25) AND (name:tom OR name:dick OR name:harry)