My Lucene/Solr database contains a date column (created_at) that I need to use as a condition in a query.
I'm new to RoR and assume that RoR automatically uses its own date object upon anyObject.save, and that Solr in turn reindexes that column in its own way.
Regardless, the date is in this format: "2008-06-03 11:15:20"
I can write a quick parser to parse my query string into the above format, but when I query
Object.find(keyword:foo created_at >= '2008-06-03 11:15:20')
Solr throws a parsing error. I've tried several standard variations on this without success. Any suggestions?
I hate to ask the obvious, but have you checked the solr docs for the query language for dates?
http://wiki.apache.org/solr/SolrQuerySyntax
and experience suggests that ">=" is not a valid solr operator. You can do range queries on date fields, but using the correct format with your example query would be:
Object.find("keyword:foo AND created_at:[2008-06-03T11:15:20.000Z TO *]")
Related
I am trying to do a fairly simply query that involves getting all the records between two dates via a specific column.
The raw SQL works fine in ISQL Firebird:
SELECT * FROM OPS_HEADER
WHERE PB_BOL_DT
BETWEEN '2020-09-01' AND '2020-09-10';
Here is my ActiveRecord Conversion:
OpsHeader.where('pb_bol_dt BETWEEN 2020-09-01 AND 2020-09-10')
This above line gives me this error:
expression evaluation not supported expression evaluation not supported Only one operand can be of type TIMESTAMP
I may be converting it wrong but it sure seems like this is the exact way to do it... I have no idea why it's giving me so much trouble.
You're missing quotes on the date literals, you'd want to start with:
OpsHeader.where(%q(pb_bol_dt BETWEEN '2020-09-01' AND '2020-09-10'))
But you can get ActiveRecord to build a BETWEEN by passing it a range:
OpsHeader.where(pb_bol_dt: '2020-09-01' .. '2020-09-10')
and letting AR deal with the quoting. You could also pass the end points separately using positional or named placeholders:
OpsHeader.where('pb_bol_dt between ? and ?', '2020-09-01', '2020-09-10')
OpsHeader.where('pb_bol_dt between :lower and :upper', lower: '2020-09-01', upper: '2020-09-10')
All of these will end up sending the same SQL to the database, the only difference is the small amount of string processing and type handling that is needed to build the latter three queries.
I'm new with Airflow and I'm currently stuck on an issue with the Bigquery operator.
I'm trying to execute a simple query on a table from a given dataset and copy the result on a new table in the same dataset. I'm using the bigquery operator to do so, since according to the doc the 'destination_dataset_table' parameter is supposed to do exactly what I'm looking for (source:https://airflow.apache.org/docs/stable/_api/airflow/contrib/operators/bigquery_operator/index.html#airflow.contrib.operators.bigquery_operator.BigQueryOperator).
But instead of copying the data, all I get is a new empty table with the schema of the one I'm querying from.
Here's my code
default_args = {
'owner':'me',
'depends_on_past':False,
'start_date':datetime(2019,1,1),
'end_date':datetime(2019,1,3),
'retries':10,
'retry_delay':timedelta(minutes=1),
}
dag = DAG(
dag_id='my_dag',
default_args=default_args,
schedule_interval=timedelta(days=1)
)
copyData = BigQueryOperator(
task_id='copyData',
dag=dag,
sql=
"SELECT some_columns,x,y,z FROM dataset_d.table_t WHERE some_columns=some_value",
destination_dataset_table='dataset_d.table_u',
bigquery_conn_id='something',
)
I don't get any warnings or errors, the code is running and the tasks are marked as success. It does create the table I wanted, with the columns I specified, but totally empty.
Any idea what I'm doing wrong?
EDIT: I tried the same code on a much smaller table (from 10Gb to a few Kb), performing a query with a much smaller result (from 500Mb to a few Kb), and it did work this time. Do you think the size of the table/the query result matters? Is it limited? Or does performing a too large query cause a lag of some sort?
EDIT2: After a few more tests I can confirm that this issue is not related to the size of the query or the table. It seems to have something to do with the Date format. In my code the WHERE condition is actually checking if a date_column = 'YYYY-MM-DD'. When I replace this condition with an int or string comparison it works perfectly. Do you guys know if Bigquery uses a particular date format or requires a particular syntax?
EDIT3: Finally getting somewhere: When I cast my date_column as a date (CAST(date_column AS DATE)) to force its type to DATE, I get an error that says that my field is actually an int-32 (Argument type mismatch). But I'm SURE that this field is a date, so that implies that either Bigquery stores it as an int while displaying it as a date, or that the Bigquery operator does some kind of hidden type conversion while loading tables. Any idea on how to fix this?
I had a similar issue when transferring data from other data sources than big-query.
I suggest casting the date_column as follows: to_char(date_column, 'YYYY-MM-DD') as date
In general, I have seen that big-query auto detection schema is often problematic. The safest way is to always specify schema before executing its corresponding query, or use operators that support schema definition.
I am using Lucene.net v3.0.3.0 for indexing and searching, I have "CreateDateTime" field which store document creation datetime.I would like to Create DateTime range query with boolean "NOT" condition. Means I would like to retrieve all those documents whose CreateDate should not be in the range which I have given, I am able to create query but the query is not returning any results.
Date is mmddyyyyhhmmss format.
my date range is 7/15/2014 12:00:00 AM To 3/31/2015 11:59:59 PM
My final query is as follows,
-CreateDateTime:[20140715000000000 TO 20150331235959000]
I had tried same query with the help of Luke tool as well it is also not returning any result. The indexing was created normally and I am able to fire all types of quires on it except DateRange query with NOT Boolean condition. NOT is working perfectly fine on other fields.
Any Suggestions ?
Is this your only query in the search request? You can't just provide a negative query, you need some matching queries too. Add a MatchAllDocsQuery to your BooleanQuery, the result should end up as *:* -CreateDateTime:[...]
I have some documents stored in Raven, which have a CreationDate property, of type "DateTimeOffset".
I have been attempting to get these documents returned in a query from C#, and they are never returned if I use the CreationDate in the query criteria.
After watching the Raven console, I saw the query being issued was:
Query: (FeedOwner:25eb541c\-b04a\-4f08\-b468\-65714f259ac2) AND ( Creati
onDate:[20120524120000000 TO NULL] AND CreationDate:{* TO 20120525120000000})
I ran this query directly against HTTP, and changed the date format to:
Query: (FeedOwner:25eb541c\-b04a\-4f08\-b468\-65714f259ac2) AND ( Creati
onDate:[2012-05-24120000000 TO NULL] AND CreationDate:{* TO 2012-05-25120000000})
And now it works - it returns my documents which DEFINITELY fall within the range.
Is Raven generating the incorrect date format for lucene? If so, how do I fix this?
Notes:
Yes I need time zone support
Yes I need Time aswell as Date in my index.
Thanks
[EDIT]
Err... I just changed my entities to use DateTime, just for giggles... and it still fails to return the data... whats going on??? Im using RavenDB.Server.1.2.2002-Unstable
Adam,
You are using RavenDB Server 1.2 pre release bits with the RavenDB Client 1.0 Stable.
Those are incompatible.
I'm using ColdFusion 9.0.1 and the integrated SOLR full text search engine.
I have dates stored in my SQL Server database as datetime fields for upcoming events. I took these records and inserted them into a SOLR collection with the custom3 and custom4 fields being the dateStart and dateEnd dates respectively. Users want to query the collection against a date range and sort by closest date to now.
First question: How do we set the datatype for the custom1-4 fields? Or, can we? Based on this post, Optimizing Solr for Sorting, the field should be set to either tdate or date rather than string for best performance. Or does SOLR automatically make the field have the correct datatype based on this post, Sort by date in Solr/Lucene performance problems?
Second question: How would the search criteria be structured to pull records? How about between May 1, 2011 and July 31, 2011, for example?
I don't tell too many people this, but for you, I believe it's time to ditch CFINDEX/CFSEARCH, and start using Solr directly.
CF's implementation is built for indexing a large block of text with some attributes, not a query. If you start using Solr directly, you can create your own schema, and have far more granular control of how your search works. Yes, it's going to take longer to implement, but you will love the results. Filtering by date is just the beginning.
Here's a quick overview of the steps:
Create a new index using the CFAdmin. This is the easy way to create all the files you need.
Modify the schema. The schema is in [cfroot]/solr/multicore/[your index name]/conf/
The top half of the schema is <types>. This defines all the datatypes you could use. The bottom half is the <fields>, and this is where you're going to be making most of your changes. It's pretty straightforward, just like a table. Create a field for each "column" you want to include. "indexed" means that you want to make that field searchable. "stored" means that you want the exact data stored, so that you can use it to display results. Because I'm using CF9's ORM, I don't store much beyond the primary key, and I use loadEntityByPK() on my results page.
After modifying the schema, you need to restart the solr service/daemon.
Use http://cfsolrlib.riaforge.org/ to index your data (the add method is a 'insert or modify' style method), and to perform the search.
To do a search, check out this example. It shows how to sort and filter by date. I didn't test it, so the format of the dates might be wrong, but you'll get the idea. http://pastebin.com/eBBYkvCW
Sorry this is answer is so general, I hope I can get you going down the right path here :)