Strange behaviour when using "revert" parameter in cumulocity API - cumulocity

When using the REST API to get measurements I can pass the revert parameter to reverse the order of the measurements:
https://tenant.cumulocity.com/measurement/measurements?revert=true
However, as soon as I pass an extra device as a parameter,
https://tenant.cumulocity.com/measurement/measurements?revert=true&source=876123
the revert keyword seems to stop working. The first value shown is sometime a day before the actual most recent value. In fact, when specifying revert=false or omitting the parameter, the timestamp of the first value shown is actually chronologically AFTER the timestamp of the value shown first with revert=true.
First Item with no source specified and revert=true: "2015-12-20T18:15:00.000+01:00"
First Item with source specified and revert=true: "2015-12-19T01:25:00.000+01:00"
First Item with source specified and revert=false: "2015-12-19T12:50:00.000+01:00"
Is there any explanation for this or is the revert keyword not valid when specifying a source?

For full understand:
when someone query our data base we get proper index and return result according to this index (performance reasons); indexes are selected according to query parameters,
the parameter 'revert' work on this index and in exact meaning is return result in revert order to index order
Why in your cases you get wrong results:
First Item with no source specified and revert=true ...
we get 'natural' index(order of adding document to database), and return elements in revert order to order of adding. Match between order or adding and order of time is accidental
First Item with source specified ...
in this case we get index 'bySource' this is b-tree on 'source' property and will be sorted only by 'source'; in this case data wasn't ordered by 'time'
If you want have result by specific 'source' and ordered by 'time', you must force application to ues bySourceAndTime index :)
and how to do this, please simply use this URL:
https://tenant.cumulocity.com/measurement/measurements?dateTo=2015-12-22&dateFrom=1970-01-01&source=10307&revert=false
in this case application get index bySourceAndByTime and index have data ordered by source and time and when you add revert = true the result will be in revert order to time.
Best regards,
Arkadiusz

Related

"Cannot construct data type datetime" when filtering data, but all values filtered DO have valid dates

I am convinced that this question is NOT a duplicate of:
Cannot construct data type datetime, some of the arguments have values which are not valid
In that case the values passed in are explicitly not valid. Whereas in this case the values that the function could be expected to be called upon are all valid.
I know what the actual problem is, and it's not something that would help most people that find the other question. But it IS something that would be good to be findable on SO.
Please read the answer, and understand why it's different from the linked question before voting to close as dupe of that question.
I've run some SQL that's errored with the error message: Cannot construct data type datetime, some of the arguments have values which are not valid.
My SQL uses DATETIMEFROMPARTS, but it's fine evaluating that function in the select - it's only a problem when I filter on the selected value.
It's also demonstrating weird, can't-possibly-be-happening behaviour w.r.t. other changes to the query.
My query looks roughly like this:
WITH FilteredDataWithDate (
SELECT *, DATETIMEFROMPARTS(...some integer columns representing date data...) AS Date
FROM Table
WHERE <unrelated pre-condition filter>
)
SELECT * FROM FilteredDataWithDate
WHERE Date > '2020-01-01'
If I run that query, then it errors with the invalid data error.
But if I omit the final Date > filter, then it happily renders every result record, so clearly none of the values it's filtering on are invalid.
I've also manually examined the contents of Table WHERE <unrelated pre-condition filter> and verified that everything is a valid date.
It also has a wild collection of other behaviours:
If I replace all of ...some integer columns representing date data... with hard-coded numbers then it's fine.
If I replace some parts of that data with hardcoded values, that fixes it, but others don't. I don't find any particular patterns in what does or doesn't help.
If I remove most of the * columns from the Table select. Then it starts to be fine again.
Specifically, it appears to break any time I include an nvarchar(max) column in the CTE.
If I add an additional filter to the CTE that limits the results to Id values in the following ranges, then the results are:
130,000 and 140,000. Error.
130,000 and 135,000. Fine.
135,000 and 140,000. Fine.!!!!
Filtering by the Date column breaks everything ... but ORDER BY Date is fine. (and confirms that all dates lie within perfectly sensible bounds.)
Adding TOP 1000000 makes it work ... even though there are only about 1000 rows.
... WTAF?!
This took me a while to decode, but it turns out that the SS compiler doesn't necessarily restrict its execution of the function just to rows that are, or could be, relevant to the result set.
Depending on the execution plan it arrives at, the function could get called on any record in Table, even one that doesn't satisfy WHERE <unrelated pre-condition filter>.
This was found by another user, for another function, over here.
So the fact that it could return all the results without the filter wasn't actually proving that every input into the function was valid. And indeed there were some records in the table that weren't in the result set, but still had invalid data.
That actually means that even if you were to add an explicit WHERE filter to exclude rows containing invalid date-component data ... that isn't actually guaranteed to fix it, because the function may still get called against the 'excluded' rows.
Each of the random other things I did will have been influencing the query plan in one way or another that happened to fix/break things.
The solution is, naturally, to fix the underlying table data.

Lucene DocValuesField, SortedDocValuesField usage for filtering and sorting

I am going to switch to newest (4.10.2) version of Lucene and I'd like to make some optimization in my index and code.
I would like to use DocValuesField to get values but also for filtering and sorting.
So here I have some questions:
If I'd like to use range filter (FieldCacheRangeFilter) I need to store a value in XxxDocValuesField,
but if i want to use terms filter (FieldCacheTermsFilter) I need to store a value in SortedDocValuesField.
So it looks like if I want to use range and terms filters I need to have two different fields. Am I right? Am I using it correctly?
Another thing is Sort. I can choose between SortedNumericSortField and SortField. First one requires SortedNumericDocValues, another NumericDocValuesField. Is there any(big) difference in performance?
Should I use SortedNumericSortField (adding another field to the index)?
And the last one. Am I right that all corresponding DocValuesField will be removed from index when doc is removed? I saw an IndexWriter method for an update doc value but no delete method for doc value.
Regards
Piotr

filter lucene search based on a particular filed

I want to return all matched documents found after a document with a certain value. The value is unique.
I have tried to use numericfilterrange. Thisis not a good solution as the field values may be in any orders
Using a numeric range is the correct way to accplish what you want, if I understand what you need. In order to sort on the same field, you'll need to pass a Sort argument to your search call, something like:
Sort sort = new Sort(new SortField("myNumericField", SortField.Type.INT));
searcher.search(query, maxDocs, sort)

solr unable to search with exact value

I am using Solr 4.1.0 and I'm facing a strange issue. If I give a value to search for a field, even be it exact or involving a wildcard, it gives me 0 search results. On the other hand if I just give the field name and a * in place of value, I get all the results.
Also, if I search in the text field, i.e where I have copied values of all my fields, it gives me correct output. text is by default, my catch-all for all fields. feature is a field which has value Butter.
So now, what is happening here is that if I try to find in the actual field with the exact value or even with starting alphabet and a *, it doesn't give me a value while if I search in the text field, which is a catch-all field, I'm able to retrieve the value. Although if I try to find in the feature field using *, it gives me complete result list correctly.
You can view the logs for text:Butter here, logs for feature:Butter here, logs for feature:B* here and logs for feature:* here
I'm facing this issue with this particular field only. Any pointers to what could be the reason behind this strange problem?
If you search without the field name, Solr is going to search in the default search field.
So make sure you are marking the fields you want to search on as default.
If you are using dismax query handler, you can add them to the qf parameter.
Also, for Wildcard Queries check [Analyzers][1]
On wildcard and fuzzy searches, no text analysis is performed on the search word.
As no analysis is done at query time for wilcard searches and hence the lower casing, stemming would not be applied during query time but just the index time.

Lucene not giving results when specifying field

I have a database which I have indexed in Lucene (using Pylucene) by section (specified by markup in the document) using lucene's fields. This index seems to work fine. I can search it using the default field which is simply the entire document and get reasonable results.
The problem is, when I search it using a specific section (not the default), I expect to get a certain number of results back (as specified by IndexSearcher.search(query, results)), but instead it might simply return nothing. So my question is: how can I get it to return a ranked list with the number of results I specify?
The only place I specify the field is in the QueryParser, by calling:
QueryParser(Version.LUCENE_CURRENT, field, StandardAnalyzer)
I would verify the index using Luke (which is something I do often when modifying my index strategy).