Filter inputs in custom ContentProvider functions - sql

In a custom ContentProvider I need to filter out some columns specified in the inputs. Given the text-oriented Android interfaces this is giving me a hard time.
For example the input on MyContentProvider.query() would effectively ask something like:
SELECT column_a, column_b FROM my_table WHERE column_a=1 AND column_b=red;
The problem is that at this particular MyContentProvider _column_b_ might not make any sense and would not be present in the table. Filtering the projection so that only relevant columns remain can be easily done since it's a String[]. However, filtering the String "where" (selection) and "selectionArgs" inputs for these columns is not trivial. If done properly it would become:
SELECT column_a FROM my_table WHERE column_a=1;
Otherwise one would get a SQLiteException "no such column".
So, is there any easy way to ignore or filter columns from such an sql statement or do I need to go and write some smart albeit very limited regexp parsing code for the selection part?
The reason I'm not getting the right inputs is because I maintain a custom ContentProvider as an interface to address, but I talk to multiple custom ContentProviders herein (in the background). One way or another, I would need to filter the selection somewhere.
Please note that I am not asking simply how to do a query or use the SELECT ... WHERE statement. However it concerns my implementation of the query() function.

Since you are extending your MyContentProvider with ContentProvider why don't you just overload the query() method?
Look at ContentProvider - Sharing Content using the ContentProvider for someone elses example on how to create a custom ContentProvider. You should have full control over what data you fetch from your SQLiteDatabase.
More importantly, look at the arguments provided to query(), as they contain the information you need to you in a way where you can dynamically build the query from what is passed into the method call.
Depending on if you can find a good query builder, you have an opportunity to build a small but powerful abstraction layer to build your queries, so that you minimize the amount of actual SQL that you write yourself.
Also, always remember to sanitize your inputs!

Related

Keyword based JPA query with statuses as Enums and with NOT clause [Kotlin]

I have a Keyword based JPA query I need to modify in order to exclude records with a particular status. Currently, I have the following:
findAllByLatestVersion_Entity_DataFieldGreaterThanEqualAndLatestVersion_AnotherFieldNull(datefield: Istant, pageable: Pageable)
I do not want to parameterise, therefore I would like to have the query to work as there was a WHERE clause stating that the status IS NOT C, for example. I am struggling to find clear documentation on how to go about. Is it possible to write something along these lines:
findAllByLatestVersion_Entity_DataFieldGreaterThanEqualAndLatestVersion_AnotherFieldNullAndLatestVersion_StatusCNot(datefield: Istant, pageable: Pageable)
Thank you
No this is not possible with query derivation, i.e. the feature you are using here. And even if it were possible you shouldn't do it.
Query derivation is intended for simple queries where the name of the repository method that you would choose anyway perfectly expresses everything one needs to know about the query to generate it.
It is not intended as a replacement for JPQL or SQL.
It should never be used when the resulting method name isn't a good method name.
So just formulate the query as a JPQL query and use a #Query annotation to specify it.

Endeca search query on multiple fields

How to create an Endeca query on combination of multiple fields [just like where clause in sql query]. Suppose we have three fields indexed are -
empId
empName
empGender
Now, I need a query like "where empName like 's%' AND empGender=male"
Thanks.
Firstly,
Checkout Record Filters in the Advanced Development Guide.
If you are trying to use a Record Filter on a property, you will need to enable it explicitly in Developer Studio for that property, while your Dimensions will automatically have the ability to apply a Record Filter. This will help when you have explicit values to filter on, for example empGender.
Your Record Filter can then look as follow:
Nr=AND(empGender:male)
You can further use the Ntk parameter to specify fields to search on so assuming your empName field is enabled for wildcard searching (configure this in Developer Studio) searching this field will look as follow:
Ntk=empName&Ntt=s*
So assuming your properties have been configured correctly, your example above will probably end up looking as follow:
Nr=AND(empGender:male)&Ntk=empName&Ntt=s*
To take this one step further, you can specify Search Filters (ie. Ntk + Ntt parameters) together. I haven't tried this for wildcards so you'll need to confirm that yourself but to combine Search Filters you delimit them with |
Ntk=empName|empId&Ntt=s*|1234*
I suggest you manually build up queries in the Reference Application to confirm you get your expected results and then start to code this up in your application.
radimbe, the problem with record filters for this use case is that they need to be precise. This means you don't get pelling correction, thesaurus expansion, case insensitivity or stemming. It's very unlikely that a user will input precise information like this.
Saraubh, you can do a boolean search to do OR text search queries. You can also use the Endeca Query Language to specify a complex set of boolean logic that goes beyond boolean search and which would incorporate spelling correction, stemming, etc.
In general though, I think for an application like this, you should move away from searching specific individual fields simultaneously and make use of the faceting capabilities of dimensions to guide the user. Additionally, a search box that searches many fields in combination simultaneously in order of importance is really the way to go for a simplified user interface for this sort of application.

Expanding an arbitrary Lucene Query

I am using Lucene 3.6.1. I receive a query from a user. This query may contain + or - operators, and may also contain phrases. In certain circumstances, I would like to expand the query by adding some extra terms that I compute. These terms are optional. However, any required include/exclude constraints specified by the user must be respected.
My initial strategy was to create a BooleanQuery, add a clause to it that contains the parsed user query, and then add further clauses that contain my expansion terms. The expansion terms would all be added as Occur.SHOULD. My question is how to constrain the user's query. I can imagine three possibilities:
The user's query contains no operators, which means I can include it as an Occur.SHOULD clause.
The user's query contains a + operator, so I need to include it as an Occur.MUST clause.
The user's query contains a - operator, but also other terms: Do I still include it as an Occur.MUST clause?
The question implicit in these three choices is how do I tell which condition is appropriate? I suppose I can rewrite the query and test for BooleanQuery instances, but that seems brittle.
I suppose can also try to tactic of creating a single string from the user's input and from my expansion terms, like this:
(fld1:userterm1 userterm2 -fld2:userterm3 +userterm4)^10 (fld1:expterm1)^8 (fld2:expterm2)^7 ...
Is this the best way to go? Or is there some elegant programmatic solution?
Okay, Not sure how useful this answer will be, but can't seem to come up with a hard and fast answer here, so I'll list a couple possibilities that come to mind:
First, a problem:
Modifying the query to look like:
(userquery) (other) (stuff)
I makes some sense to add the + with he rules you've shown, but a '-' prohibited term will be hard to respect correctly, since (query -prohibition) (other) will allow matches on other with prohibition present as well, and +(query -prohibition) (other) will require 'query' be matched.
The only way I see to really do that part right is to propagate the prohibited term into your automatically added terms as well, or extract it out to a parent query layer, more like (query -prohibition) --> (query) (other) -(prohibition).
And with user entered queries of arbitrary complexity, that may not be a great strategy.
If you want to tackle it by modifying the query string, then you should probably just add any terms to the end of the query. Nothing more to it.
I don't believe
(fld1:userterm1 userterm2 -fld2:userterm3 +userterm4)^10 (fld1:expterm1)^8 (fld2:expterm2)^7 ...
Is satisfactory, because userterm4 is only required within it's subquery, but a match Only on expterm1 is still acceptable. However, a query like:
fld1:userterm1 userterm2 -fld2:userterm3 +userterm4 (fld1:expterm1)^.8 (fld2:expterm2)^.7 ...
Should, I think, satisfy your needs, and prevents you from having to worry about the internals of your queryparser. I think this is the best approach.
I can also see logic in a query structured like
+(parsed userquery) (other stuff)
Effectively, always requiring a match on the user query. Lucene implicitly does this, in a sense, as it won't return a result that matches no term, even if no required fields are present in the query. This would then be using your added terms to impact scoring, rather than return a larger set of documents. This doesn't quite address what your asking, but might be worth considering.
If, despite the aforementioned problems of applying them, you still want to detect '+' and '-' operators, I think it can be reasonably assumed that a StandardQueryParser will return a BooleanQuery at base level for any query that you need to check for these operands on. You might have to worry about, for instance, DisjunctionMaxQueries, as well as what will happen when you have a simple query with an operator, like:
+myterm
I don't know if QueryParser would simply return a TermQuery, losing the plus (since it would be redundant without another term present). Concerns like that make me hesitant to address it in this way.
Similarly, attempting to detect these values from the query string must make assumptions about how things are parsed, and could become complicated.
To sumamrize, I think the best options are to, either: add terms to the end of the raw query string before doing any parsing, or treat the user query as atomic, and define the appropriate booleanclause independant of it's contents when adding to a boolean clause wrapping it with whatever other queries you need to include.

RESTful API Design OR Predicates

I'm designing a RESTful API and I'm trying to work out how I could represent a predicate with OR an operator when querying for a resource.
For example if I had a resource Foo with a property Name, how would you search for all Foo resources with a name matching "Name1" OR "Name2"?
This is straight forward when it's an AND operator as I could do the following:
http://www.website.com/Foo?Name=Name1&Age=19
The other approach I've seen is to post the search in the body.
You will need to pick your own approach, but I can name few that seem to be pretty logical (although not without disadvantages):
Option 1.: Using | operator:
http://www.website.com/Foo?Name=Name1|Name2
Option 2.: Using modified query param to allow selection by one of the values from the set (list of possible comma-separated values):
http://www.website.com/Foo?Name_in=Name1,Name2
Option 3.: Using PHP-like notation to provide list instead of single string:
http://www.website.com/Foo?Name[]=Name1&Name[]=Name2
All of the above mentioned options have one huge advantage: they do not interfere with other query params.
But as I mentioned, pick your own approach and be consistent about it across your API.
Well one quick way to fixing that is to add an additional parameter that is identifying the relationship between your parameters wether they're an and or an or for example:
http://www.website.com/Foo?Name=Name1&Age=19&or=true
Or for much more complex queries just keep a single parameter and in it include your whole query by making up your own little query language and on the server side you would parse the whole string and extract the information and the statement.

Android Notepad Uri Explanation

In the android Notes demo, it accepts the URI:
sUriMatcher.addURI(NotePad.AUTHORITY, "notes", NOTES);
sUriMatcher.addURI(NotePad.AUTHORITY, "notes/#", NOTE_ID);
Where the difference between notes and notes/# is that notes/# returns the note who's ID matches #.
However, the managedQuery() method that is used to get data from the content provider has the following parameters:
Parameters
uri The URI of the content provider to query.
projection List of columns to return.
selection SQL WHERE clause.
selectionArgs The arguments to selection, if any ?s are pesent
sortOrder SQL ORDER BY clause.
So, is there any particular cause for the design decision of providing a URI for that, rather than just using the selection parameter? Or is it just a matter of taste?
Thank you.
I thinks its so you can do more complex lookups without having to complicate your selections and arguments. For example in my project I have multiple tables but use the same selection and arguments. To filter content. By using the URI I don't have interpret the query, I can just switch on the URI. It.might be personal taste to begin with. But in more complex scenarios you appreciate the URI. You can also use * to match strings in the same.way you can with#.
I think it's mostly a matter of taste. IMHO, putting the id in the Uri is a little cleaner since you can make the id opaque rather than require the client to know that it actually represents a specific row id. For instance, you can pass a lookup key (like in the the Contacts API) rather than a specific row id.