Lucene syntax as objects instead of query string in Azure search - lucene

I would like to send the filter as a syntax tree, and not a query string, to Azure search. Ist that possible?
All I can find is to send the filter as a string.
I have a filter syntax like ( State eq 1 ) or ( Domain eq 'Love' ) but I'd like to send it parameterised to Azure search instead of as a string.
(It's a security thing - I'd prefer not to have to escape/wash the indata but instead let Microsoft/Azure/Lucene take care of the details as they know more about the inner workings than I do.)
Basically: I'd like to
filter =
Or (
Equal( "State", stateValue ),
Equal( "FieldName", domainValue )
)
Instead of me doing it like
filter = $"( 'State' eq {MyStringEscapeFunction(stateValue)} ) " +
"or ( 'Love' eq {MyStringEscapeFunction(domainValue)} )"

Filters in Azure Cognitive Search must be specified via the $filter parameter using OData-syntax.
https://learn.microsoft.com/en-us/azure/search/search-query-odata-filter
Your example filter is a valid OData filter. Provided that you have an index where State is a number and Domain is text.
$filter=(State eq 1) or (Domain eq 'Love')
If I understand your question correctly, you have an application where the values 1 and 'Love' are inputs from end users. The Azure Search API will validate that the filter values are valid according to the datatype. Other than that, you are responsible for validating input to your application.
For example, assume that your input parameters are s and d for State and Domain, respectively. You risk someone trying to manipulate your filter to return results you did not intend:
yourpage.aspx?s=1&d=Love%27%20or%20Domain%20eq%20%27Hate
This could potentially cause your $filter query to become:
$filter=(State eq 1) or (Domain eq 'Love' or Domain eq 'Hate')
You are responsible for implementing validation. You must build a layer that validates end-user inputs before using it in a $filter query. Here you can validate that end users' state and domain input are limited to valid values before creating an OData filter. See examples here:
https://learn.microsoft.com/en-us/aspnet/core/mvc/models/validation?view=aspnetcore-7.0

Related

PostgreSQL RPCs : allow required array parameters that will be processed in ANY/IN keywords to be null/empty

I have a PostgreSQL RPC that aims to select filtered rows of a view.
This RPC requires some parameters (name_article, catg_article, color_article, etc).
Most of these parameters are int[]/bigint[] because I want the user to be able to request "all blue articles or all red articles, etc" but I want the user to be able to post empty parameters as well, and that the request considers he doesn't care about which color or category so it will return all possibilities.
The problem is that from what I saw after many topics on Internet, the ANY () or IN () can't be empty, which I'd like to allow it otherwise my filters system would have to manage all possibilities and I really don't want to cry.
This is what I've readen on Internet to try ( param is null or in()/any() ) but it doesn't work, not returning any article (the first where is fine, also don't pay attention to the cast thing, it's just that catg_and_type is json so I have to say id_catgarticle from this json is a bigint so it works fine) :
SELECT *
FROM dev.get_all_articles
WHERE get_all_articles.lib_article ILIKE '%' || $1 || '%'
AND ($2 is null or CAST(get_all_articles.catg_et_type->>'id_catgarticle' AS BIGINT) = any ($2));
Do you have any idea how I could allow empty arrays that will be processed with IN/ANY commands ?
Thanks a lot.
Problem solved, as mentionned into my answer to #LaurenceIsla's answer to the topic.
When having to send an array parameter into a PostgREST API endpoint, the syntax is like this : /rpc/endpoint?param={1,2,3}. So in order to make the request understand an empty param in URL (endpoint?param={}), I had to say, in the WHERE clause this : OR $2 = '{}'. That's all. Kind of tricky syntax when you don't know it.

Does gorm interpret the content of a struct with a logical OR?

New to SQL, I am writing as an exercise an API middleware that checks if the information contained in some headers match a database entry ("token-based authentication"). Database access is based on GORM.
To this, I have defined my ORM as follows:
type User struct {
ID uint
UserName string
Token string
}
In my middleware I retrieve the content of relevant headers and end up with the variables userHeader and tokenHeader. They are supposed to be matched to the database in order to do the authentication.
The user table has one single entry:
select * from users
// 1,admin,admintoken
The authentication code is
var auth User
res := db.Where(&User{UserName: userHeader, Token: tokenHeader}).Find(&auth)
if res.RowsAffected == 1 {
// authentication succeeded
}
When testing this, I end up with the following two incorrect results (other combinations are correct):
with only one header set to a correct value (and the other one not present) the authentication is successful (adding the other header with an incorrect value is OK (=auth fails))
no headers set → authentication goes though
I expected my query to mean (in the context of the incorrect results above)
select * from users where users.user_name = 'admin' and users.token = ''
select * from users where users.user_name = '' and users.token = ''
and this query is correct on the console, i.e. produces zero results (ran against the database).
The ORM one, however, seems to discard non-existing headers and assume they are fine (this is at least my understanding)
I also tried to chain the Where clauses via
db.Where(&User{UserName: userHeader}).Where(&User{Token: tokenHeader}).Find(&auth)
but the result is the same.
What should be the correct query?
The gorm.io documentation says the following on the use of structs in Where conditionals:
When querying with struct, GORM will only query with non-zero fields,
that means if your field’s value is 0, '', false or other zero
values, it won’t be used to build query conditions ...
The suggested solution to this is:
To include zero values in the query conditions, you can use a map,
which will include all key-values as query conditions ...
So, when the token header or both headers are empty, but you still want to include them in the WHERE clause of the generated query, you need to use a map instead of the struct as the argument to the Where method.
db.Where(map[string]interface{}{"user_name": userHeader, "token": tokenHeader}).Find(&auth)
You can use Debug() to check for the generated SQL (it gets printed into stderr); use it if you are unsure what SQL your code generates

Odoo API search criteria not fully recognised

I am calling the Odoo object/search API and am trying to return a list of IR records for a list of external IDs.
The model I am searching is ir.model.data and want to match the following:
model = res.country.state
complete_name = l10n_uk.state_uk_99
The search criteria I am using, in PHP is:
$ir_criteria = array(
array('model', '=', 'res.country.state'),
array('complete_name', 'in', 'l10n_uk.state_uk_99'),
);
What I am getting back, is all ir.model.data records that match the model, but not limited to the given complete_name.
Why wouldn't that be working?
In the "External Identifiers" admin page, I do get the right result - for a single external ID at least - by searching for:
Model Name = res.country.state
Module = l10n_uk
External Identifier = state_uk_99
so that might be exactly what I need to search for through the API?
This is the search criteria I am using to search model ir.model.data, which is equivalent to the above:
array(
array("model", "=", "res.country.state"),
array("module", "=", "l10n_uk"),
array("name", "in", array("state_uk_99")),
)
The "complete_name" is split into module and name. Where there is no module, it is set to "" in the search criteria.
If I need to search over several modules at once, which seems to be a need if the data has been imported over time in different ways, then reverse polish notation can be used. So pulling out states 'l10n_uk.state_uk_99', 'l10n_uk.state_uk_98' and 'base.state_us_10' can be done with this search criteria:
array(
array("model", "=", "res.country.state"),
// The following element repeated for number of modules minus 1.
'|',
// The first module.
'&',
array("module", "=", "l10n_uk"),
array("name", "in", array("state_uk_99", "state_uk_98")),
// The second module.
'&',
array("module", "=", "base"),
array("name", "in", array("state_us_10")),
// Further modules, as needed.
)
That returns the database IDs of the external IDs, which are used to fetch the res_id of those external IDs, which points to the state records of res.county.state
It would be nice if each interfaced system had full control over its own set of external IDs on OpenERP, but it doesn't, so we are always stuck with the potential for a good old mix of modules and formats on the external IDs, all created by importing data in different ways.

Endeca UrlENEQuery java API search

I'm currently trying to create an Endeca query using the Java API for a URLENEQuery. The current query is:
collection()/record[CONTACT_ID = "xxxxx" and SALES_OFFICE = "yyyy"]
I need it to be:
collection()/record[(CONTACT_ID = "xxxxx" or CONTACT_ID = "zzzzz") and
SALES_OFFICE = "yyyy"]
Currently this is being done with an ERecSearchList with CONTACT_ID and the string I'm trying to match in an ERecSearch object, but I'm having difficulty figuring out how to get the UrlENEQuery to generate the or in the correct fashion as I have above. Does anyone know how I can do this?
One of us is confused on multiple levels:
Let me try to explain why I am confused:
If Contact_ID and Sales_Office are different dimensions, where Contact_ID is a multi-or dimension, then you don't need to use EQL (the xpath like language) to do anything. Just select the appropriate dimension values and your navigation state will reflect the query you are trying to build with XPATH. IE CONTACT_IDs "ORed together" with SALES_OFFICE "ANDed".
If you do have to use EQL, then the only way to modify it (provided that you have to modify it from the returned results) is via string manipulation.
ERecSearchList gives you ability to use "Search Within" functionality which functions completely different from the EQL filtering, though you can achieve similar results by using tricks like searching only specified field (which would be separate from the generic search interface") I am still not sure what's the connection between ERecSearchList and the EQL expression above?
Having expressed my confusion, I think what you need to do is to use String manipulation to dynamically build the EQL expression and add it to the Query.
A code example of what you are doing would be extremely helpful as well.

SQL Injection: is this secure?

I have this site with the following parameters:
http://www.example.com.com/pagination.php?page=4&order=comment_time&sc=desc
I use the values of each of the parameters as a value in a SQL query.
I am trying to test my application and ultimately hack my own application for learning purposes.
I'm trying to inject this statement:
http://www.example.com.com/pagination.php?page=4&order=comment_time&sc=desc' or 1=1 --
But It fails, and MySQL says this:
Warning: mysql_fetch_assoc() expects parameter 1 to be resource,
boolean given in /home/dir/public_html/pagination.php on line 132
Is my application completely free from SQL injection, or is it still possible?
EDIT: Is it possible for me to find a valid sql injection statement to input into one of the parameters of the URL?
The application secured from sql injection never produces invalid queries.
So obviously you still have some issues.
Well-written application for any input produces valid and expected output.
That's completely vulnerable, and the fact that you can cause a syntax error proves it.
There is no function to escape column names or order by directions. Those functions do not exist because it is bad style to expose the DB logic directly in the URL, because it makes the URLs dependent on changes to your database logic.
I'd suggest something like an array mapping the "order" parameter values to column names:
$order_cols = array(
'time' => 'comment_time',
'popular' => 'comment_score',
... and so on ...
);
if (!isset($order_cols[$_GET['order'])) {
$_GET['order'] = 'time';
}
$order = $order_cols[$_GET['order']];
Restrict "sc" manually:
if ($_GET['sc'] == 'asc' || $_GET['sc'] == 'desc') {
$order .= ' ' . $_GET['sc'];
} else {
$order .= ' desc';
}
Then you're guaranteed safe to append that to the query, and the URL is not tied to the DB implementation.
I'm not 100% certain, but I'd say it still seems vulnerable to me -- the fact that it's accepting the single-quote (') as a delimiter and then generating an error off the subsequent injected code says to me that it's passing things it shouldn't on to MySQL.
Any data that could possibly be taken from somewhere other than your application itself should go through mysql_real_escape_string() first. This way the whole ' or 1=1 part gets passed as a value to MySQL... unless you're passing "sc" straight through for the sort order, such as
$sql = "SELECT * FROM foo WHERE page='{$_REQUEST['page']}' ORDER BY data {$_REQUEST['sc']}";
... which you also shouldn't be doing. Try something along these lines:
$page = mysql_real_escape_string($_REQUEST['page']);
if ($_REQUEST['sc'] == "desc")
$sortorder = "DESC";
else
$sortorder = "ASC";
$sql = "SELECT * FROM foo WHERE page='{$page}' ORDER BY data {$sortorder}";
I still couldn't say it's TOTALLY injection-proof, but it's definitely more robust.
I am assuming that your generated query does something like
select <some number of fields>
from <some table>
where sc=desc
order by comment_time
Now, if I were to attack the order by statement instead of the WHERE, I might be able to get some results... Imagine I added the following
comment_time; select top 5 * from sysobjects
the query being returned to your front end would be the top 5 rows from sysobjects, rather than the query you try to generated (depending a lot on the front end)...
It really depends on how PHP validates those arguments. If MySQL is giving you a warning, it means that a hacker already passes through your first line of defence, which is your PHP script.
Use if(!preg_match('/^regex_pattern$/', $your_input)) to filter all your inputs before passing them to MySQL.