Understanding LDAP OR filter - ldap

I'm trying to understand OR LDAP queries (specifically Blind LDAP injection).
Am I right in saying that in order to infer about the value objectClass can assume (here U) the following filter sent to the LDAP server, is correct?
(|(objectClass=void)(objectClass=U))(&(objectClass=void)(type=P*))
Supposing the web application returns an object, can I safely say that the LDAP directory includes a category called U?...Is my reasoning correct?
Thanks a lot

Is my reasoning correct?
No. 'Or' is 'or'. As you have written it, either (i) the object class will be void (whatever that means), or (ii) the object class will be U.
The remainder of the filter isn't valid. There is an inner OR filter and an inner AND filter, but there is no outer operator to state how they are joined. A filter parser might be justified in stopping at the first )) for example, as there is no valid continuation of the parse.

If your goal is to understand the OR-operator (per se) inside a LDAP-query, I found the article "or-operator in LDAP queries" very helpful:
To summarize, "&" is the "And" operator, "!" is the "Not" operator,
"|" is the "Or" operator, and "*" is the wildcard. Conditions can be
nested in parentheses. The wildcard cannot be used in DN attributes.
Example: (|(givenName=foo*)(middlename=foo*)(mail=foo*)) implements a threefold OR gives back all occurrences, where foo appears in eather the given name, the middlename, or the mail address. For more details, see post "LDAP multiple or syntax".

Related

Lucene operator precedence for boolean operators

What is the order of operations for boolean operators? Left to right? Right to left? Specific operators have higher priority?
For example, if I search for:
jakarta OR apache AND website
What do I get? Is it
Anything with "jakarta" as well as anything with both "apache" and "website"?
Anything with "website" that also has either "jakarta" or "apache"?
Something else?
Short answer:
In Lucene, the AND operator takes precedence over the OR operator. So, you are effectively doing this:
jakarta OR (apache AND website)
You can verify this for yourself by parsing your query string and seeing how it converts AND and OR to the "required" and "optional" operators.
And the NOT operator takes precendence over the AND operator, since we are discussing precedence.
But you need to be very careful when dealing with Lucene's so-called "boolean" operators, as they do not behave the way you may expect based on their collective name ("boolean").
(Unfortunately I have never seen any official documentation which provides a citation for these precedence rules - but instead I am relying on empirical observations. See below for more about that. If the documentation for this does exist, that would be great to see.)
Longer Answer
One key thing to understand is that Lucene boolean operators are not really "boolean" in the sense that you may think, based on Boolean algebra, where you use parentheses to help avoid ambiguity (or where you need to know what rules a programming language may be applying) - and where everything evaluates to TRUE or FALSE.
Lucene boolean operators serve a subtly different purpose.
They are not purely concerned with TRUE/FALSE inclusion/exclusion, but also concerned with how to score results so that the more relevant results have higher scores than less relevant results.
The Lucene query jakarta OR apache AND website is equivalent to the following:
jakarta +apache +website
This means the document's field must contain apache and website, but may also include jakarta (for a higher relevance score).
You can see this for yourself by taking your original query string and parsing it:
Query query = parser.parse(queryString);
...and then printing the resulting string representation of the query. The + operator is the "required" operator. It:
requires that the term after the "+" symbol exist somewhere in the field
And the lack of a + operator means the default of "may" as in "may contain" - meaning the term is optional: it does not need to be present, if there is some other clause in the query which does match a document.
The use of AND forces the terms on either side of the AND to be required.
You can encounter some potentially surprising situations.
Consider this:
foo AND bar OR baz AND bat
This parses to the following:
+foo +bar +baz +bat
This is because the AND operators are transformed to + operators for every term, rendering the OR redundant.
It's the same result as if you had written this:
foo AND bar AND baz AND bat
But not the same as this:
(foo AND bar) OR (baz AND bat)
which is parsed to this, where the parentheses are retained:
(+foo +bar) (+baz +bat)
Bottom Line:
Use parentheses to explicitly make your intentions clear, when using AND and OR and also NOT.
Regarding NOT, since we mentioned it - that takes prescendence over AND.
The query:
foo AND bar NOT baz AND bat
Is parsed as:
+foo +bar -baz +bat
So, a document field must contain foo, bar and bat - and must not contain baz.
Why does this situation exist?
I don't know, but I think Lucene originally did not include AND, OR and NOT - but instead used + (must include), - (must not include) and "nothing" (may include). The so-called boolean operators AND, OR, NOT were added later on, as a kind of "syntactic sugar" for these original operators - introduced for people who were more familiar with AND, OR and NOT from other contexts. I'm basing this on the following thread:
Getting a Better Understanding of Lucene's Search Operators
A summary of that thread is included in this answer about the NOT operator.

URL-parameters input seems inconsistent

I have review multiple instructions on URL-parameters which all suggest 2 approaches:
Parameters can follow / forward slashes or be specified by parameter name and then by parameter value. so either:
1) http://numbersapi.com/42
or
2) http://numbersapi.com/random?min=10&max=20
For the 2nd one, I provide parameter name and then parameter value by using the ?. I also provide multiple parameters using ampersand.
Now I have see the request below which works fine but does not fit into the rules above:
http://numbersapi.com/42?json
I understand that the requests sets 42 as a parameter but why is the ? not followed by the parameter name and just by the value. Also the ? seems to be used as an ampersand???
From Wikipedia:
Every HTTP URL conforms to the syntax of a generic URI. The URI generic syntax consists of a hierarchical sequence of five components:
URI = scheme:[//authority]path[?query][#fragment]
where the authority component divides into three subcomponents:
authority = [userinfo#]host[:port]
This is represented in a syntax diagram as:
As you can see, the ? ends the path part of the URL and starts the query part.
The query part is usually a &-separated string of name=value pairs, but it doesn't have to be, so json is a valid value for the query part.
Or, as the Wikipedia articles says it:
An optional query component preceded by a question mark (?), containing a query string of non-hierarchical data. Its syntax is not well defined, but by convention is most often a sequence of attribute–value pairs separated by a delimiter.
It is also fairly common for request processors to treat a name=value pair that is missing the = sign, as if the it was name=.
E.g. if you're writing Servlet code and call servletRequest.getParameter("json"), it would return an empty string ("") for that last URL in the question.

How to factorize a string to check its belonging to language that is generated from alphabet?

Let S= {a, bb, bab, abaab} is an alphabet. and kleene closure will be S* will all possible combinations.
Is string abaabbabbaab exists in S*?
what is the method to factorize to check whether it is in S* or not?
I have done it, by the following ways,
Possible factorization:
(abaab)(bab)(b)(a)(a)(b)
(abaab)(bab)(b)(aa)(b)
(abaab)(bab)(ba)(ab)
(abaab)(bab)(baa)(b)
(abaab)(bab)(b)(aab)
we can see that (abaab)(bab) is matching , but later part is not matching will combinations in S*. I have factorized the later part in many ways, but still its not matching.
I want to ask that,
is it correct?
Is this correct way to factorize(tokenize) the string?
are all factorization pairs are correct?
is this correct method to check a string whether it is belong to a
language or not?
Some of your factoriztions contain $(b)$, which is not in $S$. So they are not correct.
I think your method is exhaustive trial and error. If you do that correctly, it is a correct way to find a factorization. For checking membership of a language, it works if the language is given in the form of the Kleene closure of a finite language.

Orient-db regex modifiers

I'm working with orient-db database, and I've issues with regex pattern matching. I really need case-insensitive modifier to be present in the request, but somehow it doesn't work as I'm expecting.
Query:
select from UserAccounts where email MATCHES '^ther.*'
Returns as expected matches in lowercase.
Whenever I try to add a modifier, outside delimiters i.e.
select from UserAccounts where email MATCHES '\^ther.*\i'
I get an empty collection. Actually the query returns an empty collection whenever delimiters are present.
If there is no way to attach modifiers I could probably replace each 'alpha' char to an expression in square brackets i.e.
select from UserAccounts where email MATCHES "^[tT][hH][eE][rR].*"
But I'm not really happy with this solution.
Using the Java case-insensitive regex modifier (from Pattern's special constructs) works in OrientDB 1.7.9 - for your example:
select from UserAccounts where email MATCHES '(?i)^ther.*'
(See also: Pattern - Special Constructs)
I've added a comment to the corresponding OrientDB issue as well.
Unfortunately there is no way to specify modifiers for regex in matches operator.
For now the good solution would be to create a custom function, where you can use whole power of JS regexps.
But we definitely should add ability to specify modifiers in MATCHES, could you create a feature request?

Is there a way to use the LIKE operator from an entity framework query?

OK, I want to use the LIKE keyword from an Entity Framework query for a rather unorthodox reason - I want to match strings more precisely than when using the equals operator.
Because the equals operator automatically pads the string to be matched with spaces such that col = 'foo ' will actually return a row where col equals 'foo' OR 'foo ', I want to force trailing whitespaces to be taken into account, and the LIKE operator actually does that.
I know that you can coerce Entity Framework into using the LIKE operator using .StartsWith, .EndsWith, and .Contains in a query. However, as might be expected, this causes EF to prefix, suffix, and surround the queried text with wildcard % characters. Is there a way I can actually get Entity Framework to directly use the LIKE operator in SQL to match a string in a query of mine, without adding wildcard characters? Ideally it would look like this:
string usernameToMatch = "admin ";
if (context.Users.Where(usr => usr.Username.Like(usernameToMatch)).Any()) {
// An account with username 'admin ' ACTUALLY exists
}
else {
// An account with username 'admin' may exist, but 'admin ' doesn't
}
I can't find a way to do this directly; right now, the best I can think of is this hack:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
Is there a better way? By the way I don't want to use PATINDEX because it looks like a SQL Server-specific thing, not portable between databases.
There isn't a way to get EF to use LIKE in its query, However you could write a stored procedure that finds users using LIKE with an input parameter and use EF to hit your stored procedure.
Your particular situation however seems to be more of a data integrity issue though. You shouldn't be allowing users to register usernames that start or end with a space (username.Trim()) for pretty much this reason. Once you do that then this particular issue goes away entirely.
Also, allowing 'rough' matches on authentication details is beyond insecure. Don't do it.
Well there doesn't seem to be a way to get EF to use the LIKE operator without padding it at the beginning or end with wildcard characters, as I mentioned in my question, so I ended up using this combination which, while a bit ugly, has the same effect as a LIKE without any wildcards:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
So, if the value is LIKE '[usernameToMatch]%' and it's LIKE '%[usernameToMatch]' and it = '[usernameToMatch]' then it matches exactly.