SQL from where like with or statement - sql

I have a SQL statement to search for an application by name.
def apps = Application.findAll("from Application as app where lower(app.name) like '%${params.query.toLowerCase()}%' ")
I want to not just be able to search by the applications name, but by different properties such as type and language. How would I add or statements to allow me to do this. Thanks!

It's not clear (to me) what framework you're using, or what specific RDBMS your database is on, but the overall thrust of SQL conditions is usually pretty simple.
Pretty much, like most other computer languages, you use and/or/not statements. Of course, in SQL you use the words, instead of (what are usually) symbols:
AND instead of &&
OR instead of ||
NOT instead of !
along with the standard comparison operators:
<, <=, =, <>, >, >=
Note that 'equals' is only a single = sign, and 'not equals' is usually opposed angle brackets <>.
So, given your current statement, if you wanted to only get those applications in the user's language, and exclude apps that are 'disabled', you could do something like the following:
def apps = Application.findAll("from Application as app
where lower(app.name) like '%${params.query.toLowerCase()}%'
and app.language = userLanguageParameter
and app.status <> inactiveAppStatus ");

Related

Lucene operator precedence for boolean operators

What is the order of operations for boolean operators? Left to right? Right to left? Specific operators have higher priority?
For example, if I search for:
jakarta OR apache AND website
What do I get? Is it
Anything with "jakarta" as well as anything with both "apache" and "website"?
Anything with "website" that also has either "jakarta" or "apache"?
Something else?
Short answer:
In Lucene, the AND operator takes precedence over the OR operator. So, you are effectively doing this:
jakarta OR (apache AND website)
You can verify this for yourself by parsing your query string and seeing how it converts AND and OR to the "required" and "optional" operators.
And the NOT operator takes precendence over the AND operator, since we are discussing precedence.
But you need to be very careful when dealing with Lucene's so-called "boolean" operators, as they do not behave the way you may expect based on their collective name ("boolean").
(Unfortunately I have never seen any official documentation which provides a citation for these precedence rules - but instead I am relying on empirical observations. See below for more about that. If the documentation for this does exist, that would be great to see.)
Longer Answer
One key thing to understand is that Lucene boolean operators are not really "boolean" in the sense that you may think, based on Boolean algebra, where you use parentheses to help avoid ambiguity (or where you need to know what rules a programming language may be applying) - and where everything evaluates to TRUE or FALSE.
Lucene boolean operators serve a subtly different purpose.
They are not purely concerned with TRUE/FALSE inclusion/exclusion, but also concerned with how to score results so that the more relevant results have higher scores than less relevant results.
The Lucene query jakarta OR apache AND website is equivalent to the following:
jakarta +apache +website
This means the document's field must contain apache and website, but may also include jakarta (for a higher relevance score).
You can see this for yourself by taking your original query string and parsing it:
Query query = parser.parse(queryString);
...and then printing the resulting string representation of the query. The + operator is the "required" operator. It:
requires that the term after the "+" symbol exist somewhere in the field
And the lack of a + operator means the default of "may" as in "may contain" - meaning the term is optional: it does not need to be present, if there is some other clause in the query which does match a document.
The use of AND forces the terms on either side of the AND to be required.
You can encounter some potentially surprising situations.
Consider this:
foo AND bar OR baz AND bat
This parses to the following:
+foo +bar +baz +bat
This is because the AND operators are transformed to + operators for every term, rendering the OR redundant.
It's the same result as if you had written this:
foo AND bar AND baz AND bat
But not the same as this:
(foo AND bar) OR (baz AND bat)
which is parsed to this, where the parentheses are retained:
(+foo +bar) (+baz +bat)
Bottom Line:
Use parentheses to explicitly make your intentions clear, when using AND and OR and also NOT.
Regarding NOT, since we mentioned it - that takes prescendence over AND.
The query:
foo AND bar NOT baz AND bat
Is parsed as:
+foo +bar -baz +bat
So, a document field must contain foo, bar and bat - and must not contain baz.
Why does this situation exist?
I don't know, but I think Lucene originally did not include AND, OR and NOT - but instead used + (must include), - (must not include) and "nothing" (may include). The so-called boolean operators AND, OR, NOT were added later on, as a kind of "syntactic sugar" for these original operators - introduced for people who were more familiar with AND, OR and NOT from other contexts. I'm basing this on the following thread:
Getting a Better Understanding of Lucene's Search Operators
A summary of that thread is included in this answer about the NOT operator.

MS Project equivalent to "xlAnd". Enumeration for logical operators

I'm trying to write a language independent filter for MS Project in VBA. I'm using the syntax:
FilterEdit (Name, Taskfilter, Create, Fieldname, Test, Value, Operation...)
I have managed to get the Fieldnames and Tests to be language independent, but I struggle with the Operation:= expression. For an English locale one would write: Operation:="and" but that doesnt work for other locales.
Is there a way to write the logical operator (and/or) as an enumeration? (not as a string?)
For Excel one could write xlAnd, and Project has a lot of enumerations starting with Pj, ie. PjTaskStart. I also know there's a Filter.LogicalOperationType, but I haven't managed to figure out if this could work for me or not. I have also experimented with FieldConstantToFieldName, but I reckon there's no fieldname for the logical operator?
I know I could use If LocaleID = xxxx Then..., but I'd like to not assume what locales will be in use.
Edit: I solved the first part of my problem!
By leaving it blank Operation:="", Project returns "And". But I haven't figured out yet how to return "Or"...
Operation:="" works for FilterEdit, but not for SetAutoFilter.
So I ended up using the dreaded If LocaleID.
Teaching moment:
I found out most operators can be language independent, except for:
And, Or, Contains and Does Not Contain.
These needs to be translated for each locale. I'll get to those in a minute. First I'll list all the language independent operators:
< Less than <= Less than or equal to > Greater than >= Greater than or equal to = Equal to <> Not equal to
My trick for finding the translations I need for the language dependent operators is the following MS Office Support page.
Notice the category named "Filter for specific text" in the English support page. Here we can read all the "words" we need. Now go to the bottom of the web page and change the language:
This opens up a new page listing all the different languages (not locale specific). Remembering where you found the word for Contains in English, then changing the language to for instance "Magyar (Magyarorzág)", we can now see that Contains = "Tartalmazza" in Magyar.
Next step is to google "Magyar languge" and learn that this actually equals Hungarian. So now you can go to this MSDN web page to see that Hungarian = LocaleID: 1038.
Putting all this together inside VBA makes you have to write the following code:
Dim LocalContains As String
If LocaleID = 1038 Then
LocalContains = "Tartalmazza" 'Hungarian
ElseIf LocaleID = 1044 Then
LocalContains = "inneholder" 'Norwegian
Else
LocalContains = "contains" 'English
End If

Is there a way to use the LIKE operator from an entity framework query?

OK, I want to use the LIKE keyword from an Entity Framework query for a rather unorthodox reason - I want to match strings more precisely than when using the equals operator.
Because the equals operator automatically pads the string to be matched with spaces such that col = 'foo ' will actually return a row where col equals 'foo' OR 'foo ', I want to force trailing whitespaces to be taken into account, and the LIKE operator actually does that.
I know that you can coerce Entity Framework into using the LIKE operator using .StartsWith, .EndsWith, and .Contains in a query. However, as might be expected, this causes EF to prefix, suffix, and surround the queried text with wildcard % characters. Is there a way I can actually get Entity Framework to directly use the LIKE operator in SQL to match a string in a query of mine, without adding wildcard characters? Ideally it would look like this:
string usernameToMatch = "admin ";
if (context.Users.Where(usr => usr.Username.Like(usernameToMatch)).Any()) {
// An account with username 'admin ' ACTUALLY exists
}
else {
// An account with username 'admin' may exist, but 'admin ' doesn't
}
I can't find a way to do this directly; right now, the best I can think of is this hack:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
Is there a better way? By the way I don't want to use PATINDEX because it looks like a SQL Server-specific thing, not portable between databases.
There isn't a way to get EF to use LIKE in its query, However you could write a stored procedure that finds users using LIKE with an input parameter and use EF to hit your stored procedure.
Your particular situation however seems to be more of a data integrity issue though. You shouldn't be allowing users to register usernames that start or end with a space (username.Trim()) for pretty much this reason. Once you do that then this particular issue goes away entirely.
Also, allowing 'rough' matches on authentication details is beyond insecure. Don't do it.
Well there doesn't seem to be a way to get EF to use the LIKE operator without padding it at the beginning or end with wildcard characters, as I mentioned in my question, so I ended up using this combination which, while a bit ugly, has the same effect as a LIKE without any wildcards:
context.Users.Where(usr =>
usr.Username.StartsWith(usernameToMatch) &&
usr.Username.EndsWith(usernameToMatch) &&
usr.Username == usernameToMatch
)
So, if the value is LIKE '[usernameToMatch]%' and it's LIKE '%[usernameToMatch]' and it = '[usernameToMatch]' then it matches exactly.

What advantages are there to using either AND or &&?

Currently, I'm using && and || instead of AND and OR because that's how I was taught. In most languages, however, both are valid syntax. Are there any advantages to one or the other in any language?
I did try to search for this question, but it's a bit hard. It doesn't interpret my input correctly.
You ask “Are there any advantages to one or the other in any language?” This is, of course, dependent on the programming language.
Languages that implement both and and && (and correspondingly or and ||) will do it one of two ways:
Both behave exactly the same way. In which case, there is no advantage provided by the language in using one over the other.
Each behaves differently. In which case, the advantage is that you can get different behaviour by using one or the other.
That all sounds a bit facetious, but it's really as specific as one can get without talking about a specific language. Your question explicitly wants to know about all languages, but it's a question that needs to be answered per language.
Perl has all four of {&& || and or} but they differ in their precedence. "and" and "or" have really low precedence so you can do things like "complex-function-call-here or die $!" and you won't accidentally have "or" slurp up something on its left side that you didn't want it to.
it depends on the language, but on PHP, I'd be careful about using && versus "and". The ones i often use are "&&" and "||"
http://us3.php.net/manual/en/language.operators.logical.php
$g = true && false; // $g will be assigned to (true && false) which is false
$h = true and false; // $h will be assigned to true
In some languages && will have a higher operator precedence than AND.
If both works fine, then I would say it's really personal preference, in most cases, they are compiled into same binary code like this : 11100010001000101001001010 [not real code, just an example].
&& = two keystrokes of the same key.
AND = three keystrokes of different keys.
I'm not sure what language you are using, but some languages differentiate between a normal boolean operator and a short-circuit operator. For example, the following are normal boolean operators in MATLAB:
C = or(A,B);
C = A | B; % Exactly the same as above
However, this is a short-circuit operator:
C = A || B;
The short-circuit syntax will evaluate the first argument and then, depending on the value, will potentially skip over evaluating the second argument. For example, if A is already true, B doesn't have to be evaluated for an OR operation, since the result is guaranteed to be true. This is helpful when B is replaced with a logical operation that involves some kind of expensive computation.
Here's a wikipedia link discussing short-circuit operators and their syntax for a few languages.
Unless there aren't any precedence issues, I'd say there are differences in readability. Consider the following:
if (&x == &y && &y == &z) {
// ..
}
#define AND &&
if (&x == &y AND &y == &z) {
// ..
}

T-SQL: checking for email format

I have this scenario where I need data integrity in the physical database. For example, I have a variable of #email_address VARCHAR(200) and I want to check if the value of #email_address is of email format. Anyone has any idea how to check format in T-SQL?
Many thanks!
I tested the following query with many different wrong and valid email addresses. It should do the job.
IF (
CHARINDEX(' ',LTRIM(RTRIM(#email_address))) = 0
AND LEFT(LTRIM(#email_address),1) <> '#'
AND RIGHT(RTRIM(#email_address),1) <> '.'
AND CHARINDEX('.',#email_address ,CHARINDEX('#',#email_address)) - CHARINDEX('#',#email_address ) > 1
AND LEN(LTRIM(RTRIM(#email_address ))) - LEN(REPLACE(LTRIM(RTRIM(#email_address)),'#','')) = 1
AND CHARINDEX('.',REVERSE(LTRIM(RTRIM(#email_address)))) >= 3
AND (CHARINDEX('.#',#email_address ) = 0 AND CHARINDEX('..',#email_address ) = 0)
)
print 'valid email address'
ELSE
print 'not valid'
It checks these conditions:
No embedded spaces
'#' can't be the first character of an email address
'.' can't be the last character of an email address
There must be a '.' somewhere after '#'
the '#' sign is allowed
Domain name should end with at least 2 character extension
can't have patterns like '.#' and '..'
AFAIK there is no good way to do this.
The email format standard is so complex parsers have been known to run to thousands of lines of code, but even if you were to use a simpler form which would fail some obscure but valid addresses you'd have to do it without regular expressions which are not natively supported by T-SQL (again, I'm not 100% on that), leaving you with a simple fallback of somethign like:
LIKE '%_#_%_.__%'
..or similar.
My feeling is generally that you shouln't be doing this at the last possible moment though (as you insert into a DB) you should be doing it at the first opportunity and/or a common gateway (the controller which actually makes the SQL insert request), where incidentally you would have the advantage of regex, and possibly even a library which does the "real" validation for you.
If you use SQL 2005 or 2008 you might want to look at writing CLR stored proceudues and use the .NET regex engine like this. If you're using SQL 2000 or earlier you can use the VBScript scripting engine's regular expression like ths. You could also use an extended stored procedure like this
There is no easy way to do it in T-SQL, I am afraid. To validate all the varieties of email address allowed byRFC 2822 you will need to use a regular expression.
More info here.
You will need to define your scope, if you want to simplify it.