SQL Difference Between "a%" and "a%_" - sql

I have been searching on good use-cases and differences between the use of LIKE and = where I faced this problem regarding LIKE.
Since LIKE "_r%" means those with 'r' as their second character, doesn't it hold to assume that "r%_" means those with 'r' as their first character, essentially making it functionally same as "a%".
I am asking this because our lecture slides says the otherwise, and I am not sure whether I am wrong or not. I have also ran this SQL test program (https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_like_underscore) to see it firsthand and this also proves my point.
Have a good day.

No, both r%_ and r% don't actually mean the same thing. The first version r%_ will match any string starting with r, followed by zero or more of any character, followed by any single character. This pattern will match ra and ran, but it will not match single r. The pattern r% on the other hand will match r, since % allows for zero characters following the leading r.

The difference is that a% will match anything that starts with a, including a by itself (% matches zero, one, or multiple characters), while a%_ will match anything that starts with a, but must be followed by at least one other character (_ matches exactly one character).

Related

Difference between _%_% and __% in sql server

I am learning basics of SQL through W3School and during understanding basics of wildcards I went through the following query:
--Finds any values that start with "a" and are at least 3 characters in length
WHERE CustomerName LIKE 'a_%_%'
as per the example following query will search the table where CustomerName column start with 'a' and have at least 3 characters in length.
However, I try the following query also:
WHERE CustomerName LIKE 'a__%'
The above query also gives me the exact same result.
I want to know whether there is any difference in both queries? Does the second query produce a different output in some specific scenario? If yes what will be that scenario?
Both start with A, and end with %. In the middle part, the first says "one char, then between zero and many chars, then one char", while the second one says "one char, then one char".
Considering that the part that comes after them (the final part) is %, which means "between zero and many chars", I can only see both clauses as identical, as they both essentially just want a string starting with A then at least two following characters. Perhaps if there were at least some limitations on what characters were allowed by the _, then maybe they could have been different.
If I had to choose, I'd go with the second one for being more intuitive. After all, many other masks (e.g. a%%%%%%_%%_%%%%%) will yield the same effect, but why the weird complexity?
For Like operator a single underscore "_" means, any single character, so if you put One underscore like
ColumnName LIKE 'a_%'
you basically saying you need a string where first letter is 'a' then followed by another single character and then followed by anything or nothing.
ColumnName LIKE 'a__%' OR ColumnName LIKE 'a_%_%'
Both expressions mean first letter 'a' then followed by two characters and then followed by anything or nothing. Or in simple English any string with 3 or more character starting with a.

No space before ORDER BY - Why does this work?

I came across some SQL in an application which had no space before the "ORDER BY" clause. I was surprised that this even works.
Given a table of numbers, called [counter] where there is simply one column, counter_id that is an incrementing list of integers this SQL works fine in Microsoft SQL Server 2012
select
*
FROM [counter] c
where c.counter_id = 1000ORDER by counter_id
This also works with strings, e.g.:
WHERE some_string = 'test'ORDER BY something
My question is, are there any potential pitfalls or dangers with this query? And conversely, are there any benefits? Other than saving, what, 8 bits of network traffic for that whitespace (whcih may well be a consideration in some applications)
Let me explain the reason why this works with numbers and strings.
The reason is because numbers cannot start identifiers, unless the name is escaped. Basically, the first things that happens to a SQL query is tokenization. That is, the components of the query are broken into identifiers and keywords, which are then analyzed.
In SQL Server, keywords and identifiers and function names (and so on) cannot start with a digit (unless the name is escaped, of course). So, when the tokenizer encounters a digit, it knows that it has a number. The number ends when a non-digit character is encountered. So, a sequence of characters such as 1000ORDER BY is easily turned into three tokens, 1000, ORDER, and BY.
Similarly, the first time that a single quote is encountered, it always represents a string literal. The string literal ends when the final single quote is encountered. The next set of characters represents another token.
Let me add that there is exactly zero reason to ever use these nuances. First, these rules are properties of SQL Server's tokenization and do not necessarily apply to other databases. Second, the purpose of SQL is for humans to be able to express queries. It is way, way more important that we read them.
As jarlh mentioned there might be difference during scanning and parsing the tokens but it creates correctly during execution plan, hence it might not be huge difference in advantages or disadvantages
When parser examines characters ,it checks for keywords,identifiers,string constants and match overall semantic and syntactic structure of the language. Since 'Order by' is a keyword and sql parser knows its possible syntactic location in a query, it will interpret it accordingly without throwing any error. This is the reason why your order by will not throw any error.
Parsing sql query
Parsing SQL

Find each of the following languages? (grammar)

I want to, for each of the following languages on Τ={a, b, c}, construct the corresponding regular expression and regular grammar:
All strings containing exactly three a’s.
All strings containing at most three b’s.
How can I do this?
You may always use unions, concatenations and Kleene stars in addition to the given symbols (unless the task explicitly forbids it). So if you don't know how those work, read up on those first. Afterwards, here's a hint to the first task: take any string that contains three or more b's, say, acbaacbbaacbacb. Each character is either one of the first three b's or not: xxbxxxbbxxxxxxx. So the structure of such a string is a sequence of any characters (or maybe none if it starts with a b), and then a b, then more other characters (maybe), then another b, more characters (maybe), the third b, and finally more characters (maybe). How do you express "any character", and how do you express the alternating sequence of b's and "any character, zero or more times"?

Turing machine that accepts strings with an equal beginning and end length

I need help creating a single tape deterministic Turing machine for this language
here I am not sure how to determine which strings the TM will accept. How can I make the machine accept strings where a=c? because the b part has elements from both a and c.
Maybe you can try do adapt a machine which accepts palidromes: you read a character to the left. If it belongs to {0,1} you delete it and go to the right (the last character). If the character belongs to {2,3}, you delete it and go back to the left (the first character). Repeat it until you find a character which does not belong to the "a" or "c" side (and check the last character if you were on the left), the remaining characters should belong to the "b" block.

Regular expression filter

I have this regular expression in my sql query
DECLARE #RETURN_VALUE VARCHAR(MAX)
IF #value LIKE '%[0-9]%[^A-Z]%[0-9]%'
BEGIN
SET #RETURN_VALUE = NULL
END
I am not sure, but whenever I have this in my row 12 TEST then it gives me the value of 12, but if I have three digit number then it filters out the three digit numbers.How can I modify the regular expression to return me the three digits numbers too.
any help will be appreciated.
SQL doesn't have regular expressions: it has SQL wildcard expressions. They are much simpler than regular expressions and long predate regular expressions. For instance, there is no way to specify alternation (a|b) or repetition ( a*, a+, a?, a{m,n} ) such as you might find in a regular expression.
The 'like expression' that you have
LIKE '%[0-9]%[^A-Z]%[0-9]%'
will match any string containing the following pattern anywhere in the string
zero or more of any character, followed by...
a single decimal digit, followed by...
zero or more of any character, followed by...
a single character other than A–Z (whether it's case sensitive or not depends on the collating sequence in use), followed by...
zero or of any character, followed by...
a single decimal digit, followed by...
zero or more of any character
One should note that the % is likely to match perhaps more than you might like.
Have you tried ([0-9]*). I believe that this will capture every digit for you. However, I am not as strong at regex. When I ran this through rubular, it worked, though :) BTW, rubular is a great way to test out regular expressions
You can easily create a SQL CLR function and use this in your queries. Visual Studio has a project template for this and makes deploying the functions a snap.
Here is more information from Microsoft about how to create the function and how to use it (for boolean matches and for data extraction).
First of all, note that this is not really a "regular expression", it's a SQL-specific form of wildcard matching. You are very limited in what you can accomplish with SQL wildcards. As one example, you cannot "optionally" match a specific character or character set.
Your expression, as you've written it, will match any value that contains two digits with at least one non-letter character in between them, meaning it will match:
111
1^1
1?7
1AAAAAAAAAAA?AAAAAAAAA1
-----------------------5-----------------3-------
And infinitely more items of a similar structure.
Oddly, one string that would not match this pattern is "12 TEST" because there is no character between the 1 and 2. The pattern also won't "give you" the value of 12 back because it's not a parsing expression, just a matching expression: it returns 1 (true) or 0 (false).
There is clearly something else going on in your application, possibly even an actual regular expression, but it has nothing to do with the SQL you've included here.