postgreSQL when to use parentheses for keywords, ON vs USING( ) - sql

I've noticed that it appears arbitrary (to me) when keywords will require a following parentheses. If a table is JOINed by identically named columns, you use USING(ID). The alternative is ON table1.column = table2.column.
Why and when are parentheses required?

USING() can be applied on a list of columns - like in USING(id,seq). The comma, without the parentheses, could be the introduction of a following clause, so the parser can only safely determine the comma between id and seq as a list separator when we put the list into parentheses.
ON, on the other side, is always followed by a Boolean expression, which is straightforward to parse: expression - comparison operator - expression .

Related

Django ORM underscore wildcard

I have been searching for a way of using Django ORM to use the SQL underscore wildcard, and do something equivalent to this:
SELECT * FROM table
WHERE field LIKE 'abc_wxyz'
Currently, I am doing:
field_like = 'abc_wxyz'
result = MyClass.objects.extra(where=["field LIKE " + field_like])
I already tried with contains() and icontains(), but that's not what I need, since what it does is adding parenthesis to the query:
SELECT * FROM table
WHERE field LIKE '%abc/_wxyz%'
Thanks!
You can use __regex lookup to build more complex lookup expressions than __contains, __startswith or __endswith (can add "i" character to beginning of each of these to make lookups case insensitive, like icontains). In your case, I think
MyClass.objects.filter(field__regex=r'^abc.wxyz$')
Would do what you are trying to do.
You can use the field__contains attribute.
for example:
MyClass.objects.filter(field__contains='abc_wxyz')
This is equivalent to:
SELECT * FROM MyClass WHERE field LIKE 'abc_wxyz'
Lord Elron's answer is incorrect. Django escapes all developer supplied wildcard characters to the LIKE-type lookups. The statement is equivalent to
SELECT * FROM MyClass WHERE field LIKE '%abc/_wxyz%'
(as the OP discovered) and the underscore has no effect.
See Escaping percent signs and underscores in LIKE statements
The field lookups that equate to LIKE SQL statements (iexact, contains, icontains, startswith, istartswith, endswith and iendswith) will automatically escape the two special characters used in LIKE statements – the percent sign and the underscore.

How to select values around .(dot) using sql

I am running below query in Teradata :
sel requesttext from dbc.tables
where tablename='old_employee_table'
Result:
alter table DB_NAME.employee_table,no fallback ;
I want to get below result using SQL:
DB_NAME.employee_table
Requesttext can be:
create set table DB_NAME.employee_table;
DB Name and table can occur anywhere in the result. Since .(dot) is joining them that's why i want to split with .(dot).
Basically I need sql which can result me surrounding values of .(dot)
I want DBName and Tablename in result.
I'm not a Teradata person, but this should work for both strings given so far, as long as teradata's regexp_substr() supports positive look-behind and positive look-ahead assertions (I might have the Teradata syntax wrong, so a little tweaking may be needed):
SELECT REGEXP_SUBSTR(requesttext, '(?<= )(\w+\.\w+)(?=[,$]?)', 1, 1)
FROM dbc.tables
WHERE tablename='old_employee_table'
See the regex101 example. Hopefully it translates to Teradata easily.
The regex looks for and returns the words either side of and including the period, when preceded by a space, and followed by an optional comma or the end of the line.
You could do this with either regexp_substr() or strtok().
As Jamie Zawinski said:
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
So I would go with the strtok() method. Also I'm lazy and regular expressions are hard.
Function strtok() takes three arguments:
The string being split
The delimiter to split the string
The number of the token to grab.
To get at the <database>.<table> from that string that is returned in your query, we can split by a space, grab the third token, then split that by a comma and grab the first token.
That would look like:
SELECT strtok(strtok(requestText,' ',3),',',1)
FROM dbc.tables
WHERE tablename='old_employee_table'

How to determine whether a varchar field DOES NOT contain characters in set

I need to determine if all rows in varchar column in a db contain any characters outside of the particular set below:
abcdefghijklmonpqrstuvwxyzABCDEFGHIJKLMONPQRSTUVWXYZ.-#,1234567890/\&%();:+#_*?|=''
I tried this but am not sure if it is correct:
select AccName
from Transactions
where AccName not like '%[!abcdefghijklmonpqrstuvwxyzABCDEFGHIJKLMONPQRSTUVWXYZ.-#,1234567890/\&%();:+#_*?|='']%'
Should this work?
Any help appeciated.
You cannot use a regular expression inside an ordinary LIKE condition in a query. If you want to use regular expressions, you will have to use a special operator. In MySQL, you could try the following:
SELECT AccName
FROM Transactions
WHERE AccName REGEXP [!abcdefghijklmonpqrstuvwxyzABCDEFGHIJKLMONPQRSTUVWXYZ.-#,1234567890/\&%();:+#_*?|='']%';
If this doesn't run to boot, then you may have to tidy up the regular expression you gave. And as marc_s asked, the exact regular expression and query will depend on the DB system you are using.
Database management systems vary in their support for matching regular expressions. Examples below use PostgreSQL, which supports POSIX regular expressions, along with other flavors. Examples below also test for case-sensitive matches to avoid sentences like "'Mike' doesn't not match the regular expression".
AFAIK, no DBMS lets you mix the like operator with a regular expression.
A like expression in the form column_name like '%a%' will match 'a' if it appears anywhere in the column. But you need your regular expression to match on the whole value of the column. Anchor the regular expression at the start and end of each value (^ and $), and tell the dbms to match one or more instances (+) of the atom.
select 'Mike' ~ '^[a-zA-Z0-9]+$'; -- 'Mike' matches the regex
Write a failing test.
select 'Mike?' ~ '^[a-zA-Z0-9]+$'; -- 'Mike?' doesn't match the regex
Add the question mark to the regex, and verify the test succeeds.
select 'Mike?' ~ '^[a-zA-Z0-9?]+$'; -- 'Mike?' matches the regex
Repeat failing test and succeeding test for each character. When you've caught all the characters you want, invert the logic using the !~ operator in place of the ~ operator.
When your data is clean move this into a CHECK constraint.
PostgreSQL pattern matching

Regular expression filter

I have this regular expression in my sql query
DECLARE #RETURN_VALUE VARCHAR(MAX)
IF #value LIKE '%[0-9]%[^A-Z]%[0-9]%'
BEGIN
SET #RETURN_VALUE = NULL
END
I am not sure, but whenever I have this in my row 12 TEST then it gives me the value of 12, but if I have three digit number then it filters out the three digit numbers.How can I modify the regular expression to return me the three digits numbers too.
any help will be appreciated.
SQL doesn't have regular expressions: it has SQL wildcard expressions. They are much simpler than regular expressions and long predate regular expressions. For instance, there is no way to specify alternation (a|b) or repetition ( a*, a+, a?, a{m,n} ) such as you might find in a regular expression.
The 'like expression' that you have
LIKE '%[0-9]%[^A-Z]%[0-9]%'
will match any string containing the following pattern anywhere in the string
zero or more of any character, followed by...
a single decimal digit, followed by...
zero or more of any character, followed by...
a single character other than A–Z (whether it's case sensitive or not depends on the collating sequence in use), followed by...
zero or of any character, followed by...
a single decimal digit, followed by...
zero or more of any character
One should note that the % is likely to match perhaps more than you might like.
Have you tried ([0-9]*). I believe that this will capture every digit for you. However, I am not as strong at regex. When I ran this through rubular, it worked, though :) BTW, rubular is a great way to test out regular expressions
You can easily create a SQL CLR function and use this in your queries. Visual Studio has a project template for this and makes deploying the functions a snap.
Here is more information from Microsoft about how to create the function and how to use it (for boolean matches and for data extraction).
First of all, note that this is not really a "regular expression", it's a SQL-specific form of wildcard matching. You are very limited in what you can accomplish with SQL wildcards. As one example, you cannot "optionally" match a specific character or character set.
Your expression, as you've written it, will match any value that contains two digits with at least one non-letter character in between them, meaning it will match:
111
1^1
1?7
1AAAAAAAAAAA?AAAAAAAAA1
-----------------------5-----------------3-------
And infinitely more items of a similar structure.
Oddly, one string that would not match this pattern is "12 TEST" because there is no character between the 1 and 2. The pattern also won't "give you" the value of 12 back because it's not a parsing expression, just a matching expression: it returns 1 (true) or 0 (false).
There is clearly something else going on in your application, possibly even an actual regular expression, but it has nothing to do with the SQL you've included here.

Return sql rows where field contains ONLY non-alphanumeric characters

I need to find out how many rows in a particular field in my sql server table, contain ONLY non-alphanumeric characters.
I'm thinking it's a regular expression that I need along the lines of [^a-zA-Z0-9] but Im not sure of the exact syntax I need to return the rows if there are no valid alphanumeric chars in there.
SQL Server doesn't have regular expressions. It uses the LIKE pattern matching syntax which isn't the same.
As it happens, you are close. Just need leading+trailing wildcards and move the NOT
WHERE whatever NOT LIKE '%[a-z0-9]%'
If you have short strings you should be able to create a few LIKE patterns ('[^a-zA-Z0-9]', '[^a-zA-Z0-9][^a-zA-Z0-9]', ...) to match strings of different length. Otherwise you should use CLR user defined function and a proper regular expression - Regular Expressions Make Pattern Matching And Data Extraction Easier.
This will not work correctly, e.g. abcÑxyz will pass thru this as it has a,b,c... you need to work with Collate or check each byte.