Postgres buitin function to escape special character - sql

I have 2 columns that I want to compare based on Postgres SIMILAR TO notation, in the example below it is the ~*. Lets call the columns: COLUMN1 and COLUMN2.
It works on most of the rows, then after further investigation if the value of either column has "++" the query failed.
SELECT
CASE
WHEN 'gcc' ~* 'gcc++' THEN 1
ELSE 2
END
with the following message
ERROR: invalid regular expression: quantifier operand invalid
Obviously the gcc++ string above need to be escaped.
Question:
Is the there a Postgres built-in function in postgres that escape the column values so that the comparison can be made safer? I'm thinking something like below ...
SELECT
CASE
WHEN escapeMe(COLUMN1) ~* escapeMe(COLUMN2) THEN 1
ELSE 2
END
FROM TABLE_T1

Related

verify function SAS to sql

I have a verify function in SAS in a select query as below. May I know how to convert that in SQL to use in pyspark.
select substr(upercase(strip(emp)),verify(strip(emp),"0")) as id from Emp_table.
VERIFY(target-expression, search–expression)
The VERIFY function returns the position of the first character in target-expression that is not present in search-expression. If there are no characters in target-expression that are unique from those in search-expression, VERIFY returns a 0.
Your search-expression is a single character 0 so the processing that is occurring is selecting the emp value without its leading zeroes.
Looks like you want to ltrim with expression trim(LEADING '0' ...

SQL: Using <= and >= to compare string with wildcard

Assuming I have table that looks like this:
Id | Name | Age
=====================
1 | Jose | 19
2 | Yolly | 26
20 | Abby | 3
29 | Tara | 4
And my query statement is:
1) Select * from thisTable where Name <= '*Abby';
it returns 0 row
2) Select * from thisTable where Name <= 'Abby';
returns row with Abby
3) Select * from thisTable where Name >= 'Abby';
returns all rows // row 1-4
4) Select * from thisTable where Name >= '*Abby';
returns all rows; // row 1-4
5) Select * from thisTable where Name >= '*Abby' and Name <= "*Abby";
returns 0 row.
6) Select * from thisTable where Name >= 'Abby' and Name <= 'Abby';
returns row with Abby;
My question: why I got these results? How does the wildcard affect the result of query? Why don't I get any result if the condition is this Name <= '*Abby' ?
Wildcards are only interpreted when you use LIKE opterator.
So when you are trying to compare against the string, it will be treated literally. So in your comparisons lexicographical order is used.
1) There are no letters before *, so you don't have any rows returned.
2) A is first letter in alphabet, so rest of names are bigger then Abby, only Abby is equal to itself.
3) Opposite of 2)
4) See 1)
5) See 1)
6) This condition is equivalent to Name = 'Abby'.
When working with strings in SQL Server, ordering is done at each letter, and the order those letters are sorted in depends on the collation. For some characters, the sorting method is much easier to understand, It's alphabetical or numerical order: For example 'a' < 'b' and '4' > '2'. Depending on the collation this might be done by letter and then case ('AaBbCc....') or might be Case then letter ('ABC...Zabc').
Let's take a string like 'Abby', this would be sorted in the order of the letters A, b, b, y (the order they would appear would be according to your collation, and i don't know what it is, but I'm going to assume a 'AaBbCc....' collation, as they are more common). Any string starting with something like 'Aba' would have a value sell than 'Abby', as the third character (the first that differs) has a "lower value". As would a value like 'Abbie' ('i' has a lower value than 'y'). Similarly, a string like 'Abc' would have a greater value, as 'c' has a higher value than 'b' (which is the first character that differs).
If we throw numbers into the mix, then you might be surpised. For example the string (important, I didn't state number) '123456789' has a lower value than the string '9'. This is because the first character than differs if the first character. '9' is greater than '1' and so '9' has the "higher" value. This is one reason why it's so important to ensure you store numbers as numerical datatypes, as the behaviour is unlikely to be what you expect/want otherwise.
To what you are asking, however, the wildcard for SQL Server is '%' and '_' (there is also '^',m but I won't cover that here). A '%' represents multiple characters, while '_' a single character. If you want to specifically look for one of those character you have to quote them in brackets ([]).
Using the equals (=) operator won't parse wildcards. you need to use a function that does, like LIKE. Thus, if you want a word that started with 'A' you would use the expression WHERE ColumnName LIKE 'A%'. If you wanted to search for one that consisted of 6 characters and ended with 'ed' you would use WHERE ColumnName LIKE '____ed'.
Like I said before, if you want to search for one of those specific character, you quote then. So, if you wanted to search for a string that contained an underscore, the syntax would be WHERE ColumnName LIKE '%[_]%'
Edit: it's also worth noting that, when using things like LIKE that they are effected by the collations sensitivity; for example, Case and Accent. If you're using a case sensitive collation, for example, then the statement WHERE 'Abby' LIKE 'abb%' is not true, and 'A' and 'a' are not the same case. Like wise, the statement WHERE 'Covea' = 'Covéa' would be false in an accent sensitive collation ('e' and 'é' are not treated as the same character).
A wildcard character is used to substitute any other characters in a string. They are used in conjunction with the SQL LIKE operator in the WHERE clause. For example.
Select * from thisTable WHERE name LIKE '%Abby%'
This will return any values with Abby anywhere within the string.
Have a look at this link for an explanation of all wildcards https://www.w3schools.com/sql/sql_wildcards.asp
It is because, >= and <= are comparison operators. They compare string on the basis of their ASCII values.
Since ASCII value of * is 42 and ASCII values of capital letters start from 65, that is why when you tried name<='*Abby', sql-server picked the ASCII value of first character in your string (that is 42), since no value in your data has first character with ASCII value less than 42, no data got selected.
You can refer ASCII table for more understanding:
http://www.asciitable.com/
There are a few answers, and a few comments - I'll try to summarize.
Firstly, the wildcard in SQL is %, not * (for multiple matches). So your queries including an * ask for a comparison with that literal string.
Secondly, comparing strings with greater/less than operators probably does not do what you want - it uses the collation order to see which other strings are "earlier" or "later" in the ordering sequence. Collation order is a moderately complex concept, and varies between machine installations.
The SQL operator for string pattern matching is LIKE.
I'm not sure I understand your intent with the >= or <= stateements - do you mean that you want to return rows where the name's first letter is after 'A' in the alphabet?

How to make to_number ignore non-numerical values

Column xy of type 'nvarchar2(40)' in table ABC.
Column consists mainly of numerical Strings
how can I make a
select to_number(trim(xy)) from ABC
query, that ignores non-numerical strings?
In general in relational databases, the order of evaluation is not defined, so it is possible that the select functions are called before the where clause filters the data. I know this is the case in SQL Server. Here is a post that suggests that the same can happen in Oracle.
The case statement, however, does cascade, so it is evaluated in order. For that reason, I prefer:
select (case when NOT regexp_like(xy,'[^[:digit:]]') then to_number(xy)
end)
from ABC;
This will return NULL for values that are not numbers.
You could use regexp_like to find out if it is a number (with/without plus/minus sign, decimal separator followed by at least one digit, thousand separators in the correct places if any) and use it like this:
SELECT TO_NUMBER( CASE WHEN regexp_like(xy,'.....') THEN xy ELSE NULL END )
FROM ABC;
However, as the built-in function TO_NUMBER is not able to deal with all numbers (it fails at least when a number contains thousand separators), I would suggest to write a PL/SQL function TO_NUMBER_OR_DEFAULT(numberstring, defaultnumber) to do what you want.
EDIT: You may want to read my answer on using regexp_like to determine if a string contains a number here: https://stackoverflow.com/a/21235443/2270762.
You can add WHERE
SELECT TO_NUMBER(TRIM(xy)) FROM ABC WHERE REGEXP_INSTR(email, '[A-Za-z]') = 0
The WHERE is ignoring columns with letters. See the documentation

how to make sqlite column aliases numeric or computed

i'm trying with sqlite to make code like this working:
select
data1 as 1
data2 as 'hello' + 'hi [or 'hello' || 'hi']
data3 as 6 * 3
from
table
but every effort was vain...
i tried with the || concatenation or with the +
i tried with something like
...AS select 6 * 3
but it seems like no operation is admitted in column aliases!
how can i set poperly the column alias of a select?
thanks everyone
Having an expression as column name is not possible in sqlite. Here is the full allowed syntax of sqlite. The column-alias after as doesn't accept expression:
A column alias is an identifier like a table name or column name.
To use an arbitrary string, you have to quote it, like with other identifiers:
SELECT 42 AS "42 is the answer!!!"
You cannot do computations on identifiers.

Name a case expression?

I have a case expression in a DB2 statement like the following.
SELECT A,
CASE WHEN B LIKE ' %'
THEN C
ELSE B
END CASE,
D
FROM TAB
I'd like to name the column that the case expression results in, but I get a syntax error with both the AS immediately following END CASE and by wrapping the entire expression in parentheses and following that with an AS.
Adding the AS (without the parens) results in the following error
199: SQL0199N The use of the reserved word "AS" following "" is not valid. Expected tokens may include: ", FROM INTO". SQLSTATE=42601
How can I name this column?
The ending keyword is not END CASE, it's just END:
SELECT A,
CASE WHEN B LIKE ' %'
THEN C
ELSE B
END AS D --- the AS is optional
FROM TAB
Here's the documentation for CASE expression in DB2 - although it's not needed really, that's standard SQL.