An regex to extract SQL where clause - sql

First, I confess I'm not realy experimented with regular expressions. I know how use it but when I want to build one, it's something else... I'm going to document me.
I want to extract the WHERE clause in a SQL query. My goal is to be able to add an condition, like this:
SELECT * FROM myTbl WHERE columnA = 'B' AND columnB = 'C' ORDER BY columnX GROUP BY columnZ LIMIT 5
TO :
SELECT * FROM myTbl WHERE columnC = 'D' AND (columnA = 'B' AND columnB = 'C') ORDER BY columnX GROUP BY columnZ LIMIT 5
I tried some expression but I'm so void...
(where (.*)(?<=order by))
I wanted to get all between 'where' and ('order by' or 'limit' or 'group by')...
Anyone have an advice for me ? I have done some search and I don't find anything like this. I found SQL Parser but these engines are too big compared to the task I want to complete.
Thank you.

Since the WHERE clause can be quite complex (including subqueries which may include an ORDER BY in some cases, for instance when used with FOR XML), so you will not be able to really find an always reliably working solution with a regex.
A better solution would be to use a proper SQL parser which generates an AST, and then you can just extract the WHERE clause from that. For T-SQL, you could use the parser from the bsn ModuleStore project (LGPL license). Modifying the AST is easy and you can re-script the statement afterwards.

You're using lookbehind (?<=) whereas you need lookahead (?=) assertion.
There's more on that.

This might get you going:
declare
sql_stmt varchar2(4000) := q'!SELECT * FROM myTbl WHERE columnA = 'B' AND columnB = 'C' ORDER BY columnX GROUP BY columnZ LIMIT 5!';
where_stmt varchar2(4000) ;
begin
where_stmt := regexp_replace(sql_stmt, '.*(WHERE.*?)ORDER BY.*', '\1');
dbms_output.put_line(where_stmt);
end;
/
The scriplet above will output WHERE columnA = 'B' AND columnB = 'C'.

Related

SQL use intermediate results

I have a column with numbers (float) that I would like to categorize and store a category as integer and as label (string). For now assume that the category is simply defined by the FLOOR(x).
This works:
SELECT salary,
FLOOR(salary) AS category_integer,
CASE WHEN FLOOR(salary) = 0
THEN 'foo'
ELSE 'bar'
END AS category_label
FROM test01
but I was wondering if I could use the intermediate variable 'category_integer' defined in the beginning of my query in a later part, something like this:
SELECT salary,
FLOOR(salary) AS category_integer,
CASE WHEN category_integer = 0
THEN 'foo'
ELSE 'bar'
END AS category_label
FROM test01
but this is apparently not how SQL works. I've looked into Common table Expressions but got lost there. Is there a way to reuse intermediate variables in an SQL expression?
SQL Fiddle
I must have missed this but I couldn't find related questions so far.
You may resort to common table expressions - basically a query that produces a labelled result set you can refer to in subsequent queries.
Adapted to your example:
with cte as (
select salary
, floor(salary) as category_integer
from test01
)
SELECT salary
, category_integer
, CASE WHEN category_integer = 0
THEN 'foo'
ELSE 'bar'
END AS category_label
FROM cte
;
Consult the reference for more details: CTE / WITH in pgSQL 9.6.
See it at work in SQL fiddle.
There are pre- and post-selection operations. For example order by and group by are post-selection instructions, distinct for example filters out duplicate results during the selection proces itself and as such duplicate results do not even enter the result set to be ordered or grouped.
When you use AS, you are telling PostgreSQL to take the result and put it in a column named category_integer in the output. You are not actually making a variable here that's available during query execution, as the result is only available after the query executes. As such, you can only do this with subselects where you have the result available as a virtual table in itself, where category_integer is a column in a table rather than a variable.
SELECT category_integer,
CASE WHEN category_integer = 0
THEN 'foo'
ELSE 'bar'
END AS category_label
FROM (SELECT FLOOR(0) AS category_integer FROM test01) AS test02
https://www.postgresql.org/docs/current/static/queries-select-lists.html
https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

sql statement inside decode clause

The decode works like this:
SELECT DECODE('col1', 'x', 'result1','y','result2') resultFinal
FROM table1;
It possible to accomplish this in sql:
SELECT *
FROM (SELECT DECODE('col1', 'x' (someSql),'y',(someOthersql)) result
FROM table1)
So instead of result1 and result2 being fixed values, they would be sql statements. If not possible, how can I achieve the same result without a stored proc.
EDIT: someSql and someOthersql are both complex queries with many joins returining many but same number of cols with same col names.
If someSql and someOthersql return exactly one row with one column, then this should work.
The following works for me:
select decode(col, (select 'foo' from dual), (select 'bar' from dual))
from some table
I think you may need to create a PL/SQL procedure to handle the complex logic.

how to select * from tableA where columnA values don't start with letter 'F'

Can I have a SQL query to get the data from columna in tableA whose values don't start with 'f' ?
For example:
select * from tableA where columnA
where values don't start with letter 'F'.
For a MSSQL Scenario, you should be able to use the "NOT" operator in conjunction with the LIKE operator. So your SQL would look roughly like
select * from tableA where columnA NOT LIKE 'F%'
#Evan: the statement about SQL Server being case insensitive is actually not entirely true. Case sensitivity depends on collation. The server has a collation (chosen on install), a database has a collation (chosen on DB creation) and text columns have a collation (chosen when creating the column). When no collation is specified on DB creation, the server collation will be the default. When no collation specified on column creation it gets the same collation as the DB.
But in most cases, people (luckily) install their server using a case insensitive collation, such as Latin1_General_CI_AS. CI = case insensitive, AS = accent sensitive.
On SQL Server, if I needed to get both the small f and capital F, I would go for:
where columnA NOT LIKE 'F%' and columnA NOT LIKE 'f%'
PS: I'm adding this as "answer" because I don't see any option to comment on an existing answer - I'm still new here... If anyone has an explanation why I don't get this option, don't hesitate to contact me.
Regards, Valentino.
SELECT columnA
FROM tableA
WHERE SUBSTR(columnA,1,1) <> 'f'
If you need both 'f' and 'F':
SELECT columnA
FROM tableA
WHERE SUBSTR(columnA,1,1) NOT IN ('f','F')
Going off of Lerxst's example, some DBMSs will also let you do fun stuff like this:
SELECT columnA
FROM tableA
WHERE columnA NOT LIKE ALL ('f%','F%')
I like all of the ideas above, but I usually take a different approach.
SELECT *
FROM tableA
WHERE LEFT(columnA,1) <> 'F'
T-SQL really offers a million ways to skin a cat.
Searching for both F and f seems like way too much work
SELECT *
FROM tableA
WHERE upper(substr(columnA,1,1)) != 'F'
Or to quote my friend Ritchie - when searching in sql, trim it and then force it upper

How to select an empty result set?

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
I'm using a stored procedure in MySQL, with a CASE statement.
In the ELSE clause of the CASE ( equivalent to default: ) I want to select and return an empty result set, thus avoiding to throw an SQL error by not handling the ELSE case, and instead return an empty result set as if a regular query would have returned no rows.
So far I've managed to do so using something like:
Select NULL From users Where False
But I have to name an existing table, like 'users' in this example.
It works, but I would prefer a way that doesn't break if eventually the table name used is renamed or dropped.
I've tried Select NULL Where False but it doesn't work.
Using Select NULL does not return an empty set, but one row with a column named NULL and with a NULL value.
There's a dummy-table in MySQL called 'dual', which you should be able to use.
select
1
from
dual
where
false
This will always give you an empty result.
This should work on most DBs, tested on Postgres and Netezza:
SELECT NULL LIMIT 0;
T-SQL (MSSQL):
SELECT Top 0 1;
How about
SELECT * FROM (SELECT 1) AS TBL WHERE 2=3
Checked in myphp, and it also works in sqlite and probably in any other db engine.
This will probably work across all databases.
SELECT * FROM (SELECT NULL AS col0) AS inner0 WHERE col0 IS NOT NULL;
SELECT TOP 0 * FROM [dbo].[TableName]
This is a reasonable approach to constant scan operator.
SELECT NULL WHERE FALSE;
it works in postgresql ,mysql, subquery in mysql.
How about this?
SELECT 'MyName' AS EmptyColumn
FROM dual
WHERE 'Me' = 'Funny'
SELECT * FROM (SELECT NULL) WHERE 0
In PostgreSQL a simple
SELECT;
works. You won't even get any columns labeled 'unknown'.
Note however, it still says 1 row retrieved.