I'm running a netezza sql process as part of a shell script and in one of the sql codes, I want it to raise an ERROR or exception if the number of rows from 2 different tables don't match.
SQL Code:
/* The following 2 tables should return the same number of rows to make sure the process is correct */
select count(*)
from (
select distinct col1, col2,col3
from table_a
where week > 0 and rec >= 1
) as x ;
select count(*)
from (
select distinct col1, col2, col3
from table_b
) as y ;
How do I compare the 2 row counts and raise an exception/ERROR in the netezza SQL process, so that it exits the process, if the 2 row counts aren't equal ?
I agree a script is the best option. However you could still do the check in your SQL itself by using a cross join
Select a.*
from Next_Step_table a cross join
(select case when y.y_cnt is null then 'No Match' else 'Match' end as match
from (select count(*) as x_cnt
from ( select distinct col1, col2,col3
from table_a
where week > 0 and rec >= 1
)) x left outer join
(select count(*) as y_cnt
from (select distinct col1, col2, col3
from table_b
)) y on x.x_cnt=y.y_cnt) match_tbl
where match_tbl.match='Match'
i'm guessing the best solution here is to do it in the script.
i.e store the result of count(*) in variables, then compare them. nzsql has command line options to only return the result data of a single query.
If it must be done in plain SQL, a horribly, horrible kludge that will work is to use divide-by-zero. It's ugly but I've used it before when testing stuff. off the top of my head:
with
subq_x as select count(*) c1 .... ,
subq_y as select count(*) c2 ...
select (case when (subq_x.c1 != subq_y.c1) then 1/0 else 1 end) counts_match;
Did I mention this is ugly ?
Related
I am trying to define an SQL query in my SSRS report, but I am getting some syntax errors:
Error in SELECT clause: expression near 'WHEN'.
Missing FROM clause.
Error in SELECT clause: expression near ','.
Unable to parse query text.
The query is not that complicated, but rather a bit convoluted, but I'll try to convey at least the structure of it here:
select
(CASE WHEN columnA1 is null THEN columnA2 ELSE columnA1 END) as columnA,
(CASE WHEN columnB1 is null THEN columnB2 ELSE columnB1 END) as "custom_name_for_columnB"
from
(
(select a.columnA1, ...
from myTable a, ...
// join conditions
)
union
select * from
(select a.columnA1, ...
from myTable a, ...
// join conditions
order by someColumn) source
)
);
I don't think it really matters what the query does since I ran it in my DMBS successfully, so I'm pretty sure it's correct SQL syntax (I'm working on Oracle DB). I think what I'm not seeing is some syntax specific to SSRS. I'm completely new to it, so I don't know whether it supports the entire SQL syntax like CASE WHEN, unions etc.
As it complains about CASE (as if it doesn't recognize the syntax), try some more options:
COALESCE:
select coalesce(columnA1, columnA2) columnA,
coalesce(columnB1, columnB2) columnB
from ...
NVL:
select nvl(columnA1, columnA2) columnA,
nvl(columnB1, columnB2) columnB
from ...
DECODE:
select decode(columnA1, null, columnA2, columnA1) columnA,
decode(columnB1, null, columnB2, columnB1) columnB
from ...
Correct my query if I made too many changes, but I think there are a couple of things wrong:
SELECT (CASE WHEN t.columnA1 IS NULL THEN t.columnA2 ELSE t.columnA1 END) as columnA,
(CASE WHEN t.columnB1 IS NULL THEN t.columnB2 ELSE t.columnB1 END) as [custom_name_for_columnB]
FROM(
SELECT a.columnA1, ...
FROM myTable a, ...
JOIN conditions
UNION
SELECT * FROM
(
SELECT a.columnA1, ...
FROM m myTable a, ...
JOIN conditions
--ORDER BY someColumn --Can't have ORDER BY in subquery without TOP(n) / FOR XML.
) source
)t; --Needs an alias
I am unable to write this, please help. Below will give an idea of what I'm trying to achieve.
WITH monthly_data AS
(SELECT MAX(some_date) latest_dt FROM monthly_data
)
SELECT SUM(data)
FROM daily_data
WHERE (monthly_data.latest_dt IS NULL
OR daily_data.some_date > monthly_data.latest_dt)
table: monthly_data
id some_date
007 08-MAY-2018
table: daily_data
some_date data
07-MAY-2018 1
08-MAY-2018 1
09-MAY-2018 1
Expected result
Case 1: 1 row exist in table monthly_data.
Query should return 1.
Case 2: No rows exist in table montly_data.
Query should return 3.
The joins in the above query is incorrect but basically written to give you an idea of what I'm trying to do. Also, when I say no rows exist in table monthly_data, it is simplified explanation. There are other conditions in the actual query that filter out the data.
This has to go in a procedure
Edit
Thanks to #D-Shih I'm in a much better position where I started by using the exist clause query that he has provided.
On performance terms, can we write it in a faster way? Something that can evaluate to below would be fastest I believe
WITH CTE AS
( SELECT MAX(some_date) latest_dt FROM monthly_data
)
SELECT SUM(d.some_data)
FROM daily_data d
WHERE (d.some_date > '08-MAY-2018'
OR '08-MAY-2018' IS NULL)
If I understand correct.I think this will be work.
Due to you didn't provide some sample data and expect result.If that didn't your expect result,you can provide some sample data and expect result,I will edit my answer.
WITH CTE AS (
SELECT Max(some_date) latest_dt
FROM monthly_data
)
SELECT Sum(d.data)
FROM daily_data d
WHERE Exists (
SELECT 1
FROM CTE c
WHERE
d.some_date > c.latest_dt
OR
c.latest_dt IS NULL
)
Edit
You can try use CTE table JOIN on daily_data table
WITH CTE AS (
SELECT Max(some_date) latest_dt
FROM monthly_data
)
SELECT SUM(d.data)
FROM CTE c JOIN daily_data d
ON d.some_date > c.latest_dt OR c.latest_dt IS NULL;
sqlfiddle: http://sqlfiddle.com/#!4/33c64e/28
TRY THIS:
SELECT CASE WHEN SUM(CASE WHEN md.Sdate IS NOT NULL THEN 1 ELSE 0 END) > 0 THEN
SUM(CASE WHEN md.Sdate IS NOT NULL THEN 1 ELSE 0 END)
ELSE
SUM(CASE WHEN md.Sdate IS NULL THEN 1 ELSE 0 END)
END cnt
FROM daily_data dd
LEFT JOIN monthly_data md ON md.Sdate = dd.Sdate
....... {other conditions}
Say I have a table t with 2 columns:
a int
b int
I can do a query such as:
select b
from t
where b > a
and a in(1,2,3)
order by b
where 1,2,3 is provided from the outside.
Obviously, the query can return no rows. In that case, I'd like to select everything as if the query did not have the and a in(1,2,3) part. That is, I'd like:
if exists (
select b
from t
where b > a
and a in(1,2,3)
)
select b
from t
where b > a
and a in(1,2,3)
order by b
else
select b
from t
where b > a
order by b
Is there a way to do this:
Without running two queries (one for exists, the other one the actual query)
That is less verbose than repeating queries (real queries are quite long, so DRY and all that stuff)
Using NOT EXISTS with a Sub Query to Determine if condition exists
SELECT b
FROM
t
WHERE
b > a
AND (
NOT EXISTS (SELECT 1 FROM #Table WHERE a IN (1,2,3))
OR a IN (1,2,3)
)
ORDER BY
b
The reason this works is because if the condition exists then the OR statement will include the rows and if the condition does not exist then the NOT EXISTS will include ALL rows.
Or With Common Table Expression and window Function with Conditional Aggregation.
WITH cte AS (
SELECT
b
,CASE WHEN a IN (1,2,3) THEN 1 ELSE 0 END as MeetsCondition
,COUNT(CASE WHEN a IN (1,2,3) THEN a END) OVER () as ConditionCount
FROM
t
)
SELECT
b
FROM
cte
WHERE
(ConditionCount > 0 AND MeetsCondition = 1)
OR (ConditionCount = 0)
ORDER BY
b
I find it a bit "ugly". Maybe it would be better to materialize output from your query within a temp table and then based on count from temp table perform first or second query (this limits accessing the original table from 3 times to 2 and you will be able to add some flag for qualifying rows for your condition not to repeat it). Other than that, read below . . .
Though, bear in mind that EXISTS query should execute pretty fast. It stops whether it finds any row that satisfies the condition.
You could achieve this using UNION ALL to combine resultset from constrained query and full query without constraint on a column and then decide what to show depending on output from first query using CASE statement.
How CASE statement works: when any row from constrained part of your query is found, return resultset from constrainted query else return everything omitting the constraint.
If your database supports using CTE use this solution:
with tmp_data as (
select *
from (
select 'constraint' as type, b
from t
where b > a
and a in (1,2,3) -- here goes your constraint
union all
select 'full query' as type, b
from t
where b > a
) foo
)
SELECT b
FROM tmp_data
WHERE
CASE WHEN (select count(*) from tmp_data where type = 'constraint') > 0
THEN type = 'constraint'
ELSE type = 'full query'
END
;
I have three tables A,B and C. I have to detect if any of them have zero rows. As soon as any table with zero row is detected, I do not need to check other ones.
So, one way is I execute three queries separately and after each query I check the number of returned rows. If its non-zero then only I execute the query of next table.
Second way is I write a single query using case-when, something like
select case
when (select count(*) from A = 0)
then 1
else (
select case
when (select count(*) from B = 0)
then 1
else (
select case
when (select count(*) from B = 0)
then 1
else 0
)
)
end as matchResult;
The second method requires lesser code as I have to write a single query and db will do the comparison for me.
My question is whether its overkilling or can I further optimize the query?
EDIT
On further study, I realise that the query above is wrong. However, I can simply do it as
select case
when (select count(*) from A) = 0 and
(select count(*) from B) = 0 and
(select count(*) from C) = 0
then 1
else 0
end as matchResult;
and if I am not wrong, and conditions are checked from left to right and if any one is false, conditions to the right are not checked.
Please confirm this point.
Count is kind of expensive
select 1
where not exits (select * from a)
or not exits (select * from b)
or not exits (select * from c)
One query with three resutls:
select (select count(*) from A) as Acount,
(select count(*) from B) as Bcount,
(select count(*) from C) as Ccount
This instead gives name of the fitst table that is empty:
select case
when (select count(*) from A)=0 then 'A'
when (select count(*) from B)=0 then 'B'
when (select count(*) from C)=0 then 'C'
else 'ops, all have records' -- remove this to have a null
end as first_empty_table
I shortened the code quite a bit, but hopefully someone will get the idea of what i am tryign to do. Need to sum totals from two different selects, i tried putting each of them in Left Outer Joins(tried Inner Joins too). If i run wiht either Left Outer Join commented out, I get the correct data, but when i run them together, i get really screwed up counts. So, i know joins are probably not the correct approach to summing data from the same table, i can;t simple do it in a where clause there is other table involved int he code i commented out.
I guess i am trying to sum together 2 different queries.
SELECT eeoc.EEOCode AS 'Test1',
SUM(eeosum.Col_One) AS 'Col_One',
FROM EEO1Analysis eeo
LEFT OUTER JOIN (
SELECT eeor.AnalysisID, eeor.Test1,
SUM(CASE eeor.ZZZ WHEN 1 THEN (CASE eeor.AAAA WHEN 1 THEN 1 ELSE 0 END) ELSE 0 END) AS 'Col_One',
FROM EEO1Roster eeor
..........
WHERE eeor.AnalysisID = 7
GROUP BY eeor.AnalysisID, eeor.EEOCode
) AS eeosum2 ON eeosum2.AnalysisID = eeo.AnalysisID
LEFT OUTER JOIN (
SELECT eeor.AnalysisID, eeor.Test1,
SUM(CASE eeor.ZZZ WHEN 1 THEN (CASE eeor.AAAA WHEN 1 THEN 1 ELSE 0 END) ELSE 0 END) AS 'Col_One',
FROM EEO1Roster eeor
........
) AS eeosum ON eeosum.AnalysisID = eeo.AnalysisID
WHERE eeo.AnalysisID = 7
GROUP BY eeoc.Test1
You could UNION ALL the 2 queries and then do a SUM + GROUP BY i.e.
SELECT Col1, Col2, SUM(Col_One) FROM
(SELECT Col1, Col2, SUM(Col_One)
FROM Table1
WHERE <Conditionset1>
GROUP BY Col1, Col2
UNION ALL
SELECT Col1, Col2, SUM(Col_One)
FROM Table1
WHERE <Conditionset2>
GROUP BY Col1, Col2)
GROUP BY
Col1, Col2
Of course, if there is are row(s) returned by and they would be double counted.
What about
SELECT ... FROM EEO1Analysis eeo,
(SELECT ... LEFT OUTER JOIN ... GROUP BY ... ) AS data
...
?
And, if you can, I'd recommend preparing the data to separate tables, then operate on them with different analysis IDs. Could save some execution time at least.
Need to sum totals from two different selects
If you expect one row single-column result, this way is enough
SELECT
((SELECT SUM(...) FROM ... GROUP BY...) +
(SELECT SUM(...) FROM ... GROUP BY...)) as TheSumOfTwoSums