SQL - Snowflake Minus Operator - sql

Hi I am running a query to check for any changes in a table between two dates....
SELECT * FROM TABLE_A where run_time = current_date()
MINUS
SELECT * FROM TABLE_A where run_time = current_date()-1
The first select statement (where run_time = current_date() return 3,357,210 records.
The second select statement (where run_time = current_date()-1 returns 0 records.
Using the MINUS operator, I was expecting to see 3,357,210 records (3,357,210 - 0) but instead I get 2,026,434
Any thoughts on why? Thanks

https://docs.snowflake.com/en/sql-reference/operators-query.html#minus-except
Removes rows from one query’s result set which appear in another query’s result set, with duplicate elimination.
Thus, you only have 2,026,434 unique values in your first query. The missing million-and-a-bit are the duplicates, which have been eliminated.

This query:
SELECT * FROM TABLE_A where run_time = current_date()
MINUS
SELECT * FROM TABLE_A where run_time = current_date()-1
Will always return all unique rows from Table_A. Why? Because run_time is one of the columns and it is different in the two queries. MINUS looks at all the columns. Note that this is true even if the second query returns rows, because the values on the rows are different.
If your total is different from the total number of rows, then you have duplicates in the table.
Here are two ways to get the new records. Let me assume that identical records are identified by col1/col2:
select col1, col2
from table_a
where run_time in (current_date(), current_date() -1)
group by col1, col2
having min(run_time) = current_date();
That is, the first occurrence is the current date.
Or:
select col1, col2
from table_a a
where a.run_time = current_date() and
not exists (select 1
from table_a a2
where a2.run_time = current_date() - 1 and
a2.col1 = a.col1 and a2.col2 = a.col2
);

Related

How to exclude all rows with the same ID based on one record's value in psql?

Say I have the results above, and want to exclude all rows with ID of 14010497 because at least one of the rows has a date of 2/25. How would I filter it down? Using a WHERE table.end_date > '2019-02-25' would still include the row with a date of 2-23
Try something like this:
select * from your_table
where id not in (
select distinct id
from your_table
where end_date > '2019-02-25'
)
/
I would use not exists:
select t.*
from t
where not exists (select 1
from t t2
where t2.id = t.id and t2.end_date = '2019-02-25'
);
I strongly advise using not exists over not in because it handles NULL values much more intuitively. NOT IN will return no rows at all if any value in the subquery is NULL.

Different WHERE clause depending on subquery result

I would like to SELECT WHERE column IS NULL or =value depending on result of subquery.
Here is an example incorrect solution that demonstrates the problem:
SELECT *
FROM table
WHERE column=(
SELECT (CASE WHEN COUNT(*) = COUNT(COLUMN) THEN MIN(column) END)
FROM table
)
When the subquery returns NULL the other query will return nothing because column=NULL is never true. How do I fix this?
(Subquery source: https://stackoverflow.com/a/51341498/7810882)
From your question. just add OR column IS NULL in where clause.
You will get the subquery condition or column IS NULL data.
SELECT *
FROM table
WHERE column= (
SELECT (CASE WHEN COUNT(*) = COUNT(COLUMN) THEN MIN(column) END)
FROM table
) OR column IS NULL
If you are only looking for one row, I would suggest:
select t.*
from table t
order by column nulls first
fetch first 1 row only;

How to get not equal rows in SQL query

I have 2 tables and I want not equal rows to be fetched. How to write a query?
For example, table a contain 10 rows, table b contain 10 rows.
Equal rows in a and b is 5.
I want to take a not equal rows (not in b table)
How to fetch a table value which is not equal to b table ?
Result should be 5 record
To take rows in A but not in B:
select * from A minus select * from B
To take rows in A and B but not in both:
(select * from A union select * from B) minus (select * from A intersect select * from B)
This problem has been solved long ago. The optimal solution only reads each table once (unlike the "symmetric difference" solution which reads each table twice and does some additional work).
select 'A' as source, col1, col2, ...
from table_A
union all
select 'B' as source, col1, col2, ...
from table_B
group by col1, col2, ...
having count(*) = 1
;
If a row is present in both tables, then the count will be 2.
This assumes there are no duplicate rows in either table; if there may be duplicate rows, the HAVING condition can be modified, for example:
having count(case when source = 'A' then 1 end) = 0
or count(case when source = 'B' then 1 end) = 0
use EXCEPT
the syntax is similar to INTERSECT.
https://www.tutorialspoint.com/sql/sql-intersect-clause.htm

Using a value from one query in second query sql

SELECT AS, COUNT(*)
FROM Table1
HAVING COUNT(AS)>1
group BY AS;
This produces the result
AS COUNT
5 2
I then want to use the AS value in another query and only output the end result. Is this possible.i was thinking something like.
SELECT *
FROM
TABLE 2
Where AS =(
SELECT AS, COUNT(*)
FROM Table1
HAVING COUNT(AS)>1
group BY AS;
);
This is called a subquery. To be safe, you would use in instead of = (and as is a bad name for a column, because it is a SQL key word):
SELECT *
FROM TABLE2
WHERE col IN (SELECT col
FROM Table1
GROUP BY col
HAVING COUNT(col) > 1
);
Your first query is also incorrect, because the having clause goes after the group by.
You could use a subquery with the in operator:
SELECT *
FROM table2
WHERE AS IN (SELECT AS
FROM table1
GROUP BY AS
HAVING COUNT(*) > 1)

Select 10 records after id=somevalue, (id>somevalue) and select first 10 records if id=somevalue doesn't exist

I have an sql query which selects 10 records after id= somevalue, but i want to select the first 10 records if the record doesnt exist. Query is in below structure.
SELECT * FROM TABLE WHERE ID > x ORDER BY METRIC LIMIT 10
Provided, id here is a varchar field which is sorted based on some field.
This comes close to what you want:
SELECT *
FROM TABLE
ORDER BY (CASE WHEN ID > X THEN 1 ELSE 0 END) DESC,
METRIC
LIMIT 10
It will always return 10 records (assuming you have at least 10 records in the table). It will put the ones with id > x first. If there are not enough of those, then it will fill in with other records.
This will also work:
SELECT TOP 10 col1, col2
FROM #yourtable
WHERE col1 > #ID
UNION ALL
SELECT TOP 10 col1, col2
FROM #yourtable
WHERE NOT EXISTS (SELECT * FROM #yourtable WHERE col1 = #ID)
However, this assumes you have an ID that you can query on using greater than/less than to retrieve the desired "next ten" records. Also, you would probably need to add an "ORDER BY" clause to ensure the records have the desired values.