SQL UNION vs OR, INTERSECT vs AND - sql

I would like to know what the difference between an INTERSECT and an AND statement as well as a UNION statement and a OR statement is.
Is there a specific scenario where either one is recommended to use and can could I always use a OR/AND instead of an UNION/ INTERSECT ?

Use AND or OR between terms in a WHERE clause. If the complete boolean expression evaluates as true, then the row is included in the query's result set.
WHERE country = 'Canada' AND age > 21
Use INTERSECT or UNION between SELECT queries. If a row appears in both result sets, or either result set, respectively, then the row is included in the compound query's result set.
SELECT customer_id FROM archived_orders
UNION
SELECT customer_id FROM recent_orders

Related

Oracle case statement not returning values for no row results

I have a simple case statement as follows:
select
case WHEN upper(VALUE) is null then 'A_VALUE_ANYWAY' end test
FROM
V$SYSTEM_PARAMETER
WHERE
UPPER(VALUE)= 'NO_VALUE_IS_HERE'
This code is designed to return 'A_VALUE_ANYWAY' because there is no output from the SQL.
However, it does not return anything at all.
Essentially, what I would like is a value being forced to return from the case statement instead of just no rows.
Am I able to do that with the case statement? Is there some form of no data found handler I should be using instead?
I have examined this similar question but this is very complex and does not seem possible with this more simple statement
SQL CASE Statement for no data
Also, this, which uses a union to get a value from dual:
Select Case, when no data return
Which seems like a "Fudge" I feel like there must be some designed way to handle no data being found in a case statement.
From Oracle 12, you can use the FETCH syntax so that you do not have to query the table multiple times:
SELECT value
FROM (
SELECT value,
1 AS priority
FROM V$SYSTEM_PARAMETER
WHERE UPPER(VALUE)= 'NO_VALUE_IS_HERE'
UNION ALL
SELECT 'A_VALUE_ANYWAY',
2
FROM DUAL
ORDER BY priority
FETCH FIRST ROW WITH TIES
)
db<>fiddle here
What you are asking is: "if my query returns no rows, I want to see a value". That cannot be solved with a case expression. A case expression transforms the results of your query. If there are no results, nothing can be transformed. Instead you could could modify your query and union it with another select from dual that returns a string if the query itself returns no results. That way either part of the UNION ALL will return something.
SELECT
VALUE
FROM
V$SYSTEM_PARAMETER
WHERE
UPPER(VALUE)= 'NO_VALUE_IS_HERE'
UNION ALL
SELECT 'A_VALUE_ANYWAY'
FROM
DUAL
WHERE NOT EXISTS (SELECT 1
FROM
V$SYSTEM_PARAMETER
WHERE UPPER(VALUE)= 'NO_VALUE_IS_HERE'
This is the same technique as in the SQL Case statement for no data question.

Do 'set operations' have an prescribed order of execution, or do they execute in order of evaluation?

Do set operations have a prescribed order of execution (e.g. first UNION, then MINUS, then INTERSECT), or do they execute in the order of which they are scripted and evaluated?
For example, let's say I want to have a starting cohort of customer_ids, then remove some, and then add some back in. Will the set operators execute here as Qry 1 minus Qry 2 union Qry 3?
select cust_id from tbl A
MINUS
select cust_id from tbl B where field = 'abc'
UNION
select cust_id from tbl A where field = 'xyz'
Since you didn't specify the RDBMS, I'll add SQL Server for completeness. This is the order of operations:
Expressions in parentheses
The INTERSECT operator
EXCEPT (equivalent of Oracle MINUS) and UNION evaluated from left to right based on their position in the expression
All set operators have equal precedence. The documentation says
If a SQL statement contains multiple set operators, then Oracle Database evaluates them from the left to right unless parentheses explicitly specify another order.
Well, not exactly "order of execution". SQL queries represent the result set.
They specify neither the exact operations being run nor the order of execution.
That said, there is an order of precedence for set operations. So, your query is going to be interpreted as:
(select cust_id from tbl A
MINUS
select cust_id from tbl B where field = 'abc'
)
UNION
select cust_id from tbl A where field = 'xyz'
This is specified by -- or more accurately, interpreted from -- the ANSI rules on set operations.
Just because the query is interpreted this way does not mean that it is executed this way.

Confused syntax in Where clause

what does the line (rowid,0) mean in the following query
select * from emp
WHERE (ROWID,0) in (
select rowid, mod(rownum,2) from emp
);
i dont get the line WHERE (ROWID,0).
what is it?
thanx in advance
IN clause in Oracle SQL can support column groups. You can do things like this:
select ...
from tab1
where (tab1.col1, tab1.col2) in (
select tab2.refcol1, tab2.refcol2
from tab2
)
That can be useful in many cases.
In your particular case, the subquery use for the second expression mod(rownum,2). Since there is no order by, that means that rownum will be in whichever order the database retrieves the rows - that might be a full table scan or a fast full index scan.
Then by using mod every other row in the subquery gets the value 0, every other row gets the value 1.
The IN clause then filters on second value in the subquery being equal to 0. The end result is that this query retrieves half of your employees. Which half will depend on which access path the optimizer chooses.
Not sure what dialect of sql you're using, but it appears that since the subquery in the IN clause has two columns in the select list, then the (ROWID,0) indicates which columns align with the subquery. I have never seen multiple columns in an IN statment's select list before.
This is a syntax used by some databases (but not all) that allows you to do in with multiple values.
With in, this is the same as:
where exists (select 1
from emp e2
where e2.rowid = emp.rowid and
mod(rownum, 2) = 0
)
I should note that if you are using Oracle (which allows this syntax), then you are using rownum in a subquery with no order by. The results are going to be rather arbitrary. However, the intention seems to be to return every other row, in some sense.

Combining two SQL SELECT statements on the same table

I would like to combine these two SQL queries:
SELECT * FROM "Contracts" WHERE
"productType" = 'RINsell' AND
"clearTime" IS NULL AND
"holdTime" IS NOT NULL
ORDER BY "generationTime";
and
SELECT * FROM "Contracts" WHERE
"productType" = 'RINsell' AND
"clearTime" IS NULL AND
"holdTime" IS NULL
ORDER BY "contractLimitPrice";
When I run each statement, I get exactly the results I want, I would just like both results sequentially. My first thought was to use UNION ALL since this selections will be disjoint but I found that you can't use a UNION after an ORDER BY. I've searched quite a bit and most people suggest doing the ORDER BY after the UNION but each query has different ORDER BY conditions.
If you want the results of the first query before the results of the second, you can remove holdtime from the where clause, and use an order by like
order by
case when holdTime is not null then 0 else 1 end, --first query comes first
case when holdTime is not null --different orders for queries
then generationTime
else contractLimitPrice
end
... but I found that you can't use a UNION after an ORDER BY.
Well, you didn't look hard enough:
(
SELECT *
FROM "Contracts"
WHERE "productType" = 'RINsell'
AND "clearTime" IS NULL
AND "holdTime" IS NOT NULL
ORDER BY "generationTime"
)
UNION ALL
)
SELECT *
FROM "Contracts"
WHERE "productType" = 'RINsell'
AND "clearTime" IS NULL
AND "holdTime" IS NULL
ORDER BY "contractLimitPrice"
)
Note the parentheses. Per documentation:
(ORDER BY and LIMIT can be attached to a subexpression if it is
enclosed in parentheses. Without parentheses, these clauses will be
taken to apply to the result of the UNION, not to its right-hand input
expression.)
Closely related answer:
Sum results of a few queries and then find top 5 in SQL
Aside: I really would get rid of those CaMeL case identifiers. Your life is much easier with all-lower case legal identifiers in Postgres.

What is the difference between HAVING and WHERE in SQL?

What is the difference between HAVING and WHERE in an SQL SELECT statement?
EDIT: I have marked Steven's answer as the correct one as it contained the key bit of information on the link:
When GROUP BY is not used, HAVING behaves like a WHERE clause
The situation I had seen the WHERE in did not have GROUP BY and is where my confusion started. Of course, until you know this you can't specify it in the question.
HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.
This code:
select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Gives you a table of all cities in MA and the number of addresses in each city.
This code:
select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5
Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.
HAVING specifies a search condition for a
group or an aggregate function used in SELECT statement.
Source
Number one difference for me: if HAVING was removed from the SQL language then life would go on more or less as before. Certainly, a minority queries would need to be rewritten using a derived table, CTE, etc but they would arguably be easier to understand and maintain as a result. Maybe vendors' optimizer code would need to be rewritten to account for this, again an opportunity for improvement within the industry.
Now consider for a moment removing WHERE from the language. This time the majority of queries in existence would need to be rewritten without an obvious alternative construct. Coders would have to get creative e.g. inner join to a table known to contain exactly one row (e.g. DUAL in Oracle) using the ON clause to simulate the prior WHERE clause. Such constructions would be contrived; it would be obvious there was something was missing from the language and the situation would be worse as a result.
TL;DR we could lose HAVING tomorrow and things would be no worse, possibly better, but the same cannot be said of WHERE.
From the answers here, it seems that many folk don't realize that a HAVING clause may be used without a GROUP BY clause. In this case, the HAVING clause is applied to the entire table expression and requires that only constants appear in the SELECT clause. Typically the HAVING clause will involve aggregates.
This is more useful than it sounds. For example, consider this query to test whether the name column is unique for all values in T:
SELECT 1 AS result
FROM T
HAVING COUNT( DISTINCT name ) = COUNT( name );
There are only two possible results: if the HAVING clause is true then the result with be a single row containing the value 1, otherwise the result will be the empty set.
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.
Check out this w3schools link for more information
Syntax:
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value
A query such as this:
SELECT column_name, COUNT( column_name ) AS column_name_tally
FROM table_name
WHERE column_name < 3
GROUP
BY column_name
HAVING COUNT( column_name ) >= 3;
...may be rewritten using a derived table (and omitting the HAVING) like this:
SELECT column_name, column_name_tally
FROM (
SELECT column_name, COUNT(column_name) AS column_name_tally
FROM table_name
WHERE column_name < 3
GROUP
BY column_name
) pointless_range_variable_required_here
WHERE column_name_tally >= 3;
The difference between the two is in the relationship to the GROUP BY clause:
WHERE comes before GROUP BY; SQL evaluates the WHERE clause before it groups records.
HAVING comes after GROUP BY; SQL evaluates HAVING after it groups records.
References
SQLite SELECT Statement Syntax/Railroad Diagram
Informix SELECT Statement Syntax/Railroad Diagram
HAVING is used when you are using an aggregate such as GROUP BY.
SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;
WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.
HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.
SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.
WHERE clause does not work for aggregate functions
means : you should not use like this
bonus : table name
SELECT name
FROM bonus
GROUP BY name
WHERE sum(salary) > 200
HERE Instead of using WHERE clause you have to use HAVING..
without using GROUP BY clause, HAVING clause just works as WHERE clause
SELECT name
FROM bonus
GROUP BY name
HAVING sum(salary) > 200
Difference b/w WHERE and HAVING clause:
The main difference between WHERE and HAVING clause is, WHERE is used for row operations and HAVING is used for column operations.
Why we need HAVING clause?
As we know, aggregate functions can only be performed on columns, so we can not use aggregate functions in WHERE clause. Therefore, we use aggregate functions in HAVING clause.
One way to think of it is that the having clause is an additional filter to the where clause.
A WHERE clause is used filters records from a result. The filter occurs before any groupings are made. A HAVING clause is used to filter values from a group
In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,
Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.
I had a problem and found out another difference between WHERE and HAVING. It does not act the same way on indexed columns.
WHERE my_indexed_row = 123 will show rows and automatically perform a "ORDER ASC" on other indexed rows.
HAVING my_indexed_row = 123 shows everything from the oldest "inserted" row to the newest one, no ordering.
When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.
However, when GROUP BY is used:
The WHERE clause is used to filter records from a result. The
filtering occurs before any groupings are made.
The HAVING clause is used to filter values from a group (i.e., to
check conditions after aggregation into groups has been performed).
Resource from Here
From here.
the SQL standard requires that HAVING
must reference only columns in the
GROUP BY clause or columns used in
aggregate functions
as opposed to the WHERE clause which is applied to database rows
While working on a project, this was also my question. As stated above, the HAVING checks the condition on the query result already found. But WHERE is for checking condition while query runs.
Let me give an example to illustrate this. Suppose you have a database table like this.
usertable{ int userid, date datefield, int dailyincome }
Suppose, the following rows are in table:
1, 2011-05-20, 100
1, 2011-05-21, 50
1, 2011-05-30, 10
2, 2011-05-30, 10
2, 2011-05-20, 20
Now, we want to get the userids and sum(dailyincome) whose sum(dailyincome)>100
If we write:
SELECT userid, sum(dailyincome) FROM usertable WHERE
sum(dailyincome)>100 GROUP BY userid
This will be an error. The correct query would be:
SELECT userid, sum(dailyincome) FROM usertable GROUP BY userid HAVING
sum(dailyincome)>100
WHERE clause is used for comparing values in the base table, whereas the HAVING clause can be used for filtering the results of aggregate functions in the result set of the query
Click here!
When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.
However, when GROUP BY is used:
The WHERE clause is used to filter records from a result. The
filtering occurs before any groupings are made.
The HAVING clause is
used to filter values from a group (i.e., to check conditions after
aggregation into groups has been performed).
I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0