Order Of Execution of the SQL query exception for SELECT/HAVING - sql

I understand that the order or execution is as follows
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
from this SO Answer as well as Microsoft Documentation
However, in my query below, the column total is built on the fly which is later used in having clause. This would mean that having executes AFTER select and not before because the column 'total' does not exist in orders table.
Am I interpreting it wrong or simply missing something?
Query
select customer_id,
sum(CASE
WHEN product_name = 'A' THEN 1
WHEN product_name = 'B' THEN 1
WHEN product_name = 'C' THEN -1
ELSE 0 END
) as total
from Orders
group by customer_id
having total > 1;
Orders table
+------------+-------------+--------------+
| order_id | customer_id | product_name |
+------------+-------------+--------------+
| 10 | 1 | A |
| 20 | 1 | B |
| 30 | 1 | D |
| 40 | 1 | C |
| 50 | 2 | A |
| 60 | 3 | A |
| 70 | 3 | B |
| 80 | 3 | D |
| 90 | 4 | C |
+------------+-------------+--------------+
Result
+-------------+-------+
| customer_id | total |
+-------------+-------+
| 3 | 2 |
+-------------+-------+

What you have described is NOT the "order of execution". It is the order of scoping for identifiers defined in the query.
It is saying that an identifier defined in from is known in the clauses beneath it. Similarly, an identifier defined in the select is not recognized in the having. I should note that many databases do allow the having clause to use aliases in the having clause. SQL Server is not one of them.
SQL is a descriptive language, not a procedural language. That means that a query describes the result set. It does not state the steps used to generate the result. The compiler and optimizer produce the execution plan, which looks nothing like the original query.

Related

How to select from a table with additional where clause on a single column

I'm having trouble formulating a SQL query in Oracle. Here's my sample table:
+----+-----------+-----------+--------+
| id | start | end | number |
+----+-----------+-----------+--------+
| 1 | 21-dec-19 | 03-jan-20 | 12 |
| 2 | 23-dec-19 | 05-jan-20 | 10 |
| 3 | 02-jan-20 | 15-jan-20 | 9 |
| 4 | 09-jan-20 | NULL | 11 |
+----+-----------+-----------+--------+
And here's what I have so far:
SELECT
SUM(number) AS total_number,
SUM(number) AS total_ended_number -- (WHERE end IS NOT NULL)
FROM table
WHERE ... -- a lot of where clauses
And the desired result:
+--------------+--------------------+
| total_number | total_ended_number |
+--------------+--------------------+
| 42 | 31 |
+--------------+--------------------+
I understand I could do a separate select inside 'total_ended_number', but the initial select has a bunch of where clauses already which would need to be applied to the internal select as well.
I'm capable of formulating it in 2 separate selects or 2 nested selects with all the where clauses duplicated, but my intended goal is to not duplicate the where clauses that would both be used on the table.
You could sum over a case expression with this logic:
SELECT
SUM(number) AS total_number,
SUM(CASE WHEN end IS NOT NULL THEN number END) AS total_ended_number
FROM table
WHERE ... -- a lot of where clauses
SUM(case when "end" is not null then number else 0 end) AS total_ended_number

Make a query making groups on the same result row

I have two tables. Like this.
select * from extrafieldvalues;
+----------------------------+
| id | value | type | idItem |
+----------------------------+
| 1 | 100 | 1 | 10 |
| 2 | 150 | 2 | 10 |
| 3 | 101 | 1 | 11 |
| 4 | 90 | 2 | 11 |
+----------------------------+
select * from items
+------------+
| id | name |
+------------+
| 10 | foo |
| 11 | bar |
+------------+
I need to make a query and get something like this:
+--------------------------------------+
| idItem | valtype1 | valtype2 | name |
+--------------------------------------+
| 10 | 100 | 150 | foo |
| 11 | 101 | 90 | bar |
+--------------------------------------+
The quantity of types of extra field values is variable, but every item ALWAYS uses every extra field.
If you have only two fields, then left join is an option for this:
select i.*, efv1.value as value_1, efv2.value as value_2
from items i left join
extrafieldvalues efv1
on efv1.iditem = i.id and
efv1.type = 1 left join
extrafieldvalues efv2
on efv1.iditem = i.id and
efv1.type = 2 ;
In terms of performance, two joins are probably faster than an aggregation -- and it makes it easier to bring in more columns from items. One the other hand, conditional aggregation generalizes more easily and the performance changes by little as more columns from extrafieldvalues are added to the select.
Use conditional aggregation
select iditem,
max(case when type=1 then value end) as valtype1,
max(case when type=2 then value end) as valtype2,name
from extrafieldvalues a inner join items b on a.iditem=b.id
group by iditem,name

How to group table SQL ORACLE table with specific rows also present

I have table like this:
Area | Client | Month
a | A | 1
a | B | 1
b | C | 1
a | A | 2
b | B | 2
How can I group and rollup this table, to achieve results like below:
Area | Client | Month | Count
a | A | 1 | 1
a | B | 1 | 1
a | | 1 | 2
b | C | 1 | 1
b | | 1 | 1
| | 1 | 3
a | A | 2 | 1
a | B | 2 | 1
a | | 2 | 2
| | 2 | 2
| | | 5
I would like to count clients by area and months, but to also list client column. I'm having hard time using "group by" with client column present.
I would also like to "order by" month, but with summaries properly ordered too.
I prefer grouping sets to cube or rollup, because it is more flexible.
However, the key to using them is that you need an aggregation. So, I think you want a fourth column:
select area, client, month, count(*) as cnt
from t
group by grouping sets ( (area, client, month), (area, month), (month));
Oracle already has rollup and cube grouping functions for such kind of queries, Use :
select area, client , month, sum(month) count
from mytable
group by rollup(area, client,month)
order by area, client;
or, this will produce subtotals of subtotals :
select area, client , month, sum(month) count
from mytable
group by cube(area, client,month)
order by area, client, month;

SQL Using Ungrouped Columns in SELECT statement

I have a GROUP BY Query which appears to use non-aggregated data not in the GROUP BY clause, which I thought would not work.
I was asked to write a query which converted the following data:
| item | type | cost | category |
|------|------|------|----------|
| 1 | X | 10 | A |
| 1 | Y | 20 | A |
| 2 | X | 30 | B |
| 2 | Y | 40 | B |
| 3 | X | 50 | C |
| 3 | Y | 60 | C |
| 4 | X | 70 | D |
| 4 | Y | 80 | D |
into this:
| item | x | y | category |
|------|----|----|----------|
| 1 | 10 | 20 | A |
| 2 | 30 | 40 | B |
| 3 | 50 | 60 | C |
| 4 | 70 | 80 | D |
Note:
The incoming data is clearly not normalised
The item is meant to be unique, but it is repeated for each type value
The category is the same for rows of the same item
I ended up with the following solution:
SELECT
item,
sum(CASE WHEN type='X' THEN cost END) as X,
sum(CASE WHEN type='Y' THEN cost END) as Y,
category
FROM data
GROUP BY item,category;
What surprised me is that it worked. What surprised me more is that it works for PostgreSQL, MariaDB (ANSI Mode), Microsoft SQL and SQLite.
Note:
- I have included category in the GROUP BY simply to allow it to appear in the SELECT clause.
- I have used the sum() function, even though there will only be one value, also simply to included it in the SELECT clause.
I thought I would not be able to use type column in the SELECT column because it is not in the GROUP BY and it is not aggregated. Indeed, if I try to select it by itself, the query will fail.
The question is, how is it that I can use the type column with the CASE operator, when I can’t use it by itself?
Your usage of the "ungrouped" columns is perfectly fine.
The rule is: "Every expression in the SELECT list must either be an aggregat function or it must part of the GROUP BY".
The column type is used inside an aggregate. sum(CASE WHEN type='X' THEN cost END) as X is not really different to sum(cost) or max(type).
This becomes more obvious if you use the standard SQL filter option:
sum(CASE WHEN type='X' THEN cost END)
is the same as:
sum(cost) filter (where type = 'X')
However only very few DBMS support this standard.

SQL Group by one column and decide which column to choose

Let's say I have data like this :
| id | code | name | number |
-----------------------------------------------
| 1 | 20 | A | 10 |
| 2 | 20 | B | 20 |
| 3 | 10 | C | 30 |
| 4 | 10 | D | 80 |
I would like to group rows by code value, but get real rows back (not some aggregate function).
I know that just
select *
from table
group by code
won't work because database don't know which row to return where code is the same.
So my question is how to tell database to select (for example) the lower number column so in my case
| id | code | name | number |
-----------------------------------------------
| 1 | 20 | A | 10 |
| 3 | 10 | C | 30 |
P.S.
I know how to do this by PARTITION but this is only allowed in Oracle databases and can't be created in JPA criteria builder (what is my ultimate goal).
Why You don't use code like this?
SELECT
id,
code,
name,
number
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY code ORDER BY number ASC) AS RowNo
FROM table
) s
WHERE s.RowNo = 1
You can look at this site;
Data Partitioning