How to group by two tables with CTE - sql

I have three tables (Oracle):
sales_order
-------------
int so_key (pk)
int part_key (fk)
int condition_key (fk)
number unit_price
int qty_ordered
number unit_cost
date entry_date
quote
-------------
int q_key (pk)
int part_key (fk)
int condition_key (fk)
number unit_price
int qty_quoted
date entry_date
stock
-------------
int stock_key (pk)
int part_key (fk)
int condition (fk)
int qty_available
number unit_cost
And all three have foreign key references to these two tables:
part
-------------
int part_key (pk)
condition
-------------
int condition_key (pk)
I am writing a query that will aggregate the data into rows grouped by part and condition. However, I am unable to figure out how to group by BOTH part and condition. Here is the (functional) query that I have that groups by part only:
WITH
ctePart_Quotes AS
(
SELECT q.part_key
, COUNT(*) AS quotes_count
, SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
, SUM(q.qty_quoted) AS quotes_qty_total
FROM quote q
WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY q.part_key
)
, ctePart_Sales AS
(
SELECT so.part_key
, COUNT(*) AS sales_count
, SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
, SUM(so.qty_ordered) AS sales_qty_total
, SUM(so.qty_ordered * so.unit_cost) AS cost_total
FROM sales_order so
WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY so.part_key
)
, ctePart_Stock AS
(
SELECT stm.part_key
, SUM(stm.qty_available) AS total_available
, SUM(stm.qty_available * stm.unit_cost) AS inv_cost
FROM stock stm
GROUP BY stm.part_key
)
SELECT p.part_key,
part_stock.total_available,
part_stock.inv_cost,
sales.sales_amt_total,
sales.sales_qty_total,
sales.sales_count,
sales.cost_total,
quotes.quotes_amt_total,
quotes.quotes_qty_total,
quotes.quotes_count
FROM parts p
LEFT OUTER JOIN ctePart_Quotes quotes
ON quotes.part_key = p.part_key
LEFT OUTER JOIN ctePart_Sales sales
ON sales.part_key = p.part_key
LEFT OUTER JOIN ctePart_Stock part_stock
ON part_stock.part_key = p.part_key
WHERE NOT(sales_amt_total IS NULL
AND sales_qty_total IS NULL
AND sales_count IS NULL
AND cost_total IS NULL
AND quotes_amt_total IS NULL
AND quotes_qty_total IS NULL
AND quotes_count IS NULL)
AND SALES_AMT_TOTAL > 10000
This query produces this output (totals grouped by part_key):
part_key | total_available | inv_cost | sales_amt_total | ...
---------|-----------------|----------|-----------------| ...
234 | 59 | 4923.90 | 29403.48 | ...
185 | 21 | 192.64 | 9034.95 | ...
102 | 102 | 8738.34 | 50382.20 | ...
...
But I'm trying to modify the query to produce this (totals grouped by part_key and condition_key):
part_key | condition_key | total_available | inv_cost | sales_amt_total | ...
---------|---------------|-----------------|----------|-----------------| ...
234 | 3 | 24 | 2360.50 | 16947.18 | ...
234 | 7 | 35 | 2563.40 | 12456.30 | ...
...
How do you do this?
EDIT: for clarification:
The complexity lies in: how do you join the condition in the final select? Because you are selecting FROM part but the condition relationship is through the other tables (sales_order, etc.). So you'd have to join through each of the tables (LEFT OUTER JOIN condition cond ON quotes.condition_key = cond.condition_key, etc.) but those joins would each be separate columns.
EDIT #2: someone provided a good image of the data model that illustrates the (proper, legitimate) relationship between part/condition but also the subtle complexity faced in this problem:

The main problem here seems to be your data model. Converting the table "descriptions" of your question into DDL code, and reversing this to a relational model (using Oracle Datamodeler), we find something like this:
DDL code
create table part ( part_key number primary key ) ;
create table condition ( condition_key number primary key ) ;
create table sales_order (
so_key number generated always as identity start with 3000 primary key
, part_key number references part
, condition_key number references condition
, unit_price number
, qty_ordered number
, unit_cost number
, entry_date date ) ;
create table quote (
q_key number generated always as identity start with 4000 primary key
, part_key number references part
, condition_key number references condition
, qty_quoted number
, unit_price number
, entry_date date );
create table stock (
stock_key number generated always as identity start with 5000 primary key
, part_key number references part
, condition_key number references condition
, qty_available number
, unit_cost number ) ;
Relational model (Oracle SQL Developer Data Modeler)
Looking at the model, it becomes clear that each PART can have several CONDITIONs. Thus, it may be necessary (for you) to decide, which condition you are referring to. That may not be easy. Suppose we have a part (with part_key) 1000. Now, we can record 3 different conditions, and use a specific condition for each of your 3 tables mentioned in your query.
-- one part, 3 conditions
begin
insert into part ( part_key ) values ( 1000 ) ;
insert into condition( condition_key ) values ( 2001 ) ;
insert into condition( condition_key ) values ( 2002 ) ;
insert into condition( condition_key ) values ( 2003 ) ;
insert into sales_order ( part_key, condition_key ) values ( 1000, 2001 ) ;
insert into quote ( part_key, condition_key ) values ( 1000, 2002 ) ;
insert into stock ( part_key, condition_key ) values ( 1000, 2003 ) ;
end ;
/
Which one of the 3 condition is supposed to be used for the query? Hard to tell.
-- not using WITH (subquery factoring) here - for clarity
select
P.part_key
, SO.condition_key
, Q.condition_key
, ST.condition_key
from part P
join sales_order SO on SO.part_key = P.part_key
join quote Q on Q.part_key = P.part_key
join stock ST on ST.part_key = P.part_key
;
-- output
PART_KEY CONDITION_KEY CONDITION_KEY CONDITION_KEY
1000 2001 2002 2003
Well - we could pick one of the conditions, couldn't we. However, even more conditions can exist for one and the same part ...
begin
insert into condition( condition_key ) values ( 2004 ) ;
insert into condition( condition_key ) values ( 2005 ) ;
insert into condition( condition_key ) values ( 2006 ) ;
insert into sales_order ( part_key, condition_key ) values ( 1000, 2004 ) ;
insert into quote ( part_key, condition_key ) values ( 1000, 2005 ) ;
insert into stock ( part_key, condition_key ) values ( 1000, 2006 ) ;
end ;
/
-- Same query as above now gives us:
PART_KEY CONDITION_KEY CONDITION_KEY CONDITION_KEY
1000 2001 2005 2006
1000 2001 2005 2003
1000 2001 2002 2006
1000 2001 2002 2003
1000 2004 2005 2006
1000 2004 2005 2003
1000 2004 2002 2006
1000 2004 2002 2003
Conclusion: Fix your data model. (We know this is sometimes easier said than done ...) Then, it will make sense to do some more work on your query.
__Update__
Now that we know that nothing can be done about the tables and the constraints, maybe the following queries will give you a starting point. We do not have proper test data, so let's just add some random values to the tables ...
-- PART and CONDITION -> 1000 integers each
begin
for i in 1 .. 1000
loop
insert into part ( part_key ) values ( i ) ;
insert into condition( condition_key ) values ( i ) ;
end loop;
end ;
/
Table QUOTE
-- 2 12s, 2 18s
SQL> select * from quote ;
Q_KEY PART_KEY CONDITION_KEY QTY_QUOTED UNIT_PRICE ENTRY_DATE
4000 10 100 55 500 01-MAY-11
4001 12 120 55 500 01-MAY-11
4002 12 37 56 501 01-MAY-11
4003 14 140 55 500 01-MAY-11
4004 15 46 56 501 01-MAY-11
4005 16 160 55 500 01-MAY-11
4006 18 180 55 500 01-MAY-11
4007 18 55 56 501 01-MAY-11
4008 20 200 55 500 01-MAY-11
Table SALES_ORDER
SQL> select * from sales_order ;
SO_KEY PART_KEY CONDITION_KEY UNIT_PRICE QTY_ORDERED UNIT_COST ENTRY_DATE
3000 10 100 500 55 400 05-MAY-11
3001 12 120 500 55 400 05-MAY-11
3002 14 140 500 55 400 05-MAY-11
3003 16 160 500 55 400 05-MAY-11
3004 18 180 500 55 400 05-MAY-11
3005 20 200 500 55 400 05-MAY-11
Table STOCK
SQL> select * from stock ;
STOCK_KEY PART_KEY CONDITION_KEY QTY_AVAILABLE UNIT_COST
5000 10 100 10 400
5001 12 120 10 400
5002 14 140 10 400
5003 14 100 12 402
5004 16 160 10 400
5005 18 180 10 400
5006 20 200 10 400
Assuming that only valid part/condition combinations are recorded, we can use FULL OUTER JOINs to get a first picture.
SQL> select
2 Q.part_key q_part , Q.condition_key q_cond
3 , SO.part_key so_part, SO.condition_key so_cond
4 , ST.part_key st_part, ST.condition_key st_cond
5 from quote Q
6 full join sales_order SO
7 on SO.part_key = Q.part_key and SO.condition_key = Q.condition_key
8 full join stock ST
9 on ST.part_key = SO.part_key and ST.condition_key = SO.condition_key
10 ;
-- result
Q_PART Q_COND SO_PART SO_COND ST_PART ST_COND
10 100 10 100 10 100
12 120 12 120 12 120
12 37 NULL NULL NULL NULL
14 140 14 140 14 140
15 46 NULL NULL NULL NULL
16 160 16 160 16 160
18 180 18 180 18 180
18 55 NULL NULL NULL NULL
20 200 20 200 20 200
NULL NULL NULL NULL 14 100
Then, we can use Analytic Functions for the various calculations. Note that we do not use GROUP BY here, the grouping is done via ... partition by Q.part_key, Q.condition_key ... (More about analytic functions: Oracle documentation, and examples here).
-- Skeleton query ...
-- Note that you will have need to write over(...) several times.
-- Add a WHERE clause and conditions as required.
select
Q.part_key as q_part, Q.condition_key as q_cond,
count( Q.part_key ) over ( partition by Q.part_key, Q.condition_key ) as q_count
--
-- Q example sums
-- , sum( Q.unit_price * Q.qty_quoted )
-- over ( partition by Q.part_key, Q.condition_key ) as qat -- quotes_amt_total
-- , sum( Q.qty_quoted )
-- over ( partition by Q.part_key, Q.condition_key ) as qqt -- quotes_qty_total
--
, SO.part_key as so_part, SO.condition_key as so_cond
, count( SO.part_key ) over ( partition by SO.part_key, SO.condition_key ) as so_count
--
-- SO sums here
--
, ST.part_key as st_part, ST.condition_key as st_cond
, count( ST.part_key ) over ( partition by ST.part_key, ST.condition_key ) as st_count
from sales_order SO
full join quote Q
on SO.part_key = Q.part_key and SO.condition_key = Q.condition_key
full join stock ST
on ST.part_key = SO.part_key and ST.condition_key = SO.condition_key
-- where ...
;
Result
-- output
Q_PART Q_COND Q_COUNT SO_PART SO_COND SO_COUNT ST_PART ST_COND ST_COUNT
10 100 1 10 100 1 10 100 1
12 37 1 NULL NULL 0 NULL NULL 0
12 120 1 12 120 1 12 120 1
14 140 1 14 140 1 14 140 1
15 46 1 NULL NULL 0 NULL NULL 0
16 160 1 16 160 1 16 160 1
18 55 1 NULL NULL 0 NULL NULL 0
18 180 1 18 180 1 18 180 1
20 200 1 20 200 1 20 200 1
NULL NULL 0 NULL NULL 0 14 100 1

The trick is to first create a Carthesian product (The condition table only has ~30 rows), and maybe suppress the unwanted result rows later:
This may look sub-optimal, but it will avoid a join onCOALESCE()d keyfields, which could perform badly.
WITH
ctePart_Quotes AS
(
SELECT q.part_key, q.condition_key
, COUNT(*) AS quotes_count
, SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
, SUM(q.qty_quoted) AS quotes_qty_total
FROM quote q
WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY q.part_key, q.condition_key
)
, ctePart_Sales AS
(
SELECT so.part_key, so.condition_key
, COUNT(*) AS sales_count
, SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
, SUM(so.qty_ordered) AS sales_qty_total
, SUM(so.qty_ordered * so.unit_cost) AS cost_total
FROM sales_order so
WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY so.part_key, so.condition_key
)
, ctePart_Stock AS
(
SELECT stm.part_key, stm.condition_key
, SUM(stm.qty_available) AS total_available
, SUM(stm.qty_available * stm.unit_cost) AS inv_cost
FROM stock stm
GROUP BY stm.part_key, stm.condition_key
)
SELECT p.part_key,
c.condition_key,
part_stock.total_available,
part_stock.inv_cost,
sales.sales_amt_total,
sales.sales_qty_total,
sales.sales_count,
sales.cost_total,
quotes.quotes_amt_total,
quotes.quotes_qty_total,
quotes.quotes_count
FROM parts p
CROSS JOIN condition c -- <<-- Here
LEFT OUTER JOIN ctePart_Quotes quotes
ON quotes.part_key = p.part_key
AND quotes.condition_key = c.condition_key -- <<-- Here
LEFT OUTER JOIN ctePart_Sales sales
ON sales.part_key = p.part_key
AND sales.condition_key = c.condition_key -- <<-- Here
LEFT OUTER JOIN ctePart_Stock part_stock
ON part_stock.part_key = p.part_key
AND part_stock.condition_key = c.condition_key -- <<-- Here
WHERE NOT(sales_amt_total IS NULL
AND sales_qty_total IS NULL
AND sales_count IS NULL
AND cost_total IS NULL
AND quotes_amt_total IS NULL
AND quotes_qty_total IS NULL
AND quotes_count IS NULL) -- <<-- And maybe Here, too
AND SALES_AMT_TOTAL > 10000
;

Maybe this works for you:
WITH
ctePart_Quotes AS
(
SELECT q.part_key,
q.condition_key
, COUNT(*) AS quotes_count
, SUM(q.unit_price * q.qty_quoted) AS quotes_amt_total
, SUM(q.qty_quoted) AS quotes_qty_total
FROM quote q
WHERE q.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY q.part_key,
q.condition_key
)
, ctePart_Sales AS
(
SELECT so.part_key,
so.condition_key
, COUNT(*) AS sales_count
, SUM(so.unit_price * so.qty_ordered) AS sales_amt_total
, SUM(so.qty_ordered) AS sales_qty_total
, SUM(so.qty_ordered * so.unit_cost) AS cost_total
FROM sales_order so
WHERE so.entry_date BETWEEN TO_DATE('01-Jan-2011', 'dd-mm-yyyy') AND TO_DATE('31-Dec-2011', 'dd-mm-yyyy')
GROUP BY so.part_key,
so.condition_key
)
, ctePart_Stock AS
(
SELECT stm.part_key,
stm.condition_key
, SUM(stm.qty_available) AS total_available
, SUM(stm.qty_available * stm.unit_cost) AS inv_cost
FROM stock stm
GROUP BY stm.part_key,
stm.condition_key
)
SELECT p.part_key,
cte.condition_key,
cte.total_available,
cte.inv_cost,
cte.sales_amt_total,
cte.sales_qty_total,
cte.sales_count,
cte.cost_total,
cte.quotes_amt_total,
cte.quotes_qty_total,
cte.quotes_count
FROM parts p
LEFT OUTER JOIN (SELECT coalesce(quotes.part_key, sales.part_key, part_stock.part_key) part_key,
coalesce(quotes.condition_key, sales.condition_key, part_stock.condition_key) condition_key,
quotes.quotes_count,
quotes.quotes_amt_total,
quotes.quotes_qty_total,
sales.sales_count,
sales.sales_amt_total,
sales.sales_qty_total,
sales.cost_total,
part_stock.total_available,
part_stock.inv_cost
FROM ctePart_Quotes quotes
FULL JOIN ctePart_Sales sales
ON sales.part_key = quotes.part_key
AND sales.condition_key = quotes.condition_key
FULL JOIN ctePart_Stock part_stock
ON part_stock.part_key = sales.part_key
AND part_stock.condition_key = sales.condition_key) cte
ON cte.part_key = p.part_key
WHERE NOT(sales_amt_total IS NULL
AND cte.sales_qty_total IS NULL
AND cte.sales_count IS NULL
AND cte.cost_total IS NULL
AND cte.quotes_amt_total IS NULL
AND cte.quotes_qty_total IS NULL
AND cte.quotes_count IS NULL)
AND SALES_AMT_TOTAL > 10000;
It also groups by condition_key in the CTEs. Then it FULL JOINs the CTEs together using coalesce to compensate for null values of part_key or condition_key in the first tables (But maybe there is none, if every combination of part_keyand condition_key, that is present in one of the tables is also present in the respective two other tables.). The result is then LEFT JOINed to part using part_key.

Related

Select most recent record (with expiration date)

Let's say that we have 2 tables named Records and Opportunities:
Records:
RecordID
CustomerID
CreateDate
777
1
1/1/2021
888
2
1/1/2021
999
1
2/1/2021
Opportunities:
OppID
CustomerID
OppCreateDate
10
1
12/31/2020
11
1
1/10/2021
12
2
2/1/2021
13
1
4/1/2021
14
1
8/5/2025
Desired Output:
RecordID
CustomerID
CreateDate
#Opportunities
777
1
1/1/2021
1
888
2
1/1/2021
1
999
1
2/1/2021
1
As you can see, the Records table provides the first 3 columns of the desired output, and the "#Opportunities" column is created by counting the number of opportunities that happen after the record is created for a given customer.
Two key things to note on this logic:
Only count opportunities when they occur within 6 months of a record.
If another record is created for a customer, only count opportunities for the most recent record.
More specifically, OppID = 11 will get credited to RecordID = 777; 12 to 888; and 13 to 999. 10 and 14 will not get credited to either RecordID.
I wrote the below code, which does not take into account #2 above:
CREATE TABLE #Records
(
RecordID int
, CustomerID int
, CreateDate Date
)
INSERT INTO #Records
VALUES
(777, 1, '2021-01-01')
, (888, 2, '2021-01-31')
, (999, 1, '2021-02-01')
CREATE TABLE #Opportunities
(
OppID int
, CustomerID int
, OppCreateDate Date
)
INSERT INTO #Opportunities
VALUES
(10, 1, '2020-12-31')
, (11, 1, '2021-01-10')
, (12, 2, '2021-02-01')
, (13, 1, '2021-04-01')
, (14, 1, '2025-08-25')
select *
from #Records
select *
from #Opportunities
select rec.*
, (select count(*)
from #Opportunities opp
where rec.CustomerID=opp.CustomerID
and rec.CreateDate<=opp.OppCreateDate --record happened on the same day or before the opportunity
and datediff(month,rec.CreateDate,opp.OppCreateDate) < 6 --opened and created within 6 months
) as [#Opportunities]
from #Records rec
Any suggestions to incorporate #2 above and generate the desired output?
Decide on which #records row is related to an #Opportunities row based on #records.CreateDate
select RecordID, CustomerID, CreateDate, count(*) cnt
from (
select r.RecordID, r.CustomerID, r.CreateDate,
row_number() over(partition by op.OppID order by r.CreateDate desc) rn
from #records r
join #Opportunities op on r.CustomerID = op.CustomerID and datediff(month, r.CreateDate, op.OppCreateDate) < 6 and r.CreateDate <= op.OppCreateDate
) t
where rn = 1
group by RecordID, CustomerID, CreateDate
Returns
RecordID CustomerID CreateDate cnt
777 1 2021-01-01 1
888 2 2021-01-31 1
999 1 2021-02-01 1
Try this:
DECLARE #Records table ( RecordID int, CustomerID int, CreateDate date );
INSERT INTO #Records VALUES
( 777, 1, '2021-01-01' ), ( 888, 2, '2021-01-31' ), ( 999, 1, '2021-02-01' );
DECLARE #Opportunities table ( OppID int, CustomerID int, OppCreateDate date );
INSERT INTO #Opportunities VALUES
( 10, 1, '2020-12-31' )
, ( 11, 1, '2021-01-10' )
, ( 12, 2, '2021-02-01' )
, ( 13, 1, '2021-04-01' )
, ( 14, 1, '2025-08-25' );
SELECT
*
FROM #Records r
OUTER APPLY (
SELECT
COUNT ( * ) AS [#Opportunities]
FROM #Opportunities AS o
WHERE
o.CustomerID = r.CustomerID
AND o.OppCreateDate >= r.CreateDate
AND DATEDIFF ( month, r.CreateDate, o.OppCreateDate ) <= 6
AND o.OppID NOT IN (
SELECT
OppID
FROM #Records AS r2
INNER JOIN #Opportunities AS o2
ON r2.CustomerID = o2.CustomerID
WHERE
r2.CustomerID = o.CustomerID
AND o2.OppCreateDate >= r2.CreateDate
AND r2.RecordID > r.RecordID
)
) AS Opps
ORDER BY
r.RecordID;
RETURNS
+----------+------------+------------+----------------+
| RecordID | CustomerID | CreateDate | #Opportunities |
+----------+------------+------------+----------------+
| 777 | 1 | 2021-01-01 | 1 |
| 888 | 2 | 2021-01-31 | 1 |
| 999 | 1 | 2021-02-01 | 1 |
+----------+------------+------------+----------------+

How to calculate average in SQL?

lets say I have the following table:
**FOOD** | **AMOUNT**
Bread | 2
Banana | 5
Pizza | 4
Apple | 57
Mandarin| 9
Orange | 8
Final result:
Bread | Percentage Of Total
Banana | percentage of total
etc
etc
I tried it in every single way, but couldn't find a solution. I hope someone can help me.
Using ANSI SQL (and SQL Server supports this syntax), you can do:
select food, sum(amount),
sum(amount) / sum(sum(amount)) over () as proportion_of_total
from t
group by food;
Note: Some databases do integer division, so you may need to convert to a floating point or fixed point type.
We can also try like below-
DECLARE #tbl AS TABLE
(
food VARCHAR(15)
,amount INT
)
INSERT INTO #tbl VALUES
('bread', 2)
,('banana', 5)
,('pizza', 4)
,('apple', 57)
,('mandarin', 9)
,('orange', 8)
SELECT
DISTINCT
food
,SUM(amount) OVER() TotalAmount
,SUM(amount) OVER (PARTITION BY food) PerFoodTotal
,CAST(SUM(amount) OVER (PARTITION BY food) * 100. / (SUM(amount) OVER()) AS DECIMAL(10,2)) [Percentage Of Total]
FROM #tbl
OUTPUT
food TotalAmount PerFoodTotal Percentage Of Total
--------------- ----------- ------------ ---------------------------------------
apple 85 57 67.06
banana 85 5 5.88
bread 85 2 2.35
mandarin 85 9 10.59
orange 85 8 9.41
pizza 85 4 4.71
(6 row(s) affected)
You can try something like this:
declare #tbl as table (
food varchar(15)
,amount int
)
insert into #tbl values
('bread', 2)
,('banana', 5)
,('pizza', 4)
,('apple', 57)
,('mandarin', 9)
,('orange', 8)
select SUM(amount) from #tbl
select
food
,SUM(amount) as [food amount]
,(SUM(cast(amount as numeric(18,2))) / (select sum(cast(amount as numeric(18,2))) from #tbl)) * 100 as [Percentage Of Total]
,(select sum(amount) from #tbl) as total
from #tbl
group by food
Here you got a way fo getting the PercentageOfTotal, asuming that the sum of all will not be 0
DECLARE #total INT = (SELECT SUM(AMOUNT) FROM Table1)
SELECT FOOD, CAST((CAST((100 * AMOUNT) AS DECIMAL (18,2)) / #total ) AS DECIMAL(18,2)) AS PercentageOfTotal from Table1
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE MusicGenres (name varchar(10)) ;
INSERT INTO MusicGenres (name)
VALUES ('Pop'),('Techno'),('Trance'),('trap'),('Hardcore'),('Electro') ;
CREATE TABLE Table2 (SongID int, MusicGenres varchar(10)) ;
INSERT INTO Table2 (SongID, MusicGenres)
VALUES (1,'Hardcore')
,(2,'Hardcore')
,(3,'Pop')
,(4,'Trap')
,(5,'Hardcore')
,(6,'Pop')
,(7,'Electro')
,(8,'Electro')
,(9,'Pop')
,(10,'Pop')
,(11,'Pop')
;
Query 1:
SELECT s1.name
, s1.recCount
, ( s1.recCount / CAST( ( SUM(recCount) OVER() ) AS decimal(5,2) ) )*100 AS pct
FROM (
SELECT m.name
, count(t.SongID) AS recCount
FROM MusicGenres m
LEFT OUTER JOIN Table2 t ON m.name = t.MusicGenres
GROUP BY m.name
) s1
Could be shortened to
SELECT m.name
, count(t.SongID) AS recCount
, ( count(t.SongID) / CAST( ( SUM(count(t.SongID)) OVER() ) AS decimal(5,2) )
)*100 AS pct
FROM MusicGenres m
LEFT OUTER JOIN Table2 t ON m.name = t.MusicGenres
GROUP BY m.name
Results:
| name | recCount | pct |
|----------|----------|---------|
| Electro | 2 | 18.1818 |
| Hardcore | 3 | 27.2727 |
| Pop | 5 | 45.4545 |
| Techno | 0 | 0 |
| Trance | 0 | 0 |
| trap | 1 | 9.0909 |

Select except where different in SQL

I need a bit of help with a SQL query.
Imagine I've got the following table
id | date | price
1 | 1999-01-01 | 10
2 | 1999-01-01 | 10
3 | 2000-02-02 | 15
4 | 2011-03-03 | 15
5 | 2011-04-04 | 16
6 | 2011-04-04 | 20
7 | 2017-08-15 | 20
What I need is all dates where only one price is present.
In this example I need to get rid of row 5 and 6 (because there is two difference prices for the same date) and either 1 or 2(because they're duplicate).
How do I do that?
select date,
count(distinct price) as prices -- included to test
from MyTable
group by date
having count(distinct price) = 1 -- distinct for the duplicate pricing
The following should work with any DBMS
SELECT id, date, price
FROM TheTable o
WHERE NOT EXISTS (
SELECT *
FROM TheTable i
WHERE i.date = o.date
AND (
i.price <> o.price
OR (i.price = o.price AND i.id < o.id)
)
)
;
JohnHC answer is more readable and delivers the information the OP asked for ("[...] I need all the dates [...]").
My answer, though less readable at first, is more general (allows for more complexes tie-breaking criteria) and also is capable of returning the full row (with id and price, not just date).
;WITH CTE_1(ID ,DATE,PRICE)
AS
(
SELECT 1 , '1999-01-01',10 UNION ALL
SELECT 2 , '1999-01-01',10 UNION ALL
SELECT 3 , '2000-02-02',15 UNION ALL
SELECT 4 , '2011-03-03',15 UNION ALL
SELECT 5 , '2011-04-04',16 UNION ALL
SELECT 6 , '2011-04-04',20 UNION ALL
SELECT 7 , '2017-08-15',20
)
,CTE2
AS
(
SELECT A.*
FROM CTE_1 A
INNER JOIN
CTE_1 B
ON A.DATE=B.DATE AND A.PRICE!=B.PRICE
)
SELECT * FROM CTE_1 WHERE ID NOT IN (SELECT ID FROM CTE2)

grouping results based on time diff in sql

I have results like this
TimeDiffMin | OrdersCount
10 | 2
12 | 5
09 | 6
20 | 15
27 | 11
I would like the following
TimeDiffMin | OrdersCount
05 | 0
10 | 8
15 | 5
20 | 15
25 | 0
30 | 11
So you can see that i want the grouping of every 5 minutes and show the total order count in those 5 minutes. eg. 0-5 minutes 0 orders, 5-10 minutes 8 orders
any help would be appreciated.
current query:
SELECT TimeDifferenceInMinutes, count(OrderId) NumberOfOrders FROM (
SELECT AO.OrderID, AO.OrderDate, AON.CreatedDate AS CancelledDate, DATEDIFF(minute, AO.OrderDate, AON.CreatedDate) AS TimeDifferenceInMinutes
FROM
(SELECT OrderID, OrderDate FROM AC_Orders) AO
JOIN
(SELECT OrderID, CreatedDate FROM AC_OrderNotes WHERE Comment LIKE '%has been cancelled.') AON
ON AO.OrderID = AON.OrderID
WHERE DATEDIFF(minute, AO.OrderDate, AON.CreatedDate) <= 100 AND AO.OrderDate >= '2016-12-01'
) AS Temp1
GROUP BY TimeDifferenceInMinutes
Now, if you are open to a TVF.
I use this UDF to create dynamic Date/Time Ranges. You supply the range and increment
Declare #YourTable table (TimeDiffMin int,OrdersCount int)
Insert Into #YourTable values
(10, 2),
(12, 5),
(09, 6),
(20,15),
(27,11)
Select TimeDiffMin = cast(R2 as int)
,OrdersCount = isnull(sum(OrdersCount),0)
From (Select R1=RetVal,R2=RetVal+5 From [dbo].[udf-Range-Number](0,25,5)) A
Left Join (
-- Your Complicated Query
Select * From #YourTable
) B on TimeDiffMin >= R1 and TimeDiffMin<R2
Group By R1,R2
Order By 1
Returns
TimeDiffMin OrdersCount
5 0
10 6
15 7
20 0
25 15
30 11
The UDF if interested
CREATE FUNCTION [dbo].[udf-Range-Number] (#R1 money,#R2 money,#Incr money)
Returns Table
Return (
with cte0(M) As (Select cast((#R2-#R1)/#Incr as int)),
cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (Select M from cte0) Row_Number() over (Order By (Select NULL)) From cte1 a,cte1 b,cte1 c,cte1 d,cte1 e,cte1 f,cte1 g,cte1 h )
Select RetSeq=1,RetVal=#R1 Union All Select N+1,(N*#Incr)+#R1
From cte2
)
-- Max 100 million observations
-- Select * from [dbo].[udf-Range-Number](0,4,0.25)
You can do this using a derived table to first build up your time difference windows and then joining from that to sum up all the Orders that fall within that window.
declare #t table(TimeDiffMin int
,OrdersCount int
);
insert into #t values
(10, 2)
,(12, 5)
,(09, 6)
,(20,15)
,(27,11);
declare #Increment int = 5; -- Set your desired time windows here.
with n(n)
as
( -- Select 10 rows to start with:
select n from(values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) as n(n)
),n2 as
( -- CROSS APPLY these 10 rows to get 10*10=100 rows we can use to generate incrementing ROW_NUMBERs. Use more CROSS APPLYs to get more rows:
select (row_number() over (order by (select 1))-1) * #Increment as StartMin
,(row_number() over (order by (select 1))) * #Increment as EndMin
from n -- 10 rows
cross apply n n2 -- 100 rows
--cross apply n n3 -- 1000 rows
--cross apply n n4 -- 10000 rows
)
select m.EndMin as TimeDiffMin
,isnull(sum(t.OrdersCount),0) as OrdersCount
from n2 as m
left join #t t
on(t.TimeDiffMin >= m.StartMin
and t.TimeDiffMin < m.EndMin
)
where m.EndMin <= 30 -- Filter as required
group by m.EndMin
order by m.EndMin
Query result:
TimeDiffMin OrdersCount
5 0
10 6
15 7
20 0
25 15
30 11

update of a column based on max date and group by

i have three tables orders, orders_delivered, orders_delivered_sta
and the data in the three tables look like
table orders
orders_id
10
11
12
13
table orders_delivered
orders_delivered_id orders_id
10 1000
10 1001
11 1002
12 1003
12 1004
13 1005
13 1006
13 1007
table orders_delivered_sta
orders_delivered_sta_id orders_delivered_id date now_ind
1 1000 02/11/2011 0
2 1000 01/10/2006 0
3 1000 09/13/2011 0
4 1001 01/19/2010 0
5 1001 02/21/2011 0
6 1002 02/11/2009 0
7 1002 08/27/2010 0
8 1003 07/15/2012 0
9 1004 03/09/2007 0
10 1010 10/01/2010 0
11 1011 03/27/2011 0
12 1012 07/25/2010 0
13 1013 09/18/2004 0
so i need to update orders_delivered_sta table such that now_ind should be 1 for the max date of one orders_delivered_id
like for one orders_delivered_id 1000 the max date is 09/13/2011 for this set of orders_delivered_id and date (1000,09/13/2011) the now_ind should be 1 and if the column orders_delivered_id has one and only one id then that should be changed to 1
there is some data in orders_delivered_sta table which are not in orders and orders_delivered tables those need not to be changed. the orders_delivered_id which are in oreders_delivered table only needs to change
so the desired output should look like
table orders_delivered_sta
orders_delivered_sta_id orders_delivered_id date now_ind
1 1000 02/11/2011 0
2 1000 01/10/2006 0
3 1000 09/13/2011 1
4 1001 01/19/2010 0
5 1001 02/21/2011 1
6 1002 02/11/2009 0
7 1002 08/27/2010 1
8 1003 07/15/2012 1
9 1004 03/09/2007 1
10 1010 10/01/2010 0
11 1011 03/27/2011 0
12 1012 07/25/2010 0
13 1013 09/18/2004 0
table structure:
create table orders
(
order_id int primary key
)
insert into orders select 10
insert into orders select 11
insert into orders select 12
insert into orders select 13
create table orders_delivered
(
orders_delivered_id int primary key,
orders_id int FOREIGN KEY(orders_id)REFERENCES orders (orders_id)
)
insert into orders_delivered select 1000,10
insert into orders_delivered select 1001,10
insert into orders_delivered select 1002,11
insert into orders_delivered select 1003,12
insert into orders_delivered select 1004,12
insert into orders_delivered select 1005,13
insert into orders_delivered select 1006,13
insert into orders_delivered select 1007,13
create table orders_delivered_sta
(
orders_delivered_sta_id int primary key,
orders_delivered_id int FOREIGN KEY(orders_delivered_id)REFERENCES orders_delivered (orders_delivered_id),
date char(10),
now_ind int
)
insert into orders_delivered_sta select 1,1000,'02/11/2011', 0
insert into orders_delivered_sta select 2,1000,'01/10/2006', 0
insert into orders_delivered_sta select 3,1000,'09/13/2011', 0
insert into orders_delivered_sta select 4,1001,'01/19/2010', 0
insert into orders_delivered_sta select 5,1001,'02/21/2011', 0
insert into orders_delivered_sta select 6,1002,'02/11/2009', 0
insert into orders_delivered_sta select 7,1002,'08/27/2010', 0
insert into orders_delivered_sta select 8,1003,'07/15/2012', 0
insert into orders_delivered_sta select 9,1004,'03/09/2007', 0
insert into orders_delivered_sta select 10,1010,'10/01/2010', 0
insert into orders_delivered_sta select 11,1011,'03/27/2011', 0
insert into orders_delivered_sta select 12,1012,'07/25/2010', 0
insert into orders_delivered_sta select 13,1013,'09/18/2004', 0
You could use a CTE and a window MAX():
;
WITH max_dates AS (
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY orders_delivered_id)
FROM orders_delivered_sta
WHERE orders_delivered_id IN (SELECT orders_delivered_id FROM orders_delivered)
)
UPDATE max_dates
SET now_ind = 1
WHERE date = max_date
References:
WITH common_table_expression (Transact-SQL)
OVER Clause (Transact-SQL)
This is the query in MySQL, but translating it to SQL-Server should be straight forward as I am using plain SQL. Notice I have changed the date to be in a different form (YYYY-MM-DD) to avoid castings from string to date.
update t3
set t3.now_ind = 1
where t3.orders_delivered_sta_id in (
select distinct t1.orders_delivered_sta_id from t1
left join (
select t2.orders_delivered_id, max(t2.adate) as MaxDate from t2
group by t2.orders_delivered_id
) t2 on (t1.orders_delivered_id = t2.orders_delivered_id) and (t1.adate = t2.MaxDate)
where t2.orders_delivered_id is not null
) and exists (
select * from o1
join od1 on (o1.order_id = od1.orders_delivered_id)
where (t3.orders_delivered_id = od1.orders_id)
)
Here is an example
Hope this helps
PS: You did need those 3 tables... I'll read questions better next time :)
Try this:
UPDATE orders_delivered_sta
SET now_ind = 1
WHERE orders_delivered_sta_id IN(
SELECT orders_delivered_sta_id
FROM (
SELECT orders_delivered_sta_id,
ROW_NUMBER() OVER(PARTITION BY orders_delivered_id ORDER BY date DESC) AS num
FROM orders_delivered_sta) AS T
WHERE T.num = 1)