Grouping problems using two CTEs

Grouping problems using two CTEs - sql

Code is providing correct numbers- Grouping is giving me a problem and this maybe a fundamental code chose issue. Query is as:
With
P as ( Select sum(r.qty) as proposed, rm.entity, id,
quarter (case When status = open then quarter = 1 ELSE 3 END) AS QUARTER,
year (case When status = open then year = 2017 ELSE 2016 END) AS YEAR,
FROM Db1
Group By proposed, quarter, id, entity, quarter, year)
A as ( Select sum(r.qty) as awarded, rm.entity, id,
quarter (case When status = open then quarter = 2 ELSE 4 END),
year(case When status = open then year = 2018 ELSE 2016 END)
From DB1
Group By proposed, quarter, id, entity, quarter, year
)
SELECT * FROM P right join a on p.id = a.id
Group By proposed, quarter, id, entity, quarter, year
My returns are something like:
ID p.Quarter p.Year a.Quarter a.Year Proposed Awarded
1 null null 1 2017 null 1
2 2 2018 3 2017 1 1
2 1 2018 4 2016 1 1
3 null null 2 2018 null 2
I want:
ID p.Quarter p.Year a.Quarter a.Year Proposed Awarded
1 null null 1 2017 null 1
2 2 2018 null null 1 null
2 1 2018 null null 1 null
2 null null 3 2017 null 1
2 null null 4 2017 null null
3 null null 2 2018 null 2
The problem is - If an ID has a proposed date, quantity, awarded date and quantity I want all of the years and quarters to be shown outside of the id grouping. So each awarded or proposed count will have it's own row. Otherwise the counts are coming in wrong.
I am pulling from two different databases and my Case statements are much more complex but adding that large amount of code seed irrelevant for this.

Related

One SQL statement for multiple column totals based on year

I'm trying to return the total number of unique countries listed for each year, and the total number of unique countries in the entire table. The table is formatted like this:
Country | Year | State | V1 | V2 |
US 2020 NY 9 2
US 2020 MA 3 6
CA 2020 MAN 2 8
CA 2020 ONT 5 1
AU 2020 TAS 7 2
AU 2020 VIC 3 3
US 2021 NY 2 0
US 2021 MA 8 2
AU 2021 TAS 4 1
AU 2021 VIC 5 2
I want my query to return this:
2020 | 2021 | Total_Unique_Countries
3 2 3
I tried:
SELECT
SUM(CASE WHEN YEAR=2020 THEN 1 ELSE 0 END) AS "2020",
SUM(CASE WHEN YEAR=2021 THEN 1 ELSE 0 END) AS "2021",
COUNT(DISTINCT COUNTRY) AS Total_Unique_Countries
FROM MYTABLE GROUP BY YEAR
The result:
2020 | 2021 | Total_Unique_Countries
6 0 3
0 4 2

SELECT
COUNT(DISTINCT CASE WHEN YEAR=2020 THEN COUNTRY END) AS "2020",
COUNT(DISTINCT CASE WHEN YEAR=2021 THEN COUNTRY END) AS "2021",
COUNT(DISTINCT COUNTRY) AS Total_Unique_Countries
FROM MYTABLE
This should give you the result you are looking for.

You can first elimnate the duplicates in a CTE and then count
WITH CTE as (SELECT
DISTINCT "Country", "Year" FROM MYTABLE)
SELECT
SUM(CASE WHEN "Year"=2020 THEN 1 ELSE 0 END) AS "2020",
SUM(CASE WHEN "Year"=2021 THEN 1 ELSE 0 END) AS "2021",
COUNT(DISTINCT "Country") AS Total_Unique_Countries
FROM CTE
2020
2021
total_unique_countries
3
2
3
SELECT 1
fiddle

JOIN by closer value to key

With the following sample data:
WITH values AS (
SELECT
1 AS shard,
2008 AS year,
1 AS value
UNION ALL
SELECT
1 AS shard,
20012 AS year,
2 AS value
UNION ALL
SELECT
2 AS shard,
2011 AS year,
3 AS value
UNION ALL
SELECT
2 AS shard,
1998 AS year,
4 AS value
UNION ALL
SELECT
2 AS shard,
2001 AS year,
5 AS value
UNION ALL
SELECT
4 AS shard,
1990 AS year,
6 AS value
ORDER BY year
),
data AS (
SELECT
1 AS id,
1 AS shard,
2010 AS year
UNION ALL
SELECT
1 AS id,
2 AS shard,
2000 AS year
UNION ALL
SELECT
1 AS id,
3 AS shard,
1990 AS year
UNION ALL
SELECT
2 AS id,
1 AS shard,
2010 AS year
UNION ALL
SELECT
2 AS id,
2 AS shard,
2000 AS year
UNION ALL
SELECT
2 AS id,
3 AS shard,
1990 AS year
)
I want to join my data collection with the values stored in values collection. Data has an id which differentiates each process, so I want to perform the JOIN for each id. Also, the JOIN has a double mapping key, which are the shard and year fields. I want to retreive, for each entry on my data, the value of the CLOSER year in my values collection which matches its shard attribute.
I have come up with the piece of code, but it is not working as expected as it doesn't consider the values.shard field, and it matches every year no matter the shard they are on.
SELECT *
FROM (
SELECT
data.id,
data.year,
values.year AS closer_year,
ABS(data.year - values.year) AS diff,
values.value,
ROW_NUMBER() OVER (PARTITION BY data.id, data.shard ORDER BY ABS(data.year - values.year)) AS rn
FROM data, values
)
WHERE rn = 1
For the sample data, the expected output should be:
id year closer_year diff value rn
1 2010 2008 2 1 1
1 2000 2001 1 5 1
1 1990 null null null 1
2 2010 2008 2 1 1
2 2000 2001 1 5 1
2 1990 null null null 1
What am I missing?

I found what I was missing just after posting the question. I will answer it in case anyone has a similar use case.
When rereading the text, I noticed that the "match the shard" property I was missing was indeed a left join, so rewriting the query like this solved the problem:
SELECT *
FROM (
SELECT
data.id,
data.year,
values.year AS closer_year,
ABS(data.year - values.year) AS diff,
values.value,
ROW_NUMBER() OVER (PARTITION BY data.id, data.shard ORDER BY ABS(data.year - values.year)) AS rn
FROM data
LEFT JOIN values
ON data.shard = values.shard
)
WHERE rn = 1

How to extract and pivot in sql

I have tables like following
I treid to sum score in pivoted style..
product date score
A 2020/8/1 1
B 2018/8/1 2
B 2018/9/1 1
C 2017/9/1 2
I'd like to transform them to the following pivotedone.
The index is YEAR(t.date) and columns = product
date A B C
2017 0 0 2
2018 0 3 0
2019 0 0 0
2020 1 0 0
Are there any effective way to achieve this?
Thanks

We can handle this by joining a calendar table containing all years of interest to your current table, aggregating by year, and then using conditional aggregation to find the sum of scores for each product.
WITH years AS (
SELECT 2017 AS year FROM dual UNION ALL
SELECT 2018 FROM dual UNION ALL
SELECT 2019 FROM dual UNION ALL
SELECT 2020 FROM dual
)
SELECT
y.year,
SUM(CASE WHEN t.product = 'A' THEN t.score ELSE 0 END) AS A,
SUM(CASE WHEN t.product = 'B' THEN t.score ELSE 0 END) AS B,
SUM(CASE WHEN t.product = 'C' THEN t.score ELSE 0 END) AS C
FROM years y
LEFT JOIN yourTable t
ON y.year = EXTRACT(YEAR FROM t."date")
GROUP BY
y.year
ORDER BY
y.year;
Demo

One option would be using PIVOT Clause after determining the year range, and joining outerly with your data source and setting the null scores as zeroes :
WITH years AS
(SELECT MIN(EXTRACT(year from "date")) AS min_year,
MAX(EXTRACT(year from "date")) AS max_year
FROM tab)
SELECT year, NVL(A,0) AS A, NVL(B,0) AS B, NVL(C,0) AS C
FROM (SELECT l.year, product, SUM(score) AS score
FROM tab t --> your original data source
RIGHT JOIN (SELECT level + min_year - 1 AS year
FROM years
CONNECT BY level BETWEEN 1 AND max_year - min_year + 1) l
ON l.year = EXTRACT(year from "date")
GROUP BY l.year, product)
PIVOT (SUM(score) FOR product IN('A' AS "A", 'B' AS "B", 'C' AS "C"))
ORDER BY year;
YEAR A B C
---- - - -
2017 0 0 2
2018 0 3 0
2019 0 0 0
2020 1 0 0
Demo

SQL to calculate Net Capacity

How to write a SQL to get the Net change in capacity by using the capacity (when status is 1 or 2) and minus the total capacity (when status is 3) for each month? Thanks. Here is the table:
STATUS MONTH CAPACITY
1 01/16 5
3 01/16 2
1 02/16 11
3 02/16 20
1 03/16 8
3 03/16 12
1 04/16 4
2 04/16 10
3 04/16 18
2 05/16 14
3 05/16 37
2 06/16 4
3 06/16 8
For example, the net change in capacity for Jan. 16 is 5 minus 2 equals 3.

You need a conditional sum:
SUM(CASE WHEN STATUS IN (1,2) THEN CAPACITY ELSE 0 END) -
SUM(CASE WHEN STATUS IN (3) THEN CAPACITY ELSE 0 END)

dnoeth answer can be simplified to
SUM(CASE WHEN STATUS IN (1,2) THEN CAPACITY WHEN STATUS IN (3) THEN -CAPACITY ELSE 0 END)

Builds on 1,2 < 3
select MONTH, [Net change]=SUM(CASE STATUS/3 WHEN 0 THEN CAPACITY ELSE -CAPACITY END)
from t
group by MONTH;

no CASE statement:
select month, sum(capacity)-2*sum((status/3)*capacity) from table group by month;
Here is an example

You can join the table to itself and perform the calculation like so:
SELECT
a.status,
a.month,
a.capacity,
b.capacity AS total_capacity,
a.capacity - b.capacity AS net_capacity
FROM
table a
JOIN
table b
ON (a.month = b.month)
AND (b.status = 3)
WHERE
a.status IN (1,2);
-- If you don't want to have the status and instead aggregate in the event there are two within the same month:
SELECT
a.month,
SUM(a.capacity) AS capacity,
SUM(b.capacity) AS total_capacity,
SUM(a.capacity) - MAX(b.capacity) AS net_capacity
FROM
table a
JOIN
table b
ON (a.month = b.month)
AND (b.status = 3)
WHERE
a.status IN (1,2)
GROUP BY
a.month;

SELECT
"Status",
"Month",
SUM(Capacity) AS Capacity
FROM ( SELECT
"Status",
"Month",
CASE WHEN Status = 3 THEN -1 * Capacity ELSE Capacity END AS Capacity FROM tbl
) t
GROUP BY
"Status",
"Month"

Partition by when NULL

I have a table that looks like
Year Month ID Date Status
--------------------------------------
2013 8 99999 8/1/2013 Status A
2013 9 99999 NULL NULL
2013 10 99999 NULL NULL
2013 11 99999 NULL NULL
2013 12 99999 NULL NULL
2014 1 99999 NULL NULL
2014 2 99999 2/5/2014 Status B
2014 3 99999 NULL NULL
2014 4 99999 NULL NULL
2014 5 99999 NULL NULL
2014 6 99999 NULL NULL
2014 7 99999 NULL NULL
I want to add a column that will give me the number of the status, repeated until the next occurrence of a status, where it will add 1.
Result:
Year Month ID Date Status Value
--------------------------------------------
2013 8 99999 8/1/2013 Status A 1
2013 9 99999 NULL NULL 1
2013 10 99999 NULL NULL 1
2013 11 99999 NULL NULL 1
2013 12 99999 NULL NULL 1
2014 1 99999 NULL NULL 1
2014 2 99999 2/5/2014 Status B 2
2014 3 99999 NULL NULL 2
2014 4 99999 NULL NULL 2
2014 5 99999 NULL NULL 2
2014 6 99999 NULL NULL 2
2014 7 99999 NULL NULL 2
The Nulls are whats throwing me off...Thanks for the help!
Edit:
Here's my current query:
DECLARE #DateStart DATETIME
DECLARE #DateEnd DATETIME
SET #DateStart = '8/1/2013'
SET #DateEnd = '7/1/2014'
SELECT
P.Year, P.Month, P.ID,
PP.MaxStatusDate,
Status
FROM
(SELECT
*
FROM
(SELECT DISTINCT
year, Month
FROM
lu_Calendar
WHERE
Date BETWEEN #DateStart AND #DateEnd) AS A
CROSS JOIN
(SELECT DISTINCT
ID
FROM
dbo.StatusChangeData) AS B
) AS P
LEFT JOIN
(SELECT
yr, mnth, MaxStatusDate, Status, A.ID
FROM
(SELECT
ID, YEAR([ModifiedDate]) AS yr,
MONTH(ModifiedDate) AS mnth,
MAX([ModifiedDate]) AS MaxStatusDate
FROM
dbo.StatusChangeData
GROUP BY
ID, YEAR([ModifiedDate]), MONTH(ModifiedDate)) AS A
INNER JOIN
dbo.StatusChangeData sce ON sce.ID = A.ID AND A.MaxStatusDate = sce.[ModifiedDate]
) AS PP ON P.Month = pp.mnth AND P.YEAR = PP.yr AND P.ID = PP.ID
WHERE
P.ID = 99999

You can do this with a correlated subquery. Essentially, this counts the number of not-NULL values before any given value:
select scd.*,
(select count(*)
from StatusChangeData scd2
where scd2.id = scd.id and
scd2.status is not null and
scd2.year*100+scd2.month <= scd.year*100+scd.month
) as value
from StatusChangeData scd;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Grouping problems using two CTEs - sql

Related

One SQL statement for multiple column totals based on year

JOIN by closer value to key

How to extract and pivot in sql

SQL to calculate Net Capacity

Partition by when NULL

Categories

Resources