Grouping SQL results by Year and count - sql

I have a table with the below structure:
I would like to retrieve the results using sql in the below format
I am new to SQL and can't figure out how to go about it. Is this possible without using procedures? How do I go achieve this? (the actual data size is huge and I have given only a snapshot here)

Part of it is pivoting. Totals by row and column (and really, even the pivoting) should be done in your reporting application, not in SQL. If you insist on doing it in SQL, there are fancier ways, but something like the silly query below will suffice.
with test_data (city, yr, ct) as (
select 'Tokyo' , 2016, 2 from dual union all
select 'Mumbai', 2013, 3 from dual union all
select 'Mumbai', 2014, 5 from dual union all
select 'Dubai' , 2011, 5 from dual union all
select 'Dubai' , 2015, 15 from dual union all
select 'Dubai' , 2016, 8 from dual union all
select 'London', 2011, 16 from dual union all
select 'London', 2012, 22 from dual union all
select 'London', 2013, 4 from dual union all
select 'London', 2014, 24 from dual union all
select 'London', 2015, 13 from dual union all
select 'London', 2016, 5 from dual
),
test_with_totals as (
select city, yr, ct from test_data union all
select city, 9999, sum(ct) from test_data group by city union all
select 'Grand Total', yr , sum(ct) from test_data group by yr union all
select 'Grand Total', 9999, sum(ct) from test_data
)
select * from test_with_totals
pivot ( sum (ct) for yr in (2011, 2012, 2013, 2014, 2015, 2016, 9999 as "Total"))
order by "Total";
Result:
CITY 2011 2012 2013 2014 2015 2016 Total
----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
Tokyo 2 2
Mumbai 3 5 8
Dubai 5 15 8 28
London 16 22 4 24 13 5 84
Grand Total 21 22 7 29 28 15 122

Related

Generate rows to fill in gaps between years, carry over a value from previous year

I have a table of road condition ratings (roads are rated from 1-20; 20 being good).
with road_inspections
(road_id, year, cond) as (
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual)
select * from road_inspections
ROAD_ID YEAR COND
---------- ---------- ----------
1 2009 17
1 2011 16
1 2015 14
1 2016 18.3
1 2019 18.1
2 2013 17.5
2 2016 18
2 2019 18
2 2022 18
3 2022 20
db<>fiddle
In a query, for each road, I want to generate rows to fill in the gaps between the years.
For a given road, starting at the first row (the earliest inspection), there should be consecutive rows for each year all the way to the current year (the sysdate year; currently 2022).
For the filler rows, I want carry over the condition rating from the last known inspection.
The result would look like this:
ROAD_ID YEAR COND
---------- ---------- ----------
1 2009 17
1 2010 17 *
1 2011 16
1 2012 16 *
1 2013 16 *
1 2014 16 *
1 2015 14
1 2016 18.3
1 2017 18.3 *
1 2018 18.3 *
1 2019 18.1
1 2020 18.1 *
1 2021 18.1 *
1 2022 18.1 *
2 2013 17.5
2 2014 17.5 *
2 2015 17.5 *
2 2016 18
2 2017 18 *
2 2018 18 *
2 2019 18
2 2020 18 *
2 2021 18 *
2 2022 18
3 2022 20
*=filler row
Question:
How can I create those filler rows using an Oracle SQL query?
(My priorities are: simplicity first, performance second.)
You can use the LEAD analytic function with a LATERAL joined hierarchical query to generate the missing rows from each row until the next row:
SELECT r.road_id,
y.year,
r.cond
FROM ( SELECT r.*,
LEAD(year, 1, EXTRACT(YEAR FROM SYSDATE) + 1)
OVER (PARTITION BY road_id ORDER BY year) AS next_year
FROM road_inspections r
) r
CROSS JOIN LATERAL (
SELECT r.year + LEVEL - 1 AS year
FROM DUAL
CONNECT BY r.year + LEVEL - 1 < r.next_year
) y
Which, for the sample data:
CREATE TABLE road_inspections (road_id, year, cond) as
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual;
Outputs:
ROAD_ID
YEAR
COND
1
2009
17
1
2010
17
1
2011
16
1
2012
16
1
2013
16
1
2014
16
1
2015
14
1
2016
18.3
1
2017
18.3
1
2018
18.3
1
2019
18.1
1
2020
18.1
1
2021
18.1
1
2022
18.1
2
2013
17.5
2
2014
17.5
2
2015
17.5
2
2016
18
2
2017
18
2
2018
18
2
2019
18
2
2020
18
2
2021
18
2
2022
18
3
2022
20
db<>fiddle here
with
road_inspections (road_id, year_, cond) as (
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual
)
, prep (road_id, first_year) as (
select road_id, min(year_)
from road_inspections
group by road_id
)
, all_years (road_id, year_) as (
select p.road_id, l.year_
from prep p cross join lateral (
select first_year + level - 1 as year_
from dual
connect by level <= 2022 - first_year + 1
) l
)
select road_id, year_,
last_value(ri.cond ignore nulls) over
(partition by road_id order by year_) as cond
from all_years ay left outer join road_inspections ri using (road_id, year_)
;
The first subquery, prep, finds the first year for each road id. This is used in the all_years subquery to generate all the years relevant for each road id.
Then left-outer-join to the original cata, copy the cond wherever it is available, and use the analytic function last_value with the ignore nulls option to fill in the gaps.
Note that I changed the column name year to year_ (with a trailing underscore); year is an Oracle keyword, not a good choice for a column name.
Output:
ROAD_ID YEAR_ COND
---------- ---------- ----------
1 2009 17
1 2010 17
1 2011 16
1 2012 16
1 2013 16
1 2014 16
1 2015 14
1 2016 18.3
1 2017 18.3
1 2018 18.3
1 2019 18.1
1 2020 18.1
1 2021 18.1
1 2022 18.1
2 2013 17.5
2 2014 17.5
2 2015 17.5
2 2016 18
2 2017 18
2 2018 18
2 2019 18
2 2020 18
2 2021 18
2 2022 18
3 2022 20
Using LEAD function and connect by LEVEL row generator we can achieve the same. The DB FIDDLE here
with r as (
select
*
from
road_inspections
union
select
road_id,
2022,
cond
from
road_inspections
where
(road_id, year) in(
select
road_id,
max(year) over (partition by road_id)
from
road_inspections a
where
not exists (
select
1
from
road_inspections b
where
a.road_id = b.road_id
and b.year = 2022
)
)
),
data as(
SELECT
r.*,
nvl(
lead(year, 1) over (
partition by road_id
order by
year
)- year,
0
) gaps
FROM
r
)
select
road_id,
year + level -1 year,
cond
from
(
select
a.road_id,
year,
cond,
rownum rn,
gaps
from
data a
) connect by level <= gaps
and prior rn = rn
and prior dbms_random.value != 1
order by
road_id,
year + level -1;

SQL Oracle - Sales Forecast

I am doing a Seles forecast in SQL Oracle. I need to figure out what revenue I can expect for next year. I should calculate for each month(In my example January, February 2018 for Each customer by State/City) . I have data for 3 years.
The result should contain an estimated sales forecast for each month based on the city+state combination. I was trying to use use req_slope, but it doesn't work. I have code here: SQL Fiddle
select c.*,
max(year) +1 forecast_year,
regr_slope(revenue, year)
* (max(year) + 1)
+ regr_intercept(revenue, year) forecasted_revenue
from New_customer_data c
group by Cust_ID ,
State ,
City ,
year ,
id_month ,
revenue ;
I need to figure out what revenue I can expect for next year.
Remove revenue and year from the GROUP BY clause as those are the columns you want to perform the regression on:
select cust_id,
city,
state,
id_month,
max(year) +1 forecast_year,
regr_slope(revenue, year)
* (max(year) + 1)
+ regr_intercept(revenue, year) forecasted_revenue
from New_customer_data c
group by
Cust_ID,
city,
state,
id_month;
Which, for your sample data:
insert into New_customer_data
select 1, 'MN' , 'Minneapolis', 2016, 1, 679862 from dual union all
select 1, 'IL', 'Chicago' , 2016, 2, 11862 from dual union all
select 1, 'MN' , 'Minneapolis', 2017, 1,547365 from dual union all
select 1, 'IL', 'Chicago' , 2017, 2, 705365 from dual union all
select 2, 'CA', 'San Diego', 2016, 1, 51074 from dual union all
select 2, 'CA', 'LA', 2016, 2, 598862 from dual union all
select 2, 'CA', 'San Diego', 2017, 1, 705365 from dual union all
select 2,'CA', 'LA', 2017, 2, 50611 from dual union all
select 3, 'CA', 'Santa Monica', 2016, 1, 190706 from dual union all
select 3, 'IL', 'Evanston', 2016, 2, 679862 from dual union all
select 3, 'CA', 'Santa Monica', 2017, 1, 705365 from dual union all
select 3, 'IL', 'Evanston', 2017, 2, 90393 from dual union all
select 4, 'MN', 'Shakopee', 2016, 1, 31649 from dual union all
select 4, 'FL', 'Miami', 2016, 2,888862 from dual union all
select 4, 'MN', 'Shakopee', 2017, 1, 125365 from dual union all
select 4, 'FL', 'Miami', 2017, 2, 30566 from dual;
Outputs:
CUST_ID
CITY
STATE
ID_MONTH
FORECAST_YEAR
FORECASTED_REVENUE
1
Minneapolis
MN
1
2018
414868
1
Chicago
IL
2
2018
1398868
2
San Diego
CA
1
2018
1359656
2
LA
CA
2
2018
-497640
3
Santa Monica
CA
1
2018
1220024
3
Evanston
IL
2
2018
-499076
4
Shakopee
MN
1
2018
219081
4
Miami
FL
2
2018
-827730
db<>fiddle here

How to group sales by month, quarter and year in the same row using case?

I'm trying to return the total number of sales for every month, every quarter, for the year 2016. I want to display annual sales on the first month row, and not on the other rows. Plus, I want to display the quarter sales on the first month of each quarter, and not on the others.
To further explain this, here's what I want to achieve:
MONTH MONTH_SALES QUARTER_SALES YEAR_SALES
1 2183 5917 12505
2 1712 - -
3 1972 - -
4 2230 6588 -
5 2250 - -
6 2108 - -
Here's my SQL query so far:
SELECT
Time.month,
SUM(Sales.sales) AS MONTH_SALES, -- display monthly sales.
CASE
WHEN MOD(Time.month, 3) = 1 THEN ( -- first month of quarter
SELECT
SUM(Sales.sales)
FROM
Sales,
Time
WHERE
Sales.Time_id = Time.Time_id
AND Time.year = 2016
GROUP BY
Time.quarter
FETCH FIRST 1 ROW ONLY
)
END AS QUARTER_SALES,
CASE
WHEN Time.month = 1 THEN ( -- display annual sales.
SELECT
SUM(Sales.sales)
FROM
Sales,
Time
WHERE
Sales.Time_id = Time.Time_id
AND Time.year = 2016
GROUP BY
Time.year
)
END AS YEAR_SALES
FROM
Sales,
Time
WHERE
Sales.Time_id = Time.Time_id
AND Time.year = 2016
GROUP BY
Time.month
ORDER BY
Time.month
I'm almost getting the desired output, but I'm getting the same duplicated 6588 value in quarter sales for the first and fourth month (because I'm fetching the first row that comes from first quarter).
MONTH MONTH_SALES QUARTER_SALES YEAR_SALES
1 2183 6588 12505
2 1712 - -
3 1972 - -
4 2230 6588 -
5 2250 - -
6 2108 - -
I even tried to put WHERE Time.quarter = ((Time.month * 4) / 12) but the month value from the outer query doesn't get passed in the subquery.
Unfortunately I don't have enough experience with CASE WHEN expressions to know how to pass the month row. Any tips would be awesome.
How about this?
Sample data:
SQL> with
2 time (time_id, month, quarter, year) as
3 (select 1, 1, 1, 2016 from dual union all
4 select 2, 2, 1, 2016 from dual union all
5 select 3, 3, 1, 2016 from dual union all
6 select 4, 5, 2, 2016 from dual union all
7 select 5, 7, 3, 2016 from dual union all
8 select 6, 8, 3, 2016 from dual union all
9 select 7, 9, 3, 2016 from dual union all
10 select 8, 10, 4, 2016 from dual union all
11 select 9, 11, 4, 2016 from dual
12 ),
13 sales (time_id, sales) as
14 (select 1, 100 from dual union all
15 select 1, 100 from dual union all
16 select 2, 200 from dual union all
17 select 3, 300 from dual union all
18 select 4, 400 from dual union all
19 select 5, 500 from dual union all
20 select 6, 600 from dual union all
21 select 7, 700 from dual union all
22 select 8, 800 from dual union all
23 select 9, 900 from dual
24 ),
Query begins here; it uses sum aggregate in its analytic form; partition by clause says what to compute. row_number, similarly, sorts rows in each quarter/year - it is later used in CASE expression to decide whether to show quarterly/yearly total or not.
25 temp as
26 (select t.month, t.quarter, t.year, sum(s.sales) month_sales
27 from time t join sales s on s.time_id = t.time_id
28 where t.year = 2016
29 group by t.month, t.quarter, t.year
30 ),
31 temp2 as
32 (select month, quarter, month_sales,
33 sum(month_sales) over (partition by quarter) quarter_sales,
34 sum(month_sales) over (partition by year ) year_sales,
35 row_number() over (partition by quarter order by quarter) rnq,
36 row_number() over (partition by year order by null) rny
37 from temp
38 )
39 select month,
40 month_sales
41 case when rnq = 1 then quarter_sales end month_sales,
42 case when rny = 1 then year_sales end year_sales
43 from temp2
44 order by month;
MONTH MONTH_SALES QUARTER_SALES YEAR_SALES
---------- ---------- ----------- ----------
1 200 700 4600
2 200
3 300
4 400 1500
5 500
6 600
7 700 2400
8 800
9 900
9 rows selected.
SQL>

How to separate range of year on oracle

I am working on a db oracle and I need to create a query where it return a range of date. For example:
Supose that I had a field of like this:
I need to get this dates and apply a range of years to return someting like:
|'0-5'|'6-10'|'11-15'|...
| 10 | 35 | 20 |...
where each range contains a number of people in this range of years old.
I tried to use SELECT CASE...
SELECT CASE
WHEN DATE_BORN <= DATE_BORN + 5 THEN '0 - 5
WHEN DATE_BORN >= DATE_BORN + 6 AND DATE_BORN <= 10 THEN '6 - 10'
END AS AGE_RANGE,
COUNT(*)
FROM MY_TABLE
GROUP BY 1
So I saw that this way change only days not year.
How can I write this query?
That's conditional aggregation:
SQL> with test (date_born) as
2 (select date '2000-05-12' from dual union all
3 select date '2001-05-12' from dual union all
4 select date '2012-05-12' from dual union all
5 select date '2013-05-12' from dual union all
6 select date '2004-05-12' from dual union all
7 select date '2008-05-12' from dual union all
8 select date '2009-05-12' from dual union all
9 select date '2001-05-12' from dual union all
10 select date '2012-05-12' from dual union all
11 select date '2001-05-12' from dual union all
12 select date '2004-05-12' from dual union all
13 select date '2005-05-12' from dual
14 )
15 select
16 sum(case when extract (year from date_born) between 2000 and 2005 then 1 else 0 end) as "2000 - 2005",
17 sum(case when extract (year from date_born) between 2006 and 2010 then 1 else 0 end) as "2006 - 2010",
18 sum(case when extract (year from date_born) between 2011 and 2015 then 1 else 0 end) as "2011 - 2015"
19 from test;
2000 - 2005 2006 - 2010 2011 - 2015
----------- ----------- -----------
7 2 3
SQL>
Here is a dynamic way to do this (using the sample table above)
First I think it's easier to have your ranges in rows rather than columns, easier for having a variety of dates that may change.
Second your first grouping is 6 years, so I changed it to just be series of 5 years:
with test (date_born) as
(select date '2000-05-12' from dual union all
select date '2001-05-12' from dual union all
select date '2012-05-12' from dual union all
select date '2013-05-12' from dual union all
select date '2004-05-12' from dual union all
select date '2008-05-12' from dual union all
select date '2009-05-12' from dual union all
select date '2001-05-12' from dual union all
select date '2012-05-12' from dual union all
select date '2001-05-12' from dual union all
select date '2004-05-12' from dual union all
select date '2005-05-12' from dual
)
,mydata AS (
SELECT
(SELECT min(extract(YEAR FROM date_born)) FROM test)+((LEVEL-1)*5)dt1
,(SELECT min(extract(YEAR FROM date_born)) FROM test)+((LEVEL-1)*5)+4 dt2
FROM dual CONNECT BY LEVEL*5 <=
(SELECT max(extract(YEAR FROM date_born))-min(extract(YEAR FROM date_born)) FROM test)+5)
SELECT d.*, count(t.date_born) cnt FROM mydata d
LEFT JOIN test t ON extract(YEAR FROM date_born) BETWEEN d.dt1 AND d.dt2
GROUP BY dt1, dt2
ORDER BY dt1;
You get this for your solution
DT1 DT2 CNT
2000 2004 6
2005 2009 3
2010 2014 3
Solution is basically extracting years from dates, finding min/max of this data set, using connect to get all years in between, and then joining to count your matching records

Store multiple results from one field in new columns in SQL

Current query returns 3 lines for one invoice if there are 3 denial codes and more if there are multiple denial codes and multiple denial dates. I am trying to create a column for each denial code so all results can be on one line. The requirements from the client is for each denial to be in its own column so I am unable to use the listagg function.
The results should looks like the below:
office, invoice, denial date, denial code 1, denial code 2, denial code 3, denial date2, denial code 1...etc
Oracle database. Current code:
SELECT
A.OFFICE_NBR,
A.INV_NBR,
TO_DATE(A.CRTD_DT,'MM/DD/YYYY') AS CARC_DT,
A.CLM_ID,
A.CLM_LN_ID,
A.RSN_CD
FROM DENIALS A
WHERE A.OFFICE_NBR = '1234'
AND A.INV_NBR = '123456'
I took a guess at some test data but this should get you going. Please add some before and after data, that helps a lot to understand what you are trying to do, and allows for building a more realistic solution.
Anyway the caveat is you have to define ahead of time a DENIAL_CD_X column for each possible denial code. I will be interested to see if someone can come up with a dynamic solution for the denial_cd columns.
SQL> with DENIALS(OFFICE_NBR, INV_NBR, CRTD_DT, RSN_CD) as (
2 select '11', '1111', '07/31/2015', '1' from dual
3 union
4 select '11', '1111', '07/31/2015', '99' from dual
5 union
6 select '11', '1111', '07/31/2015', '50' from dual
7 union
8 select '11', '1113', '06/01/2014', '34' from dual
9 union
10 select '11', '1113', '06/01/2014', '71' from dual
11 union
12 select '32', '3232', '06/21/2015', '34' from dual
13 union
14 select '32', '3232', '07/31/2015', '99' from dual
15 )
16 select OFFICE_NBR, INV_NBR, TO_DATE(CRTD_DT,'MM/DD/YYYY') AS CARC_DT,
17 DENIAL_CD_1, DENIAL_CD_2, DENIAL_CD_3
18 from
19 (
20 select OFFICE_NBR, INV_NBR, CRTD_DT, rsn_cd,
21 row_number() over(partition by OFFICE_NBR, INV_NBR, CRTD_DT order by OFFICE_NBR, INV_NBR, CRTD_DT, rsn_cd) rn
22 from DENIALS
23 )
24 pivot
25 (
26 max(rsn_cd)
27 for rn in ('1' as DENIAL_CD_1, '2' as DENIAL_CD_2, '3' as DENIAL_CD_3)
28 )
29 order by OFFICE_NBR, INV_NBR, CRTD_DT;
OFFICE_NBR INV_NBR CARC_DT DENIAL_CD_1 DENIAL_CD_2 DENIAL_CD_3
---------- ------- ---------- ----------- ----------- -----------
11 1111 31-JUL-15 1 50 99
11 1113 01-JUN-14 34 71
32 3232 21-JUN-15 34
32 3232 31-JUL-15 99
SQL>