Rollup / recursive addition SQL Server 2008 - sql

I have a query with rollup that outputs data like (the query is a little busy, but I can post if necessary)
range subCounts Counts percent
1-9 3 100 3.0
10-19 13 100 13.0
20-29 30 100 33.0
30-39 74 100 74.0
NULL 100 100 100.0
How is it possible to keep a running summation total of percent? Say I need to find the bottom 15 percentile, in this case 3+13=16 so I would like for the last row to be returned read
range subCounts counts percent
10-19 13 100 13.0
EDIT1: here the query
select '$'+cast(+bin*10000 + ' ' as varchar(10)) + '-' + cast(bin*10000+9999 as varchar(10)) as bins,
count(*) as numbers,
(select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')
) as Totals
, round(100*count(*)/cast((select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')) as float),2) as binsPercent
from
(
select tblclaims.patientid, sum(claimsmedicarepaid) as TotalCosts,
cast(sum(claimsmedicarePaid)/10000 as int) as bin
from tblclaims inner join patient on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on patient.hospitalnpi = tblhospitals.hospitalnpi
where tblhospitals.hospitalname = 'X'
group by tblclaims.patientid
) as t
group by bin with rollup

OK, so for whomever might use this for reference I figured out what I needed to do.
I added row_number() over(bin) as rownum to the query and saved all of this as a view.
Then I used
SELECT *,
SUM(t2.binspercent) AS SUM
FROM t t1
INNER JOIN t t2 ON t1.rownum >= t2.rownum
GROUP BY t1.rownum,
t1.bins, t1.numbers, t1.uktotal, t1.binspercent
ORDER BY t1.rownum
by joining t1.rownum >=t2.rownum you can get the rolling count sort of thing.

This isn't exactly what i was looking for, but it's on the same track:
http://blog.tallan.com/2011/12/08/sql-server-2012-windowing-functions-part-1-of-2-running-and-sliding-aggregates/ and http://blog.tallan.com/2011/12/19/sql-server-2012-windowing-functions-part-2-of-2-new-analytic-functions/ - check out PERCENT_RANK
CUME_DIST
PERCENTILE_CONT
PERCENTILE_DISC
Sorry for the lame answer

Related

How to force postgres to return 0 even if there are no rows matching query, using coalesce, group by and join

I've been trying hopelessly to get the following SQL statement to return the query results and default to 0 if there are no rows matching the query.
This is the intended result:
vol | year
-------+------
0 | 2018
Instead I get:
vol | year
-----+------
(0 rows)
Here is the sql statement:
select coalesce(vol,0) as vol, year
from (select sum(vol) as vol, year
from schema.fact_data
join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
join schema.product_data
on schema.fact_data.product_tag =
schema.product_data.tag
join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where "retailer"='MadeUpRetailer'
and "product_tag"='FakeProductTag'
and "year"='2018' group by year
) as DerivedTable;
I know the query works because it returns data when there is data. Just doesn't default to 0 as intended...
Any help in finding why this is the case would be much appreciated!
Using your subquery DerivedTable, you could write:
SELECT coalesce(DerivedTable.vol, 0) AS vol,
y.year
FROM (VALUES ('2018'::text)) AS y(year)
LEFT JOIN (SELECT ...) AS DerivedTable
ON DerivedTable.year = y.year;
Remove the GROUP BY (and the outer query):
select 2018 as year, coalesce(sum(vol), 0) as vol
from schema.fact_data f join
schema.period_data p
on f.period_tag = p.tag join
schema.product_data pr
on f.product_tag = pr.tag join
schema.market_data m
on fd.market_tag = m.tag
where "retailer" = 'MadeUpRetailer' and
"product_tag" = 'FakeProductTag' and
"year" = '2018';
An aggregation query with no GROUP BY always returns exactly one row, so this should do what you want.
EDIT:
The query would look something like this:
select v.yyyy as year, coalesce(sum(vol), 0) as vol
from (values (2018), (2019)) v(yyyy) left join
schema.fact_data f
on f.year = v.yyyy left join -- this is just an example. I have no idea where year is coming from
schema.period_data p
on f.period_tag = p.tag left join
schema.product_data pr
on f.product_tag = pr.tag left join
schema.market_data m
on fd.market_tag = m.tag
group by v.yyyy
However, you have to move the where conditions to the appropriate on clauses. I have no idea where the columns are coming from.
From the code you posted it is not clear in which table you have the year column.
You can use UNION to fetch just 1 row in case there are no rows in that table for the year 2018 like this:
select sum(vol) as vol, year
from schema.fact_data innrt join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
inner join schema.product_data
on schema.fact_data.product_tag = schema.product_data.tag
inner join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where
"retailer"='MadeUpRetailer' and
"product_tag"='FakeProductTag' and
"year"='2018'
group by "year"
union
select 0 as vol, '2018' as year
where not exists (
select 1 from tablename where "year" = '2018'
)
In case there are rows for the year 2018, then nothing will be fetched by the 2nd query,

altering query in db2 to fix count from a join

I'm getting an aggregated count of records for orders and I'm getting the expected count on this basic query:
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
But as soon as I add back in my joins and joined values then my count goes from 1 (which it should be) to over 200. All I need from these joins is the customer ID and the manager number. so even if my count is high, I'm basically just trying to say "for this cstnoc, give me the slsupr and xlsno"
How can I perform this below query without affecting the count? I only want my count (sales_180 and velocity) coming from the custgroup table based on my where clause, but I then just want one value of the xcstno and xslsno based on the cstnoc.
SELECT
count(*) as sales_180,
180/count(*) as velocity,
c.xslsno as CustID,
cr.slsupr as Manager
FROM custgroup g
inner join customers c
on g.cstnoc = c.xcstno
inner join managers cr
on c.xslsno = cr.xslsno
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
GROUP BY c.xslsno, cr.slsupr
You are producing multiple rows when joining, so your count is now counting all the resulting rows with all that [unintended] multiplicity.
The solution? Use a table expression to pre-compute your count, and then you can join it to the other tables, as in:
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join ( -- here the Table Expression starts
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
) g2 on g2.cstnoc = c.xcstno
You can also use a Common Table Expression (CTE) that will produce the same result:
with g2 as (
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
)
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join g2 on g2.cstnoc = c.xcstno

aggregating nested SQL statements to fewer columns

I am trying to aggregate my data and group it with respect to SKU's and the cluster ID associated with that SKU.
My current output brings back roughly 40,000 rows (5 SKU's * 8,000 Stores) however I want just 35.
My code:
SELECT DISTINCT E.*
FROM ALC_ITEM_SOURCE P
RIGHT JOIN
(
SELECT D.* ,SUM(L.ALLOCATED_QTY) AS TOTAL_ALLOCATED
FROM ALC_ITEM_LOC L
RIGHT JOIN
(
SELECT C.*
FROM STORE S,
(
SELECT A.*, B.LOCATION AS STORE_NUMBER
FROM FDT_MAP_CLUSTER_LOCATION B,
(
SELECT DISTINCT SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.SKU
from fdt_maptool_sas_data ss
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988
)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'
GROUP BY SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.WORKSHEET_ID, SS.SKU
)A
WHERE B.CLUSTER_ID = A.ALLOC_CLUSTER_ID
AND B.LOCATION_TYPE = 'S'
)C
WHERE S.STORE = C.STORE_NUMBER
AND S.STORE_CLOSE_DATE IS NULL
AND S.DISTRICT NOT IN (997, 998, 999)
AND S.STORE_OPEN_DATE <= SYSDATE
)D
ON L.ITEM_ID = D.SKU
AND L.LOCATION_ID = D.STORE_NUMBER
GROUP BY D.ALLOC_CLUSTER_ID, D.ALLOC_CLUSTER_NAME, D.SKU, D.STORE_NUMBER
)E
ON P.ITEM_ID = E.SKU
AND P.SOURCE_TYPE <> 4
AND P.RELEASE_DATE > '01-FEB-2018'
My desired result would contain:
SKU Cluster_ID Total_allocated Count(stores)
1000989 1AA STORES 258 200
1000989 2A STORES 78 600
1000989 B STORES 36 500
1000989 C STORES 114 100
1000989 D STORES 144 1222
1000989 E STORES 168 600
1000989 F STORES 60 501
Which is taking a sum of total allocated per store per cluster ID.
As you can see each SKU has a grade (AA-F), I would want to repeat this 5 times since I have 5 SKU's.
Basically I am asking how can I aggregate my data up to look like the above table from the 40,000 rows it is now.
Any help is appreciated!
Just to make your sql nicer and neat, you should avoid constructing joins in 'where' statement.
Also I think you have nothing to do with ALC_ITEM_SOURCE table, since you did not use it practically.
You may try this version, or at least start working on it:
select SS.ALLOC_CLUSTER_ID,SS.ALLOC_CLUSTER_NAME,SS.SKU,SUM (L.ALLOCATED_QTY) as total_allocated,count(b.location) as store_number
FROM fdt_maptool_sas_data ss
inner join
FDT_MAP_CLUSTER_LOCATION b on B.CLUSTER_ID =A.ALLOC_CLUSTER_ID AND B.LOCATION_TYPE = 'S'
inner join store s on S.STORE = b.location AND S.STORE_CLOSE_DATE IS NULL AND S.DISTRICT NOT IN (997, 998, 999) AND S.STORE_OPEN_DATE <= SYSDATE
left outer join ALC_ITEM_LOC L on L.ITEM_ID = ss.SKU AND L.LOCATION_ID = b.location
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'

How to use alias of a subquery to get the running total?

I have a UNION of 3 tables for calculating some balance and I need to get the running SUM of that balance but I can't use PARTITION OVER, because I must do it with a sql query that can work in Access.
My problem is that I cannot use JOIN on an alias subquery, it won't work.
How can I use alias in a JOIN to get the running total?
Or any other way to get the SUM that is not with PARTITION OVER, because it does not exist in Access.
This is my code so far:
SELECT korisnik_id, imePrezime, datum, Dug, Pot, (Dug - Pot) AS Balance
FROM (
SELECT korisnik_id, k.imePrezime, r.datum, SUM(IIF(u.jedinstven = 1, r.cena, k.kvadratura * r.cena)) AS Dug, '0' AS Pot
FROM Racun r
INNER JOIN Usluge u ON r.usluga_id = u.ID
INNER JOIN Korisnik k ON r.korisnik_id = k.ID
WHERE korisnik_id = 1
AND r.zgrada_id = 1
AND r.mesec = 1
AND r.godina = 2017
GROUP BY korisnik_id, k.imePrezime, r.datum
UNION ALL
SELECT korisnik_id, k.imePrezime, rp.datum, SUM(IIF(u.jedinstven = 1, rp.cena, k.kvadratura * rp.cena)) AS Dug, '0' AS Pot
FROM RacunP rp
INNER JOIN Usluge u ON rp.usluga_id = u.ID
INNER JOIN Korisnik k ON rp.korisnik_id = k.ID
WHERE korisnik_id = 1
AND rp.zgrada_id = 1
AND rp.mesec = 1
AND rp.godina = 2017
GROUP BY korisnik_id, k.imePrezime, rp.datum
UNION ALL
SELECT uu.korisnik_id, k.imePrezime, uu.datum, '0' AS Dug, SUM(uu.iznos) AS Pot
FROM UnosUplata uu
INNER JOIN Korisnik k ON uu.korisnik_id = k.ID
WHERE korisnik_id = 1
GROUP BY uu.korisnik_id, k.imePrezime, uu.datum
) AS a
ORDER BY korisnik_id
You can save a query (let's name it Query1) for the UNION of the 3 tables and then create another query that returns each row in the first query and calculates the sum of the rows that are before it (optionally checking that they are in the same group).
It should be something like this:
SELECT *, (
SELECT SUM(Value) FROM Query1 AS b
WHERE b.GroupNumber=a.GroupNumber
AND b.Position<=a.Position
) AS RunningSum
FROM Query1 AS a
However, it's more efficient to do that in the report.

Results of SQL grouping not as intended

Intro Details:
I have an issue with a query for Oracle SQL Developer using client 12c. I've researched several other questions on SO as well as searched Google for an answer to this, but ultimately the answers to all have been to include all columns without aggregate functions in the GROUP BY clause.
What I want is to get a single result for each category (PG.CAT) per year, week, and day (FD.YEAR, FD.WEEK, and FD.DT respectively). I want to sum the units, hours, errors (GE.QTY), and total hours. I also perform multiplication and division on two columns and join up to four other tables.
Query:
`SELECT
FD.YEAR,
FD.WEEK,
PG.DT,
PG.CAT,
SUM(PG.UNITS) AS UNITS,
SUM(PG.HOURS) AS HOURS,
PM.MTM,
PM.MTM * PG.HOURS AS ADJ_MTM,
(PG.UNITS / (PM.MTM * PG.HOURS)) AS PERC_STANDARD,
SUM(CASE WHEN GE.QTY IS NULL THEN 0 ELSE GE.QTY END) AS QTY,
SUM(WH.TOTALHOURS) AS TOTALHOURS
FROM
PROD_GUNS PG
INNER JOIN PROD_MTM PM
ON PG.CAT = PM.CATEGORY
AND PM.DEPTNO = '018'
AND PG.DT BETWEEN PM.START_DT AND PM.END_DT
INNER JOIN FISCAL_DATES_DAYS FD
ON PG.DT = FD.DT
LEFT OUTER JOIN PROD_GUNS_ERRORS GE
ON PG.EID = GE.EID
AND PG.DT = GE.DT
INNER JOIN WEEKLY_HOURS WH
ON FD.WEEK = WH.DT_WEEK
AND FD.YEAR = WH.DT_YEAR
AND PG.EID = WH.EEXX31
GROUP BY
FD.YEAR,
FD.WEEK,
PG.DT,
PG.CAT,
PM.MTM,
PM.MTM * PG.HOURS,
(PG.UNITS / ( PM.MTM * PG.HOURS))
HAVING
FD.YEAR = '2015'
AND FD.WEEK = '1'
AND PG.DT = '29-DEC-14'
AND PG.CAT = 'Picking'
ORDER BY
PG.DT;`
Actual Result:
2015 1 29-DEC-14 Picking 46 0.5 68 34 1.35294117647058823529411764705882352941 0 32.21
2015 1 29-DEC-14 Picking 831 7.72 68 524.96 1.58297775068576653459311185614142029869 0 29.35
Intended Result:
2015 1 20-Dec-14 Picking 877 8.22 68 558.96 1.21654501216545 0 61.59
Question:
With the aggregates and grouping that I have above, why would this not be giving me the intended result? Thank you all in advance for any guidance provided.
Try to SUM/AVG (depending on what you need) PM.MTM * PG.HOURS AS ADJ_MTM and (PG.UNITS / (PM.MTM * PG.HOURS)) AS PERC_STANDARD, not group by them:
SELECT
FD.YEAR,
FD.WEEK,
PG.DT,
PG.CAT,
SUM(PG.UNITS) AS UNITS,
SUM(PG.HOURS) AS HOURS,
PM.MTM,
SUM(PM.MTM * PG.HOURS )AS ADJ_MTM,
SUM((PG.UNITS / (PM.MTM * PG.HOURS))) AS PERC_STANDARD,
SUM(CASE WHEN GE.QTY IS NULL THEN 0 ELSE GE.QTY END) AS QTY,
SUM(WH.TOTALHOURS) AS TOTALHOURS
FROM
PROD_GUNS PG
INNER JOIN PROD_MTM PM
ON PG.CAT = PM.CATEGORY
AND PM.DEPTNO = '018'
AND PG.DT BETWEEN PM.START_DT AND PM.END_DT
INNER JOIN FISCAL_DATES_DAYS FD
ON PG.DT = FD.DT
LEFT OUTER JOIN PROD_GUNS_ERRORS GE
ON PG.EID = GE.EID
AND PG.DT = GE.DT
INNER JOIN WEEKLY_HOURS WH
ON FD.WEEK = WH.DT_WEEK
AND FD.YEAR = WH.DT_YEAR
AND PG.EID = WH.EEXX31
GROUP BY
FD.YEAR,
FD.WEEK,
PG.DT,
PG.CAT,
PM.MTM
HAVING
FD.YEAR = '2015'
AND FD.WEEK = '1'
AND PG.DT = '29-DEC-14'
AND PG.CAT = 'Picking'
ORDER BY
PG.DT;