aggregating nested SQL statements to fewer columns - sql

I am trying to aggregate my data and group it with respect to SKU's and the cluster ID associated with that SKU.
My current output brings back roughly 40,000 rows (5 SKU's * 8,000 Stores) however I want just 35.
My code:
SELECT DISTINCT E.*
FROM ALC_ITEM_SOURCE P
RIGHT JOIN
(
SELECT D.* ,SUM(L.ALLOCATED_QTY) AS TOTAL_ALLOCATED
FROM ALC_ITEM_LOC L
RIGHT JOIN
(
SELECT C.*
FROM STORE S,
(
SELECT A.*, B.LOCATION AS STORE_NUMBER
FROM FDT_MAP_CLUSTER_LOCATION B,
(
SELECT DISTINCT SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.SKU
from fdt_maptool_sas_data ss
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988
)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'
GROUP BY SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.WORKSHEET_ID, SS.SKU
)A
WHERE B.CLUSTER_ID = A.ALLOC_CLUSTER_ID
AND B.LOCATION_TYPE = 'S'
)C
WHERE S.STORE = C.STORE_NUMBER
AND S.STORE_CLOSE_DATE IS NULL
AND S.DISTRICT NOT IN (997, 998, 999)
AND S.STORE_OPEN_DATE <= SYSDATE
)D
ON L.ITEM_ID = D.SKU
AND L.LOCATION_ID = D.STORE_NUMBER
GROUP BY D.ALLOC_CLUSTER_ID, D.ALLOC_CLUSTER_NAME, D.SKU, D.STORE_NUMBER
)E
ON P.ITEM_ID = E.SKU
AND P.SOURCE_TYPE <> 4
AND P.RELEASE_DATE > '01-FEB-2018'
My desired result would contain:
SKU Cluster_ID Total_allocated Count(stores)
1000989 1AA STORES 258 200
1000989 2A STORES 78 600
1000989 B STORES 36 500
1000989 C STORES 114 100
1000989 D STORES 144 1222
1000989 E STORES 168 600
1000989 F STORES 60 501
Which is taking a sum of total allocated per store per cluster ID.
As you can see each SKU has a grade (AA-F), I would want to repeat this 5 times since I have 5 SKU's.
Basically I am asking how can I aggregate my data up to look like the above table from the 40,000 rows it is now.
Any help is appreciated!

Just to make your sql nicer and neat, you should avoid constructing joins in 'where' statement.
Also I think you have nothing to do with ALC_ITEM_SOURCE table, since you did not use it practically.
You may try this version, or at least start working on it:
select SS.ALLOC_CLUSTER_ID,SS.ALLOC_CLUSTER_NAME,SS.SKU,SUM (L.ALLOCATED_QTY) as total_allocated,count(b.location) as store_number
FROM fdt_maptool_sas_data ss
inner join
FDT_MAP_CLUSTER_LOCATION b on B.CLUSTER_ID =A.ALLOC_CLUSTER_ID AND B.LOCATION_TYPE = 'S'
inner join store s on S.STORE = b.location AND S.STORE_CLOSE_DATE IS NULL AND S.DISTRICT NOT IN (997, 998, 999) AND S.STORE_OPEN_DATE <= SYSDATE
left outer join ALC_ITEM_LOC L on L.ITEM_ID = ss.SKU AND L.LOCATION_ID = b.location
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'

Related

Selecting Rows That Have One Value but Not Another

I need a get some rows of two tables with join that shoud have one value in a column (1407) but shouldn't have other value (1403)
These is the tables and the query:
select a.job, a.date, b.group from log a inner join active_tmp b
on a.jobno=b.jobno and a.no=b.no where b.list = 'N'
AND LOGDATE = TO_CHAR(TRUNC(SYSDATE),'YYYYMMDD')
and a.job not like 'HOUSE%'
and a.job not like 'CAR%' and (errorCode=1047 and errorCode<>1403);
LOG
JOB DATE LOGDATE JOBNO NO errorCode
MAM 20220123 20220125 33 22 1047
MAM 20220123 20220125 33 22 1403
DAD 20220122 20220125 11 99 1047
MAM 20220122 20220125 33 22 0323
DAD 20220122 20220125 11 99 0444
ACTIVE_TMP
JOB JOBNO NO GROUP LIST
MAM 33 22 LAPTOP N
MAM 33 22 LAPTOP N
DAD 11 99 KEY N
But I get:
MAM,20220123,LAPTOP
DAD,20220122,KEY
I need:
DAD,20220122,KEY
Because MAM have both codes (1047 and 1043).
To rephrase, I think you mean "I want to return matching rows that have error code 1047 but for which the same values of jobno, no, list do not have a corresponding row with error code 1403"
This part is redundant:
AND (errorCode = 1047 AND errorCode <> 1403);
If you are saying errorCode must be 1047, you are also saying it is not equal to 1403.
I think you want to select some rows into some result set, then check that there's not another row that disqualifies one of the selected rows from the final result.
So,
SELECT a.job,
a.date,
b.group
FROM _log a
INNER JOIN _active_tmp b
ON a.jobno = b.jobno
AND a.no = b.no
WHERE b.list = 'N'
AND LOGDATE = TO_CHAR(CURRENT_TIMESTAMP,'YYYYMMDD')
AND a.job NOT LIKE 'HOUSE%'
AND a.job NOT LIKE 'CAR%'
AND a.errorCode = 1047
AND NOT EXISTS (SELECT 1
FROM _log c
INNER JOIN _active_tmp d
ON c.jobno = d.jobno
AND c.no = d.no
WHERE a.job = c.job
AND a.date = c.date
AND b.group = d.group
AND c.errorCode = 1403)
We select the rows that satisfy the join and have error code 1047 then subtract from that set those rows that also satisfy the join but have error code 1403. You could possibly make this more terse using CTE or a temp table, but this works too.
Note I had to change a few things to make it work in my engine (Postgres), so you may have to change a few things back to Oracle.
You need to change the error code logic. Identify what JOB values has 1403 and then exclude those values
select distinct a.job, a.date, b.[group] from LOG a inner join active_tmp b
on a.jobno=b.jobno and a.no=b.no where b.list = 'N'
AND LOGDATE = TO_CHAR(TRUNC(SYSDATE),'YYYYMMDD')
and a.job not like 'HOUSE%'
and a.job not like 'CAR%' and a.job not in (select JOB from log where errorCode in(1403));

altering query in db2 to fix count from a join

I'm getting an aggregated count of records for orders and I'm getting the expected count on this basic query:
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
But as soon as I add back in my joins and joined values then my count goes from 1 (which it should be) to over 200. All I need from these joins is the customer ID and the manager number. so even if my count is high, I'm basically just trying to say "for this cstnoc, give me the slsupr and xlsno"
How can I perform this below query without affecting the count? I only want my count (sales_180 and velocity) coming from the custgroup table based on my where clause, but I then just want one value of the xcstno and xslsno based on the cstnoc.
SELECT
count(*) as sales_180,
180/count(*) as velocity,
c.xslsno as CustID,
cr.slsupr as Manager
FROM custgroup g
inner join customers c
on g.cstnoc = c.xcstno
inner join managers cr
on c.xslsno = cr.xslsno
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)||'-'||substr(g.EXTD1d,7,2) ) between current_Date - 180 DAY AND current_Date
GROUP BY c.xslsno, cr.slsupr
You are producing multiple rows when joining, so your count is now counting all the resulting rows with all that [unintended] multiplicity.
The solution? Use a table expression to pre-compute your count, and then you can join it to the other tables, as in:
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join ( -- here the Table Expression starts
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
) g2 on g2.cstnoc = c.xcstno
You can also use a Common Table Expression (CTE) that will produce the same result:
with g2 as (
SELECT
count(*) as sales_180,
180/count(*) as velocity
FROM custgroup g
WHERE g.cstnoc = 10617
AND g.framec = 4847
AND g.covr1c = 1763
AND g.colr1c = 29
AND date(substr(g.extd1d,1,4)||'-'||substr(g.EXTD1d,5,2)
||'-'||substr(g.EXTD1d,7,2) )
between current_Date - 180 DAY AND current_Date
)
select
g2.sales_180,
g2.velocity,
c.xslsno as CustID,
cr.slsupr as Manager
from customers c
join managers cr on c.xslsno = cr.xslsno
join g2 on g2.cstnoc = c.xcstno

Merge results to one column

I have the following query:
SELECT a.User1 as Employee
, isnull(sum(distinct b.Page_Count),0) AS Yesterday
, isnull(sum(distinct c.Page_Count),0) AS Today
, isnull(sum(distinct d.Page_Count),0) AS Week
, e.Material_Location as '(Yesterday)'
, f.Material_Location as '(Today)'
From TaskUser AS a
LEFT JOIN PaperMaterial AS b
ON b.Assigned_To = a.User1
AND b.Date_Assigned between ('06/09/2014') AND ('06/13/2014')
LEFT JOIN PaperMaterial AS c
ON c.Assigned_To = a.User1
AND c.Date_Assigned between ('06/13/2014') AND ('06/14/2014')
LEFT JOIN PaperMaterial AS d
ON d.Assigned_To = a.User1
AND d.Date_Assigned between ('06/09/2014') AND ('06/14/2014')
LEFT JOIN PaperMaterial AS e
ON e.Assigned_To = a.User1
AND e.Date_Assigned between ('06/12/2014') AND ('06/13/2014')
LEFT JOIN PaperMaterial AS f
ON f.Assigned_To = a.User1
AND f.Date_Assigned between ('06/13/2014') AND ('06/14/2014')
GROUP BY a.User1, e.Material_Location, f.Material_Location
Order By a.User1, e.Material_Location, f.Material_Location
If multiple records were input for the same user on the same day, I am getting unique rows for the same person. I only want one row per user with the e and f results merged to the same column.
Ie: Current Output =
Amy 0 640 640 NoTask Task
Amy 0 640 640 Task2 Task
Amy 0 640 640 Task3 Task4
Amy 0 640 640 Task1 NoTask
Requested output:
Amy 0 640 640 (NoTask, Task1, Task2, Task3) (NoTask, Task, Task4)
Here's a greatly over-simplified example of using stuff combined with a correlated subquery:
SQL Fiddle
I used your output as a table, more or less:
select
name,
stuff(
(
select cast(',' as varchar(max)) + mt.one
from MyTable mt
WHERE mt.name = t1.name
order by mt.name
for xml path('')
), 1, 1, '')
from mytable t1
group by name
We're using stuff to concatenate each value for the column I creatively named ONE for each NAME. The correlated subquery allows us to relate each row coming out of that to the corresponding row coming out of the main query.

Get percentages of larger group

The query below is kind of an ugly one so I hope I've got it spaced well enough to make it readable. The query finds the percentage of people that visit a given hospital if they are from a certain area. For instance, if 100 people live in county X and 20 go to hospital A and 80 go to hospital B the query outputs. How the heck is this sort of thing done? Let me know if I need to document the query or whatever I can do to make it clearer.
hospital A 20
hospital B 80
The query below works exactly like I want it to, but it give me thinking: how could this be done for every county in my table?
select hospitalname, round(cast(counts as float)/cast(fayettestrokepop as float)*100,2)as percentSeen
from
(
SELECT tblHospitals.hospitalname, COUNT(tblHospitals.hospitalname) AS counts, tblStateCounties_1.countyName,
(SELECT COUNT(*) AS Expr1
FROM Patient INNER JOIN
tblStateCounties ON Patient.stateCode = tblStateCounties.stateCode AND Patient.countyCode = tblStateCounties.countyCode
WHERE (tblStateCounties.stateCode = '21') AND (tblStateCounties.countyName = 'fayette')) AS fayetteStrokePop
FROM Patient AS Patient_1 INNER JOIN
tblHospitals ON Patient_1.hospitalnpi = tblHospitals.hospitalnpi INNER JOIN
tblStateCounties AS tblStateCounties_1 ON Patient_1.stateCode = tblStateCounties_1.stateCode AND Patient_1.countyCode = tblStateCounties_1.countyCode
WHERE (tblStateCounties_1.stateCode = '21') AND (tblStateCounties_1.countyName = 'fayette')
GROUP BY tblHospitals.hospitalname, tblStateCounties_1.countyName
) as t
order by percentSeen desc
EDIT: sample data
The sample data below is without the outermost query (the as t order by part).
The countsInTheCounty column is the (select count(*)..) part after 'tblStateCounties_1.countyName'
hospitalName hospitalCounts countyName countsInTheCounty
st. james 23 X 300
st. jude 40 X 300
Now with the outer query we would get
st james 0.076 (23/300)
st. jude 0.1333 (40/300)
Here is my guess. You'll have to test against your data or provide proper DDL + sample data.
;WITH totalCounts AS
(
SELECT StateCode, countyCode, COUNT(*) AS totalcount
FROM dbo.Patient GROUP BY StateCode, countyCode
)
SELECT
h.hospitalName,
hospitalCounts = COUNT(p.hospitalnpi),
c.countyName,
countsInTheCounty = tc.totalCount,
percentseen = CONVERT(DECIMAL(5,2), COUNT(p.hospitalnpi)*100.0/tc.totalCount)
FROM
dbo.Patient AS p
INNER JOIN
dbo.tblHospitals AS h
ON p.hospitalnpi = h.hospitalnpi
INNER JOIN
totalCounts AS tc
ON p.StateCode = tc.StateCode
AND p.countyCode = tc.countyCode
INNER JOIN
dbo.tblStateCounties AS c
ON tc.StateCode = c.stateCode
AND tc.countyCode = c.countyCode
GROUP BY
h.hospitalname,
c.countyName,
tc.totalcount
ORDER BY
c.countyName,
percentseen DESC;

Rollup / recursive addition SQL Server 2008

I have a query with rollup that outputs data like (the query is a little busy, but I can post if necessary)
range subCounts Counts percent
1-9 3 100 3.0
10-19 13 100 13.0
20-29 30 100 33.0
30-39 74 100 74.0
NULL 100 100 100.0
How is it possible to keep a running summation total of percent? Say I need to find the bottom 15 percentile, in this case 3+13=16 so I would like for the last row to be returned read
range subCounts counts percent
10-19 13 100 13.0
EDIT1: here the query
select '$'+cast(+bin*10000 + ' ' as varchar(10)) + '-' + cast(bin*10000+9999 as varchar(10)) as bins,
count(*) as numbers,
(select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')
) as Totals
, round(100*count(*)/cast((select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')) as float),2) as binsPercent
from
(
select tblclaims.patientid, sum(claimsmedicarepaid) as TotalCosts,
cast(sum(claimsmedicarePaid)/10000 as int) as bin
from tblclaims inner join patient on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on patient.hospitalnpi = tblhospitals.hospitalnpi
where tblhospitals.hospitalname = 'X'
group by tblclaims.patientid
) as t
group by bin with rollup
OK, so for whomever might use this for reference I figured out what I needed to do.
I added row_number() over(bin) as rownum to the query and saved all of this as a view.
Then I used
SELECT *,
SUM(t2.binspercent) AS SUM
FROM t t1
INNER JOIN t t2 ON t1.rownum >= t2.rownum
GROUP BY t1.rownum,
t1.bins, t1.numbers, t1.uktotal, t1.binspercent
ORDER BY t1.rownum
by joining t1.rownum >=t2.rownum you can get the rolling count sort of thing.
This isn't exactly what i was looking for, but it's on the same track:
http://blog.tallan.com/2011/12/08/sql-server-2012-windowing-functions-part-1-of-2-running-and-sliding-aggregates/ and http://blog.tallan.com/2011/12/19/sql-server-2012-windowing-functions-part-2-of-2-new-analytic-functions/ - check out PERCENT_RANK
CUME_DIST
PERCENTILE_CONT
PERCENTILE_DISC
Sorry for the lame answer