Should I use a subquery or sum in this situation?

Should I use a subquery or sum in this situation? - sql

I am trying to get line 11 (which can have multiple results of square footage numbers of different areas of a house such as porches, garages and includes living area) and subtract line 10 if possible to get the total square footage of areas in a house other than living area.
as a-- sum(id1.[calc_area] - pp.living_area) as [other_ area],
my problem is the two numbers are from different tables, and the select statement uses a different from table pv. What would be the easiest way to accomplish this?
select distinct pv.prop_id,
pv.hood_cd as neighborhood,
pv.abs_subdv_cd as subdivision,
cast (pv.[legal_desc] as char(16)) as legal,
[deed_date],
[consideration],
pv.prop_val_yr as year,
sts1.[situs_num] as address,
cast(sts1.[situs_street] as char(11)) as street,
pp.living_area,
id1.[calc_area] as [total_area],
cast (pp.[land_total_acres]as decimal (6,2))as acres,
[sale_type],
case when [sale_date] >='01/01/2014'then convert(varchar(18), [sale_date], 101)else''end as'sale date',
pp.ls_table,
(pv.land_hstd_val + pv.land_non_hstd_val + pv.ag_market + pv.timber_market)as land_val,
cast(pp.[main_land_total_adj]as decimal (5,2)) as land_adj_total,
(pv.imprv_hstd_val + pv.imprv_non_hstd_val)as imprv_val,
case when [sale_date] >='01/01/2014'then [sale_price] else 0 end as'sale price',
pv.market
from property_val pv with (nolock)
inner join prop_supp_assoc psa with (nolock) on
pv.prop_id = psa.prop_id
and pv.prop_val_yr = psa.owner_tax_yr
and pv.sup_num = psa.sup_num
inner join property p with (nolock)on
pv.prop_id = p.prop_id
inner join owner o with (nolock) on
pv.prop_id = o.prop_id
and pv.prop_val_yr = o.owner_tax_yr
and pv.sup_num = o.sup_num
inner join account ac with (nolock) on
o.owner_id = ac.acct_id
inner join property_profile pp with (nolock) on
pv.prop_id = pp.prop_id
and pv.prop_val_yr = pp.prop_val_yr
left outer join imprv_detail as id1 with (nolock) on
pv.prop_id = id1.prop_id
and pv.prop_val_yr = id1.prop_val_yr
and pv.sup_num = id1.sup_num
left outer join
(select cop.prop_id,
convert(varchar(20), co.deed_dt, 101) as deed_date,
co.consideration as consideration, s.sl_dt as sale_date,
s.sl_price as sale_price, s.sl_type_cd as sale_type
from chg_of_owner_prop_assoc cop with (nolock)
inner join chg_of_owner co with (nolock) on
co.chg_of_owner_id = cop.chg_of_owner_id
inner join sale s with (nolock) on
co.chg_of_owner_id = s.chg_of_owner_id
where cop.seq_num = 0
)as c
on c.prop_id = pv.prop_id
basic results with some columns hidden-----
prop_id address street living_area total_area acres
x 322 SURBER ST 939 48 0
x 322 SURBER ST 939 288 0
x 322 SURBER ST 939 939 0
xy 318 SURBER STRE 1202 0 0
xy 318 SURBER STRE 1202 120 0
xy 318 SURBER STRE 1202 340 0
xy 318 SURBER STRE 1202 1052 0

if you need only additional sum and all currently returned rows then use SUM OVER function:
sum(id1.[calc_area] - pp.living_area) over (PARTITION BY pv.prop_id, pv.prop_val_yr) as [other_ area]
you can modify partitioning, for example:
sum(id1.[calc_area] - pp.living_area) over (PARTITION BY pv.prop_id, pv.prop_val_yr, pp.ls_table) as [other_ area]
but if you have unwanted row multiplication when you join table and you need aggregated values from this table, then use subquery in OUTER APPLY clause (and don't join this table in other way), for example:
SELECT ...
, id1.total_area
...
OUTER APPLY (
SELECT Sum(calc_area) as total_area
FROM imprv_detail
WHERE prop_id = pv.prop_id = id1.
and prop_val_yr = pv.prop_val_yr
and sup_num = pv.sup_num
) AS id1
then to calculate other_area use total_area - total_living_area

Related

SQL query for most recent date

I'm trying to query to only pull the most recent sale date but keep the unique value of "strap". This is the query result I have.
nh_cd strap dor_cd acreage sqft sale date reception_num price asd_val rea_cd
178.00 R0000001 AG 4.7160 205443 2019-07-11 00:00:00.000 3723615 890000 200 05
178.00 R0000001 AG 4.7160 205443 2020-05-29 00:00:00.000 3787823 880000 200 40
205.00 R0022222 AGRES 5.8030 252771 2019-06-10 00:00:00.000 3718473 647500 520200 40
This is what I've built so far, but it doesn't give me my desired result of a recent date.
SELECT distinct
parcel.nh_cd
,sales.strap
,parcel.dor_cd
,detail.acreage
,detail.sqft
,max(sales.dos)
,sales.reception_num
,sales.price
,parcel.asd_val
,sales.rea_cd
,sales.qu_flg
,sales.valid_cd
,sales.vi
,site.str_num
,site.str_pfx
,site.str
,site.str_sfx
,site.city
,parcel.status_cd
,strap_idx.folio
FROM detail INNER JOIN parcel ON parcel.strap = detail.strap
INNER JOIN sales ON parcel.strap = sales.strap
INNER JOIN site ON parcel.strap = site.strap
INNER JOIN strap_idx ON parcel.strap = strap_idx.strap
INNER JOIN lnd_a ON parcel.strap = lnd_a.strap
WHERE lnd_a.st_use_cd IN ('4117','4127','4137','4147','4167','4177','4180')
AND parcel.dor_cd LIKE 'AG%'
AND parcel.status_cd = 'A'
AND (sales.price > '0')
AND (site.ln_num = '1')
AND (sales.dos>='07/01/2018')
AND (sales.dos<='08/24/2020')
GROUP by parcel.nh_cd
,sales.strap
,parcel.dor_cd
,detail.acreage
,detail.sqft
,sales.dos
,sales.reception_num
,sales.price
,parcel.asd_val
,sales.rea_cd
,sales.qu_flg
,sales.valid_cd
,sales.vi
,site.str_num
,site.str_pfx
,site.str
,site.str_sfx
,site.city
,parcel.status_cd
,strap_idx.folio
This is the result I want
nh_cd strap dor_cd acreage sqft sale date reception_num price asd_val rea_cd
178.00 R0000001 AG 4.7160 205443 2020-05-29 00:00:00.000 3787823 880000 200 40
205.00 R0022222 AGRES 5.8030 252771 2019-06-10 00:00:00.000 3718473 647500 520200 40
How would I go about doing this?

You can ROW_NUMBER() it
SELECT *
FROM (
SELECT distinct parcel.nh_cd
,sales.strap
,parcel.dor_cd
,detail.acreage
,detail.sqft
,sales.dos
,sales.reception_num
,sales.price
,parcel.asd_val
,sales.rea_cd
,sales.qu_flg
,sales.valid_cd
,sales.vi
,site.str_num
,site.str_pfx
,site.str
,site.str_sfx
,site.city
,parcel.status_cd
,strap_idx.folio
, ROW_NUMBER() OVER(PARTITION BY parcel.nh_cd, sales.strap ORDER BY sales.dos DESC) AS rn
FROM detail INNER JOIN parcel ON parcel.strap = detail.strap
INNER JOIN sales ON parcel.strap = sales.strap
INNER JOIN site ON parcel.strap = site.strap
INNER JOIN strap_idx ON parcel.strap = strap_idx.strap
INNER JOIN lnd_a ON parcel.strap = lnd_a.strap
WHERE lnd_a.st_use_cd IN ('4117','4127','4137','4147','4167','4177','4180')
AND parcel.dor_cd LIKE 'AG%'
AND parcel.status_cd = 'A'
AND (sales.price > '0')
AND (site.ln_num = '1')
AND (sales.dos>='07/01/2018')
AND (sales.dos<='08/24/2020')
) t
WHERE rn = 1

Nesting Queries to get multiple column results

Have two queries , one collects moves in based on property and unit type the other would collect based on Move Outs for the same data. when ran separately they yield the correct information (move outs are 6 and move ins are 11) Have tried nesting in select and from statements but not getting what i need. When nested within the select am getting the correct move outs per unit type, but each line for move ins is total move ins. I recall that the nesting here would only return one value but know there is a way to return the value for each row. Any assistance is appreciated.
SELECT
p.scode as PropNumber,
p.saddr1 propname,
ut.scode as UnitType,
COUNT(t.hmyperson) as Moveouts,
(
SELECT COUNT(t.hmyperson) as MoveIns
FROM
tenant t
JOIN unit u ON t.hunit = u.hmy
JOIN property p ON p.hmy = u.hproperty
JOIN unittype ut ON ut.hmy = u.HUNITTYPE
WHERE
t.dtmovein >= getdate() - 14
AND p.scode IN ('gsaff')
) mi
FROM
Property p
JOIN unit u ON u.hproperty = p.hmy
JOIN tenant t ON t.hunit = u.hmy
JOIN unittype ut ON ut.hmy = u.HUNITTYPE
WHERE
p.scode IN ('gsaff')
AND t.DTMOVEOUT >= getdate()- 14
GROUP BY
ut.scode,
p.scode,
p.saddr1
With this data is coming out like :
PropNumber Propname UnitType MoveOuts MoveIns
1 x tc2 1 11
1 x tc3 2 11
1 x tc4 1 11
1 x tc5 1 11
1 x tc6 1 11 <pre>
Move in column should display as
2
5
1
0
3

You need to correlate the subquery according to the record being processed in the outer query. This also requires that you use different table aliases in the subquery than in the outer query.
It is hard to tell without seeing sample data, however I would expect that you need to correlate with all non-aggregated columns in the outer query.
Try changing :
(
SELECT COUNT(t.hmyperson) as MoveIns
FROM
tenant t
JOIN unit u ON t.hunit = u.hmy
JOIN property p ON p.hmy = u.hproperty
JOIN unittype ut ON ut.hmy = u.HUNITTYPE
WHERE
t.dtmovein >= getdate() - 14
AND p.scode IN ('gsaff')
) mi
To :
(
SELECT COUNT(t.hmyperson) as MoveIns
FROM
tenant t1
JOIN unit u1 ON t1.hunit = u1.hmy
JOIN property p1 ON p1.hmy = u1.hproperty
JOIN unittype ut1 ON ut1.hmy = u1.HUNITTYPE
WHERE
t1.dtmovein >= getdate() - 14
AND p1.scode IN ('gsaff')
AND p1.scode = p.scode
AND p1.saddr1 = p.saddr1
AND ut1.scode = ut.scode
) mi

aggregating nested SQL statements to fewer columns

I am trying to aggregate my data and group it with respect to SKU's and the cluster ID associated with that SKU.
My current output brings back roughly 40,000 rows (5 SKU's * 8,000 Stores) however I want just 35.
My code:
SELECT DISTINCT E.*
FROM ALC_ITEM_SOURCE P
RIGHT JOIN
(
SELECT D.* ,SUM(L.ALLOCATED_QTY) AS TOTAL_ALLOCATED
FROM ALC_ITEM_LOC L
RIGHT JOIN
(
SELECT C.*
FROM STORE S,
(
SELECT A.*, B.LOCATION AS STORE_NUMBER
FROM FDT_MAP_CLUSTER_LOCATION B,
(
SELECT DISTINCT SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.SKU
from fdt_maptool_sas_data ss
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988
)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'
GROUP BY SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.WORKSHEET_ID, SS.SKU
)A
WHERE B.CLUSTER_ID = A.ALLOC_CLUSTER_ID
AND B.LOCATION_TYPE = 'S'
)C
WHERE S.STORE = C.STORE_NUMBER
AND S.STORE_CLOSE_DATE IS NULL
AND S.DISTRICT NOT IN (997, 998, 999)
AND S.STORE_OPEN_DATE <= SYSDATE
)D
ON L.ITEM_ID = D.SKU
AND L.LOCATION_ID = D.STORE_NUMBER
GROUP BY D.ALLOC_CLUSTER_ID, D.ALLOC_CLUSTER_NAME, D.SKU, D.STORE_NUMBER
)E
ON P.ITEM_ID = E.SKU
AND P.SOURCE_TYPE <> 4
AND P.RELEASE_DATE > '01-FEB-2018'
My desired result would contain:
SKU Cluster_ID Total_allocated Count(stores)
1000989 1AA STORES 258 200
1000989 2A STORES 78 600
1000989 B STORES 36 500
1000989 C STORES 114 100
1000989 D STORES 144 1222
1000989 E STORES 168 600
1000989 F STORES 60 501
Which is taking a sum of total allocated per store per cluster ID.
As you can see each SKU has a grade (AA-F), I would want to repeat this 5 times since I have 5 SKU's.
Basically I am asking how can I aggregate my data up to look like the above table from the 40,000 rows it is now.
Any help is appreciated!

Just to make your sql nicer and neat, you should avoid constructing joins in 'where' statement.
Also I think you have nothing to do with ALC_ITEM_SOURCE table, since you did not use it practically.
You may try this version, or at least start working on it:
select SS.ALLOC_CLUSTER_ID,SS.ALLOC_CLUSTER_NAME,SS.SKU,SUM (L.ALLOCATED_QTY) as total_allocated,count(b.location) as store_number
FROM fdt_maptool_sas_data ss
inner join
FDT_MAP_CLUSTER_LOCATION b on B.CLUSTER_ID =A.ALLOC_CLUSTER_ID AND B.LOCATION_TYPE = 'S'
inner join store s on S.STORE = b.location AND S.STORE_CLOSE_DATE IS NULL AND S.DISTRICT NOT IN (997, 998, 999) AND S.STORE_OPEN_DATE <= SYSDATE
left outer join ALC_ITEM_LOC L on L.ITEM_ID = ss.SKU AND L.LOCATION_ID = b.location
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'

Merge results to one column

I have the following query:
SELECT a.User1 as Employee
, isnull(sum(distinct b.Page_Count),0) AS Yesterday
, isnull(sum(distinct c.Page_Count),0) AS Today
, isnull(sum(distinct d.Page_Count),0) AS Week
, e.Material_Location as '(Yesterday)'
, f.Material_Location as '(Today)'
From TaskUser AS a
LEFT JOIN PaperMaterial AS b
ON b.Assigned_To = a.User1
AND b.Date_Assigned between ('06/09/2014') AND ('06/13/2014')
LEFT JOIN PaperMaterial AS c
ON c.Assigned_To = a.User1
AND c.Date_Assigned between ('06/13/2014') AND ('06/14/2014')
LEFT JOIN PaperMaterial AS d
ON d.Assigned_To = a.User1
AND d.Date_Assigned between ('06/09/2014') AND ('06/14/2014')
LEFT JOIN PaperMaterial AS e
ON e.Assigned_To = a.User1
AND e.Date_Assigned between ('06/12/2014') AND ('06/13/2014')
LEFT JOIN PaperMaterial AS f
ON f.Assigned_To = a.User1
AND f.Date_Assigned between ('06/13/2014') AND ('06/14/2014')
GROUP BY a.User1, e.Material_Location, f.Material_Location
Order By a.User1, e.Material_Location, f.Material_Location
If multiple records were input for the same user on the same day, I am getting unique rows for the same person. I only want one row per user with the e and f results merged to the same column.
Ie: Current Output =
Amy 0 640 640 NoTask Task
Amy 0 640 640 Task2 Task
Amy 0 640 640 Task3 Task4
Amy 0 640 640 Task1 NoTask
Requested output:
Amy 0 640 640 (NoTask, Task1, Task2, Task3) (NoTask, Task, Task4)

Here's a greatly over-simplified example of using stuff combined with a correlated subquery:
SQL Fiddle
I used your output as a table, more or less:
select
name,
stuff(
(
select cast(',' as varchar(max)) + mt.one
from MyTable mt
WHERE mt.name = t1.name
order by mt.name
for xml path('')
), 1, 1, '')
from mytable t1
group by name
We're using stuff to concatenate each value for the column I creatively named ONE for each NAME. The correlated subquery allows us to relate each row coming out of that to the corresponding row coming out of the main query.

Rollup / recursive addition SQL Server 2008

I have a query with rollup that outputs data like (the query is a little busy, but I can post if necessary)
range subCounts Counts percent
1-9 3 100 3.0
10-19 13 100 13.0
20-29 30 100 33.0
30-39 74 100 74.0
NULL 100 100 100.0
How is it possible to keep a running summation total of percent? Say I need to find the bottom 15 percentile, in this case 3+13=16 so I would like for the last row to be returned read
range subCounts counts percent
10-19 13 100 13.0
EDIT1: here the query
select '$'+cast(+bin*10000 + ' ' as varchar(10)) + '-' + cast(bin*10000+9999 as varchar(10)) as bins,
count(*) as numbers,
(select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')
) as Totals
, round(100*count(*)/cast((select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')) as float),2) as binsPercent
from
(
select tblclaims.patientid, sum(claimsmedicarepaid) as TotalCosts,
cast(sum(claimsmedicarePaid)/10000 as int) as bin
from tblclaims inner join patient on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on patient.hospitalnpi = tblhospitals.hospitalnpi
where tblhospitals.hospitalname = 'X'
group by tblclaims.patientid
) as t
group by bin with rollup

OK, so for whomever might use this for reference I figured out what I needed to do.
I added row_number() over(bin) as rownum to the query and saved all of this as a view.
Then I used
SELECT *,
SUM(t2.binspercent) AS SUM
FROM t t1
INNER JOIN t t2 ON t1.rownum >= t2.rownum
GROUP BY t1.rownum,
t1.bins, t1.numbers, t1.uktotal, t1.binspercent
ORDER BY t1.rownum
by joining t1.rownum >=t2.rownum you can get the rolling count sort of thing.

This isn't exactly what i was looking for, but it's on the same track:
http://blog.tallan.com/2011/12/08/sql-server-2012-windowing-functions-part-1-of-2-running-and-sliding-aggregates/ and http://blog.tallan.com/2011/12/19/sql-server-2012-windowing-functions-part-2-of-2-new-analytic-functions/ - check out PERCENT_RANK
CUME_DIST
PERCENTILE_CONT
PERCENTILE_DISC
Sorry for the lame answer

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Should I use a subquery or sum in this situation? - sql

Related

SQL query for most recent date

Nesting Queries to get multiple column results

aggregating nested SQL statements to fewer columns

Merge results to one column

Rollup / recursive addition SQL Server 2008

Categories

Resources