SQL Group and Join - sql

In SQL Server 2008, I have a table that looks like this:
ID | RefNum | Label | Value | Status
------------------------------------------------------
1 123 OrderNum 123456 0
2 123 TrackingNum 111111 0
3 123 ConfNum 989898 0
4 234 OrderNum 234567 1
5 234 TrackingNum 222222 1
6 234 ConfNum 878787 0
7 567 OrderNum 345678 1
8 567 TrackingNum 333333 0
9 567 ConfNum 767676 0
I want to select all records where Status = 0 and join, based on RefNum, to the 'OrderNum' and 'TrackingNum' Label values, regardless of whether 'OrderNum' and TrackingNum Statuses are 1 or 0. For example, the query should produce:
ID | RefNum | Label | Value | Status |OrderNum|TrackingNum
------------------------------------------------------------------------
1 123 OrderNum 123456 0 123456 111111
2 123 TrackingNum 111111 0 123456 111111
3 123 ConfNum 989898 0 123456 111111
6 234 ConfNum 878787 0 234567 222222
8 567 TrackingNum 333333 0 345678 333333
9 567 ConfNum 767676 0 345678 333333
Right now, my query looks like this:
SELECT Id
,mT.RefNum
,Label
,Value
,Status
,OrderNum
,TrackingNum
FROM [dbo].[myTable] AS mT
INNER JOIN (
SELECT MAX(ID) As OrderRowId, RefNum, Value AS OrderNum
FROM [dbo].[myTable]
WHERE Label= 'OrderNum'
group by RefNum, Value) AS OrderNums
ON OrderNums.RefNum= mt.RefNum
INNER JOIN (
SELECT MAX(ID) As OrderRowId, RefNum, Value AS TrackingNum
FROM [dbo].[myTable]
WHERE Label= 'TrackingNum'
group by RefNum, Value) AS TrackingNums
ON TrackingNums.RefNum= mt.RefNum
WHERE ProcessComplete = 0
This apprears to work, but requires a hash join. Would love someone to shoot holes in this or provide a more efficient solution. Thanks.

If there can't be duplicate order numbers or tracking numbers per reference number, you can simplify the query somewhat with a regular LEFT JOIN or JOIN;
SELECT mt.id, mt.refnum, mt.label, mt.value, mt.status,
ordno.value ordernum, trackno.value trackingnum
FROM myTable mt
LEFT JOIN myTable ordno
ON ordno.label='ordernum' and mt.refnum=ordno.refnum
LEFT JOIN myTable trackno
ON trackno.label='trackingnum' and mt.refnum=trackno.refnum
WHERE mt.status = 0;
An SQLfiddle to test with.
If there may be duplicates, you can still do a single GROUP BY to get a result;
SELECT mt.id, mt.refnum, mt.label, mt.value, mt.status,
MAX(ordno.value) ordernum, MAX(trackno.value) trackingnum
FROM myTable mt
LEFT JOIN myTable ordno
ON ordno.label='ordernum' and mt.refnum=ordno.refnum
LEFT JOIN myTable trackno
ON trackno.label='trackingnum' and mt.refnum=trackno.refnum
WHERE mt.status = 0
GROUP BY mt.id,mt.refnum,mt.label,mt.value,mt.status;
Another SQLfiddle.

Related

Netezza SQL Join dataset A to dataset B but pull fields from B when b_date > a_date

I have 2 datasets from 2 different sources but many of the members are the same in both datasets. My select statement is :
Select a.member_id, a.start_date, a.customer_id, a.region_id, b.b_start_date, b.customer_id, b.region_id
from dataset1 a
left join dataset2 b
on a.member_id=b.member_id
I want to somehow pick up all recs in A and recs in B where a.member_id = b.member_id but bring in the fields from A when a.start_date = b.b_start_date or a.start_date > b.b_start_date and bring in the fields from B when b.b_start_date > a.start_date.
Here's a pretty small example:
Dataset A:
member_id
start_date
customer_id
region_id
1111
1/30/2021
123
555
2222
1/30/2021
222
555
3333
1/1/2021
345
678
Dataset B:
member_id
b_start_date
customer_id
region_id
1111
1/1/2022
567
444
2222
1/30/2021
222
555
Result:
member_id
customer_id
region_id
1111
567
444
2222
222
555
3333
345
678
/* try this */
select a.* from a inner join b using (member_id) where a.start_date >= b.b_start_date
union all
select b.* from a inner join b using (member_id) where b.b_start_date > a.start_date;

How to use a subselect in a LEFT JOIN ON clause?

I have a table t with
ORD_DATE
ORD_ID
ORD_REF
ORD_TYPE1
ORD_TYPE2
PRODNUM
PRODQUAL
PRICE
2020-09-01
101
101
ORDER
ORDER
456
F
555
2020-09-02
102
101
CONF
ORDER
456
F
555
2020-11-30
103
102
ORDER
ORDER
123
K
444
2020-12-01
104
102
CONF
ORDER
123
K
444
2020-12-01
105
103
ORDER
ORDER
123
K
444
2020-12-01
106
104
ORDER
ORDER
123
K
333
2020-12-02
107
104
CONF
ORDER
123
K
333
2020-12-08
108
104
CONF
RETURN
123
K
-333
2020-12-01
109
105
ORDER
ORDER
123
F
222
2020-12-02
110
105
CONF
ORDER
123
F
222
and a table s with:
ORD_DATE
PROD_NUMBER
PROD_QUAL
2020-12-01-00.00.00.000000
123
K
2020-12-01-00.00.00.000000
123
L
In table t are all sales per day.
A sale has 2 stages: first the order is generated when the customer buys something
("ORDER"/"ORDER"). Then it gets confirmed which is at the next day or within the next days normally ("CONF"/"ORDER"). If a customer sends the product back it's a return ("CONF"/"RETURN").
In table s are the products that are "second hand".
if a product is in that table it means all sales from table t with
ORDER_TYPE_1 = "ORDER"
AND ORDER_TYPE_2 = "ORDER"
AND t.ORD_DATE >= s.ORD_DATE
AND t.PROD_NUMBER = s.PROD_NUMBER
AND t.PROD_QUAL = s.PROD_QUAL
count as "second hand".
I need the sum of all "second hand" sales that are confirmed from the year 2021 and month 12. But only rows with CONF/ORDER or CONF/RETURN should be in the calculation. I have CAL_YEAR and CAL_MONTH in table t for that (omitted for less clutter).
From table t only ORDER_REF 105 matches that and the sum would be 0 because only these 2 rows matter:
| 2020-12-02 | 107 | 104 | CONF | ORDER | 123 | K | 333
| 2020-12-08 | 108 | 104 | CONF | RETURN | 123 | K | -333
My code so far:
SELECT SUM(PRICE)
FROM t
--
LEFT JOIN s
ON t.PRODNUM = s.PRODNUM
AND t.PRODQUAL = s.PRODQUAL
AND (SELECT ORD_DATE FROM t WHERE ORDER_TYPE_1 = 'ORDER' AND ORDER_TYPE_2 = 'ORDER') >= s.ORD_DATE
--
WHERE CAL_YEAR = 2021
AND CAL_MONTH = 12
AND ORDER_TYPE_1 = 'CONF'
AND ORDER_TYPE_2 IN ('ORDER', 'RETURN')
--
GROUP BY PRICE
;
SQL-Error: "single-row subquery returns more than one row
My problem is limiting the LEFT JOIN to ORDER/ORDER (so that ORDER_REF 105 is in) but only use CONF/ORDER and CONF/RETURN for the sum (so that ORDER_REF 102 is out).
Anyone can help?
The simplest way I can think of would be to do a self-join, where you join a second copy of table t aliased t2 to use for the CONF/ORDER and CONF/RETURN rows, while you use t for the ORDER/ORDER rows.
SELECT SUM(t2.PRICE)
FROM t
--
INNER JOIN t t2
ON t2.ORD_REF = t.ORD_REF
AND t2.ORDER_TYPE_1 = 'CONF'
AND t2.ORDER_TYPE_2 IN ('ORDER', 'RETURN')
--
LEFT JOIN s
ON t.PRODNUM = s.PRODNUM
AND t.PRODQUAL = s.PRODQUAL
AND t.ORD_DATE >= s.ORD_DATE
--
WHERE t.CAL_YEAR = 2021
AND t.CAL_MONTH = 12
AND t.ORDER_TYPE_1 = 'ORDER'
AND t.ORDER_TYPE_2 = 'ORDER'
;
If you need it to be more efficient, you could use analytic/window functions to pull the summed price from the CONF rows into the ORDER/ORDER row as a new column. This way it will only query table t once instead of twice.
SELECT SUM(t2.order_price_sum)
FROM (select t.*,
sum(case when ORDER_TYPE_1 = 'CONF'
AND ORDER_TYPE_2 IN ('ORDER', 'RETURN')
then t.price
else 0 end) over (partition by ord_ref) as order_price_sum
from t) t2
--
LEFT JOIN s
ON t2.PRODNUM = s.PRODNUM
AND t2.PRODQUAL = s.PRODQUAL
AND t2.ord_date >= s.ORD_DATE
--
WHERE CAL_YEAR = 2021
AND CAL_MONTH = 12
AND ORDER_TYPE_1 = 'ORDER'
AND ORDER_TYPE_2 = 'ORDER'
;

Replace nulls in table B with values in table A using SQL

I have a table like this:
price_families (Table A):
ID UPC
1 123
1 456
2 789
2 111
1 121
And a second table:
sales_volume (Table B):
UPC sales volume
123 13.99 2.99
456 null null
121 14.99 1.99
789 31.88 22.99
111 null null
121 null null
What I want is to replace the null values in Table B with the sales/volume values of a different UPC of the same product family (using Table A to determine price families, joining on UPC) and to order by sales desc, volume desc (for each price family).
What is the optimal way to do this? Can I use coalesce() here, or perhaps a case statement?
My output should be this:
output (Table C):
UPC sales volume
123 13.99 2.99
456 13.99 2.99
121 14.99 1.99
789 31.88 22.99
111 31.88 22.99
121 13.99 2.99
Hmmm . . . One method uses tuples for the assignment:
update b
set (sales, volume) = (select b2.sales, b2.volume
from b b2 join
a
on b2.upc = a.upc
where b2.sales is not null and b2.volume is not null
order by b2.sales desc, b2.volume desc
limit 1
)
where sales is null and volume is null;
EDIT:
If you just want the select query:
select b.upc,
coalesce(b.sales, b2.sales) as sales,
coalesce(b.volume, b2.volume) as volume
from b left join lateral
(select b2.sales, b2.volume
from b b2 join
a
on b2.upc = a.upc
where b2.sales is not null and b2.volume is not null
order by b2.sales desc, b2.volume desc
limit 1
) b2
on 1=1;

Oracle query stumped - derived table

It's been a long time since I've done more than the most basic sql queries. But I ran into this one today and have spent a few hours on it and am stuck with my derived table attempt (this is for an Oracle db). Looking for a few tips. Thx.
TABLE: dtree
DataID Name
-------------
10001 A.doc
10002 B.doc
10003 C.doc
10004 D.doc
TABLE: collections
CollectionID DataID
---------------------
201 10001
201 10002
202 10003
203 10004
TABLE: rimsNodeClassification
DataID RimsSubject RimsRSI Status
---------------------------------------
10001 blah IS-03 Active
10002 blah LE-01 Active
10003 blah AD-02 Active
10004 blah AD-03 Active
TABLE: rsiEventSched
RimsRSI RetStage DateToUse RetYears
--------------------------------------
IS-03 SEM-PHYS 95 1
IS-03 ACT NULL 2
LE-01 SEM-PHYS 94 1
LE-01 INA-PHYS 95 2
LE-01 ACT NULL NULL
LE-01 OFC NULL NULL
LE-02 SEM-PHYS 94 2
Trying to query on CollectionID=201
INTENDED RESULT:
DataID Name RimsRSI Status SEMPHYS_DateToUse INAPHYS_DateToUse SEMPHYS_RetYears INAPHYS_RetYears
-------------------------------------------------------------------------------------------------------
10001 A.doc IS-03 Active 95 null 1 null
10002 B.doc Le-01 Active 94 95 1 2
You don't need a Derived Table, just join the tables (the last using a Left join) and then apply a MAX(CASE) aggregation:
select c.DataID, t.Name, rnc.RimsRSI, rnc.Status,
max(case when res.RetStage = 'SEM-PHYS' then res.DateToUse end) SEMPHYS_DateToUse,
max(case when res.RetStage = 'INA-PHYS' then res.DateToUse end) INAPHYS_DateToUse,
max(case when res.RetStage = 'SEM-PHYS' then res.RetYears end) SEMPHYS_RetYears,
max(case when res.RetStage = 'INA-PHYS' then res.RetYears end) INAPHYS_RetYears
from collections c
join dtree t
on c.DataID = t.DataID
join rimsNodeClassification rnc
on c.DataID = rnc.DataID
left join rsiEventSched res
on rnc.RimsRSI = res.RimsRSI
where c.CollectionID= 201
group by c.DataID, t.Name, rnc.RimsRSI, rnc.Status

SQL Server : take 1 to many record set and make 1 record per id

I need some help. I need to take the data from these 3 tables and create an output that looks like below. The plan_name_x and pending_tallyx columns are derived to make one line per claim id. Each claim id can be associated to up to 3 plans and I want to show each plan and tally amounts in one record. What is the best way to do this?
Thanks for any ideas. :)
Output result set needed:
claim_id ac_name plan_name_1 pending_tally1 plan_name_2 Pending_tally2 plan_name_3 pending_tally3
-------- ------- ----------- -------------- ----------- -------------- ----------- --------------
1234 abc cooks delux_prime 22 prime_express 23 standard_prime 2
2341 zzz bakers delpux_prime 22 standard_prime 2 NULL NULL
3412 azb pasta's prime_express 23 NULL NULL NULL NULL
SQL Server 2005 table to use for the above result set:
company_claims
claim_id ac_name
1234 abc cooks
2341 zzz bakers
3412 azb pasta's
claim_plans
claim_id plan_id plan_name
1234 101 delux_prime
1234 102 Prime_express
1234 103 standard_prime
2341 101 delux_prime
2341 103 standard_prime
3412 102 Prime_express
Pending_amounts
claim_id plan_id Pending_tally
1234 101 22
1234 102 23
1234 103 2
2341 101 22
2341 103 2
3412 102 23
If you know that 3 is always the max amount of plans then some left joins will work fine:
select c.claim_id, c.ac_name,
cp1.plan_name as plan_name_1, pa1.pending_tally as pending_tally1,
cp2.plan_name as plan_name_2, pa2.pending_tally as pending_tally2,
cp3.plan_name as plan_name_3, pa3.pending_tally as pending_tally3,
from company_claims c
left join claim_plans cp1 on c.claim_id = cp1.claim_id and cp1.planid = 101
left join claim_plans cp2 on c.claim_id = cp2.claim_id and cp2.planid = 102
left join claim_plans cp3 on c.claim_id = cp3.claim_id and cp3.planid = 103
left join pending_amounts pa1 on cp1.claim_id = pa1.claimid and cp1.planid = pa1.plainid
left join pending_amounts pa2 on cp2.claim_id = pa2.claimid and cp2.planid = pa2.plainid
left join pending_amounts pa3 on cp3.claim_id = pa3.claimid and cp3.planid = pa3.plainid
I would first join all your data so that you get the relevant columns: claim_id, ac_name, plan_name, pending tally.
Then I would add transform this to get plan name and plan tally on different rows, with a label tying them together.
Then it should be easy to pivot.
I would tie these together with common table expressions.
Here's the query:
with X as (
select cc.*, cp.plan_name, pa.pending_tally,
rank() over (partition by cc.claim_id order by plan_name) as r
from company_claims cc
join claim_plans cp on cp.claim_id = cc.claim_id
join pending_amounts pa on pa.claim_id = cp.claim_id
and pa.plan_id = cp.plan_id
), P as (
select
X.claim_id,
x.ac_name,
x.plan_name as value,
'plan_name_' + cast(r as varchar(max)) as label
from x
union all
select
X.claim_id,
x.ac_name,
cast(x.pending_tally as varchar(max)) as value,
'pending_tally' + cast(r as varchar(max)) as label
from x
)
select claim_id, ac_name, [plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3]
from (select * from P) p
pivot (
max(value)
for label in ([plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3])
) as pvt
order by pvt.claim_id, ac_name
Here's a fiddle showing it in action: http://sqlfiddle.com/#!3/68f62/10