SQL PIVOT ON Across Multiple Tables - sql

I am trying to do a PIVOT (am running SQL Server 2008) across multiple tables and with no aggregate function involved. I have to be honest I'm a little out of my depth here and am struggling to define the problem so figure I should just jump in and show you my stuff (oooeeer), firstly I have three tables:
CHARTER_vessels
===============
vesselID vesselName
-------- ----------
1 The Titanic
2 The Pinafore
3 The Black Pearl
CHARTER_rateDateRange
=====================
rateDateRangeID rateDateRangeName
--------------- -----------------
1 Spring 2012
2 Summer 2012
3 Fall 2012
CHARTER_rates
=============
vesselID rateDateRangeID rateCost
-------- --------------- --------
1 1 434
1 2 445
1 3 231
2 1 675
2 2 545
2 3 768
3 1 543
3 2 654
3 3 658
And the output I'm trying to achieve is that the rates for each boat appear in the column for each season, like this:
vesselName Spring 2012 Summer 2012 Fall 2012
---------- ----------- ----------- ---------
The Titanic 434 445 231
The Pinafore 675 545 768
The Black Pearl 543 654 658
Obviously I would like to be able to sort the result set by the different columns if possible!

The below makes the assumption of uniqueness of vessel and date range. If this isn't true and you don't want to aggregate a pivot is not for you. The <aggregate>(rateCost) is a requirement to use a SQL Server pivot. There needs to be a mechanism for SQL Server to decide what to return if a vessel has multiple of the same daterange. If this doesn't occur the aggregate is really meaningless. The other option would be a series of self joins. Let me know if you need to see the self join solution.
SELECT src.vesselName,pvt.[Spring 2012], pvt.[Summer 2012], pvt.[Fall 2012]
FROM
(select vesselName, rateCost, rateDateRangeName
from CHARTER_rateDateRange crd
inner join CHARTER_rates cr on cr.rateDateRangeID = crd.rateDateRangeID
inner join CHARTER_vessels cv on cv.vesselID = crd.vesselID) AS src
PIVOT
(
max(rateCost)
FOR rateDateRangeName IN ([Spring 2012], [Summer 2012], [Fall 2012])
) AS pvt;
Ah why not in case someone else runs across this is the self join solution. Caution not at all optimized.
with joinMe as (
select vesselName, rateCost, rateDateRangeName
from CHARTER_rateDateRange crd
inner join CHARTER_rates cr on cr.rateDateRangeID = crd.rateDateRangeID
inner join CHARTER_vessels cv on cv.vesselID = crd.vesselID
)
select a.vesselName,a.rateCost as 'Spring 2012',b.rateCost as 'Summer 2012',c.rateCost as 'Fall 2012'
from joinMe a
inner join joinMe b on b.vesselName= a.vesselName
and b.rateDateRangeName = 'Summer 2012'
inner join joinMe c on c.cesselName = a.vesselName
and c.rateDateRangeName = 'Fall 2012'
where a.rateDateRangeName = 'Spring 2012'
Due to the size limit I will write a query response for you here. What does the following return for you anything with a count greater than 1?
select vesselName, rateDateRangeName,count(rateCost)
from CHARTER_rateDateRange crd
inner join CHARTER_rates cr on cr.rateDateRangeID = crd.rateDateRangeID
inner join CHARTER_vessels cv on cv.vesselID = cr.vesselID
group by vesselName,rateDateRangeName
order by count(rateCost) desc

Related

Linear Interpolation in SQL

I work with crashes and mileage for the same year which is Year in table. Crashes are are there for every record, but annual mileage is not. NULLs for mileage could be at the beginning or at the end of the time period for certain customer. Also, couple of annual mileage records can be missing as well. I do not know how to overcome this. I try to do it in CASE statement but then I do not know how to code it properly. Issue needs to be resolved in SQL and use SQL Server.
This is how the output looks like and I need to have mileage for every single year for each customer.
The info I am pulling from is proprietary database and the records themselves should be untouched as is. I just need code in query which will modify my current output to output where I have mileage for every year. I appreciate any input!
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
NULL
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
NULL
2015
123
15
6346747
2016
123
0
NULL
2017
123
2
534534
2018
123
7
NULL
2019
123
11
NULL
2020
123
15
565435
2021
123
12
474567546
2022
123
7
NULL
Desired Results
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
175399 (prior value is taken)
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
34900100 (avg of 2 adjacent values)
2015
123
15
6346747
2016
123
0
3440641 (avg of 2 adjacent values)
2017
123
2
534534
2018
123
7
534534 ( prior value is taken)
2019
123
11
549985 (avg of 2 adjacent values)
2020
123
15
565435
2021
123
12
474567546
2022
123
7
474567546 (prior value is taken)
SELECT Year,
Customer,
Crashes,
CASE
WHEN Annual_Mlg IS NOT NULL THEN Annual_Mlg
WHEN Annual_Mlg IS NULL THEN
CASE
WHEN PREV.Annual_Mlg IS NOT NULL
AND NEXT.Annual_Mlg IS NOT NULL
THEN ( PREV.Annual_Mlg + NEXT.Annual_Mlg ) / 2
ELSE 0
END
END AS Annual_Mlg
FROM #table
The above code doesn't work, but I just need to start somehow and that what I have currently.
I understand what I need to do I just do not know how to code it in SQL.
After i applied row_number () function i got this output for first 2 clients and for the rest of the 4 clients row_number() function gave correct output. i have no idea why is that. I thought may be because i used "full join" before to combine mileage and crashes table?
enter image description here
Your use of #table tells me that you're using MS SQL Server (a temporary table, probably in a stored procedure).
You want to:
select all the rows in #table
joined with the matching row (if any) for the previous year, and
joined with the matching row (if any) for the next year
Then it's easy. Assuming the primary key on your #table is composed of the year and customer columns, something like this ought to do you:
select t.year ,
t.customer ,
t.crashes ,
annual_milage = coalesce(
t.annual_milage ,
( coalesce( p.annual_mileage, 0 ) +
coalesce( n.annual_mileage, 0 )
) / 2
)
from #table t -- take all the rows
left join #table p on p.year = t.year - 1 -- with the matching row for
and p.customer = t.customer -- the previous year (if any)
left join #table n on n.year = t.year + 1 -- and the matching row for
and n.customer = t.customer -- the next year (if any)
Notes:
What value you default to if the previous or next year doesn't exist is up to you (zero? some arbitrary value?)
Is the previous/next year guaranteed to be the current year +/- 1?
If not, you may have to use derived tables as the source for the
prev/next data, selecting the closest previous/next year (that sort
of thing rather complicates the query significantly).
Edited To Note:
If you have discontiguous years for each customer such that the "previous" and "next" years for a given customer are not necessarily the current year +/- 1, then something like this is probably the most straightforward way to find the previous/next year.
We use a derived table in our from clause, and assign a sequential number in lieu of year for each customer, using the ranking function row_number() function. This query, then
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
would produce results along these lines:
row_nbr
customer
year
...
1
123
1992
...
2
123
1993
...
3
123
1995
...
4
123
2020
...
1
456
2001
...
2
456
2005
...
3
456
2020
...
And that leads us to this:
select year = t.year ,
customer = t.customer ,
crashes = t.crashes ,
annual_mileage = coalesce(
t.mileage,
coalesce(
t.annual_mileage,
(
coalesce(p.annual_mileage,0) +
coalesce(n.annual_mileage,0)
) / 2
),
)
from (
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
) t
left join #table p on p.customer = t.customer and p.row_nbr = t.row_nbr-1
left join #table n on n.customer = t.customer and n.row_nbr = t.row_nbr+1

Aggregate payments per year per customer per type

Please consider the following payment data:
customerID paymentID pamentType paymentDate paymentAmount
---------------------------------------------------------------------
1 1 A 2015-11-28 500
1 2 A 2015-11-29 -150
1 3 B 2016-03-07 300
2 4 A 2015-03-03 200
2 5 B 2016-05-25 -100
2 6 C 2016-06-24 700
1 7 B 2015-09-22 110
2 8 B 2016-01-03 400
I need to tally per year, per customer, the sum of the diverse payment types (A = invoice, B = credit note, etc), as follows:
year customerID paymentType paymentSum
-----------------------------------------------
2015 1 A 350 : paymentID 1 + 2
2015 1 B 110 : paymentID 7
2015 1 C 0
2015 2 A 200 : paymentID 4
2015 2 B 0
2015 2 C 0
2016 1 A 0
2016 1 B 300 : paymentID 3
2016 1 C 0
2016 2 A 0
2016 2 B 300 : paymentID 5 + 8
2016 2 C 700 : paymentId 6
It is important that there are values for every category (so for 2015, customer 1 has 0 payment value for type C, but still it is good to see this).
In reality, there are over 10 payment types and about 30 customers. The total date range is 10 years.
Is this possible to do in only SQL, and if so could somebody show me how? If possible by using relatively easy queries so that I can learn from it, for instance by storing intermediary result into a #temptable.
Any help is greatly appreciated!
a simple GROUP BY with SUM() on the paymentAmount will gives you what you wanted
select year = datepart(year, paymentDate),
customerID,
paymentType,
paymentSum = sum(paymentAmount)
from payment_data
group by datepart(year, paymentDate), customerID, paymentType
This is a simple query that generates the required 0s. Note that it may not be the most efficient way to generate this result set. If you already have lookup tables for customers or payment types, it would be preferable to use those rather than the CTEs1 I use here:
declare #t table (customerID int,paymentID int,paymentType char(1),paymentDate date,
paymentAmount int)
insert into #t(customerID,paymentID,paymentType,paymentDate,paymentAmount) values
(1,1,'A','20151128', 500),
(1,2,'A','20151129',-150),
(1,3,'B','20160307', 300),
(2,4,'A','20150303', 200),
(2,5,'B','20160525',-100),
(2,6,'C','20160624', 700),
(1,7,'B','20150922', 110),
(2,8,'B','20160103', 400)
;With Customers as (
select DISTINCT customerID from #t
), PaymentTypes as (
select DISTINCT paymentType from #t
), Years as (
select DISTINCT DATEPART(year,paymentDate) as Yr from #t
), Matrix as (
select
customerID,
paymentType,
Yr
from
Customers
cross join
PaymentTypes
cross join
Years
)
select
m.customerID,
m.paymentType,
m.Yr,
COALESCE(SUM(paymentAmount),0) as Total
from
Matrix m
left join
#t t
on
m.customerID = t.customerID and
m.paymentType = t.paymentType and
m.Yr = DATEPART(year,t.paymentDate)
group by
m.customerID,
m.paymentType,
m.Yr
Result:
customerID paymentType Yr Total
----------- ----------- ----------- -----------
1 A 2015 350
1 A 2016 0
1 B 2015 110
1 B 2016 300
1 C 2015 0
1 C 2016 0
2 A 2015 200
2 A 2016 0
2 B 2015 0
2 B 2016 300
2 C 2015 0
2 C 2016 700
(We may also want to play games with a numbers table and/or generate actual start and end dates for years if the date processing above needs to be able to use an index)
Note also how similar the top of my script is to the sample data in your question - except it's actual code that generates the sample data. You may wish to consider presenting sample code in such a way in the future since it simplifies the process of actually being able to test scripts in answers.
1CTEs - Common Table Expressions. They may be thought of as conceptually similar to temp tables - except we don't actually (necessarily) materialize the results. They also are incorporated into the single query that follows them and the whole query is optimized as a whole.
Your suggestion to use temp tables means that you'd be breaking this into multiple separate queries that then necessarily force SQL to perform the task in an order that we have selected rather than letting the optimizer choose the best approach for the above single query.

SQL How can this happen? - Query which normally returns 1 result alone actually resulted in multiple results when put inside WHERE clause

Question brief
I'm doing this practice on w3resource and I couldn't understand why the solution worked. I'm 2 days old to SQL. I'll appreciate very much if someone can help me explain.
I have 2 tables, COMPANY(com_id, com_name) and PRODUCT(pro_name, pro_price, com_id). Each company has several products with different prices. Now I need to write a query to display companies' name together with their most expensive products respectively.
The sample answer on the practice is like this
SELECT c.com_name, p.pro_name, p.pro_price
FROM product p
INNER JOIN company c ON p.com_id = c.com_id
AND p.pro_price =
( SELECT MAX(p.pro_price)
FROM product p
WHERE p.com_id = c.com_id );
The query returned expected result.
com_name pro_name pro_price
--------- --------- -----------
Samsung Monitor 5000.00
iBall DVD drive 900.00
Epsion Printer 2600.00
Zebronics ZIP drive 250.00
Asus Mother Board 3200.00
Frontech Speaker 550.00
But I cannot understand how, especially the part inside the bottom sub-query. Isn't SELECT MAX(p.pro_price) supposed to return only 1 highest price of all companies together?
I also tried subsecting this sub-query like this
SELECT MAX(p.pro_price)
FROM product p
INNER JOIN company c ON p.com_id = c.com_id
WHERE p.com_id = c.com_id;
... and it only returned 1 maximum value.
max(p.pro_price)
-----
5000.00
So how does the final result of the whole query include more than 1 records? There's no GROUP BY or anything.
By the way, the query seemed to use 2 conditions for INNER JOIN. But I also tried swapping the 2nd condition into a WHERE clause and it still worked the same. This is one more thing I don't understand.
The databases involved
COMPANY table
COM_ID | COM_NAME
----------------
11 | Samsung
12 | iBall
13 | Epsion
14 | Zebronics
15 | Asus
16 | Frontech
PRODUCT table
PRO_NAME PRO_PRICE COM_ID
-------------------- ---------- ---------
Mother Board 3200 15
Key Board 450 16
ZIP drive 250 14
Speaker 550 16
Monitor 5000 11
DVD drive 900 12
CD drive 800 12
Printer 2600 13
Refill cartridge 350 13
Mouse 250 12
The sub-query is a correlated sub-query. This query is executed for each value of c.com_id in the outer query:
WHERE p.com_id = c.com_id

SQL Server : take 1 to many record set and make 1 record per id

I need some help. I need to take the data from these 3 tables and create an output that looks like below. The plan_name_x and pending_tallyx columns are derived to make one line per claim id. Each claim id can be associated to up to 3 plans and I want to show each plan and tally amounts in one record. What is the best way to do this?
Thanks for any ideas. :)
Output result set needed:
claim_id ac_name plan_name_1 pending_tally1 plan_name_2 Pending_tally2 plan_name_3 pending_tally3
-------- ------- ----------- -------------- ----------- -------------- ----------- --------------
1234 abc cooks delux_prime 22 prime_express 23 standard_prime 2
2341 zzz bakers delpux_prime 22 standard_prime 2 NULL NULL
3412 azb pasta's prime_express 23 NULL NULL NULL NULL
SQL Server 2005 table to use for the above result set:
company_claims
claim_id ac_name
1234 abc cooks
2341 zzz bakers
3412 azb pasta's
claim_plans
claim_id plan_id plan_name
1234 101 delux_prime
1234 102 Prime_express
1234 103 standard_prime
2341 101 delux_prime
2341 103 standard_prime
3412 102 Prime_express
Pending_amounts
claim_id plan_id Pending_tally
1234 101 22
1234 102 23
1234 103 2
2341 101 22
2341 103 2
3412 102 23
If you know that 3 is always the max amount of plans then some left joins will work fine:
select c.claim_id, c.ac_name,
cp1.plan_name as plan_name_1, pa1.pending_tally as pending_tally1,
cp2.plan_name as plan_name_2, pa2.pending_tally as pending_tally2,
cp3.plan_name as plan_name_3, pa3.pending_tally as pending_tally3,
from company_claims c
left join claim_plans cp1 on c.claim_id = cp1.claim_id and cp1.planid = 101
left join claim_plans cp2 on c.claim_id = cp2.claim_id and cp2.planid = 102
left join claim_plans cp3 on c.claim_id = cp3.claim_id and cp3.planid = 103
left join pending_amounts pa1 on cp1.claim_id = pa1.claimid and cp1.planid = pa1.plainid
left join pending_amounts pa2 on cp2.claim_id = pa2.claimid and cp2.planid = pa2.plainid
left join pending_amounts pa3 on cp3.claim_id = pa3.claimid and cp3.planid = pa3.plainid
I would first join all your data so that you get the relevant columns: claim_id, ac_name, plan_name, pending tally.
Then I would add transform this to get plan name and plan tally on different rows, with a label tying them together.
Then it should be easy to pivot.
I would tie these together with common table expressions.
Here's the query:
with X as (
select cc.*, cp.plan_name, pa.pending_tally,
rank() over (partition by cc.claim_id order by plan_name) as r
from company_claims cc
join claim_plans cp on cp.claim_id = cc.claim_id
join pending_amounts pa on pa.claim_id = cp.claim_id
and pa.plan_id = cp.plan_id
), P as (
select
X.claim_id,
x.ac_name,
x.plan_name as value,
'plan_name_' + cast(r as varchar(max)) as label
from x
union all
select
X.claim_id,
x.ac_name,
cast(x.pending_tally as varchar(max)) as value,
'pending_tally' + cast(r as varchar(max)) as label
from x
)
select claim_id, ac_name, [plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3]
from (select * from P) p
pivot (
max(value)
for label in ([plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3])
) as pvt
order by pvt.claim_id, ac_name
Here's a fiddle showing it in action: http://sqlfiddle.com/#!3/68f62/10

SQL multiple 1-to-many joins

I'm almost certain I've run into this before and am just having an extended senior moment, but I am trying to pull work order data from three different tables across 2 db's on a SQL instance and combine it all into a report, I'm looking for the end result to contain the following columns:
WO | Production Recorded Qty | Inventory Qty | Variance
The Variance part is easy I can just nest the select statement, and subtract the two quantities in the outer statement, but the problem I'm running in to is when I join the production and Inventory tables in their corresponding databases I end up getting sums of the columns that I'm targeting that are way larger than what they should be
Sample Data:
Work Order, Short P/N, and Long P/N in Work Order Table:
dba.1
WO ShortPN LongPN
152 1234 Part1
Short P/N, Quantity on hand, location, and lot # in inventory table:
dba.2
ShortPN Qty Loc Lot
1234 31 Loc1 456
1234 0 Loc2 456
1234 0 Loc4 456
1234 19 Loc1 789
1234 25 Loc4 789
Work Order, Long P/N, and production count of the last 5min in Production table:
dbb.3
WO LongPN Count
152 Part3 6
152 Part3 8
152 Part3 9
152 Part3 4
152 Part3 6
152 Part3 7
With this example I've tried:
SELECT 1.WO AS WO
,SUM(2.Qty) AS "Qty On Hand"
,SUM(3.Count) AS "Produced Qty"
FROM dba.1
INNER JOIN dbb.2 ON 1.ShortPN=2.ShortPN
INNER JOIN dbb.3 ON 1.WO = 3.WO
GROUP BY 1.WO
I've also tried selecting from 3, joining 1 to 3 on the WO, then 2 to 1 on shortPN, and both yield SUM() numbers that are exponentially higher than they should be(ie what should be 15,xxx turns into over 2,000,000), however if I remove one of the data points from the report and select just the inventory or production qty I get the correct end results. I swear that I've run into this before but for the life of me can't remember how it was solved, sorry if it's a duplicate question as well, couldn't find anything by searching.
Thanks in advance for the help, it's greatly appreciated.
You can do something like this
select
WO.WO, isnull(i.Qty, 0) as Qty, isnull(p.[Count], 0) as [Count]
from WorkOrder as WO
left outer join (select t.ShortPN, sum(t.Qty) as Qty from inventory as t group by t.ShortPN) as i on
i.ShortPN = WO.ShortPN
left outer join (select t.WO, sum(t.[Count]) as [Count] from Production as t group by t.WO) as p on
p.WO = WO.WO
SQL FIDDLE example
if you have SQL Server 2005 or higher, you can write it like this
select
WO.WO, isnull(i.Qty, 0) as Qty, isnull(p.[Count], 0) as [Count]
from WorkOrder as WO
outer apply (select sum(t.Qty) as Qty from inventory as t where t.ShortPN = WO.ShortPN) as i
outer apply (select sum(t.[Count]) as [Count] from Production as t where t.WO = WO.WO) as p
SQL FIDDLE example
this happens because when you make a join of WO and inventory tables you got
WO SHORTPN QTY
-------------------
152 1234 31
152 1234 0
152 1234 0
152 1234 19
152 1234 25
and you see that now you have 5 rows with WO = 152. When you add join with Production table, for each row with WO = 152 from this join there will be 6 rows with WO = 152 from Production table, so you will have 30 rows total and QTY from inventory will be listed 6 times each. When you sum this up, instead of 75 you will have 75 * 6 = 450. And for Count you'll have each Count * 5, so instead of 40 you'll have 200.