SQL Crosstab with undetermined columns - sql

I see a lot of similar questions, but almost all of them wind up grouping results as column names (Column names based on results), mine is a more simple list. I don't care if it uses dynamic SQL or not (I'd think it has to).
Please don't tell me I need to restructure the tables, I'm working from a legacy system and don't have that option.
Basically, I just need a list of all valid table "B" entries that match a given record from table "A", in a row.
I don't have any code sample yet, because I'm not seeing a way to set this up correctly.
Table: Customer c
CustID Name
1 Bill Smith
2 Jim Jones
3 Mary Adams
4 Wendy Williams
Table: Debt d
CustID Creditor Balance
1 ABC Loans 245
1 Citibank 815
2 Soprano Financial 74000
3 Citibank 24
3 Soprano Financial 93000
3 Wells Fargo 275
3 Midwestern S&L 2500
4 ABC Loans 1500
4 Fred's Payday Loan 1000
Desired Output:
Name Cred1 Bal1 Cred2 Bal2 Cred3 Bal3 Cred4 Bal4
Bill Smith ABC Loans 245 Citibank 815 (NULL) (NULL) (NULL) (NULL)
Jim Jones Soprano Financial 74000 (NULL) (NULL) (NULL) (NULL) (NULL) (NULL)
Mary Adams Citibank 24 Soprano Finanacial 93000 Wells Fargo 275 Midwestern S&L 2500
Wendy Williams ABC Loans 1500 Fred's Payday Loan 1000 (NULL) (NULL) (NULL) (NULL)
Basically, I probably have to collect some kind of count of the most number of records for any specific "CustomerID", and define the output columns based on that. If this has already been answered, feel free to link and close this out, I did not see this specific scenario when I did my search.

Here is another dynamic approach. We use Row_Number() to create the minimal number of columns.
Example
Declare #SQL varchar(max) = Stuff((Select Distinct ','+QuoteName(concat('Cred',ColNr))
+','+QuoteName(concat('Bal',ColNr))
From (Select ColNr=Row_Number() over (Partition By CustID Order By Creditor) From Debt ) A
Order By 1
For XML Path('')),1,1,'')
Select #SQL = '
Select *
From (
Select C.Name
,B.*
From (
Select *,ColNr=Row_Number() over (Partition By CustID Order By Creditor)
From Debt
) A
Cross Apply (values (concat(''Cred'',ColNr),[Creditor])
,(concat(''Bal'' ,ColNr) ,cast(Balance as varchar(25)))
) B (Item,Value)
Join Customer C on A.CustID=C.CustID
) A
Pivot (max([Value]) For [Item] in (' + #SQL + ') ) p'
--Print #SQL
Exec(#SQL);
Returns
If if Helps, the Generated SQL Looks Like This:
Select *
From (
Select C.Name
,B.*
From (
Select *,ColNr=Row_Number() over (Partition By CustID Order By Creditor)
From Debt
) A
Cross Apply (values (concat('Cred',ColNr),[Creditor])
,(concat('Bal' ,ColNr) ,cast(Balance as varchar(25)))
) B (Item,Value)
Join Customer C on A.CustID=C.CustID
) A
Pivot (max([Value]) For [Item] in ([Cred1],[Bal1],[Cred2],[Bal2],[Cred3],[Bal3],[Cred4],[Bal4]) ) p
Just for the Visualization, the query "feeding" the Pivot generates:

I will guess you are already know how to use cross tab so you only need to prepare your data to use it.
STEP 1: Join both tables:
SELECT c.Name, d.Creditor, d.Balance
FROM Customer c
JOIN Debt d
ON c.CustID = d.CustID
STEP 2: Include a row number to each element related to the customer you are going to use to cross tab
SELECT c.Name, d.Creditor, d.Balance,
ROW_NUMBER() over (PARTITION BY Name ORDER BY creditor) as rndebt_tab
FROM Customer c
JOIN Debt d
ON c.CustID = d.CustID
Now you have:
CustID Creditor Balance rn
1 ABC Loans 245 1
1 Citibank 815 2
2 Soprano Financial 74000 1
3 Citibank 24 1
3 Soprano Financial 93000 2
3 Wells Fargo 275 3
3 Midwestern S&L 2500 4
4 ABC Loans 1500 1
4 Fred's Payday Loan 1000 2
STEP 3: Create the SOURCE for the cross tab
WITH cte as (
<query from step2>
)
SELECT Name,
'CREDITOR_' + RIGHT('000' + CAST(rn AS VARCHAR(3)),3) as cross_tab,
Creditor as Value
FROM cte
UNION all
SELECT Name,
'DEBT_' + RIGHT('000' + CAST(rn AS VARCHAR(3)),3) as cross_tab,
CAST(Balance as VARCHAR(max)) as Value
FROM cte
Now you have:
CustID cross_tab Value
1 CREDITOR_001 ABC Loans
1 CREDITOR_002 Citibank
2 CREDITOR_001 Soprano Financial
3 CREDITOR_001 Citibank
3 CREDITOR_002 Soprano Financial
3 CREDITOR_003 Wells Fargo
3 CREDITOR_004 Midwestern S&L
4 CREDITOR_001 ABC Loans
4 CREDITOR_002 Fred's Payday Loan
1 DEBT_001 245
1 DEBT_002 815
2 DEBT_001 ` 74000
3 DEBT_001 24
3 DEBT_002 93000
3 DEBT_003 275
3 DEBT_004 2500
4 DEBT_001 1500
4 DEBT_002 1000
EDIT: I use CustID instead of Name on the example but too lazy to change now.

Related

How to assign filters to row number () function in sql

I am trying to extract only single row after name = system in each case where the town is not Austin.
In case 1001 there are 8 rows, row # 4 is system, output should be only the row with Name=Terry and Date Moved=7/4/2019 (Next entry with town /= Austin)
Case Name Town Date Moved Row #(Not in table)
1001 Ted Madisson 9/7/2018 1
1001 Joyal Boston 10/4/2018 2
1001 Beatrice Chicago 1/1/2019 3
1001 System Chicago 1/5/2019 4
1001 John Austin 4/11/2019 5
1001 Simon Austin 6/11/2019 6
1001 Terry Cleveland 7/4/2019 7
1001 Hawkins Newyork 8/4/2019 8
1002 Devon Boston 12/4/2018 1
1002 Joy Austin 12/7/2018 2
1002 Rachael Newyork 12/19/2018 3
1002 Bill Chicago 1/4/2019 4
1002 System Dallas 2/12/2019 5
1002 Phil Austin 3/16/2019 6
1002 Dan Seattle 5/18/2019 7
1002 Claire Birmingham 7/7/2019 8
Tried sub query with row number function and not in ('Austin') filter
ROW_NUMBER() OVER(PARTITION BY Case ORDER BY Moved_date ASC) AS ROWNUM
Please note there are > 10k cases.
You can try this below script-
WITH CTE AS
(
SELECT [Case],[Name],Town,[Date Moved],
ROW_NUMBER() OVER (PARTITION BY [Case] ORDER BY [Date Moved]) [Row #]
FROM your_table
)
SELECT A.*
FROM CTE A
INNER JOIN
(
SELECT C.[Case],C.Town,MAX(C.[Row #]) MRN
FROM CTE C
INNER JOIN
(
SELECT *
FROM CTE A
WHERE A.Name = 'System'
)D ON C.[Case] = D.[Case] AND C.[Row #] > D.[Row #]
AND C.Town = 'Austin'
GROUP BY C.[Case],C.Town
)B ON A.[Case] = B.[Case] AND A.[Row #] = B.MRN+1
Output is -
Case Name Town Date Moved Row #
1001 Terry Cleveland 7/4/2019 6
1002 Dan Seattle 5/18/2019 7
Here are three possibilities. I'm still concerned about ties though. The first one will return multiple rows while the others only one per case:
with matches as (
select t1."case", min(t2."Date Moved") as "Date Moved"
from Movements r1 inner join Movements t2 on t1."case" = t2."case"
where t1.name = 'System' and t2.Town <> 'Austin'
and t2."Date Moved" > t1."Date Moved"
group by t1."case"
)
select t.*
from Movements t inner join matches m
on m."case" = t."case" and m."Date Moved" = t."Date Moved";
select m2.*
from Movements m1 cross apply (
select top 1 * from Movements m2
where m2.Town <> 'Austin' and m2."Date Moved" > m1."Date Moved"
order by m2."Date Moved"
) as match
where m1.name = 'System';
with m1 as (
select *,
count(case when name = 'System') over (partition by "case" order by "Date Moved") as flag
from Movements
), m2 as (
select *,
row_number() over (partition by "case" order by "Date Moved") as rn
from m1
where flag = 1 and name <> 'System' and Town <> 'Austin'
)
select * from m2 where rn = 1;
I'm basically assuming this is SQL Server. You might need a few minor tweaks if not.
It also does not require a town named Austin to fall between the "System" row and the desired row as I do not believe that was a stated requirement.

Include a column to count records with a specific value

I want to return all data in a table and append a column that counts the number of records in a subset (say, the number of houses in a neighborhood).
I tried
CASE
WHEN EXISTS (SELECT 1 as [parcels]
FROM dbo.parcels p2
WHERE p2.Neighborhood = p.Neighborhood)
THEN COUNT([parcels]) END -- can't count outside subquery
as [TotalProps]
The subquery itself returns a value of 1 for each property record in any given neighborhood, but I can't count/sum the [parcels] outside of the subquery in a THEN statement.
Input Table:
dbo.parcels
ID Address Neighborhood
== ======= ============
1 123 Main St MITO
2 124 Main St MITO
3 200 2nd St MITO
4 201 2nd St MITO
5 5 Park Ave FAIRWIND
6 1600 Baker St GALLERY
7 1601 Baker St GALLERY
8 1602 Baker St GALLERY
SELECT *, <<<COUNT(neighborhood props)>>> as [TotalProps]
FROM dbo.parcels p
Expected Output:
ID Address Neighborhood TotalProps
== ======= ============ ==========
1 123 Main St MITO 4
2 124 Main St MITO 4
3 200 2nd St MITO 4
4 201 2nd St MITO 4
5 5 Park Ave FAIRWIND 1
6 1600 Baker St GALLERY 3
7 1601 Baker St GALLERY 3
8 1602 Baker St GALLERY 3
You can use COUNT OVER PARTITION aggregate:
SELECT
p.*,
COUNT(ID) OVER(PARTITION BY Neighborhood) AS TotalProps
FROM dbo.parcels p
Use window functions:
select p.*, count(*) over (partition by neighborhood)
from dbo.parcels p;
Keeping things simple - a basic subselect will give you what you need ...
SELECT
p.*,
(
select count(*)
FROM dbo.parcels p2
WHERE p2.neighborhood = p1.neighborhood ) AS hoodcount
FROM dbo.parcels p

SQL displaying results based on a value in column

So I have 2 tables in web SQL , one of them looks like this(there are thousands of rows):
customer_number | order_number
--------------------------------------------
1234 12
1234 13
1234 14
6793 20
6793 22
3210 53
etc.
And the other table like this(also thousands of rows):
customer_number | first_purchase_year
----------------------------------------------------
1234 2010
5313 2001
1632 2018
9853 2017
6793 2000
3210 2005
etc.
I have this code to select 10 customers from the first table and list all their purchases:
select top 10 * from
(select distinct t1.customer_number,
stuff((select '' + t2.order_number
from orders t2
where t1.customer_number = t2.customer_number
for xml path(''), type
).value('.','NVARCHAR(MAX)')
,1,0,'')DATA
from orders t1) a
Whch outputs this:
customer_number | order_number
--------------------------------------------
1234 12 13 14
6793 20 22
3210 53
What I need to do is ONLY display 10 random customers that have first_purchase_year > 2010.
I am not sure how to check if first_purchase_year corresponding to a customer_number is greater than 2010.
Thank you!
You just need to fix the subquery in the outer from clause:
select c.customer_number,
stuff((select '' + o2.order_number
from orders o2
where c.customer_number = o2.customer_number
for xml path(''), type
).value('.','NVARCHAR(MAX)'
), 1, 0, ''
) as data
from (select top (10) c.customer_number
from table2 c
where c.first_purchase_year > 2010
) c;

get extra rows for each group where date doesn't exist

I've been playing with this for days, and can't seem to come up with something. I have this query:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
Which gives me these results:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
etc....
What I need to do is use a JOIN or something, so that I get a NULL value for each month (1-12), for each name:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Smith 5 NULL
Smith 6 NULL
Smith ... NULL
Smith 12 NULL
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
Jones 5 NULL
Jones ... NULL
Jones 12 NULL
etc....
I have a "Numbers" table, and have tried doing:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
FULL JOIN Number n on n.Number = MONTH(v.YearMonth) and n in(1,2,3,4,5,6,7,8,9,10,11,12)
But that only gives me 6 additional NULL rows, where what I want is actually 6 NULL rows for each group of names. I've tried using Group By, but not sure how to use it in a JOIN statement like that, and not even sure if that's the correct route to take.
Any advice or direction is much appreciated!
Here's one way to do it:
select
s.emp_name as Name
,s.Number as m
,st.salestotal as Amount
from (
select distinct emp_name, number
from salestotals, numbers
where number between 1 and 12) s left join salestotals st on
s.emp_name = st.emp_name and s.number = month(st.yearmonth)
Condensed SQL Fiddle
You could do:
SELECT EN.emp_name Name,
N.Number M,
ST.SalesTotal Amount
FROM ( SELECT Number
FROM NumberTable
WHERE Number BETWEEN 1 AND 12) N
CROSS JOIN (SELECT DISTINCT emp_name
FROM SalesTotals) EN
LEFT JOIN SalesTotals ST
ON N.Number = MONTH(ST.YearMonth)
AND EN.emp_name = ST.emp_name

SQL Server : take 1 to many record set and make 1 record per id

I need some help. I need to take the data from these 3 tables and create an output that looks like below. The plan_name_x and pending_tallyx columns are derived to make one line per claim id. Each claim id can be associated to up to 3 plans and I want to show each plan and tally amounts in one record. What is the best way to do this?
Thanks for any ideas. :)
Output result set needed:
claim_id ac_name plan_name_1 pending_tally1 plan_name_2 Pending_tally2 plan_name_3 pending_tally3
-------- ------- ----------- -------------- ----------- -------------- ----------- --------------
1234 abc cooks delux_prime 22 prime_express 23 standard_prime 2
2341 zzz bakers delpux_prime 22 standard_prime 2 NULL NULL
3412 azb pasta's prime_express 23 NULL NULL NULL NULL
SQL Server 2005 table to use for the above result set:
company_claims
claim_id ac_name
1234 abc cooks
2341 zzz bakers
3412 azb pasta's
claim_plans
claim_id plan_id plan_name
1234 101 delux_prime
1234 102 Prime_express
1234 103 standard_prime
2341 101 delux_prime
2341 103 standard_prime
3412 102 Prime_express
Pending_amounts
claim_id plan_id Pending_tally
1234 101 22
1234 102 23
1234 103 2
2341 101 22
2341 103 2
3412 102 23
If you know that 3 is always the max amount of plans then some left joins will work fine:
select c.claim_id, c.ac_name,
cp1.plan_name as plan_name_1, pa1.pending_tally as pending_tally1,
cp2.plan_name as plan_name_2, pa2.pending_tally as pending_tally2,
cp3.plan_name as plan_name_3, pa3.pending_tally as pending_tally3,
from company_claims c
left join claim_plans cp1 on c.claim_id = cp1.claim_id and cp1.planid = 101
left join claim_plans cp2 on c.claim_id = cp2.claim_id and cp2.planid = 102
left join claim_plans cp3 on c.claim_id = cp3.claim_id and cp3.planid = 103
left join pending_amounts pa1 on cp1.claim_id = pa1.claimid and cp1.planid = pa1.plainid
left join pending_amounts pa2 on cp2.claim_id = pa2.claimid and cp2.planid = pa2.plainid
left join pending_amounts pa3 on cp3.claim_id = pa3.claimid and cp3.planid = pa3.plainid
I would first join all your data so that you get the relevant columns: claim_id, ac_name, plan_name, pending tally.
Then I would add transform this to get plan name and plan tally on different rows, with a label tying them together.
Then it should be easy to pivot.
I would tie these together with common table expressions.
Here's the query:
with X as (
select cc.*, cp.plan_name, pa.pending_tally,
rank() over (partition by cc.claim_id order by plan_name) as r
from company_claims cc
join claim_plans cp on cp.claim_id = cc.claim_id
join pending_amounts pa on pa.claim_id = cp.claim_id
and pa.plan_id = cp.plan_id
), P as (
select
X.claim_id,
x.ac_name,
x.plan_name as value,
'plan_name_' + cast(r as varchar(max)) as label
from x
union all
select
X.claim_id,
x.ac_name,
cast(x.pending_tally as varchar(max)) as value,
'pending_tally' + cast(r as varchar(max)) as label
from x
)
select claim_id, ac_name, [plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3]
from (select * from P) p
pivot (
max(value)
for label in ([plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3])
) as pvt
order by pvt.claim_id, ac_name
Here's a fiddle showing it in action: http://sqlfiddle.com/#!3/68f62/10