Trouble pivoting data in DB2 - sql

Before this one is marked as duplicate please know I have done my research on Pivoting in DB2 (even though DB2 doesnt have PIVOT) from these links
Pivoting in DB2 on SO and IBM Developers, but I just cant make sense of how to do it with my Data and need some help. I tried to manipulate my string using examples from both of those links and could not get it to work. Im not asking for anyone to write the full code for me, but just give me a point in the right direction on how to change my string to retrieve the desired result. Thank you in advance.
Current String:
SELECT
cfna1 AS "Customer Name", cfrisk AS "Risk Rating", cfrirc AS "Rated By", date(digits(decimal(cfrid7 + 0.090000, 7, 0))) AS "Risk Rated Date",cfuc3n3 AS "Credit Score", date(digits(decimal(cf3ud7 + 0.090000, 7, 0))) AS "CR Date"
FROM cncttp08.jhadat842.cfmast cfmast
WHERE cfcif# IN ('T000714', 'T000713', 'T000716', 'T000715')
ORDER BY
CASE cfcif#
WHEN 'T000714' THEN 1
WHEN 'T000713' THEN 2
WHEN 'T000716' THEN 3
WHEN 'T000715' THEN 4
END
Result as expected from String:
Customer Name | Risk Rating | Rated By | Risk Rated Date | Credit Score | CR Date
Elmer Fudd 8 MLA 2018-02-08 777 2018-02-08
Result I would like to achieve:
Elmer Fudd
Risk Rating 8
Rated By MLA
Risk Rated Date 2018-02-08
Credit Score 777
CR Date 2018-02-08

Use unpivot method suggested in developers link and use cast to convert all columns to varchar.
Example:
select st1.id1, unpivot1.col1, unpivot1.val1
from (
select id1, char1 , date1, number1
from sometable
) st1,
lateral (values
('char col', cast(st1.char1 as varchar(100))),
('date col', cast(st1.date1 as varchar(100))),
('number col', cast(st1.number1 as varchar(100)))
) as unpivot1 (col1, val1)
order by st1.id1

I don't think that output is possible in sql -- do you mean something like this?
id_group Data_Type Value
1 Name Elmer Fudd
1 Risk Rating 8
1 Rated By MLA
1 Risk Rated Date 2018-02-08
1 Credit Score 777
1 CR Date 2018-02-08
To do this we need another column that brings all the elements together -- I called it "id_group" this is the column that identifys the group

Related

SQL: SUM of MAX values WHERE date1 <= date2 returns "wrong" results

Hi stackoverflow users
I'm having a bit of a problem trying to combine SUM, MAX and WHERE in one query and after an intense Google search (my search engine skills usually don't fail me) you are my last hope to understand and fix the following issue.
My goal is to count people in a certain period of time and because a person can visit more than once in said period, I'm using MAX. Due to the fact that I'm defining people as male (m) or female (f) using a string (for statistic purposes), CHAR_LENGTH returns the numbers I'm in need of.
SELECT SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id")
So far, so good. But now, as stated before, I'd like to only count the guests which visited in a certain time interval (for statistic purposes as well).
SELECT "statistic"."id", SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id"),
"statistic", "guests"
WHERE ( "guests"."arrival" <= "statistic"."from" AND "guests"."departure" >= "statistic"."to")
GROUP BY "statistic"."id"
This query returns the following, x = desired result:
x * (x+1)
So if the result should be 3, it's 12. If it should be 5, it's 30 etc.
I probably could solve this algebraic but I'd rather understand what I'm doing wrong and learn from it.
Thanks in advance and I'm certainly going to answer all further questions.
PS: I'm using LibreOffice Base.
EDIT: An example
guests table:
ID | arrival | departure | gender |
10 | 1.1.14 | 10.1.14 | mf |
10 | 15.1.14 | 17.1.14 | m |
11 | 5.1.14 | 6.1.14 | m |
12 | 10.2.14 | 24.2.14 | f |
13 | 27.2.14 | 28.2.14 | mmmmmf |
statistic table:
ID | from | to | name |
1 | 1.1.14 | 31.1.14 |January | expected result: 3
2 | 1.2.14 | 28.2.14 |February| expected result: 7
MAX(...) is the wrong function: You want COUNT(DISTINCT ...).
Add proper join syntax, simplify (and remove unnecessary quotes) and this should work:
SELECT s.id, COUNT(DISTINCT g.id) AS People
FROM statistic s
LEFT JOIN guests g ON g.arrival <= s."from" AND g.departure >= s."too"
GROUP BY s.id
Note: Using LEFT join means you'll get a result of zero for statistics ids that have no guests. If you would rather no row at all, remove the LEFT keyword.
You have a very strange data structure. In any case, I think you want:
SELECT s.id, sum(numpersons) AS People
FROM (select g.id, max(char_length(g.gender)) as numpersons
from guests g join
statistic s
on g.arrival <= s."from" AND g.departure >= s."too"
group by g.id
) g join
GROUP BY s.id;
Thanks for all your inputs. I wasn't familiar with JOIN but it was necessary to solve my problem.
Since my databank is designed in german, I made quite the big mistake while translating it and I'm sorry if this caused confusion.
Selecting guests.id and later on grouping by guests.id wouldn't make any sense since the id is unique. What I actually wanted to do is select and group the guests.adr_id which links a visiting guest to an adress databank.
The correct solution to my problem is the following code:
SELECT statname, SUM (numpers) FROM (
SELECT statistic.name AS statname, guests.adr_id, MAX( CHAR_LENGTH( guests.gender ) ) AS numpers
FROM guests
JOIN statistics ON (guests.arrival <= statistics.too AND guests.departure >= statistics.from )
GROUP BY guests.adr_id, statistic.name )
GROUP BY statname
I also noted that my database structure is a mess but I created it learning by doing and haven't found any time to rewrite it yet. Next time posting, I'll try better.

SQL query dynamic row generation with composite key

My question is made of 3 parts.
First part:
Is there a way to generate rows based on a value?
E.g:
I want to give each family a number of vouchers based on their family_members_count.
Each voucher should have a unique id:
Base table:
id name family_members_count
1 fadi 2
2 sami 3
3 ali 1
Result:
family_id name voucher_id
1 fadi 121
1 fadi 122
2 sami 123
2 sami 124
2 sami 125
3 ali 126
Second part:
Can I control the voucher_id composite key? I want the voucher_id to be like this
(location)(cycle)(sequence 5 digits)
If north = 08 and we are in the second cycle it should be:
080200001
080200002
... and so on.
Third part:
I need the solution in both MS Access 2010 SQL and PostgreSQL 9.1 SQL.
Question 1
Use generate_series(). (In the coming version 9.3 look for the key word LATERAL.)
SELECT id AS family_id
,name
,120 + generate_series(1, family_members_count) AS voucher_id
FROM fam;
-> sqlfiddle demo
Question 2
SELECT id AS family_id
,name
,location
|| to_char(cycle, 'FM00')
|| to_char(generate_series(1, family_members_count), 'FM00000')
AS voucher_id
FROM fam2;
-> sqlfiddle demo
Note the use of to_char() to format numbers as text - and in particular the use of the FM pattern modifier to avoid leading white space.
Question 3
Sorry, I got MS Access out of my system 10 years ago and never looked back. Somebody else might fill in. I doubt it will be as simple.

Finding the next occurrence of a value in a table

Sorry in advance if this has already been covered.
I am working on a database which isnt particularly well structured but it is owned by a third party and cannot be changed.
I need some assistance with t-sql in find the next occurrence of a value within the table and return records based on the result. Let me first explain the data. I have simplified this to make it easier to understand.
Polref Effective Date Transaction Type Suffix Value
ABCD1 01/06/2010 New Bus 1 175.00
ABCD1 01/06/2011 Ren 2 200.00
ABCD1 19/08/2011 Adjust 3 50.00
ABCD1 23/04/2012 Adjust 4 50.00
ABCD1 01/06/2012 Ren 5 275.00
So if I ran my query for 2011, the code would need to return in this example rows with suffix 2,3 and 4. So what I have been trying to do is find the first suffix of a New Bus or Ren for the specified year and then finding the next suffix for a New Bus or Ren for the same polref and then using those two suffix values to limit my recordset. It aint working!!
I cant use MAX() as transactions for 2013 have already been added to the system to I would get more records than I actually need.
There result I should be expecting for this example data would be:
ABCD1 300.00
Any help would be greatly appreciated.
To answer another question, If I select 2011 as my year to run the report, there should only be one New Bus or Ren transaction for 2011 so if its a New Bus transaction, the next main transaction will be a Ren, if its a Ren then the next main transaction will be a Ren. Again in my example below, if I run for 2011, it should find the Ren from 01/06/2011 so I want to return that Ren and the two Adjust records.
Sorry, I've not used this forum before so apologies if I was a little vague.
The table I am using has many polrefs so I need this code to calculate totals for all polrefs that fall within the date range. Some polrefs may only have one row, a New Bus, some will have many rows depending on how many adjustments have been made throughout the year of the policy
Partial answer:
This query:
declare #t table (PolRef char(5) not null, EffectiveDate date not null,TransactionType varchar(10) not null,Suffix int not null,Value decimal(10,2) not null)
insert into #t (Polref,EffectiveDate,TransactionType,Suffix,Value) values
('ABCD1','20100601','New Bus',1,175.00),
('ABCD1','20110601','Ren',2,200.00),
('ABCD1','20110819','Adjust',3,50.00),
('ABCD1','20120423','Adjust',4,50.00),
('ABCD1','20120601','Ren',5,275.00)
;With StartTransactions as (
select PolRef,Suffix,ROW_NUMBER() OVER (PARTITION BY PolRef ORDER BY Suffix) rn
from #t where TransactionType in ('New Bus','Ren')
), Periods as (
select st1.PolRef,st1.Suffix as StartSuffix,st2.Suffix as EndSuffix
from
StartTransactions st1
left join
StartTransactions st2
on
st1.PolRef = st2.PolRef and
st1.rn = st2.rn - 1
)
select
p.PolRef,t2.EffectiveDate,SUM(t.Value) as Total
from
Periods p
inner join
#t t
on
p.PolRef = t.PolRef and
p.StartSuffix <= t.Suffix and
(p.EndSuffix > t.Suffix or
p.EndSuffix is null)
inner join
#t t2
on
p.PolRef = t2.PolRef and
t2.Suffix = p.StartSuffix
group by
p.PolRef,t2.EffectiveDate
Groups each set of transactions based on each successive Ren or New Bus transaction:
PolRef EffectiveDate Total
------ ------------- ---------------------------------------
ABCD1 2010-06-01 175.00
ABCD1 2011-06-01 300.00
ABCD1 2012-06-01 275.00
From that, it should be trivial to e.g. select out only the ones you're interested in from a particular year. But your question is still vague on some specifics, so I'm not taking it any further at this point.

Optimal solution for interview question

Recently in a job interview, I was given the following problem.
Say I have the following table
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
b | 30.00 | 1
c | 20.00 | 1
d | 25.00 | 1
where widget_name is holds the name of the widget, widget_costs is the price of a widget, and in stock is a constant of 1.
Now for my business insurance I have a certain deductible. I am looking to find a sql statement that will tell me every widget and it's price exceeds the deductible. So if my dedudctible is $50.00 the above would just return
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
d | 25.00 | 1
Since widgets b and c where used to meet the deductible
The closest I could get is the following
SELECT
*
FROM (
SELECT
widget_name,
widget_price
FROM interview.tbl_widgets
minus
SELECT widget_name,widget_price
FROM (
SELECT
widget_name,
widget_price,
50 - sum(widget_price) over (ORDER BY widget_price ROWS between unbounded preceding and current row) as running_total
FROM interview.tbl_widgets
)
where running_total >= 0
)
;
Which gives me
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
c | 20.00 | 1
d | 25.00 | 1
because it uses a and b to meet the majority of the deductible
I was hoping someone might be able to show me the correct answer
EDIT: I understood the interview question to be asking this. Given a table of widgets and their prices and given a dollar amount, substract as many of the widgets you can up to the dollar amount and return those widgets and their prices that remain
I'll put an answer up, just in case it's easier than it looks, but if the idea is just to return any widget that costs more than the deductible then you'd do something like this:
Select
Widget_Name, Widget_Cost, In_Stock
From
Widgets
Where
Widget_Cost > 50 -- SubSelect for variable deductibles?
For your sample data my query returns no rows.
I believe I understand your question, but I'm not 100%. Here is what I'm assuming you mean:
Your deductible is say, $50. To meet the deductible you have you "use" two items. (Is this always two? How high can it go? Can it be just one? What if they don't total exactly $50, there is a lot of missing information). You then want to return the widgets that aren't being used towards deductible. I have the following.
CREATE TABLE #test
(
widget_name char(1),
widget_cost money
)
INSERT INTO #test (widget_name, widget_cost)
SELECT 'a', 15.00 UNION ALL
SELECT 'b', 30.00 UNION ALL
SELECT 'c', 20.00 UNION ALL
SELECT 'd', 25.00
SELECT * FROM #test t1
WHERE t1.widget_name NOT IN (
SELECT t1.widget_name FROM #test t1
CROSS JOIN #test t2
WHERE t1.widget_cost + t2.widget_cost = 50 AND t1.widget_name != t2.widget_name)
Which returns
widget_name widget_cost
----------- ---------------------
a 15.00
d 25.00
This looks like a Bin Packing problem these are really hard to solve especially with SQL.
If you search on SO for Bin Packing + SQL, you'll find how to find Sum(field) in condition ie “select * from table where sum(field) < 150” Which is basically the same problem except you want to add a NOT IN to it.
I couldn't get the accepted answer by brianegge to work but what he wrote about it in general was interesting
..the problem you
describe of wanting the selection of
users which would most closely fit
into a given size, is a bin packing
problem. This is an NP-Hard problem,
and won't be easily solved with ANSI
SQL. However, the above seems to
return the right result, but in fact
it simply starts with the smallest
item, and continues to add items until
the bin is full.
A general, more effective bin packing
algorithm would is to start with the
largest item and continue to add
smaller ones as they fit. This
algorithm would select users 5 and 4.
So with this advice you could write a cursor to loop over the table to do just this (it just wouldn't be pretty).
Aaron Alton gives a nice link to a series of articles that attempts to solve the Bin Packing problem with sql but basically concludes that its probably best to use a cursor to do it.

Need a Complex SQL Query

I need to make a rather complex query, and I need help bad. Below is an example I made.
Basically, I need a query that will return one row for each case_id where the type is support, status start, and date meaning the very first one created (so that in the example below, only the 2/1/2009 John's case gets returned, not the 3/1/2009). The search needs to be dynamic to the point of being able to return all similar rows with different case_id's etc from a table with thousands of rows.
There's more after that but I don't know all the details yet, and I think I can figure it out if you guys (an gals) can help me out here. :)
ID | Case_ID | Name | Date | Status | Type
48 | 450 | John | 6/1/2009 | Fixed | Support
47 | 450 | John | 4/1/2009 | Moved | Support
46 | 451 | Sarah | 3/1/2009 | |
45 | 432 | John | 3/1/2009 | Fixed | Critical
44 | 450 | John | 3/1/2009 | Start | Support
42 | 450 | John | 2/1/2009 | Start | Support
41 | 440 | Ben | 2/1/2009 | |
40 | 432 | John | 1/1/2009 | Start | Critical
...
Thanks a bunch!
Edit:
To answer some people's questions, I'm using SQL Server 2005. And the date is just plain date, not string.
Ok so now I got further in the problem. I ended up with Bliek's solution which worked like a charm. But now I ran into the problem that sometimes the status never starts, as it's solved immediately. I need to include this in as well. But only for a certain time period.
I imagine I'm going to have to check for the case table referenced by FK Case_ID here. So I'd need a way to check for each Case_ID created in the CaseTable within the past month, and then run a search for these in the same table and same manner as posted above, returning only the first result as before. How can I use the other table like that?
As usual I'll try to find the answer myself while waiting, thanks again!
Edit 2:
Seems this is the answer. I don't have access to the full DB yet so I can't fully test it, but it seems to be working with the dummy tables I created, to continue from Bliek's code's WHERE clause:
WHERE RowNumber = 1 AND Case_ID IN (SELECT Case_ID FROM CaseTable
WHERE (Date BETWEEN '2007/11/1' AND '2007/11/30'))
The date's screwed again but you get the idea I'm sure. Thanks for the help everyone! I'll get back if there're more problems, but I think with this info I can improvise my way through most of the SQL problems I currently have to deal with. :)
Maybe something like:
select Case_ID, Name, MIN(date), Status, Type
from table
where type = 'Support'
and status = 'Start'
group by Case_ID, Name, Status, Type
EDIT: You haven't provided a lot of details about what you really want, so I'd suggest that you read all the answers and choose one that suits your problem best. So far I'd say that Tomalak's answer is closest to what you're looking for...
SELECT
c.ID,
c.Case_ID,
c.Name,
c.Date,
c.Status,
c.Type
FROM
CaseTable c
WHERE
c.Type = 'Support'
AND c.Status = 'Start'
AND c.Date = (
SELECT MIN(Date)
FROM CaseTable
WHERE Case_ID = c.Case_ID AND Type = c.Type AND Status = c.Status)
/* GROUP BY only needed when for a given Case_ID several rows
exist that fulfill the WHERE clause */
GROUP BY
c.ID,
c.Case_ID,
c.Name,
c.Date,
c.Status,
c.Type
This query benefits greatly from indexes on the Case_ID, Date, Status and Type columns.
Added value though the fact that the filter on Support and Status only needs to be set in one place.
As an alternative to the GROUP BY clause, you can do SELECT DISTINCT, which would increase readability (this may or may not affect overall performance, I suggest you measure both variants against each other). If you are sure that for no Case_ID in your table two rows exist that have the same Date, you won't need GROUP BY or SELECT DISTINCT at all.
In SQL Server 2005 and beyond I would use Common Table Expressions (CTE). This offers lots of possibilities like so:
With ResultTable (RowNumber
,ID
,Case_ID
,Name
,Date
,Status
,Type)
AS
(
SELECT Row_Number() OVER (PARTITION BY Case_ID
ORDER BY Date ASC)
,ID
,Case_ID
,Name
,Date
,Status
,Type
FROM CaseTable
WHERE Type = 'Support'
AND Status = 'Start'
)
SELECT ID
,Case_ID
,Name
,Date
,Status
,Type
FROM ResultTable
WHERE RowNumber = 1
Don't apologize for your date formatting, it makes more sense that way.
SELECT ID, Case_ID, Name, MIN(Date), Status, Type
FROM caseTable
WHERE Type = 'Support'
AND status = 'Start'
GROUP BY ID, Case_ID, Name, Status, Type