How to pivot wider in SQL or use a more "dynamic" form of the LEAD function? - sql

I have a table that looks as follows:
Policy Number Benefit Code Transaction Code
1 A 2
1 B 1
2 A 3
3 A 2
1 C 2
For analysis purposes, it would be much more convenient to have the table in the following form:
PN BC 1 TC 1 BC 2 TC 2 BC 3 TC 3
1 A 2 B 1 C 2
2 A 3 NULL NULL NULL NULL
3 A 2 NULL NULL NULL NULL
I believe this can be done, for example, in R using the tidyverse package, where the concept is basically pivoting the table from long-form to wide-form. Now, I know that I could possibly use the LEAD function in SQL, but the problem/issue is that I do not know how many benefit codes and transaction codes each policy has (i.e. they are not fixed).
Thus, my query is:
How can I "pivot wider" my table to achieve something like the above?
Other than "pivoting wider", is there a more dynamic form of the LEAD function in SQL, where it takes all subsequent rows of a group (in my case, each policy number) and puts them in new columns?
Any intuitive explanations or suggestions will be greatly appreciated :)

Related

How to rollup specific strings in a query

I would like to combine rows with duplicates in a specific column such that specific items are listed and others are excluded
I have attempted to use string_agg, group_by and self joins, I feel like I may simply need a better self join but I am not sure.
one two three four
1 1 a NULL
2 4 b e
3 7 c x
3 7 c z
I would like it to look something like this (with the elements that were the same remaining unsegregated)
one two three
1 1 a NULL
2 4 b e
3 7 c x,z
If you are using MySQL :
SELECT one, two, three, GROUP_CONCAT(four)
FROM table
GROUP BY one, two, three
Otherwise, this is a bad thing to do in a RDBMS because this is not a relationnal operation.
You should do this in the client-side of your project.

Update query just showing zero values when there exists non-zero values. (ACCESS)

I have been struggling with this for hours. I am trying to update all values that have the same 'SHORT#'. If the 'SHORT#' is in 017_PolWpart2 I want this to be the value that updates the corresponding 'SHORT#' in 017_WithdrawalsYTD_changelater. This update query is just displaying zeroes, but these values are in fact non-zero.
So say 017_WithdrawalsYTD_changelater looks like this:
SHORT# WithdrawalsYTD
1 0
2 0
3 0
4 0
5 0
and 017_PolWpart2 looks like this:
SHORT# Sum_MTD_AGG
3 50
5 12
I want this:
SHORT# WithdrawalsYTD
1 0
2 0
3 50
4 0
5 12
But I get this:
SHORT# WithdrawalsYTD
1 0
2 0
3 0
4 0
5 0
I have attached the SQL for the Query below.
Thanks!
UPDATE 017_WithdrawalsYTD_changelater
INNER JOIN 017b_PolWpart2 ON [017_WithdrawalsYTD_changelater].[SHORT#] =
[017b_PolWpart2].[SHORT#]
SET [017_WithdrawalsYTD_changelater].WithdrawalsYTD = [017b_PolWpart2].[Sum_MTD_AGG];
EDIT:
As I must aggregate on the fly, I have tried to do so. Still getting all kinds off errors. Note the table 17a_PolicyWithdrawalMatch is of the form:
SHORT# MTG_AGG WithdrawalPeriod PolDurY
1 3 1 1
1 5 1 0
2 2 1 1
2 22 1 1
So I aggregate:
SHORT# MTG_AGG
1 3
2 24
And put these aggregated values in 017_WithdrawalsYTD_changelater.
I tried to this like so:
SELECT [017a_PolicyWithdrawalMatch].[SHORT#], Sum([017a_PolicyWithdrawalMatch].MTD_AGG) AS Sum_MTD_AGG
WHERE ((([017a_PolicyWithdrawalMatch].WithdrawalPeriod)=[017a_PolicyWithdrawalMatch].[PolDurY]))
GROUP BY [017a_PolicyWithdrawalMatch].[SHORT#]
UPDATE 017_WithdrawalsYTD_changelater INNER JOIN 017a_PolicyWithdrawalMatch ON [017_WithdrawalsYTD_changelater].[SHORT#] = [017a_PolicyWithdrawalMatch].[SHORT#] SET 017_WithdrawalsYTD_changelater.WithdrawalsYTD =Sum_MTD_AGG;
I am getting no luck... I get told SELECT statement is using a reserved word... :(
Consider heeding #June7's comments to avoid the use of saving aggregate data in a table as it redundantly uses storage resources since such data can be easily queried in real time. Plus, such aggregate values immediately become historical figures since it is saved inside a static table.
In MS Access, update queries must be sourced from updateable objects of which aggregate queries are not, being read-only types. Hence, they cannot be used in UPDATE statements.
However, if you really, really, really need to store aggregate data, consider using domain functions such as DSUM inside the UPDATE. Below assumes SHORT# is a string column.
UPDATE [017_WithdrawalsYTD_changelater] c
SET c.WithdrawalsYTD = DSUM("MTD_AGG", "[017a_PolicyWithdrawalMatch]",
"[SHORT#] = '" & c.[SHORT#] & "' AND WithdrawalPeriod = [PolDurY]")
Nonetheless, the aggregate value can be queried and refreshed to current values as needed. Also, notice the use of table aliases to reduce length of long table names:
SELECT m.[SHORT#], SUM(m.MTD_AGG) AS Sum_MTD_AGG
FROM [017a_PolicyWithdrawalMatch] m
WHERE m.WithdrawalPeriod = m.[PolDurY]
GROUP BY m.[SHORT#]

T-SQL 2008 INSERT dummy row WHEN condition is met

**Schema & Dataset**
id version payment name
1 1 10 Rich
2 1 0 David
3 1 10 Marc
4 1 10 Jess
5 1 0 Steff
1 2 10 Rich
2 2 0 David
3 2 10 Marc
4 2 10 Jess
5 2 0 Steff
2 3 0 David
3 3 10 Marc
4 3 10 Jess
http://sqlfiddle.com/#!3/1c457/18 - Contains my schema and the dataset I'm working with.
Background
The data above is the final set after a stored proc has done it's processing so everything above is in one table and unfortunately I can't change it.
I need to identify in the dataset where a person has been deleted with a payment total greater than 0 in previous versions and insert a dummy row with a payment of 0. So in the above example, Rich has been deleted in version 3 with a payment total of 10 on previous versions. I need to first identify where this has happened in all instances and insert a dummy row with a 0 payment for that version. Steff has also been deleted on version 3 but she hasn't had a payment over 0 on previous versions so a dummy row is not needed for her.
Tried so far -
So I looked at pinal dave's example here and I can look back to the previous row which is great so it's a step in the right direction. I'm not sure however of how to go about achieving the above requirement. I've been toying with the idea of a case statement but I'm not certain that would be the best way to go about it. I'm really struggling with this one and would appreciate some advice on how to tackle it.
You can do this by generating all possible combinations of names and versions. Then filter out the ones you don't want according to the pay conditions:
select n.name, v.version, coalesce(de.payment, 0) as payment
from (select name, max(payment) as maxpay from dataextract group by name) n cross join
(select distinct version from dataextract) v left outer join
dataextract de
on de.name = n.name and de.version = v.version
where de.name is not null or n.maxpay > 0;
Here is a SQL Fiddle.

SQL Query: Using Cursors

I need some directions for SQL Server & Cursors:
I have a table named Order:
OrderID Item Amount
1 A 10
1 B 1
2 A 5
2 C 4
2 D 21
3 B 11
I have a second table named Storage:
Item Amount
A 40
B 44
C 20
D 1
For every OrderID, I want to check if enough items are available. If not, I want to return an error message. How can this be done with Cursors at all? Are nested cursors the solution to this? My main issue is to understand how I can fetch the OrderID as actual "Group" of ID=1, 2, 3 etc. instead of line by line
Please don't use a cursor. You could use a try / catch if you need to throw an error.

SQL Recursive Tables

I have the following tables, the groups table which contains hierarchically ordered groups and group_member which stores which groups a user belongs to.
groups
---------
id
parent_id
name
group_member
---------
id
group_id
user_id
ID PARENT_ID NAME
---------------------------
1 NULL Cerebra
2 1 CATS
3 2 CATS 2.0
4 1 Cerepedia
5 4 Cerepedia 2.0
6 1 CMS
ID GROUP_ID USER_ID
---------------------------
1 1 3
2 1 4
3 1 5
4 2 7
5 2 6
6 4 6
7 5 12
8 4 9
9 1 10
I want to retrieve the visible groups for a given user. That it is to say groups a user belongs to and children of these groups. For example, with the above data:
USER VISIBLE_GROUPS
9 4, 5
3 1,2,4,5,6
12 5
I am getting these values using recursion and several database queries. But I would like to know if it is possible to do this with a single SQL query to improve my app performance. I am using MySQL.
Two things come to mind:
1 - You can repeatedly outer-join the table to itself to recursively walk up your tree, as in:
SELECT *
FROM
MY_GROUPS MG1
,MY_GROUPS MG2
,MY_GROUPS MG3
,MY_GROUPS MG4
,MY_GROUPS MG5
,MY_GROUP_MEMBERS MGM
WHERE MG1.PARENT_ID = MG2.UNIQID (+)
AND MG1.UNIQID = MGM.GROUP_ID (+)
AND MG2.PARENT_ID = MG3.UNIQID (+)
AND MG3.PARENT_ID = MG4.UNIQID (+)
AND MG4.PARENT_ID = MG5.UNIQID (+)
AND MGM.USER_ID = 9
That's gonna give you results like this:
UNIQID PARENT_ID NAME UNIQID_1 PARENT_ID_1 NAME_1 UNIQID_2 PARENT_ID_2 NAME_2 UNIQID_3 PARENT_ID_3 NAME_3 UNIQID_4 PARENT_ID_4 NAME_4 UNIQID_5 GROUP_ID USER_ID
4 2 Cerepedia 2 1 CATS 1 null Cerebra null null null null null null 8 4 9
The limit here is that you must add a new join for each "level" you want to walk up the tree. If your tree has less than, say, 20 levels, then you could probably get away with it by creating a view that showed 20 levels from every user.
2 - The only other approach that I know of is to create a recursive database function, and call that from code. You'll still have some lookup overhead that way (i.e., your # of queries will still be equal to the # of levels you are walking on the tree), but overall it should be faster since it's all taking place within the database.
I'm not sure about MySql, but in Oracle, such a function would be similar to this one (you'll have to change the table and field names; I'm just copying something I did in the past):
CREATE OR REPLACE FUNCTION GoUpLevel(WO_ID INTEGER, UPLEVEL INTEGER) RETURN INTEGER
IS
BEGIN
DECLARE
iResult INTEGER;
iParent INTEGER;
BEGIN
IF UPLEVEL <= 0 THEN
iResult := WO_ID;
ELSE
SELECT PARENT_ID
INTO iParent
FROM WOTREE
WHERE ID = WO_ID;
iResult := GoUpLevel(iParent,UPLEVEL-1); --recursive
END;
RETURN iResult;
EXCEPTION WHEN NO_DATA_FOUND THEN
RETURN NULL;
END;
END GoUpLevel;
/
Joe Cleko's books "SQL for Smarties" and "Trees and Hierarchies in SQL for Smarties" describe methods that avoid recursion entirely, by using nested sets. That complicates the updating, but makes other queries (that would normally need recursion) comparatively straightforward. There are some examples in this article written by Joe back in 1996.
I don't think that this can be accomplished without using recursion. You can accomplish it with with a single stored procedure using mySQL, but recursion is not allowed in stored procedures by default. This article has information about how to enable recursion. I'm not certain about how much impact this would have on performance verses the multiple query approach. mySQL may do some optimization of stored procedures, but otherwise I would expect the performance to be similar.
Didn't know if you had a Users table, so I get the list via the User_ID's stored in the Group_Member table...
SELECT GroupUsers.User_ID,
(
SELECT
STUFF((SELECT ',' +
Cast(Group_ID As Varchar(10))
FROM Group_Member Member (nolock)
WHERE Member.User_ID=GroupUsers.User_ID
FOR XML PATH('')),1,1,'')
) As Groups
FROM (SELECT User_ID FROM Group_Member GROUP BY User_ID) GroupUsers
That returns:
User_ID Groups
3 1
4 1
5 1
6 2,4
7 2
9 4
10 1
12 5
Which seems right according to the data in your table. But doesn't match up with your expected value list (e.g. User 9 is only in one group in your table data but you show it in the results as belonging to two)
EDIT: Dang. Just noticed that you're using MySQL. My solution was for SQL Server. Sorry.
-- Kevin Fairchild
There was already similar question raised.
Here is my answer (a bit edited):
I am not sure I understand correctly your question, but this could work My take on trees in SQL.
Linked post described method of storing tree in database -- PostgreSQL in that case -- but the method is clear enough, so it can be adopted easily for any database.
With this method you can easy update all the nodes depend on modified node K with about N simple SELECTs queries where N is distance of K from root node.
Good Luck!
I don't remember which SO question I found the link under, but this article on sitepoint.com (second page) shows another way of storing hierarchical trees in a table that makes it easy to find all child nodes, or the path to the top, things like that. Good explanation with example code.
PS. Newish to StackOverflow, is the above ok as an answer, or should it really have been a comment on the question since it's just a pointer to a different solution (not exactly answering the question itself)?
There's no way to do this in the SQL standard, but you can usually find vendor-specific extensions, e.g., CONNECT BY in Oracle.
UPDATE: As the comments point out, this was added in SQL 99.