Oracle SQL -- selecting specific data from multiple rows into one row - sql

I'm trying to select data across multiple rows into one row.
For example, with this data set:
NAME THING DATE
----- ------ ------
JACK 1 EARLY
JACK 2 LATER
JACK 3 NOW
JANE 1 LATER
JANE 2 EARLY
JANE 3 NOW
I want to produce the following result:
NAME THING DATE
---- ---- -----
JACK 1, 2, 3 NOW
JANE 1, 2, 3 NOW
And so, I know i can use the LISTAGG function to combine the "Thing" rows, but my biggest question is how to select across multiple rows to get the "NOW" values in the date field.
Any help would be appreciated. Thanks!

It isn't clear if you want the latest date for any thing (ordering the aggregated things either by date or by their own values):
select name,
listagg(thing, ',') within group (order by date_col) as things,
max(date_col) as now
from your_table
group by name
order by name;
or the date corresponding to the highest value of thing:
select name,
listagg(thing, ',') within group (order by thing) as things,
max(date_col) keep (dense_rank last order by thing) as now
from your_table
group by name
order by name;
As you said those are actually dates, with your sample data with added date values configured slightly differently for two names:
NAME THING DATE_COL
---- ---------- ----------
JACK 1 2019-01-01
JACK 2 2019-03-15
JACK 3 2019-04-30
JANE 1 2019-02-01
JANE 2 2019-05-03
JANE 3 2019-04-02
the first query gets:
NAME THINGS NOW
---- --------------- ----------
JACK 1,2,3 2019-04-30
JANE 1,3,2 2019-05-03
and the second query gets:
NAME THINGS NOW
---- --------------- ----------
JACK 1,2,3 2019-04-30
JANE 1,2,3 2019-04-02
db<>fiddle

Related

How do I select a max date by person in a table

I am not too advanced with SSRS/SQL queries, and need to write a report that pulls out % allocations by person to then compare to a wage table to allocate the wages. These allocations change quarterly, but all allocations continue to be stored in the table. If a persons allocation did not change, they do NOT get a new entry in the table. Here is a sample table called Allocations.
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
01/01/20
A
25.00
Doe
Jane
01/01/20
B
25.00
Doe
Jane
01/01/20
C
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
01/01/20
A
100.00
Wayne
Bruce
04/01/20
B
100.00
The results that I would want to have from this sample table when querying it are:
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
04/01/20
B
100.00
However, I would also like to pull this by comparing it to a date that the user inputs, so that they could run this report at any point in time and get the correct "max" dates. So, for example, if there were also 7/1/20 dates in here, but the user input date was 6/30/20, I would NOT want to pull the 7/1/20 data. In other words, I would like to pull the rows with the maximum date by name w/o going over the user's input date.
Any idea on the best way to accomplish this?
Thanks in advance for any advice you can provide.
In SQL, ROW_NUMBER can be used to order records in groups by a particular field.
SELECT * FROM (
SELECT *, ROW_NUMBER()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1
Then you filter for ROW_NUM = 1.
However, I noticed that there are a couple with the same date and you want both. In this caseyou'd want to use RANK - which allows for ties so there may be multiple records with the same date that you want to capture.
SELECT * FROM (
SELECT *, RANK()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1

Join two rows that contain a common column value [duplicate]

Let's say I've got the following database table
Name | Nickname | ID
----------------------
Joe Joey 14
Joe null 14
Now I want to do a select statement that merges these two columns to one while replacing the null values. The result should look like this:
Joe, Joey, 14
Which sql statement manages this (if it's even possible)?
Simplest solution:
SQL> select * from t69
2 /
NAME NICKNAME ID
---------- ---------- ----------
Joe Joey 14
Joe 14
Michael 15
Mick 15
Mickey 15
SQL> select max(name) as name
2 , max(nickname) as nickname
3 , id
4 from t69
5 group by id
6 /
NAME NICKNAME ID
---------- ---------- ----------
Joe Joey 14
Michael Mickey 15
SQL>
If you have 11gR2 you could use the new-fangled LISTAGG() function but otherwise it is simple enough to wrap the above statement in a SELECT which concatenates the NAME and NICKNAME columns.
AFAIK,
the question is not clear.so i am making some assumptions over here.
your output has the first and 3rd columns for both the rows as same.
Only the 2nd field is different.
so u can simply write a select quest
select one.name,two.nick_name,one.id from
(select name,id from your_tb group by name,id) one,
your_tb two
where two.nickname is not NULL
and two.name=one.name
and two.id=one.id;
may be we can tune this but i am not good in tuning sql squeries,but this is the way i suppose u need.

How to query: "for which do these values apply"?

I'm trying to match and align data, or resaid, count occurrences and then list for which values those occurrences occur.
Or, in a question: "How many times does each ID value occur, and for what names?"
For example, with this input
Name ID
-------------
jim 123
jim 234
jim 345
john 123
john 345
jane 234
jane 345
jan 45678
I want the output to be:
count ID name name name
------------------------------------
3 345 jim john jane
2 123 jim john
2 234 jim jane
1 45678 jan
Or similarly, the input could be (noticing that the ID values are not aligned),
jim john jane jan
----------------------------
123 345 234 45678
234 123 345
345
but that seems to complicate things.
As close as I am to the desired results is in SQL, as
for ID, count(ID)
from table
group by (ID)
order by count desc
which outputs
ID count
------------
345 3
123 2
234 2
45678 1
I'll appreciate help.
You seem to want a pivot. In SQL, you have to specify the number of columns in advance (unless you construct the query as a string).
But the idea is:
select ID, count(*) as cnt,
max(case when seqnum = 1 then name end) as name_1,
max(case when seqnum = 2 then name end) as name_2,
max(case when seqnum = 3 then name end) as name_3
from (select t.*,
row_number() over (partition by id order by id) as seqnum -- arbitrary ordering
from table t
) t
group by ID
order by count desc;
If you have an unknown number of columns, you can aggregate the values into an array:
select ID, count(*) as cnt,
array_agg(name order by name) as names
from table t
group by ID
order by count desc
the query would look similar to this if that's what you're looking for.
SELECT
name,
id,
COUNT(id) as count
FROM
dataSet
WHERE
dataSet.name = 'input'
AND dataSet.id = 'input'
GROUP BY
name,
id

SQL: Identify distinct blocks of treatment over multiple start and end date ranges for each member

Objective: Identify distinct episodes of continuous treatment for each member in a table. Each member has a diagnosis and a service date, and an episode is defined as all services where the time between each consecutive service is less than some number (let's say 90 days for this example). The query will need to loop through each row and calculate the difference between dates, and return the first and last date associated with each episode. The goal is to group results by member and episode start/end date.
A very similar question has been asked before, and was somewhat helpful. The problem is that in customizing the code, the returned tables are excluding first and last records. I'm not sure how to proceed.
My data currently looks like this:
MemberCode Diagnosis ServiceDate
1001 ----- ABC ----- 2010-02-04
1001 ----- ABC ----- 2010-03-20
1001 ----- ABC ----- 2010-04-18
1001 ----- ABC ----- 2010-05-22
1001 ----- ABC ----- 2010-09-26
1001 ----- ABC ----- 2010-10-11
1001 ----- ABC ----- 2010-10-19
2002 ----- XYZ ----- 2010-07-10
2002 ----- XYZ ----- 2010-07-21
2002 ----- XYZ ----- 2010-11-08
2002 ----- ABC ----- 2010-06-03
2002 ----- ABC ----- 2010-08-13
In the above data, the first record for Member 1001 is 2010-02-04, and there is not a difference of more than 90 days between consecutive services until 2010-09-26 (the date at which a new episode starts). So Member 1001 has two distinct episodes: (1) Diagnosis ABC, which goes from 2010-02-04 to 2010-05-22, and (2) Diagnosis ABC, which goes from 2010-09-26 to 2010-10-19.
Similarly, Member 2002 has three distinct episodes: (1) Diagnosis XYZ, which goes from 2010-07-10 to 2010-07-21, (2) Diagnosis XYZ, which begins and ends on 2010-11-08, and (3) Diagnosis ABC, which goes from 2010-06-03 to 2010-08-13.
Desired output:
MemberCode Diagnosis EpisodeStartDate EpisodeEndDate
1001 ----- ABC ----- 2010-02-04 ----- 2010-05-22
1001 ----- ABC ----- 2010-09-26 ----- 2010-10-19
2002 ----- XYZ ----- 2010-07-10 ----- 2010-07-21
2002 ----- XYZ ----- 2010-11-08 ----- 2010-11-08
2002 ----- ABC ----- 2010-06-03 ----- 2010-08-13
I've been working on this query for too long, and still can't get exactly what I need. Any help would be appreciated. Thanks in advance!
SQL Server 2012 has the lag() and cumulative sum functions, which makes it easier to write such a query. The idea is to find the first in each sequence. Then take the cumulative sum of the first flag to identify each group. Here is the code:
select MemberId, Diagnosis, min(ServiceDate) as EpisodeStartDate,
max(ServiceStartDate) as EpisodeEndDate
from (select t.*, sum(ServiceStartFlag) over (partition by MemberId, Diagnosis order by ServiceDate) as grp
from (select t.*,
(case when datediff(day,
lag(ServiceDate) over (partition by MemberId, Diagnosis
order by ServiceDate),
ServiceDate) < 90
then 0
else 1 -- handles both NULL and >= 90
end) as ServiceStartFlag
from table t
) t
group by grp, MemberId, Diagnosis;
You can do this in earlier versions of SQL Server but the code is more cumbersome.
For versions of SQL Server prior to 2012, here's some code snippets that should work.
First, you'll need a temp table (as opposed to a CTE, as the lookup of the edge event will fire the newid() function again, rather than retriving the value for that row)
DECLARE #Edges TABLE (MemberCode INT, Diagnosis VARCHAR(3), ServiceDate DATE, GroupID VARCHAR(40))
INSERT INTO #Edges
SELECT *
FROM Treatments E
CROSS APPLY (
SELECT
CASE
WHEN EXISTS (
SELECT TOP 1 E2.ServiceDate
FROM Treatments E2
WHERE E.MemberCode = E2.MemberCode
AND E.Diagnosis = E2.Diagnosis
AND E.ServiceDate > E2.ServiceDate
AND DATEDIFF(dd,E2.ServiceDate,E.ServiceDate) BETWEEN 1 AND 90
ORDER BY E2.ServiceDate DESC
) THEN 'Group'
ELSE CAST(NEWID() AS VARCHAR(40))
END AS GroupID
) z
The EXISTS operator contains a query that looks into the past for a date between 1 and 90 days ago. Once the Edge cases are gathered, this query will provide the results you posted as desired from the test data you posted.
SELECT MemberCode, Diagnosis, MIN(ServiceDate) AS StartDate, MAX(ServiceDate) AS EndDate
FROM (
SELECT
MemberCode
, Diagnosis
, ServiceDate
, CASE GroupID
WHEN 'Group' THEN (
SELECT TOP 1 GroupID
FROM #Edges E2
WHERE E.MemberCode = E2.MemberCode
AND E.Diagnosis = E2.Diagnosis
AND E.ServiceDate > E2.ServiceDate
AND GroupID != 'Group'
ORDER BY ServiceDate DESC
)
ELSE GroupID END AS GroupID
FROM #Edges E
) Z
GROUP BY MemberCode, Diagnosis, GroupID
ORDER BY MemberCode, Diagnosis, MIN(ServiceDate)
Like Gordon said, more cumbersome, but it can be done if your server is not SQL 2012 or greater.

Merging two rows to one while replacing null values

Let's say I've got the following database table
Name | Nickname | ID
----------------------
Joe Joey 14
Joe null 14
Now I want to do a select statement that merges these two columns to one while replacing the null values. The result should look like this:
Joe, Joey, 14
Which sql statement manages this (if it's even possible)?
Simplest solution:
SQL> select * from t69
2 /
NAME NICKNAME ID
---------- ---------- ----------
Joe Joey 14
Joe 14
Michael 15
Mick 15
Mickey 15
SQL> select max(name) as name
2 , max(nickname) as nickname
3 , id
4 from t69
5 group by id
6 /
NAME NICKNAME ID
---------- ---------- ----------
Joe Joey 14
Michael Mickey 15
SQL>
If you have 11gR2 you could use the new-fangled LISTAGG() function but otherwise it is simple enough to wrap the above statement in a SELECT which concatenates the NAME and NICKNAME columns.
AFAIK,
the question is not clear.so i am making some assumptions over here.
your output has the first and 3rd columns for both the rows as same.
Only the 2nd field is different.
so u can simply write a select quest
select one.name,two.nick_name,one.id from
(select name,id from your_tb group by name,id) one,
your_tb two
where two.nickname is not NULL
and two.name=one.name
and two.id=one.id;
may be we can tune this but i am not good in tuning sql squeries,but this is the way i suppose u need.