Set-based way to calculate family ranges in SQL? - sql

I have a table that contains parents and 0 or more children for each parent, with a flag indicating which records are parents. All of the members of a given family have the same parent id, and the parent always has the lowest id in a given family. Also, each child has a value associated with it. (Specifically, this is a database of emails and attachments, where each parent is an email and the children are the attachments.)
I have two fields I need to calculate:
Range = {lowest id in family} - {highest id in family} [populated for all members]
Value-list = {delimited list of the values of each child, in id order} [only for parent]
So, given this:
Id | Parent| HasChildren| Value | Range | Value-list
----------------------------------------|-----------
1 | 1 | 1 | | |
2 | 1 | 0 | a | |
3 | 1 | 0 | b | |
4 | 4 | 1 | | |
5 | 4 | 0 | c | |
6 | 6 | 0 | | |
I would like to end up with this:
Id | Parent| HasChildren| Value | Range | Value-list
----------------------------------------|-----------
1 | 1 | 1 | | 1-3 | a;b
2 | 1 | 0 | a | 1-3 |
3 | 1 | 0 | b | 1-3 |
4 | 4 | 1 | | 4-5 | c
5 | 4 | 0 | c | 4-5 |
6 | 6 | 0 | | 6-6 |
How can I do this efficiently? Ideally, I'd like to do this with just set-based logic, without cursors, or even stored procedures. Temporary tables are fine.
I'm working in T-SQL, if that makes a difference, though I'd be curious to see platform agnostic answers.

The following SQLFiddle Solution should do the job for you, however as #Allan mentioned, you might want to revise your database structure.
Using CTE's:
Note: my query uses table1 as name of Your table
with cte as(
select parent
,ValueList= stuff(( select ';' +isnull(t2.Value, '')
from table1 t2
where t1.parent=t2.parent
order by t2.value
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
from table1 t1
group by parent
),
cte2 as (select parent
, min(id) as firstID
, max(id) as LastID
from table1
group by parent)
select *
,(select FirstID from cte2 t2 where t2.parent=t1.parent)+'-'+(select LastID from cte2 t2 where t2.parent=t1.parent) as [Range]
,(select ValueList from cte t2 where t1.parent=t2.parent and t1.[haschildren]='1') as [Value -List]
from table1 t1

Related

How do I do an Oracle SQL update from select in a specific order?

I have a table with old values (some null) and new values for various attributes, all inserted at different add times throughout the months. I'm trying to update a second table with records with business month end dates. Right now, these records only contain the most recent new values for all month end dates. The goal is to create historical data by updating the previous month end values with the old values from the first table. I am a beginner and was able to come up with a query to update on one object where there was one entry from the first table. Now I am trying to expand the query to include multiple objects, with possible, multiple old values within the same month. I tried to use "order by" (since I need to make updates for a month in ascending order so it gets the latest value) but read that doesn't work with update statements without a subquery. So I tried my hand at making a more complicated query, without success. I am getting the following error: single-row subquery returns more than one row. Thanks!
TableA:
| ID | TYPE | OLD_VALUE | NEW_VALUE | ADD_TIME|
-----------------------------------------------
| 1 | A | 2 | 3 | 1/11/2019 8:00:00am |
| 1 | B | 3 | 4 | 12/10/2018 8:00:00am|
| 1 | B | 4 | 5 | 12/11/2018 8:00:00am|
| 2 | A | 5 | 1 | 12/5/2018 08:00:00am|
| 2 | A | 1 | 2 | 12/5/2019 09:00:00am|
| 2 | A | 2 | 3 | 12/5/2019 10:00:00am|
| 2 | B | 1 | 2 | 12/5/2019 10:00:00am|
TableB
| ID | MONTH_END | TYPE_A | TYPE_B |
-----------------------------------
| 1 | 1/31/19 | 3 | 5 |
| 1 | 12/31/18 | 3 | 5 |
| 1 | 11/30/18 | 3 | 5 |
| 2 | 12/31/18 | 3 | 2 |
| 2 | 11/30/18 | 3 | 2 |
Desired Output for TableB
| ID | MONTH_END | TYPE_A | TYPE_B |
-----------------------------------
| 1 | 1/31/19 | 3 | 5 |
| 1 | 12/31/18 | 2 | 5 |
| 1 | 11/30/18 | 2 | 3 |
| 2 | 12/31/18 | 3 | 2 |
| 2 | 11/30/18 | 5 | 2 |
My Query for Type A (Which I plan to adapt for Type B and execute as well for the desired output)
update TableB B
set b.type_a =
(
with aa as
(
select id, nvl(old_value, new_value) typea, add_time
from TableA
where type = 'A'
order by id, add_time ascending
)
select typea
from aa
where aa.id = b.id
and b.month_end <= aa.add_tm
)
where exists
(
with aa as
(
select id, nvl(old_value, new_value) typea, add_time
from TableA
where type = 'A'
order by id, add_time ascending
)
select typea
from aa
where aa.id = b.id
and b.month_end <= aa.add_tm
)
Kudo's for giving example input data and desired output. I found your question a bit confusing so let me rephrase to "Provide the last type a value from table a that is in the same month as the month end.
By matching on type and date of entry, we can get your answer. The "ROWNUM=1" is to limit result set to a single entry in case there is more than one row with the same add_time. This SQL is still a mess, maybe someone else can come up with a better one.
UPDATE tableb b
SET b.typea =
(SELECT old_value
FROM tablea a
WHERE LAST_DAY( TRUNC( a.add_time ) ) = b.month_end
AND TYPE = 'A'
AND add_time =
(SELECT MAX( add_time )
FROM tablea
WHERE TYPE = 'A' AND LAST_DAY( TRUNC( a.add_time ) ) = b.month_end)
AND ROWNUM = 1)
WHERE EXISTS
(SELECT old_value
FROM tablea a
WHERE LAST_DAY( TRUNC( a.add_time ) ) = b.month_end AND TYPE = 'A');

Split data by levels in hierarchy

Example of initial data:
| ID | ParentID |
|------|------------|
| 1 | NULL |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | NULL |
| 6 | 2 |
| 7 | 3 |
In my initial data I have ID of element and his parent ID.
Some elements has parent, some has not, some has a parent and his parent has a parent.
The maximum number of levels in this hierarchy is 3.
I need to get this hierarchy by levels.
Lvl 1 - elements without parents
Lvl 2 - elements with parent which doesn't have parent
Lvl 3 - elements with parent which has a parent too.
Expected result looks like:
| Lvl1 | Lvl2 | Lvl3 |
|-------|----------|----------|
| 1 | NULL | NULL |
| 1 | 2 | NULL |
| 1 | 3 | NULL |
| 1 | 2 | 4 |
| 5 | NULL | NULL |
| 1 | 2 | 6 |
| 1 | 3 | 7 |
How I can do it?
For a fixed dept of three, you can use CROSS APPLY.
It can be used like a JOIN, but also return extra records to give you the NULLs.
SELECT
Lvl1.ID AS lvl1,
Lvl2.ID AS lvl2,
Lvl3.ID AS lvl3
FROM
initial_data AS Lvl1
CROSS APPLY
(
SELECT ID FROM initial_data WHERE ParentID = Lvl1.ID
UNION ALL
SELECT NULL AS ID
)
AS Lvl2
CROSS APPLY
(
SELECT ID FROM initial_data WHERE ParentID = Lvl2.ID
UNION ALL
SELECT NULL AS ID
)
AS Lvl3
WHERE
Lvl1.ParentID IS NULL
ORDER BY
Lvl1.ID,
Lvl2.ID,
Lvl3.ID
But, as per my comment, this is often a sign that you're headed down a non-sql route. It might feel easier to start with, but later it turns and bites you, because SQL benefits tremendously from normalised structures (your starting data).

Efficient ROW_NUMBER increment when column matches value

I'm trying to find an efficient way to derive the column Expected below from only Id and State. What I want is for the number Expected to increase each time State is 0 (ordered by Id).
+----+-------+----------+
| Id | State | Expected |
+----+-------+----------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 4 | 2 |
| 6 | 2 | 2 |
| 7 | 3 | 2 |
| 8 | 0 | 3 |
| 9 | 5 | 3 |
| 10 | 3 | 3 |
| 11 | 1 | 3 |
+----+-------+----------+
I have managed to accomplish this with the following SQL, but the execution time is very poor when the data set is large:
WITH Groups AS
(
SELECT Id, ROW_NUMBER() OVER (ORDER BY Id) AS GroupId FROM tblState WHERE State=0
)
SELECT S.Id, S.[State], S.Expected, G.GroupId FROM tblState S
OUTER APPLY (SELECT TOP 1 GroupId FROM Groups WHERE Groups.Id <= S.Id ORDER BY Id DESC) G
Is there a simpler and more efficient way to produce this result? (In SQL Server 2012 or later)
Just use a cumulative sum:
select s.*,
sum(case when state = 0 then 1 else 0 end) over (order by id) as expected
from tblState s;
Other method uses subquery :
select *,
(select count(*)
from table t1
where t1.id < t.id and state = 0
) as expected
from table t;

Select from cross-reference based on inclusion (column values being superset)

Given a cross-reference table t relating table a with b:
| id | a_id | b_id |
--------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 7 |
| 5 | 2 | 3 |
| 6 | 3 | 2 |
| 7 | 3 | 3 |
What would be the conventional way of selecting all a_id whose b_id is a superset of a given set?
For example, for the set (2,3), I would expect the result:
| a_id |
--------
| 1 |
| 3 |
Since a_id 1 and 3 are the only set of b_id that is a superset of (2,3).
The best solution I've found so far (thanks to this answer):
select id
from a
where 2 = (select count(*)
from t
where t.a_id = a.id and t.b_id in (2,3)
);
But I'd prefer to avoid calculating stuff like cardinality before running the query.
You can simply adapt the query as:
select id
from a cross join
(select count(*) as cnt
from t
where . . .
) x
where x.cnt = (select count(*)
from t
where t.a_id = a.id and t.b_id in (2,3)
);

How to select hierarchy collection? (mixed with non hierarchy data, etc)

Having the table:
I need to show the following:
| ID | PERSONID | MASTERID | CHILDID | VALUE | DEPTHLEVEL |
---------------------------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 | 1 |
| 2 | 3 | 21456 | | 0 | 2 |
| 3 | 3 | 652314 | 417859 | 115 | 1 |
| 4 | 3 | 417859 | | 0 | 2 |
| 5 | 4 | 998654 | 223655 | 300 | 1 |
| 6 | 4 | 223655 | | 0 | 2 |
| 7 | 4 | 201302 |789654,441592| 200 | 1 |
| 8 | 4 | 789654 | | 0 | 2 |
| 9 | 4 | 441592 | | 0 | 2 |
| 10 | 5 | 999852 | | 123 | 1 |
Look at the row with id 10 this row has not relations (childs), the row with id 7 has two childs.
I need to quit (put value to 0) the value for every child/leaf.
For the row 1-9 I try the following query:
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
) v
order by v.id
Results:
Look the rows with id 7, 8 is the master with two childs, I need to put this in one row.
This is the first problem.
Also I need to show the data with no hierarchy relation(id 10 in expected result table, id 11 in image table data).
I think that I can query all rows with masterid not referenced by a childid and then make an union between the first query(above) and the query to search all master id not referenced by childid.
The query to to search all rows with masterid not referenced by childid will show me the row without relation and the master rows of level 1.
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
Here I can do an union and the result will fit my requirements(except the childid concatenate for master row).
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
union
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
) v
order by v.id
Almost final result:
But knowing that my real table has hundred of thousands of records make union like that are a good approach?
I've taken a stab at what I think your source data looks like:
| ID | PERSONID | MASTERID | CHILDID | VALUE |
-----------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 |
| 2 | 3 | 21456 | | -1 |
| 3 | 3 | 652314 | 417859 | 115 |
| 4 | 3 | 417859 | | -1 |
| 5 | 4 | 998654 | 223655 | 300 |
| 6 | 4 | 223655 | | -1 |
| 7 | 4 | 201302 | 441592 | 200 |
| 7 | 4 | 201302 | 789654 | 200 |
| 9 | 4 | 441592 | | -1 |
| 8 | 4 | 789654 | | -1 |
| 10 | 4 | 999852 | | 123 |
-----------------------------------------------
The following query gets you your desired results:
enter code here
select id,
personid,
masterid,
listagg(childid, ',') within group (order by childid) childid,
-- Took a guess that all values for a personid were the same and didn't need to be aggregated...
min(decode(depthlevel, 1, value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
-- Trick here is to start with all of the desired starting conditions...
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid ))
group by id, personid, masterid;
If ordering of your CHILDID is important, you would need to re-join the nested view with TMPSIMPLEEXAMPLE:
select v1.id,
v1.personid,
v1.masterid,
listagg(v1.childid, ',') within group (order by v2.id) childid,
min(decode(depthlevel, 1, v1.value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid )) v1,
tmpsimpleexample v2
-- Outer Join is important!
where v1.childid = v2.masterid (+)
group by v1.id, v1.personid, v1.masterid;
The real magic here is the LISTAGGG function. If you are not on 11g or better yet (why not?!?), then the following article can guide you in building your own aggregate function:
http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php