Combining Rows in SQL Viia Recursive Query - sql

I have the following table.
Animal Vaccine_Date Vaccine
Cat 2/1/2016 y
Cat 2/1/2016 z
Dog 2/1/2016 z
Dog 1/1/2016 x
Dog 2/1/2016 y
I would like to get the results to be as shown below.
Animal Vaccine_Date Vaccine
Dog 1/1/2016 x
Dog 2/1/2016 y,z
Cat 2/1/2016 y,z
I have the following code which was supplied via my other post at "Combine(concatenate) rows based on dates via SQL"
WITH RECURSIVE recCTE AS
(
SELECT
animal,
vaccine_date,
CAST(min(vaccine) as VARCHAR(50)) as vaccine, --big enough to hold concatenated list
cast (1 as int) as depth --used to determine the largest/last group_concate (the full group) in the final select
FROM TableOne
GROUP BY 1,2
UNION ALL
SELECT
recCTE.animal,
recCTE.vaccine_date,
trim(trim(recCTE.vaccine)|| ',' ||trim(TableOne.vaccine)) as vaccine,
recCTE.depth + cast(1 as int) as depth
FROM recCTE
INNER JOIN TableOne ON
recCTE.animal = TableOne.animal AND
recCTE.vaccine_date = TableOne.vaccine_date and
TableOne.vaccine > recCTE.vaccine
WHERE recCTE.depth < 5
)
--Now select the result with the largest depth for each animal/vaccine_date combo
SELECT * FROM recCTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY animal,vaccine_date ORDER BY depth desc) =1
But this results in the following.
Animal Vaccine_Date vaccine depth
Cat 2/1/2016 y,z,z,z,z 5
Dog 1/1/2016 x 1
Dog 2/1/2016 y,z,z,z,z 5
The "z" keeps repeating. This is because the code is saying anything greater than the minimum vaccine. To account for this, the code was changed to the following.
WITH RECURSIVE recCTE AS
(
SELECT
animal,
vaccine_date,
CAST(min(vaccine) as VARCHAR(50)) as vaccine, --big enough to hold concatenated list
cast (1 as int) as depth, --used to determine the largest/last group_concate (the full group) in the final select
vaccine as vaccine_check
FROM TableOne
GROUP BY 1,2,5
UNION ALL
SELECT
recCTE.animal,
recCTE.vaccine_date,
trim(trim(recCTE.vaccine)|| ',' ||trim(TableOne.vaccine)) as vaccine,
recCTE.depth + cast(1 as int) as depth,
TableOne.vaccine as vaccine_check
FROM recCTE
INNER JOIN TableOne ON
recCTE.animal = TableOne.animal AND
recCTE.vaccine_date = TableOne.vaccine_date and
TableOne.vaccine > recCTE.vaccine and
vaccine_check <> recCTE.vaccine_check
WHERE recCTE.depth < 5
)
--Now select the result with the largest depth for each animal/vaccine_date combo
SELECT * FROM recCTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY animal,vaccine_date ORDER BY depth desc) =1
However, this resulted in the following.
Animal Vaccine_Date vaccine depth vaccine_check
Cat 2/1/2016 y 1 y
Dog 1/1/2016 x 1 x
Dog 2/1/2016 y 1 y
What is missing in the code to get the desired results of the following.
Animal Vaccine_Date Vaccine
Dog 1/1/2016 x
Dog 2/1/2016 y,z
Cat 2/1/2016 y,z

Hmmm. I don't have Teradata on hand but this is a major shortcoming in the project (in my opinion). I think this will work for you, but it might need some tweaking:
with tt as (
select t.*,
row_number() over (partition by animal, vaccine_date order by animal) as seqnum
count(*) over (partition by animal, vaccine_date) as cnt
),
recursive cte as (
select animal, vaccine_date, vaccine as vaccines, seqnum, cnt
from tt
where seqnum = 1
union all
select cte.animal, cte.dte, cte.vaccines || ',' || t.vaccine, tt.seqnum, tt.cnt
from cte join
tt
on tt.animal = cte.animal and
tt.vaccine_date = cte.vaccine_date and
tt.seqnum = cte.seqnum + 1
)
select cte.*
from cte
where seqnum = cnt;

If your Teradata Database version is 14.10 or higher it supports XML data type. This also means that XMLAGG function is supported which would be useful for your case and would let you avoid recursion.
Check if XMLAGG function exists, which is installed with XML Services as an UDF:
SELECT * FROM dbc.FunctionsV WHERE FunctionName = 'XMLAGG'
If it does, then the query would look like:
SELECT
animal,
vaccine_date
TRIM(TRAILING ',' FROM CAST(XMLAGG(vaccine || ',' ORDER BY vaccine) AS VARCHAR(10000)))
FROM
tableone
GROUP BY 1,2
I have no way of testing this atm, but I believe this should work with possibility of minor tweaks.

I was able to get the desired results with the following SQL. This doesn't seem very efficient at all and is not dynamic. However, I can add extra sub querys as needed to combine more vaccines by animal by date.
select
qrya.animal
,qrya.vaccine_date
,case when qrya.vac1 is not null then qrya.vac1 else null end ||','||case when qrya.animal=qryb.animal and qrya.vaccine_date=qryb.vaccine_date then qryb.Vac2 else 'End' end as vaccine_List
from
(
select
qry1.Animal
,qry1.Vaccine_Date
,case when qry1.Vaccine_Rank = 1 then qry1.vaccine end as Vac1
from
(
select
animal
,vaccine_date
,vaccine
,row_number() over (partition by animal,vaccine_date order by vaccine) as Vaccine_Rank
from TableOne
) as qry1
where vac1 is not null
group by qry1.Animal,
qry1.Vaccine_Date
,case when qry1.Vaccine_Rank = 1 then qry1.vaccine end
) as qrya
join
(
select
qry1.Animal
,qry1.Vaccine_Date
,case when qry1.Vaccine_Rank = 2 then qry1.vaccine end as Vac2
from
(
select
animal
,vaccine_date
,vaccine
,row_number() over (partition by animal,vaccine_date order by vaccine) as Vaccine_Rank
from TableOne
) as qry1
where vac2 is not null
group by qry1.Animal,
qry1.Vaccine_Date
,case when qry1.Vaccine_Rank = 2 then qry1.vaccine end
) as qryb
on qrya.Animal=qryb.Animal

Related

Separating out key value pairs

In T-SQL how can one separate columns; one with the key and the other with the value for strings that follow the pattern below?
Examples of the strings that need to be processed are:
country_code: "US"province_name: "NY"city_name: "Old Chatham"
postal_code: "11746-8031"country_code: "US"province_name: "NY"street_address: "151 Millet Street North"city_name: "Dix Hills"
street_address: "1036 Main Street, Holbrook, NY 11741"
Desired outcome for example 1 would be:
Key
Value
country_code
US
province_name
NY
city_name
Old Chatham
Nice to see Old Chatham ... a little touch of home
My first thought was to "correct" the JSON string, but that got risky.
Here is an option that will parse and pair the key/values
Example or dbFiddle
Select A.*
,C.*
From YourTable A
Cross Apply ( values ( replace(replace(replace(SomeCol,'"',':'),': :',':'),'::',':') ) ) B(CleanString)
Cross Apply (
Select [Key] =max(case when Seq=1 then Val end)
,[Value]=max(case when Seq=0 then Val end)
From (
Select Seq = row_number() over (order by [Key]) % 2
,Grp = (row_number() over (order by [Key])-1) / 2
,Val = Value
From OpenJSON( '["'+replace(string_escape(CleanString,'json'),':','","')+'"]' )
Where ltrim(Value)<>''
) C1
Group By Grp
) C
Results

Consolidate information (time serie) from two tables

MS SQL Server
I have two tables with different accounts from the same customer:
Table1:
ID
ACCOUNT
FROM
TO
1
A
01.10.2019
01.12.2019
1
A
01.02.2020
09.09.9999
and table2:
ID
ACCOUNT
FROM
TO
1
B
01.12.2019
01.01.2020
As result I want a table that summarize the story of this costumer and shows when he had an active account and when he doesn't.
Result:
ID
FROM
TO
ACTIV Y/N
1
01.10.2019
01.01.2020
Y
1
02.01.2020
31.01.2020
N
1
01.02.2020
09.09.9999
Y
Can someone help me with some ideas how to proceed?
This is the typical gaps and island problem, and it's not usually easy to solve.
You can achieve your goal using this query, I will explain it a little bit.
You can test on this db<>fiddle.
First of all... I have unified your two tables into one to simplify the query.
-- ##table1
select 1 as ID, 'A' as ACCOUNT, convert(date,'2019-10-01') as F, convert(date,'2019-12-01') as T into ##table1
union all
select 1 as ID, 'A' as ACCOUNT, convert(date,'2020-02-01') as F, convert(date,'9999-09-09') as T
-- ##table2
select 1 as ID, 'B' as ACCOUNT, convert(date,'2019-12-01') as F, convert(date,'2020-01-01') as T into ##table2
-- ##table3
select * into ##table3 from ##table1 union all select * from ##table2
You can then get your gaps and island using, for example, a query like this.
It combines recursive cte to generate a calendar (cte_cal) and lag and lead operations to get the previous/next record information to build the gaps.
with
cte_cal as (
select min(F) as D from ##table3
union all
select dateadd(day,1,D) from cte_cal where d < = '2021-01-01'
),
table4 as (
select t1.ID, t1.ACCOUNT, t1.F, isnull(t2.T, t1.T) as T, lag(t2.F, 1,null) over (order by t1.F) as SUP
from ##table3 t1
left join ##table3 t2
on t1.T=t2.F
)
select
ID,
case when T = D then F else D end as "FROM",
isnull(dateadd(day,-1,lead(D,1,null) over (order by D)),'9999-09-09') as "TO",
case when case when T = D then F else D end = F then 'Y' else 'N' end as "ACTIV Y/N"
from (
select *
from cte_cal c
cross apply (
select t.*
from table4 t
where t.SUP is null
and (
c.D = t or
c.D = dateadd(day,1,t.T)
)
) t
union all
select F, * from table4 where T = '9999-09-09'
) p
order by 1
option (maxrecursion 0)
Dates like '9999-09-09' must be treated like exceptions, otherwise I would have to create a calendar until that date, so the query would take long time to resolve.

SQL query to get column names if it has specific value

I have a situation here, I have a table with a flag assigned to the column names(like 'Y' or 'N'). I have to select the column names of a row, if it have a specific value.
My Table:
Name|sub-1|sub-2|sub-3|sub-4|sub-5|sub-6|
-----------------------------------------
Tom | Y | | Y | Y | | Y |
Jim | Y | Y | | | Y | Y |
Ram | | Y | | Y | Y | |
So I need to get, what are all the subs are have 'Y' flag for a particular Name.
For Example:
If I select Tom I need to get the list of 'Y' column name in query output.
Subs
____
sub-1
sub-3
sub-4
sub-6
Your help is much appreciated.
The problem is that your database model is not normalized. If it was properly normalized the query would be easy. So the workaround is to normalize the model "on-the-fly" to be able to make the query:
select col_name
from (
select name, sub_1 as val, 'sub_1' as col_name
from the_table
union all
select name, sub_2, 'sub_2'
from the_table
union all
select name, sub_3, 'sub_3'
from the_table
union all
select name, sub_4, 'sub_4'
from the_table
union all
select name, sub_5, 'sub_5'
from the_table
union all
select name, sub_6, 'sub_6'
from the_table
) t
where name = 'Tom'
and val = 'Y'
The above is standard SQL and should work on any (relational) DBMS.
Below code works for me.
select t.Subs from (select name, u.subs,u.val
from TableName s
unpivot
(
val
for subs in (sub-1, sub-2, sub-3,sub-4,sub-5,sub-6,sub-7)
) u where u.val='Y') T
where t.name='Tom'
Somehow I am near to the solution. I can get for all rows. (I just used 2 columns)
select col from ( select col, case s.col when 'sub-1' then sub-1 when 'sub-2' then sub-2 end AS val from mytable cross join ( select 'sub-1' AS col union all select 'sub-2' ) s ) s where val ='Y'
It gives the columns for all row. I need the same data for a single row. Like if I select "Tom", I need the column names for 'Y' value.
I'm answering this under a few assumptions here. The first is that you KNOW the names of the columns of the table in question. Second, that this is SQL Server. Oracle and MySql have ways of performing this, but I don't know the syntax for that.
Anyways, what I'd do is perform an 'UNPIVOT' on the data.
There's a lot of parans there, so to explain. The actual 'unpivot' statement (aliased as UNPVT) takes the data and twists the columns into rows, and the SELECT associated with it provides the data that is being returned. Here's I used the 'Name', and placed the column names under the 'Subs' column and the corresponding value into the 'Val' column. To be precise, I'm talking about this aspect of the above code:
SELECT [Name], [Subs], [Val]
FROM
(SELECT [Name], [Sub-1], [Sub-2], [Sub-3], [Sub-4], [Sub-5], [Sub-6]
FROM pvt) p
UNPIVOT
(Orders FOR [Name] IN
([Name], [Sub-1], [Sub-2], [Sub-3], [Sub-4], [Sub-5], [Sub-6])
)AS unpvt
My next step was to make that a 'sub-select' where I could find the specific name and val that was being hunted for. That would leave you with a SQL Statement that looks something along these lines
SELECT [Name], [Subs], [Val]
FROM (
SELECT [Name], [Subs], [Val]
FROM
(SELECT [Name], [Sub-1], [Sub-2], [Sub-3], [Sub-4], [Sub-5], [Sub-6]
FROM pvt) p
UNPIVOT
(Orders FOR [Name] IN
([Name], [Sub-1], [Sub-2], [Sub-3], [Sub-4], [Sub-5], [Sub-6])
)AS unpvt
) AS pp
WHERE 1 = 1
AND pp.[Val] = 'Y'
AND pp.[Name] = 'Tom'
select col from (
select col,
case s.col
when 'sub-1' then sub-1
when 'sub-2' then sub-2
when 'sub-3' then sub-3
when 'sub-4' then sub-4
when 'sub-5' then sub-5
when 'sub-6' then sub-6
end AS val
from mytable
cross join
(
select 'sub-1' AS col union all
select 'sub-2' union all
select 'sub-3' union all
select 'sub-4' union all
select 'sub-5' union all
select 'sub-6'
) s on name="Tom"
) s
where val ='Y'
included the join condition as
on name="Tom"

sort table with null as "insignificant"

I have a table with two columns: col_order (int), and name (text). I would like to retrieve ordered rows such that, when col_order is not null, it determines the order, but when its null, then name determines the order. I thought of an order by clause such as
order by coalesce( col_order, name )
However, this won't work because the two columns have different type. I am considering converting both into bytea, but: 1) to convert the integer is there a better method than just looping moding by 256, and stacking up individual bytes in a function, and 2) how do I convert "name" to insure some sort of sane collation order (assuming name has order ... well citext would be nice but I haven't bothered to rebuild to get that ... UTF8 for the moment).
Even if all this is possible (suggestions on details welcome) it seems like a lot of work. Is there a better way?
EDIT
I got an excellent answer by Gordon, but it shows I didn't phrase the question correctly. I want a sort order by name where col_order represents places where this order is overridden. This isn't a well posed problem, but here is one acceptable solution:
col_order| name
----------------
null | a
1 | foo
null | foo1
2 | bar
Ie -- here if col_order is null name should be inserted after name closest in alphabetical order but less that it. Otherwise, this could be gotten by:
order by col_order nulls last, name
EDIT 2
Ok ... to get your creative juices flowing, this seems to be going in the right direction:
with x as ( select *,
case when col_order is null then null else row_number() over (order by col_order) end as ord
from temp )
select col_order, name, coalesce( ord, lag(ord,1) over (order by name) + .5) as ord from x;
It gets the order from the previous row, sorted by name, when there is no col_order. It isn't right in general... I guess I'd have to go back to the first row with non-null col_order ... it would seem that sql standard has "ignore nulls" for window functions which might do this, but isn't implemented in postgres. Any suggestions?
EDIT 3
The following would seem close -- but doesn't work. Perhaps window evaluation is a bit strange with recursive queries.
with recursive x(col_order, name, n) as (
select col_order, name, case when col_order is null then null
else row_number() over (order by col_order) * t end from temp, tot
union all
select col_order, name,
case when row_number() over (order by name) = 1 then 0
else lag(n,1) over (order by name) + 1 end from x
where x.n is null ),
tot as ( select count(*) as t from temp )
select * from x;
Just use multiple clauses:
order by (case when col_order is not null then 1 else 2 end),
col_order,
name
When col_order is not null, then 1 is assigned for the first sort key. When it is null, then 2 is assigned. Hence, the not-nulls will be first.
Ok .. the following seems to work -- I'll leave the question "unanswered" though pending criticism or better suggestions:
Using the last_agg aggregate from here:
with
tot as ( select count(*) as t from temp ),
x as (
select col_order, name,
case when col_order is null then null
else (row_number() over (order by col_order)) * t end as n,
row_number() over (order by name) - 1 as i
from temp, tot )
select x.col_order, x.name, coalesce(x.n,last_agg(y.n order by y.i)+x.i, 0 ) as n
from x
left join x as y on y.name < x.name
group by x.col_order, x.n, x.name, x.i
order by n;

Sql query to return one single record per each combination in a table

I need the result for every combination of (from_id, to_id) which has the minimun value and the loop matching a criteria.
So basically I need the loop that has the minimun value. e.g. From A to B i need the minimun value and the loop_id .
The table has the following fields:
value from_id to_id loop_id
-------------------------------------
2.3 A B 2
0.1 A C 2
2.1 A B 4
5.4 A C 4
So a result will be:
value from_id to_id loop_id
-------------------------------------
2.1 A B 4
0.1 A C 2
I have tried with the following:
SELECT t.value, t.from_id, t.to_id,t.loop_id
FROM myresults t
INNER JOIN (
SELECT min(m.value), m.from_id, m.to_id, m.loop_id
FROM myresults m where m.loop_id % 2 = 0
GROUP BY m.from_id, m.to_id, m.loop_id
) x
ON (x.from_id = t.from_id and x.to_id=t.to_id and x.loop_id=t.loop_id )
AND x.from_id = t.from_id and x.to_id=t.to_id and x.loop_id=t.loop_id
But it is returning all the loops.
Thanks in advance!
As I understand the problem this will work:
SELECT t.value, t.from_id, t.to_id, t.loop_id
FROM MyResults t
INNER JOIN
( SELECT From_ID, To_ID, MIN(Value) [Value]
FROM MyResults
WHERE Loop_ID % 2 = 0
GROUP BY From_ID, To_ID
) MinT
ON MinT.From_ID = t.From_ID
AND MinT.To_ID = t.To_ID
AND MinT.Value = t.Value
However, if you had duplicate values for a From_ID and To_ID combination e.g.
value from_id to_id loop_id
-------------------------------------
0.1 A B 2
0.1 A B 4
This would return both rows.
If you are using SQL-Server 2005 or later and you want the duplicate rows as stated above you could use:
SELECT Value, From_ID, To_ID, Loop_ID
FROM ( SELECT *, MIN(Value) OVER(PARTITION BY From_ID, To_ID) [MinValue]
FROM MyResults
) t
WHERE Value = MinValue
If you did not want the duplicate rows you could use this:
SELECT Value, From_ID, To_ID, Loop_ID
FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY From_ID, To_ID ORDER BY Value, Loop_ID) [RowNumber]
FROM MyResults
) t
WHERE RowNumber = 1
Can't you do this a lot more simply?
SELECT
from_id,
to_id,
MIN(value)
FROM
myresults
WHERE
loop_id % 2 = 0
GROUP BY
from_id,
to_id
Or maybe I'm misunderstanding the question.
EDIT: To include loop_id
SELECT
m2.from_id,
m2.to_id,
m2.value,
m2.loop_id
FROM
myresults m2 INNER JOIN
(SELECT
m1.from_id,
m1.to_id,
MIN(m1.value)
FROM
myresults m1
WHERE
m1.loop_id % 2 = 0
GROUP BY
m1.from_id,
m1.to_id) minset
ON
m2.from_id = minset.from_id
AND m2.to_id = minset.to_id
AND m2.value = minset.value