Calculate multiple columns with each other using CTE - sql

I want to build columns that calculated with each other. (Excuse my English)
Example:
Id Column1 Column2 Column3
1 5 5 => Same as Column1 5 => Same as Column2
2 2 12 => column1 current + column2.prev + column3.previous = 2+5+5 17 => column2.current + column3.prev = 12+5
3 3 32 => 3+12+17 49 => 32+17
easier way to see:
Id Column1 Column2 Column3
1 5 5 => Same as Column1 5 => Same as Column2
2 2 12 => 2+5+5 17 => 12+5
3 3 32 => 3+12+17 49 => 32+17
so complicated??? :-(
The previous issue was calculating Column3 with the new calculated column as Column2. But now, it must be renew with the just calculated Column2 and the previous record of Column3 as well. If you want to have a look at the previous post, here it is.
Here is my previous recursive CTE code. It works like, 1st, calculate column2 with previous record of current column (c.Column2) in cteCalculation, and then calculate new column3 in cte2 with just calculated column2 from cteCalculation.
/copied from that previous post/
;with cteCalculation as (
select t.Id, t.Column1, t.Column1 as Column2
from table_1 t
where t.Id = 1
union all
select t.Id, t.Column1, (t.Column1 + c.Column2) as Column2
from table_1 t
inner join cteCalculation c
on t.Id-1 = c.id
),
cte2 as(
select t.Id, t.Column1 as Column3
from table_1 t
where t.Id = 1
union all
select t.Id, (select column2+1 from cteCalculation c where c.id = t.id) as Column3
from table_1 t
inner join cte2 c2
on t.Id-1 = c2.id
)
select c.Id, c.Column1, c.Column2, c2.column3
from cteCalculation c
inner join cte2 c2 on c.id = c2. id
Now I wanna extend it like calculate 2 columns with the data from each other. Means, use 2nd to calc the 3rd, and use 3rd to get new 2nd column data. Hope you can get it.

This is an example how to achive this using recursive CTE
create table #tmp (id int identity (1,1), Column1 int)
insert into #tmp values(5)
insert into #tmp values(2)
insert into #tmp values(3);
with counter as
(
SELECT top 1 id, Column1, Column1 as Column2, Column1 as Column3 from #tmp
UNION ALL
SELECT t.id, t.Column1,
t.Column1 + counter.Column2 + counter.Column3,
(t.Column1 + counter.Column2 + counter.Column3) + counter.Column3 FROM counter
INNER JOIN #tmp t ON t.id = counter.id + 1
)
select * from counter

You'll need to use a Recursive CTE since the values of subsequent columns are dependent upon earlier results.
Do this in pieces, too. Have your first query just return the correct values for Column1. Your next (recursive CTE) query will add the results for Column2, and so on.

OK I'm assuming you're doing inserts into column 1 here of various values.
Essentially col2 always = new col1 value + old col2 value + old col 3 value
col3 = new col2 value + old col3 value
so col3 = (new col1 value + old col2 value + old col 3 value) + old col3 value
So an INSTEAD OF Insert trigger is probably the easiest way to implement.
CREATE TRIGGER tr_xxxxx ON Tablename
INSTEAD OF INSERT
AS
INSERT INTO Tablename (Column1, Column2, Column3)
SELECT ins.col1, ins.col1+t.col2+t.col3, ins.col1+t.col2+t.col3+t.col3
FROM Tablename t INNER JOIN Inserted ins on t.Id = ins.Id
The trigger has access to both the existing (old) values in Tablename t, and the new value being inserted (Inserted.col1).

Related

Group identifiers/values that are related with each other between multiple columns

I want to group identifiers that are related with each other between multiple columns and create/assign a unique group id.
Also, If we receive a new row, we can assign the right id respecting what has been done before for others group id
For example:
Col1
Col2
Col3
Col4
AA
Null
33
12
BB
Null
45
12
AA
123
65
15
CC
123
NULL
42
DD
Null
10
42
EE
NULL
20
NULL
FF
145
33
NULL
GG
NULL
NULL
11
Desired result:
The group ID =1 beacuse in col1, it's the same value row 1 and 3 (AA) and for row 4 it's also ID 1 because in the second column, the value for AA it's 123 (the same for CC)
If there is any match between rows and cross the columns, we generate an id
Col1
Col 2
Col 3
Col 4
Group ID
AA
Null
33
12
1
BB
Null
45
12
1
AA
123
65
15
1
CC
123
NULL
42
1
DD
Null
10
42
1
EE
NULL
20
NULL
2
FF
145
33
NULL
1
GG
NULL
NULL
11
3
I've been doing some work on this and agree with Kashyap- I cannot find a way to do this is a single statement. You need either a recursive CTE or a loop. Synapse does not currently support recursive CTEs, which leaves using a loop to create the effect you want.
One concern that came up while I was working with this. As you continue to add data, you'll have more and more overlaps and could eventually end up with just one group. That depends on your dataset- you might have something you can guarantee will have discrete divisions. The way the script I put together works, a new match will update any group IDs, even in existing data. You could modify it to only set group IDs only for new rows, but then you could end up in a situation when one row matches multiple groups.
Certainly not the only option, but this is the script I pulled together. It is dependent on having a unique ID that will remain the same in each iteration. Because the loop uses updates instead of inserts, prepping the data would involve inserting the data into your new table without the group, and you can create your ID at that time using auto-increment or otherwise. The script works best with an INT ID column, but should work with a guid if that is necessary.
So process is essentially this:
Do whatever initial prep you need to do to inserting data into the table and creating an ID
Join the table back onto itself, once for each column that could contain a match
Update the Group ID to be the minimum value across the IDs and current group IDs of that set of matches.
Check to see if we need to do another round. Because we are using minimum ID as a group number, there will be a row where the ID = group ID in each group
CREATE TABLE #testtable
(
[id] INT NOT NULL,
[col1] INT NOT NULL,
[col2] INT NULL,
[col3] INT NULL,
[groupnumber] INT NULL
)
INSERT INTO #testtable
(id,
col1,
col2,
col3)
INSERT INTO #testTable (id, col1, col2, col3)
SELECT 1, 1, 5, 33 UNION ALL -- First
SELECT 2, 2, null, 45 UNION ALL -- Second
SELECT 3, 1, 123, 65 UNION ALL -- First
SELECT 4, 3, 123, null UNION ALL -- First
SELECT 5, 10, null, 10 UNION ALL -- Third
SELECT 6, 5, null, 45 UNION ALL -- Second
SELECT 7, 6, 145, 33 -- First
DECLARE #RemainingRows INT,
#LoopCounter INT, #MaxLoops int -- To protect against infinite loop
SET #RemainingRows = (SELECT COUNT([id]) FROM #testtable)
SET #LoopCounter = 0;
SET #MaxLoops = 10;
WHILE( #RemainingRows > 0
AND #LoopCounter < #MaxLoops )
BEGIN
WITH combineddata AS
(
SELECT
id,
col1,
col2,
col3,
groupnumber
FROM
#testtable
),
--Create a set a rows that contains all rows and all possible matches
matcheddata AS
(
SELECT
c1.id,
c1.col1 AS c1col1,
c1.col2 AS c1col2,
c1.col3 AS c1col3,
c1.groupnumber AS groupNumber1,
c2.id AS RowNum2,
c2.groupnumber AS groupNumber2,
c3.id AS RowNum3,
c3.groupnumber AS groupNumber3,
c4.id AS RowNum4,
c4.groupnumber AS groupNumber4
FROM
combineddata c1
LEFT JOIN
combineddata c2
ON c1.col1 = c2.col1
LEFT JOIN
combineddata c3
ON c1.col2 = c3.col2
LEFT JOIN
combineddata c4
ON c1.col3 = c4.col3
)
UPDATE #testtable
SET
groupnumber =
CASE
WHEN
NEW.groupnumber IS NULL
THEN
NULL
ELSE
NEW.groupnumber
END
FROM
(
SELECT
id,
c1col1,
c1col2,
c1col3,
MIN(groupnumber) AS GroupNumber
FROM
matcheddata CROSS apply (
SELECT
MIN(c) AS GroupNumber
FROM (VALUES
(id),
(RowNum2),
(RowNum3),
(RowNum4),
(groupNumber1),
(groupNumber2),
(groupNumber3),
(groupNumber4)
) AS v (C)
WHERE
c IS NOT NULL) g
GROUP BY
id,
c1col1,
c1col2,
c1col3
) NEW
INNER JOIN # testtable
ON NEW.id = #testtable.id
SET
#LoopCounter = #LoopCounter + 1
SET
#RemainingRows =
(
SELECT
COUNT(t1.id)
FROM
#testtable t1
LEFT JOIN
#testtable t2
ON t1.groupnumber = t2.[id]
WHERE
t2.id IS NULL
OR t2.id <> t2.groupnumber
)
PRINT 'Remaining Rows: ' + CAST(#RemainingRows AS VARCHAR) PRINT 'Counter: ' + CAST(#LoopCounter AS VARCHAR);
END
SELECT * FROM #testtable
IF Object_id('tempdb..#testTable') IS NOT NULL
BEGIN
DROP TABLE # testtable
END```

add column values of two tables that have the same date

for example, I have 2 tables
resto1
day 1 = 1 2 3 4
resto2
day 1 = 5 6 7 8
I wanted to add the values of the first two columns that have the same date the result would be:
day_1_earned = 6 8 10 12
please help
assuming both the tables have a date column, let say named dt_col you can use below query to achieve the required result
select t1.column1+t2.column2 added_values from
table1 t1,table2 t2
where t1.dt_col = t2.dt_col;
SELECT T1.Column1 + T2.Column As TotalColumn
From T1, T2 WHERE T1.Date= T2.Date
SELECT Cr_date, SUM(Column1) Column1, SUM(Column2) Column2
FROM
(
SELECT Cr_Date,Column1,Column2 FROM #T1
UNION
SELECT Cr_Date,Column1,Column2 FROM #T2
) as res
GROUP BY Cr_Date

how to scan each row of a table, and update current row based on previous row?

I need to update the current row using the following logic:
if current row is null, then set it as previous row
if current row is not null, then no action
the 1st row is not null, then NULL appears randomly
Those NULLs need to be updated using the logic previously mentioned
e.g.
1. 1
2. null
3. null
4. 2
5. null
6. null
needs to be updated as
1. 1
2. 1
3. 1
4. 2
5. 2
6. 2
How to do it in SQL?
Thanks
r
In case of two Null values in a row, you need to define the least non-null value of the table, so I think Outer Apply will handle your problem:
CREATE TABLE #TB(ID Int Identity(1, 1), Value Int)
INSERT INTO #TB([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
UPDATE G SET G.Value = GG.Value
FROM
#TB AS G
OUTER APPLY
(SELECT
TOP 1 *
FROM
#TB AS GG
WHERE
GG.Value IS NOT NULL
AND
GG.ID < G.ID
ORDER BY
GG.ID DESC
) AS GG
WHERE
G.Value IS NULL
SELECT * FROM #TB AS T
but note, that if the first value is Null it will not give you the results, as you have not defined the logic for this scenario.
This might help:
SELECT
t1.col1,
t1.col2 AS previous,
(SELECT
t2.col2
FROM table_1 t2
WHERE t2.col1 = (SELECT
MAX(t3.col1)
FROM table_1 t3
WHERE t3.col1 <= t1.col1
AND col2 IS NOT NULL))
AS new
FROM table_1 t1;
result
Where are you using this SQL code? If you are using Hive SQL for example, there is a function which allows you to directly get last non null value:
LAST_VALUE(col, true) over (PARTITION BY id ORDER BY date)
Oracle 10g has also a function to do this, as adressed in this thread:
Fill null values with last non-null amount - Oracle SQL
Are you familiar with window functions?
while (select count(*) FROM Table_1 where c1_derived = '') > 0
begin
update top(1) Table_1
set c1_derived = (select c1_derived from Table_1 t2 where (t2.id = [Table_1].id-1))
where c1_derived = ''
end
Try the below script. (sql 2008 +)
CREATE TABLE #table(id Int Identity(1, 1), value Int)
INSERT INTO #table([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
;WITH cte AS
(
SELECT ID,Value,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS row
FROM #table
)
SELECT a.ID,max(b.Value)
FROM cte a
INNER JOIN cte b ON a.row >=b.row
GROUP BY a.ID
drop table #table
Edit2 this also another script using "UNBOUNDED PRECEDING "
CREATE TABLE #table(id Int Identity(1, 1), value Int)
INSERT INTO #table([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
select * ,max(t.value) over(order by Id Rows UNBOUNDED PRECEDING) maxValue
from #table t
drop table #table
check this link about "OVER Clause"
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql

Cast varchar that holds some strings to integer field in informix

I have 2 rows from 2 tables in a database that I want to compare.
Column1 is on table1 and is an Integer field with entries like the following
column1
147518
187146
169592
Column2 is on table2 and is a Varchar(15) field with various entries but for this example lets use these 3:
column2
169592
00010000089
DummyId
For my query part of it relies on checking if rows from table1 are linked to the rows in table2, but to do this, I need to compare column1 and column2.
SELECT * FROM table1 WHERE column1 IN (SELECT column2 FROM table2)
The result of this using the data above should be 1 row - 169592
Obviously this wont work (A character to numeric conversion process failed) as they cannot be compared as is, but how do I get them to work?
I have tried
SELECT * FROM table1 WHERE column1 IN (SELECT CAST(column2 AS INTEGER) FROM table2)
and
SELECT * FROM table1 WHERE column1 IN (SELECT (column2::INTEGER) column2 FROM table2)
Using Server Studio 9.1 if that helps.
Try casting the int to a string:
SELECT * FROM table1 WHERE cast(column1 as varchar(15)) IN (SELECT column2 FROM table2)
You can try to use ISNUMERIC in following:
SELECT * FROM table1 WHERE column1 IN (SELECT CASE WHEN ISNUMERIC(column2) = 1 THEN CAST(column2 AS INT) END FROM table2)
For this purpose there is no need to create a special function that you'll not find on other environments.
Let's create a test case for your example:
CREATE TABLE tab1 (
col1 INT,
col2 INT
);
CREATE TABLE tab2 (
col1 VARCHAR(15)
);
INSERT INTO tab1 VALUES(147518,1);
INSERT INTO tab1 VALUES(187146,2);
INSERT INTO tab1 VALUES(169592,3);
INSERT INTO tab2 VALUES(169592);
INSERT INTO tab2 VALUES('00010000089');
INSERT INTO tab2 VALUES('DummyId');
The first query you run was like:
SELECT t1.*
FROM tab1 AS t1
WHERE t1.col1 IN (SELECT t2.col1 FROM tab2 AS t2);
This will raise an error because it tries to compare an INT with a VARCHAR
[infx1210#tardis ~]$ finderr 1213
-1213 A character to numeric conversion process failed.
A character value is being converted to numeric form for storage in a
numeric column or variable. However, the character string cannot be
interpreted as a number. It contains some characters other than white
space, digits, a sign, a decimal, or the letter e; or the parts are in
the wrong order, so the number cannot be deciphered.
If you are using NLS, the decimal character or thousands separator
might be wrong for your locale.
[infx1210#tardis ~]$
Then you've tried to cast a VARCHAR into a INT which resulted in the same error, you should tried the other way:
> SELECT t1.*
> FROM tab1 AS t1
> WHERE t1.col1::CHAR(11) IN (SELECT t2.col1 FROM tab2 AS t2);
>
col1 col2
169592 3
1 row(s) retrieved.
>
Check also if you don't get faster results using the EXISTS:
> SELECT t1.*
> FROM tab1 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM tab2 AS t2
> WHERE t1.col1::CHAR(11) = t2.col1
> );
col1 col2
169592 3
1 row(s) retrieved.
>
Another way possible is to just join the tables:
> SELECT t1.*
> FROM tab1 AS t1
> INNER JOIN tab2 AS t2
> ON (t1.col1 = t2.col1);
col1 col2
169592 3
1 row(s) retrieved.
>
Part of this question was answered by #Stanislovas Kalašnikovas where he said to use the following:
SELECT * FROM table1 WHERE column1 IN (SELECT CASE WHEN ISNUMERIC(column2) = 1 THEN CAST(column2 AS INT) END FROM table2)
But informix does not have a built in function for ISNUMERIC, so the following created it:
create function isnumeric2(inputstr varchar(15)) returning integer;
define numeric_var decimal(15,0);
define function_rtn integer;
on exception in (-1213)
let function_rtn = 0;
end exception with resume
let function_rtn = 1;
let numeric_var = inputstr;
return function_rtn;
end function;
And then the first query above worked for me.

SELECT DISTINCT for data groups

I have following table:
ID Data
1 A
2 A
2 B
3 A
3 B
4 C
5 D
6 A
6 B
etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:
DataID Data
101 A
102 A
102 B
103 C
104 D
So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:
ID DataID
1 101
2 102
3 102
4 103
5 104
6 102
How can I do that?
PS. Code to generate the first table:
CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, 'A'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'
In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons).
Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this
with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings
The following query using CASE will give you the result shown below.
From there on, getting the distinct datagroups and proceeding further should not really be a problem.
SELECT
id,
MAX(CASE data WHEN 'A' THEN data ELSE '' END) +
MAX(CASE data WHEN 'B' THEN data ELSE '' END) +
MAX(CASE data WHEN 'C' THEN data ELSE '' END) +
MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM t1
GROUP BY id
ID DataGroups
1 A
2 AB
3 AB
4 C
5 D
6 AB
However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.
In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)
LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.
From your sample data (having added the missing 2,'A' tuple, the following gives the renumbered (and uniqueified) data:
with NonDups as (
select t1.id
from #t1 t1 left join #t1 t2
on t1.id > t2.id and t1.data = t2.data
group by t1.id
having COUNT(t1.data) > COUNT(t2.data)
), DataAddedBack as (
select ID,data
from #t1 where id in (select id from NonDups)
), Renumbered as (
select DENSE_RANK() OVER (ORDER BY id) as ID,Data from DataAddedBack
)
select * from Renumbered
Giving:
1 A
2 A
2 B
3 C
4 D
I think then, it's a matter of relational division to match up rows from this output with the rows in the original table.
Just to share my own dirty solution that I'm using for the moment:
SELECT DISTINCT t1.id, D.data
FROM #t1 t1
CROSS APPLY (
SELECT CAST(Data AS VARCHAR) + ','
FROM #t1 t2
WHERE t2.id = t1.id
ORDER BY Data ASC
FOR XML PATH('') )
D ( Data )
And then going analog to LuckyLuke's solution.