SQL script to aggregate column values

SQL script to aggregate column values - sql

i'd appreciate some help putting together a sql script to copy data from one table to another. essentially what i need to do for each row in the source table is aggregate the column values and store them into a single column in the target table.
TableA: ID, ColumnA, ColumnB, ColumnC
TableB: Identity, ColumnX
so, ColumnX needs to be something like 'ColumnA, ColumnB, ColumnC'.
in addition though, i need to keep track of each TableA.ID -> SCOPE_IDENTITY() mapping in order to update a third table.
thanks in advance!
EDIT: TableA.ID is not the same as TableB.Identity. TableB.Identity will return a new identity value on insert. so either i need to store the mapping in a temp table or update TableC with each insert into TableB.

Here is a row-by-row processing example. This will provide you the results in a way where you can process each row at a time. Or you can use TableC at the end and do whatever processing you need to do.
However, it would be a lot faster if you added an extra column to TableB (Called TableA_ID) and just INSERTED the result into it. You would have instant access to TableA.ID and TableB.Identity. But without knowing your exact situation, this may not be feasible. (But you could always add the column and then drop it afterwards!)
USE tempdb
GO
CREATE TABLE TableA (
ID int NOT NULL PRIMARY KEY,
ColumnA varchar(10) NOT NULL,
ColumnB varchar(10) NOT NULL,
ColumnC varchar(10) NOT NULL
)
CREATE TABLE TableB (
[Identity] int IDENTITY(1,1) NOT NULL PRIMARY KEY,
ColumnX varchar(30) NOT NULL
)
CREATE TABLE TableC (
TableA_ID int NOT NULL,
TableB_ID int NOT NULL,
PRIMARY KEY (TableA_ID, TableB_ID)
)
GO
INSERT INTO TableA VALUES (1, 'A', 'A', 'A')
INSERT INTO TableA VALUES (2, 'A', 'A', 'B')
INSERT INTO TableA VALUES (3, 'A', 'A', 'C')
INSERT INTO TableA VALUES (11, 'A', 'B', 'A')
INSERT INTO TableA VALUES (12, 'A', 'B', 'B')
INSERT INTO TableA VALUES (13, 'A', 'B', 'C')
INSERT INTO TableA VALUES (21, 'A', 'C', 'A')
INSERT INTO TableA VALUES (22, 'A', 'C', 'B')
INSERT INTO TableA VALUES (23, 'A', 'C', 'C')
GO
-- Do row-by-row processing to get the desired results
SET NOCOUNT ON
DECLARE #TableA_ID int
DECLARE #TableB_Identity int
DECLARE #ColumnX varchar(100)
SET #TableA_ID = 0
WHILE 1=1 BEGIN
-- Get the next row to process
SELECT TOP 1
#TableA_ID=ID,
#ColumnX = ColumnA + ColumnB + ColumnC
FROM TableA
WHERE ID > #TableA_ID
-- Check if we are all done
IF ##ROWCOUNT = 0
BREAK
-- Insert row into TableB
INSERT INTO TableB (ColumnX)
SELECT #ColumnX
-- Get the identity of the new row
SET #TableB_Identity = SCOPE_IDENTITY()
-- At this point, you have #TableA_ID and #TableB_Identity.
-- Go to town with whatever extra processing you need to do
INSERT INTO TableC (TableA_ID, TableB_ID)
SELECT #TableA_ID, #TableB_Identity
END
GO
SELECT * FROM TableC
GO
SELECT * FROM TableA
ID ColumnA ColumnB ColumnC
----------- ---------- ---------- ----------
1 A A A
2 A A B
3 A A C
11 A B A
12 A B B
13 A B C
21 A C A
22 A C B
23 A C C
SELECT * FROM TableB
Identity ColumnX
----------- ------------------------------
1 AAA
2 AAB
3 AAC
4 ABA
5 ABB
6 ABC
7 ACA
8 ACB
9 ACC
SELECT * FROM TableC
TableA_ID TableB_ID
----------- -----------
1 1
2 2
3 3
11 4
12 5
13 6
21 7
22 8
23 9

Assuming:
TableB exists
INSERT INTO TableB (ColumnX)
SELECT [TableA]![ColumnA]+","+[TableA]![ColumnB]+","+[TableA]![ColumnC] AS ColumnX
FROM TableA;

Related

Difference between delete statements

DELETE a
FROM TableA a
JOIN TableB b ON a.Field1 = b.Field1 AND a.Field2 = b.Field2;
vs.
DELETE
FROM TableA
WHERE Field1 IN (
SELECT Field1
FROM TableB
) AND Field2 IN (
SELECT Field2
FROM TableB
);

The logical conditions of the two statements are different.
The first statement will delete any row in TableA if both it's Field1 and Field2 correspond to the equivalent columns of a row in TableB.
The second statement will delete any row in TableA if the value of Field1 exists in Field1 of TableB, and the value of Field2 exists in Field2 of TableB - but that doesn't have to be in the same row.
It's easy to see the difference if you change the delete to select.
Here's an example. First, create and populate sample tables (Please save us this step in your future questions):
CREATE TABLE A
(
AInt int,
AChar char(1)
);
CREATE TABLE B
(
BInt int,
BChar char(1)
);
INSERT INTO A (AInt, AChar) VALUES
(1, 'a'), (2, 'a'), (3, 'a'),
(1, 'b'), (2, 'b'), (3, 'b');
INSERT INTO B (BInt, BChar) VALUES
(1, 'a'),
(2, 'b'),
(3, 'c');
The statements (translated to select statements):
SELECT A.*
FROM A
JOIN B
ON AInt = BInt AND AChar = BChar;
SELECT *
FROM A
WHERE AInt IN (
SELECT BInt
FROM B
) AND AChar IN (
SELECT BChar
FROM B
);
Results:
AInt AChar
1 a
2 b
AInt AChar
1 a
2 a
3 a
1 b
2 b
3 b
And you can see a live demo on DB<>Fiddle

JOIN tables based on 6 or 7 digits key in 1st table with 7 digits key in second table by using CASE and MAX function

Table 1:
Id1 Data1
123123 David
123124 Jan
1231344 Juro
1234126 Marco
Table 2:
Id2 Data2
1231230 Info 1
1231231 Info 2
1231232 Info 3
1231240 Info 4
1231241 Info 5
1231242 Info 6
Each id from Table 1 can have 1 or more matches in Table 2 based on first 6 digits.
For example 123123 from Table 1 matches 1231230, 1231231 and 1231232 in Table 2.
I'm trying to create join to match maximum id2 from Table 2 based on id1 from Table 1.

I would just join using LIKE:
SELECT
tb1.id1,
tb1.data1
MAX(tb2.[id2]) AS id2
FROM [dbo].[table1] tb1
LEFT JOIN [dbo].[table2] tb2
ON tb2.[id2] LIKE CONCAT(tb1.[id1], '%')
GROUP BY
tb1.id1,
tb1.data1
ORDER BY
tb1.id1 DESC;
This approach might still leave open the possibility of using an index on the second table. In any case, it is slightly easier to read than your version.

This is working solution:
SELECT tb1.*,
MAX(tb2.[id2]) as id2
FROM [dbo].[table1] tb1
LEFT JOIN [dbo].[table2] tb2
ON CASE
WHEN LEN(tb1.[id1]) = 7 and tb1.[id1] = tb2.[id2] THEN 1
WHEN LEN(tb1.[id1]) = 6 and tb1.[id1] = SUBSTRING(tb2.[id2],1,6) THEN 1
ELSE 0
END = 1
GROUP BY tb1.id1
,tb1.data1
ORDER BY tb1.id1 desc

You can try this as well:
Declare #t table (id1 varchar(50) , data1 varchar(50))
insert into #t values (123123,'David')
insert into #t values (123124,'Jan')
insert into #t values (1231344,'Juro')
Declare #t1 table (id2 varchar(50) , data2 varchar(50))
insert into #t1 values (1231230,'Info 1')
insert into #t1 values (1231231,'Info 2')
insert into #t1 values (1231232,'Info 3')
insert into #t1 values (1231240,'Info 4')
insert into #t1 values (1231241,'Info 5')
insert into #t1 values (1231242,'Info 6')
select * from #t a JOIN #t1 B
on b.id2 like '%' + a.id1 + '%'

Flag based on combination of variables

I have a table A which looks like below-
Column1 Column2
0001M 80050
0001M 80053
0001M 80076
0001T 0002T
0001T 34800
0001T 34802
0001T 34804
0001T 36000
0001U 80500
0001U 80502
0001U 81105
0001U 81106
CREATE TABLE mytable(
Column1 VARCHAR(5) NOT NULL PRIMARY KEY
,Column2 VARCHAR(5) NOT NULL
);
INSERT INTO mytable(Column1,Column2) VALUES ('0001M','80050');
INSERT INTO mytable(Column1,Column2) VALUES ('0001M','80053');
INSERT INTO mytable(Column1,Column2) VALUES ('0001M','80076');
INSERT INTO mytable(Column1,Column2) VALUES ('0001T','0002T');
INSERT INTO mytable(Column1,Column2) VALUES ('0001T','34800');
INSERT INTO mytable(Column1,Column2) VALUES ('0001T','34802');
INSERT INTO mytable(Column1,Column2) VALUES ('0001T','34804');
INSERT INTO mytable(Column1,Column2) VALUES ('0001T','36000');
INSERT INTO mytable(Column1,Column2) VALUES ('0001U','80500');
INSERT INTO mytable(Column1,Column2) VALUES ('0001U','80502');
INSERT INTO mytable(Column1,Column2) VALUES ('0001U','81105');
INSERT INTO mytable(Column1,Column2) VALUES ('0001U','81106');
I have another table B which has the following columns -
ID SubID
1 0001M
1 80050
1 80053
1 12500
2 0001T
2 0002T
2 34800
2 36000
2 12506
3 80500
3 80502
3 81106
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,SubID VARCHAR(5) NOT NULL
);
INSERT INTO mytable(ID,SubID) VALUES (1,'0001M');
INSERT INTO mytable(ID,SubID) VALUES (1,'80050');
INSERT INTO mytable(ID,SubID) VALUES (1,'80053');
INSERT INTO mytable(ID,SubID) VALUES (1,'12500');
INSERT INTO mytable(ID,SubID) VALUES (2,'0001T');
INSERT INTO mytable(ID,SubID) VALUES (2,'0002T');
INSERT INTO mytable(ID,SubID) VALUES (2,'34800');
INSERT INTO mytable(ID,SubID) VALUES (2,'36000');
INSERT INTO mytable(ID,SubID) VALUES (2,'12506');
INSERT INTO mytable(ID,SubID) VALUES (3,'80500');
INSERT INTO mytable(ID,SubID) VALUES (3,'80502');
INSERT INTO mytable(ID,SubID) VALUES (3,'81106');
Both the values of column1 and column2 of tableA should not occur together in SubID column within each distinct ID column of tableB. If it happens, I need to flag it 1. Otherwise 0. For example, (0001M, 80050) and (0001M, 80053) are not allowed to occur together. Since these two combinations exist in ID=1 of table B, it should be flagged 1.
The output should like this -
ID Flag
1 1
2 1
3 0
Reason Flag = 0 for ID=3 --> Since (80500, 80502) and (80502, 81106) come from column2 only (not from both column 1 and 2), they are allowed to occur together, it is flagged 0. I am using SQL Server 2016 version.

select
b1.ID,
max(case
when exists (select 1 from tableA a where a.Column1=b1.SubId and a.Column2=b2.SubId)
then 1 else 0 end) as flag
from
tableB b1
inner join table B b2 on b1.ID=b2.ID
group by b1.ID

you can also use JOINS to get this done like below.
Logic is that if we get id in tableA from tableB for each column and they are same then flag should be 1
see live demo
select
distinct B.id, flag =case when b1.id = b2.id then 1 else NULL end
from
mytableA A
left join mytableB b1 on b1.subid=a.column1
left join mytableB b2 on b2.subid=a.column2
right join
mytableB B
on B.id= case when b1.id = b2.id then b1.id else NULL end

Hmmm . . . This is tricky. I think this does what you want.
select b.id,
(case when count(*) > 0 then 1 else 0 end) as flag
from (select distinct b.id from b) b cross join
a join
b b1
on b1.subid = a.column1 and b1.id = b.id join
b b2
on b2.subid = a2.column2 and b2.id = b.id
group by b.id;

How to know the column name from a table based on the column values

I am working in Informix and I want to know if there is a simple way to know the tabname/colname by its possible column values.
For example:
table1
Register 1
==========
id 1
col1 3
col2 Y
Register 2
==========
id 2
col1 43
col2 X
Register 3
==========
id 2
col1 0
col2 Z
Register 4
==========
id 2
col1 23
col2 F
table2
Register 1
==========
id 1
col1 X
col2 Y
Register 2
==========
id 2
col1 X
col2 X
Register 3
==========
id 2
col1 Z
col2 Z
Register 4
==========
id 2
col1 X
col2 X
table3
Register 1
==========
id 1
col1 ASX
With this database, if I want to know the colnames and their related tabnames of the database that contain X, Y and Z (amoung other values).
It could be something like this:
select tabname, colname
where ('X','Y','Z') in colnamevalues --this has been invented by me
And this should return the following values:
table1.col2
table2.col1
table2.col2
--Note that the columns fetched contains also other values
--different from 'X', 'Y' and 'Z' but T didn't fix in this case
--the whole list of values, only some of them
I have queried for other Q&A but all of them look to use some functions of other databases such as Oracle or SQL Server and I don't understand them very well.

You can get all the tables that exist on a database by querying the systables:
SELECT tabname
FROM systables
WHERE tabtype = 'T' --get only tables
AND tabid > 99; --skip catalog tables
You can join it to the syscolumns table to get the columns:
SELECT t.tabname, c.colname
FROM systables t
INNER JOIN syscolumns c ON (c.tabid = t.tabid)
WHERE t.tabtype = 'T' AND t.tabid > 99;
And if you know the type of values you can even filter it. Example if you're looking for "strings":
SELECT t.tabname, c.colname
FROM systables t
INNER JOIN syscolumns c ON (c.tabid = t.tabid)
WHERE t.tabtype = 'T' AND t.tabid > 99
AND MOD(c.coltype,256) IN (
0, --CHAR
13, --VARCHAR
15, --NCHAR
16, --NVARCHAR
40, --LVARCHAR
43 --LVARCHAR
);
The next example works, but it really should be optimized and bullet proof, but can get you kick off.
When I have time I get another look at it and check what can be optimized and put some error handling.
Another way to do it is scripting, what OS are you running?
Schema creation:
CREATE TABLE tab1(
id INT,
col1 CHAR(3),
col2 CHAR(3)
);
INSERT INTO tab1 VALUES (1, 3, 'Y');
INSERT INTO tab1 VALUES (2, 43, 'X');
INSERT INTO tab1 VALUES (2, 0, 'Z');
INSERT INTO tab1 VALUES (2, 23, 'F');
CREATE TABLE tab2(
id INT,
col1 CHAR(3),
col2 CHAR(3)
);
INSERT INTO tab2 VALUES (1, 'X', 'Y');
INSERT INTO tab2 VALUES (2, 'X', 'X');
INSERT INTO tab2 VALUES (2, 'Z', 'Z');
INSERT INTO tab2 VALUES (2, 'X', 'X');
CREATE TABLE tab3(
id INT,
col1 CHAR(3)
);
INSERT INTO tab3 VALUES (1, 'ASX');
Sample function:
CREATE FUNCTION get_columns()
RETURNING LVARCHAR(257) AS col;
DEFINE stmt VARCHAR(255);
DEFINE tab_name VARCHAR(128,0);
DEFINE tab_id INTEGER;
DEFINE col_name VARCHAR(128,0);
DEFINE o_tname VARCHAR(128,0);
DEFINE o_cname VARCHAR(128,0);
CREATE TEMP TABLE out_table(
t_name VARCHAR(128,0),
c_name VARCHAR(128,0)
);
CREATE TEMP TABLE tab_v (
col1 VARCHAR(255)
);
INSERT INTO tab_v VALUES ('X');
INSERT INTO tab_v VALUES ('Y');
INSERT INTO tab_v VALUES ('Z');
FOREACH tables FOR
SELECT tabname, tabid
INTO tab_name, tab_id
FROM systables
WHERE tabid > 99 AND tabtype = 'T'
FOREACH column FOR
SELECT colname
INTO col_name
FROM syscolumns
WHERE tabid = tab_id
AND MOD(coltype,256) IN (
0, --CHAR
13, --VARCHAR
15, --NCHAR
16, --NVARCHAR
40, --LVARCHAR
43 --LVARCHAR
)
LET stmt = "INSERT INTO out_table "||
"SELECT '"||tab_name||"', '"||col_name||"' "||
"FROM "||tab_name||" "||
"WHERE EXISTS (SELECT 1 FROM tab_v v WHERE v.col1 = "||col_name||");";
EXECUTE IMMEDIATE stmt;
END FOREACH
END FOREACH
FOREACH out FOR
SELECT UNIQUE t_name, c_name
INTO o_tname, o_cname
FROM out_table
RETURN o_tname||"."||o_cname WITH RESUME;
END FOREACH
DROP TABLE out_table;
DROP TABLE tab_v;
END FUNCTION;
EXECUTE FUNCTION get_columns();

How do I join an unknown number of rows to another row?

I have this scenario:
Table A:
---------------
ID| SOME_VALUE|
---------------
1 | 123223 |
2 | 1232ff |
---------------
Table B:
------------------
ID | KEY | VALUE |
------------------
23 | 1 | 435 |
24 | 1 | 436 |
------------------
KEY is a reference to to Table A's ID. Can I somehow join these tables so that I get the following result:
Table C
-------------------------
ID| SOME_VALUE| | |
-------------------------
1 | 123223 |435 |436 |
2 | 1232ff | | |
-------------------------
Table C should be able to have any given number of columns depending on how many matching values that are found in Table B.
I hope this enough to explain what I'm after here.
Thanks.

You need to use a Dynamic PIVOT clause in order to do this.
EDIT:
Ok so I've done some playing around and based on the following sample data:
Create Table TableA
(
IDCol int,
SomeValue varchar(50)
)
Create Table TableB
(
IDCol int,
KEYCol int,
Value varchar(50)
)
Insert into TableA
Values (1, '123223')
Insert Into TableA
Values (2,'1232ff')
Insert into TableA
Values (3, '222222')
Insert Into TableB
Values( 23, 1, 435)
Insert Into TableB
Values( 24, 1, 436)
Insert Into TableB
Values( 25, 3, 45)
Insert Into TableB
Values( 26, 3, 46)
Insert Into TableB
Values( 27, 3, 435)
Insert Into TableB
Values( 28, 3, 437)
You can execute the following Dynamic SQL.
declare #sql varchar(max)
declare #pivot_list varchar(max)
declare #pivot_select varchar(max)
Select
#pivot_list = Coalesce(#Pivot_List + ', ','') + '[' + Value +']',
#Pivot_select = Coalesce(#pivot_Select, ', ','') +'IsNull([' + Value +'],'''') as [' + Value + '],'
From
(
Select distinct Value From dbo.TableB
)PivotCodes
Set #Sql = '
;With p as (
Select a.IdCol,
a.SomeValue,
b.Value
From dbo.TableA a
Left Join dbo.TableB b on a.IdCol = b.KeyCol
)
Select IdCol, SomeValue ' + Left(#pivot_select, Len(#Pivot_Select)-1) + '
From p
Pivot ( Max(Value) for Value in (' + #pivot_list + '
)
)as pvt
'
exec (#sql)
This gives you the following output:
Although this works at the moment it would be a nightmare to maintain. I'd recommend trying to achieve these results somewhere else. i.e not in SQL!
Good luck!

As Barry has amply illustrated, it's possible to get multiple columns using a dynamic pivot.
I've got a solution that might get you what you need, except that it puts all of the values into a single VARCHAR column. If you can split those results, then you can get what you need.
This method is a trick in SQL Server 2005 that you can use to form a string out of a column of values.
CREATE TABLE #TableA (
ID INT,
SomeValue VARCHAR(50)
);
CREATE TABLE #TableB (
ID INT,
TableAKEY INT,
BValue VARCHAR(50)
);
INSERT INTO #TableA VALUES (1, '123223');
INSERT INTO #TableA VALUES (2, '1232ff');
INSERT INTO #TableA VALUES (3, '222222');
INSERT INTO #TableB VALUES (23, 1, 435);
INSERT INTO #TableB VALUES (24, 1, 436);
INSERT INTO #TableB VALUES (25, 3, 45);
INSERT INTO #TableB VALUES (26, 3, 46);
INSERT INTO #TableB VALUES (27, 3, 435);
INSERT INTO #TableB VALUES (28, 3, 437);
SELECT
a.ID
,a.SomeValue
,RTRIM(bvals.BValues) AS ValueList
FROM #TableA AS a
OUTER APPLY (
-- This has the effect of concatenating all of
-- the BValues for the given value of a.ID.
SELECT b.BValue + ' ' AS [text()]
FROM #TableB AS b
WHERE a.ID = b.TableAKEY
ORDER BY b.ID
FOR XML PATH('')
) AS bvals (BValues)
ORDER BY a.ID
;
You'll get this as a result:
ID SomeValue ValueList
--- ---------- --------------
1 123223 435 436
2 1232ff NULL
3 222222 45 46 435 437

This looks like something a database shouldn't do. Firstly; a table cannot have arbitrary number of columns depending on whatever you'll store. So you will have to put up a maximum number of values anyway. You can get around this by using comma seperated values as value for that cell (or a similar pivot-like solution).
However; if you do have table A and B; i recommend keeping to those two tables; as they seem to be pretty normalised. Should you need a list of b.value given an input a.some_value, the following sql query gives that list.
select b.value from a,b where b.key=a.id a.some_value='INPUT_VALUE';

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL script to aggregate column values - sql

Assuming: TableB exists INSERT INTO TableB (ColumnX) SELECT [TableA]![ColumnA]+","+[TableA]![ColumnB]+","+[TableA]![ColumnC] AS ColumnX FROM TableA;

Related

Difference between delete statements

JOIN tables based on 6 or 7 digits key in 1st table with 7 digits key in second table by using CASE and MAX function

Flag based on combination of variables

How to know the column name from a table based on the column values

How do I join an unknown number of rows to another row?

Categories

Resources