How to create sequential ID number when using SELECT * INTO - sql

I have a complex stored procedure that is collecting data from many tables and inserting it into MyTable. It inserts over 1.5 M records.
What would be the most efficient way to create sequential ID number when populating MyTable
The structure of the table looks like this:
IF OBJECT_ID ('MyTable', 'U') IS NOT NULL
DROP TABLE MyTable;
SELECT *
INTO MyTable
FROM
(SELECT
col1, col2, col3
FROM
Table1
INNER JOIN
Table2 ON...
INNER JOIN
Table3 ON...
INNER JOIN
Table4 ON...
WHERE
Condition1,
Condition2) T

SELECT ID = IDENTITY(INT, 1, 1),* INTO MyTable FROM (
SELECT
col1,
col2,
col3
FROM Table1 INNER JOIN Table2 ON...
Table3 INNER JOIN Table4 ON...
WHERE Condition1,
Condition2
) T
ID = IDENTITY(INT, 1, 1) will create an identity ID that auto increments, i didnt test the code

I think simple row_number() is helpfull as below:
IF OBJECT_ID ('MyTable', 'U') IS NOT NULL
DROP TABLE MyTable;
SELECT * INTO MyTable FROM (
SELECT
RowNum = Row_Number() over (order by (Select NULL)) --Instead you can generate based on any column in the table
col1,
col2,
col3
FROM Table1 INNER JOIN Table2 ON...
Table3 INNER JOIN Table4 ON...
WHERE Condition1,
Condition2
) T

As Gabri demonstrated, you can use IDENTITY with a SELECT INTO statement. This will make that column and IDENTITY column. If you don't want it to be an IDENTITY column you can use ROW_NUMBER; this will work on any SQL 2005+ system.
SELECT ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), *
INTO MyTable FROM
(
SELECT
col1,
col2,
col3
FROM Table1 INNER JOIN Table2 ON...
Table3 INNER JOIN Table4 ON...
WHERE Condition1, Condition2
) T;

Related

Standard SQL - Delete all rows in table 1 that exist in table 2

Table 1 has 101,915 rows and Table 2 has 49,466 where all of them exist in Table 1. Table 1 should have 101,915 - 49,466 = 52,449 after the query.
I tried querying a left join but it returns only 8,269 rows.
SELECT
A.*
FROM
`table1` A
LEFT JOIN
`table2` B
ON
A.interval_uid = B.interval_uid
WHERE
B.interval_uid IS NULL
I used interval_uid as key field but all the repeated rows are identical in both tables.
Try this, If the schema is same, following shall work
INSERT INTO `table3` SELECT
*
FROM
`table1` A
WHERE
A.interval_uid NOT IN (SELECT B.interval_uid FROM `table2` B)
Try using operator EXCEPT for SQL. More info needed find here
SELECT col1, col2, col3,..
FROM table1
EXCEPT
SELECT col1, col2, col3,..
FROM table2;
Insert query as follows:
Insert into table3
SELECT col1, col2, col3,..
FROM table1
EXCEPT
SELECT col1, col2, col3,..
FROM table2;
Note: Columns specified in the query must be equivalent w.r.to., table1 & table2
Please use below query for your desired result..
WITH CTE
AS (
SELECT *
FROM Table_1
EXCEPT
SELECT *
FROM Table_2
)
INSERT INTO Table_3
SELECT *
FROM CTE
Please try this one if you don't want to insert in table 3.
SELECT * FROM Table_1
EXCEPT
SELECT * FROM Table_2
you can use this following logic-
Note: Delete is a risky operation. Please try with test data first.
DELETE table1
FROM table1
INNER JOIN table2 ON Table1.interval_uid = Table2.interval_uid
This is what I wanted:
WITH
unwanted_rows AS (
SELECT
a.*
FROM
`table` a
JOIN (
SELECT
interval_uid,
COUNT(*)
FROM
`table`
GROUP BY
interval_uid
HAVING
COUNT(*) > 1) b
ON
a.interval_uid = b.interval_uid
WHERE
duration IS NULL
)
SELECT
*
FROM
`table` EXCEPT DISTINCT
SELECT
*
FROM
unwanted_rows

How to spread a table into another one

We have a legacy table
create table table1 (col1 int);
insert into table1 values(1);
insert into table1 values(2);
insert into table1 values(3);
SELECT * FROM table1;
1
2
3
now it gets a new column
alter table table1 add column col2 int;
alter table table1 ADD CONSTRAINT unique1 UNIQUE (col2);
SELECT * FROM table1;
1;null
2;null
3;null
then we have another table
create table table2 (col1 int);
insert into table2 values(7);
insert into table2 values(8);
insert into table2 values(9);
SELECT * FROM table2;
7
8
9
now we want to spread the values of table 2 into table1.col2
UPDATE table1 up
SET col2 = (SELECT col1
FROM table2 t2
WHERE NOT EXISTS (SELECT 1 FROM table1 t1 WHERE t1.col2=t2.col1)
LIMIT 1);
but the update statement does not see the already updated rows
ERROR: duplicate key value violates unique constraint "unique1"
Any ideas how to do that? It would be ok, if table1 remains with some rows col2=null if table2 has less rows than table1
This seems much easier with a join:
with t2 as (
select t2.*, row_number() over (order by col1) as seqnum
from table2 t2
)
update table1 t1
set col2 = t2.col1
from t2
where t1.col1 = t2.seqnum;
If col1 in table1 is not strictly sequential, you can still do this:
with t2 as (
select t2.*, row_number() over (order by col1) as seqnum
from table2 t2
),
t1 as (
select t1.*, row_number() over (order by col1) as seqnum
from table1 t1
)
update table1 toupdate
set col2 = t2.col1
from t1 join
t2
on t1.seqnum = t2.seqnum
where toupdate.col1 = t1.col1;
i think i found a solution
WITH rownumbers AS (
SELECT col1, row_number() over (partition by 1 ORDER BY col1) FROM table1
)
UPDATE table1 up SET col2 =
(SELECT col1 FROM table2 t2 WHERE NOT EXISTS (SELECT 1 FROM table1 t1 WHERE t1.col2=t2.col1)
LIMIT 1 OFFSET (
SELECT row_number-1 FROM rownumbers WHERE col1=up.col1
)
)
any cons about it?

INSERT SELECT TOP (n) ORDER BY column not included in the destination table

Is that possible to insert from select when the select statement has more columns that the table to insert to ?
consider scenario:
INSERT INTO table_1 --table_1 consist only of one column
SELECT TOP 10 col1, col2 --col_2 is only selected because is used in ORDER BY
FROM table_2
ORDER BY col2 DESC
The above statement will result with error. One way I would accomplish that is to use sub-query like that
INSERT INTO table_1
SELECT TOP 10 col1
FROM (
SELECT col1, col2
FROM table_2
ORDER BY col2 DESC
) AS t
But I'm wondering if there is a straight forward way for example using equal operator like in UPDATE statement.
UPDATE
I apologize for submitting oversimplified example. That was because I've took it for granted this will apply to my scenario without actually testing it.
This is the reproduced context of my query (tested on sqlfiddle as have no SQL Server installed on my home PC)
CREATE TABLE table_1 (id INT)
CREATE TABLE table_2 (id INT, col2 INT)
CREATE TABLE table_3 (id INT, col2 INT)
INSERT INTO table_2 VALUES (1,3),(2,2),(3,1)
INSERT INTO table_3 VALUES (1,3),(1,2),(3,1)
INSERT INTO table_1
SELECT TOP 1 t.id, t.Qty
FROM table_2
INNER JOIN
(
SELECT table_2.id, COUNT(table_3.id) AS Qty
FROM table_2
INNER JOIN table_3 on table_3.id = table_2.id
GROUP BY table_2.id
) AS t ON (t.id = table_2.id)
ORDER BY t.Qty
The original query is much more complex, therefore I would like to avoid another sub-query if this is possible.
This query results with the error saying:
Column name or number of supplied values does not match table definition.: INSERT INTO table_1 SELECT TOP 1 table_1.id FROM table_1 INNER JOIN ( SELECT table_2.id, COUNT(table_3.id) AS Qty FROM table_2 INNER JOIN table_3 on table_3.id = table_2.id GROUP BY table_2.id ) AS t ON (t.id = table_1.id) ORDER BY t.Qty
Up don't need to include the order by col in the select list.
CREATE TABLE table_1 (id INT)
GO
CREATE TABLE table_2 (col1 INT, col2 INT)
GO
INSERT INTO table_2 VALUES (1,3),(2,2),(3,1)
GO
INSERT INTO table_1 --table_1 consist only of one column
SELECT TOP 2 col1
FROM table_2
ORDER BY col2 DESC
Have you tried specifying the column that you are trying to insert into?
INSERT INTO table_1 (id)
SELECT TOP 1 table_1.id
FROM table_1
INNER JOIN
(
SELECT table_2.id, COUNT(table_3.id) AS Qty
FROM table_2
INNER JOIN table_3 on table_3.id = table_2.id
GROUP BY table_2.id
) AS t ON (t.id = table_1.id)
ORDER BY t.Qty

Sum on subqueries on SQL Server

I have a query with some subqueries inside and I want to add a sum query to sum them all.
How can I do that?
example:
Id,
(SELECT COUNT(*) FROM table1 LEFT JOIN table2 on ...) as col1,
(SELECT COUNT(*) FROM table3 LEFT JOIN table4 on ...) as col2,
** Sum of both col1 and col2 here **
Try this:
SELECT ID, col1, col2, [Total] = (col1 + col2)
FROM (
SELECT Id,
(SELECT COUNT(*) FROM table1 LEFT JOIN table2 on ...) as col1,
(SELECT COUNT(*) FROM table3 LEFT JOIN table4 on ...) as col2
FROM [TABLE]) T
Hope that helps.
the easiest way would be to treat all your query as a subquery
select Id, col1 + col2 as total
from
(<yourCode>) s
Because it's not possible to use alias in the same "level of query" in the select clause.

Use a Resultset as Table (TSQL)

Say, I have the following:
WITH T1 as (select xxx as MYALIAS FROM T)
T2 as (select yyy from T1)
SELECT T2.ID,
MAX(MYALIAS) AS MAX_ALIAS,
T.AN_ALIAS
FROM T2
WHERE MAX_ALIAS <> T.AN_ALIAS
GROUP BY T2.ID, T.AN_ALIAS
Let's imagine the result as a table named MY_RESULT_TABLE.
I need to MY_RESULT_TABLE join some other columns.
I need to obtain
TEMP_TABLE = MY_RESULT_TABLE + SOME JOINS.
The question is how can I use this MY_RESULT_TABLE?
I tried to do
WITH MY_RESULT_TABLE AS (the code above)... didn't work...
EDIT
My other problem is that I need to insert into MY_TABLE some values from an select from MY_RESULT_TABLE...
INSERT INTO MY_TABLE_TO_INSERT ...
SELECT a b c FROM MY_RESULT_TABLE
It should work as you've detailed, just remember to separate each CTE with a comma.
Edit You can use CTE's for INSERTs and UPDATEs as well. Updated for insert.
;
WITH T1 AS
(
SELECT xxx
AS MYALIAS FROM T
),
T2 AS
(
SELECT yyy
FROM T1
),
MY_RESULT_TABLE AS
(
SELECT T2.ID,
MAX(MYALIAS) AS MAX_ALIAS,
T.AN_ALIAS
FROM T2
WHERE MAX_ALIAS <> T.AN_ALIAS
GROUP BY T2.ID, T.AN_ALIAS
)
INSERT INTO MY_TABLE_TO_INSERT(col1, col2, col3, ...)
SELECT ID, MAX_ALIAS, xyz.OtherColumnsHere
FROM MY_RESULT_TABLE
INNER JOIN xyz on ...