Way to randomly update a set of values using another table as a collection of random values without using a loop? - sql

I have a table with a column that contains some values. I want to replace all the existing values with random values from another table (but only the existing ones - so WHERE COL1 IS NOT NULL). There is no way to correlate the two tables. Each of the random values needs to be different (well... unless they are randomly the same, in which case it's fine).
For example:
CREATE TABLE T1 (ID NUMBER(11,0), COL1 VARCHAR2(20));
CREATE TABLE T2 (COL2 VARCHAR2(20));
INSERT INTO T1 VALUES (1, NULL);
INSERT INTO T1 VALUES (54, NULL);
INSERT INTO T1 VALUES (941, 'Some text');
INSERT INTO T1 VALUES (251, NULL);
INSERT INTO T1 VALUES (352, 'Some other text');
INSERT INTO T1 VALUES (354, NULL);
INSERT INTO T2 VALUES ('Val1');
INSERT INTO T2 VALUES ('Val2');
INSERT INTO T2 VALUES ('Val3');
INSERT INTO T2 VALUES ('Val4');
INSERT INTO T2 VALUES ('Val5');
INSERT INTO T2 VALUES ('Val6');
INSERT INTO T2 VALUES ('Val7');
I have tried a few things and have searched this site for answers. The answers I've found seem to require that there is some correlation between the two tables. A lot of the examples are for SQL Server. I've tried a few out anyway, but I can't seem to get a MERGE or CROSS APPLY approach to work (I appreciate this is most likely my failure...).
The only solution I have that actually works at the moment is the following:
BEGIN
FOR X IN (
SELECT ID FROM T1 WHERE COL1 IS NOT NULL
)
LOOP
UPDATE T1 SET COL1 = (
SELECT COL2 FROM (
SELECT COL2 FROM T2 ORDER BY SYS_GUID()
) WHERE ROWNUM = 1
)
WHERE T1.ID = X.ID;
END LOOP;
END;
/
This produces a (desired) result of:
SELECT * FROM T1;
ID COL1
---------------------------
1
54
941 Val3
251
352 Val7
354
(COL1 values are both random each time I run the loop).
... But I know there must be a set-based way to achieve this... right?

You could do it with a MERGE statement.
merge into t1
using ( select a1.id
, a2.col2
from ( select id
, row_number() over (order by dbms_random.value) rn
from t1
where col1 is not null ) a1
join ( select col2
, row_number() over (order by dbms_random.value) rn
from t2) a2
on a1.rn = a2.rn
) q
on ( q.id = t1.id)
when matched then
update set t1.col1 = q.col2
/
The USING query is a little unorthodox. There are two subqueries, one for each table, which generate analytic row_number() in a random order (this is tidier than using rownum. The two subqueries are joined on the random row numbers which gives a random combination of T1.ID and T2.COL_2. After that, it's a straightforward MERGE.

You can create a FUNCTION named GET_RANDOM_VALUE
create or replace FUNCTION GET_RANDOM_VALUE
RETURN VARCHAR2 AS
sValue VARCHAR2(100) := '?';
BEGIN
select col2 INTO sValue
from t2
order by dbms_random.value
fetch first 1 rows only;
RETURN sValue;
END GET_RANDOM_VALUE;
and the UPDATE command can be
UPDATE t1
SET col1 = GET_RANDOM_VALUE()
WHERE col1 IS NOT NULL;
Normally, I avoid using SELECT in function, but in this case, I found this solution more readable and more easy to understand.

Related

A sql query to create multiple rows in different tables using inserted id

I need to insert a row into one table and use this row's id to insert two more rows into a different table within one transaction. I've tried this
begin;
insert into table default values returning table.id as C;
insert into table1(table1_id, column1) values (C, 1);
insert into table1(table1_id, column1) values (C, 2);
commit;
But it doesn't work. How can I fix it?
updated
You need a CTE, and you don't need a begin/commit to do it in one transaction:
WITH inserted AS (
INSERT INTO ... RETURNING id
)
INSERT INTO other_table (id)
SELECT id
FROM inserted;
Edit:
To insert two rows into a single table using that id, you could do that two ways:
two separate INSERT statements, one in the CTE and one in the "main" part
a single INSERT which joins on a list of values; a row will be inserted for each of those values.
With these tables as the setup:
CREATE TEMP TABLE t1 (id INTEGER);
CREATE TEMP TABLE t2 (id INTEGER, t TEXT);
Method 1:
WITH inserted1 AS (
INSERT INTO t1
SELECT 9
RETURNING id
), inserted2 AS (
INSERT INTO t2
SELECT id, 'some val'
FROM inserted1
RETURNING id
)
INSERT INTO t2
SELECT id, 'other val'
FROM inserted1
Method 2:
WITH inserted AS (
INSERT INTO t1
SELECT 4
RETURNING id
)
INSERT INTO t2
SELECT id, v
FROM inserted
CROSS JOIN (
VALUES
('val1'),
('val2')
) vals(v)
If you run either, then check t2, you'll see it will contain the expected values.
Please find the below query:
insert into table1(columnName)values('stack2');
insert into table_2 values(SCOPE_IDENTITY(),'val1','val2');

SQL INSERT from select with NULL values

Good afternoon in my timezone.
i have to insert a row in a table but one of the columns are values from another table.So what i want to accomplish is something like this
INSERT TABLE_NAME(COL1,COL2,COL3,COL4) VALUES("VAL1","VAL2","VAL3",(SELECT COL_A FROM TABLE2 WHERE COL_B = 'X'))
But i think the above code is not possible so i use the following code:
INSERT INTO TABLE_NAME(COL1,COL2,COL3,COL4)
SELECT "COL1","COL2","COL3", COL_A FROM TABLE2 T2
WHERE COL_B= "X"
My question is:
I want to insert values even the select does not return values and in this case the COL4 will be NULL
How can i achieve this ?
Thanks in advance
Best regards
No, you cannot insert a row that is not in the table.
If you are expecting one row (in the case of a match), you can use aggregation:
INSERT INTO TABLE_NAME(COL1, COL2, COL3, COL4)
SELECT "COL1","COL2","COL3", MAX(COL_A)
FROM TABLE2 T2
WHERE COL_B = 'X';
This will return NULL if there is no match -- but only one row even if there are multiple matches in the table.
As i understood from your description, only one column is NULL, other 3 columns have values. You should use Select INTO as https://www.w3schools.com/sql/sql_select_into.asp
SELECT COL1,COL2,COL3,COL4 INTO TABLE2 FROM TABLE_NAME WHERE COL_B= "X"
You can create a temp table to store the primary keys of TABLE2 (for eg TABLE2 has primary key 'X', 'Y')
CREATE TABLE #TempPK (COLB int null);
insert into #TempPK(COLB) values ('X');
insert into #TempPK(COLB) values ('Y');
--then you do a FULL OUTER JOIN on your insert select from statement
INSERT INTO TABLE_NAME(COL1, COL2, COL3, COL4)
SELECT "COL1","COL2","COL3", MAX(COL_A)
FROM TABLE2 T2 FULL OUTER JOIN #TempPK TPK
ON T2.COL_B = TPK.COLB
This way you should be able to insert both rows(X and Y) and the Row X should show NULL values all across the row. Hope it makes sense.

How to add the sum of the 'sum of the two tables?'

I created Tables T1 and T2. I managed to add their sum, but I can't seem to add the sum of the T1 and T2 together (10+12 = 22) by adding a sum() in the beginning of the code.
CREATE TABLE T1(kW int)
CREATE TABLE T2(kW int)
SELECT T1C1, T2C1
FROM
( select SUM(Kw) T1C1 FROM T1 ) A
CROSS JOIN
( select SUM(Kw) T2C1 FROM T2 ) B
BEGIN
INSERT INTO T1 VALUES ('4');
INSERT INTO T1 VALUES ('1');
INSERT INTO T1 VALUES ('5');
INSERT INTO T2 VALUES ('7');
INSERT INTO T2 VALUES ('2');
INSERT INTO T2 VALUES ('3');
END
You should use union all to create a "virtual" column from the columns in the two tables:
SELECT SUM(kw)
FROM (SELECT kw FROM t1
UNION ALL
SELECT kw FROM t2) t
Try using a stored procedure. Doing so you will be able to store the sum of each table on a separated variable and then return the SUM of those two variables.
You can also make a UNION ALL and SUM the column you want. Notice that you should a UNION ALL to avoid eliminating duplicated values.
Another approach is to add the results of the two subqueries directly, using the built-in dummy table dual as the main driving table:
select ( select SUM(Kw) FROM T1 )
+ ( select SUM(Kw) FROM T2 ) as total
from dual;
TOTAL
----------
22

select table to select from, dependent on column-value of already given table

for my intention I have to select a table to select columns from dependent on the column-value of an already given table.
First I thought about a CASE construct, if this is possible with sqlite.
SELECT * FROM
CASE IF myTable.column1 = "value1" THEN (SELECT * FROM table1 WHERE ...)
ELSE IF myTable.column1 = "value2" THEN (SELECT * FROM table2 WHERE ...)
END;
I am new to SQL. What construct would be the most concise (not ugly) solution and if I cannot have it in sqlite, what RDBM would be the best fit?
Thanks
Here is a proposal for associating a value from one of two tables for each entry in mytable. I.e. this is making the assumption that mytable does not only contain a single entry for choosing the secondary table.
For details on what this means, see "MCVE" at the end of this answer.
If you want to switch between two secondary tables, based on a single entry in main table, see at the very end of this answer.
Details:
a hardcoded "value1"/"value2" as column1 added on the fly to the result from secondary tables
joining by the faked colummn1 and a secondary join-key, assumption here id
a union all to make a single table from both secondary tables (including the fake column1)
select *
from mytable
left join
(select 'value1' as column1, * from table1
UNION ALL
select 'value2' as column1, * from table2)
using(id, column1);
Output (for the MCVE provided below, "a-f" from table1, "A-Z" from table2):
value1|1|a
value2|2|B
value1|3|c
value1|4|d
value2|5|E
value2|6|F
MCVE:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE mytable (column1 varchar(10), id int);
INSERT INTO mytable VALUES('value1',1);
INSERT INTO mytable VALUES('value2',2);
INSERT INTO mytable VALUES('value1',3);
INSERT INTO mytable VALUES('value1',4);
INSERT INTO mytable VALUES('value2',5);
INSERT INTO mytable VALUES('value2',6);
CREATE TABLE table2 (value varchar(2), id int);
INSERT INTO table2 VALUES('F',6);
INSERT INTO table2 VALUES('E',5);
INSERT INTO table2 VALUES('D',4);
INSERT INTO table2 VALUES('C',3);
INSERT INTO table2 VALUES('B',2);
INSERT INTO table2 VALUES('A',1);
CREATE TABLE table1 (value varchar(2), id int);
INSERT INTO table1 VALUES('a',1);
INSERT INTO table1 VALUES('b',2);
INSERT INTO table1 VALUES('c',3);
INSERT INTO table1 VALUES('d',4);
INSERT INTO table1 VALUES('e',5);
INSERT INTO table1 VALUES('f',6);
COMMIT;
For selecting between two tables based on a single entry in main table (in this case "mytable2":
select * from table1 where (select column1 from mytable2) = 'value1'
union all
select * from table2 where (select column1 from mytable2) = 'value2';
Output (with mytable2 only containing 'value1'):
a|1
b|2
c|3
d|4
e|5
f|6

How can I delete duplicate rows in a table

I have a table with say 3 columns. There's no primary key so there can be duplicate rows. I need to just keep one and delete the others. Any idea how to do this is Sql Server?
I'd SELECT DISTINCT the rows and throw them into a temporary table, then drop the source table and copy back the data from the temp.
EDIT: now with code snippet!
INSERT INTO TABLE_2
SELECT DISTINCT * FROM TABLE_1
GO
DELETE FROM TABLE_1
GO
INSERT INTO TABLE_1
SELECT * FROM TABLE_2
GO
Add an identity column to act as a surrogate primary key, and use this to identify two of the three rows to be deleted.
I would consider leaving the identity column in place afterwards, or if this is some kind of link table, create a compound primary key on the other columns.
The following example works as well when your PK is just a subset of all table columns.
(Note: I like the approach with inserting another surrogate id column more. But maybe this solution comes handy as well.)
First find the duplicate rows:
SELECT col1, col2, count(*)
FROM t1
GROUP BY col1, col2
HAVING count(*) > 1
If there are only few, you can delete them manually:
set rowcount 1
delete from t1
where col1=1 and col2=1
The value of "rowcount" should be n-1 times the number of duplicates. In this example there are 2 dulpicates, therefore rowcount is 1. If you get several duplicate rows, you have to do this for every unique primary key.
If you have many duplicates, then copy every key once into anoher table:
SELECT col1, col2, col3=count(*)
INTO holdkey
FROM t1
GROUP BY col1, col2
HAVING count(*) > 1
Then copy the keys, but eliminate the duplicates.
SELECT DISTINCT t1.*
INTO holddups
FROM t1, holdkey
WHERE t1.col1 = holdkey.col1
AND t1.col2 = holdkey.col2
In your keys you have now unique keys. Check if you don't get any result:
SELECT col1, col2, count(*)
FROM holddups
GROUP BY col1, col2
Delete the duplicates from the original table:
DELETE t1
FROM t1, holdkey
WHERE t1.col1 = holdkey.col1
AND t1.col2 = holdkey.col2
Insert the original rows:
INSERT t1 SELECT * FROM holddups
btw and for completeness: In Oracle there is a hidden field you could use (rowid):
DELETE FROM our_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM our_table
GROUP BY column1, column2, column3... ;
see: Microsoft Knowledge Site
Here's the method I used when I asked this question -
DELETE MyTable
FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, Col1, Col2, Col3
FROM MyTable
GROUP BY Col1, Col2, Col3
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
This is a way to do it with Common Table Expressions, CTE. It involves no loops, no new columns or anything and won't cause any unwanted triggers to fire (due to deletes+inserts).
Inspired by this article.
CREATE TABLE #temp (i INT)
INSERT INTO #temp VALUES (1)
INSERT INTO #temp VALUES (1)
INSERT INTO #temp VALUES (2)
INSERT INTO #temp VALUES (3)
INSERT INTO #temp VALUES (3)
INSERT INTO #temp VALUES (4)
SELECT * FROM #temp
;
WITH [#temp+rowid] AS
(SELECT ROW_NUMBER() OVER (ORDER BY i ASC) AS ROWID, * FROM #temp)
DELETE FROM [#temp+rowid] WHERE rowid IN
(SELECT MIN(rowid) FROM [#temp+rowid] GROUP BY i HAVING COUNT(*) > 1)
SELECT * FROM #temp
DROP TABLE #temp
This is a tough situation to be in. Without knowing your particular situation (table size etc) I think that your best shot is to add an identity column, populate it and then delete according to it. You may remove the column later but I would suggest that you should keep it as it is really a good thing to have in the table
After you clean up the current mess you could add a primary key that includes all the fields in the table. that will keep you from getting into the mess again.
Of course this solution could very well break existing code. That will have to be handled as well.
Can you add a primary key identity field to the table?
Manrico Corazzi - I specialize in Oracle, not MS SQL, so you'll have to tell me if this is possible as a performance boost:-
Leave the same as your first step - insert distinct values into TABLE2 from TABLE1.
Drop TABLE1. (Drop should be faster than delete I assume, much as truncate is faster than delete).
Rename TABLE2 as TABLE1 (saves you time, as you're renaming an object rather than copying data from one table to another).
Here's another way, with test data
create table #table1 (colWithDupes1 int, colWithDupes2 int)
insert into #table1
(colWithDupes1, colWithDupes2)
Select 1, 2 union all
Select 1, 2 union all
Select 2, 2 union all
Select 3, 4 union all
Select 3, 4 union all
Select 3, 4 union all
Select 4, 2 union all
Select 4, 2
select * from #table1
set rowcount 1
select 1
while ##rowcount > 0
delete #table1 where 1 < (select count(*) from #table1 a2
where #table1.colWithDupes1 = a2.colWithDupes1
and #table1.colWithDupes2 = a2.colWithDupes2
)
set rowcount 0
select * from #table1
What about this solution :
First you execute the following query :
select 'set rowcount ' + convert(varchar,COUNT(*)-1) + ' delete from MyTable where field=''' + field +'''' + ' set rowcount 0' from mytable group by field having COUNT(*)>1
And then you just have to execute the returned result set
set rowcount 3 delete from Mytable where field='foo' set rowcount 0
....
....
set rowcount 5 delete from Mytable where field='bar' set rowcount 0
I've handled the case when you've got only one column, but it's pretty easy to adapt the same approach tomore than one column. Let me know if you want me to post the code.
How about:
select distinct * into #t from duplicates_tbl
truncate duplicates_tbl
insert duplicates_tbl select * from #t
drop table #t
I'm not sure if this works with DELETE statements, but this is a way to find duplicate rows:
SELECT *
FROM myTable t1, myTable t2
WHERE t1.field = t2.field AND t1.id > t2.id
I'm not sure if you can just change the "SELECT" to a "DELETE" (someone wanna let me know?), but even if you can't, you could just make it into a subquery.