Quick comparison of two columns in other TABLE - sql

How to quickly find values ​​in a column that does not contain another column in another table
The problem is the speed of the query that is dynamically built in "execute immediate "stmt and average size of test tables test_table = 40mln and test_table2 = 1mln
Unfortunately I've been able to find similar topics and i will be grateful for any help
My queries:
select pole2 from test_table tt
where exists( select 1 from test_table2 tt2
where tt2.pole1 = 'ABC'
and tt.pole2 != tt.pole2)
select pole2 from test_table tt
where pole2 not in ( select pole2 from test_table2 tt2
where tt2.pole1 = 'ABC')

Related

Postgresql update column based on set of values from another table

Dummy data to illustrate my problem:
create table table1 (category_id int,unit varchar,is_valid bool);
insert into table1 (category_id, unit, is_valid)
VALUES (1, 'a', true), (2, 'z', true);
create table table2 (category_id int,unit varchar);
insert into table2 (category_id, unit)
values(1, 'a'),(1, 'b'),(1, 'c'),(2, 'd'),(2, 'e');
So the data looks like:
Table 1:
category_id
unit
is_valid
1
a
true
2
z
true
Table 2:
category_id
unit
1
a
1
b
1
c
2
d
2
e
I want to update the is_valid column in Table 1, if the category_id/unit combination from Table 1 doesn't match any of the rows in Table 2. For example, the first row in Table 1 is valid, since (1, a) is in Table 2. However, the second row in Table 1 is not valid, since (2, z) is not in Table 2.
How can I update the column using postgresql? I tried a few different where clauses of the form
UPDATE table1 SET is_valid = false WHERE...
but I cannot get a WHERE clause that works how I want.
You can just set the value of is_valid the the result of a ` where exists (select ...). See Demo.
update table1 t1
set is_valid = exists (select null
from table2 t2
where (t2.category_id, t2.unit) = (t1.category_id, t1.unit)
);
NOTES:
Advantage: Query correctly sets the is_valid column regardless of the current value and is a vary simple query.
Disadvantage: Query sets the value of is_valid for every row in the table; even thoes already correctly set.
You need to decide whether the disadvantage out ways the advantage. If so then the same basic technique in a much more complicated query:
with to_valid (category_id, unit, is_valid) as
(select category_id
, unit
, exists (select null
from table2 t2
where (t2.category_id, t2.unit) = (t1.category_id, t1.unit)
)
from table1 t1
)
update table1 tu
set is_valid = to_valid.is_valid
from to_valid
where (tu.category_id, tu.unit) = (to_valid.category_id, to_valid.unit)
and tu.is_valid is distinct from to_valid.is_valid;

Needing system defined function to select updated or unmatched new records from two tables

I am having a live data table in which the old values are placed,in a new table i am moving data from that live table to this one how to find updated or new records that are inserted or updated in new table with out using except,checksum(binary_checksum) and join ,i am looking for a solution using System Defined Function.
The requirement is interesting as the best solutions are to use EXCEPT or a FULL JOIN. What you are trying to do is what is referred to as an left anti semi join. Here's a good article about the topic.
Note this sample data and the solutions (note that my solution that does not use EXCEPT or a join is the last solution):
-- sample data
if object_id('tempdb.dbo.orig') is not null drop table dbo.orig;
if object_id('tempdb.dbo.new') is not null drop table dbo.new;
create table dbo.orig (someid int, col1 int, constraint uq_cl_orig unique (someid, col1));
create table dbo.new (someid int, col1 int, constraint uq_cl_new unique (someid, col1));
insert dbo.orig values (1,100),(2,110),(3,120),(4,2000)
insert dbo.new values (1,100),(2,110),(3,122),(5,999);
Here's the EXCEPT version
select someid
from
(
select * from dbo.new except
select * from dbo.orig
) n
union -- union "distict"
select someid
from
(
select * from dbo.orig except
select * from dbo.new
) o;
Here's a FULL JOIN Solution which will also tell you if the record was removed, changed or added:
select
someid = isnull(n.someid, o.someid),
[status] =
case
when count(isnull(n.someid, o.someid)) > 1 then 'changed'
when max(n.col1) is null then 'removed' else 'added'
end
from dbo.new n
full join dbo.orig o
on n.col1=o.col1 and n.someid = o.someid
where n.col1 is null or o.col1 is null
group by isnull(n.someid, o.someid);
But, because those efficient solutions are not an option - you will need to go with a NOT IN or NOT EXISTS subquery.... And because it has to be a function, I am encapsulating the logic into a function.
create function dbo.newOrChangedOrRemoved()
returns table as return
-- get the new records
select someid, [status] = 'new'
from dbo.new n
where n.someid not in (select someid from dbo.orig)
union all
-- get the removed records
select someid, 'removed'
from dbo.orig o
where o.someid not in (select someid from dbo.new)
union all
-- get the changed records
select someid, 'changed'
from dbo.orig o
where exists
(
select *
from dbo.new n
where o.someid = n.someid and o.col1 <> n.col1
);
Results:
someid status
----------- -------
5 new
4 removed
3 changed

If condition in SQL file vertica

I need to execute insertions for around 10 tables, before inserting I have to check for a condition, condition remains the same for each of the tables, instead of giving that condition within insert query, I wish I could give in if condition (a select query), if satisfied then execute insert statements, is there a way to give if condition in Vertica SQL file ? If condition is not satisfied I dont want to execute any of the insert queries.
If the condition is, for example, that you only insert the data on a Sunday, try this:
a) a test table:
CREATE LOCAL TEMPORARY TABLE input(id,name)
ON COMMIT PRESERVE ROWS AS
SELECT 42,'Arthur Dent'
UNION ALL SELECT 43,'Ford Prefect'
UNION ALL SELECT 44,'Tricia McMillan'
KSAFE 0;
From this table, select with a WHERE condition that tests whether it's sunday -- that's all, see here:
SELECT
*
FROM input
WHERE TRIM(TO_CHAR(CURRENT_DATE,'Day'))='Sunday'
;
id|name
42|Arthur Dent
43|Ford Prefect
44|Tricia McMillan
With a different value for the week day (I'm writing this on a Sunday...), you get this:
SELECT
*
FROM input
WHERE TRIM(TO_CHAR(CURRENT_DATE,'Day'))='Monday'
;
id|name
select succeeded; 0 rows fetched
I use this technique in SQL generating SQL to create a script or an empty file determining on circumstances, and then to call that script (full or empty), implementing a conditional SQL execution that way....
I know it is an old post, but i want to share what i recently solved my problem.
I need to insert in one table or another based on some condition. You first should have a field or value what will be your search condition.
create table tmp1 (
Col1 int null
,Col2 varchar(100) null
)
--Insert values
insert into tmp (Col1,Col2) Values
(1,'Text1')
,(2,'Text2')
--Insert into table001
insert into table001
select
t.field1
,t.field2
,......
from table1 t
inner join tmp t2
on t2.col1 = t.ColX
where 1 = case when t2.Col2 = 'Text1' then 1 else 0 end --Search condition; if 1<>0 then it doesn't do anything; otherwise insert.
--Insert into table002
insert into table002
select
t.field1
,t.field2
,......
from table2 t
inner join tmp t2
on t2.col1 = t.ColX
where 1 = case when t2.Col2 = 'Text2' then 1 else 0 end --Search condition; if 1<>0 then it doesn't do anything; otherwise insert.
Or you can use an UNION/UNION ALL based on this if it is the same working table.
Regards!

SQL Import Data - Insert Only if it doesn't exists

I am using SQL server management tool 2008 to import data to the web host database. I have tables with primary keys. (Id for each row) Now I can import data normally. But when I am importing data for the second time..I need to make sure only those rows that doesn't currently exists only then it's inserted. If there's a to do this using the wizard? If not, then what's the best practice?
Insert the data into a temp table
use left join with main table to identify which records to insert
--
CREATE TABLE T1(col1 int)
go
CREATE TABLE Temp(col1 int )
go
INSERT INTO T1
SELECT 1
UNION
SELECT 2
INSERT INTO TEMP
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
INSERT INTO T1
SELECT TEMP.col1
FROM Temp
LEFT JOIN T1
ON TEMP.col1 = T1.col1
WHERE T1.col1 IS NULL
I've used this some time ago, maybe it can help:
insert into timestudio.dbo.seccoes (Seccao,Descricao,IdRegiao,IdEmpresa)
select distinct CCT_COD_CENTRO_CUSTO, CCT_DESIGNACAO, '9', '10' from rhxxi.dbo.RH_CCTT0
where CCT_COD_CENTRO_CUSTO not in (select Seccao from TimeStudio.dbo.Seccoes where idempresa = '10')
Or, just use simple a IF Statement.

PostgreSQL: Select a single-row x amount of times

A single row in a table has a column with an integer value >= 1 and must be selected however many times the column says. So if the column had '2', I'd like the select query to return the single-row 2 times.
How can this be accomplished?
Don't know why you would want to do such a thing, but...
CREATE TABLE testy (a int,b text);
INSERT INTO testy VALUES (3,'test');
SELECT testy.*,generate_series(1,a) from testy; --returns 3 rows
You could make a table that is just full of numbers, like this:
CREATE TABLE numbers
(
num INT NOT NULL
, CONSTRAINT numbers_pk PRIMARY KEY (num)
);
and populate it with as many numbers as you need, starting from one:
INSERT INTO numbers VALUES(1);
INSERT INTO numbers VALUES(2);
INSERT INTO numbers VALUES(3);
...
Then, if you had the table "mydata" that han to repeat based on the column "repeat_count" you would query it like so:
SELECT mydata.*
FROM mydata
JOIN numbers
ON numbers.num <= mydata.repeat_count
WHERE ...
If course you need to know the maximum repeat count up front, and have your numbers table go that high.
No idea why you would want to do this thought. Care to share?
You can do it with a recursive query, check out the examples in
the postgresql docs.
something like
WITH RECURSIVE t(cnt, id, field2, field3) AS (
SELECT 1, id, field2, field3
FROM foo
UNION ALL
SELECT t.cnt+1, t.id, t.field2, t.field3
FROM t, foo f
WHERE t.id = f.id and t.cnt < f.repeat_cnt
)
SELECT id, field2, field3 FROM t;
The simplest way is making a simple select, like this:
SELECT generate_series(1,{xTimes}), a.field1, a.field2 FROM my_table a;