Exclude rows that have same value in two different columns - sql

I have a table which has 2 columns that sometimes have the same values. I want to know how to exclude the rows where the value of column1 is equal to a value in column2.
EXAMPLE:
COL1 | COL2
1 -------- 7
2 -------- 8
3 -------- 2
4 -------- 5
5 -------- 9
Here I would exclude rows 2 and 5.
Thanks

select
*
from table
where col1 not in (
select
column2
from table
)

Something like this should work :
SELECT *
FROM yourtable
WHERE COL1 NOT IN (SELECT COL2
FROM yourtable)

I tend to avoid using IN for long lists of values, as it performs poorly on some database systems. The following selects all values from col1 that are not present in col2:
SELECT col1
FROM
yourtable t1
LEFT JOIN
yourtable t2
ON
t1.col1 = t2.col2
WHERE
t2.col2 IS NULL
Why does it work? Well, normally the join operator will link together rows that have the same value. Left join will keep some rows that are mismatched though (and it's those we want). The left join takes the table on the left (t1) and uses it as the reference table, and starts associating rows from the table on the right (after the word JOIN, in this case t2). If the col1 value has a matching value in col2 then the row will be fully populated with values for each. If the value from col1 has no matching value from col2, the col2 cell on the resulting row is blank/null. Because we hence want to know only those values that aren't matched, we say "where col2 is null"
The other trick with getting to grips with this is in understanding that the same table can appear twice in a query. We give it a different alias each time we use it so we can tell them apart. You could conceive it as virtually making a copy of the table, before it links them together

Use EXCEPT together with a correlated sub-query - as shown below.
Read up on EXCEPT here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql
SELECT *
FROM TEST
EXCEPT
SELECT *
FROM TEST
WHERE COL1 IN (
SELECT COL2
FROM TEST
)

not sure, but maybe...
SELECT t1.*
FROM my_table AS t1
LEFT JOIN my_table AS t2
ON t2.col_b = t1.col_a
WHERE t2.col_b IS NULL

Related

Create a view of a table with a column that has multiple values

I have a table (Table1) like the following:
Col1
Col2
First
Code1,Code2,Code3
Second
Code2
So Col2 can contain multiple values comma separated, I have another table (Table2) that contains this:
ColA
ColB
Code1
Value1
Code2
Vaue2
Code3
Vaue3
I need to create a view that joins the two tables (Table1 and Table2) and returns something like this:
Col1
Col2
First
Value1,Value2,Value3
Second
Value2
Is that possible? (I'm on Oracle DB if that helps.)
It's a violation of first normal form to have a list in a column value like that. It causes a lot of difficulties in a relational database, like the one you are encountering now.
However, you can get what you want by using the LIKE operator to find colA values that are substrings of the Col2 column. Add delimiters before and after to catch the first and last ones. Then aggregate back up to a single list using LISTAGG.
SELECT table1.col1,
LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
FROM table1,
table2
WHERE ','||table1.col2||',' LIKE '%,'||table2.colA||',%'
GROUP BY table1.col1
This will not perform well on large volumes, because without an equijoin it's going to use nested loops, and you can't use an index on a LIKE predicate with % at the beginning. The combination of nested loops + FTS is not pleasant with large volumes of data. Therefore, if this is your situation, you will need to fix the 1NF problem by transforming table1 into normal relational format, and then join it to table2 with an equijoin, which will enable it to use a hash join instead. So:
SELECT table1.col1,
LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
FROM (SELECT t.col1,
SUBSTR(t.col2,INSTR(t.col2,',',1,seq)+1,INSTR(t.col2,',',1,seq+1)-(INSTR(t.col2,',',1,seq)+1)) col2_piece
FROM (SELECT col1,
','||col2||',' col2
FROM table1) t,
(SELECT ROWNUM seq FROM dual CONNECT BY LEVEL < 10) x) table1,
table2
WHERE table1.col2_piece IS NOT NULL
AND table1.col2_piece = table2.colA
GROUP BY table1.col1
If you want the values in the same order in the list as the terms then you can use:
SELECT t1.col1,
LISTAGG(t2.colb, ',') WITHIN GROUP (
ORDER BY INSTR(','||t1.col2||',', ','||t2.colA||',')
) AS value2
FROM table1 t1
INNER JOIN table2 t2
ON INSTR(','||t1.col2||',', ','||t2.colA||',') > 0
GROUP BY
t1.col1
Which, for the sample data:
CREATE TABLE Table1 (Col1, Col2) AS
SELECT 'First', 'Code1,Code2,Code3' FROM DUAL UNION ALL
SELECT 'Second', 'Code2' FROM DUAL;
CREATE TABLE Table2 (ColA, ColB) AS
SELECT 'Code1', 'XXXX' FROM DUAL UNION ALL
SELECT 'Code2', 'ZZZZ' FROM DUAL UNION ALL
SELECT 'Code3', 'YYYY' FROM DUAL;
Outputs:
COL1
VALUE2
First
XXXX,ZZZZ,YYYY
Second
ZZZZ
fiddle

How to delete duplicate results in SQL

I have the following table with two columns which is generated by a query in SQL:
Lookup Value Result
1 2
2 1
4 3
3 4
As you can see it contains duplicate results. I only want it to show the first line and the third line. Does anyone know how to do this in SQL?
Thanks
There are several methods. Here is one using union all:
select t.*
from t
where col1 < col2
union all
select t1.*
from t1
where col1 > col2 and
not exists (select 1 from t t2 where t1.col1 = t2.col2 and t1.col2 = t2.col1);
If you always know that both pairs exist (as in your sample data), you can just use:
select t.*
from t
where col1 < col2;
SELECT DISTINCT
CASE WHEN Lookup Value < Result
THEN Lookup Value
ELSE Result
END as first,
CASE WHEN Lookup Value < Result
THEN Result
ELSE Lookup Value
END as second
FROM YourTable
Create Table T (
[Lookup Value] int,
Result int
)
Insert into T values (1,2),(2,1),(4,3),(3,4)
Select distinct T.[Lookup Value], T.Result
From T
where T.[Lookup Value]<=T.Result

SQL get records from table where results from another table is in the range defined by Col1 and Col2?

I have the following ranges from a query:
Col1 Col2
--------------
100-200
200-300
300-400
and this vector from another query:
Nbr
----
119
351
149
I want to get the ranges for the numbers on the vector.
Is there a way to do this in SQL without recurring to iterations? Something like:
SELECT Col1, Col2
FROM TB1
WHERE (SELECT Nbr FROM TB2) BETWEEN Col1 and Col2
The above query doesn't work because multiple results are returned.
Thank you.
Yes. Just use a join:
SELECT TB1.Col1, TB1.Col2
FROM TB1 JOIN
TB2
ON TB2.Nbr BETWEEN TB1.Col1 and TB1.Col2;

Two equal tables (different column numbers) have different number of rows

AWS Redshift DB
I have two tables A and B
select col1, col2 from A
except
select col1, col2 from B
returns empty, the same
select col1, col2 from B
except
select col1, col2 from A
returns empty
but
select count(*) from A
returns for example 100, but
select count(*) from B
returns 200
how can that be ?
Because each tables distinct data set is contained in the other. A different count means that you have duplicate rows. This might make it clearer.
Distinct(A) is a subset of B
Distinct(B) is a subset of A

Update Table Beginning At Record One SQL Server

I am trying to update a table with records from another table. Whenever I use the insert into statement, I find that the records are simply appended. Instead, I want the records to be inserted from the top of the table. What is the easiest way to do this? I am thin king I could use a update statement, but that means I will have to join the tables. One of the tables(the one I am pulling records from) has only one column. As such, I would have to include another column to do the join.I am trying not to make it so complicated. If there is a simplier way, please let me know.
Sample:
Table One
Col1
1
2
3
4
Table 2
Col1 Col2
a
b
c
d
I want to move column 1 from table 1 to column 2 in table 2 such that table 2 will be:
Table 2
Col1 Col2
a 1
b 2
c 3
d 4
You can do the update using row_number(), but the rows will be assigned in an indeterminate order:
with toupdate as (
select t2.*, row_number() over (select NULL)) as seqnum
from table2 t2
),
t1 as (
select t1.*, row_numbrer() over (select NULL)) as seqnum
from table1 t1
)
update toupdate
set col2 = t1.col1
from toupdate join
t1
on toupdate.seqnum = t1.seqnum;
Note: if you have an ordering in mind, then use the appropriate order by in the partition clauses.
Unless you explicity define an ORDER BY clause in your SELECT statements, your result set will be completely arbitrary. This is in line with how any RDBMS should operate. You should consider including a timestamp at the time of insertion to identify the latest rows.