I have a table (Table1) like the following:
Col1
Col2
First
Code1,Code2,Code3
Second
Code2
So Col2 can contain multiple values comma separated, I have another table (Table2) that contains this:
ColA
ColB
Code1
Value1
Code2
Vaue2
Code3
Vaue3
I need to create a view that joins the two tables (Table1 and Table2) and returns something like this:
Col1
Col2
First
Value1,Value2,Value3
Second
Value2
Is that possible? (I'm on Oracle DB if that helps.)
It's a violation of first normal form to have a list in a column value like that. It causes a lot of difficulties in a relational database, like the one you are encountering now.
However, you can get what you want by using the LIKE operator to find colA values that are substrings of the Col2 column. Add delimiters before and after to catch the first and last ones. Then aggregate back up to a single list using LISTAGG.
SELECT table1.col1,
LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
FROM table1,
table2
WHERE ','||table1.col2||',' LIKE '%,'||table2.colA||',%'
GROUP BY table1.col1
This will not perform well on large volumes, because without an equijoin it's going to use nested loops, and you can't use an index on a LIKE predicate with % at the beginning. The combination of nested loops + FTS is not pleasant with large volumes of data. Therefore, if this is your situation, you will need to fix the 1NF problem by transforming table1 into normal relational format, and then join it to table2 with an equijoin, which will enable it to use a hash join instead. So:
SELECT table1.col1,
LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
FROM (SELECT t.col1,
SUBSTR(t.col2,INSTR(t.col2,',',1,seq)+1,INSTR(t.col2,',',1,seq+1)-(INSTR(t.col2,',',1,seq)+1)) col2_piece
FROM (SELECT col1,
','||col2||',' col2
FROM table1) t,
(SELECT ROWNUM seq FROM dual CONNECT BY LEVEL < 10) x) table1,
table2
WHERE table1.col2_piece IS NOT NULL
AND table1.col2_piece = table2.colA
GROUP BY table1.col1
If you want the values in the same order in the list as the terms then you can use:
SELECT t1.col1,
LISTAGG(t2.colb, ',') WITHIN GROUP (
ORDER BY INSTR(','||t1.col2||',', ','||t2.colA||',')
) AS value2
FROM table1 t1
INNER JOIN table2 t2
ON INSTR(','||t1.col2||',', ','||t2.colA||',') > 0
GROUP BY
t1.col1
Which, for the sample data:
CREATE TABLE Table1 (Col1, Col2) AS
SELECT 'First', 'Code1,Code2,Code3' FROM DUAL UNION ALL
SELECT 'Second', 'Code2' FROM DUAL;
CREATE TABLE Table2 (ColA, ColB) AS
SELECT 'Code1', 'XXXX' FROM DUAL UNION ALL
SELECT 'Code2', 'ZZZZ' FROM DUAL UNION ALL
SELECT 'Code3', 'YYYY' FROM DUAL;
Outputs:
COL1
VALUE2
First
XXXX,ZZZZ,YYYY
Second
ZZZZ
fiddle
I have the following table with two columns which is generated by a query in SQL:
Lookup Value Result
1 2
2 1
4 3
3 4
As you can see it contains duplicate results. I only want it to show the first line and the third line. Does anyone know how to do this in SQL?
Thanks
There are several methods. Here is one using union all:
select t.*
from t
where col1 < col2
union all
select t1.*
from t1
where col1 > col2 and
not exists (select 1 from t t2 where t1.col1 = t2.col2 and t1.col2 = t2.col1);
If you always know that both pairs exist (as in your sample data), you can just use:
select t.*
from t
where col1 < col2;
SELECT DISTINCT
CASE WHEN Lookup Value < Result
THEN Lookup Value
ELSE Result
END as first,
CASE WHEN Lookup Value < Result
THEN Result
ELSE Lookup Value
END as second
FROM YourTable
Create Table T (
[Lookup Value] int,
Result int
)
Insert into T values (1,2),(2,1),(4,3),(3,4)
Select distinct T.[Lookup Value], T.Result
From T
where T.[Lookup Value]<=T.Result
I have the following ranges from a query:
Col1 Col2
--------------
100-200
200-300
300-400
and this vector from another query:
Nbr
----
119
351
149
I want to get the ranges for the numbers on the vector.
Is there a way to do this in SQL without recurring to iterations? Something like:
SELECT Col1, Col2
FROM TB1
WHERE (SELECT Nbr FROM TB2) BETWEEN Col1 and Col2
The above query doesn't work because multiple results are returned.
Thank you.
Yes. Just use a join:
SELECT TB1.Col1, TB1.Col2
FROM TB1 JOIN
TB2
ON TB2.Nbr BETWEEN TB1.Col1 and TB1.Col2;
AWS Redshift DB
I have two tables A and B
select col1, col2 from A
except
select col1, col2 from B
returns empty, the same
select col1, col2 from B
except
select col1, col2 from A
returns empty
but
select count(*) from A
returns for example 100, but
select count(*) from B
returns 200
how can that be ?
Because each tables distinct data set is contained in the other. A different count means that you have duplicate rows. This might make it clearer.
Distinct(A) is a subset of B
Distinct(B) is a subset of A
I am trying to update a table with records from another table. Whenever I use the insert into statement, I find that the records are simply appended. Instead, I want the records to be inserted from the top of the table. What is the easiest way to do this? I am thin king I could use a update statement, but that means I will have to join the tables. One of the tables(the one I am pulling records from) has only one column. As such, I would have to include another column to do the join.I am trying not to make it so complicated. If there is a simplier way, please let me know.
Sample:
Table One
Col1
1
2
3
4
Table 2
Col1 Col2
a
b
c
d
I want to move column 1 from table 1 to column 2 in table 2 such that table 2 will be:
Table 2
Col1 Col2
a 1
b 2
c 3
d 4
You can do the update using row_number(), but the rows will be assigned in an indeterminate order:
with toupdate as (
select t2.*, row_number() over (select NULL)) as seqnum
from table2 t2
),
t1 as (
select t1.*, row_numbrer() over (select NULL)) as seqnum
from table1 t1
)
update toupdate
set col2 = t1.col1
from toupdate join
t1
on toupdate.seqnum = t1.seqnum;
Note: if you have an ordering in mind, then use the appropriate order by in the partition clauses.
Unless you explicity define an ORDER BY clause in your SELECT statements, your result set will be completely arbitrary. This is in line with how any RDBMS should operate. You should consider including a timestamp at the time of insertion to identify the latest rows.