I have a table Table1 with columns A and B (many to many table).
|---------------------|------------------|
| ColumnA | ColumnB |
|---------------------|------------------|
| a1 | b1 |
|---------------------|------------------|
| a1 | b2 |
|---------------------|------------------|
| a2 | b1 |
|---------------------|------------------|
| a2 | b3 |
|---------------------|------------------|
| a3 | b2 |
|---------------------|------------------|
I want a list of As whose Bs are ONLY in list of Bs.
So, from above table, if list is [b1, b2]
Expected [a1, a3]
Not including a2as it is associated with b3 also.
You can use aggregation and having:
select a
from ab
group by a
having sum(case when b not in ('b1', 'b2') then 1 else 0 end) = 0;
The having clause is checking the number of rows that are not in the list. The = 0 says there are none.
Assuming there are not any nulls in ColumnB you can use NOT EXISTS:
select t.*
from tablename t
where not exists (select 1 from tablename where ColumnA = t.ColumnA and ColumnB not in ('b1', 'b2'))
If you want only the distinct values of ColumnA:
select distinct t.ColumnA
from tablename t
where not exists (select 1 from tablename where ColumnA = t.ColumnA and ColumnB not in ('b1', 'b2'))
See the demo.
Related
I have a table where strings 1 and 2 are almost duplicate - they have the same values but in reverse order. How can I delete these duplicates?
+--------+-------+
| COL_1 | COL_# |
+--------+-------+
| a1 | b1 |
| b1 | a1 | <- same as 1stline but in reversed order, needs to be removed
| a2 | b2 |
| a3 | b3 |
| b3 | a3 |<-- also is duplicate of string above, one of these 2str need
+--------+-------+ to be removed
expected result:
+--------+-------+
| COL_1 | COL_# |
+--------+-------+
| b1 | a1 |
| a2 | b2 |
| a3 | b3 |
+--------+-------+
or
+--------+-------+
| COL_1 | COL_# |
+--------+-------+
| a1 | b1 |
| a2 | b2 |
| a3 | b3 |
+--------+-------+
Do you mean like this :
DELETE(
select e.COL_#, f.COL_1
from example as e
join example as f on e.COL_# = f.COL_1 and e.COL_# < f.COL_1 )
If col_1 and col_# cannot be null, you can use LEASTand GREATEST to keep one row per combination:
delete from tbl
where rowid not in
(
select min(rowid) -- one rowid per combination to keep
from tbl
group by least(col_1, col_#), greatest(col_1, col_#)
);
I think it is a bit complicated to achieve it with a delete query.
Maybe it is OK for you to have such a select query first:
select A,B from (
select A,
B,
ROW_NUMBER() over (PARTITION BY ORA_HASH(A) * ORA_HASH(B) ORDER BY A) as RANK
FROM <your_table_name>
) where RANK = 1;
You could save the result of this query as a new table with CREATE TABLE AS SELECT ...
And then you simply DROP your old table.
One possible solution is:
DELETE FROM T
WHERE (COL_1, "COL_#") IN (SELECT COL_1, "COL_#"
FROM (SELECT ROWNUM AS RN, t1.COL_1, t1."COL_#"
FROM T t1
INNER JOIN T t2
ON t2."COL_#" = t1.COL_1 AND
t2.COL_1 = t1."COL_#")
WHERE RN / 2 <> TRUNC(RN / 2));
Note that COL_# must be quoted in Oracle as # is not a legal character in an unquoted identifier.
dbfiddle here
This doesn't seem so complicated:
delete t
where not (col1 < col2 or
not exists (select 1
from t t2
where t2.col1 = t.col2 and
t2.col2 = t.col1
)
);
Or:
delete t
where col1 > col2 and
exists (select 1
from t t2
where t2.col1 = t.col2 and
t2.col2 = t.col1
);
i have a table with data that I want to join unto another table. Problem is that the join can happen on two columns of the same table, where I want to get the first join to work and if this Fails i want the second join to give me a valid result.
Base table:
| ID1 | ID2 | Value |
| a1 | a2 | val_1 |
| b1 | b2 | val_2 |
| c1 | c2 | val_3 |
join Table:
| ID1 | ID2 | Join_Value |
| | a2 | join_val_1 |
| b1 | | join_val_2 |
| c1 | c2 | join_val_3 |
What i tried was this:
select base.id1, base.id2, Value, isnull(j1.Join_value,j2.Join_value) Join_Value from base
left join Join j1 on j1.id1 = base.id1
left join Join j2 on j2.id2 = base.id2
The Result is this:
| ID1 | ID2 | Value | Join_Value |
| a1 | a2 | val_1 | join_val_1 |
| b1 | b2 | val_2 | join_val_2 |
| c1 | c2 | val_3 | join_val_3 |
| c1 | c2 | val_3 | join_val_3 |
What i want is this:
| ID1 | ID2 | Value | Join_Value |
| a1 | a2 | val_1 | join_val_1 |
| b1 | b2 | val_2 | join_val_2 |
| c1 | c2 | val_3 | join_val_3 |
I hope i made my Problem clear.
You don't need to join the same table twice. Just specify the condition in the ON
select b.ID1, b.ID2, b.[Value], j.Join_Value
from [base] b
inner join [join] j on b.ID1 = j.ID1
or (
j.ID1 = ''
and b.ID2 = j.ID2
)
You are going to get duplicate rows for for the c1 and c2 rows because they match on both of your Join table joins (j1 and j2).
A quick fix is to add a DISTINCT to your query:
select DISTINCT base.id1, base.id2, Value, isnull(j1.Join_value,j2.Join_value) Join_Value
from base
left join Join j1 on j1.id1 = base.id1
left join Join j2 on j2.id2 = base.id2
A better fix, depending on your DBMS is to use a window function:
select id1, id2, Value, Join_Value
FROM (
select base.id1, base.id2, Value, isnull(j1.Join_value,j2.Join_value) Join_Value,
ROW_NUMBER() OVER(
PARTITION BY base.id1, base.id2 -- Group rows based on (id1, id2) combination
ORDER BY j1.id1 -- If more than one row, give priority to row with "id1" value
) AS RowNum
from base
left join Join j1 on j1.id1 = base.id1
left join Join j2 on j2.id2 = base.id2
) src
WHERE RowNum = 1 -- Only return one row
This will make sure you always one row maximum per (id1, id2) combination.
Try:
select *
from base b
join [join] j on b.id1 = j.id1 or b.id2 = j.id2
First, your version does exactly what you want. Here is a db<>fiddle.
Second, for more control over the matching, you can use a lateral join. This allows you to choose only one matching row -- say the one where both ids match:
select b.id1, b.id2, b.value, jt.join_value
from base b cross apply
(select top (1) jt.*
from jointable jt
where b.id1 = jt.id1 or
b.id2 = jt.id2
order by (case when b.id1 = jt.id1 then 1 else 0 end) +
(case when b.id2 = jt.id2 then 1 else 0 end) desc
) jt ;
I am kind of new to impala, and to sql in general. I am trying to do some pivot operations in order to start with this table.
Input:
Name table: MyName
+-----------+---------------------+-----------+
| Column A | Column B | Column C |
+-----------+---------------------+-----------+
| a1 | b1 | c1 |
| a2 | b2 | c2 |
| a3 | b3 | c3 |
+-----------+---------------------+-----------+
And to obtain this other table trasposed, where b1, b2, b3 goes from column to row.
output:
+-----------+---------------------+-----------+
| b1 | b2 | b3 |
+-----------+---------------------+-----------+
| a1 | a2 | a3 |
| c1 | c2 | c3 |
+-----------+---------------------+-----------+
This is the code I came up so far:
select b_column,
max(case where b_column='b%' then column_a, column_c end) column_a, column_c
from MyName
group by b_column;
But it's not working and I am feeling pretty stuck.
Can anyone give me a hint/suggestion on how to solve the issue?
Thanks so much in advance!
If you are trying to do a pivot in imapla in general, you can't per the 6.1 documentation, PIVOT is not a current functionality.
https://www.cloudera.com/documentation/enterprise/6/6.1/topics/impala_reserved_words.html
select b_column,
max(case when b_column like 'b%' then column_a end) column_a,
max(case when b_column like 'c%' then column_c end) column_c
from MyName
group by b_column;
I've got one database with two columns (id and value). There are two types of values and each id has both of this values. How can I make a select to this database to have three columns in result (id, value1 and value2)
I've tried CASE and GROUP BY, but it shows only one result of each id
Example of a db:
| id | value |
| 0 | a |
| 0 | b |
| 1 | a |
| 1 | b |
Example of the result I am looking for is:
| id | value_a | value_b |
| 0 | a | b |
| 1 | a | b |
UPDATE:
As it was noted in comments, there is too simple data in the example.
The problem is more complicated
An example that would better describe it:
DB:
| id | value | value2 | value3 |
| 0 | a | a2 | a3 |
| 0 | b | b2 | b3 |
| 1 | a | c2 | c3 |
| 1 | b | d2 | d3 |
RESULT:
| id | value_a | value_b | value2_a | value2_b | value3_a | value3_b |
| 0 | a | b | a2 | b2 | a3 | b3 |
| 1 | a | b | c2 | d2 | c3 | d3 |
The output should be sorted by id an have all info from the both rows of each id.
If there are always two values per ID, you can try an aggregation using min() and max().
SELECT id,
min(value) value_a,
max(value) value_b
FROM elbat
GROUP BY id;
select t0.id,t0.Value as Value_A, t1.Value as Value_B
from test t0
inner join test t1 on t0.id = t1.id
where t0.Value = 'a' and t1.value = 'b';
I have used this method to turn "rows" into "columns". Depending on the number of unique values that exist in the table, you may or may not want to use this :)
SELECT id, SUM(CASE WHEN value = "a" then 1 else 0 END) value_a,
SUM(CASE WHEN value = "b" then 1 else 0 END) value_b,
SUM(CASE WHEN value = "c" then 1 else 0 END) value_c,
SUM(CASE WHEN value ="a2" then 1 else 0 END) value_a2,
.
.
.
FROM table
GROUP BY id;
Thanks all for the answers! This is the way how I did this:
WITH a_table AS
(
SELECT id, value, value2, value3 FROM table1 WHERE table1.value = 0
),
b_table AS
(
SELECT id, value, value2, value3 FROM table1 WHERE table1.value = 1
)
SELECT DISTINCT
a_table.id AS id,
a_table.value AS value_a,
a_table.value2 AS value2_a,
a_table.value3 AS value3_a,
b_table.value AS value_b,
b_table.value2 AS value2_b,
b_table.value3 AS value3_b
FROM a_table
JOIN b_table ON a_table.id = b_table.id
GROUP BY id;
I have a table like this
A1 | A2
a | b
c | d
b | a
a | b
And I want to select distinct pairs :
A1 | A2
a | b
c | d
I tried :
select a, b from (
select a, b , a|b as ab, b|a as ba from T
)t where ab!=ba group by a, b
Anyone have a better idea about how I can do this ?
Thanks
An ANSI compliant way of doing this would be to rearrange each pair of A1 and A2 values as min/max using CASE expressions. Then just select distinct on this derived table.
SELECT DISTINCT
A1, A2
FROM
(
SELECT
CASE WHEN A1 < A2 THEN A1 ELSE A2 END AS A1,
CASE WHEN A1 < A2 THEN A2 ELSE A1 END AS A2
FROM yourTable
) t
This would be the cleanest way if NULL values are not involved
select distinct
least (A1,A2) as A1
,greatest (A1,A2) as A2
from t
;
+-----+-----+
| a1 | a2 |
+-----+-----+
| a | b |
| c | d |
+-----+-----+