Find matching column data between two rows in the same table - sql

I want to find the matching value between two rows in the same sqlite table. For example, if I have the following table:
rowid, col1, col2, col3
----- ---- ---- ----
1 5 3 1
2 3 6 9
3 9 12 5
So comparing row 1 and 2, I get the value 3.
Row 2 and 3 will give 9.
Row 3 and 1 will give 5.
There will always be one and only one matching value between any two rows in the table.
What it the correct sqlite query for this?

I hardcoded the values for the rows because i do not know how to declare variables in sqllite.
select t1.rowid as r1, t2.rowid as r2, t2.col as matchvalue from <yourtable> t1 join
(
select rowid, col1 col from <yourtable> where rowid = 3 union all
select rowid, col2 from <yourtable> where rowid = 3 union all
select rowid, col3 from <yourtable> where rowid = 3
) t2
on t2.col in (t1.col1, t1.col2, t1.col3)
and t1.rowid < t2.rowid -- you don't need this if you have two specific rows
and t1.rowid = 1

select col from
(
select rid, c1 as col from yourtable
union
select rid, c2 from yourtable
union
select rid, c3 from yourtable
) v
where rid in (3,2)
group by col
order by COUNT(*) desc
limit 1

Related

Return rows where specific column has duplicate values

From the table below I want to show the two rows where the values in column 3 are duplicates:
ID
Col2
Col3
1
a
123
2
b
123
3
c
14
4
d
65
5
e
65
This means that the query that I need should return rows with ID 1, 2 and 4, 5.
I wrote query using having:
SELECT *
FROM t1
INNER JOIN (SELECT col3 FROM t1
GROUP BY col3
HAVING COUNT(*) > 1) a
ON t1.col3 = a.col3
This query though only returns 1 and 4 rows for example, not all duplicates.
I would appreciate the help.
Your query should work, but I would suggest window functions:
select t1.*
from (select t1.*, count(*) over (partition by col3) as cnt
from t1
) t1
where cnt > 1;

select query to fetch rows corresponding to all values in a column

Consider this example table "Table1".
Col1 Col2
A 1
B 1
A 4
A 5
A 3
A 2
D 1
B 2
C 3
B 4
I am trying to fetch those values from Col1 which corresponds to all values (in this case, 1,2,3,4,5). Here the result of the query should return 'A' as none of the others have all values 1,2,3,4,5 in Col2.
Note that the values in Col2 are decided by other parameters in the query and they will always return some numeric values. Out of those values the query needs to fetch values from Col1 corresponding to all in Col2. The values in Col2 could be 11,12,1,2,3,4 for instance (meaning not necessarily in sequence).
I have tried the following select query:
select distinct Col1 from Table1 where Col1 in (1,2,3,4,5);
select distinct Col1 from Table1 where Col1 exists (select distinct Col2 from Table1);
and its different variations. But the problem is that I need to apply an 'and' for Col2 not an 'or'.
like Return a value from Col1 where Col2 'contains' all values between 1 and 5.
Appreciate any suggestion.
You could use analytic ROW_NUMBER() function.
SQL FIddle for a setup and working demonstration.
SELECT col1
FROM
(SELECT col1,
col2,
row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
FROM your_table
WHERE col2 IN (1,2,3,4,5)
)
WHERE rn =5;
UPDATE As requested by OP, some explanation about how the query works.
The inner sub-query gives you the following resultset:
SQL> SELECT col1,
2 col2,
3 row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
4 FROM t
5 WHERE col2 IN (1,2,3,4,5);
C COL2 RN
- ---------- ----------
A 1 1
A 2 2
A 3 3
A 4 4
A 5 5
B 1 1
B 2 2
B 4 3
C 3 1
D 1 1
10 rows selected.
PARTITION BY clause will group each sets of col1, and ORDER BY will sort col2 in each group set of col1. Thus the sub-query gives you the row_number for each row in an ordered way. now you know that you only need those rows where row_number is at least 5. So, in the outer query all you need ot do is WHERE rn =5 to filter the rows.
You can use listagg function, like
SELECT Col1
FROM
(select Col1,listagg(Col2,',') within group (order by Col2) Col2List from Table1
group by Col1)
WHERE Col2List = '1,2,3,4,5'
You can also use below
SELECT COL1
FROM TABLE_NAME
GROUP BY COL1
HAVING
COUNT(COL1)=5
AND
SUM(
(CASE WHEN COL2=1 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=2 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=3 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=4 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=5 THEN 1 ELSE 0
END))=5

require to form a sql query

I was working on preparing a query where I was stuck.
Consider tables below:
table1
id key col1
-- --- -----
1 1 abc
2 2 d
3 3 s
4 4 xyz
table2
id col1 foreignkey
-- ---- ----------
1 12 1
2 13 1
3 14 1
4 12 2
5 13 2
Now what I need is to select only those records from table1 for which the corresponding entries in table2 does not have say col1 value as 12.
So the challenge is after applying join even though it will skip for value 1 corresponding to col1 equal to 12 it still has another multiple rows whose values are say 13, 14 for which also they have same foreignkey. Now what I want is if there is a single row having value 12 then it should not pick that id at all from table1.
How can I form a query with this?
The output which i need is say from above table structure i want to get those records from table1 for which col1 value from table2 does not have value as 14.
so my query should return me only row 2 from table1 and not row 1.
Another way of doing that. The first two queries are just for making the sample data.
;WITH t1(id ,[key] ,col1) AS
(
SELECT 1 , 1 , 'abc' UNION ALL
SELECT 2 , 2 , 'd' UNION ALL
SELECT 3 , 3 , 's' UNION ALL
SELECT 4 , 4 , 'xyz'
)
,t2(id ,col1, foreignkey) AS
(
SELECT 1 , 12 , 1 UNION ALL
SELECT 2 , 13 , 1 UNION ALL
SELECT 3 , 14 , 1 UNION ALL
SELECT 4 ,12 , 2 UNION ALL
SELECT 5 ,13 , 2
)
SELECT id, [key], col1
FROM t1
WHERE id NOT IN (SELECT t2.Id
FROM t2
INNER JOIN t1 ON t1.Id = t2.foreignkey
WHERE t2.col1 = 14)
This is a typical case for NOT EXISTS:
SELECT id, [key], col1
FROM table1 t1
WHERE NOT EXISTS (SELECT 1
FROM table2 t2
WHERE t2.foreignkey = t1.id AND t2.col1 = 14)
The above query will not select a row from table1 if there is a single correlated row in table2 having col1 = 14.
Output:
id key col1
-------------
2 2 d
3 3 s
4 4 xyz
If you want to return records that, in addition to the criterion set above, also have correlated records in table2, then you can use the following query:
SELECT t1.id, MAX(t1.[key]) AS [key], MAX(t1.col1) AS col1
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.foreignkey
GROUP BY t1.id
HAVING COUNT(CASE WHEN t2.col1 = 14 THEN 1 END) = 0
Output:
id key col1
-------------
2 2 d
You can also achieve the same result with the second query using a combination of EXISTS and NOT EXISTS:
SELECT id, [key], col1
FROM table1 t1
WHERE EXISTS (SELECT 1
FROM table2 t2
WHERE t2.foreignkey = t1.id)
AND
NOT EXISTS (SELECT 1
FROM table2 t3
WHERE t3.foreignkey = t1.id AND t3.col1 = 14)
select t1.id,t1.key,
(select ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col1 DESC) AS Row,* into
#Temp from table1)
from table1 t1
inner join table2 t2 on t1.id=t2.foreignkey
where t2.col1=(select col1 from #temp where row>1)

Apply the distinct on 2 fields and also fetch the unique data for each columns

According to some weird requirement, i need to select the record where all the output values in both the columns should be unique.
Input looks like this:
col1 col2
1 x
1 y
2 x
2 y
3 x
3 y
3 z
Expected Output is:
col1 col2
1 x
2 y
3 z
or
col1 col2
1 y
2 x
3 z
I tried applying the distinct on 2 fields but that returns all the records as overall they are distinct on both the fields. What we want to do is that if any value is present in the col1, then it cannot be repeated in the col2.
Please let me know if this is even possible and if yes, how to go about it.
Great problem! Armunin has picked up on the deeper structural issue here, this is a recursive enumerable problem description and can only be resolved with a recursive solution - base relational operators (join/union/etc) are not going to get you there. As Armunin cited, one approach is to bring out the PL/SQL, and though I haven't checked it in detail, I'd assume the PL/SQL code will work just fine. However, Oracle is kind enough to support recursive SQL, through which we can build the solution in just SQL:
-- Note - this SQL will generate every solution - you will need to filter for SOLUTION_NUMBER=1 at the end
with t as (
select 1 col1, 'x' col2 from dual union all
select 1 col1, 'y' col2 from dual union all
select 2 col1, 'x' col2 from dual union all
select 2 col1, 'y' col2 from dual union all
select 3 col1, 'x' col2 from dual union all
select 3 col1, 'y' col2 from dual union all
select 3 col1, 'z' col2 from dual
),
t0 as
(select t.*,
row_number() over (order by col1) id,
dense_rank() over (order by col2) c2_rnk
from t),
-- recursive step...
t1 (c2_rnk,ids, str) as
(-- base row
select c2_rnk, '('||id||')' ids, '('||col1||')' str
from t0
where c2_rnk=1
union all
-- induction
select t0.c2_rnk, ids||'('||t0.id||')' ids, str||','||'('||t0.col1||')'
from t1, t0
where t0.c2_rnk = t1.c2_rnk+1
and instr(t1.str,'('||t0.col1||')') =0
),
t2 as
(select t1.*,
rownum solution_number
from t1
where c2_rnk = (select max(c2_rnk) from t1)
)
select solution_number, col1, col2
from t0, t2
where instr(t2.ids,'('||t0.id||')') <> 0
order by 1,2,3
SOLUTION_NUMBER COL1 COL2
1 1 x
1 2 y
1 3 z
2 1 y
2 2 x
2 3 z
You can use a full outer join to merge two numbered lists together:
SELECT col1, col2
FROM ( SELECT col1, ROW_NUMBER() OVER ( ORDER BY col1 ) col1_num
FROM your_table
GROUP BY col1 )
FULL JOIN
( SELECT col2, ROW_NUMBER() OVER ( ORDER BY col2 ) col2_num
FROM your_table
GROUP BY col2 )
ON col1_num = col2_num
Change ORDER BY if you require a different order and use ORDER BY NULL if you're happy to let Oracle decide.
What would be the result if another row of
col1 value as 1 and col2 value as xx ?
A single row is better in this case:
SELECT DISTINCT TO_CHAR(col1) FROM your_table
UNION ALL
SELECT DISTINCT col2 FROM your_table;
My suggestion is something like this:
begin
EXECUTE IMMEDIATE 'CREATE global TEMPORARY TABLE tmp(col1 NUMBER, col2 VARCHAR2(50))';
end;
/
DECLARE
cur_print sys_refcursor;
col1 NUMBER;
col2 VARCHAR(50);
CURSOR cur_dist
IS
SELECT DISTINCT
col1
FROM
ttable;
filtered sys_refcursor;
BEGIN
FOR rec IN cur_dist
LOOP
INSERT INTO tmp
SELECT
col1,
col2
FROM
ttable t1
WHERE
t1.col1 = rec.col1
AND t1.col2 NOT IN
(
SELECT
tmp.col2
FROM
tmp
)
AND t1.col1 NOT IN
(
SELECT
tmp.col1
FROM
tmp
)
AND ROWNUM = 1;
END LOOP;
FOR rec in (select col1, col2 from tmp) LOOP
DBMS_OUTPUT.PUT_LINE('col1: ' || rec.col1 || '|| col2: ' || rec.col2);
END LOOP;
EXECUTE IMMEDIATE 'DROP TABLE tmp';
END;
/
May still need some refining, I am especially not happy with the ROWNUM = 1 part.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl ( col1, col2 ) AS
SELECT 1, 'x' FROM DUAL
UNION ALL SELECT 1, 'y' FROM DUAL
UNION ALL SELECT 2, 'x' FROM DUAL
UNION ALL SELECT 2, 'y' FROM DUAL
UNION ALL SELECT 3, 'x' FROM DUAL
UNION ALL SELECT 3, 'y' FROM DUAL
UNION ALL SELECT 4, 'z' FROM DUAL;
Query 1:
WITH c1 AS (
SELECT DISTINCT
col1,
DENSE_RANK() OVER (ORDER BY col1) AS rank
FROM tbl
),
c2 AS (
SELECT DISTINCT
col2,
DENSE_RANK() OVER (ORDER BY col2) AS rank
FROM tbl
)
SELECT c1.col1,
c2.col2
FROM c1
FULL OUTER JOIN c2
ON ( c1.rank = c2.rank)
ORDER BY COALESCE( c1.rank, c2.rank)
Results:
| COL1 | COL2 |
|------|--------|
| 1 | x |
| 2 | y |
| 3 | z |
| 4 | (null) |
And to address the additional requirement:
What we want to do is that if any value is present in the col1, then it cannot be repeated in the col2.
Query 2:
WITH c1 AS (
SELECT DISTINCT
col1,
DENSE_RANK() OVER (ORDER BY col1) AS rank
FROM tbl
),
c2 AS (
SELECT DISTINCT
col2,
DENSE_RANK() OVER (ORDER BY col2) AS rank
FROM tbl
WHERE col2 NOT IN ( SELECT TO_CHAR( col1 ) FROM c1 )
)
SELECT c1.col1,
c2.col2
FROM c1
FULL OUTER JOIN c2
ON ( c1.rank = c2.rank)
ORDER BY COALESCE( c1.rank, c2.rank)

Tricky delete. How do i?

I got a table with 2 columns (both INT), and there are 400 000 records (a lot).
The first column is random numbers ordered ASC. The second column has a rule on it (which is not important right now)
In the table there are 1000 records, that are exceptions. So, instead of the "rule", there is only "-1" - valued cells.
How can I delete ~399 000 records, so i want to have in my table left only the ones with -1 and their "neighbors" (the records before and after the ones with -1)
UPDATE
sql server 2k5
first column values - yes unique, but not ID-s (it's not ++ :D)
example:
before:
20022518 13
20022882 364
20022885 -1
20022887 5
20022905 18
20023200 295
20023412 212
20023696 284
20024112 416
20025015 903
20025400 385
20025401 -1
20025683 283
20025981 298
20025989 8
20026752 763
20027779 1027
20028344 565
20028350 6
20028896 546
20028921 25
20028924 -1
20028998 77
20029031 33
20029051 20
20029492 441
20029530 38
20029890 360
after:
20022882 364
20022885 -1
20022887 5
20025400 385
20025401 -1
20025683 283
20028921 25
20028924 -1
20028998 77
If I understand correctly you want to keep all records with col2 = -1 and the records with the closest col1 to the records with -1. Assuming no duplicates in col1 I would do something like this
delete from table where not col1 in
(
(select col1 from table where col2 = -1)
union
(select (select max(t2.col1) from table t2 where t2.col1 < t1.col1) from table t1 where t1.col2 = -1)
union
(select (select min(t4.col1) from table t4 where t4.col1 > t3.col1) from table t3 where t3.col2 = -1)
)
Edit:
t4.col1 < t3.col1 should be t4.col1 > t3.col1
I created a test-table with col1 and col2, both int, col1 is PK, but not autonumber
SELECT * from adjacent
Gives
col1 col2
1 5
3 4
4 2
7 -1
11 8
12 2
With the above subselects:
SELECT * from adjacent
where
col1 in
(
(select col1 from adjacent where col2 = -1)
union
(select (select max(t2.col1) from adjacent t2 where t2.col1 < t1.col1) from adjacent t1 where t1.col2 = -1)
union
(select (select min(t4.col1) from adjacent t4 where t4.col1 > t3.col1) from adjacent t3 where t3.col2 = -1)
)
gives
col1 col2
4 2
7 -1
11 8
With the not also
SELECT * from adjacent
where
col1 not in
(
(select col1 from adjacent where col2 = -1)
union
(select (select max(t2.col1) from adjacent t2 where t2.col1 < t1.col1) from adjacent t1 where t1.col2 = -1)
union
(select (select min(t4.col1) from adjacent t4 where t4.col1 > t3.col1) from adjacent t3 where t3.col2 = -1)
)
gives
col1 col2
1 5
3 4
12 2
Finally a delete and select
delete from adjacent
where
col1 not in
(
(select col1 from adjacent where col2 = -1)
union
(select (select max(t2.col1) from adjacent t2 where t2.col1 < t1.col1) from adjacent t1 where t1.col2 = -1)
union
(select (select min(t4.col1) from adjacent t4 where t4.col1 > t3.col1) from adjacent t3 where t3.col2 = -1)
)
select * from adjacent
gives
col1 col2
4 2
7 -1
11 8
Assuming SQL Server here. Your best bet, if you are keeping a very small dataset, is to insert into a new table. I.E.:
SELECT *
INTO MyTable2
FROM MyTable
WHERE ColumnB = -1
DROP TABLE MyTable
exec sp_rename MyTable2 MyTable
This will be a minimally logged operation and will run in a fraction of the time of a DELETE.
Without another key there is no way to ensure you get the "neighbors" since this is not really a valid concept in a relational DB. If the first column is "random" you can't tell which ones are "before" and "after" a row with a -1 value.
If by "random" you mean it's like an IDENTITY column that increases automatically, AND YOU HAVE NO MISSING VALUES IN THE SEQUENCE you can do something like:
SELECT *
INTO MyTable2
FROM MyTable mt
WHERE ColumnB = -1
OR WHERE EXISTS (
SELECT * FROM MyTable mt2
WHERE mt2.id = mt.id + 1
OR mt2.id = mt.id -1)
DROP TABLE MyTable
exec sp_rename MyTable2 MyTable
The solution is to number the records first, identify those adjacent to the -1 rules and then use UNION to assemble the final result:
WITH Numbered(seq, id, ruleno) AS (
SELECT
ROW_NUMBER() OVER (ORDER BY id), id, ruleno
FROM
Tricky
),
Brothers(id, ruleno) AS (
SELECT
b.id, b.ruleno
FROM
Numbered a INNER JOIN Numbered b
ON a.ruleno = -1 AND
abs(a.seq - b.seq) = 1
),
Triplets(id, ruleno) AS (
SELECT
id, ruleno
FROM
Tricky
WHERE
ruleno = -1
UNION ALL
SELECT
id, ruleno
FROM
Brothers
)
-- Display results
SELECT
id, ruleno
FROM
Triplets
ORDER BY
id
Result:
id ruleno
20022882 364
20022885 -1
20022887 5
20025400 385
20025401 -1
20025683 283
20028921 25
20028924 -1
20028998 77
Finally:
DELETE FROM
Tricky
WHERE
id NOT IN (
SELECT
id
FROM
triplets
)
USe this tricky query:
for this I created a table by below statement:
create table t1 (val int, val2 int)
GO
-- below is the exact stmt:
With CTE as(select val, val2, row_number() over (order by val ASC) as rnum
from t1)
DELETE t1
From t1 inner join cte a
ON t1.val = a.val INNER JOIN (SELECT * fROM cte where val2 = -1) as b
on a.rnum = b.rnum
or a.rnum = b.rnum - 1
or a.rnum = b.rnum + 1
For more information baout CTE please see this post:
http://blog.sqlauthority.com/2009/08/08/sql-server-multiple-cte-in-one-select-statement-query/