SQL:how to find duplicated combinations in two adjacent columns - sql

I have a table (field1, field 2, field3, field4),
How can I sift out only those rows which contain duplicated combinations in two adjacent columns - field3 and field4? i.e ->

Try this:
select *
from mytable t
join (
select field3, field4, count(*) from (
select field3, field4 from mytable where field3 <= field4
union all
select field4, field3 from mytable where field3 > field4) x
group by field3, field4
having count(*) > 1) y
on (t.field3 = y.field3 and t.field4 = y.field4)
or (t.field3 = y.field4 and t.field4 = y.field3)
The union all inner query lines up all the values, without removing duplicates (as union does) into consistent columns - the where clauses ensure that rows aren't selected twice.
The inner query is then grouped with a having clause to pick the duplicates.
The outer query joins to these both ways to get all the rows.

Related

Process TOP(n) records with an SQL CTE in an SQL WHILE Loop

First timer here, so if I am doing something wrong, please do not hesitate in telling me.
I have a situation where I know what to do, but I do not know how to do it. Situation is as follows:
Take records from one table - now each record has to be split into two parts into a new table. This means I take 1 record and it ends up being 2 in another table. For this purpose I have made a CTE with SQL. This works perfectly. It splits the record into two parts. This is what the CTE looks like :
WITH
cteFirstLine (Field1, Field2, Field3) AS
(
SELECT TOP 1000 T.Field1, T.Field2, T.Field3
FROM Table1 T
LEFT JOIN Table2 P ON T.Field1 = P.Field4
ORDER BY Field2 DESC
)
,
cteSecondLine(Field1, Field2, Field3) AS
(
SELECT TOP 1000 T.Field1, T.Field2, T.Field3
FROM Table1 T
LEFT JOIN Table2 P ON T.Field1 = P.Field4
ORDER BY Field2 DESC
)
,
cteUnified (Field1, Field2, Field3)
AS
(
SELECT Field1, Field2, Field3
FROM cteFirstLine
UNION
SELECT Field1, Field2, Field3
FROM cteSecondLine
)
INSERT INTO Table_TEST(Field1, Field2, Field3)
SELECT TOP 2000 Field1, Field2, Field3
FROM cteUnified
OK, so this works. The problem I now have is that The old table, the table from which I have to process these records contains almost a million records. I know it is a lot, but it used to be way more than that, hence this task.
I need to know: how can I use this CTE inside a loop? Will a loop suffice? How would I consturct the loop so that it can do everything automatically. By Automatically I mean instead of doing only 1000 records at a time, but more.
Can anyone please help?

How to use LIKE statement with multiple values from another field?

I want to select all records where field contains values from another field. How do I do that? Here is a code that I am trying.
select field1 , field2, field3
from table1
where field1 like '%'+(select distinct field4 from table2)+'%'
Thanks in advance.
Just do your like as a join condition:
select field1 , field2, field3
from table1
join (select distinct field4 from table2) x
on field1 like '%'+field4+'%'
Using the original structure of your query, you can do:
select field1, field2, field3
from table1 t1
where exists (select 1
from table2
where t1.field1 like '%' + field4 + '%'
);
The advantage of this method is that it will not duplicate records in table1. For instance, if there are two rows in table2 with the values 'a' and 'b' respectively and one row in table1 with the value 'ab', then this method will only return the row from table1 once.

select distinct value in sql with 35 columns

I have table that contains 35 columns, how would I select only the distinct records from that table, this is what my query looks like:
`SELECT field1, field2, field3 etc... from table1 group by field1, field2, field3 etc...`
This gets me the unique results that I want but I have 35 columns, its too long to group all 35 rows - is there any efficient way of doing this:
by doing this, I get repeated results:
SELECT distinct * from table1
DISTINCT will be also faster:
Your query should looks like:
SELECT distinct field1, field2, field3 etc...
from table1
Use distinct once.
This will affect all columns.
You don't need GROUP BY if you use DISTINCT. But you have to list all the fields that you want to show, and for sure not the primary key.
SELECT DISTINCT field1, field2, field3 etc...
FROM table1

SQL Having Clause with multiple selected fields

I have a SQL query such as the following:
SELECT field1, field2, field3, field4, field5
FROM tablename
WHERE field1 = condition
GROUP BY 1,2,3,4,5
HAVING COUNT(field1) > 2
I expected the query to return only the results which have more than 2 rows in the resultset, however the query returns 0 zero.
Could anyone point out what I'm doing wrong? I need to keep my query selecting the fields it has been, but limit the results coming back to only those who have at least 2 rows. If they only have 1, I don't want them included in my results.
The where clause specifies that field1 has to be equal to condition.
count(field1) would essentially always be 1 (distinct values of field1 would be 1 and equal to condition).
That's why we always have 0 results since the count is never > 2.
I suspect (although from my comment under the question I am not sure) that you want
SELECT field1, field2, field3, field4, field5
FROM tablename
WHERE field1 = condition
AND field1 IN
(SELECT field1
FROM tablename
GROUP BY 1
HAVING COUNT(field1) >=2 );

ORACLE Select and group by excluding one field

I have a very simple query (on Oracle 11g) to select 3 fields:
select field1, field2, field3, count(*) from table
where...
group by field1, field2, field3
having count(*) > 10;
Now, what I need, is exclude "field3" from the "group by" since I only need field 1 and 2 to be grouped, but I also need field3 in the output.
As far I know, all the fields in the select must be reported also in "group by", so how can I handle that?
Thanks
Lucas
select t.field1, t.field2, t.field3, tc.Count
from table t
inner join (
select field1, field2, count(*) as Count
from table
where...
group by field1, field2
having count(*) > 10
) tc on t.field1 = tc.field1 and t.field2 = tc.field2
Use the analytical version of the "count" function:
select * from (
select field1, field2, field3, count(*) over(partition by field1, field2) mycounter
from table )
--simulate the having clause
where mycounter > 10;
If you don't group by field3 anymore, there can suddenly be different field3 per group. You must decide which one to show, e.g. the maximum:
select field1, field2, max(field3), count(*) from table
where...
group by field1, field2
having count(*) > 10;
The only way I know how to handle that is to first isolate the Field1 and Field2 data and create a new table, then link it back to the original table adding in Field3.
Select Table2.Field1, Table2.Field2, Table1.Field3
From
(Select Field1, max(Field2) as Field2
From Table1) Table2
Where Table2.Field1 = Table1.Field1
And Table2.Field2 = Table1.Field2
Group By
Table2.Field1, Table2.Field2, Table1.Field3