Oracle - Selecting full rows from GROUP BY expression - sql

I want to select full rows with use of my resulting group by. So that I can inspect rows separately. This is what I came up with but It doesn't return anything and I have no idea why.
SELECT * FROM myTable t ,
( SELECT
prop_1, prop_2 , prop_3
FROM
( SELECT COUNT(*) AS countx, prop_1, prop_2, prop_3 FROM myTable
GROUP BY prop_1, prop_2, prop_3 )
WHERE countx>1 ) subselect
WHERE
t.prop_1 = subselect.prop_1
AND t.prop_2 = subselect.prop_2
AND t.prop_3 = subselect.prop_3 ;
Maybe I should try totally different approach but please explain me why this isn't working

I need you need something like this:
select * from myTable where count(*) over (partition by prop_1, prop_2,prop_3) >1
Or if you need the duplications maybe it is cleaner this way:
select * from myTable t1, myTable t2 where
t1.prop_1 = t2.prop_1 and
t1.prop_2 = t2.prop_2 and
t1.prop_3 = t2.prop_3 and
t1.id <t2.id
Otherwise you can merge your internal 2 queries using having count(*)>1 in the most internal one, instead of the where in the second.

Remove the middle sub-query. Try this:
SELECT * FROM myTable t ,
( SELECT COUNT(*) AS countx, prop_1, prop_2, prop_3 FROM myTable
GROUP BY prop_1, prop_2, prop_3 ) subselect
WHERE
t.prop_1 = subselect.prop_1
AND t.prop_2 = subselect.prop_2
AND t.prop_3 = subselect.prop_3 ;

Related

SQL Return only duplicate records

I want to return rows that have duplicate values in both Full Name and Address columns in SQL. So in the example, I would just want the first two rows return. How do I code this?
Why return duplicate values? Just aggregate and return the count:
select fullname, address, count(*) as cnt
from t
group by fullname, address
having count(*) >= 2;
One option uses window functions:
select *
from (
select t.*, count(*) over(partition by fullname, address) cnt
from mytable t
) t
where cnt > 1
If your table has a primary key, say id, you can also use exists:
select t.*
from mytable t
where exists (
select 1
from mytable t1
where t1.fullname = t.fullname and t1.address = t.address and t1.id <> t.id
)

Impala/SQL How can I put all the orther fields on a group by statement?

I have a query grouped by 3 fields against 100 fields table. How can I put the another 97 fields in the select without a join?
This is my statement:
select a,b,c,max(d) as max_d
from mytable
group by a,b,c;
I know that the following query works, but it's very heavy :(
select mytable.* from
(
select a,b,c,max(d) as max_d
from mytable
group by a,b,c
) uni
join mytable myt (uni.a=mytable.a AND uni.b=mytable.b AND uni.c=mytable.c AND uni.max_d=mytable.d);
Thanks!!
Use window functions:
select t.*
from (select t.*, max(d) over (partition by a, b, c) as max_d
from mytable t
where d = max_d;
You can use correlated subquery instead :
select mt.*
from mytable mt
where mt.d = (select max(mt1.d)
from mytable mt1
where mt1.a = mt.a and mt1.b = mt.b and mt1.c = mt.c
);
you can use co-related subquery
select t.* from mytable t
where t.d in ( select max(d) from mytable t1
where t1.a=t.a and t1.b=t.b and t1.c=t.c
)

How to ignore column in SQL Server

I have this query:
Select *
from
(Select
*
ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID) AS RowNumber
from
MyTable
where
Eid = 'C1') as a
where
a.RowNumber = 1
and it displays these results:
Column1 Column2 RowNumber
------------------------------
Value1 value2 1
I want to ignore the RowNumber column in the select statement and I don't want to list all columns in select query (100+ columns and given is just an example).
How to do this in SQL Server?
Well, you would have to list all the columns in the outer select, if you use a subquery and row_number() to get a unique row.
An alternative method uses a correlated subquery, but requires having some unique column in the table. If you have one:
select t.*
from mytable t
where t.col = (select max(t2.col) from mytable t2 where t2.tid = t.tid and t2.eid = 'C1');
With the right indexes, this can have better performance than the row_number() version.
If you don't have a unique column, you can do:
select t.*
from (select distinct tid from mytable where eid = 'C1') tc cross apply
(select top 1 t.*
from mytable t
where t.tid = tc.tid and t.eid = 'C1'
) t;
Wrap your query as a subquery and select specific columns from it like so:
SELECT x.Column1, x.Column2
FROM
(
Select * from (Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1") as a where a.RowNumber=1
) AS x
OR Change your original Select to:
Select a.[Column1], a.[Column2]
from
(
Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1"
) as a
Where a.RowNumber=1
Replace * from your query in clarify exactly columnd which you whant
select x.Column1, x.Column2 FROM (
Select * from (Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1") as a where a.RowNumber=1) AS x

Select duplicated data from table

Query
select * from table1
where having count(reference)>1
I want to select * the data which have duplicate data,any idea why my query is not working?
Below are my expect result..
You can make use of window function count to find number of rows per id and reference and then filter to get those which have count more than 1.
;with cte as (
select t.*, count(*) over (partition by id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
Demo
In the above solution, I have made an assumption that name and id has one to one correspondence (which is true as per your given data). If that's not the case, add name too in the partition by clause:
;with cte as (
select t.*, count(*) over (partition by name, id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
I might actually approach this by using a subquery with GROUP BY:
SELECT t1.*
FROM table1 t1
INNER JOIN
(
SELECT Name, ID, reference
FROM table1
GROUP BY Name, ID, reference
HAVING COUNT(*) > 1
) t2
ON t1.Name = t2.Name AND
t1.ID = t2.ID AND
t1.reference = t2.reference
Demo here:
Rextester
Try this ), first i get count by partition, after that i get row with count > 1
select No, Name, ID, Reference
from (select count(*) over (partition by name, ID, reference) cnt, table1.* from table1)
where cnt>1
The easy way (although maybe not the best for performance) would be:
select * from table1 where reference in (
select reference from table1 group by reference having count(*)>1
)
In a subselect you have the duplicated data, and in the outter select you have all the data for these references.

Select more columns with MAX function

Need to find in databse max value, but then i need read other values in columns.
Can this be done with one SQL command or I have to use this two commands?
SELECT MAX(id) FROM Table;
SELECT * FROM Table WHERE id = $value;
where $value is variable from 1st command
select * from your_table
where id = (select max(id) from your_table)
or
select t1.* from your_table t1
inner join
(
select max(id) as mid
from your_table
)
t2 on t1.id = t2.mid
Probably the simplest way is:
select *
from t
order by id
limit 1
Or use top 1 or where rownum = 1 or whatever is the right logic for your database.
Note: this only returns one row. If you have duplicate such rows, then comparison to the maximum will give you all of them.
Also, if you are using a database that supports window functions:
select *
from (select t.*, row_number() over (order by id desc) as seqnum
from t
) t
where seqnum = 1;