How to select all columns for rows where I check if just 1 or 2 columns contain duplicate values - sql

I'm having difficulty with what I figure should be an easy problem. I want to select all the columns in a table for which one particular column has duplicate values.
I've been trying to use aggregate functions, but that's constraining me as I want to just match on one column and display all values. Using aggregates seems to require that I 'group by' all columns I'm going to want to display.

If I understood you correctly, this should do:
SELECT *
FROM YourTable A
WHERE EXISTS(SELECT 1
FROM YourTable
WHERE Col1 = A.Col1
GROUP BY Col1
HAVING COUNT(*) > 1)

You can join on a derived table where you aggregate and determine "col" values which are duplicated:
SELECT a.*
FROM Table1 a
INNER JOIN
(
SELECT col
FROM Table1
GROUP BY col
HAVING COUNT(1) > 1
) b ON a.col = b.col

This query gives you a chance to ORDER BY cola in ascending or descending order and change Cola output.
Here's a Demo on SqlFiddle.
with cl
as
(
select *, ROW_NUMBER() OVER(partition by colb order by cola ) as rn
from tbl)
select *
from cl
where rn > 1

Related

How to intersect two tables without losing the duplicate values oracle

How to intersect two tables without losing the duplicate values in Oracle?
TAB1:
A
A
B
C
TAB2:
A
A
B
D
Output:
A
A
B
A subquery will filter the rows:
select *
from tab1
where col in (select col from tab2)
If I understand correctly:
select a.*, row_number() over (partition by col1 order by col1)
from a
intersect
select b.*, row_number() over (partition by col1 order by col1)
from b;
This adds a new sequential number to each row. Intersect will go up to the matching number.
This uses partition by col1 -- the col1 is arbitrary. You may need to include all columns in the partition by.

How to use order by with union all in sql?

I tried the sql query given below:
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
It results in the following error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP or FOR
XML is also specified.
I need to use order by in union all. How do I accomplish this?
SELECT *
FROM
(
SELECT * FROM TABLE_A
UNION ALL
SELECT * FROM TABLE_B
) dum
-- ORDER BY .....
but if you want to have all records from Table_A on the top of the result list, the you can add user define value which you can use for ordering,
SELECT *
FROM
(
SELECT *, 1 sortby FROM TABLE_A
UNION ALL
SELECT *, 2 sortby FROM TABLE_B
) dum
ORDER BY sortby
You don't really need to have parenthesis. You can sort directly:
SELECT *, 1 AS RN FROM TABLE_A
UNION ALL
SELECT *, 2 AS RN FROM TABLE_B
ORDER BY RN, COLUMN_1
Not an OP direct response, but I thought I would jimmy in here responding to the the OP's ERROR messsage, which may point you in another direction entirely!
All these answers are referring to an overall ORDER BY once the record set has been retrieved and you sort the lot.
What if you want to ORDER BY each portion of the UNION independantly, and still have them "joined" in the same SELECT?
SELECT pass1.* FROM
(SELECT TOP 1000 tblA.ID, tblA.CustomerName
FROM TABLE_A AS tblA ORDER BY 2) AS pass1
UNION ALL
SELECT pass2.* FROM
(SELECT TOP 1000 tblB.ID, tblB.CustomerName
FROM TABLE_B AS tblB ORDER BY 2) AS pass2
Note the TOP 1000 is an arbitary number. Use a big enough number to capture all of the data you require.
There will be times when you need to do something like this :
Pull top 5 from table 1 based on a sort
and bottom 5 from table 2 based on another sort
and union these together.
solution
select * from (
-- top 5 records
select top 5 col1, col2, col3
from table1
group by col1, col2
order by col3 desc ) z
union all
select * from (
-- bottom 5 records
select top 5 col1, col2, col3
from table2
group by col1, col2
order by col3 ) z
this was the only way i was able to get around the error and worked fine for me.
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
ORDER BY 2;
2 is column number here .. In Oracle SQL you can use the column number by which you want to sort the data
This solved my SELECT statement:
SELECT * FROM
(SELECT id,name FROM TABLE_A
UNION ALL
SELECT id,name FROM TABLE_B ) dum
order by dum.id , dum.name
where id and name columns available in tables and you can use your columns .
Simply use that , no need parenthesis or anything else
SELECT *, id as TABLE_A_ID FROM TABLE_A
UNION ALL
SELECT *, id as TABLE_B_ID FROM TABLE_B
ORDER BY TABLE_A_ID, TABLE_B_ID
ORDER BY after the last UNION should apply to both datasets joined by union.
The solution shown below:
SELECT *,id AS sameColumn1 FROM Locations
UNION ALL
SELECT *,id AS sameColumn2 FROM Cities
ORDER BY sameColumn1,sameColumn2
select CONCAT(Name, '(',substr(occupation, 1, 1), ')') AS f1
from OCCUPATIONS
union
select temp.str AS f1 from
(select count(occupation) AS counts, occupation, concat('There are a total of ' ,count(occupation) ,' ', lower(occupation),'s.') As str from OCCUPATIONS group by occupation order by counts ASC, occupation ASC
) As temp
order by f1

how to find the most appears one in a table using sql?

I have a table A with two columns named B and C as following:
('W1','F2')
('W1','F7')
('W2','F1')
('W2','F6')
('W2','F8')
('W4','F7')
('W6','F2')
('W6','F15')
('W7','F1')
('W7','F4')
('W7','F17')
('W8','F13')
How can I find which one in the B column appears with the most time using sql in oracle? (In this case, it's W2 and W7). Thank you!
Use a subquery to calculate the number of items in columC for each value in columnB and rank() the results of the subquery based on that count. Then in your main select return just the values of columnB where the rank of the rows returned by the subquery is 1:
SELECT ColB
FROM (
SELECT ColB,
Count(ColC),
rank() over (ORDER BY Count(ColC) DESC) AS rnk
FROM yourTable
GROUP BY ColB)
WHERE rnk = 1
Here's a sql fiddle: http://sqlfiddle.com/#!4/fa6bd/2
/*
C2 REFERS TO THE COLUMN B
T1 Refers to an alias
*/
WITH T1 AS
(
SELECT C2,COUNT(*) AS COUNT
FROM YOURTABLE
GROUP BY C2
)
SELECT C2,COUNT FROM T1 WHERE COUNT=(SELECT MAX(COUNT) FROM T1 )
;
Select ColB, Count(*)
FROM yourTable
GROUP BY ColB
ORDER BY count(*) desc

Wrap-around in SQL results ordering

If I have a table with items beginning with C, D, and J, is it in any way possible to arrange a query that orders these ascending, but starts with D, and ends with C wrapped around?
E.g. raw table = C,C,C,D,D,J
Desired result order = D,D,J,C,C,C
In ordinary SQL this is, not any functional language? I can't see how the desired order can be achieved without hard-coding, i.e. selecting each individual record in the desired order all union'd together.
You could introduce a custom sort key with a CASE statement.
Select Col1, Col2, Col3,
case left(Col3,1) when 'D' then 1
when 'J' then 2
when 'C' then 3
else 4
end as SortKey
from YourTable
order by SortKey, Col3
That does not seem like a regular order, so you won't be able to do it in a simple way. But you can do this:
(SELECT * FROM yourtable WHERE ID >=C ORDER BY ID) UNION (SELECT SELECT * FROM yourtable WHERE ID <C ORDER BY ID)
You can implement custom sort orders without hardcoding by using a table:
CREATE TABLE SortRules (
Prefix char(1) NOT NULL
,SortOrder int NOT NULL
)
Then join to the SortRules table:
SELECT *
FROM YourTable
LEFT JOIN SortRules
ON YourTable.YourColumn LIKE SortRules.Prefix + '%'
ORDER BY SortRules.SortOrder, YourTable.YourColumn
You can make SortOrder UNIQUE (although that's not required). You can also decide if you want missing ones (unmatched in the join) to be at the top or bottom:
SELECT *
FROM YourTable
LEFT JOIN SortRules
ON YourTable.YourColumn LIKE SortRules.Prefix + '%'
ORDER BY COALESCE(SortRules.SortOrder, 2147483647), YourTable.YourColumn
SELECT *
FROM YourTable
LEFT JOIN SortRules
ON YourTable.YourColumn LIKE SortRules.Prefix + '%'
ORDER BY COALESCE(SortRules.SortOrder, -2147483648), YourTable.YourColumn

Select DISTINCT, return entire row

I have a table with 10 columns.
I want to return all rows for which Col006 is distinct, but return all columns...
How can I do this?
if column 6 appears like this:
| Column 6 |
| item1 |
| item1 |
| item2 |
| item1 |
I want to return two rows, one of the records with item1 and the other with item2, along with all other columns.
In SQL Server 2005 and above:
;WITH q AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY col6 ORDER BY id) rn
FROM mytable
)
SELECT *
FROM q
WHERE rn = 1
In SQL Server 2000, provided that you have a primary key column:
SELECT mt.*
FROM (
SELECT DISTINCT col6
FROM mytable
) mto
JOIN mytable mt
ON mt.id =
(
SELECT TOP 1 id
FROM mytable mti
WHERE mti.col6 = mto.col6
-- ORDER BY
-- id
-- Uncomment the lines above if the order matters
)
Update:
Check your database version and compatibility level:
SELECT ##VERSION
SELECT COMPATIBILITY_LEVEL
FROM sys.databases
WHERE name = DB_NAME()
The key word "DISTINCT" in SQL has the meaning of "unique value". When applied to a column in a query it will return as many rows from the result set as there are unique, different values for that column. As a consequence it creates a grouped result set, and values of other columns are random unless defined by other functions (such as max, min, average, etc.)
If you meant to say you want to return all rows for which Col006 has a specific value, then use the "where Col006 = value" clause.
If you meant to say you want to return all rows for which Col006 is different from all other values of Col006, then you still need to specify what that value is => see above.
If you want to say that the value of Col006 can only be evaluated once all rows have been retrieved, then use the "having Col006 = value" clause. This has the same effect as the "where" clause, but "where" gets applied when rows are retrieved from the raw tables, whereas "having" is applied once all other calculations have been made (i.e. aggregation functions have been run etc.) and just before the result set is returned to the user.
UPDATE:
After having seen your edit, I have to point out that if you use any of the other suggestions, you will end up with random values in all other 9 columns for the row that contains the value "item1" in Col006, due to the constraint further up in my post.
You can group on Col006 to get the distinct values, but then you have to decide what to do with the multiple records in each group.
You can use aggregates to pick a value from the records. Example:
select Col006, min(Col001), max(Col002)
from TheTable
group by Col006
order by Col006
If you want the values to come from a specific record in each group, you have to identify it somehow. Example of using Col002 to identify the record in each group:
select Col006, Col001, Col002
from TheTable t
inner join (
select Col006, min(Col002)
from TheTable
group by Col006
) x on t.Col006 = x.Col006 and t.Col002 = x.Col002
order by Col006
SELECT *
FROM (SELECT DISTINCT YourDistinctField FROM YourTable) AS A
CROSS APPLY
( SELECT TOP 1 * FROM YourTable B
WHERE B.YourDistinctField = A.YourDistinctField ) AS NewTableName
I tried the answers posted above with no luck... but this does the trick!
select * from yourTable where column6 in (select distinct column6 from yourTable);
SELECT *
FROM harvest
GROUP BY estimated_total;
You can use GROUP BY and MIN() to get more specific result.
Lets say that you have id as the primary_key.
And we want to get all the DISTINCT values for a column lets say estimated_total, And you also need one sample of complete row with each distinct value in SQL. Following query should do the trick.
SELECT *, min(id)
FROM harvest
GROUP BY estimated_total;
create table #temp
(C1 TINYINT,
C2 TINYINT,
C3 TINYINT,
C4 TINYINT,
C5 TINYINT,
C6 TINYINT)
INSERT INTO #temp
SELECT 1,1,1,1,1,6
UNION ALL SELECT 1,1,1,1,1,6
UNION ALL SELECT 3,1,1,1,1,3
UNION ALL SELECT 4,2,1,1,1,6
SELECT * FROM #temp
SELECT *
FROM(
SELECT ROW_NUMBER() OVER (PARTITION BY C6 Order by C1) ID,* FROM #temp
)T
WHERE ID = 1