Finding rows that have many similar values and one different one

Finding rows that have many similar values and one different one - sql

I'm trying to isolate a problem with a violation of a unique key index. I'm pretty certain that the cause is resulting from columns that have the same value in 3 columns not having the same value in the 4th (when they should). As an example...
Key Column1 Column2 Column3 Column4
1 A B C D
2 A B C D
3 A B C D
4 A B C Z
I basically want to select column 4, or some way to let me identify column 4. I know it's a matter of using aggregrate functions but I'm not very familiar with them. Can anyone assist on a way to select Key, Column4 for rows that have a different column 4 value and the same column 1-3 values?

This is what you want:
select column1, column2, column3
from t
group by column1, column2, column3
having min(column4) <> max(column4)
Once you get the right values for the first three columns, you can join back in to get the specific rows.
Or, you can use window functions like this:
select t.*
from (select t.*, min(column4) over (partition by column1, column2 column3) as min4,
max(column4) over (partition by column1, column2 column3) as max4
from t
) t
where min4 <> max4;
If NULL is a valid "other" value that you want to count, you will need additional logic for that.

If you want to get all columns, then (it could be simpler if windowed count supported distinct but it's not):
with cte1 as (
select distinct * from Table1
), cte2 as (
select
*,
count(column4) over(partition by column1, column2, column3) as cnt
from cte1
)
select * from cte2 where cnt > 1;
if you want just to select key:
select
column1, column2, column3
from Table1
group by column1, column2, column3
having count(distinct column4) > 1
sql fiddle demo

Related

Select row after filter row has a coincident column in sql

I have a database as below
Column1 column2 column3
A123 abc Def
A123 xyz Abc
B456 Gh Ui
I want to select rows which don't have coincident content in column 1 by sql command.
In this case, The expected result is only row 3rd.
How to do it?
Thanks

you could use a join with a subselect for count =1
select * from my_table m
inner join (
select column1, count(*)
from my_table
group by column_1
having count(*) =1
) t on t.column_1 = m.column_1

WITH CTE AS (Select COUNT(Column1) OVER(PARTITION BY Column1 ) as coincident,* from table )Select * from CTE where coincident =1

I would use window functions:
select Column1, column2, column3
from (select t.*, count(*) over (partition by column1) as cnt
from t
) t
where cnt = 1;
However, there are other fun ways. For instance, aggregation:
select column1, max(column2) as column2, max(column3) as column3
from t
group by column1
having count(*) = 1;
Or if you know one of the other columns is going to have different values on different rows, then not exists may be the most efficient solution:
select t.*
from t
where not exists (select 1
from t t2
where t2.column1 = t.column1 and
t2.column2 <> t.column2
);

How to combine multiple columns into one column?

I'm writing a query and want the results in one column
My current results return like this
Column1 Column2 column3
1 A CAT
I want the results to return like this
Column1
1
A
CAT

SELECT Column1 FROM TableName
UNION ALL
SELECT Column2 FROM TableName
UNION ALL
SELECT Column3 FROM TableName
If you don't want duplicate values, use UNION instead of UNION ALL.
You can also do this using UNPIVOT operator
SELECT Column123
FROM
(
SELECT Column1, Column2, Column3
FROM TableName
) AS tmp
UNPIVOT
(
Column123 FOR ColumnAll IN (Column1, Column2, Column3)
) AS unpvt;
https://www.w3schools.com/sql/sql_union.asp
https://www.mssqltips.com/sqlservertip/3000/use-sql-servers-unpivot-operator-to-help-normalize-output/

The answer is.. it depends..
If the number of columns are unknown.. then use unpivot as UZI has suggested
if you know all columns and is a small finite set..
you can simply go
Select
column1
from table
union all
select column2
from table
union all
select column3
from table

The Cartesian product of the T table with derived table of 3 rows.(each row of #t is presented 3 times, for а=1 and а=2 and а=3). For the first case we take value from Column1,
and for the second - from Column2 and for the Third - from Column3.
Here, certainly, there is both union and join but, in my opinion, the title's question means single scanning the table.
CREATE TABLE #t (Column1 NVARCHAR(25),Column2 NVARCHAR(25), column3 NVARCHAR(25))
INSERT INTO #t
SELECT '1','A','CAT'
SELECT
CASE a WHEN 1 THEN Column1 WHEN 2 THEN Column2 ELSE column3 END col
FROM #t, (SELECT 1 a UNION ALL SELECT 2 UNION ALL SELECT 3) B
DROP TABLE #t

Return column with running sequence number Oracle

My simple query returns data like this:
SELECT column1, column2 FROM table1
COLUMN1 COLUMN2
------- -------
CA A
CA B
CB C
CB D
I want to return column3 with these values (for same COLUMN1 value, I want to return same sequence number):
COLUMN3
-------
1
1
2
2

You can use analytic function DENSE_RANK.
SELECT column1,
column2,
DENSE_RANK() OVER(ORDER BY column1) as "column3"
FROM table1
See the following for some examples - oracle-base.com/articles/misc/rank-dense-rank-first-last-analytic-functions.php#dense_rank

Try this query,
Select column1, column2,
dense_rank() over (order by column1) as column3
from table1;

Get multiple records only using SELECT DISTINCT or similar

I have records like this:
Column1 Column2
A Blue
A Blue
B Red
B Green
C Blue
C Red
Using SELECT DISTINCT I get this:
Column1 Column2
A Blue
B Red
B Green
C Blue
C Red
What I'd like to get:
Column1 Column2
B Red
B Green
C Blue
C Red
So I need to get only multiple records of column1 that have different values on column2.
(I'm joining two tables)
With SELECT DISTINCT, I got closer to what I need, but I can't find a way to exclude records like "A" on column1 that have always the same value on column2...

Try this:
SELECT * FROM yourtable
WHERE Column1 IN
(SELECT Column1
FROM yourtable
GROUP BY Column1
HAVING COUNT(DISTINCT Column2) > 1
)
The DISTINCT in COUNT ensures that you only get those records where Column2 has multiple distinct values.

I think this code will work in most systems:
SELECT Col1,Col2
FROM tbl
GROUP BY Col1,Col2
HAVING COUNT(*)<=1
See the results: http://sqlfiddle.com/#!6/47285/8/0

Assuming you're using MS SQL Server....that's not how DISTINCT works, DISTINCT is applied across the whole list of Columns in the SELECT statement, that's why you're getting the output you mentioned. Try using a combination of GROUP BY and HAVING like below...
SELECT Column1, Column2 FROM [table_name]
GROUP BY Column1, Column2
HAVING COUNT(*) < 2
ORDER BY Column1

Try the following Query :
SELECT Col1,Col2
FROM tbl
GROUP BY Col1,Col2
HAVING COUNT(Col1)<=1

You can use like this
select a.Column1 ,count(a.Column1 )
from
(select Distinct Column1 ,Column2
from Items) a
group by a.Column1
having Count(a.Column1 ) > 1

Sampling unique set of records in Oracle table

I have an Oracle table that from which I need to select a given percentage of records for each type of a given set of unique column combination.
For example,
SELECT distinct column1, column2, Column3 from TableX;
provides me all the combination of unique records from that table. I need a % of each rows from each such combination. Currently I am using the following query to accomplish this, which is lengthy and slow.
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘value1’ and
Column2 = ‘value2’ and
Column3 = ‘value3
UNION
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘value1’ and
Column2 = ‘value2’ and
Column3 = ‘value4
UNION
…
…
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘valueP’ and
Column2 = ‘valueQ’ and
Column3 = ‘valueR’
Where the combination of suffix in the “Value” is unique for that table (obtained from the first query)
How can I improve the length of the query and speed?

Here is one approach:
select t.*
from (select t.*,
row_number() over (partition by column1, column2, column3 order by dbms_random()
) as seqnum,
count(*) over (partition by column1, column2, column3) as totcnt
from tablex t
) t
where seqnum / totcnt <= 0.10 -- or whatever your threshold is
It uses row_number() to assign a sequential number to rows in each group, in a random order. The where clause chooses the proportion that you want.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding rows that have many similar values and one different one - sql

Related

Select row after filter row has a coincident column in sql

How to combine multiple columns into one column?

Return column with running sequence number Oracle

Get multiple records only using SELECT DISTINCT or similar

Sampling unique set of records in Oracle table

Categories

Resources