Select multiple columns but distinct only one in SQL? - sql

Lets say I have a table called TABLE with the columns col1, col2, col3 and col4
I want to select col1, col2 and col3 but distinct col2 values from the others, but I can't do it.
I tried something like this:
SELECT DISTINCT "col1", "col2", "col3" FROM [Table] WHERE col1 = Values
But the output brings me more than one record of col 2 with the same value.
I know that is because the distinct filtered all the columns that i specified, but i don't know how to get all the columns and filter only the values of col2.
Is it possible to SELECT more than 1 column but filter only one of them with SELECT DISTINCT ?

As you said, distinct just limits the full set of columns to eliminate duplicates. Instead, I'd just use an aggregate function with a GROUP BY statement.
SELECT MAX(col1) AS col1, col2,
MAX(col3) AS col3
FROM tbl
GROUP BY col2
That will take the top value alphanumerically from the supplied columns. Or, to list all values separated by commas:
SELECT STRING_AGG(col1,',') AS col1, col2,
STRING_AGG(col3,',') AS col3
FROM tbl
GROUP BY col2

Related

Oracle SQL: How to convert one column of Select to rows

I am new to Oracle and am looking for a way to convert 1 column in a Select to rows.
My first approach was using Listagg which does exactly what I want but the character limit for this is not enough for my case.
As an alternative I would like to do the following.
SELECT
t.col1
, t.col2
, t.col3
, t.col4
, t.col5
FROM
my table t
Instead of the standard output of t.col1 t.col2 t.col3 t.col4 t.col5 I would like t.col2 to appear in rows (i.e. below each other) instead of in columns (next to each other). Col2 always contains a value and each of them should appear in a separate row.
When searching for a solution to this I came across Unpivot and Decode but am not sure if and how this could be applied here.
Can someone tell me how this can be achieved ?
Many thanks in advance for any help,
Mike
A simple method -- if your data is not too large -- is just to use union all. Your description makes it sound like you want this:
select col1, col2, col5
from t
where col2 is not null
union all
select col1, col3, col5
from t
where col2 is not null
union all
select col1, col4, col5
from t
where col2 is not null;
Hmmm, or if you just want the distinct values in col2:
select distinct col2
from t;
You are looking for the UNPIVOT function
SELECT col
FROM my table t
UNPIVOT INCLUDE NULLS (col FOR source_column_name IN (col1, col2, col3, col4, col5)
COL
----
Value 1
Value 2
Value 3
Value 4
Value 5
The result contains five rows with one column COL each that contains the value from the columns COL1 to COL5. The (unused) column SOURCE_COLUMN_NAME contains the name of the column where the data is coming from. You can remove the INCLUDING NULLS if you are only interested in rows the COL IS NOT NULL.
See the ORACLE-BASE article on PIVOT and UNPIVOT operators for more details.

De-duplicating rows in a table with respect to certain columns and retaining the corresponding values in the other columns in HIVE

I need to create a temporary table in HIVE using an existing table that has 7 columns. I just want to get rid of duplicates with respect to first three columns and also retain the corresponding values in the other 4 columns. I don't care which row is actually dropped while de-duplicating using first three rows alone.
You could use something as below if you are not considered about ordering
create table table2 as
select col1, col2, col3,
,split(agg_col,"|")[0] as col4
,split(agg_col,"|")[1] as col5
,split(agg_col,"|")[2] as col6
,split(agg_col,"|")[3] as col7
from (Select col1, col2, col3,
max(concat(cast(col4 as string),"|",
cast(col5 as string),"|",
cast(col6 as string),"|",
cast(col7 as string))) as agg_col
from table1
group by col1,col2,col3 ) A;
Below is another approach, which gives much control over ordering but slower than above approach
create table table2 as
select col1, col2, col3,max(col4), max(col5), max(col6), max(col7)
from (Select col1, col2, col3,col4, col5, col6, col7,
rank() over ( partition by col1, col2, col3
order by col4 desc, col5 desc, col6 desc, col7 desc ) as col_rank
from table1 ) A
where A.col_rank = 1
GROUP BY col1, col2, col3;
rank() over(..) function returns more than one column with rank as '1' if order by columns are all equal. In our case if there are 2 columns with exact same values for all seven columns then there will be duplicates when we use filter as col_rank =1. These duplicates can be eleminated using max and group by clauses as written in above query.

Oracle SQL - Join 2 table columns in 1 row

I have 2 SQL's and the result come fine. They are no relation between those 2 queries but I want to see all the rows in single column.
e.g.
Select col1,col2,sum(col3) as col3 from table a
select col4,col5 from table b
I would like the result to be
col1 col2 col3 col4 col5
If there is no equivalent row for either table a or table b replace with zeroes.
Could some one help me with this. thanks.
Since, you didn't provided any information like table structure or data inside each tables. You can cross join both tables.
select t.col1,t.col2,t.col3,t1.col1,t1.col2 from tab1 t,tab2 t1;
SQLFiddle
In both select statements add column based on rownum or row_number() and then full join results using this column:
select nvl(col1, 0) col1, nvl(col2, 0) col2, nvl(col3, 0) col3,
nvl(col4, 0) col4, nvl(col5, 0) col5
from
(select rownum rn, col1, col2, col3 from (
select col1, col2, sum(col3) col3 from tableA group by col1, col2)) a
full join (select rownum rn, col4, col5 from tableB) b using (rn)
SQLFiddle demo
I guess a UNION could be a pragmatic solution since the 2 queries are not related. They are just 2 data sets that should be retrieved in one statement:
Select col1,col2,sum(col3) as col3 from table a
UNION
select col4,col5, to_number(null) col6 from table b
Be aware of col6 in the example. SQL insists on retrieving an equal set of columns in a UNION statement. It is a good practice to retrieve columns with exactly the same datatype. Since the sum(col3) will yield a number datatype column, col6 should too.
The outcome of col4 and col5 will be shown in col1 and col2.

oracle sql query finding rows with multiple values in 3rd column matching columns 1 and 2

I have a dataset with about a million rows in and Oracle 11 db.
I'd like to find rows where col1 and col2 match but have different values in col3.
I'm not sure how to do this well though i can certainly write a query that never seems to finish:
select col1,col2,col3
from table tab1
where exists
(select 1
from table tab2
where tab1.col1 = tab2.col1
and tab1.col2 = tab2.col2
and tab1.col3 != tab2.col3);
I ran this and after an hour gave up waiting - I need to analyze the problems and present it to some people for figuring out how to move forward.
Thanks in any case,
Jeff
A query like this will indicate which rows having the same col1, col2 have differing values in col3:
SELECT col1, col2
FROM x
GROUP BY col1, col2
HAVING MIN(col3) <> MAX(col3)
To see how many of this col1, col2 pairs are affected:
SELECT COUNT(*)
FROM (SELECT col1, col2
FROM x
GROUP BY col1, col2
HAVING MIN(col3) <> MAX(col3)
)
You may also wish to know how many duplicates there are (ie having col1, col2, col3 the same:
SELECT col1, col2, col3
FROM x
GROUP BY col1, col2, col3
HAVING COUNT(*) > 1
Did you mean something like this?
select col1,col2,col3
from table tab1
where col1 = col2
and col1 <> col3

select all columns with one column has different value

In my table,some records have all column values are the same, except one. I need write a query to get those records. what's the best way to do it? the table is like this:
colA colB colC
a b c
a b d
a b e
What's the best way to get all records with all the columns? Thanks for everyone's help.
Assuming you know that column3 will always be different, to get the rows that have more than one value:
SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
If you need all the values in the three columns, then you can join this back to the original table:
SELECT t.*
FROM table t join
(SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
) cols
on t.col1 = cols.col1 and t.col2 = cols.col2
Just select those rows that have the different values:
SELECT col1, col2
FROM myTable
WHERE colWanted != knownValue
If this is not what you are looking for, please post examples of the data in the table and the wanted output.
How about something like
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) = 1
This will give you Col1, Col2 that have unique data.
Assuming col3 has the difs
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) > 1
OR TO SHOW ALL 3 COLS
SELECT Col1, Col2, Col3
FROM Table1
GROUP BY Col1, Col2, Col3
HAVING COUNT(Col3) > 1