Return distinct rows from not entirely distinct results - sql

Two columns, first is distcint, second not so much.
Col1 ---- Col2
1 ---- abc
1 ---- abc (123)
2 ---- def
2 ---- def (324)
etc
I need to bring back distinct records, but only the ones with the longer Col2.
I've tried using the CONTAINS function, but my table isn't full-text indexed.

One option is to use use ROW_NUMBER() ordering by the LEN() of Col2:
SELECT *
FROM (
SELECT Col1, Col2, ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY LEN(Col2) DESC) rn
FROM YourTable
) t
WHERE rn = 1
SQL Fiddle Demo

SELECT col1 ,
col2
FROM ( SELECT col1 ,
col2 ,
Rank() OVER ( PARTITION BY col1 ORDER BY col2 DESC ) row
FROM dbo.table
) t
WHERE row = 1
You can also try this ..

Related

How can I use a COUNT(DISTINCT var) to return the count of unique values per group?

I need to return a count of unique values, but unique per group of the result set, not unique to the entire result set. For example I would like the following code:
SELECT col1 AS letters, count(DISTINCT col2) AS numbers
GROUP BY col1;
applied to this data:
col1 col2
a 5
a 5
a 6
b 1
b 2
b 6
To return this:
col1 col2
a 2
b 3
If the above code will not produce this, how can I accomplish this is T-SQL?
I hope this works for your solution, you need to use group by on col2 with count distinct of col2
SELECT
col1,
COUNT(DISTINCT col2)
FROM
count_unique_values_per_group
GROUP BY
col1
Try this:
SELECT DISTINCT col1
,dense_rank() over (partition by col1 order by col2 asc) + dense_rank() over (partition by col1 order by col2 desc) - 1
FROM my_table
Apply concat function to get the unique count. Hope this helps..
SELECT col1, count(distinct col1 + col2) FROM table_name group by col1;
or
SELECT col1, count(distinct concat(col1,col2)) FROM table_name group by col1;

SQL DISTINCT based on a single column, but keep all columns as output

--mytable
col1 col2 col3
1 A red
2 A green
3 B purple
4 C blue
Let's call the table above mytable. I want to select only distinct values from col2:
SELECT DISTINCT
col2
FROM
mytable
When I do this the output looks like this, which is expected:
col2
A
B
C
but how do I perform the same type of query, yet keep all columns? The output would look like below. In essence I'm going through mytable looking at col2, and when there's multiple occurrences of col2 I'm only keeping the first row.
col1 col2 col3
1 A red
3 B purple
4 C blue
Do SQL functions (eg DISTINCT) have arguments I could set? I could imagine it to be something like KeepAllColumns = TRUE for this DISTINCT function? Or do I need to perform JOINs to get what I want?
You can use window functions, particularly row_number():
select t.*
from (select t.*, row_number() over (partition by col2 order by col2) as seqnum
from mytable t
) t
where seqnum = 1;
row_number() enumerates the rows, starting with "1". You can control whether you get the oldest, earliest, biggest, smallest . . .
You can use the QUALIFY clause in Teradata:
SELECT col1, col2, col3
FROM mytable
QUALIFY ROW_NUMBER() OVER(PARTITION BY col2 ORDER BY col2) = 1 -- Get 1st row per group
If you want to change the ordering for how to determine which col2 row to get, just change the expression in the ORDER BY.
With NOT EXISTS:
select m.* from mytable m
where not exists (
select 1 from mytable
where col2 = m.col2 and col1 < m.col1
)
This code will return the rows for which there is not another row with the same col2 and a smaller value in col1.

How to match the list of values in an IN clause with another IN clause in a linear manner

--Table_1
col1 col2
............
123 abc
456 def
123 def
select * from Table_1 where col1 in (123,456) and col2 in (abc,def);
I want the output to match the row containing just '123' from "col1" and 'abc' from "col2" , and not '123' from col1 and 'def' from 'col2'.
The list in IN clause should match accordingly in a linear manner.
select * from Table_1 where col1 in (123,456) and col2 in (abc,def);
O/P
col1 col2
123 abc
456 def
You may use tuples for comparison of a combination of multiple columns.
select *
from Table_1
where (col1,col2) in ( (123,'abc'),(456,'def'), (789,'abc') );
Demo
You can try to use row_number window function to make it.
SELECT col1,col2
from (
select col1,col2,row_number() over(partition by col1 order by col2) rn
from Table_1
where col1 in (123,456) and col2 in ('abc','def')
) t1
where rn = 1
sqlfiddle

SQL script for retrieving 5 unique values in a table ( google big query )

I am looking for a query where I can get unique values(5) in a table. For example.
The table consists of more 100+ columns. Is there any way I can get unique values.
I am using google big query and tried this option
select col1 col2 ... coln
from tablename
where col1 is not null and col2 is not null
group by col1,col2... coln
order by col1, col2... coln
limit 5
But problem is it gives zero records if all the column are null
Thanks
R
I think you might be able to do this in Google bigquery, assuming that the types for the columns are compatible:
select colname, colval
from (select 'col1' as colname, col1 as colvalue
from t
where col1 is not null
group by col1
limit 5
),
(select 'col2' as colname, col2 as colvalue
from t
where col2 is not null
group by col2
limit 5
),
. . .
For those not familiar with the syntax, a comas in the from clause means union all, not cross join in this dialect. Why did they have to change this?
Try This one, i hope it works
;With CTE as (
select * ,ROW_NUMBER () over (partition by isnull(col1,''),isnull(col2,'')... isnull(coln,'') order by isnull(col1,'')) row_id
from tablename
) select * from CTE where row_id =1

DISTINCT for only one Column and other column random?

I have one Table name Demodata which have two column col1 and col2. data of table is
col1 col2
1 5
1 6
2 7
3 8
3 9
4 10
and after SELECT command we need this data
col1 Col2
1 5
6
2 7
3 8
9
4 10
is this possible then what is query please guide me
Try this
SELECT CASE WHEN RN > 1 THEN NULL ELSE Col1 END,Col2
FROM
(
SELECT *,Row_Number() Over(Partition by col1 order by col1) AS RN
From yourTable
) AS T
No it is not possible.
SQL Server result sets are row based not tree based. You must have a value for each column (alternatively a NULL value).
What you can do is grouping by col1 and run an aggregate function on the values of col2 (possibly the STUFF function).
You can do this in SQL, using row_number():
select (case when row_number() over (partition by col1 order by col2) = 1
then col1
end), col2
from table t
order by col1, col2;
Notice that the ordering is important. The way you have written the result set, the data is ordered by col1 and then col2. Result sets do not have an inherent ordering, unless you include an order by clause.
Also, I have used NULL for the missing values.
And, finally, although this can be done in SQL, it is often preferable to do these types of manipulations on the client side.
What do you want to select on the duplicates, an empty string, NULL, 0, ... ?
I presume NULL, you can use a CTE with ROW_NUMBER and CASE on col1:
WITH CTE AS(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY col1
ORDER BY (SELECT 1))
, col1, col2
FROM Demodata
)
SELECT col1 = CASE WHEN RN = 1 THEN col1 ELSE NULL END, col2
FROM CTE
Demo