Select all except duplicate columns unless the length of the column's string is longest - sql

I have a table with the following columns:
strWord, strWordType, strWordDescription
I'd like to be able to select all of the rows except the ones where there is a duplicate strWordDescription. In the case of duplicates, I only want to return the rows where strWord has the longest length. This should only take effect if strWordType is the same.
Notes: There are no duplicate rows of strWords/strWordType combinations only duplicate strWordDescriptions for specific strWordTypes. I would like to avoid using Distinct.
Example: myTable
strWord | strWordType | strWordDescription |
blue 2012 This is a color
blue 2014 This is a color
green 2012 This is a color
ham 2014 This is a food
chicken 2014 This is a food
Expected Results:
strWord | strWordType | strWordDescription
green 2012 This is a color
blue 2014 This is a color
chicken 2014 This is a food

Hmmm . . . a correlated subquery comes to mind:
select t.*
from t
where t.strword = (select top (1) t2.strword
from t t2
where t2.description = t.description and
t2.strWordType = t.strWordType
order by len(t2.strword) desc
);

Just solved it -
SELECT MAX(mt.strWord),
mt.strWordType,
mt.strWordDescription
FROM myTable mt
GROUP BY mt.strWordType, mt.strWordDescription
ORDER BY MAX(mt.strWord)

Related

How to filter out all records that have duplicates in SQL?

Trying to get this result from a table with duplicates
red
red
red
blue
green
to
blue
green
Totally omitting all the records that has duplicates and only bringing in the unique records
Use GROUP BY and HAVING clause...
select color
from table1
group by color
having count(color) = 1
If you need more than just the colours.
select *
from paintmess
where colour in (
select colour
from paintmess
group by colour
having count(*)=1
);
id
colour
4
blue
5
green
db<>fiddle here

How to not lose records in full join

Let's say I have two tables; table A and table shown below:
A
Color ID
Blue 1
Green 2
Red 3
B
Color ID
Blue 1
Brown 2
Red 3
If I were to attempt to join them using a full join, the result would depend on which table I use in the select statement. For example the following query would produce the following result
select A.color, count(*)
from A
full join B on a.color = B.color
group by 1
order by 1
color count
Blue 1
Green 1
Red 1
1
If I decided to use B.color in the select statement instead of A.color, I would get the result below:
color count
Blue 1
Brown 1
Red 1
1
How would I get the resultset to include all values for color. I know I could accomplish using unionall, and I could use a case statement in the select statement to use one when the other is null, but is there another cleaner way to accomplish this?
Thanks,
Ben
Use coalesce to pick up the value from the other table in case the value exists in one table and not the other.
select coalesce(A.color,B.color) as color, count(*)
from A
full join B on a.color = B.color
group by 1
order by 1

Select one distinct row based on a case statement applied to a column

I'm unable to figure out a sql query (using MS Sql Server). I'm trying to retrieve a single row from a dataset in which an item with one id can have more than one row. The part that is throwing me off is that the correct row should be based on a "hierarchy". I have trying to throw a case statement at the problem.
Some sample data:
Id Class Date
100 Red 2012-12-12
100 Blue 2012-12-31
200 Red 2012-10-31
300 Green 2012-04-04
300 Blue 2011-09-01
I want to return a single row based on the value of Class.
Case When Red Then
Date
Case When Blue Then
Date
Case When Green Then
Date
Else
''
My final dataset should look like this:
Id Class Date
100 Red 2012-12-12
200 Red 2012-10-31
300 Blue 2011-09-01
So, if one of the duplicate rows has a value of Red, use the date from that row first. Then blue, then green.
Been working on this one all day, playing around with subqueries, group bys, havings, case statements, derived tables. I'm quite rusty on my sql skills, as it's been a while.
Any hints on the direction I should take?
You can try this
;WITH cte AS
(
SELECT Id, Class, Date,
row_number() OVER (PARTITION BY Id
ORDER BY CASE Class
WHEN 'Red' THEN 1
WHEN 'Blue' THEN 2
WHEN 'Green' THEN 3
ELSE 4 END) as rn
FROM MyTable
)
SELECT Id, Class, Date
FROM cte
WHERE rn = 1
Try:
select id, class, date
from TABLE
where class = COLOR
and date = (select min(date) from TABLE where class = COLOR)
I like #bobs query and it'll work well on a TSQL database.
This is just another way of doing the same thing that may be a little more portable to other SQL databases that have common table expressions but not the same row number syntax;
;WITH cte AS
(SELECT *, CASE Class WHEN 'Red' THEN 1 WHEN 'Blue' THEN 2
WHEN 'Green' THEN 3 ELSE 4 END c FROM myTable)
SELECT b1.Id, b1.Class, b1.Date
FROM cte b1
LEFT JOIN cte b2
ON b1.Id = b2.Id AND b1.c > b2.c
WHERE b2.Class IS NULL
An SQLfiddle to test with.

Conditionally append a character in select statement

Functionality I'm trying to add to my DB2 stored procedure:
Select a MIN() date from a joined table column.
IF there was more than one row in this joined table, append a " * " to the date.
Thanks, any help or guidance is much appreciated.
It's not clear which flavor of DB2 is needed nor if any suggestion worked. This works on DB2 for i:
SELECT
T1.joinCol1,
max( T2.somedateColumn ),
count( t2.somedateColumn ),
char(max( T2.somedateColumn )) concat case when count( T2.somedateColumn )>1 then '*' else '' end
FROM joinFile1 t1 join joinFile2 t2
on joinCol1 = joinCol2
GROUP BY T1.joinCol1
ORDER BY T1.joinCol1
The SQL is fairly generic, so it should translate to many environments and versions.
Substitute table and column names as needed. The COUNT() here actually counts rows from the JOIN rather than the number of times the specific date occurs. If a count of duplicate dates is needed, then some changes to this example are also needed.
Hope this helps
Say I have result coming as
1 Jeff 1
2 Jeff 333
3 Jeff 77
4 Jeff 1
5 Jeff 14
6 Bob 22
7 Bob 4
8 Bob 5
9 Bob 6
Here the value 1 is repeated twice(in 3 column)
So, this query gets the count as 2 along with the * concatenated along with it
SELECT A.USER_VAL,
DECODE(A.CNT, '1', A.CNT, '0', A.CNT, CONCAT(A.CNT, '*')) AS CNT
FROM (SELECT DISTINCT BT.USER_VAL, CAST(COUNT(*) AS VARCHAR2(2)) AS CNT
FROM SO_BUFFER_TABLE_8 BT
GROUP BY BT.USER_VAL) A

How to get a proper count in sql server when retrieving a lot of fields?

Here is my scenario,
I have query that returns a lot of fields. One of the fields is called ID and I want to group by ID and show a count in descending order. However, since I am bringing back more fields, it becomes harder to show a true count because I have to group by those other fields. Here is an example of what I am trying to do. If I just have 2 fields (ID, color) and I group by color, I may end up with something like this:
ID COLOR COUNT
== ===== =====
2 red 10
3 blue 5
4 green 24
Lets say I add another field which is actually the same person, but they have a different spelling of their name which throws the count off, so I might have something like this:
ID COLOR NAME COUNT
== ===== ====== =====
2 Red Jim 5
2 Red Jimmy 5
3 Red Bob 3
3 Red Robert 2
4 Red Johnny 12
4 Red John 12
I want to be able to bring back ID, Color, Name, and Count, but display the counts like in the first table. Is there a way to do this using the ID?
If you want a single result set, you would have to omit the name, as in your first post
SELECT Id, Color, COUNT(*)
FROM YourTable
GROUP By Id, Color
Now, you could get your desired functionality with a subquery, although not elegant
SELECT Id, Color Name, (SELECT COUNT(*)
FROM YourTable
Where Id = O.Id
AND Color = O.Color
) AS "Count"
FROM YourTable O
GROUP BY Id, Color, Name
This should work as you desire
Try this:-
SELECT DISTINCT a.ID, a.Color, a.Name, b.Count
FROM yourTable
INNER JOIN (
SELECT ID, Color, Count(1) [Count] FROM yourTable
GROUP BY ID, Color
) b ON a.ID = b.ID, a.Color = b.Color
ORDER BY [Count] DESC
Try doing a sub query to get the count.
-- MarkusQ