How to select the top n from a union of two queries where the resulting order needs to be ranked by individual query? - sql

Let's say I have a table with usernames:
Id | Name
-----------
1 | Bobby
20 | Bob
90 | Bob
100 | Joe-Bob
630 | Bobberino
820 | Bob Junior
I want to return a list of n matches on name for 'Bob' where the resulting set first contains exact matches followed by similar matches.
I thought something like this might work
SELECT TOP 4 a.* FROM
(
SELECT * from Usernames WHERE Name = 'Bob'
UNION
SELECT * from Usernames WHERE Name LIKE '%Bob%'
) AS a
but there are two problems:
It's an inefficient query since the sub-select could return many rows (looking at the execution plan shows a join happening before top)
(Almost) more importantly, the exact match(es) will not appear first in the results since the resulting set appears to be ordered by primary key.
I am looking for a query that will return (for TOP 4)
Id | Name
---------
20 | Bob
90 | Bob
(and then 2 results from the LIKE query, e.g. 1 Bobby and 100 Joe-Bob)
Is this possible in a single query?

You could use a case to place the exact matches on top:
select top 4 *
from Usernames
where Name like '%Bob%'
order by
case when Name = 'Bob' then 1 else 2 end
Or, if you're worried about performance and have an index on (Name):
select top 4 *
from (
select 1 as SortOrder
, *
from Usernames
where Name = 'Bob'
union all
select 2
, *
from Usernames
where Name like '%Bob%'
and Name <> 'Bob'
and 4 >
(
select count(*)
from Usernames
where Name = 'Bob'
)
) as SubqueryAlias
order by
SortOrder

A slight modification to your original query should solve this. You could add in an additional UNION that matches WHERE Name LIKE 'Bob%' and give this priority 2, changing the '%Bob' priority to 3 and you'd get an even better search IMHO.
SELECT TOP 4 a.* FROM
(
SELECT *, 1 AS Priority from Usernames WHERE Name = 'Bob'
UNION
SELECT *, 2 from Usernames WHERE Name LIKE '%Bob%'
) AS a
ORDER BY Priority ASC

This might do what you want with better performance.
SELECT TOP 4 a.* FROM
(
SELECT TOP 4 *, 1 AS Sort from Usernames WHERE Name = 'Bob'
UNION ALL
SELECT TOP 4 *, 2 AS Sort from Usernames WHERE Name LIKE '%Bob%' and Name <> 'Bob'
) AS a
ORDER BY Sort

This works for me:
SELECT TOP 4 * FROM (
SELECT 1 as Rank , I, name FROM Foo WHERE Name = 'Bob'
UNION ALL
SELECT 2 as Rank,i,name FROM Foo WHERE Name LIKE '%Bob%'
) as Q1
ORDER BY Q1.Rank, Q1.I

SET ROWCOUNT 4
SELECT * from Usernames WHERE Name = 'Bob'
UNION
SELECT * from Usernames WHERE Name LIKE '%Bob%'
SET ROWCOUNt 0

The answer from Will A got me over the line, but I'd like to add a quick note, that if you're trying to do the same thing and incorporate "FOR XML PATH", you need to write it slightly differently.
I was specifying XML attributes and so had things like :
SELECT Field_1 as [#attr_1]
What you have to do is remove the "#" symbol in the sub queries and then add them back in with the outer query. Like this:
SELECT top 1 a.SupervisorName as [#SupervisorName]
FROM
(
SELECT (FirstNames + ' ' + LastName) AS [SupervisorName],1 as OrderingVal
FROM ExamSupervisor SupervisorTable1
UNION ALL
SELECT (FirstNames + ' ' + LastName) AS [SupervisorName],2 as OrderingVal
FROM ExamSupervisor SupervisorTable2
) as a
ORDER BY a.OrderingVal ASC
FOR XML PATH('Supervisor')
This is a cut-down version of my final query, so it doesn't really make sense, but you should get the idea.

Related

Select ID with specific values in more than one field

I have a table as follows
groupCode
ProductIdentifier
1
dental
1
membership
2
dental
2
vision
2
health
3
dental
3
vision
I need to find out if a specific groupCode have "dental", "vision" and "health" (all three simultaneously)
The expected result is code 2
What I need to identify is if groupCode 2 has the three products (or two, or whatever the user enters). This is part of a huge kitchen sink query I'm building.
I'm doing
SELECT groupCode
FROM dbo.table
WHERE (productIdentifier = N'dental')
AND (productIdentifier = N'vision')
AND (productIdentifier = N'health')
AND (groupCode = 2)
But clearly is wrong because it's not working.
I tried to do something like its described here but it didn't return a result for me:
Select rows with same id but different value in another column
Thanks.
If each of 'dental','vision' and 'health' occur only once per group identifier, you can group by group identifier and filter by the groups having count(*) = 3:
WITH
-- your input ..
indata(groupCode,ProductIdentifier) AS (
SELECT 1,'dental'
UNION ALL SELECT 1,'membership'
UNION ALL SELECT 2,'dental'
UNION ALL SELECT 2,'vision'
UNION ALL SELECT 2,'health'
UNION ALL SELECT 3,'dental'
UNION ALL SELECT 3,'vision'
)
-- real query starts here ...
SELECT
groupcode
FROM indata
WHERE productidentifier IN ('dental','vision','health')
GROUP BY
groupcode
HAVING COUNT(*) = 3;
-- out groupcode
-- out -----------
-- out 2
As per Marcothesane answer, if you know the groupCode (2) and the number of products (vision, dental and health), 3 in this case, and you need to confirm if that code has those three specific products, this will work for you:
SELECT COUNT(groupCode) AS totalRecords
FROM dbo.table
WHERE (groupCode = 2) AND (productIdentifier IN ('dental', 'vision', 'health'))
HAVING (COUNT(groupCode) = 3)
This will return 3 (number of records = number of products).
Its basically's Marcothesane answer in a way you can "copy/paste" to your code by just changing the table name. You should accept Marcothesane answer.

Oracle: union all query 1 and query 2 want to minus some rows if query 1 have rowdata

my query as below , i want to minus some rows from query1 when query2 have rowdata , but i don't know how to do:
my query:
with query1 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
t.org_id
from (
select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID is null or LINK_ORG_ID = '')) t
group by t.org_id) ,
query2 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
org_id
from (select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID = '55')) t
group by org_id)
select *
from query1
union all
select *
from query2 minus
-- this doesn't work ,i want to minus the rowdata from query 1 when query1.org_id = query2.org_id. the query2 is marked as outer query column.
(select * from query1 where query1.ORG_ID = query2.ORG_ID)
;
sample table
name phone link_org_id org_id
lily 133 1
ming 144 1
hao 333 2
jane 1234 55 2
bob 666 3
herry 555 3
query 1 result:
name phone org_id
lily,ming 133,144 1
hao 333 2
bob,herry 666,555 3
query 2 result:
name phone org_id
jane 1234 2
such like this , jane selected by query2 and hao selected by query 1 . All of them are from a same org which org_id =2 . but i don't need hao ,i just need jane. how to do?
i means if query2 can find result , then no need query1's result. but if query2 can't find any data, then i need query1's data.
The way it is now, you'll first have to split names (and phones) into rows, and then apply set operators (UNION, MINUS) to such a data.
Which means that you shouldn't use WM_CONCAT at all; at least, not at the beginning, because
first you concatenate data
then you'd have to split it back into rows
UNION / MINUS sets
Doing useless job in the first 2 steps.
I'd suggest you to UNION / MINUS data first, then aggregate them using WM_CONCAT. By the way, which database version do you use? WM_CONCAT is a) undocumented, b) doesn't even exist in latest Oracle database versions so you'd rather switch to LISTAGG, if possible.

Query for transitive word translation

I have trouble constructing what I guess should be a CTE query, which would return a word translation.
I have two tables - WORDS:
ID | WORD | LANG
1 'Cat' 'ENG'
2 'Kot' 'POL'
3 'Katze' 'GER'
and CONNECTIONS:
ID | WORD_A | WORD_B
1 1 2
2 1 3
As ENG->POL, and ENG->GER translations already exist, I would like to receive a POL->GER translation.
So for any given word, I have to check it's connected translations, if target language is not there, I should check connected translations of those connected translations etc., up to the point where either translation is found, or nothing is returned;
I have no idea where to start, it wouldn't be a problem if there was a constant number of transitions required, but it might as well require transition like POL->ITA->...->ENG->GER.
This is a complicated graph walking problem. The idea is to set up a connections2 relationship that has words and languages in both directions, and then use this for walking the graph.
To get all German translations for Polish words, you can use:
with words AS (
select 1 as id, 'Cat' as word, 'ENG' as lang union all
select 2, 'Kot', 'POL' union all
select 3, 'Katze', 'GER'
),
connections as (
select 1 as id, 1 as word_a, 2 as word_b union all
select 2 as id, 1 as word_a, 3 as word_b
),
connections2 as (
select c.word_a as id_a, c.word_b as id_b, wa.lang as lang_a, wb.lang as lang_b
from connections c join
words wa
on c.word_a = wa.id join
words wb
on c.word_b = wb.id
union -- remove duplicates
select c.word_b, c.word_a, wb.lang as lang_a, wa.lang as lang_b
from connections c join
words wa
on c.word_a = wa.id join
words wb
on c.word_b = wb.id
),
cte as (
select id as pol_word, id as other_word, 'POL' as other_lang, 1 as lev, ',POL,' as langs
from words
where lang = 'POL'
union all
select cte.pol_word, c2.id_b, c2.lang_b, lev + 1, langs || c2.lang_b || ','
from cte join
connections2 c2
on cte.other_lang = c2.lang_a
where langs not like '%,' || c2.lang_b || ',%' or c2.lang_b = 'GER'
)
select *
from cte
where cte.other_lang = 'GER';
Here is a db<>fiddle.
Presumably, you want the shortest path. For that, you would use this query (after the CTEs):
select *
from cte
where cte.other_lang = 'GER' and
cte.lev = (select min(cte2.lev) from cte cte2 where ct2.pol_word = cte.pol_word);
I recommend changing your table design to something along these lines:
ID | WORD_ID | WORD | LANG
1 | 1 | 'Cat' | 'ENG'
2 | 1 | 'Kot' | 'POL'
3 | 1 | 'Katze' | 'GER'
Now suppose that you want to go from Polish to German, and you are starting off with the Polish word for cat, Kot. Then, you may use the following query:
SELECT t2.WORD
FROM yourTable t1
INNER JOIN yourTable t2
ON t1.WORD_ID = t1.WORD_ID AND t2.LANG = 'GER'
WHERE
t1.WORD = 'Kot' AND t1.LANG = 'POL'
Demo
The idea here is that you want one of your tables to maintain a mapping of every logically distinct word to some common id or reference. Then, you only need an incoming word and language, and a desired output language, to be able to do the mapping.
Note: For simplicity I am only showing a table with a single word across a handful of language, but the logic should still work with multiple words and languages. Your current table may not be normalized, but optimizing that might be another task separate from your immediate question.

SQL Query - Display Count & All ID's With Same Name

I'm trying to display the amount of table entries with the same name and the unique ID's associated with each of those entries.
So I have a table like so...
Table Names
------------------------------
ID Name
0 John
1 Mike
2 John
3 Mike
4 Adam
5 Mike
I would like the output to be something like:
Name | Count | IDs
---------------------
Mike 3 1,3,5
John 2 0,2
Adam 1 4
I have the following query which does this except display all the unique ID's:
select name, count(*) as ct from names group by name order by ct desc;
select name,
count(id) as ct,
group_concat(id) as IDs
from names
group by name
order by ct desc;
You can use GROUP_CONCAT for that
Depending on version of MSSQL you are using (2005+), you can use the FOR XML PATH option.
SELECT
Name,
COUNT(*) AS ct,
STUFF((SELECT ',' + CAST(ID AS varchar(MAX))
FROM names i
WHERE i.Name = n.Name FOR XML PATH(''))
, 1, 1, '') as IDs
FROM names n
GROUP BY Name
ORDER BY ct DESC
Closest thing to group_concat you'll get on MSSQL unless you use the SQLCLR option (which I have no experience doing). The STUFF function takes care of the leading comma. Also, you don't want to alias the inner SELECT as it will wrap the element you're selecting in an XML element (alias of TD causes each element to return as <TD>value</TD>).
Given the input above, here's the result I get:
Name ct IDs
Mike 3 1,3,5
John 2 0,2
Adam 1 4
EDIT: DISCLAIMER
This technique will not work as intended for string fields that could possibly contain special characters (like ampersands &, less than <, greater than >, and any number of other formatting characters). As such, this technique is most beneficial for simple integer values, although can still be used for text if you are ABSOLUTELY SURE there are no special characters that would need to be escaped. As such, read the solution posted HERE to ensure these characters get properly escaped.
Here is another SQL Server method, using recursive CTE:
Link to SQLFiddle
; with MyCTE(name,ids, name_id, seq)
as(
select name, CAST( '' AS VARCHAR(8000) ), -1, 0
from Data
group by name
union all
select d.name,
CAST( ids + CASE WHEN seq = 0 THEN '' ELSE ', ' END + cast(id as varchar) AS VARCHAR(8000) ),
CAST( id AS int),
seq + 1
from MyCTE cte
join Data d
on cte.name = d.name
where d.id > cte.name_id
)
SELECT name, ids
FROM ( SELECT name, ids,
RANK() OVER ( PARTITION BY name ORDER BY seq DESC )
FROM MyCTE ) D ( name, ids, rank )
WHERE rank = 1

SQL - Removing Duplicate without 'hard' coding?

Heres my scenario.
I have a table with 3 rows I want to return within a stored procedure, rows are email, name and id. id must = 3 or 4 and email must only be per user as some have multiple entries.
I have a Select statement as follows
SELECT
DISTINCT email,
name,
id
from table
where
id = 3
or id = 4
Ok fairly simple but there are some users whose have entries that are both 3 and 4 so they appear twice, if they appear twice I want only those with ids of 4 remaining. I'll give another example below as its hard to explain.
Table -
Email Name Id
jimmy#domain.com jimmy 4
brian#domain.com brian 4
kevin#domain.com kevin 3
jimmy#domain.com jimmy 3
So in the above scenario I would want to ignore the jimmy with the id of 3, any way of doing this without hard coding?
Thanks
SELECT
email,
name,
max(id)
from table
where
id in( 3, 4 )
group by email, name
Is this what you want to achieve?
SELECT Email, Name, MAX(Id) FROM Table WHERE Id IN (3, 4) GROUP BY Email;
Sometimes using Having Count(*) > 1 may be useful to find duplicated records.
select * from table group by Email having count(*) > 1
or
select * from table group by Email having count(*) > 1 and id > 3.
The solution provided before with the select MAX(ID) from table sounds good for this case.
This maybe an alternative solution.
What RDMS are you using? This will return only one "Jimmy", using RANK():
SELECT A.email, A.name,A.id
FROM SO_Table A
INNER JOIN(
SELECT
email, name,id,RANK() OVER (Partition BY name ORDER BY ID DESC) AS COUNTER
FROM SO_Table B
) X ON X.ID = A.ID AND X.NAME = A.NAME
WHERE X.COUNTER = 1
Returns:
email name id
------------------------------
jimmy#domain.com jimmy 4
brian#domain.com brian 4
kevin#domain.com kevin 3