Select rows with multiple occurrences over multiple tables - sql

DB: SQL Server 2008
I want to select id, title and userid from a table if that userid occurs more than once across 2 tables.
So if a user has 2 records in locations table, and 1 in artists table, i want to return the id, title and userid of all 3 records.
This is what I have now, but it returns 0 records.
When I leave out the "having count(userid)>1" part, I get ALL 400 records in all tables.
select userid,id,title from (
select id,title,userid from locations l
union
select id,title,userid from artists a
) as info
group by userid,id,title
having count(userid)>1

Something like this should work:
select userid,id,title
from
( select id,title,userid from locations l union select id,title,userid from artists a )
as grabfromthis
where userid in (
select userid
( select id,title,userid from locations l union select id,title,userid from artists a )
as info
group by userid having count(userid)>1)

I modified the above slightly to make it just a bit more slick:
WITH combined AS
(
SELECT id from t1
UNION ALL
SELECT id FROM t2
etc...
)
SELECT id, count(id) as numOccurrences
FROM combined
GROUP BY id
HAVING count(id) > 1
The difference being that this will show you just how many times your number is repeated.
You could also add
ORDER BY numOccurrences to chop down your worst offenders first

;WITH combined AS (
SELECT id, title, userid FROM locations
UNION ALL
SELECT id, title, userid FROM artists
)
SELECT *
FROM combined c
INNER JOIN (
SELECT userid
FROM combined
GROUP BY userid
HAVING COUNT(*) > 1
) g ON c.userid = g.userid
Get the non-unique userid list from the UNION sub-query and join it back to the subquery on userid. The sub-query is implemented only once, as a CTE.

Related

Select number of IDs in more than one table (from three tables)

I need the count of this:
select distinct ID
from (
select ID from A
union all
select ID from B
union all
select ID from C
) ids
GROUP BY ID HAVING COUNT(*) > 1;
but I have no idea how to do it.
Use a subquery:
select count(*)
from (select ID
from (select ID from A
union all
select ID from B
union all
select ID from C
) ids
group by ID
having count(*) > 1
) i;
SELECT DISTINCT is almost never needed with GROUP BY and definitely not in this case.
You just want to find the id that appear 2 more times in the A,B,C table, the SQL is below:
select count(1) from (
select
id,
count(1)
from
(
select ID from A
union all
select ID from B
union all
select ID from C
)
group by id having(count(1)>1)
) tmp

Select a third column based on two distant rows within the same table

I want to select a third column based on two distant columns within the same table.
I could only think of this:
select tl.thirdcolumn
from table1 t1
WHERE
EXISTS
(
Select distinct tl.firstcolumn , t1.secondcolumn
From t1
)
This:
select distinct tl.thirdcolumn
from table t1
won't work as I don't want the distinct thirdrow. I want the thirdrow to be based on the first two rows being distinct.
I guess its a kind of nested sql statment with a select top 1... idk
CATEGORY NAME Query
---------------------------------------------------
STUDENTS NUMBER_OF_CHAPTERS QueryA
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
Your question is rather hard to follow, but I think you might simply want group by:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn;
If you want rows where the pair of values only appears once, then add having count(*) = 1:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn
having count(*) = 1;
Query -
SELECT
CATEGORY,NAME,QUERY
FROM
(
WITH TAB AS (
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_CHAPTERS' AS NAME,
'QUERYA' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
) SELECT
CATEGORY,
NAME,
QUERY,
COUNT(*) OVER(PARTITION BY
CATEGORY,
NAME
ORDER BY
CATEGORY,
NAME,
QUERY
) AS RNK
FROM
TAB
)
WHERE
RNK = 1;
Output -
"CATEGORY","NAME","QUERY"
"STUDENTS","NUMBER_OF_CHAPTERS","QueryA"

SQL - How to Order By in UNION query

Is there a way to union two tables, but keep the rows from the first table appearing first in the result set? However orderby column is not in select query
For example:
Table 1
name surname
-------------------
John Doe
Bob Marley
Ras Tafari
Table 2
name surname
------------------
Lucky Dube
Abby Arnold
Result
Expected Result:
name surname
-------------------
John Doe
Bob Marley
Ras Tafari
Lucky Dube
Abby Arnold
I am bringing Data by following query
SELECT name,surname FROM TABLE 1 ORDER BY ID
UNION
SELECT name,surname FROM TABLE 2
The above query is not keeping track of order by after union.
P.S - I dont want to show ID in my select query
I am getting ORDER BY Column by joining tables. Following is my real query
SELECT tbl_Event_Type_Sort_Orders.Appraisal_Event_Type_ID AS Appraisal_Event_Type_ID , ISNULL(tbl_Appraisal_Event_Types.Appraisal_Event_Type_Display_Name, 'UnCategorized') AS Appraisal_Event_Type_Display_Name
INTO #temptbl
FROM tbl_Event_Type_Sort_Orders
INNER JOIN tbl_Appraisal_Event_Types
ON tbl_Event_Type_Sort_Orders.Appraisal_Event_Type_ID = tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID
WHERE 1=1
AND User_Name='abc'
ORDER BY tbl_Event_Type_Sort_Orders.Sort_Order
SELECT * FROM #temptbl
UNION
SELECT DISTINCT (tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID) AS Appraisal_Event_Type_ID , ISNULL(tbl_Appraisal_Event_Types.Appraisal_Event_Type_Display_Name, 'UnCategorized') AS Appraisal_Event_Type_Display_Name
FROM tbl_Appraisal_Event_Types
INNER JOIN tbl_Appraisal_Events
ON tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID = tbl_Appraisal_Events.Event_Type_ID
INNER JOIN tbl_Appraisals
ON tbl_Appraisal_Events.Appraisal_ID = tbl_Appraisal_Events.Appraisal_ID
WHERE 1=1
AND ((tbl_Appraisals.Assigned_To_Staff_User) = 'abc' OR (tbl_Appraisals.Assigned_To_Staff_User2) = 'abc' OR (tbl_Appraisals.Assigned_To_Staff_User3) = 'abc')
Put a UNION ALL in a derived table. To keep duplicate elimination, do select distinct and also add a NOT EXISTS to second select to avoid returning same person twice if found in both tables:
select name, surname
from
(
select distinct name, surname, 1 as tno
from table1
union all
select distinct name, surname, 2 as tno
from table2 t2
where not exists (select * from table1 t1
where t2.name = t1.name
and t2.surname = t1.surname)
) dt
order by tno, surname, name
You can use a column for the table and one for the ID to order by:
SELECT x.name, x.surname FROM (
SELECT ID, TableID = 1, name, surname
FROM table1
UNION ALL
SELECT ID = -1, TableID = 2, name, surname
FROM table2
) x
ORDER BY x.TableID, x.ID
You can write as below, if you are ok with duplicate data then please use UNION ALL it will be faster:
SELECT NAME, surname FROM (
SELECT ID,name,surname FROM TABLE 1
UNION
SELECT ID,name,surname FROM TABLE 2 ) t ORDER BY ID
this will order the first row sets first then by anything you need
(haven't tested the code)
;with cte_1
as
(SELECT ID,name,surname,1 as table_id FROM TABLE 1
UNION
SELECT ID,name,surname,2 as table_id FROM TABLE 2 )
SELECT name, surname
FROM cte_1
ORDER BY table_id,ID
simply use a UNION clause with out order by.
SELECT name,surname FROM TABLE 1
UNION
SELECT name,surname FROM TABLE 2
if you wanted to order first table use the below query.
;WITH cte_1
AS
(SELECT name,surname,ROW_NUMBER()OVER(ORDER BY Id)b FROM TABLE 1 )
SELECT name,surname
FROM cte_1
UNION
SELECT name,surname
FROM TABLE 2

How to select top x from with params?

I'm using Sql-Server 2005
I have Users table with userID and gender. I want to select top 1000 males(0) and top 1000 females(1) order by userID desc.
If i create union only one result set is ordered by userID desc. What other way to do that?
SELECT top 1000 *
FROM Users
where gender=0
union
SELECT top 1000 *
FROM Users
where gender=1
order by userID desc
Another way of doing it
WITH TopUsers AS
(
SELECT UserId,
Gender,
ROW_NUMBER() OVER (PARTITION BY Gender ORDER BY UserId DESC) AS RN
FROM Users
WHERE Gender IN (0,1) /*I guess this line might well not be needed*/
)
SELECT UserId, Gender
FROM TopUsers
WHERE RN <= 1000
ORDER BY UserId DESC
Martin Smith's solution is better than the following.
SELECT UserID, Gender
FROM
(SELECT TOP 1000 UserId, Gender
FROM Users
WHERE gender = 0
ORDER BY UserId DESC) m
UNION ALL
SELECT UserID, Gender
FROM
(SELECT TOP 1000 UserId, Gender
FROM Users
WHERE gender = 1
ORDER BY UserId DESC) f
ORDER BY Gender, UserID DESC
This does what you want, just change the order by if you'd rather have the latest user first, but it will get you the top 1000 of each.
Done some testing, and the results are pretty strange. If you specify an order by in both parts of a union, SQL Server gives a syntax error:
select top 2 * from #users where gender = 0 order by id
union all
select top 2 * from #users where gender = 1 order by id
That makes sense, because the order by should only be at the end of the union. But if you use the same construct in a subquery, it compiles! And works as expected:
select * from (
select top 2 * from #users where gender = 0 order by id
union all
select top 2 * from #users where gender = 1 order by id
) sub
The strangest thing happens when you specify only one order by for the subquery union:
select * from (
select top 2 * from #users where gender = 0
union all
select top 2 * from #users where gender = 1 order by id
) sub
Now it orders the first half of the union at random, but the second half by id. That's pretty unexpected. The same thing happens with the order by in the first half:
select * from (
select top 2 * from #users where gender = 0 order by id desc
union all
select top 2 * from #users where gender = 1
) sub
I'd expect this to give a syntax error, but instead it orders the first half of the union. So it looks like union interacts with order by in a different way when the union is part of a subquery.
Like Chris Diver originally posted, a good way to get out of the confusion is not to rely on the order by in a union, and specify everything explicitly:
select *
from (
select *
from (
select top 2 *
from #users
where gender = 0
order by
id desc
) males
union all
select *
from (
select top 2 *
from #users
where gender = 1
order by
id desc
) females
) males_and_females
order by
id
Example data:
declare #users table (id int identity, name varchar(50), gender bit)
insert into #users (name, gender)
select 'Joe', 0
union all select 'Alex', 0
union all select 'Fred', 0
union all select 'Catherine', 1
union all select 'Diana', 1
union all select 'Esther', 1
You need to ensure that you create a sub-select for the union, then do the ordering outside over the combined results.
Something like this should work:
SELECT u.*
FROM (SELECT u1a.* FROM (SELECT TOP 1000 u1.*
FROM USERS u1
WHERE u1.gender = 0
ORDER BY u1.userid DESC) u1a
UNION ALL
SELECT u2a.* FROM (SELECT TOP 1000 u2.*
FROM USERS u2
WHERE u2.gender = 1
ORDER BY u2.userid DESC) u2a
) u
ORDER BY u.userid DESC
Also, using a UNION ALL will give better performance as the db won't bother checking for duplicates (which there won't be in this query) in the results.

How to add 2 temporary tables together

If I am creating temporary tables, that have 2 columns. id and score. I want to to add them together.
The way I want to add them is if they each contain the same id then I do not want to duplicate the id but instead add the scores together.
if I have 2 temp tables called t1 and t2
and t1 had:
id 3 score 4
id 6 score 7
and t2 had:
id 3 score 5
id 5 score 2
I would end up with a new temp table containing:
id 3 score 9
id 5 score 2
id 6 score 7
The reason I want to do this is, I am trying to build a product search. I have a few algorithms I want to use, 1 using fulltext another not. And I want to use both algorithms so I want to create a temporary table based on algorithm1 and a temp table based on algorithm2. Then combine them.
How about:
SELECT id, SUM(score) AS score FROM (
SELECT id, score FROM t1
UNION ALL
SELECT id, score FROM t2
) t3
GROUP BY id
This is untested but you should be able to perform a union on the two tables and then perform a select on the results, grouping the fields and adding the scores
SELECT id,SUM(score) FROM
(
SELECT id,score FROM t1
UNION ALL
SELECT id,score FROM t2
) joined
GROUP BY id
Perform a full outer join on the ID. Select on the ID and the sum of the two "score" columns after coalescing the values to 0.
SELECT id, SUM(score) FROM
(
SELECT id, score FROM #t1
UNION ALL
SELECT id, score FROM #t2
) AS Temp
GROUP BY id
select id, sum(score)
from (
select * from table 1
union all
select * from table2
) tables
group by id
You need to create an union of those two tables then You can easily group the results.
SELECT id, sum(score) FROM
(
SELECT id, score FROM t1
UNION
SELECT id, score FROM t2
) as tmp
GROUP BY id;