Retrieving Top 10 rows and sum all others in row 11 - sql

I have the following query that retrieve the number of users per country;
SELECT C.CountryID AS CountryID,
C.CountryName AS Country,
Count(FirstName) AS Origin
FROM Users AS U
INNER JOIN Country AS C ON C.CountryID = U.CountryOfOrgin
GROUP BY CASE C.CountryName,
C.CountryID
What I need is a way to get the top 10 and then sum all other users in a single row. I know how to get the top 10 but I`m stuck on getting the remaining in a single row. Is there a simple way to do it?
For example if the above query returns 17 records the top ten are displayed and a sum of the users from the 7 remaining country should appear on row 11. On that row 11 the countryid would be 0 and countryname Others
Thanks for your help!

You did not specify how you are ranking the top 10 so I'm assuming the highest counts are ranked higher?
With TopItems As
(
SELECT C.CountryID AS CountryID
, C.CountryName AS Country
, Count(FirstName) AS Origin
, ROW_NUMBER() OVER( ORDER BY Count(FirstName) DESC ) As Num
FROM Users AS U
JOIN Country AS C
ON C.CountryID = U.CountryOfOrgin
GROUP BY C.CountryName, C.CountryID
)
Select CountryId, Country, Origin
From TopItems
Where Num <= 10
Union ALL
Select 0, 'Others', Sum(Origin)
From TopItems
Where Num > 10

Something like this:
SELECT
-- show them
ROW_NUMBER() OVER (ORDER BY CASE WHEN country_code = 'Others' THEN 1 ELSE 0 END, SUM(n) DESC) AS nr,
countryID,
SUM(n)
FROM (
-- change name for some countries
SELECT
CASE WHEN nr >= 11 THEN 'Others' ELSE countryID END AS countryID,
n
-- select all countries
FROM (
SELECT
-- store number to recognize position
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) AS nr,
countries.countryID,
COUNT(*) AS n
FROM
countries WITH (NOLOCK)
JOIN
users WITH (NOLOCK)
ON
users.countryID = countries.countryID
GROUP BY
countries.countryID
) AS x
) AS y
GROUP BY
countryID
ORDER BY
-- show Others as last one
CASE WHEN countryID = 'Others' THEN 1 ELSE 0 END,
SUM(n) DESC
works for me.

Related

Fetch the top nine rows and then get a tenth row with the total of everything else

I want to fetch nine rows of the count of experts in the country and the name of the country ordered in descending order by count of experts. For the tenth row I want to add a row that shows the total number of experts from all other countries.
Here is my code:
SELECT count(expert_id) as total_expert, cc.country_name
FROM expertsdb.ci_experts_master cem
INNER JOIN ci_city cct ON cct.city_id = cem.city_
INNER JOIN ci_country cc ON cc.country_id = cct.country_id
WHERE cem.city_ IS NOT NULL
order by total_expert
limit 9 desc
If I understand correctly, you want an "other" category after the first 9. You can use window functions:
SELECT (CASE WHEN seqnum <= 9 THEN country_name ELSE 'rest' END) as country_name,
SUM(total_expert)
FROM (SELECT cc.country_name, count(*) as total_expert,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM expertsdb.ci_experts_master cem JOIN
ci_city cct
ON cct.city_id = cem.city_ JOIN
ci_country cc
ON cc.country_id = cct.country_id
WHERE cem.city_ IS NOT NULL
) c
GROUP BY (CASE WHEN seqnum <= 9 THEN country_name ELSE 'rest' END)
ORDER BY seqnum ;

Modify query to group by client identifier

I have the following query.
Base query
WITH CTE (clientid, dayZero)
AS
-- Define the CTE query.
(
SELECT
clientid,
DATEDIFF(
DAY,
MIN(calendar),
MIN(CASE
WHEN total = 0
THEN calendar
END)
) as dayZero
FROM (
SELECT
clientid,
CONVERT(datetime, convert(varchar(10), calendar)) calendar,
TOTAL
FROM STATS s1
) a
GROUP BY clientid
),
cteb as
-- Define the outer query referencing the CTE name.
(SELECT cte.*, c.company, v.Name, m.id as memberid
FROM CTE
JOIN client c
on c.id = cte.CLIENTID
join Domain v
on v.Id = c.domainID
join subscriber m
on m.ClientId = c.id
join activity a
on m.id = a.memberid
where c.id != 023
),
ctec as
(
select count(distinct memberid) as Number from cteb
group by clientid
)
select clientid, dayzero, company, name, Number from cteb , ctec
The output of this query is -
clientid dayzero company name Number
21 35 School Boards Education 214
21 35 School Boards Education 214
I want it to only return 1 row per client. Any ideas on how to modify this query
Sub Query
select count(distinct memberid) as Number from cteb
group by clientid
When I only run the query until the above subquery and select like so -
select * from ctec
where clientid = 21
I get
clientid Number
21 214
22 423
This is what I would. But when I run the following select to get all the other columns I need, I start getting duplicates. The output makes sense because I am not grouping by clientid. But if I groupby how do I get the other columns I need?
select clientid, dayzero, company, name, Number from cteb , ctec
UPDATE
When I run the below select
select clientid, dayzero, company, name, Number from cteb , ctec
group by clientid, dayzero, company, name, Number
I still get
clientid dayzero company name Number
21 35 School Boards Education 214
21 35 School Boards Education 215
I don't understand why I am getting different numbers in the Number column (214 and 215 in this case). But when I run it with the group by as shown below, I get the correct numbers.
select count(distinct memberid) as Number from cteb
group by clientid
select * from ctec
where clientid = 21
I get
clientid Number
21 2190
Neither 214 nor 215 is correct. The correct number is 2190 which I get when I groupby as shown above.
If you want to show unique rows based on a particular column, you can use ROW_NUMBER() like following query.
select * from
(
select clientid, dayzero, company, name, Number,
ROW_NUMBER() OVER(PARTITION BY clientid ORDER BY Number DESC) RN
from cteb , ctec
) t
where RN=1

SELECT a single field by ordered value

Consider the following two tables:
student_id score date
-------------------------
1 10 05-01-2013
2 100 05-15-2013
2 60 05-01-2012
2 95 05-14-2013
3 15 05-01-2011
3 40 05-01-2012
class_id student_id
----------------------------
1 1
1 2
2 3
I want to get unique class_ids where the score is above a certain threshold for at least one student, ordered by the latest score.
So for instance, if I wanted to get a list of classes where the score was > 80, i would get class_id 1 as a result, since student 2's latest score was above > 80.
How would I go about this in t-sql?
Are you asking for this?
SELECT DISTINCT
t2.[class_ID]
FROM
t1
JOIN t2
ON t2.[student_id] = t1.[student_id]
WHERE
t1.[score] > 80
Edit based on your date requirement, then you could use row_number() to get the result:
select c.class_id
from class_student c
inner join
(
select student_id,
score,
date,
row_number() over(partition by student_id order by date desc) rn
from student_score
) s
on c.student_id = s.student_id
where s.rn = 1
and s.score >80;
See SQL Fiddle with Demo
Or you can use a WHERE EXISTS:
select c.class_id
from class_student c
where exists (select 1
from student_score s
where c.student_id = s.student_id
and s.score > 80
and s.[date] = (select max(date)
from student_score s1
where s.student_id = s1.student_id));
See SQL Fiddle with Demo
select distinct(class_id) from table2 where student_id in
(select distinct(student_id) from table1 where score > thresholdScore)
This should do the trick:
SELECT DISTINCT
CS.Class_ID
FROM
dbo.ClassStudent CS
CROSS APPLY (
SELECT TOP 1 *
FROM dbo.StudentScore S
WHERE CS.Student_ID = S.Student_ID
ORDER BY S.Date DESC
) L
WHERE
L.Score > 80
;
And here's another way:
WITH LastScore AS (
SELECT TOP 1 WITH TIES
FROM dbo.StudentScore
ORDER BY Row_Number() OVER (PARTITION BY Student_ID ORDER BY Date DESC)
)
SELECT DISTINCT
CS.Class_ID
FROM
dbo.ClassStudent CS
WHERE
EXISTS (
SELECT *
FROM LastScore L
WHERE
CS.Student_ID = L.Student_ID
AND L.Score > 80
)
;
Depending on the data and the indexes, these two queries could have very different performance characteristics. It is worth trying several to see if one stands out as superior to the others.
It seems like there could be some version of the query where the engine would stop looking as soon as it finds just one student with the requisite score, but I am not sure at this moment how to accomplish that.

Finding Count of Duplicate emails that has a different first and/or last name

Hi I'm having trouble getting the right count for this problem. I'm trying to get a count of duplicate email that has a different first name and/or different last name.
(i.e
123#.com sam
123#.com ben
I need a count of that duplicate email)
I'm working with 2 tables. The email_address is in the mrtcustomer.customer_email table and the first and last name is in the mrtcustomer.customer_master table
My code
SELECT COUNT(*)
FROM
(SELECT e.customer_master_id, email_address, customer_first_name, customer_last_name,
ROW_NUMBER() OVER (PARTITION BY EMAIL_ADDRESS ORDER BY CUSTOMER_FIRST_NAME) RN
FROM mrtcustomer.customer_email e
JOIN mrtcustomer.customer_master t ON e.customer_master_id = t.customer_master_id
WHERE t.customer_first_name IS NOT NULL
AND t.customer_last_name IS NOT NULL
AND customer_FIRST_NAME != 'Unknown'
AND customer_LAST_NAME != 'Unknown'
GROUP BY e.customer_master_id, email_address, customer_first_name, customer_last_name
ORDER BY 1 DESC)
WHERE RN > 1
I'm guessing its my WHERE clause that is wrong.
i would start with something like this: (edited to reflect edits)
select email_address
, count( distinct customer_first_name ) f
, count( distinct customer_last_name ) l
from customer_email e, customar_master m
where e.customer_master_id = m.customer_master_id
group by email_address
then if either of the name columns is > 1 you have a problem - so wrap that similar to this:
select email_address from
(
select email_address
, count( distinct customer_first_name ) f
, count( distinct customer_last_name ) l
from customer_email e, customar_master m
where e.customer_master_id = m.customer_master_id
group by email_address
)
where fn > 1 or ln > 1
identify distinct fname,lname,email records...
then group by email (having more than one record)...
then do a count on that.
-- count
SELECT COUNT(DISTINCT email_address)
FROM
(
-- group by email , find where there is more than one distinct record for each email
SELECT email_address
FROM
(
-- get distinct Fname, Lname, Email combinations in derived table
SELECT customer_first_name , customer_last_name, email_address
FROM mrtcustomer.customer_email
JOIN mrtcustomer.customer_master t ON e.customer_master_id = t.customer_master_id
WHERE t.customer_first_name IS NOT NULL
AND t.customer_last_name IS NOT NULL
AND customer_FIRST_NAME != 'Unknown'
AND customer_LAST_NAME != 'Unknown'
GROUP BY 1,2,3
) foo
GROUP BY 1
HAVING COUNT(*)>1
) bar

grouping and aggregates with subqueries

I have a query that is designed to find the number of people who went to a hospital more than once. What I have works, but is there a way to do it without the subquery?
SELECT count(*) as counts, hospitals.hospitalname
FROM Patient INNER JOIN
hospitals ON Patient.hospitalnpi = hospitals.npi
WHERE (hospitals.hospitalname = 'X')
group by patientid, hospitalname
having count(patient.patientid) >1
order by count(*) desc
This will always return the number of correct rows (30), but not the number 30. If I remove the group by patientid then I get the entire result set returned.
I solved this problem by doing
select COUNT(*),hospitalname
from
(
SELECT count(*) as counts,hospitals.hospitalname
FROM hospitals INNER JOIN
Patient ON hospitals.npi = Patient.hospitalnpi
group by patientid, hospitals.hospitalname
having count(patient.patientid) >1
) t
group by t.hospitalname
order by t.hospitalname desc
I feel that there has to be a more elegant solution than using subqueries all the time. How could this be improved?
sample data from first query
row # revisits
1 2
2 2
3 2
4 2
same data from second, working query
row# hosp. name revisitAggregate
1 x 30
2 y 15
3 z 5
Simple one-to-many relationship between patient and hospitals
It's super hacky, but here you are:
SELECT TOP 1
ROW_NUMBER() OVER (order by patient.patientid) as Count
FROM
Patient
INNER JOIN hospitals
ON Patient.hospitalnpi = hospitals.npi
WHERE
(hospitals.hospitalname = 'X')
GROUP BY
patientid,
hospitalname
HAVING
count(patient.patientid) >1
ORDER BY
Count desc
select distinct hospitalname, count(*) over (partition by hospitalname) from (
SELECT hospitalname, count(*) over (partition by patientid,
hospitals.hospitalname) as counter
FROM hospitals INNER JOIN
Patient ON hospitals.npi = Patient.hospitalnpi
WHERE (hospitals.hospitalname = 'X')
) Z
where counter > 1