Top 10 Subquery in Access SQL - sql

SELECT TOP 10 [FINAL_FOR_DB].[Indemnity_Paid]/[FINAL_FOR_DB].[Claim_Count] AS Indemnity_Cost,
final_for_db.Claimant_Name,
final_for_db.Account_Name,
final_for_db.Claim_ID,
final_for_db.File_Date,
final_for_db.Resolution_Date,
final_for_db.Claim_Status,
final_for_db.State_Filed, final_for_db.Expense_Amount,
final_for_db.Claim_Count,
final_for_db.Indemnity_Paid AS [Total Indemnity]
FROM final_for_db
WHERE (((final_for_db.Account_Name)="Exxon"))
ORDER BY [FINAL_FOR_DB].[Indemnity_Paid]/[FINAL_FOR_DB].[Claim_Count] DESC;
This would only give me top 10 entries for Exxon but I am wondering if there is a way to get top 10 entries for each account name from the biggest indemnity cost to the lowest. I believe there is a need for subquery. I would appreciate any help on this. Thanks

Other RDBMS's support the RANK() and ROW_NUMBER() functions. Unfortunately, Access does not (to my knowledge). This should get you close to what you want. It does not handle duplicates well (two customers with the same indemnity cost would get the same rank, possibly leaving you with the top 11 or so).
Select * From
(
Select *
, (
Select count(*)
From final_for_db as tbl2
where (tbl1.Indemnity_Paid/tbl1.Claim_Count) < (tbl2.Indemnity_Paid/tbl2.Claim_Count)
and tbl1.Account_Name= tbl2.Account_Name
) + 1 as rank from final_for_db tbl1
) x where x.Rank < 10

Related

Display percentage of registered members that have not rated a Movie

I have the following three tables. See full db<>fiddle here
members
member_id
first_name
last_name
1
Roby
Dauncey
2
Isa
Garfoot
3
Sullivan
Carletto
4
Jacintha
Beacock
5
Mikey
Keat
6
Cindy
Stenett
7
Alexina
Deary
8
Perkin
Bachmann
10
Suzann
Genery
39
Horatius
Baukham
41
Bendicty
Willisch
movies
movie_id
movie_name
movie_genre
10
The Bloody Olive
Comedy,Crime,Film-Noir
56
Attack of The Killer Tomatoes
(no genres listed)
ratings
rating_id
movie_id
member_id
rating
19
10
39
2
10
56
41
1
Now the question is:
Out of the total number registered members, how many have actually left a movie rating? Display the result as a percentage
This is what I have tried:
SELECT CONVERT(VARCHAR,(CONVERT(FLOAT,COUNT([Number of Members])) / CONVERT(FLOAT,COUNT(*)) * 100)) + '%'
AS 'Members Percentage'
FROM (
SELECT COUNT(*) AS 'Number of Members'
FROM members
WHERE member_id IN (
SELECT member_id FROM members
EXCEPT
SELECT member_id FROM ratings
)
) MembersNORatings
And my query result is displaying as 100%. Which is obvious that the result is wrong.
Members Percentage
100%
What I figured out was that in the first line of the query:
COUNT(*) value is being recognized as the value equivalent to the alias [Number of Members]. That's why it is showing 100%.
I thought of replacing COUNT(*) with SELECT COUNT(*) FROM members but before I try to run the query, it was showing error saying
Incorrect Syntax near SELECT.
What change do I need to make in my existing query in order to get the proper percentage result?
You can use a cross apply to determine using a sub-query whether a given member has left a rating or not (because you can't use a sub-query in an aggregation). Then divide (ensuring you use decimal division, not integer) to get the percentage.
select
count(*) TotalMembers
, sum(r.HasRating) TotalWithRatings
, convert(decimal(9,2), 100 * sum(r.HasRating) / (count(*) * 1.0)) PercentageWithRatings
from #members m
cross apply (
select case when exists (select 1 from #ratings r where r.member_id = m.member_id) then 1 else 0 end
) r (HasRating);
Returns:
TotalMembers
TotalWithRatings
PercentageWithRatings
50
2
4.00
As mentioned in the comments, there are several ways to approach this. For example:
Option #1 - OUTER JOIN + DISTINCT
SELECT TotalMembers
, TotalMembersWithRatings
, CAST( 100.0 * TotalMembersWithRatings
/ NULLIF(TotalMembers, 0 )
AS DECIMAL(10,2)) AS MemberPercentage
FROM (
SELECT COUNT(DISTINCT m.member_id) AS TotalMembers
, COUNT(DISTINCT r.member_id) AS TotalMembersWithRatings
FROM members m LEFT JOIN ratings r ON r.member_id = m.member_id
) t
Option #2 - CTE + ROW_NUMBER()
WITH memberRatings AS (
SELECT member_id, ROW_NUMBER() OVER(
PARTITION BY member_id
ORDER BY member_id
) AS RowNum
FROM ratings
)
SELECT COUNT(mr.member_id) AS TotalMembers
, COUNT(mr.member_id) AS TotalWithRatings
, CAST( 100.0 * COUNT(mr.member_id)
/ NULLIF(COUNT(m.member_id), 0 )
AS DECIMAL(10,2)) AS MemberPercentage
FROM members m LEFT JOIN memberRatings mr ON mr.member_id = m.member_id
AND mr.RowNum = 1
Option #3 - CROSS APPLY
SELECT
COUNT(*) TotalMembers
, SUM(r.HasRating) TotalWithRatings
, CONVERT(decimal(9,2), 100 * sum(r.HasRating) / (count(*) * 1.0)) PercentageWithRatings
FROM members m
CROSS APPLY (
SELECT CASE WHEN exists (select 1 from ratings r where r.member_id = m.member_id) THEN 1
ELSE 0
END
) r (HasRating);
Execution Plans - Take #1
There's a LOT more to analyzing execution plans than just comparing a single number. However, high level plans do provide some useful indicators.
With the small data samples provided, the plans suggest options #2 (CTE) and #3 (APPLY) are likely to be the most performant (19%), and option #1 (OUTER JOIN + DISTINCT) the least at (63%), likely due to the count(distinct) which can often be slower than alternative options.
Original Sample Size:
TableName
TotalRows
movies
50
members
50
ratings
50
Execution Plans - Take #2
However, populate the tables with more than a few sample rows of data and the same rough comparison produces a different result. Option #2 (CTE) still seems likely to be the least expensive query (9%), but Option #3 (APPLY) is now the most expensive (76%). You can see the majority of that cost is the index spool used due to how APPLY operates:
New Sample Size
TableName
TotalRows
movies
4105
members
29941
ratings
14866
New Execution Plans
With the increased amount of data, STATISTICS IO shows option #2 has far less logical reads and scans and option #3 (APPLY) which as has the most. While Option #1, which appears to have a lower cost overall (15%) it still has a much higher number of logical reads. (Add a non-clustered index on member_id and movie_id and the numbers, while similar, change once again.) So don't just look at a single number.
New Statistics IO
While overall, option #2 (CTE) would seem likely to be most efficient, there are a lot of factors involved (indexes, data volume, statistics, version, etc), so you should examine the actual execution plans in your own environment.
As with most things, the answer as to which is best is: it depends.
Late to the party, but you don't need to join the tables if you only want to know how many members made a rating, not who.
What you need is
count entries in members table
count (distinct) members in ratings
get quota of 'rating' members (rating members divided by total members)
to get nonrating members, substract the quota from 1.0
multiply with 100 to get the percent value
This is how you could do the calculation step by step using CTEs:
with count_members as (
select count(member_id) as member_count from members
), count_raters as (
select count(distinct member_id) as rater_count from ratings
), convert_both as (
select top 1
cast(m.member_count as decimal(10,2)) as member_count,
cast(r.rater_count as decimal(10,2)) as rater_count
from count_members as m cross join count_raters as r
), calculate_quota as (
select (rater_count / member_count) as quota from convert_both
), invert_quota as (
select (1.0 - quota) as quota from calculate_quota
)
select (quota * 100) as percentage from invert_quota;
Alternatively, that's how you could roll it all into one:
select (
(1.0 - (
cast((select count(distinct member_id) from ratings) as decimal(10,2))
/
cast((select count(member_id) from members) as decimal(10,2))
) ) * 100
) as percentage;
dbfiddle here

How do I get the top 10 results of a query?

I have a postgresql query like this:
with r as (
select
1 as reason_type_id,
rarreason as reason_id,
count(*) over() count_all
from
workorderlines
where
rarreason != 0
and finalinsdate >= '2012-12-01'
)
select
r.reason_id,
rt.desc,
count(r.reason_id) as num,
round((count(r.reason_id)::float / (select count(*) as total from r) * 100.0)::numeric, 2) as pct
from r
left outer join
rtreasons as rt
on
r.reason_id = rt.rtreason
and r.reason_type_id = rt.rtreasontype
group by
r.reason_id,
rt.desc
order by r.reason_id asc
This returns a table of results with 4 columns: the reason id, the description associated with that reason id, the number of entries having that reason id, and the percent of the total that number represents.
This table looks like this:
What I would like to do is only display the top 10 results based off the total number of entries having a reason id. However, whatever is leftover, I would like to compile into another row with a description called "Other". How would I do this?
with r2 as (
...everything before the select list...
dense_rank() over(order by pct) cause_rank
...the rest of your query...
)
select * from r2 where cause_rank < 11
union
select
NULL as reason_id,
'Other' as desc,
sum(r2.num) over() as num,
sum(r2.pct) over() as pct,
11 as cause_rank
from r2
where cause_rank >= 11
As said above Limit and for the skipping and getting the rest use offset... Try This Site
Not sure about Postgre but SELECT TOP 10... should do the trick if you sort correctly
However about the second part: You might use a Right Join for this. Join the TOP 10 Result with the whole table data and use only the records not appearing on the left side. If you calculate the sum of those you should get your "Sum of the rest" result.
I assume that vw_my_top_10 is the view showing you the top 10 records. vw_all_records shows all records (including the top 10).
Like this:
SELECT SUM(a_field)
FROM vw_my_top_10
RIGHT JOIN vw_all_records
ON (vw_my_top_10.Key = vw_all_records.Key)
WHERE vw_my_top_10.Key IS NULL

select top 1 * returns diffrent recordset each time

In my application I use SELECT TOP 12 * clause to select top 12 records from database and show it to user. In another case I have to show the same result one by one. So I use SELECT TOP 1 * clause,rest of the query is same. I used Sql row_number() function to select items one by on serially.
The problem is SELECT TOP 1 * doesn't return me same row as I get in SELECT TOP 12 *. Also the result set of SELECT TOP 12 * get changed each time I execute the query.
Can anybody explain me why the result is not get same in SELECT TOP 12 * and SELECT TOP 1 *.
FYI: here is my sql
select distinct top 1 * from(
select row_number() over ( ORDER BY Ratings desc ) as Row, * from(
SELECT vw.IsHide, vw.UpdateDate, vw.UserID, vw.UploadPath, vw.MediaUploadID, vw.Ratings, vw.Caption, vw.UserName, vw.BirthYear, vw.BirthDay, vw.BirthMonth, vw.Gender, vw.CityProvince, vw.Approved
FROM VW_Media as vw ,Users as u WITH(NOLOCk)
WHERE vw.IsHide='false' and
GenderNVID=5 and
vw.UserID=u.UserID and
vw.UserID not in(205092) and
vw.UploadTypeNVID=1106 and
vw.IsDeleted='false' and
vw.Approved = 1 and
u.HideProfile=0 and
u.StatusNVID=126 and
vw.UserID not in(Select BlockedToUserID from BlockList WITH(NOLOCk) where UserID=205092) a) totalres where row >0
Thanks in Advance
Sachin
When you use SELECT TOP, you must use also the ORDER BY clause to avoid different results every time.
For performance resons, the database is free to return the records in any order it likes if you don't specify any ordering.
So, you always have to specify in which order you want the records, if you want them in any specific order.
Up to some version of SQL Server (7 IIRC) the natural order of the table was preserved in the result if you didn't specify any ordering, but this feature was removed in later versions.

SQL to produce Top 10 and Other

Imagine I have a table showing the sales of Acme Widgets, and where they were sold. It's fairly easy to produce a report grouping sales by country. It's fairly easy to find the top 10. But what I'd like is to show the top 10, and then have a final row saying Other. E.g.,
Ctry | Sales
=============
GB | 100
US | 80
ES | 60
...
IT | 10
Other | 50
I've been searching for ages but can't seem to find any help which takes me beyond the standard top 10.
TIA
I tried some of the other solutions here, however they seem to be either slightly off, or the ordering wasn't quite right.
My attempt at a Microsoft SQL Server solution appears to work correctly:
SELECT Ctry, Sales FROM
(
SELECT TOP 2
Ctry,
SUM(Sales) AS Sales
FROM
Table1
GROUP BY
Ctry
ORDER BY
Sales DESC
) AS Q1
UNION ALL
SELECT
Ctry AS 'Other',
SUM(Sales) AS Sales
FROM
Table1
WHERE
Ctry NOT IN (SELECT TOP 2
Ctry
FROM
Table1
GROUP BY
Ctry
ORDER BY
SUM(Sales) DESC)
Note that in my example, I'm only using TOP 2 rather than TOP 10. This is simply due to my test data being rather more limited. You can easily substitute the 2 for a 10 in your own data.
Here's the SQL Script to create the table:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Table1](
[Ctry] [varchar](50) NOT NULL,
[Sales] [float] NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
And my data looks like this:
GB 10
GB 21.2
GB 34
GB 16.75
US 10
US 11
US 56.43
FR 18.54
FR 98.58
WE 44.33
WE 11.54
WE 89.21
KR 10
PO 10
DE 10
Note that the query result is correctly ordered by the Sales value aggregate and not the alphabetic country code, and that the "Other" category is always last, even if it's Sales value aggregate would ordinarily push it to the top of the list.
I'm not saying this is the best (read: most optimal) solution, however, for the dataset that I provided it seems to work pretty well.
SELECT Ctry, sum(Sales) Sales
FROM (SELECT COALESCE(T2.Ctry, 'OTHER') Ctry, T1.Sales
FROM (SELECT Ctry, sum(Sales) Sales
FROM Table1
GROUP BY Ctry) T1
LEFT JOIN
(SELECT TOP 10 Ctry, sum(sales) Sales
FROM Table1
GROUP BY Ctry) T2
on T1.Ctry = T2.Ctry
) T
GROUP BY Ctry
The pure SQL solutions to this problem make multiple passes through the individual records more than once. The following solution only queries the data once, and uses a SQL ranking function, ROW_NUMBER() to determine if some results belong in the "Other" category. The ROW_NUMBER() function has been available in SQL Server since SQL Server 2008. In my database, this seems to have resulted in a more efficient query. Please note that the "Other" row will appear above some rows if the total of the "Other" sales exceeds the top 10. If this is not desired some adjustments would need to be made to this query:
SELECT CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END AS Ctry,
SUM(Sales) as Sales FROM
(
SELECT Ctry, SUM(Sales) as Sales,
ROW_NUMBER() OVER(ORDER BY SUM(Sales) DESC) AS RowNumber
FROM Table1 GROUP BY Ctry
) as AggregateQuery
GROUP BY CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END
ORDER BY SUM(Sales) DESC
Using a real analytics SQL engine, such as Apache Spark, you can use Common Table Expression with to do:
with t as (
select rank() over (order by sales desc) as r, sales,city
from DB
order by sales desc
)
select sales, city, r
from t where r <= 10
union
select sum(sales) as sales, "Other" as city, 11 as r
from t where r > 10
In pseudo SQL:
select top 10 order by sales
UNION
select 'Other',SUM(sales) where Ctry not in (select top 10 like above)
Union the top ten with an outer Join of the top ten with the table it self to aggregate the rest.
I don't have access to SQL here but I'll hazzard a guess:
select top (10) Ctry, sales from table1
union all
select 'other', sum(sales)
from table1
left outer join (select top (10) Ctry, sales from table1) as table2
on table2.Ctry = table2.Ctry
where table2.ctry = null
group by table1.Ctry
Of course if this is a rapidly changing top(10) then you either lock or maintain a copy of the top(10) for the duration of the query.
Have in mind that depending on your use (and database volume / restrictions) you can achieve the same results using application code (python, node, C#, java etc). Sure it will depend on your use-case but hey, it's possible.
I ended up doing this in C# for instance:
// Mockup Class that has a CATEGORY and it's VOLUME
class YourModel { string category; double volume; }
List<YourModel> groupedList = wholeList.Take (5).ToList ();
groupedList.Add (new YourModel()
{
category = "Others",
volume = tempChartData.Skip (5).Select (t => t.qtd).Sum ()
});
Disclaimer
I understand that this is a "SQL Only" tagged question, but there might be other people like me out there who can make use of the application layer instead of relying only on SQL to make it happen. I am just trying to show people other ways of doing the same thing, that might be helpful. Even if this gets downvoted to oblivion I know that someone will be happy to read this because they were taught to use each tool to it's best, and think "outside the box".

SQL query to get the top "n" scores out of a list

I'd like to find the different ways to solve a real life problem I had: imagine to have a contest, or a game, during which the users collect points. You have to build a query to show the list of users with the best "n" scores.
I'm making an example to clarify. Let's say that this is the Users table, with the points earned:
UserId - Points
1 - 100
2 - 75
3 - 50
4 - 50
5 - 50
6 - 25
If I want the top 3 scores, the result will be:
UserId - Points
1 - 100
2 - 75
3 - 50
4 - 50
5 - 50
This can be realized in a view or a stored procedure, as you want. My target db is Sql Server. Actually I solved this, but I think there are different way to obtain the result... faster or more efficent than mine.
Untested, but should work:
select * from users where points in
(select distinct top 3 points from users order by points desc)
Here's one that works - I don't know if it's more efficient, and it's SQL Server 2005+
with scores as (
select 1 userid, 100 points
union select 2, 75
union select 3, 50
union select 4, 50
union select 5, 50
union select 6, 25
),
results as (
select userid, points, RANK() over (order by points desc) as ranking
from scores
)
select userid, points, ranking
from results
where ranking <= 3
Obviously the first "with" is to set up the values, so you can test the second with, and final select work - you could start at "with results as..." if you were querying against an existing table.
How about:
select top 3 with ties points
from scores
order by points desc
Not sure if "with ties" works on anything other the SQL Server.
On SQL Server 2005 and up, you can pass the "top" number as an int parameter:
select top (#n) with ties points
from scores
order by points desc
Actually a modification to the WHERE IN, utilizing an INNER JOIN will be much faster.
SELECT
userid, points
FROM users u
INNER JOIN
(
SELECT DISTINCT TOP N
points
FROM users
ORDER BY points DESC
) AS p ON p.points = u.points
#bosnic, I don't think that will work as requested, I'm not that familiar with MS SQL but I would expect it to return only 3 rows, and ignore the fact that 3 users are tied for 3rd place.
Something like this should work:
select userid, points
from scores
where points in (select top 3 points
from scores
order by points desc)
order by points desc
#Rob#37760:
select top N points from users order by points desc
This query will only select 3 rows if N is 3, see the question. "Top 3" should return 5 rows.
#Espo thanks for the reality check - added the sub-select to correct for that.
I think the easiest response is to:
select userid, points from users
where points in (select distinct top N points from users order by points desc)
If you want to put that in a stored proc which takes N as a parameter, then you'll either have to do read the SQL into a variable then execute it, or do the row count trick:
declare #SQL nvarchar(2000)
set #SQL = "select userID, points from users "
set #SQL = #SQL + " where points in (select distinct top " + #N
set #SQL = #SQL + " points from users order by points desc)"
execute #SQL
or
SELECT UserID, Points
FROM (SELECT ROW_NUMBER() OVER (ORDER BY points DESC)
AS Row, UserID, Points FROM Users)
AS usersWithPoints
WHERE Row between 0 and #N
Both examples assume SQL Server and haven't been tested.
#Matt Hamilton
Your answer works with the example above but would not work if the data set was 100, 75, 75, 50, 50 (where it would return only 3 rows). TOP WITH TIES only includes the ties of the last row returned...
Crucible got it (assuming SQL 2005 is an option).
Hey I found all the other answers bit long and inefficient
My answer would be:
select * from users order by points desc limit 0,5
this will render top 5 points
Try this
select top N points from users order by points desc