Apache Hive command - sql

I have this question: Show the top 5 game Disciplines for the countries who got more than 10 gold medals.
my code is: select distinct t.discipline, m.team from teams t join medals m on (t.noc=m.team and m.numbergold>10) order by m.team;
cloud you please help me with this code to show the top 5 game for each country?
change the code to display 5 games for each country

Please use the group by and windows function for your required result.
In 'numbergold' - Use windows functions like dense_rank so that you can filter on the top 5.

Related

How do I create a view on data

I have a table with 13 fields including:
Computer, Application
I need to have a similar table/view that lists COUNT(Application) along with Application listed only once for each Computer. All fields must exist with the addition of the new field.
I need something similar to:
Computer| Application | AppCount | ...
USD9090 MS Outlook 3
UOD0909 MS Outlook 5
UDL4563 Skype 4
Ive tried grouping by Computer with Application COUNT
SELECT TOP 5 Computer, ComputerID, Application FROM AppReliability WHERE EXISTS
(SELECT TOP 5 Count(Application) AS App, Computer
FROM AppReliability
WHERE Date >= DATEADD(day,-30,GETDATE())
GROUP BY Computer
ORDER BY App DESC)
I cant get the correct output
Do you just want an aggregation query?
select Computer, Application, count(*) as AppCount
from AppReliability
group by Computer, Application;
Your question doesn't mention anything about dates or why you would be using top (5).

SQL - Find the horses that have run at ONLY 1 specific track

I have some horse data in a table and am practicing SQL while following my hobby. I would like to find an elegant way to solve this problem. Right now I have this convoluted way of getting the answer but I know there HAS to be an easier way.
Description:
I'll make it really simple (Assume these 8 rows are the entire table). I have a table with 4 columns. HORSE_ID, NAME, TRACK, Date
A horse might run at one track or many different tracks. The end goal is to find what only the horses who have ran/campaigned at only one specific track....in this case, I want to see the horses that have run all their races at SA (Santa Anita)
HORSE_ID NAME TRACK DATE
1 JUSTIFY SA FEB-2018
2 JUSTIFY PIM MAY-2018
3 JUSTIFY BEL JUN-2018
4 KANTHAKA SA DEC-2017
5 KANTHAKA SA JAN-2018
7 THREE RULES GP JUL-2016
8 DABSTER SA JAN-2018
So if I ran this query with this data, the only horses I would expect to see are KANTHAKA, and DABSTER because they are the only horses that only ran all their races at Santa Anita track. So say next month KANTHAKA ran at ANOTHER DIFFERENT track, then the next time the query was run, only DABSTER would show up.
Does this make sense?
Try using GROUP BY with HAVING:
SELECT NAME
FROM yourTable
GROUP BY NAME
HAVING MIN(TRACK) = MAX(TRACK);
Writing the HAVING clause as above is preferable to writing HAVING COUNT(DISTINCT TRACK) = 1. The reason for this is that the above query can make use of an index on (NAME, TRACK).
If in addition you wish to restrict to a single track, then we can try:
SELECT NAME
FROM yourTable
GROUP BY NAME
HAVING MIN(TRACK) = MAX(TRACK) AND MIN(TRACK) = 'SA';
You can do a subquery requesting the count of distinct tracks to be 1 using GROUP BY and HAVING COUNT DISTINCT and then select those WHERE the track is 'SA':
SELECT NAME
FROM
(SELECT NAME, MIN(TRACK) as TRACK
FROM HORSES
GROUP BY 1
HAVING COUNT (DISTINCT TRACK) = 1) horses_one_race
WHERE
TRACK = 'SA'

Using Count and Group By in Power BI

I have a table that contains data about different benefit plans and users enrolled in one or more of those plans. So basically the table contains two columns representing the benefit plan counts and total users enrolled in those plans.
I need to create visualization in Power BI to represent the number of total users enrolled in 1 plan, 2 plans, 3 plans, ...etc.
I wrote the query in sql to get the desired result but not sure how do I do the same in power BI.
Below is my sql query:
SELECT S.PlanCount, COUNT(S.UserName) AS Participants
FROM (
SELECT A.Username, COUNT(*) AS PlanCount
FROM [dbo].[vw_BenefitsCount_Plan_Participants] AS A
GROUP BY A.username
)AS S
GROUP BY S.PlanCount
ORDER BY S.PlanCount
The query result is below image:
So here, PlanCount column represents the total different benefit plans that users are enrolled in. For e.g. the first row means that total of 6008 members are enrolled in only 1 plan, whereas row 2 displays that there are total of 3030 members who are enrolled in total of 2 plans and similarly row 5 means there are only 10 users who are enrolled in total of 6 plans.
I am new to Power BI and trying to understand DAX functions but couldn't find a reasonable example that could help me create my visualization.
I found a something similar here and here but they seem to be more towards single count and group by usage.
Here is a simple example. I have a table of home owners who have homes in multiple cities.
Now in this table, Alex, Dave and Julie have home in 1 city (basically we can say that these 3 people own just 1 home each). Similarly Jim owns a total of 2 homes and Bob and Pam each have 3 homes in total.
Now the output that I need is a table with total number of home owners that own 1 home, 2 homes and so on. So the resulting table in SQL is this.
Where NameCount is basically count of total home owners and Homes is the count of total homes these home owners have.
Please let me know if this helps.
Thanks.
If I understood fine, you have a table like this:
BenefitPlan | User
1 | Max
1 | Joe
2 | Max
3 | Anna
If it's ok, you can simply use a plot bar (for example) where the Axis is BenefitPlan and Value is User. When you drag some column in Value field, it will be grouped automaticaly (like group by in SQL), and by default the groupping method is count.
Hope it helps.
Regards.
You can use DAX to create a summary table from your data table:
https://community.powerbi.com/t5/Desktop/Creating-a-summary-table-out-of-existing-table-assistance/td-p/431485
Once you have counted plans by customer you will then have a field that will enable you to visualize the # of customers with each count.
Mock-up of the code:
PlanSummary = SUMMARIZE('vw_BenefitsCount_Plan_Participants',[Username],COUNT([PLAN_ID])

SQL AVG function

What is the correct syntax to display the data from all records using GROUP (or DISTINCT) and AVG?
My data looks like this:
TABLE NAME: Pensje
COLUMNS:
ID, Company, Position, Salary
Example:
ID Company Position Salary
1 Atari Designer 24000
2 Atari Designer 20000
3 Atari Programmer 35000
4 Amiga Director 40000
I need to arrange data in this way (only records from 1 company need to be displayed)
Position a , average Salary from all the records with same Company and Position
Position b , average Salary from all the records with same Company and Position
Ex. Atari
Designer, 22000
Programmer, 35000
My SQL looks like this:
SELECT Position, AVG(Salary)
FROM Pensje
WHERE Company = %s
GROUP BY Position
ORDER BY Position ASC
In the above example "Position" is displayed correctly, "Salary" is not displayed at all, while after removing AVG() is displayed but only first position found in table
Many thanks for taking your time to help me!
Posting an answer from MCP_infiltrator which helped:
In your WHERE clause you should be using WHERE Company LIKE '%s' not = The way you have it written, should produce the results you want so I do not understand why it is not working properly for you. You may also want to alias your AVG() results like AVG(Salary) AS Average_Salary otherwise your column will comback with no name.

Conditional Drill Down Powerview

Not sure if this is doable but I just wonder say if my data set got this structure
Department, Team, Service, Area
and I got a measure lets call it #count that count all rows in this table.
When create a powerview chart I want to be able to drill down by Department, Team, Service and Area.
However if say in Area there are following attributes
Area 1 - 10 rows
Area 2 - 5 rows
Area 3 - 3 rows
Is it possible to stop drill down if count row in certain area is less than 5 per se?
Appreciate all advice.
Many thanks
Peddie