Recently I have been asked a below SQL query question? Can someone help me with this one. I have a table with three columns - Home Team, Away Team, and Winner Team. Like given below.
H_T A_T W_T
AUS IND IND
ENG AUS ENG
IND AUS AUS
AUS ENG AUS
ENG IND IND
IND ENG IND
Above data needs to be converted in SQL to show the report with below attributes
Team Name, Total Matches Played, Win Count, Draw Count, Loss Count, Points.
To calculate points, these are the formulas for each kind (win/draw/loss)
Win = Win Count * 3
Draw = Draw Count * 1
Loss = Loss Count * 0
Point will be summed up with above three values.
Thanks in advance
Need a little more information as to what defines a draw, this assumes the w_t column contains the value draw instead of a team.
Either way, you can use conditional aggregation to get your desired results. Normally you would have a teams table and join to it, but you can create that with a union and a subquery:
select t.*,
(win_count * 3) + (draw_count) as Points
from (
select t.team,
count(*) Total_Matches_Played,
count(case when t.team = y.w_t then 1 end) Win_Count,
count(case when t.team <> y.w_t then 1 end) Loss_Count,
count(case when y.w_t = 'DRAW' then 1 end) Draw_Count
from (
select h_t as team from yourtable
union select a_t from yourtable
) t join yourtable y on t.team in (y.a_t,y.h_t)
group by t.team
) t
Related
I have data that looks like this:
State Sex
---- ---
GA M
GA M
GA F
GA F
GA F
NY M
NY M
NY M
NY M
NY F
NY F
NY F
NY F
NY F
I want the result to be:
one row per state
col1 State
col2 count of Males
col3 count of Females
col4 total count by state
col 5 percent Male by state
The query I am using is:
select t.state State,
M.count Male,
F.count Female,
count(t.state) Total,
CONCAT(ROUND(CAST(M.count as float)/CAST(count(t.state) as float)*100, 2), '%') as calc
from MyTable t
join
(
select state, count(sex) as count
from MyTable where sex ='M'
group by state) M
on t.state = M.state
join (
select state, count(sex) as count
from MyTable where sex ='F'
group by state) F
ON M.state = F.state
group by t.state, m.count, F.count;
The above query works but I am wondering if I did this in the most effecent way.
This was done using SQLServer but I think this should be the same for all RDBMS.
The link is here: http://sqlfiddle.com/#!18/7a969/87
Use conditional aggregation:
select t.state,
sum(case when sex = 'M' then 1 else 0 end) as males,
sum(case when sex = 'F' then 1 else 0 end) as females,
count(*) as total,
avg(case when sex = 'M' then 1.0 else 0 end) as male_ratio
from MyTable t
group by t.state;
I would expect this to be the fastest method in just about any database.
Here is a SQL Fiddle.
You can compute the number of females by subtracting the number of males from the total count per state. This way, only a single join is required:
with r as (select t.state s, count(*) c from testtable t group by t.state)
select r1.s, t1.m males, r1.c - t1.m females, r1.c total, 100*(t1.m/r1.c) m_percent
from r r1
join (select t.state s, t.sex, count(*) m from testtable t group by t.state, t.sex) t1 on r1.s = t1.s where t1.sex = "M";
Output:
state
males
females
total
m_percent
GA
2
3
5
40.0000
NY
4
5
9
44.4444
See demo.
It is not necessary to separate the data out into the tables for Male and Female.
From a performance point of view, it might help improve performance if the sub-queries were able to make optimal use of indexes, but in reality the chances are low that you would be agregating over only indexed values.
For this simple query you could use simple CASE expressions inline to express the Male/Female columns as BIT values, then we can SUM those values in a single aggregation, however that would require you to define the CASE for Male twice, so you could use it in the Male column and the % Male.
Instead of inline CASE we can use CROSS APPLY as a way to resolve calculations against each row once, and allow you to reference the result:
SELECT t.state State,
SUM(Calcs.IsMale) Male,
SUM(Calcs.IsFemale) Female,
COUNT(1) Total,
CONCAT(ROUND(SUM(Calcs.IsMale)/CAST(COUNT(1) as float)*100, 2), '%') as Calc
FROM MyTable t
CROSS APPLY (SELECT
CASE Sex WHEN 'M' THEN 1 END as [IsMale]
,CASE Sex WHEN 'F' THEN 1 END as [IsFemale]
) as calcs
GROUP BY [State]
Is this any more efficient though? In general it should be, this execution plan is much simpler than joining multiple aggregated sets, but its hard to say without a much larger dataset to test it against.
Either way, I would expect this simple CROSS apply version to win as we only have to process the resultset once.
When running the original and the CROSS APPLY on the given dataset and look at the actual execution plans, SQL Sever reports the CROSS APPLY query to be 25% of the relative cost for the batch:
I apologize in advance for posting this as an image, not really sure if there is a better way to have this discussion
This execution plan reports that the Original Query is 3 times the cost of the CROSS APPLY version, probably due to the 3 table scans in the first query, compared to the single table scan in the CROSS APPLY version.
I'm using Microsoft SQL Server and trying to write the following query below. I want to use the Pivot clause to get the count in gold medals for USA and Russia from the Olympics 2000. But My output is Zero for both countries. I know that I can use the group by to get the desired result (see print screen below). But how can do this with the pivot clause?
Please see print screens of the dataset and the output below
select
'Gold' as total_m,
['USA'] as USA, ['RUS'] as RUS
from
(select
country, medal, year
from
summer
where
medal = 'Gold'
and year = 2000
and country in ('USA', 'RUS')) as SourceTable
pivot
(count(medal)
for country in (['USA'],['RUS'])) as PivotTable;
Dataset
Output
Group by
Remove quotes from pivot column list,
select 'Gold' as total_m, [USA] as USA, [RUS] as RUS
from
(select country, medal, year
from summer
where medal = 'Gold'
and year = 2000
and country in ('USA', 'RUS')) as SourceTable
pivot
(count(medal)
for country in ([USA],[RUS])) as PivotTable;
I find that it is simpler to express this with conditional aggregation. This is a portable syntax, that works across most databases, and that is somehow less convoluted that vendor-specific pivot syntax:
select mdel,
sum(case when country = 'USA' then 1 else 0 end) as USA,
sum(case when country = 'RUS' then 1 else 0 end) as RUS
from summer
where medal = 'Gold' and year = 2000 and country in ('USA', 'RUS')
group by medal
Is it possible to get only the countrys who just played in the pre round
Country Round
Germany Pre Round
Germany Quater final
Spain Pre Round
Portugal Pre Round
And I just want to get the countrys which only played in the pre round. So the result should look like this:
Country
Spain
Portugal
You can group by country and set the conditions in the having clause:
select country
from tablename
group by country
having count(*) = 1 and max(round) = 'Pre Round'
You can try the below using not exists
select country from c
where not exists
(select 1 from c as c1 where c.country=c1.country and roundval<>'Pre Round')
Two more for fun. The first is kind of a variation on #forpas', assigning a numeric value to each round, representing the progression through the rounds, and then getting the highest for the country (which would be simpler if the rounds were stored separately with a round number):
select country
from your_table
group by country
having max(case round
when 'Pre Round' then 1
when 'Quater final' then 2
when 'Semi final' then 3
when 'Final' then 4
end) = 1;
If you wanted to find countries that were in the quarters but not semis then you just need to change to = 2, etc.
The second is overkill here, but could be useful to look for more complicated combinations in other types of data:
select country
from your_table
pivot (
count(*) for round in (
'Pre Round' as pre, 'Quater final' as quarter, 'Semi final' as semi, 'Final' as final
)
)
where pre = 1 and quarter = 0 and semi = 0 and final = 0;
Obviously in your example you wouldn't ever have quarter as 0 and then either semi or final as 1 - you can't get to those rounds without playing the quarters; but for other data you might want a mix.
You could use a inner join on subquery for country wih round 're Round' and check for distinct count
select m.Contry
from my_table m
inner join (
select Country
from my_table
where round ='Pre Round'
) t on t.country = m.country
group by m.Country
having count(distinct m.round ) = 1
So I have a table excerpt that looks like this:
Title Type Year Rating
-------------------------
Interstellar DVD 2014 8.4
Interstellar HD 2014 8.4
12 Angry Men DVD 1969 8.9
The Pianist HD 2001 8.1
The Pianist DVD 2001 8.1
Dragon Ball Z HD 2011 8.4
Dragon Ball Z DVD 1999 8.3
I want to retrieve all titles that have a DVD and HD released in the same year.
So, I would want to get Interstellar, The Pianist but not Dragon Ball Z since it's HD and DVD version were released on different years.
I managed to do it by joining tables but is it possible to do it without any joins?
You can use INTERSECT:
SELECT Title, Year
FROM your_table
WHERE Type = 'DVD'
INTERSECT
SELECT Title, Year
FROM your_table
WHERE Type = 'HD';
LiveDemo
If you need only title use subquery:
SELECT Title
FROM( SELECT Title, Year
FROM your_table
WHERE Type = 'DVD'
INTERSECT
SELECT Title, Year
FROM your_table
WHERE Type = 'HD') as t
LiveDemo2
One more possibility is to use EXISTS and correlated subquery:
SELECT Title
FROM your_table m1
WHERE Type = 'DVD'
AND EXISTS (SELECT 1
FROM your_table m2
WHERE m2.Title = m1.Title
AND m2.Type = 'HD'
AND m2.Year = m1.Year);
LiveDemo3
You can do it using GROUP BY:
SELECT Title, Year
FROM mytable
GROUP BY Title, Year
HAVING MIN(Type) <> MAX(Type)
Assuming there are only two available Type values, 'HD' and 'DVD', the above query identifies Title, Year pairs that relate to both of these types.
If there are more types available then HAVING becomes a bit more complicated:
HAVING COUNT(CASE WHEN Type = 'DVD' THEN 1 END) <> 0 AND
COUNT(CASE WHEN Type = 'HD' THEN 1 END) <> 0
Output:
Title Year
-----------------------
The Pianist 2001
Interstellar 2014
I am trying to create a table to will count the occurrences of each position for various offices.
So if my data is as follows:
Office Position
A Manager
A Supervisor
A Entry Level
A Entry Level
B Manager
B Entry Level
I would want my code to return:
Office Managers Supervisors EntryLevel
A 1 1 2
B 1 0 1
I have my code below. The issue is that this code counts the total amount of occurrences, not the unique count to each office. The results are as follows
A 2 1 3
B 2 1 3
CREATE TABLE OfficeTest AS
SELECT DISTINCT Office,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Manager') as Managers,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Supervisor') as Supervisors,
(Select COUNT(Position) FROM OfficeData WHERE Make_Name = 'Entry Level') as EntryLevel
FROM OfficeData
GROUP BY Office;
Any ideas on how to fix this?
The easiest way I can think of doing this is like this:
SELECT Office,
COUNT(CASE WHEN Make_Name = 'Manager' THEN Position END) AS Managers,
COUNT(CASE WHEN Make_Name = 'Supervisor' THEN Position END) AS Supervisors,
COUNT(CASE WHEN Make_Name = 'Entry Level' THEN Position END) AS EntryLevel
FROM OfficeData
GROUP BY Office
COUNT ignores MISSING values; if the Position is not the one specified in the CASE clause, it will return a MISSING value and won't be counted. This way each case considers only the value of Position you compare.
Another option, as stated in the comments, would be pivoting the table. The SAS equivalent is the TRANSPOSE procedure. I don't have a SAS system to create and test a query using it, but here's the documentation in case you want to check it out.
Just to flush out Danny's comment a bit, the SUM code would look like:
proc sql;
CREATE TABLE want AS
SELECT office,
SUM( (position='Manager') ) as Managers,
SUM( (position='Supervisor') ) as Supervisors,
SUM( (position='Entry Level') ) as EntryLevel
FROM OfficeData
GROUP BY office
;quit;
The (position='Manager') bit resolves to 0 or 1, depending on if its true for the current record. I find the SUM version a lot more concise and legible, but both should work for your situation. Plus, its easily extensible to more than one criteria, like (postion='Manager')*(sex='F') to count only female managers.
SUM with CASE statement should resolve the issue. Below is a reference code
proc sql;
create table result as
select age
, sum(case sex when 'F' then 1 else 0 end) as Female
, sum(case sex when 'M' then 1 else 0 end) as Male
from sashelp.class
group by age;
quit;
proc print data=result;run;