Constructing the SQL query below - sql

GOALS (~1700 rows)
YEAR COUNTRY NAME NUM_GOALS
-------------------------------------------
2018 England Harry Kane 6
2018 France Antoine Griezmann 4
2014 Argentina Lionel Messi 4
2014 Brazil Fred 1
2010 Germany Thomas Muller 5
2010 Japan Shinji Okazaki 1
1992 England Gary Linekar 6
CHAMPIONS (~500 rows)
YEAR COUNTRY NAME ROLE
-------------------------------------------------
2018 France Didier Deschamps Manager
2018 France Hugo Lloris Goalkeeper
2018 France Paul Pogba Midfielder
2014 Germany Joachim Loew Manager
2014 Germany Mesut Ozil Midfielder
2014 Germany Miroslav Klose Forward
2002 Brazil Da Silva Midfielder
1994 Brazil Da Silva Midfielder
1998 France Didier Deschamps Midfielder
Write a query showing all world cup winning players who have never scored a goal.
What I am unsure about is whether to use a join for this and whether there is a need to specify and ID's if a join is to be used.
I'd be grateful for extra clarification and help with this, or if my query needs any tweaking.
What I have tried:
This is what I came up with:
SELECT GOALS.NAME
FROM GOALS
INNER JOIN CHAMPIONS ON CHAMPIONS.COUNTRY = GOALS.NAME
WHERE GOALS.NUM_GOALS = 0;

Problems with your query:
the join condition does not look right
even if it was, it searches for players that had at least one world cup without scoring a goal - which is different from those that never scored a goal
You could use not exists:
select c.*
from champions c
where not exists (
select 1
from goals g
where g.country = c.country and g.name = c.name and g.num_goals > 0
)
This assumes that (country, name) tuples do identify a player.
On the other hand, if you want players that won a world cup without scoring a goal in that particular event, then you can either add a correlation condition on year, or use a straight join:
select c.*
from champions c
inner join goals g
on g.country = c.country
and g.name = c.name
and g.year = c.year
where g.num_goals = 0

Your ON condition is comparing CHAMPIONS.COUNTRY = GOALS.NAME, which is not a good comparison for joining these two tables. I would suggest doing this:
SELECT
GOALS.NAME
FROM
GOALS
INNER JOIN
CHAMPIONS
ON
CHAMPIONS.COUNTRY = GOALS.COUNTRY
WHERE
GOALS.NUM_GOALS = 0;

Related

Adding rows in a table from data that is not in a column

I'm trying to create a table to add all Medals won by the participant countries in the Olympics.
I scraped the data from Wikipedia and have something similar to this:
Year
Country_Name
Host_city
Host_Country
Gold
Silver
Bronze
1986
146
Los Angeles
United States
41
32
30
1986
67
Los Angeles
United States
12
12
12
And so on
I double-checked the data for some years, and it seems very accurate. The Country_Name has an ID because I have a Country_ID table that I created and updated the names with the ID:
Country_ID
Country_Name
1986
1
1986
2
So far so good. Now I want to create a new table where I'll have all countries in a specific year and the total medals for that country. I managed to easily do that for countries that participated in an edition, here's an example for the 1896 edition:
INSERT INTO Cumultative_Medals_by_Year(Country_ID, Year, Culmutative_Gold, Culmutative_Silver, Culmutative_Bronze, Total_Medals)
SELECT a.Country_Name, a.Year, SUM(a.Gold) As Cumultative_Gold, SUM(a.Silver) As Cumultative_Silver, SUM(a.Bronze) As Cumultative_Bronze, SUM(a.Gold) + SUM(a.Silver) + SUM(a.Bronze) AS Total_Medals
FROM Country_Medals a
Where a.Year >= 1896 AND Year < 1900
Group By a.Country_Name, a.Year
And I'll have this table:
Country_ID
Year
Cumultative_Gold
Cumultative_Silver
Cumultative_Bronze
Total_Medals
6
1986
2
0
0
5
7
1986
2
1
2
5
35
1986
1
2
3
6
46
1986
5
4
2
11
49
1986
6
5
2
13
51
1986
2
3
2
7
52
1986
10
18
19
47
58
1986
2
1
3
6
85
1986
1
0
1
2
131
1986
1
2
0
3
146
1986
11
7
2
20
To add the other editions I just have to edit the dates, "Where a.Year >= 1900 AND Year < 1904", for example.
INSERT INTO Cumultative_Medals_by_Year(Country_ID, Year, Culmutative_Gold, Culmutative_Silver, Culmutative_Bronze, Total_Medals)
SELECT a.Country_Name, a.Year, SUM(a.Gold) As Cumultative_Gold, SUM(a.Silver) As Cumultative_Silver, SUM(a.Bronze) As Cumultative_Bronze, SUM(a.Gold) + SUM(a.Silver) + SUM(a.Bronze) AS Total_Medals
FROM Country_Medals a
Where a.Year >= 1900 AND Year < 1904
Group By a.Country_Name, a.Year
And the table will grow.
But I'd like to also add all the other countries for the year 1896. This way I'll have a full record of all countries. So for example, you see that Country 1 has no medals in the 1896 Olympic edition, but I'd like to also add it there, even if the sum becomes NULL (where I'll update with a 0).
Why do I want that? I'd like to do an Animated Bar Chart Race, and with the data I have, some counties go "away" from the race. For example, the US didn't participate in the 1980 Olympics, so for a brief moment, the Bar for the US in the chart goes away just to return in 1984 (when it participated again). Another example is the Soviet Union, even though they do not participate anymore, it's the second participant with most medals won (only behind the US), but as the country does not have more participation after 1988, the bar just goes away after that year. By keeping a record of medals for all countries in all editions would prevent that from happening.
I'm pretty sure there are lots of countries that have won metals that were not around in 1896. But if you want a row for every country and every year, then generate the rows you want using cross join. Then join in the available information:
select c.Country_Name, y.Year,
SUM(cm.Gold) As Cumulative_Gold,
SUM(cm.Silver) As Cumulative_Silver,
SUM(cm.Bronze) As Cumulative_Bronze,
COALESCE(SUM(cm.Gold), 0) + COALESCE(SUM(cm.Silver), 0) + COALESCE(SUM(cm.Bronze), 0) AS Total_Medals
from (select distinct year from Country_Medals) y cross join
(select distinct country_name from country_medals) c left join
country_medals cm
on cm.year = y.year and
cm.country_name = c.country_name
group By c.Country_Name, y.Year

sql select 1 item from list

i want to select a column from a table which can have another column reference many times.
select t1.name
from ccp.ENTITIES t1
Non
Albania
Australia
China
Czech Republic
Egypt
Germany
Greece
Group
Hungary
India
Ireland
Italy
Luxembourg
Malaysia
Malta
Netherlands
Portugal
Romania
Spain
Turkey
UK
US
this will give me a list of names of which i want 1 row from another table
v_networks_by_lm this table holds records with column t1.name and network. i want the column network only once for each item in the list. v_networks_by_lmcan hold many t1.name
entity name
a Spain
b Spain
c Spain
d Spain
e Spain
f Spain
g Spain
h Germany
i Germany
j Germany
k Germany
l Germany
m Germany
n Germany
o Germany
p UK
q Germany
r Spain
s Spain
t Portugal
u Portugal
v Portugal
q Portugal
from the above data which is in v_networks_by_lm i only want name returned once with any value of entity. but i want to pick the name from ENTITIES as it can be dynamic
I think aggregation does what you want:
SELECT MAX(n.network) as network, e.name
FROM ccp.ENTITIES e JOIN
ccp.v_networks_by_lm n
ON n.name = e.name
GROUP BY e.name;
Sounds like you want a subquery to get the single instance of name from the table, and then you do the join against entities.
Select sub.one_of_entity_values, sub.name
from ccp.entities e
inner join (
select max(entity) as one_of_entity_values, name
from v_networks_by_lm
group by name) sub on e.name = sub.name

SQL Queries problems

So I'm new to this site because I just recently started a data basing class, so I'm still learning the basics, but I need a little bit of help. So these are the two problems I have.
List of all apartments that were occupied on March 1, 2015 sorted by complex and apartment number. Example Results:
complexName apartmentNumber
Fox Run 101
Fox Run 102
Fox Run 204
Oak Meadows 103
Villa Maria 11
Villa Maria 12
List of all tenants that had a current lease on March 1, 2015 sorted by property and apartment number
Example results:
complexName apartmentNumber givenName familyName
Fox Run 101 Shannon McCoy
Fox Run 102 Larry Thomas
Fox Run 204 Mark Patterson
Oak Meadows 103 Jose Ortiz
Villa Maria 11 Cassie Lee
Villa Maria 12 Robert Woodward
My SQL for the first problem is...
SELECT DISTINCT name AS 'complexName', number AS 'apartmentNumber'
FROM week9wildwood.Complex AS c
INNER JOIN week9wildwood.Apartment AS a ON c.complexID = a.Complex_complexID
INNER JOIN week9wildwood.Lease AS l ON a.number = l.Apartment_number
WHERE startDate BETWEEN '2015-03-01' AND '2016-03-01'
ORDER BY name, number;
But I keep getting this back...
complexName apartmentNumber
Fox Run 102
Fox Run 103
Fox Run 104
Oak Meadows 102
Oak Meadows 103
Villa Maria 11
Villa Maria 21
I'm not sure what I'm doing wrong, and why it's coming back with different data. I also feel like the querie for the first problem, is almost the same for the second problem, but the wording of it has me hesitant. Any suggestions would be greatly appreciated!
Untested but maybe just make sure the end date is greater than March 15, 2015
select name AS 'complexName', number AS 'apartmentNumber'
from week9wildwood.Complex AS c
inner join
week9wildwood.Apartment AS a
on c.complexID = a.Complex_complexID
inner join
week9wildwood.Lease AS l
on a.number = l.Apartment_number
where end_date > 2015-03-15
ORDER BY name, number;
Have you tried Group By?
SELECT DISTINCT name AS 'complexName', number AS 'apartmentNumber'
FROM week9wildwood.Complex c
INNER JOIN week9wildwood.Apartment a
ON c.complexID = a.Complex_complexID
INNER JOIN week9wildwood.Lease l
ON a.number = l.Apartment_number
WHERE startDate BETWEEN '2015-03-01' AND '2016-03-01'
GROUP BY 1
ORDER BY name, number;

SQL Select Distinct returning duplicates

I am trying to return the country, golfer name, golfer age, and average drive for the golfers with the highest average drive from each country.
However I am getting a result set with duplicates of the same country. What am I doing wrong? here is my code:
select distinct country, name, age, avgdrive
from pga.golfers S1
inner join
(select max(avgdrive) as MaxDrive
from pga.golfers
group by country) S2
on S1.avgdrive = s2.MaxDrive
order by avgdrive;
These are some of the results I've been getting back, I should only be getting 15 rows, but instead I'm getting 20:
COUN NAME AGE AVGDRIVE
---- ------------------------------ ---------- ----------
Can Mike Weir 35 279.9
T&T Stephen Ames 41 285.8
USA Tim Petrovic 39 285.8
Ger Bernhard Langer 47 289.3
Swe Fredrik Jacobson 30 290
Jpn Ryuji Imada 28 290
Kor K.J. Choi 37 290.4
Eng Greg Owen 33 291.8
Ire Padraig Harrington 33 291.8
USA Scott McCarron 40 291.8
Eng Justin Rose 25 293.1
Ind Arjun Atwal 32 293.7
USA John Rollins 30 293.7
NIr Darren Clarke 37 294
Swe Daniel Chopra 31 297.2
Aus Adam Scott 25 300.6
Fij Vijay Singh 42 300.7
Spn Sergio Garcia 25 301.9
SAf Ernie Els 35 302.9
USA Tiger Woods 29 315.2
You are missing a join condition:
select s1.country, s1.name, s1.age, s1.avgdrive
from pga.golfers S1 inner join
(select country, max(avgdrive) as MaxDrive
from pga.golfers
group by country
) S2
on S1.avgdrive = s2.MaxDrive and s1.country = s2.country
order by s1.avgdrive;
Your problem is that some people in one country have the same average as the best in another country.
DISTINCT eliminated duplicate rows, not values in some fields.
To get a list of countries with ages, names, and max drives, you would need to group the whole select by country.

Query to run through each instance. MS-Access

I have a MS-Access database with two tables which I would like to query from, the basic table schema is shown below. I am looking to pull out the details for the earliest parish church in each parish – and in the instance that there is no church with ‘parish’ in the name; I would like to pull out the earliest church.
SITEDETAIL:
Site
Reference No. | Civil Parish | Site Name | NGR East | NGR North
1 Assynt Old Parish Church 6137 3172
2 Assynt St. Marys 6097 3870
3 Assynt New Parish Church 6249 3490
4 Bower Grimbister 2095 4067
5 Bower St. Andrews 2304 3194
6 Halkirk Firth Parish Church 7136 3450
7 Holm Strath Parish Church 4586 2045
8 Holm St Nicholas Parish 4132 3146
SITEDATES:
Site
Reference No. | Date
1 1812
2 1300
3 1900
4 1760
5 1750
6 1838
7 1619
8 1774
I have written a query that pulls out all the instances of ‘parish’:
SELECT SITEDETAIL.SITEREFNO, SITEDETAIL.CIVPARBUR_CDE, SITEDETAIL.SITENAME, SITEDETAIL.NGRE, SITEDETAIL.NGRN, SITEDATES.DATE
FROM SITEDETAIL INNER JOIN SITEDATES ON SITEDETAIL.SITEREFNO = SITEDATES.SITEREFNO
WHERE (((SITEDETAIL.SITENAME) Like "par*"));
However, this does not take into account the instances of multiple/no churches with ‘par*’ in the name.
Is it possible to create an SQL query that runs through each civil parish and selects the earliest ‘parish’ or earliest church, or is it necessary to write a perl script to run through them? Is this possible using DBI?
Desired output:
Site
Reference No. | Civil Parish | Site Name | NGR East | NGR North | Date
1 Assynt Old Parish Church 6137 3172 1812
5 Bower St. Andrews 2304 3194 1750
6 Halkirk Firth Parish Church 7136 3450 1838
7 Holm Strath Parish Church 4586 2045 1619
NB:In the case of Assynt, 'Old Parish Church' is selected despite being older because of having 'parish' in the name.
The following query should get you what you need. It's a little long, but it does the trick:
`select LIST.Civil_Parish, SD.Site_name, LIST.MSite_Date
from
(
select Civil_Parish, min(Site_date) as MSite_date
from SiteDetail
where Boolean = 1
group by Civil_Parish
union
select Civil_parish, min(Site_date) as MSite_date
from SiteDetail
where Civil_parish not in
(select Civil_parish
from SiteDetail
where Boolean = 1)
group by Civil_Parish) as LIST
left join sitedetail SD on LIST.Civil_Parish = SD.Civil_Parish and LIST.MSite_Date = SD.Site_Date`
Please note the following:
1) I am using PowerUser's boolean suggestion. If the Boolean column has value 1, then the row is a Parish Church, and 0 if it is not.
2) I combined the tables "SiteDates" and "SiteDetails" for the purpose of this example, as they are 1 to 1.
The core of the query is A) finding the oldest Parish church in a Parish, then B) find Parishes without Parish Churches.
The code for A) is as follows:
'select Civil_Parish, min(Site_date) as MSite_date
from SiteDetail
where Boolean = 1
group by Civil_Parish'
We then union that with the oldest churches in parishes that do not have a parish church:
'select Civil_parish, min(Site_date) as MSite_date
from SiteDetail
where Civil_parish not in
(select Civil_parish
from SiteDetail
where Boolean = 1)
group by Civil_Parish'
We then join the union query (named "LIST" here) with our original "SITEDETAIL" table on Parish and Date to bring in the church name.