How to join tables on column containing text - sql

I have 2 tables to merge:
t1
Continent Country City
-----------------------
Europe Germany Munich
NA Canada Ontario
Asia Singapore (blank)
Asia Japan Tokyo
AND
t2
Country Status
-----------------
Germany Complete
Canada Incomplete
Singapore Complete
Japan Complete
I want to get the continent with 2nd highest "Complete" status. I am new to SQL and I am trying hard to learn the basics, but I cannot get this done.

I understand that you want to pull out the continent that has the second most country marked as completed.
If so, you can join, aggregate, order by the count of completed countries per continent, and then filter on the second rows:
select continent
from
(select distinct country, continent from t2) t2
inner join t1 on t1.country = t2.country
group by continent
order by sum(case when status = 'Complete' then 1 else 0 end) desc
limit 1, 1
Note the use of distinct when retrieving the association of countries and continents: this is because your sample data seems like it could have more than one row per country/continent tuple (since it is referencing cities). Without the distinct, we would potientally generate duplicate rows, causing sum() to be wrong.

I understand you mean the country with more cities with status complete. You can use sub-queries:
with a as
(
select a.country,
sum(case when status = 'Complete' then 1 else 0 end) as CompleteCount
from t1 a inner join t2 b on a.country = b.country
group by a.country
)
select country from
(
select country,
ROW_NUMBER() OVER( ORDER BY CompleteCount desc) as OrderComplete
from a
)a where OrderComplete = 2

Related

Sql query for the lowest score per country

I have a database with 3 tables. In table country I have id and name columns. The sport table also has id and name columns. Finally, the table match has id, player1, and player2(that are ids of country that play one against other), winner_id (id of country that won the match) and sport_id of the sport which was played. The least wins means that I just need in which sport country had the least wins, no matther on played matches.
I want to show the sport per country with the least wins. It should look like this:
Country
Sport
Wins
France
Basketball
2
How can I construct this query? I'm using SQL Server.
Data in table look like this. Table countries:
country_id
name
1
France
2
England
Table sport:
sport_id
name
1
Footbal
2
Basketball
Table match:
match_id
player1
player2
winner_id
sport_id
1
3
1
3
1
2
6
4
4
2
I want to note that the used wording with least wins is not clear, in my solution with least wins means most matches played with least wins.
To get this ranking, we need to know how many matches a country has played in each sport and how many of those have been won.
SELECT
country.name AS country,
sport.name AS sport,
sport_wins.wins
FROM
country
OUTER APPLY (
SELECT TOP 1
t.match_count,
COALESCE(t.wins, 0) AS wins,
t.sport_id
FROM (
SELECT
COUNT(*) AS match_count,
m_c.sport_id,
t.wins
FROM match m_c
OUTER APPLY (
SELECT
COUNT(*) AS wins,
match.sport_id
FROM match
WHERE country.country_id = match.winner_id
AND match.sport_id = m_c.sport_id
GROUP BY match.sport_id
) t
WHERE country.country_id IN (m_c.player1, m_c.player2)
GROUP BY m_c.sport_id, t.wins
) t
ORDER BY t.wins ASC, t.match_count DESC
) sport_wins
JOIN sport ON sport.sport_id = sport_wins.sport_id
Please, check a demo.
If you do not take into account losses, but only the number of wins is of interest, you can use a query like this one
WITH cte AS (
SELECT
country.country_id,
sport.sport_id,
SUM(CASE WHEN match.winner_id = country.country_id THEN 1 ELSE 0 END) AS wins
FROM country
CROSS JOIN sport
JOIN match ON match.sport_id = sport.sport_id
AND country.country_id IN (match.player1, match.player2)
GROUP BY country.country_id, sport.sport_id
)
SELECT
country.name,
sport.name,
t.min_wins AS wins
FROM (
SELECT
country_id,
MIN(wins) AS min_wins
FROM cte
GROUP BY country_id
) t
JOIN cte ON cte.country_id = t.country_id AND cte.wins = min_wins
JOIN country ON cte.country_id = country.country_id
JOIN sport ON cte.sport_id = sport.sport_id
This query takes into account the fact that the country participates in matches in sport, so if a country does not compete in a sport, that sport will not be included in the statistics as it will have 0 wins and this will be the minimum value.
Please, check a demo
You need to first cross-join the sports with the countries, then get the total.
Then you can use a row-number approach to get the bottom country in each sport
SELECT
c.Country,
c.Sport,
c.Wins
FROM (
SELECT
c.name Country,
s.name Sport,
COUNT(m.winner_id) Wins,
ROW_NUMBER() OVER (PARTITION BY s.sport_id, s.name ORDER BY COUNT(m.winner_id)) rn
FROM country c
CROSS JOIN sport s
LEFT JOIN [match] m
ON s.sport_id = m.sport_id AND m.winner_id = c.country_id
GROUP BY
s.sport_id,
s.name,
c.country_id,
c.name
) c
WHERE c.rn = 1;

Sql to get all cities in a country given a tree type databsae

I have been given a table like this
Id Name Type ParentId
1 US country -1
2 NY state 1
3 NYC city 2
4 Yonkers city 2
5 Washington state 1
6 Seattle city 5
7 Tacoma city 5
8 Canada country -1
9 Manitoba state 8
I want to write a sql query to write the all cities in a state.
Example
Country state city
US NY NYC
US NY Yonkers
I get that I need to write a recursive query but not able yo do so. I need help to write a sql for this.
You can use a recursive common table expression:
with recursive cte as (
select id, name, type, parentid
from the_table
where type = 'state'
and name = 'NY'
union all
select c.id, c.name, c.type, c.parentid
from the_table c
join cte p on p.id = c.parentid
)
select *
from cte
where type <> 'state';
The above is standard ANSI SQL, but not all database products support this exact syntax.
If the number of levels is fixed (so it's always Country -> State -> City) and will never change, you can use a simpler query:
select c.*
from the_table c
where parentid in (select s.id
from the_table s
where s.type = 'state'
and s.name = 'NY');
SELECT t1.name country, t2.name state, t3.name city
FROM table t1
JOIN table t2 ON t1.id = t2.parent_id
JOIN table t3 ON t2.id = t3.parent_id
WHERE t2.name = 'NY';

Insert occurrence of something from one table into another

There are 2 tables from the above example there one table is university name and block name, in table 1 there are university name and the teams they have in respective sports, in table 2 there are universities and sports according to block name and i want to add the number of occurrences of that sport in the table 1
This is just an example table, the main tables are really large but the principle is same
If you are using sql server, you can do this:
update t1
set
Rugby = RugbyCount
, Basketball = BasketballCount
, Baseball = BaseballCount
from table1 t1
join ( select UNI
, count(case when sport ='Rugby' then 1 end) RugbyCount
, count(case when sport ='Basketball' then 1 end) BasketballCount
, count(case when sport ='Baseball' then 1 end) BaseballCount
from table2
group by UNI
) t2
join on t1.uni = t2.uni
this is what worked for me on MySQL
update db.tbl t1
join (select UNI,
(select count(SPORT) where SPORT= 'Rugby') CRugby,
(select count(SPORT) where SPORT= 'Basketball') CBasketball,
(select count(SPORT) where SPORT= 'Baseball') CBaseball,
from db.tb2
group by sport, UNI) t2 on t1.UNI = t2.UNI
set Rugby= CRugby,
Basketball= CBasketball,
Baseball= CBaseball;

How to use GROUP BY and JOIN in a SQL Statement?

I have two tables called "Player" and "Country":
Player
Person Goals Country
------- ----- -------
Pogba 1 France
Pavard 1 France
Griezmann 2 France
Neymar 3 Brazil
Silva 2 Brazil
Country
Name Continent
------- --------------
France Europe
Brazil South America
I want to show the sum of goals for each country and display the country name, continent, and total goals
So, I would like my output to look like this:
Country Continent Goals
------- ------------- ------
France Europe 4
Brazil South America 5
I can display Country & Continent together and Country & Goals together but I can't do all three.
Here is what I tried:
SELECT Country.Name, Country.Continent, SUM(Player.Goals)
FROM Player
INNER JOIN Country ON Player.Country = Country.Name
GROUP BY Player.Country;
Maybe I'm over simplifying it? I just don't know how I can get the desired result.
Try the below - add Country.Continent to group by too
SELECT Country.Name, Country.Continent, SUM(Player.Goals)
FROM Player
INNER JOIN Country ON Player.Country = Country.Name
GROUP BY Country.Name, Country.Continent
SELECT Country.Continent, Player.Country, SUM(Player.Goals)
FROM Player
INNER JOIN Country ON Player.Country = Country.Name
GROUP BY Country.Continent, Player.Country
try this;
SELECT Country.Name, Country.Continent, SUM(Player.Goals) as Goals
FROM Player
INNER JOIN Country ON Player.Country = Country.Name
GROUP BY Country.Name, Country.Continent;
Although adding continent to the GROUP BY is definitely one solution, there are others.
First, you can use an aggregation function on continent:
SELECT c.Name, MAX(c.Continent) as Continent, SUM(p.Goals)
FROM Player p JOIN
Country c
ON p.Country = c.Name
GROUP BY c.Name;
Or, you can use a correlated subquery:
SELECT c.*,
(SELECT SUM(p.Goals)
FROM Player p
WHERE p.Country = c.Name
) as goals
FROM Country c;
Note that both of these include table aliases, which make the query easier to write and read.
The advantage of the last version is that it avoids an aggregation in the outer query and also it allows you to easily choose as many columns from Country as you like. It can also take advantage of an index on Player(Country), if available -- although that is irrelevant on smaller amounts of data.

Calculate Distinct count when joining two tables

id1 id2 year State Gender
==== ====== ====== ===== =======
1 A 2008 ca M
1 B 2008 ca M
3 A 2009 ny F
3 A 2008 ny F
4 A 2009 tx F
select
state, gender, [year],
count (distinct(cast(id1 as varchar(10)) + id2))
from
tabl1
group by state, gender, [year]
i could find the distinct count through statewise.
now i need to find distinct count through city wise. like in CA - 3 cities.. sfo,la,sanjose. i have a look up table that i could find the state and the city.
table2 - city
====
cityid name
==== ====
1 sfo
2 la
3 sanjose
table 3 - state
====
stateid name
==== ====
1 CA
2 Az
table 4 lookup state city
====
pk_cityId pk_state_id
1 1
2 1
select state,city,gender, [year],
count (distinct(cast(id1 as varchar(10)) + id2))
from
tabl1 p
group by state, gender, [year],city
this query to find city and state name.
select c.city,s.state from city_state sc
inner join (select * from state)s on sc.state_id = s.state_id
inner join (select * from city)c on sc.city_id = c.city_id
i did similar to this query using the look up table but the problem is that i get the distinct count throughout the states and the same no of count is repeating for each city in the state.
ex: for count for ca : 10 then the count for cities should be like La - 5, sanjose - 4, sfo-1.
but with my query i get as sfo - 10,la-10, sanjose-10.. i couldnt find the count for the lower level. any help would be appreciated.
UPDATE:
i have updated the query and the lookup tables.
Your implied schema seems to have a flaw:
You're trying to get city level aggregates but you are joining your data table (table1) to your city table (table2) based on the state. This will cause EVERY city in the same state to have the same aggregate values; in your case: all California states having count of 10.
Can you provide actual DDL statements for your two tables? Perhaps you have other columns there (city_id?) that might provide the necessary data for you to correct your query.
I think you need something like the following, but can't be sure w/o further information:
;WITH DistinctState AS
(
SELECT DISTINCT
id1
, id2
, [year]
, [State]
, Gender
FROM tab1
)
SELECT s.state
, c.city
, gender
, [year]
, count(*)
FROM DistinctState s
INNER JOIN
tab2 c
ON s.id1 = c.id1
AND s.id2 = c.id2
GROUP BY
s.state
, c.city
, gender
, [year]