Can not use group by function - sql

Data on Table:-
wkt Partners Team Opponent Runs Balls
1 S Hope & E Lewis WEST INDIES SOUTH AFRICA 43 66
2 S Hope & S Hetmyer WEST INDIES SOUTH AFRICA 70 79
3 D Bravo & S Hetmyer WEST INDIES SOUTH AFRICA 84 97
1 J Malan & Q Kock SOUTH AFRICA WEST INDIES 3 4
2 J Malan & F Plessis SOUTH AFRICA WEST INDIES 32 44
3 J Malan & R Dussen SOUTH AFRICA WEST INDIES 100 90
1 S Dhawan & R Sharma INDIA IRELAND 3 8
2 V Kohli & R Sharma INDIA IRELAND 102 70
I want to return the pair of partners, team they belong to, opponent they play against only once for each wkt where runs are highest for that particular wkt
For above table I'd like result as follow
wkt Partners Team Opponent Runs Balls
1 S Hope & E Lewis WEST INDIES SOUTH AFRICA 43 66
2 V Kohli & R Sharma INDIA IRELAND 102 70
3 J Malan & R Dussen SOUTH AFRICA WEST INDIES 100 90
Following is the code that I've used
SELECT wkt, Partners, Team, Opponent, max(Runs), Balls
FROM Partnerships
GROUP BY wkt
But I've been stuck with following error
Column 'Partnerships.Partners' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

How about row_number()?
select p.*
from (select p.*, row_number() over (partition by wkt order by runs desc) as seqnum
from Partnerships p
) p
where seqnum = 1;

Related

extracting data using beautifulsoup from wiki

I pretty new to this,
What I am trying to accomplished is having a table with distrcits and their various neighborhoods but my final code just list all neighborhoods in a list format without assigning them to a specific district.
url = "https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto"
html = urlopen(url)
soup = BeautifulSoup(html, 'lxml')
type(soup)
print(soup.prettify())
Toronto_table = soup.find('table',{'class':'wikitable sortable'})
links = Toronto_table.find_all('a')
neighborhoods = []
for link in links:
neighborhoods.append(link.get('title'))
print(neighborhoods)
df_neighborhoods = pd.DataFrame(neighborhoods)
df_neighborhoods
You can simply read_html and print the table.
import pandas as pd
f_states=pd.read_html('https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto')
print(f_states[6])
Output :
District Number Neighbourhoods Included
0 C01 Downtown, Harbourfront, Little Italy, Little P...
1 C02 The Annex, Yorkville, South Hill, Summerhill, ...
2 C03 Forest Hill South, Oakwood–Vaughan, Humewood–C...
3 C04 Bedford Park, Lawrence Manor, North Toronto, F...
4 C06 North York, Clanton Park, Bathurst Manor
5 C07 Willowdale, Newtonbrook West, Westminster–Bran...
6 C08 Cabbagetown, St. Lawrence Market, Toronto wate...
7 C09 Moore Park, Rosedale
8 C10 Davisville Village, Midtown Toronto, Lawrence ...
9 C11 Leaside, Thorncliffe Park, Flemingdon Park
10 C13 Don Mills, Parkwoods–Donalda, Victoria Village
11 C14 Newtonbrook East, Willowdale East
12 C15 Hillcrest Village, Bayview Woods – Steeles, Ba...
13 E01 Riverdale, Danforth (Greektown), Leslieville
14 E02 The Beaches, Woodbine Corridor
15 E03 Danforth (Greektown), East York, Playter Estat...
16 E04 The Golden Mile, Dorset Park, Wexford, Maryval...
17 E05 Steeles, L'Amoreaux, Tam O'Shanter – Sullivan
18 E06 Birch Cliff, Oakridge, Hunt Club, Cliffside
19 E08 Scarborough Village, Cliffcrest, Guildwood, Eg...
20 E09 Scarborough City Centre, Woburn, Morningside, ...
21 E10 Rouge (South), Port Union (Centennial Scarboro...
22 E11 Rouge (West), Malvern
23 W01 High Park, South Parkdale, Swansea, Roncesvall...
24 W02 Bloor West Village, Baby Point, The Junction (...
25 W03 Keelesdale, Eglinton West, Rockcliffe–Smythe, ...
26 W04 York, Glen Park, Amesbury (Brookhaven), Pelmo ...
27 W05 Downsview, Humber Summit, Humbermede (Emery), ...
28 W06 New Toronto, Long Branch, Mimico, Alderwood
29 W07 Sunnylea (The Queensway – Humber Bay)
30 W08 The Kingsway, Central Etobicoke, Eringate – Ce...
31 W09 Kingsview Village-The Westway, Richview (Willo...
32 W10 Rexdale, Clairville, Thistletown - Beaumond He...

sql select 1 item from list

i want to select a column from a table which can have another column reference many times.
select t1.name
from ccp.ENTITIES t1
Non
Albania
Australia
China
Czech Republic
Egypt
Germany
Greece
Group
Hungary
India
Ireland
Italy
Luxembourg
Malaysia
Malta
Netherlands
Portugal
Romania
Spain
Turkey
UK
US
this will give me a list of names of which i want 1 row from another table
v_networks_by_lm this table holds records with column t1.name and network. i want the column network only once for each item in the list. v_networks_by_lmcan hold many t1.name
entity name
a Spain
b Spain
c Spain
d Spain
e Spain
f Spain
g Spain
h Germany
i Germany
j Germany
k Germany
l Germany
m Germany
n Germany
o Germany
p UK
q Germany
r Spain
s Spain
t Portugal
u Portugal
v Portugal
q Portugal
from the above data which is in v_networks_by_lm i only want name returned once with any value of entity. but i want to pick the name from ENTITIES as it can be dynamic
I think aggregation does what you want:
SELECT MAX(n.network) as network, e.name
FROM ccp.ENTITIES e JOIN
ccp.v_networks_by_lm n
ON n.name = e.name
GROUP BY e.name;
Sounds like you want a subquery to get the single instance of name from the table, and then you do the join against entities.
Select sub.one_of_entity_values, sub.name
from ccp.entities e
inner join (
select max(entity) as one_of_entity_values, name
from v_networks_by_lm
group by name) sub on e.name = sub.name

retrieving data from field which has multiple values

We have table person. It has sample fields with multiple values like
person
ID name tripNumber startPlace endPlace
1 xxx 20 Portland Atlanta
25 California Atlanta
40 America Africa
2 EKVV 40 America Africa
37 Argentina Carolina
We need to retrieve entire row of data in particular condition like tripNumber=40 and endPlace="Africa"
We need the result like this,
ID name tripNumber startPlace endPlace
1 xxx 40 America Africa
2 EKVV 40 America Africa
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.person` AS (
SELECT 1 id, 'xxx' name, [20, 25, 40] tripNumber, ['Portland', 'California', 'America'] startPlace, ['Atlanta', 'Atlanta', 'Africa'] endPlace UNION ALL
SELECT 2, 'EKVV', [40, 37], ['America', 'Argentina'], ['Africa', 'Carolina']
)
SELECT id, name, tripNumber, startPlace[SAFE_OFFSET(off)] startPlace, endPlace[SAFE_OFFSET(off)] endPlace
FROM `project.dataset.person`,
UNNEST(tripNumber) tripNumber WITH OFFSET off
WHERE tripNumber = 40
with result
Row id name tripNumber startPlace endPlace
1 1 xxx 40 America Africa
2 2 EKVV 40 America Africa
Above solution assumes that you have independent repeated fields and match to be done based on positions in respective arrays
Below - is based on more common pattern of having repeated record
so if person table would look like below
Row id name trips.tripNumber trips.startPlace trips.endPlace
1 1 xxx 20 Portland Atlanta
25 California Atlanta
40 America Africa
2 2 EKVV 40 America Africa
37 Argentina Carolina
in this case solution would be
#standardSQL
WITH `project.dataset.person` AS (
SELECT 1 id, 'xxx' name, [STRUCT<tripNumber INT64, startPlace STRING, endPlace STRING>(20, 'Portland', 'Atlanta'),(25, 'California', 'Atlanta'),(40, 'America', 'Africa')] trips UNION ALL
SELECT 2, 'EKVV', [STRUCT(40, 'America', 'Africa'),(37, 'Argentina', 'Carolina')]
)
SELECT id, name, tripNumber, startPlace, endPlace
FROM `project.dataset.person`,
UNNEST(trips) trip
WHERE tripNumber = 40
with result
Row id name tripNumber startPlace endPlace
1 1 xxx 40 America Africa
2 2 EKVV 40 America Africa

SQL Select Distinct returning duplicates

I am trying to return the country, golfer name, golfer age, and average drive for the golfers with the highest average drive from each country.
However I am getting a result set with duplicates of the same country. What am I doing wrong? here is my code:
select distinct country, name, age, avgdrive
from pga.golfers S1
inner join
(select max(avgdrive) as MaxDrive
from pga.golfers
group by country) S2
on S1.avgdrive = s2.MaxDrive
order by avgdrive;
These are some of the results I've been getting back, I should only be getting 15 rows, but instead I'm getting 20:
COUN NAME AGE AVGDRIVE
---- ------------------------------ ---------- ----------
Can Mike Weir 35 279.9
T&T Stephen Ames 41 285.8
USA Tim Petrovic 39 285.8
Ger Bernhard Langer 47 289.3
Swe Fredrik Jacobson 30 290
Jpn Ryuji Imada 28 290
Kor K.J. Choi 37 290.4
Eng Greg Owen 33 291.8
Ire Padraig Harrington 33 291.8
USA Scott McCarron 40 291.8
Eng Justin Rose 25 293.1
Ind Arjun Atwal 32 293.7
USA John Rollins 30 293.7
NIr Darren Clarke 37 294
Swe Daniel Chopra 31 297.2
Aus Adam Scott 25 300.6
Fij Vijay Singh 42 300.7
Spn Sergio Garcia 25 301.9
SAf Ernie Els 35 302.9
USA Tiger Woods 29 315.2
You are missing a join condition:
select s1.country, s1.name, s1.age, s1.avgdrive
from pga.golfers S1 inner join
(select country, max(avgdrive) as MaxDrive
from pga.golfers
group by country
) S2
on S1.avgdrive = s2.MaxDrive and s1.country = s2.country
order by s1.avgdrive;
Your problem is that some people in one country have the same average as the best in another country.
DISTINCT eliminated duplicate rows, not values in some fields.
To get a list of countries with ages, names, and max drives, you would need to group the whole select by country.

Same-table Tree Table Query in SQL Server

I've searched but found nothing that could help.
I have the following table in a SQL Server 2005 database:
Parent Child Value
---- -------- ---------
America Mexico 8
America Canada 1
Asia Japan 5
Asia Korea 7
Europe Spain 0
Europe Italy 2
Africa Zimbabwe 1
Mexico Baja California 0
America USA 3
USA California 1
USA Texas 2
Parent and Child are Primary Key, value is not important (IMO). I would like to create a view that results in something like this:
Parent Child Value
---- -------- ---------
America USA 3
USA California 1
USA Texas 2
I would search for America, and the result will give back every nested child there is, recursively, no matter how many it has, since I could include cities, localities, etc.
What I need is similar to what some call a BOM explosion.
Here is how you can do it:
with cte as (
select parent, child
from t
union all
select cte.parent, t.child
from cte join
t
on cte.child = t.parent
)
select cte.*
from cte
where parent = 'America';
Here is a small SQL Fiddle example.