Calculated column - SQL - Football Teams - sql

I'd like to add a new second column to a 'teams' table which is representative of premier league (UK) football rankings. At the moment the table just contains the names of each football team.
The column will be called 'Played' and it will list the number of games each team has played. I'd like to calculate this number (integer data type) from a separate table called 'games', which records a historic log of games fixtures. This would probably include using SQL's native 'COUNT' function.
I have tried to use a function to help me do this, but currently it is inserting all values as '0'
CREATE FUNCTION [dbo].[GetPlayed](#Team VARCHAR)
FROM games
WHERE games.Home = #Team OR games.Away = #Team);
ADD Played AS GetPlayed(teams.Team)
The tables:
Crystal Palace
Hull City
Leicester City
Manchester City
Manchester United
Stoke City
Swansea City
Tottenham Hotspur
West Bromwich Albion
West Ham United
gameID Home HomeScore Away AwayScore GameDate
4 Arsenal 2 Chelsea 0 2018-05-26
5 Arsenal 5 Bournemouth 0 2018-04-22
6 Arsenal 1 Leicester City 1 2018-03-15
7 Bournemouth 5 Liverpool 0 2018-04-22
8 Burnley 5 Bournemouth 0 2018-04-22
9 Burnley 1 Swansea City 2 2017-11-22
10 Stoke City 0 Burnley 0 2018-01-08
11 Chelsea 1 Middlesborough 2 2017-11-22
12 Southampton 0 Chelsea 0 2018-01-01
13 Crystal Palace 1 Everton 2 2018-03-26
14 Manchester United 4 Crystal Palace 0 2018-06-01
15 Crystal Palace 0 Southampton 1 2018-04-16
16 Everton 1 Hull City 2 2017-11-20
17 Manchester City 4 Everton 0 2017-11-20
18 Hull City 0 Burnley 0 2018-06-01
19 Sunderland 2 Hull City 0 2018-06-15
20 Leicester City 3 Tottenham Hotspur 1 2017-09-20
21 Swansea City 2 Leicester City 5 2018-02-15
22 Sunderland 0 Leicester City 1 2018-01-29
23 Liverpool 3 Tottenham Hotspur 0 2018-02-28
24 Stoke City 1 Liverpool 2 2017-09-19
25 Manchester City 2 Manchester United 4 2018-05-02
26 Middlesborough 1 Southampton 1 2018-02-08
27 Stoke City 2 Middlesborough 2 2017-08-19
28 Swansea City 0 Manchester United 5 2018-06-27
29 Sunderland 1 Tottenham Hotspur 2 2017-09-01
Any help would be much appreciated!
Thanks, Rob

VARCHAR without size defaults to 1 char, you need to change your function declaration
CREATE FUNCTION [dbo].[GetPlayed](#Team VARCHAR(32))
Without size your parameter #Team will receive just the first letter of your passed team value and, of course, the WHERE statement is unable to find any result in your games table


How retrieve all parent and child rows population in Oracle sql?

I have a table "TB_Population" with some records about the population from all over the world.
at this time I want to calculate each title's population in particular row
and demonstrate each level in that table.
I have this table with the following data:
1 WORLD 10
2 AFRICA 1 5
3 ASIA 1 10
4 EUROPE 1 4
6 FRANCE 4 10
7 ITALY 4 4
8 JAPAN 3 6
10 SPAIN 4 9
11 INDIA 3 8
13 USA 14 10
14 AMERICA 1 10
15 NEWYORK 13 5
The expected output table should be as below
1 WORLD 100 1
2 AFRICA 6 2
3 ASIA 24 2
4 EUROPE 35 2
6 FRANCE 10 3
7 ITALY 4 3
8 JAPAN 6 3
10 SPAIN 9 3
11 INDIA 8 3
13 USA 15 3
14 AMERICA 25 2
15 NEWYORK 5 4
Thanks and best regards
The tricky part which I see here is you want the LEVEL of title from "BOTTOM TO TOP" and POPULATION from "TOP TO BOTTOM". For example, AMERICA's level has to be 2 which means the LEVEL has to be measured from AMERICA -> WORLD, but AMERICA's population has to be 25 which is the sum of population measured from AMERICA -> NEWYORK. So, I tried this:
You can have a look at the simulation here:
Hope this helps you

Highest occupant with his details using view & max?

Query to get the house details of the house having the highest occupancy
There are two tables houses & tenant histories used to calculate the highest occupancy.
So first I'm using datediff function to get the duration of stay.
then I create view so that it act as a virtual table & I can access the column to get max value.
create view [dbo].[vWHouseStay]
select profile_id,
furnishing_type ,
DATEDIFF(MONTH,move_in_date,move_out_date) AS 'Total Length of stay'
from Tenancy_histories
join houses on tenancy_histories.house_Id = houses.house_id
What I need is highest occupant with house details. How do I do that? Basically it should just return one house with max [Total Length of stay].
table structure:
Field Type Null Key Default
id int(11) NO PRI auto_increment
profile_id int(11) NO FK
house_id int(11) NO FK
move_in_date date NO
move_out_date date YES
rent int(11) NO
Bed_type varchar(255) YES
move_out_reason varchar(255) YES
Field Type Null Key Default
house_id int(11) NO PRI auto_increment
house_type varchar(255) YES
bhk_details varchar(255) YES
bed_count int(11) NO
furnishing_type varchar(255) YES
Beds_vacant int(11) NO
sample data
house_id house_type bhk_details bed_count furnishing_type OccupancyDays
5 Independent 4 BHK 4 fully-furnished 443
7 Apartment 3 BHK 3 semifurnished 417
4 Apartment 2 BHK 2 fully-furnished 397
18 Independent 2 BHK 2 fully-furnished 358
16 Apartment 3 BHK 3 fully-furnished 324
19 Independent 3 BHK 3 fully-furnished 290
3 Apartment 3 BHK 6 fully-furnished 226
1 Apartment 3 BHK 5 unfurnished NULL
2 Apartment 3 BHK 3 unfurnished NULL
17 Independent 3 BHK 3 fully-furnished NULL
6 Apartment 3 BHK 3 semifurnished NULL
8 Apartment 2 BHK 4 fully-furnished NULL
sample data [tenancy histories]
id profile_id house_id move_in_date move_out_date rent Bed_type move_out_reason
242 1 5 2015-02-12 2016-04-30 7500 bed MOVE_OUT
243 2 2 2015-06-05 NULL 11000 room
244 3 4 2015-10-28 2016-11-28 12000 room RENT_CHANGE
245 4 1 2015-04-26 NULL 8000 bed
246 5 3 2015-05-15 2015-12-27 9000 bed MOVE_OUT
247 6 8 2015-12-25 NULL 10200 room
248 7 6 2015-11-20 NULL 6500 bed
249 8 7 2015-11-10 2016-12-31 7200 bed MOVE_OUT
250 9 9 2015-10-15 NULL 7500 bed
251 10 10 2015-06-20 NULL 7500 bed
252 11 19 2015-08-29 2016-06-14 8000 bed INTERNAL_TRANSFER
253 12 15 2015-02-24 NULL 11000 room
254 13 12 2015-02-25 NULL 12000 room
255 14 18 2016-01-07 2016-12-30 13500 room MOVE_OUT
256 15 13 2015-04-07 NULL 6500 bed
257 16 17 2015-04-23 NULL 6500 bed
258 17 14 2015-02-10 NULL 10500 room
259 18 16 2015-10-16 2016-09-04 8000 bed MOVE_OUT
260 19 20 2015-09-26 NULL 7500 bed
261 20 11 2015-09-30 NULL 9500 bed
Looks like you want to aggregate over all tenancy history. Here would be how. This can be used in a view as well.
;with cte as(
,OccupancyDays = sum(datediff(day,th.move_in_date,move_out_date))
Houses h
inner join
Tenancy_histories th on th.house_id = h.house_id)
group by
select TOP 1 --you can remove the TOP 1 to bring them all back
from cte
order by OccupancyDays desc

If last names are similar in [Name] column, fill in missing values of another column

Below is a sample of a much larger dataframe.
Fare Cabin Pclass Ticket Name
257 86.5000 B77 1 110152 Cherry, Miss. Gladys
759 86.5000 B77 1 110152 Rothes, the Countess. of (Lucy Noel Martha Dye...
504 86.5000 B79 1 110152 Maioni, Miss. Roberta
262 79.6500 E67 1 110413 Taussig, Mr. Emil
558 79.6500 E67 1 110413 Taussig, Mrs. Emil (Tillie Mandelbaum)
585 79.6500 NaN 1 110413 Taussig, Miss. Ruth
475 52.0000 A14 1 110465 Clifford, Mr. George Quincy
110 52.0000 C110 1 110465 Porter, Mr. Walter Chamberlain
335 26.0000 C106 1 110469 Maguire, Mr. John Edward
158 26.5500 D22 1 110489 Borebank, Mr. John James
430 26.5500 C52 1 110564 Bjornstrom-Steffansson, Mr. Mauritz Hakan
236 75.2500 D37 1 110813 Warren, Mr. Frank Manley
366 75.2500 D37 1 110813 Warren, Mrs. Frank Manley (Anna Sophia Atkinson)
191 26.0000 NaN 1 111163 Salomon, Mr. Abraham L
170 33.5000 B19 1 111240 Van der hoef, Mr. Wyckoff
462 38.5000 E63 1 111320 Gee, Mr. Arthur H
329 57.9792 Nan 1 111361 Hippach, Miss. Jean Gertrude
523 57.9792 B18 1 111361 Hippach, Mrs. Louis Albert (Ida Sophia Fischer)
If I want to iterate the filling of missing values of "Cabin" for people who are missing "Cabin" values, with someone else's "Cabin" values, only if
the someone else (the one who has a cabin value) has the same last name and also are in the vicinity of oneself( as in one above or one below them) .
So in the dataframe above, [Tassuig, Miss.Ruth]'s Cabin value of "Nan" would be replaced with that of [Tassuig, Mrs.Emil]'s cabin value [E67] who is one above herself because both conditions are met. (Same last name and in the vicinity)
And [Hippach, Miss. Jean Gertrude]'s missing cabin value would be replaced with
[ Hippach, Mrs. Louis Albert (Ida Sophia Fischer)]'s Cabin value of [B18].
I tried to think of iteration but this is as far as I got
for x in df.Name.str.split(',')[x][0] ==df.Name.str.split(',')[x+1][0]:
if df.Cabin[x] or df.Cabin[x+1] == np.nan:
I want to make sure the np.nan value is replaced with a True value and not np.nan. Couldn't figure out how to do that.
Starting with your DataFrame
Fare Cabin Pclass Ticket \
0 86.5000 B77 1 110152
1 86.5000 B77 1 110152
2 86.5000 B79 1 110152
3 79.6500 E67 1 110413
4 79.6500 E67 1 110413
5 79.6500 NaN 1 110413
6 52.0000 A14 1 110465
7 52.0000 C110 1 110465
8 26.0000 C106 1 110469
9 26.5500 D22 1 110489
10 26.5500 C52 1 110564
11 75.2500 D37 1 110813
12 75.2500 D37 1 110813
13 26.0000 NaN 1 111163
14 33.5000 B19 1 111240
15 38.5000 E63 1 111320
16 57.9792 NaN 1 111361
17 57.9792 B18 1 111361
0 Cherry, Miss. Gladys
1 Rothes, the Countess. of (Lucy Noel Martha Dye...
2 Maioni, Miss. Roberta
3 Taussig, Mr. Emil
4 Taussig, Mrs. Emil (Tillie Mandelbaum)
5 Taussig, Miss. Ruth
6 Clifford, Mr. George Quincy
7 Porter, Mr. Walter Chamberlain
8 Maguire, Mr. John Edward
9 Borebank, Mr. John James
10 Bjornstrom-Steffansson, Mr. Mauritz Hakan
11 Warren, Mr. Frank Manley
12 Warren, Mrs. Frank Manley (Anna Sophia Atkinson)
13 Salomon, Mr. Abraham L
14 Van der hoef, Mr. Wyckoff
15 Gee, Mr. Arthur H
16 Hippach, Miss. Jean Gertrude
17 Hippach, Mrs. Louis Albert (Ida Sophia Fischer)
Creating a new column/series with just the LastName. Note, might be a better way to do this with pandas str methods, but I couldn't get anything to work
df['LastName'] = df['Name'].map(lambda x : x[:x.find(',')])
Then we leverage Pandas' shift and boolean indexing to see if the passenger above has the same last name (ie the Taussig case)
filter = (df['Cabin'].isnull()) & (df['LastName'] == df['LastName'].shift())
df.loc[filter,'Cabin'] = df['Cabin'].shift()
and then the passenger below by passing a -1 to shift() (ie the Hippach case)
filter = (df['Cabin'].isnull()) & (df['LastName'] == df['LastName'].shift(-1))
df.loc[filter,'Cabin'] = df['Cabin'].shift(-1)
Fare Cabin Pclass Ticket \
0 86.5000 B77 1 110152
1 86.5000 B77 1 110152
2 86.5000 B79 1 110152
3 79.6500 E67 1 110413
4 79.6500 E67 1 110413
5 79.6500 E67 1 110413
6 52.0000 A14 1 110465
7 52.0000 C110 1 110465
8 26.0000 C106 1 110469
9 26.5500 D22 1 110489
10 26.5500 C52 1 110564
11 75.2500 D37 1 110813
12 75.2500 D37 1 110813
13 26.0000 NaN 1 111163
14 33.5000 B19 1 111240
15 38.5000 E63 1 111320
16 57.9792 B18 1 111361
17 57.9792 B18 1 111361
Name LastName
0 Cherry, Miss. Gladys Cherry
1 Rothes, the Countess. of (Lucy Noel Martha Dye... Rothes
2 Maioni, Miss. Roberta Maioni
3 Taussig, Mr. Emil Taussig
4 Taussig, Mrs. Emil (Tillie Mandelbaum) Taussig
5 Taussig, Miss. Ruth Taussig
6 Clifford, Mr. George Quincy Clifford
7 Porter, Mr. Walter Chamberlain Porter
8 Maguire, Mr. John Edward Maguire
9 Borebank, Mr. John James Borebank
10 Bjornstrom-Steffansson, Mr. Mauritz Hakan Bjornstrom-Steffansson
11 Warren, Mr. Frank Manley Warren
12 Warren, Mrs. Frank Manley (Anna Sophia Atkinson) Warren
13 Salomon, Mr. Abraham L Salomon
14 Van der hoef, Mr. Wyckoff Van der hoef
15 Gee, Mr. Arthur H Gee
16 Hippach, Miss. Jean Gertrude Hippach
17 Hippach, Mrs. Louis Albert (Ida Sophia Fischer) Hippach
groupby + fillna
# back fills, then forward fills
def bffill(x):
return x.bfill().ffill()
# group by last name
df['Cabin'] = df.groupby(df.Name.str.split(',').str[0]).Cabin.apply(bffill)

How to write a query to identify names with similar sounds?

How do I write a query to identify names(possibly including non-English names) that have similar sounds? Soundex does not seem to handle non-English names well.
The code should be able to identify that for example the following(or most of them) are names with similar sounds?
Helena - Elena
Violet - Viola
Beatrix - Beatrice
Madeline - Madeleine (ma-duh-LINE vs ma-duh-LEN)
Alice - Elise
Madeline - Adeline
Kristen - Kirsten
Lily - Millie
Charlotte - Scarlett
Zara / Lara / Sara / Mara
Elena - Alana
Emily - Emmeline
Amelia - Amalia
Stella - Bella - Ella
Isabel - Isabeau
Holly - Hallie
Laura - Lara
Fiona - Finola
Louise - Eloise
Cara - Clara
Susanna vs Susannah
Nora vs Norah
Talia vs Tahlia vs Thalia
Catherine vs Katherine
Cecilia vs Cecelia
Lucy vs Lucie
Vivian vs Vivien
Lillian vs Lilian
Gwendolen vs Gwendolyn
Sofia vs Sophia
Isabel vs Isobel vs Isabelle
Seraphina vs Serafina
Juliet vs Juliette
Annabel vs Annabelle
Emily vs Emilie
Elisabeth vs Elizabeth
...and non-English names too.
Would it help by using algorithm like Levenshtein Distance to compare the similarity between two sequences?
Particularly in Oracle, you can use utl_match.
For example:
--Find closest names based on UTL_MATCH.EDIT_DISTANCE.
with names as
--Names data.
select column_value name
from table(sys.odcivarchar2list('Adeline','Alana','Alice','Amalia','Amelia','Annabel',
--Name with the closest matches.
select name1, edit_distance, listagg(name2, ',') within group (order by name2) names
--Compare strings.
select name1, name2
,utl_match.edit_distance(, edit_distance
over (partition by min_edit_distance
from names names1
cross join names names2
--This cross join could get expensive. It may help to add conditions here to
--filter out obvious non-matches. For example, maybe throw out rows where the
--string length is vastly different?
where <>
order by 1, 3, 2
where edit_distance = min_edit_distance
group by name1, edit_distance
order by 1;
----- ------------- -----
Adeline 2 Madeline
Alana 2 Clara,Elena
Alice 2 Elise
Amalia 1 Amelia
Amelia 1 Amalia
Annabel 2 Annabelle
Annabelle 2 Annabel
Beatrice 2 Beatrix
Beatrix 2 Beatrice
Bella 2 Ella,Stella
Cara 1 Clara,Lara,Mara,Sara,Zara
Catherine 1 Katherine
Cecelia 1 Cecilia
Cecilia 1 Cecelia
Charlotte 4 Scarlett
Clara 1 Cara
Elena 2 Alana,Ella,Helena
Elisabeth 1 Elizabeth
Elise 1 Eloise
Elizabeth 1 Elisabeth
Ella 2 Bella,Elena
Eloise 1 Elise
Emilie 2 Emily
Emily 2 Emilie,Lily
Emmeline 3 Adeline,Emilie,Madeline
Finola 2 Fiona,Viola
Fiona 2 Finola,Viola
Gwendolen 1 Gwendolyn
Gwendolyn 1 Gwendolen
Hallie 2 Millie
Helena 2 Elena
Holly 3 Bella,Ella,Emily,Hallie,Lily
Isabeau 2 Isabel
Isabel 1 Isobel
Isabelle 2 Isabel
Isobel 1 Isabel
Juliet 2 Juliette
Juliette 2 Juliet
Katherine 1 Catherine
Kirsten 2 Kristen
Kristen 2 Kirsten
Lara 1 Cara,Laura,Mara,Sara,Zara
Laura 1 Lara
Lilian 1 Lillian
Lillian 1 Lilian
Lily 2 Emily,Lucy
Louise 3 Elise,Eloise,Lucie
Lucie 2 Lucy
Lucy 2 Lily,Lucie
Madeleine 1 Madeline
Madeline 1 Madeleine
Mara 1 Cara,Lara,Sara,Zara
Millie 2 Hallie
Nora 1 Norah
Norah 1 Nora
Sara 1 Cara,Lara,Mara,Zara
Scarlett 4 Charlotte
Serafina 2 Seraphina
Seraphina 2 Serafina
Sofia 2 Sophia
Sophia 2 Sofia
Stella 2 Bella
Susanna 1 Susannah
Susannah 1 Susanna
Tahlia 1 Talia
Talia 1 Tahlia,Thalia
Thalia 1 Talia
Viola 2 Finola,Fiona,Violet
Violet 2 Viola
Vivian 1 Vivien
Vivien 1 Vivian
Zara 1 Cara,Lara,Mara,Sara

Merge two tables using common fields

I have two tables, which I need to get data from table 1 to table 2 by matching customer name & Sale date. In the first table, the name is in two columns but the other table its in one column.
> list(CustomerSales.CSV)
CustomerFirstName CustomerLastName SaleDate_Time InvoiceNo InvoiceValue
1 Hendricks Eric 30-09-2015 13:00 10 5000
2 Fier Marilyn 02-10-2015 15:30 15 18000
3 O'Brien Donna 03-10-2015 13:30 16 25000
4 Perez Barney 03-10-2015 16:10 17 20000
5 Fier Marilyn 04-10-2015 11:10 18 6000
6 Hendricks Eric 05-10-2015 14:00 19 8000
> list(ReturnSales.CSV)
CustomerName SaleDate_Time ReturnDate_Time ReturnNo ReturnValue
1 Hendricks Eric 05-10-2015 14:00 10-10-2015 14:00 1 1000
2 O'Brien Donna 03-10-2015 13:30 15-10-2015 13:30 2 2000
3 Perez Barney 03-10-2015 16:10 12-10-2015 16:10 3 1500
4 Fier Marilyn 02-10-2015 15:30 08-10-2015 15:30 4 2000
The result should be a table like this.
CustomerName SaleDate_Time InvoiceNo InvoiceValue ReturnDate_Time ReturnNo ReturnValue
1 Hendricks Eric 05-10-2015 14:00 19 8000 10-10-2015 14:00 1 1000
2 O'Brien Donna 03-10-2015 13:30 16 25000 15-10-2015 13:30 2 2000
3 Perez Barney 03-10-2015 16:10 17 20000 12-10-2015 16:10 3 1500
4 Fier Marilyn 02-10-2015 15:30 15 18000 08-10-2015 15:30 4 2000
Table 2 customer name & SaleDate_Time should be match with table 1 CustomerFirstName, CustomerLastName, & SaleDate_Time. Then combine from table 1, InvoiceNo & InvoiceValue to table 2.
Any suggestions?
Are you looking for SQL Query for the above scenario then you can something like below.
SELECT RS.CustomerName
FROM CustomerSales CS
ON RS.CustomerName = CS.CustomerfirstName + ' ' + Cs.CustomerLastName
WHERE RS.SaleDate_Time = CS.SaleDate_Time