Postgresql query using recursion and selfjoin - sql

I have an assignment and I am having trouble with one question. Basically I have a table like to one below. Alex is a player and the table is showing the team he has played with in every season. Note that a season start in a specific year and ends in the following year. I need to use only SQL (no cursors) to produce the output as illustrated in the second table were the career of Alex is shown only in two rows as opposed to the first table were the career of Alex is shown in four rows.
I have hardly tried to solve this question but cannot understand how to produce the output in the second table. I can perceive that I have to use CTE since I can see the Year_End is equal to the Year_Start of the following row. I have also tried to research on the net but since this is a very specific question I cannot find any relevant solutions. I have also posted my query so far since I think I am on the right track but now I'm stuck.
**TABLE records**
Id | Name | Team_Name | Year_Start | Year_End
---------------------------------------------------
100 | Alex | New Team | 2010 | 2011
101 | Alex | New Team | 2011 | 2012
102 | Alex | Best Eleven | 2012 | 2013
103 | Alex | Best Eleven | 2013 | 2014
**Required result from query**
Name | Team Name | Year_Start | Year_End
-------------------------------------------
Alex | New Team | 2010 | 2012
Alex | Best Eleven | 2012 | 2014
My query so far...
WITH RECURSIVE cte(id, name, team_name, year_start, year_end) AS
(
SELECT *
FROM history
WHERE name = 'Alex'
UNION ALL
SELECT history.id, history.name, history.team_name, history.year_start, history.year_end
FROM cte, history
WHERE cte.year_start = history.year_end
)
SELECT *
FROM cte;
Query that produced the requested result.
WITH RECURSIVE cte(id, name, team_name, year_start, year_end) AS
(
SELECT *
FROM history
WHERE name = 'Alex'
UNION ALL
SELECT history.id, history.name, history.team_name, history.year_start, history.year_end
FROM cte, history
WHERE cte.year_start = history.year_end
)
SELECT team_name, MIN(year_start), MAX(year_end)
FROM cte
GROUP BY team_name;

You can use below query to get your desired result
Sample SQL Fiddle
Select Distinct t.Name,t.Team_Name,b.YStart,b.YEnd
FROM t INNER JOIN(
Select Team_Name, Min(Year_Start) YStart,Max(Year_End) YEnd
FROM t
Group BY Team_Name ) b
ON t.Team_Name = b.Team_Name

select Name, Team_Name, min(Year_Start) startY, max(Year_End) endY
from t group by Name, Team_Name

Related

SQL (sqlite) compare sums of rows grouped by another repeating row

I have a table like:
|------------------------|
|day name trees_planted|
|------------------------|
|1 | alice | 3 |
|2 | alice | 4 |
|1 | bob | 2 |
|2 | bob | 4 |
|------------------------|
I'm using SELECT name, SUM(trees_planted) FROM year2016 GROUP BY name to get:
name | trees_planted
alice | 7
bob | 6
But then I have another table from 2015 and I want to compare the results with the previous year, if for example Alice planted more trees in 2016 than in 2015 I'd get a result like this:
name | tree_difference
alice | -2 (if previous year she planted 5 trees, 5 -7 = -2)
bob | 0 (planted the same number of trees last year)
You could use a sub-query to get the records from both 2016 and 2015, but negate the values from 2016. Then group and sum like you already did:
SELECT name,
SUM(trees_planted) AS tree_difference
FROM (SELECT name, trees_planted
FROM year2015
UNION ALL
SELECT name, -trees_planted
FROM year2016
) AS years
GROUP BY name
This will also work for cases where a number is only given in one of the two years.
Assuming you can join using user field, you can do:
select a.name, a.tp, b.tp, a.tp - b.tp
from
(
(select name, SUM(trees_planted) tp from year2016 group by name) a
inner join
(select name, SUM(trees_planted) tp from year2015 group by name) b
using(name)
)
If you can't join on field user (you have different set of users in 2015 and 2016), it'll be easy to add the missing information by using a couple of union clauses.
Here's a link with artificial data to SQLFIDDLE to try the query.

How can I get the MAX COUNT for multiple users?

I'm sorry if this happens to be a re-post however looking through all of the previous questions I could find with similar wording I have not been able to find a working answer.
I have a trainingHistory table that has a record for every new training. The training can be done by multiple trainers. Clients can have multiple trainers.
What I am trying to accomplish is to COUNT the number of clients that was last trained by each trainer.
Example:
clientID | trainDate | trainerID
101 | 2012-03-13 10:58:11| 10
101 | 2012-03-12 10:58:11| 11
102 | 2012-03-15 10:58:11| 10
102 | 2012-03-09 10:58:11| 12
103 | 2012-03-08 10:58:11| 7
So the end result I am looking for would be:
Results
trainerID | count
10 | 2
7 | 1
I've tried quite a few different queries and looked over quite a few answers, including this one here Using sub-queries in SQL to find max(count()) but have so far been unable to get the desired result.
What I keep getting is like this:
Results
trainerID | count
10 | 5
7 | 5
How can I get an accurate count per trainer as opposed to an overall total?
The closest I've gotten is this:
SELECT t.trainerName,
t.trainerID,
(
SELECT COUNT(lastTrainerCount)
FROM (
SELECT MAX(th.clientID) AS lastTrainerCount
FROM trainingHistory th
GROUP BY th.clientID
) AS lastTrainerCount
)
FROM trainers t
INNER JOIN trainingHistory th ON (th.trainerID = t.trainerID)
WHERE th.trainingDate BETWEEN '12/14/14' AND '02/07/15'
GROUP BY t.trainerName, t.trainerID
Which results in:
Results
trainerID | count
10 | 1072
7 | 1072
Using SQL Server 2012
Appreciate any help you can provide.
First find the max trainDate per clientID in sub-select. Then count the trainerID in outer query. Try this.
select trainerID,count(trainerID) [Count]
From
(
select clientID,trainDate,trainerID,
row_number()over(partition by clientID order by trainDate Desc) Rn
From yourtable
) A
where Rn=1
Group by trainerID
SQLFIDDLE DEMO

updating nulls based on column

So I got this very inconsistent record for example(just an example):
Manager | Associate | FTE | Revenue
Bob | James | Y | 500
Bob | James | NULL | 100
Bob | James | Y | 200
Kelly | Rick | N | 200
Kelly | Rick | N | 500
Kelly | Rick | NULL | 300
So the goal i wanted was to Sum up the revenue, but the problem is in the group by the nulls kinda split them apart. So i want to write an update statement saying basically "well Looks like James and Bob are both FTE, so lets update that to Y and Kelly and rick are not so update that to no."
How can i fix this? Using MSAccess and of course my table is a lot biger with a lot of different name combos.
You can "impute" the value by using an aggregation function. The following query aggregates by manager/associate and takes the maximum value of fte. This is then joined back to the original data to do the calculation:
select ma.fte, sum(Revenue)
from table as t inner join
(select manager, associate, max(fte) as fte
from table as t
group by manager, associate
) as ma
on t.manager = ma.manager and
t.associate = ma.associate
group by ma.fte;
EDIT:
Immediately after posting this, I realized the join is not necessary. Two aggregations are sufficient:
select ma.fte, sum(Revenue)
from (select manager, associate, max(fte) as fte, sum(Revenue) as Revenue
from table as t
group by manager, associate
) as ma
group by ma.fte;
You haven't given the primary key columns, which makes it a bit harder. I've called it {id} below.
With the nulls, many SQL dialects have an "IfNull" function, but it seems MS-Access does not. You can get the same effect this way:
IIF(ISNULL(column),0,column)
You'd use that in a SELECT as so:
SELECT IIF(ISNULL(Revenue),0,Revenue) FROM ...
For a one-off fix you could do this:
UPDATE {table} SET Revenue=0 WHERE Revenue = NULL;
Doing a join to get the FTE from another row is more complex, and I don't have access handy to see just what the limits and syntax are. The easy to understand way is a nested query:
UPDATE {table} a SET FTE = (SELECT max(FTE) FROM {table} b WHERE FTE IS NOT NULL AND a.{id} = b.{id})
The max() function works here because it ignores nulls, where some other functions return null if you pass a null in.

Merging rows SQL - Access

I have this table on MS Access:
Name | Week | Manager | Sales
John | 201409 | Marcelo | 53
John | 201410 | Marcelo | 20
John | 201410 | Raquel | 30
John | 201411 | Raquel | 53
I have to merge Week 201410 by the max Sales and choose which Manager. After this I'd like to sum the Total Sales for this two and make like this:
Name | Week | Manager | Sales
John | 201409 | Marcelo | 53
John | 201410 | Raquel | 50
John | 201411 | Raquel | 53
Could anybody help me? I tried a lot of SQL and couldn't do nothing useful.
You can try this:
SELECT [Name], [Week], [Manager], SUM([Sales]) as Sales1
From [YourTable]
GROUP BY [Name], [Week], [Manager]
I did not test this so let me know what errors you get.
If each row had a unique identifier (Primary Key), it would be a lot simpler. However, you work with the data you have, not with the data you wish you had, so here's my circuitous way of accomplishing it. You could combine this all into one query and avoid using temporary tables; I split it out this way to make it convenient to understand, rather than being concise.
First, extract the highest Sales for each Name-Week combination:
SELECT Name, Week, MAX(Sales)
INTO #MaxSales
FROM [YourTable]
GROUP BY Name, Week
Use this information to get the Manager that you should use for each week (We use TOP 1 to resolve the case where two managers have the same sales for the same Name/Week; I'm not sure how you would want to resolve this.):
SELECT Name, Week, Manager
INTO #MaxSalesManager
FROM [YourTable]
INNER JOIN #MaxSales
ON [YourTable].Name = #MaxSales.Name
AND [YourTable].Week = #MaxSales.Week
WHERE [YourTable].Sales = #MaxSales.Sales
Now you can extract the information you need:
SELECT [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager, SUM([YourTable].Sales)
FROM [YourTable]
INNER JOIN #MaxSalesManager
ON [YourTable].Name = #MaxSalesManager.Name
AND [YourTable].Week = #MaxSalesManager.Week
GROUP BY [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager
Hope this helps!
EDIT:
Combining them all into one query:
SELECT [YourTable].Name,
[YourTable].Week,
#MaxSalesManager.Manager,
SUM([YourTable].Sales)
FROM [YourTable]
INNER JOIN
(SELECT Name, Week, Manager
FROM [YourTable]
INNER JOIN
(SELECT Name, Week, MAX(Sales)
FROM [YourTable]
GROUP BY Name, Week) AS #MaxSales
ON [YourTable].Name = #MaxSales.Name
AND [YourTable].Week = #MaxSales.Week
WHERE [YourTable].Sales = #MaxSales.Sales) AS #MaxSalesManager
ON [YourTable].Name = #MaxSalesManager.Name
AND [YourTable].Week = #MaxSalesManager.Week
GROUP BY [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager

MIN() Function in SQL

Need help with Min Function in SQL
I have a table as shown below.
+------------+-------+-------+
| Date_ | Name | Score |
+------------+-------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/05 | Jones | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/06 | James | 3 |
| 2012/07/07 | Hugo | 1 |
| 2012/07/07 | Jack | 1 |
| 2012/07/07 | Jim | 2 |
+------------+-------+-------+
I would like to get the output like below
+------------+------+-------+
| Date_ | Name | Score |
+------------+------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/07 | Hugo | 1 |
+------------+------+-------+
When I use the MIN() function with just the date and Score column I get the lowest score for each date, which is what I want. I don't care which row is returned if there is a tie in the score for the same date. Trouble starts when I also want name column in the output. I tried a few variation of SQL (i.e min with correlated sub query) but I have no luck getting the output as shown above. Can anyone help please:)
Query is as follows
SELECT DISTINCT
A.USername, A.Date_, A.Score
FROM TestTable AS A
INNER JOIN (SELECT Date_,MIN(Score) AS MinScore
FROM TestTable
GROUP BY Date_) AS B
ON (A.Score = B.MinScore) AND (A.Date_ = B.Date_);
Use this solution:
SELECT a.date_, MIN(name) AS name, a.score
FROM tbl a
INNER JOIN
(
SELECT date_, MIN(score) AS minscore
FROM tbl
GROUP BY date_
) b ON a.date_ = b.date_ AND a.score = b.minscore
GROUP BY a.date_, a.score
SQL-Fiddle Demo
This will get the minimum score per date in the INNER JOIN subselect, which we use to join to the main table. Once we join the subselect, we will only have dates with names having the minimum score (with ties being displayed).
Since we only want one name per date, we then group by date and score, selecting whichever name: MIN(name).
If we want to display the name column, we must use an aggregate function on name to facilitate the GROUP BY on date and score columns, or else it will not work (We could also use MAX() on that column as well).
Please learn about the GROUP BY functionality of RDBMS.
SELECT Date_,Name,MIN(Score)
FROM T
GROUP BY Name
This makes the assumption that EACH NAME and EACH date appears only once, and this will only work for MySQL.
To make it work on other RDBMSs, you need to apply another group function on the Date column, like MAX. MIN. etc
SELECT T.Name, T.Date_, MIN(T.Score) as Score FROM T
GROUP BY T.Date_
Edit: This answer is not corrected as pointed out by JNK in comments
SELECT Date_,MAX(Name),MIN(Score)
FROM T
GROUP BY Date_
Here I am using MAX(NAME), it will pick one name if two names were found with the same goal numbers.
This will find Min score for each day (no duplicates), scored by any player. The name that starts with Z will be picked first than the name that starts with A.
Edit: Fixed by removing group by name