SQL query for until 2005 but not after 2005? - sql

Find all Id who had taught until 2005 but had not taught after 2005.
for eg.
year ID
2010 A
2009 C
2005 B
2002 D
2002 C
2001 B
2000 A
Then the result should give only B and D.
The table has columns ID and year and I want to print out ID.
SELECT ID
FROM university.teaches
WHERE year <= 2005 AND
year NOT IN (SELECT year FROM university.teaches WHERE year> 2005);
I am trying something like this but it gives result including A and C

You should check for ids and not years with the IN operator:
SELECT DISTINCT ID
FROM university.teaches
WHERE id NOT IN (SELECT id FROM university.teaches WHERE year > 2005);
The subquery of IN returns all the ids that have taught after 2005, so NOT IN will return all the rest ids.
See the demo.
Results:
| ID |
| --- |
| B |
| D |

Use GROUP BY and MAX():
select id
from university.teaches
group by id
having max(year) <= 2005;

Related

SQL - Display Name ID from Consecutive Occurrences of values in a Table

I have a table created, as an example 'Table1', see below;
Name Year
John 2003
Lyla 1994
Faith 1996
John 2002
Carol 2000
Carol 1999
John 2001
Carol 2002
Lyla 1996
Lyla 1997
Carol 2001
John 2009
Based on the above table, I have summarised my findings.
Carol participated for 4 years in a row; 1999, 2000, 2001, 2002
John participated for 3 years in a row; 2001, 2002, 2003 – John also participated in 2009, but this does not count as part of the streak.
Lyla participated in 1994, 1996, 1997 but these were not three consecutive years.
Faith participated only 1 time.
What I am looking to do is write a SQL query where only the Name Id in the table are displayed where the users have participated for 3 consecutive years or more, so I should only be getting the names of only 'Carol' and 'John' based on the above.
I am not exactly sure how to write this and would hope that someone could guide me.
I have only come up with a short and basic start like the one below, but in all honesty I am not sure that is even the correct way to go about it.
Select Name From Table1
Where Year = ?
Order by Name asc
Group by Year
Assuming you have one row per person per year, you can use lag() and select distinct:
select distinct name
from (select t.*,
lag(year, 2) over (partition by name order by name) as prev2_year
from table1 t
) t
where prev2_year = year - 2;
This simply looks back two rows for each name and compares the year on that row to the year on the current row. If there are three years in a row, then that year is exactly year - 2.
You could also do this with joins, but the above probably performs better:
select distinct t1.name
from table1 t1 join
table1 t1_1
on t1.name = t1_1.name and
t1.year = t1_1.year + 1 join
table1 t1_2
on t1.name = t1_2.name and
t1.year = t1_2.year + 2;
select
n1.name,
SUM(CASE WHEN n2.year is null then 0 else 1 end)+1 YearsInRow
from Table1 n1
left join Table1 n2 on n2.name=n1.name and (n2.year=n1.year+1 )
GROUP by n1.name
HAVING SUM(CASE WHEN n2.year is null then 0 else 1 end)+1 >=3
output:
name YearsInRow
---------- -----------
Carol 4
John 3

Hive: Add a column with a value repeated of a specific columnn in a specific row?

I have a table in Hive that looks like this called Products.
'Root Product | Product | Date
A A 2012
A B 2013
A C 2013
D D 2014
D E 2015
Is it possible to add fourth column repeating the value of the date present in the column Date when Root Product == Product (the date of the root product)? Such that
'Root Product | Product | Date | Root Date
A A 2012 2012
A B 2013 2012
A C 2013 2012
D D 2014 2014
D E 2015 2014
Using max window function.
select root_product
,product
,date
,max(case when root_product = product then date end) over(partition by root_product) as root_date
from tbl

SQL Max Value for a Specified Limit

I'm trying to return a list of years when certain conditions are met but I am having trouble with the MAX function and having it work with the rest of my logic.
For the following two tables:
coach
coach | team | wins | year
------+------+------+------
nk01 | a | 4 | 2000
vx92 | b | 1 | 2000
nk01 | b | 5 | 2003
vx92 | a | 2 | 2003
team
team | worldcupwin | year
-----+-------------+------
a | Y | 2000
b | N | 2000
a | Y | 2003
b | N | 2003
I want to get the following output:
years
-----
2000
Where the years printed are where the coaches' team with most wins during that year also won the world cup.
I decided to use the MAX function but quickly ran into the problem of not knowing how to use it to only be looking for max values for a certain year. This is what I've got so far:
SELECT y.year
FROM (SELECT c.year, MAX(c.wins), c.team
FROM coach AS c
WHERE c.year >= 1999
GROUP BY c.year, c.team) AS y, teams AS t
WHERE y.year = t.year AND t.worldcupwin = 'Y' AND y.team = t.team;
This query outputs all years greater than 1999 for me, rather than just those where a coach with the most wins also won the world cup.
(Using postgresql)
Any help is appreciated!
You can use correlated subquery
DEMO
SELECT c.year, c.team
FROM coachs AS c inner join teams t on c.team = t.team and c.year=t.year
WHERE c.year >= 1999 and exists (select 1 from coachs c1 where c.team=c1.team
having max(c1.wins)=c.wins)
and t.worldcupwin = 'Y'
OUTPUT:
year team
2000 a
The following query uses DISTINCT ON:
SELECT DISTINCT ON (year) c.year, wins, worldcupwin, c.team
FROM coach AS c
INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE c.year > 1999
ORDER BY year, wins DESC
in order to return the records having the biggest number of wins per year
year wins worldcupwin team
---------------------------------
2000 4 Y a
2003 5 N b
Filtering out teams that didn't win the world cup:
SELECT year, team
FROM (
SELECT DISTINCT ON (year) c.year, wins, worldcupwin, c.team
FROM coach AS c
INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE c.year > 1999
ORDER BY year, wins DESC) AS t
WHERE t.worldcupwin = 'Y'
ORDER BY year, wins DESC
gives the expected result:
year team
-------------
2000 a
Demo here
You can use the below to get the desired result:
EASY METHOD
SELECT TOP 1 c.year
FROM coach AS c INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE t.worldcupwin = 'Y'
ORDER BY c.wins DESC;
use row_number() window function
select a.coach,a.team,a.win,a.year from
(select c.*,t.*,
row_number()over(order by wins desc) rn
from coach c join team t on c.team=t.team
where worldcupwin='Y'
) a where a.rn=1

How to get lastest date group by employee of a column but without another column

I'm working a query in SQL 2005.
I'm trying to get the latest date for a number column. The trick is there is another column (rate) that use the column date and I fetch the wrong column in the end.
An example will better explain my question.
This is my SQL table EmployeeRates:
----------------------------------
FkEmployee | Date | Rate | Number |
----------------------------------
1 2000 15 1.5
1 2001 16 1.5
1 2002 16 1.6
2 2000 12 1.5
2 2001 14 1.6
2 2002 15 1.6
So if I fetch the latest date, currently I have :
FkEmployee #1 = 2002 (which is correct because it's the latest date for the number column.)
FkEmployee #2 = 2002 (which is not what I want, because that year it was the rate that changed and there is a duplicate number) What I want is 2001.
The code I have right now (2015-08-10 14:15)
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(
SELECT FkEmployee, MAX(Date) AS MaxDate
FROM EmployeeRates
GROUP BY FkEmployee
)
t2 ON t1.FkEmploye = t2.FkEmploye
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmploye
Thanks for anybody that can help =)
This should work. First find MIN date by Employee, Number, then get the MAX of that. This will ensure you are getting the earliest date per number, but latest date per employee:
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(SELECT FkEmployee,MAX(MinDate) AS MaxDate from
(SELECT FkEmployee, MIN(Date) AS MinDate
FROM EmployeeRates
GROUP BY FkEmployee,Number) a
GROUP BY Fkemployee
)
t2 ON t1.FkEmployee = t2.FkEmployee
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmployee

Select same number of rows even if data is not there

I want to write a query to always select the same number of rows, even if the data is not there. If the data is not there, I would still like to select something in its place.
For example, if I want to select the amount in my bank account for the last 5 years, but I only have data for the last 3 years, could I still select 5 rows and just have 0's for the two missing years?
| Year | Balance |
| 2014 | $5 |
| 2013 | $10 |
| 2012 | $31 |
| 2011 | $0 | << Doesn't exist
| 2010 | $0 | << Doesn't exist
Is this possible? Thanks for any help.
Using mssql, this will work. There are other similar functions for other DBs.
SELECT TOP 5 year, ISNULL(balance,0) FROM yourtable
The key to solving this is using an outer join. It is tempting to think of your actual table of data as the "main" table, but what you really want is a table of all possible years. I didn't have a table of years, so I made one up on the fly:
select 2010 as Year
union select 2011 as Year
union select 2012 as Year
union select 2013 as Year
union select 2014 as Year
This gives me a list of all the years I care about.
Then I use an outer join to join it to my table with real data. The outer join will just return NULL values for the stuff that isn't there. But I don't want NULLs, I want zeroes, so I use the isnull function to make them zeros if they are null. And then I end up with this:
select YearList.Year, isnull(Bals.Balance, 0) as theBalance
from
(select 2010 as Year
union select 2011 as Year
union select 2012 as Year
union select 2013 as Year
union select 2014 as Year) as YearList
left join (select Year, Balance from Balances) as Bals
on YearList.Year = Bals.Year