get data for a record in the past - sql

I have 2 tables, table1 and table2.
What I want to achieve:
I want to return the current month info about an employee
AND
when they created an account from table 2. They might be have changed their position as of today so I want to capture info at a point in time and current on the same row.
College program table
Table 1
Name Acct_Cr_DT
a1 12/1/2018
b1 1/4/2018
c1 5/6/2018
Last Month (12/29) and current Month Data (1/29/2019). Assuming data refreshes on last day of every fiscal month.
Table 2
Name position gender Emp status FISCAL_MONTH_END_DATE
a1 Analyst M hourly 12/29/2018
b1 Intern F hourly 12/29/2018
c1 Director F hourly 12/29/2018
a1 Manager M hourly 1/29/2019
b1 Analyst F hourly 1/29/2019
c1 Director F hourly 1/29/2019
a1 was an analyst at the time of account creation.
b1 was an intern at the time of account creation.
Sample output: Need the info at the time of account creation before these got a promotion.
Name Acct_Cr_DT position gender Emp status FISCAL_MONTH_END_DATE
a1 12/1/2018 Analyst M hourly 1/29/2019
b1 1/4/2018 Intern F hourly 1/29/2019
c1 5/6/2018 Director F hourly 1/29/2019

If you want to return the current month info for table 1:
SELECT *
FROM YOUR_TABLE
WHERE MONTH(COLUMN_NAME) = MONTH(GETDATE())
AND YEAR(COLUMN_NAME) = YEAR(GETDATE())
However, judging your from your explanations, you need a join statement to capture info at a point in time and current on the same row. So you probably need this:
SELECT *
FROM TABLE_1 a inner join TABLE_2 b on a.id=b.id
WHERE MONTH(COLUMN_NAME) = MONTH(GETDATE())
AND YEAR(COLUMN_NAME) = YEAR(GETDATE())
Please provide sample output for further explanation.

Here you can try the following query.
SELECT TB1.name,Acct_Cr_DT,position,gender,Emp status,FISCAL_MONTH_END_DATE
FROM Table1 tb1
INNER JOIN Table2 tb2 ON tb1.name=tb2.name
WHERE DATEPART(YEAR,Acct_Cr_DT) = DATEPART(YEAR,FISCAL_MONTH_END_DATE) AND DATEPART(MONTH,FISCAL_MONTH_END_DATE)='12' AND DATEPART(DAY,FISCAL_MONTH_END_DATE)='29'
Getting record at the time of account creation and also the last day of the year which was the record refreshed.

It should be possible to solve this using the wonderful DB2 OLAP functions (aka window functions).
The following code use a subquery to pull out the first position of each employee and sort the records by fiscal end of month. The outer query then joins the results with table2 to get the employee hire date, and filters the records corresponding to the most recent fiscal end of month.
SELECT
tx.name, tx.first_position, tx.gender, tx.emp_status, tx.fiscal_month_end_date, t1.acct_cr_dt
FROM (
SELECT
t2.*,
FIRST_VALUE(t2.position) OVER(PARTITION BY t2.name ORDER BY fiscal_month_end_date) first_position,
DENSE_RANK() OVER (ORDER BY t2.fiscal_month_end_date desc) rnk
FROM table2 t2
) tx INNER JOIN table1 t1 ON t1.name = tx.name
WHERE tx.rnk = 1;
In this DB Fiddle demo, the query yields :
| name | first_position | gender | emp_status | fiscal_month_end_date | acct_cr_dt |
| ---- | -------------- | ------ | ---------- | --------------------- | ---------- |
| a1 | Analyst | M | hourly | 2019-01-29 | 2018-12-01 |
| b1 | Intern | F | hourly | 2019-01-29 | 2018-01-04 |
| c1 | Director | F | hourly | 2019-01-29 | 2018-05-06 |
NB : this a MySQL 8.0 fiddle, since there is no DB2 db fiddlde available in the wild...

I found the answer to this own my own.
Basically you'll have to join on Name and make sure the "Acct Create DT" is
<=FISCAL_MONTH_END_DATE" to get the info for point in time for that employee.
Now after that, create a sub-query with a LEFT JOIN on Table 2 and extract the current
"FISCAL_MONTH_END_DATE" to return current month data

Related

Alias Reference Date_Diff Days. Need to Parse or create temp table with dates?

Below I have the tables and query which output the below
Table1
EmployeeID | StartDateTimestamp | CohortID | CohortName
---------- | ------------------ | -------- | ----------
1 | 20080101 01:30:00 | 1 | Peanut
1 | 20090204 01:01:00 | 2 | Apple
2 | 20190107 05:52:14 | 1 | Peanut
3 | 20190311 02:35:26 | 2 | Apple
Employee
EmployeeID | HireStartName | StartDateTimestamp2
---------- | ------------- | -------------------
1 | HiredStart | 20080501 01:30:00
1 | DeferredStart | 20090604 01:01:00
2 | HiredStart | 20190115 05:52:14
3 | HiredStart | 20190330 02:35:26
Query
select
t.cohortid,
min(e.startdatetimestamp2) first,
max(e.startdatetimestamp2) last
from table1 t
inner join employee e on e.employeeid = t.employeeid
group by t.cohort_id
Output
ID | FIRST | LAST
1 |20190106 12:00:05 |20180214 03:45:12
2 |20180230 01:45:23 |20180315 01:45:23
My attempt:
SELECT DATE_DIFF(first, last, Day), ID, max(datecolumn1) first, min(datecolumn1) last
Error: Unrecognized name.
How do I enter the reference alias first and last in a Date_Diff?
Do I need to derive a table?
Clarity: Trying to avoid inputting in the dates, since I am looking to find the date diff of both first and last columns for as many rows as there is data.
This answer has been discussed here: Date Difference between consecutive rows
DateDiff has deprecated, and now it is Date_Diff (first, last, day)
Then I tried:
SELECT ID, DATE_DIFF(PARSE_DATE('%y%m%d',t.first), PARSE_DATE('%y%m%d',t.last), DAY) days
FROM table
Failed to parse input string "20180125 01:00:05"
Tried this
SELECT CohortID, date_diff(first,last,day) as days
FROM (select cohortid,min(startdatetimestamp2) first,
max(startdatetimestamp2) last
FROM employee
JOIN table1 on table1.employeeid = employee.employeeid
group by cohortid)
I get days not found on either side of join
Regarding your first question about using aliases in a query, there are some restriction where to use them, specially in the FROM, GROUP BY and ORDER BY statements. I encourage you to have a look here to check these restrictions.
About your main issue, obtaining the date difference between two dates. I would like to point that your timestamp data, in both of your tables, are actually considered as DATETIME format in BigQuery. Therefore, you should use DATETIME builtin functions to get the desired results.
The below query uses the data you provided to obtain the aimed output.
WITH
data AS
(
SELECT
t.cohortid AS ID,
PARSE_DATETIME('%Y%m%d %H:%M:%S', MIN(e.startdatetimestamp2)) AS first,
PARSE_DATETIME('%Y%m%d %H:%M:%S', MAX(e.startdatetimestamp2)) AS last
FROM
`test-proj-261014.sample.table1` t
INNER JOIN
`test-proj-261014.sample.employee` e
ON
e.employeeid = t.employeeid
GROUP BY t.cohortid
)
SELECT
ID,
first,
last,
DATETIME_DIFF(last, first, DAY ) AS diff_days
FROM
data
And the output:
Notice that I created a temp table to format the fields StartDateTimestamp and StartDateTimestamp2, using the PARSE_DATETIME(). Afterwards, I used the DATETIME_DIFF() method to obtain the difference in days between the two fields.

How do I get the next record based on a condition using a SQL Query?

I have a car hire table which records all the dates a car is on hire Onhire and when it's returned Offhire. I've been asked to provide the next or follow on hire company name in the results table but I'm not sure how to do it. The hire table is structured as follows:
---------------
| Hire |
---------------
| Id |
| CarId |
| Onhire |
| Offhire |
| HireCompany |
|-------------|
If I run a basic select against that table I see the following data. I've added a WHERE to pull back a specific car that is still on hire and has a follow on hire shortly after (I am using UK date formatting).
Id | CarId | Onhire | Offhire | HireCompany
-------------------------------------------------------
10 | 272 | 2019-01-01 | 2019-03-01 | Company A
11 | 272 | 2019-03-02 | 2019-04-01 | Company B
-------------------------------------------------------
As you can see, the car is currently on hire until 01/03/2019 but after that, it is going on hire to Company B on the 02/03/2019. I need my query to show that the car is on hire at the moment but in a column called ForwardHire (or whatever) show the NEXT company that has it on hire as well as a column that shows the next hire start date.
So, my query would produce the following desired result:
Id | CarId | Onhire | Offhire | ForwardHire | ForwardHireDate
---------------------------------------------------------------------------
10 | 272 | 2019-01-01 | 2019-03-01 | Company B | 2019-03-02
Note: I am already aware of how to return a single result from my Hire
table using an outer apply, advice which I got in a different thread.
I hope my question has made sense and that someone can help. In terms of SQL queries, this is a first for me so any advice and guidance are appreciated.
Are you looking for lead function ? :
SELECT h.*
FROM (SELECT h.*,
LEAD(HireCompany) OVER (PARTITION BY CarID ORDER BY Id) AS ForwardHire,
LEAD(Onhire) OVER (PARTITION BY CarID ORDER BY Id) AS ForwardHireDate
FROM Hire h
) h
WHERE ForwardHire IS NOT NULL AND ForwardHireDate IS NOT NULL;
Using OUTER APPLY:
SELECT
H.*,
T.ForwardHire,
T.ForwardHireDate
FROM
Hire AS H
OUTER APPLY (
SELECT TOP 1 -- Just the next record
ForwardHire = F.HireCompany,
ForwardHireDate = F.OnHire
FROM
Hire AS F
WHERE
H.CarId = F.CarId AND -- With the same car
F.OnHire > H.OffHire -- With later OnHire
ORDER BY
F.OnHire ASC -- Sorted by OnHire (closeste one first)
) AS T
Do you just want lead()?
select h.*,
lead(h.hirecompany) over (partition by h.carid order by h.onhire) as next_hirecompany
from hire h;
Note: this will return the next company, even if there are gaps. If you want the "adjacent" next company, then I'd recommend a left join:
select h.*, hnext.hirecompany as next_hirecompany
from hire h left join
hire hnext
on hnext.carid = h.carid and
hnext.onhire = dateadd(day, 1, h.offhire);
Self join the hire table to the row that has the next onhire date:
select
h1.*,
h2.hirecompany ForwardHire
h2.onhire ForwardHireDate
from hire h1 left join hire h2
on
(h2.carid = h1.carid)
and
(h2.onhire = (select min(onhire) from hire where carid = h1.carid and onhire > h1.offhire) )
where
h1.carid = 272
and
curdate() between h1.onhire and h1.offhire

Find name of employees hired on different joining date

I wrote a query to find the employess hired on same date.
this is the query
select a.name,b.name,a.joining,b.joining from [SportsStore].[dbo].[Employees] a,
[SportsStore].[dbo].[Employees] b where a.joining = b.joining and a.name>b.name
Then a question popped into my mind. How do i find those employess only who were hired on different dates? I tried something like this
select a.name,b.name,a.joining,b.joining from [SportsStore].[dbo].[Employees] a,
[SportsStore].[dbo].[Employees] b where a.joining != b.joining and a.name>b.name
but then i realized this doesnt make sense . I thought about a sub query but it wont work either because we are selecting from two tables.
So i searched and could not find anything.
So the question is how do we "Find name of employees hired on different joining date?"
JOIN the Employees table with a subquery that counts the joining dates.
where j.num = 1
returns employees hired on different dates
where j.num > 1
returns employees hired on same date
select e.id, e.name, e.joining
from [SportsStore].[dbo].[Employees] e
inner join (select joining, count(*) num
from [SportsStore].[dbo].[Employees]
group by joining) j
on j.joining = e.joining
where j.num = 1;
+----+------+---------------------+
| id | name | joining |
+----+------+---------------------+
| 1 | abc | 01.01.2017 00:00:00 |
+----+------+---------------------+
| 2 | def | 01.01.2017 00:00:00 |
+----+------+---------------------+
| 5 | mno | 01.01.2017 00:00:00 |
+----+------+---------------------+
+----+------+---------------------+
| id | name | joining |
+----+------+---------------------+
| 3 | ghi | 02.01.2017 00:00:00 |
+----+------+---------------------+
| 4 | jkl | 03.01.2017 00:00:00 |
+----+------+---------------------+
Can check it here: http://rextester.com/OOO96554
If you just need the names (and not the list of different hiring dates), the following rather simple query should do the job:
select id, name
from employee
group by id, name
having count(distinct joining) > 1
after getting the answer , I have another way to get the same result . Here it is. I Hope its helpful to others and someone might explain which approach is better and in what scenario .
select name,joining from [SportsStore].[dbo].[Employees] where joining not in
(
select joining
from [SportsStore].[dbo].[Employees]
group by joining
having count(*)=1
)

Perform right outer join with a condition for left table

I have two tables,
Student:
rollno | name
1 | Abc
2 | efg
3 | hij
4 | klm
Attendance:
name | date |status
Abc | 10-10-2013 | A
efg | 10-10-2013 | A
Abc | 11-10-2013 | A
hij | 25-10-2013 | A
My required output is:
Some query with where condition as "where date between '10-09-2013' and '13-10-2013' "
rollno| name |count
1 | Abc | 2
2 | efg | 1
3 | hij | 0
4 | klm | 0
I tried using:
SELECT p.rollno,p.name,case when s.statuss='A' then COUNT(p.rollno) else '0' end as count
from attendance s
right outer join student p
on s.rollno=p.rollno
where s.date between '10-09-2013' and '13-10-2013'
group by p.rollno,p.regno,p.name,s.statuss
order by p.rollno
And the Output is:
rollno| name |count
1 | Abc | 2
2 | efg | 1
I want the remaining values from the student table to also be appended. I have tried many different queries, but all have been unsuccessful. Is there a query that will return the required output above?
You need to move the criteria from the where to the join:
SELECT p.rollno,p.name,case when s.statuss='A' then COUNT(p.rollno) else 0 end as count
from attendance s
right outer join student p
on s.rollno=p.rollno
and s.date between '10-09-2013' and '13-10-2013'
group by p.rollno,p.regno,p.name,s.statuss
order by p.rollno;
At the moment even though you have an outer join, by referring to the outer table in the where clause you effectively turn it into an inner join. Where there is no match in attendance, s.Date will be NULL, and because NULL is not between '10-09-2013' and '13-10-2013' the rows are excluded.
It is not apparent from the question, but I would image that what you are actually looking for is this. It appears you are just after a count of entries in attendance where status = 'A' by student:
SELECT p.rollno,
p.name,
COUNT(s.statuss) as count
from attendance s
right outer join student p
on s.rollno=p.rollno
and s.date between '10-09-2013' and '13-10-2013'
AND s.statuss = 'A'
group by p.rollno,p.regno,p.name,
order by p.rollno;
I have removed s.statuss from the group by, and changed the count so that there is only one row per student, rather than one row per status per student. I have changed the column within the count to a column in the attendance status table, to ensure that you get a count of 0 when there are no entries in attendance. if you use a column in students you will get a count of 1 even when there are no entries. Finally, since you are only interested in entries with statuss = 'A' I have also moved this to the join condition.
On one final note, it is advisable when using strings for dates to use the culture insensitive format yyyyMMdd, as this is completely unanbiguous, 20130201' is always the 1st February, and never 2nd January, whereas in your query10-09-2013' could be 10th September, or 9th October, depending on your settings.

MIN() Function in SQL

Need help with Min Function in SQL
I have a table as shown below.
+------------+-------+-------+
| Date_ | Name | Score |
+------------+-------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/05 | Jones | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/06 | James | 3 |
| 2012/07/07 | Hugo | 1 |
| 2012/07/07 | Jack | 1 |
| 2012/07/07 | Jim | 2 |
+------------+-------+-------+
I would like to get the output like below
+------------+------+-------+
| Date_ | Name | Score |
+------------+------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/07 | Hugo | 1 |
+------------+------+-------+
When I use the MIN() function with just the date and Score column I get the lowest score for each date, which is what I want. I don't care which row is returned if there is a tie in the score for the same date. Trouble starts when I also want name column in the output. I tried a few variation of SQL (i.e min with correlated sub query) but I have no luck getting the output as shown above. Can anyone help please:)
Query is as follows
SELECT DISTINCT
A.USername, A.Date_, A.Score
FROM TestTable AS A
INNER JOIN (SELECT Date_,MIN(Score) AS MinScore
FROM TestTable
GROUP BY Date_) AS B
ON (A.Score = B.MinScore) AND (A.Date_ = B.Date_);
Use this solution:
SELECT a.date_, MIN(name) AS name, a.score
FROM tbl a
INNER JOIN
(
SELECT date_, MIN(score) AS minscore
FROM tbl
GROUP BY date_
) b ON a.date_ = b.date_ AND a.score = b.minscore
GROUP BY a.date_, a.score
SQL-Fiddle Demo
This will get the minimum score per date in the INNER JOIN subselect, which we use to join to the main table. Once we join the subselect, we will only have dates with names having the minimum score (with ties being displayed).
Since we only want one name per date, we then group by date and score, selecting whichever name: MIN(name).
If we want to display the name column, we must use an aggregate function on name to facilitate the GROUP BY on date and score columns, or else it will not work (We could also use MAX() on that column as well).
Please learn about the GROUP BY functionality of RDBMS.
SELECT Date_,Name,MIN(Score)
FROM T
GROUP BY Name
This makes the assumption that EACH NAME and EACH date appears only once, and this will only work for MySQL.
To make it work on other RDBMSs, you need to apply another group function on the Date column, like MAX. MIN. etc
SELECT T.Name, T.Date_, MIN(T.Score) as Score FROM T
GROUP BY T.Date_
Edit: This answer is not corrected as pointed out by JNK in comments
SELECT Date_,MAX(Name),MIN(Score)
FROM T
GROUP BY Date_
Here I am using MAX(NAME), it will pick one name if two names were found with the same goal numbers.
This will find Min score for each day (no duplicates), scored by any player. The name that starts with Z will be picked first than the name that starts with A.
Edit: Fixed by removing group by name