How can I search by two continuous rows in SQL? - sql

Given the SQL table
id date employee_type employee_level
1 10/01/2015 other 2
1 09/13/2011 full-time 1
1 09/25/2010 intern 1
2 09/25/2013 full-time 3
2 09/25/2011 full-time 2
2 09/25/2008 full-time 1
3 09/23/2015 full-time 5
3 09/23/2013 full-time 4
Is it possible to search for ids that have one row with employee_type "intern", and the row above it in the table (same id with later date) with employee_type "full-time".
In this case, id 1 meets my requirement.
Thanks a lot!

Assuming that you mean the same id with the previous date, then you can use lag(), an ANSI standard function supported by most databases:
select t.*
from table t
where t.id in (select id
from (select t.*,
lag(employee_type) over (partition by id order by date) as prev_et
from table t
) tt
where tt.employee_type = 'intern' and tt.prev_et = 'full-time'
);
If your database doesn't support lag(), you can do something similar with correlated subqueries.

I believe the request isn't as described in the question; instead what you appear to be wanting is list all rows for folks who have been interns.
SELECT
t1.*
FROM yourtable AS t1
INNER JOIN (
SELECT DISTINCT
id
FROM yourtable
WHERE employee_type = 'intern'
) AS t2 ON t1.id = t2.id
;
Alternatively you might be wanting only those folks who have been both 'intern' and 'full-time' in which case you could use the query below that uses a HAVING clause:
SELECT
t1.*
FROM yourtable AS t1
INNER JOIN (
SELECT id
FROM yourtable
WHERE employee_type = 'intern'
OR employee_type = 'full-time'
GROUP BY id
HAVING COUNT(DISTINCT employee_type) > 1
) AS t2 ON t1.id = t2.id
;

Related

Combining access sql tables in a query side by side

I have 2 tables containing different data, linked by a column "id", except the id is repeated multiple times
For example,
Table 1:
id grade
1 A
1 C
Table 2:
Id company
1 Alpha
1 Beta
1 Charlie
The number of rows would be inconsistent, table 1 may sometimes have more/less/equal rows compared to table 2. How am I able to combine/merge them into this outcome:
id grade company
1 A Alpha
1 C Beta
1 Charlie
I am using Microsoft access' query.
This is a real pain in MS Access. But you can do it by using a subquery to generate sequence numbers. Here is one method assuming that the rows are unique:
select id, max(grade) as grade, max(company) as company
from ((select id, grade, null as company,
(select count(*)
from table1 as tt1
where tt1.id = t1.id and tt1.grade <= t1.grade
) as seqnum
from table1 as tt1
) union all
(select id, null as grade, company,
(select count(*)
from table2 as tt2
where tt2.id = t2.id and tt2.company <= t1.company
) as seqnum
from table2 as tt2
)
) t12
group by id, seqnum;
This would be much simpler in almost any other database.

How to select the top 3 values from a group based on date and exclude duplicate value?

If I three columns and 1 column has ID, 1 column has value and 1 column has date. Example, ID column has ID1, ID2, ID3. The value for each ID has a numeric value, say 1,2,3,4,5 for each ID.
How do I only get 3 results for each ID based on the most recent date descending.
I am using Sybase SQL. Is there any way I can write this?
I tried to use Row_number() and rank() but I don't get to use either of those functions with my SQL tool.
ID value Date
1 3 20190511
1 1 20190503
1 5 20190401
2 2 20190520
2 1 20190514
2 4 20190503
3 1 20190516
3 5 20190415
3 3 20190402
If you don't have row_number try this
SELECT *
FROM yourTable t1
WHERE (SELECT COUNT(*)
FROM yourTable t2
WHERE t1.id = t2.id
AND t1.date < t2.date) < 3
So if one id have 3 or more older rows wont appear.
with row_number
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) as rn
FROM YourTable t1
) as t
WHERE t.rn <= 3
I assume you cant have multiple rows in same date. In that case you may want use RANK() or DENSE_RANK() and decide how handle ties.
One method uses a correlated subquery with in:
select t.*
from t
where t.date in (select top (3) t2.date
from t t2
where t2.id = t.id
order by t2.date desc
);
Note that this assumes that the dates are unique.

SQL Case depending on previous status of record

I have a table containing status of a records. Something like this:
ID STATUS TIMESTAMP
1 I 01-01-2016
1 A 01-03-2016
1 P 01-04-2016
2 I 01-01-2016
2 P 01-02-2016
3 P 01-01-2016
I want to make a case where I take the newest version of each row, and for all P that has at some point been an I, they should be cased as a 'G' instead of P.
When I try to do something like
Select case when ID in (select ID from TABLE where ID = 'I') else ID END as status)
From TABLE
where ID in (select max(ID) from TABLE)
I get an error that this isn't possible using IN when casing.
So my question is, how do I do it then?
Want to end up with:
ID STATUS TIMESTAMP
1 G 01-04-2016
2 G 01-02-2016
3 P 01-01-2016
DBMS is IBM DB2
Have a derived table which returns each id with its newest timestamp. Join with that result:
select t1.ID, t1.STATUS, t1.TIMESTAMP
from tablename t1
join (select id, max(timestamp) as max_timestamp
from tablename
group by id) t2
ON t1.id = t2.id and t1.TIMESTAMP = t2.max_timestamp
Will return both rows in case of a tie (two rows with same newest timestamp.)
Note that ANSI SQL has TIMESTAMP as reserved word, so you may need to delimit it as "TIMESTAMP".
You can do this by using a common table expression find all IDs that have had a status of 'I', and then using an outer join with your table to determine which IDs have had a status of 'I' at some point.
To get the final result (with only the newest record) you can use the row_number() OLAP function and select only the "newest" record (this is shown in the ranked common table expression below:
with irecs (ID) as (
select distinct
ID
from
TABLE
where
status = 'I'
),
ranked as (
select
rownumber() over (partition by t.ID order by t.timestamp desc) as rn,
t.id,
case when i.id is null then t.status else 'G' end as status,
t.timestamp
from
TABLE t
left outer join irecs i
on t.id = i.id
)
select
id,
status,
timestamp
from
ranked
where
rn = 1;
other solution
with youtableranked as (
select f1.id,
case (select count(*) from yourtable f2 where f2.ID=f1.ID and f2."TIMESTAMP"<f1."TIMESTAMP" and f2.STATUS='I')>0 then 'G' else f1.STATUS end as STATUS,
rownumber() over(partition by f1.id order by f1.TIMESTAMP desc, rrn(f1) desc) rang,
f1."TIMESTAMP"
from yourtable f1
)
select * from youtableranked f0
where f0.rang=1
ANSI SQL has TIMESTAMP as reserved word, so you may need to delimit it as "TIMESTAMP"
try this
select distinct f1.id, f4.*
from yourtable f1
inner join lateral
(
select
case (select count(*) from yourtable f3 where f3.ID=f2.ID and f3."TIMESTAMP"<f2."TIMESTAMP" and f3.STATUS='I')>0 then 'G' else f2.STATUS end as STATUS,
f2."TIMESTAMP"
from yourtable f2 where f2.ID=f3.ID
order by f2."TIMESTAMP" desc, rrn(f2) desc
fetch first rows only
) f4 on 1=1
rrn(f2) order is for same last date
ANSI SQL has TIMESTAMP as reserved word, so you may need to delimit it as "TIMESTAMP"

Is there something equivalent to putting an order by clause in a derived table?

This is sybase 15.
Here's my problem.
I have 2 tables.
t1.jobid t1.date
------------------------------
1 1/1/2012
2 4/1/2012
3 2/1/2012
4 3/1/2012
t2.jobid t2.userid t2.status
-----------------------------------------------
1 100 1
1 110 1
1 120 2
1 130 1
2 100 1
2 130 2
3 100 1
3 110 1
3 120 1
3 130 1
4 110 2
4 120 2
I want to find all the people who's status for THEIR two most recent jobs is 2.
My plan was to take the top 2 of a derived table that joined t1 and t2 and was ordered by date backwards for a given user. So the top two would be the most recent for a given user.
So that would give me that individuals most recent job numbers. Not everybody is in every job.
Then I was going to make an outer query that joined against the derived table searching for status 2's with a having a sum(status) = 4 or something like that. That would find the people with 2 status 2s.
But sybase won't let me use an order by clause in the derived table.
Any suggestions on how to go about this?
I can always write a little program to loop through all the users, but I was gonna try to make one horrendus sql out of it.
Juicy one, no?
You could rank the rows in the subquery by adding an extra column using a window function. Then select the rows that have the appropriate ranks within their groups.
I've never used Sybase, but the documentation seems to indicate that this is possible.
With Table1 As
(
Select 1 As jobid, '1/1/2012' As [date]
Union All Select 2, '4/1/2012'
Union All Select 3, '2/1/2012'
Union All Select 4, '3/1/2012'
)
, Table2 As
(
Select 1 jobid, 100 As userid, 1 as status
Union All Select 1,110,1
Union All Select 1,120,2
Union All Select 1,130,1
Union All Select 2,100,1
Union All Select 2,130,2
Union All Select 3,100,1
Union All Select 3,110,1
Union All Select 3,120,1
Union All Select 3,130,1
Union All Select 4,110,2
Union All Select 4,120,2
)
, MostRecentJobs As
(
Select T1.jobid, T1.date, T2.userid, T2.status
, Row_Number() Over ( Partition By T2.userid Order By T1.date Desc ) As JobCnt
From Table1 As T1
Join Table2 As T2
On T2.jobid = T1.jobid
)
Select *
From MostRecentJobs As M2
Where Not Exists (
Select 1
From MostRecentJobs As M1
Where M1.userid = M2.userid
And M1.JobCnt <= 2
And M1.status <> 2
)
And M2.JobCnt <= 2
I'm using a number of features here which do exist in Sybase 15. First, I'm using common-table expressions both for my sample data and clump my queries together. Second, I'm using the ranking function Row_Number to order the jobs by date.
It should be noted that in the example data you gave, no user satisfies the requirement of having their two most recent jobs both be of status "2".
__
Edit
If you are using a version of Sybase that does not support ranking functions (e.g. Sybase 15 prior to 15.2), then you need simulate the ranking function using Counts.
Create Table #JobRnks
(
jobid int not null
, userid int not null
, status int not null
, [date] datetime not null
, JobCnt int not null
, Primary Key ( jobid, userid, [date] )
)
Insert #JobRnks( jobid, userid, status, [date], JobCnt )
Select T1.jobid, T1.userid, T1.status, T1.[date], Count(T2.jobid)+ 1 As JobCnt
From (
Select T1.jobid, T2.userid, T2.status, T1.[date]
From #Table2 As T2
Join #Table1 As T1
On T1.jobid = T2.jobid
) As T1
Left Join (
Select T1.jobid, T2.userid, T2.status, T1.[date]
From #Table2 As T2
Join #Table1 As T1
On T1.jobid = T2.jobid
) As T2
On T2.userid = T1.userid
And T2.[date] < T1.[date]
Group By T1.jobid, T1.userid, T1.status, T1.[date]
Select *
From #JobRnks As J1
Where Not Exists (
Select 1
From #JobRnks As J2
Where J2.userid = J1.userid
And J2.JobCnt <= 2
And J2.status <> 2
)
And J1.JobCnt <= 2
The reason for using the temp table here is for performance and ease of reading. Technically, you could plug in the query for the temp table into the two places used as a derived table and achieve the same result.

How to filter out records grouped by date with a large date difference

I have some records, grouped by name and date.
I would like to find any records in a table that have a date difference between them larger than a week, from the most recent record.
Would this be possible to do with a cte?
I am thinking something along these lines (it is difficult to explain)
; with mycte as (
select *
from #GroupedRecords)
select *
from mycte a
join (select *
from #GroupedRecords) b on a.Name = b.Name
where datediff(day, a.DateCreated, b.DateCreated) > 7
For example:
Id Name Date
1 Foo 02/03/2010
2 Bar 23/02/2010
3 Ram 21/01/2010
4 Foo 29/02/2010
5 Foo 22/02/2010
6 Foo 05/12/2009
The results should be:
Id Name Date
1 Foo 02/03/2010
5 Foo 22/02/2010
6 Foo 05/12/2009
You can try:
SELECT id,
name,
DATE
FROM groupedrecords AS gr1
WHERE ( (SELECT MAX(DATE) AS md
FROM groupedrecords gr2
WHERE gr1.name = gr2.name) - gr1.DATE ) > 7;
Or probably better yet:
SELECT id,
name,
DATE
FROM groupedrecords AS gr1
INNER JOIN (SELECT name,
MAX(DATE) AS md
FROM groupedrecords AS gr2
GROUP BY name) AS q1
ON gr1.name = q1.name
WHERE ( q1.md - gr1.DATE ) > 7;
UPDATE: As suggested in the comments, here is a version that uses union to get the id with the max date per group AND the ids of those that are 7 days or older than the max date. I used a CTE for fun, it was not necessary. Note that if there is more than 1 ID that shares the max date in a group, this query will need to be modified-
WITH CTE
AS (SELECT name,
Max(date) AS MD
FROM Records
GROUP BY name)
SELECT R.ID,
R.name,
R.date
FROM CTE
INNER JOIN Records AS R
ON CTE.Name = R.Name
AND CTE.MD = R.date
UNION ALL
SELECT r1.id,
r1.name,
r1.DATE
FROM Records AS R1
INNER JOIN CTE
ON CTE.name = R1.name
WHERE ( CTE.md - R1.DATE ) > 7
ORDER BY name ASC,
date DESC
I wonder if this gets close to a solution:
; with tableWithRow as (
select *, row_number() over (order by name, date) as rowNum
from t
)
select t1.*, t2.id t2id, t2.name t2name, t2.date t2date, t2.rowNum t2rowNum
from tableWithRow t1
join tableWithRow t2
on t1.rowNum = t2.rowNum + 1 and t1.name = t2.name