T-SQL get values for specific group - sql

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?

I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Related

SQL Issue with Group By's . Choose a PK based on condition

I want to display only the most recent appointmentId by CustomerId . I know im doing the query all wrong but how can i achieve the desired result
SELECT TEMP.AppointmentId,TEMP.AppointmentDateTime,TEMP.PatientId,MAX(TEMP.AppointmentDateTime)
FROM
(SELECT
Appointment.Id AS AppointmentId,
Appointment.DateTime AS AppointmentDateTime,
Customer.Id AS CustomerId
FROM Customer
INNER JOIN Appointment ON Appointment.CustomerId = Customer.Id
INNER JOIN CustomerUser ON CustomerUser.CustomerId = Customer.Id
WHERE Appointment.UpdatedAt >= #StartDate AND Appointment.UpdatedAt <= #EndDate
AND Appointment.Is_Active = 1
AND Customer.Is_Active = 1
AND CustomerUser.Is_Active = 1
) AS TEMP
GROUP BY TEMP.AppointmentId,TEMP.AppointmentDateTime,TEMP.CustomerId
This is the result set that i have
AppointmentId AppointmentDateTime CustomerId
8909 2020-12-24 13:39:00 98
8931 2020-12-18 10:30:00 26
8932 2020-12-17 14:30:00 26
8933 2020-11-06 15:30:00 26
8934 2020-12-30 17:31:00 153
8936 2020-12-21 11:06:00 180
8938 2020-12-25 23:00:00 153
8943 2020-12-21 17:45:00 188
9046 2020-12-30 13:49:00 98
But this is the Expected result
AppointmentId AppointmentDateTime CustomerId
8931 2020-12-18 10:30:00 26
8934 2020-12-30 17:31:00 153
8936 2020-12-21 11:06:00 180
8943 2020-12-21 17:45:00 188
9046 2020-12-30 13:49:00 98
I want to display only the most recent appointmentId of a Patient based on the AppointmentDateTime.
One method is a correlated subquery:
select a.*
from Appointment a
where a.AppointmentDateTime = (select max(a2 AppointmentDateTime)
from Appointment a2
where a2.customerid = a.customerid
);
Your query is considerably more complicated than this, including tables and conditions not explained in the question. However, this answers the question that you have asked.
You made it too complicated. You only need one query and you need to remove your aggregated field from the group by statement.
SELECT
MAX(Appointment.Id) AS MaxAppointmentId,
MAX(Appointment.DateTime) AS AppointmentDateTime,
Customer.Id AS CustomerId
FROM Customer
INNER JOIN Appointment ON Appointment.CustomerId = Customer.Id
INNER JOIN CustomerUser ON CustomerUser.CustomerId = Customer.Id
WHERE Appointment.UpdatedAt >= #StartDate AND Appointment.UpdatedAt <= #EndDate
AND Appointment.Is_Active = 1
AND Customer.Is_Active = 1
AND CustomerUser.Is_Active = 1
GROUP BY Customer.Id
I think this will do what you want.
select *
from
( select *, row_number() over (partition by customerid order by AppointmentDateTime desc) as rn
from expandtable ) as t
where rn = 1

IF Else or Case Function for SQL select problem

Hi I would like to make a select expression using case or if/else which seems to be a simple solution from logic perspective but I can't seem to get it to work. Basically I am joining against two table here, the first table is customer record with date filter called min_del_date and then the second table for the model scoring table with BIN and update_date parameters.
There are two logics I want to display
Picking the model score that was the month before min_del_date
If model score month before delivery is greater than 50 (Bin > 50) then pick the model score for same month as min_del_date
My 1st logic code is below
with cust as (
select
distinct cust_no, max(del_date) as del_date, min(del_date) as min_del_date, (EXTRACT(YEAR FROM min(del_date)) -1900)*12 + EXTRACT(MONTH FROM min(del_date)) AS upd_seq
from customer.cust_history
group by 1
)
,model as (
select party_id, model_id, update_date, upd_seq, bin, var_data8, var_data2
from
(
select
party_id, update_date, bin, var_data8, var_data2,
(EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) AS upd_seq,
dense_Rank() over (partition by (EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) order by update_date desc) as rank1
from
(
select party_id,update_date, bin, var_data8, var_data2
from model.rpm_model
group by party_id,update_date, bin, var_data8, var_data2
) model
)model_final
where rank1 = 1
)
-- Add model scores
-- 1st logic Picking the model score that was the month before delivery date
select *
from
(
select cust.cust_no, cust.del_date, cust.min_del_date, model.upd_seq, model.bin
from cust
left join cust
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
Now I am struggling in creating the 2nd logic in the same query?.. any assistance would be appreciated
cust table
cust_no
min_del_date
upd_seq
123
2021-01-11
1453
234
2020-06-29
1446
456
2020-07-20
1447
model table
party_id
update_date
upd_seq
BIN
123
2020-11-30
1451
22
123
2020-12-25
1452
54
123
2020-01-11
1453
14
234
2020-05-23
1445
76
234
2020-06-18
1446
48
234
2020-07-23
1447
12
456
2020-06-18
1446
23
456
2020-07-23
1447
39
456
2020-08-21
1448
21
desired results
cust_no
min_del_date
model.upd_seq
update_date
BIN
123
2021-01-11
1453
2020-01-11
14
234
2020-06-29
1446
2020-06-18
48
456
2020-07-20
1446
2020-06-18
23
Update
I managed to find the solution by myself, thanks for everyone who has attending this question. The solution is per below
select a.cust_no, a.del_date, a.min_del_date, b.update_date, b.upd_seq, b.bin
from
(
select cust.cust_no, cust.del_date, cust.min_del_date,
CASE WHEN model.BIN <=50 THEN model.upd_seq WHEN BIN > 50 THEN model.upd_seq +1 ELSE NULL END as upd_seq
from cust
inner join model
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
inner join model b
on a.cust_no = b.party_id
and a.upd_seq = b.upd_seq

Take the last row Group By date

I need to select content statistics group By Date.
Here example of records :
id cid viewCount created_at
1 1 50 31-12-2018 18:00:00
2 1 50 01-01-2019 18:00:00
3 2 50 01-01-2019 18:00:00
4 2 100 01-01-2019 19:00:00
5 2 150 01-01-2019 20:00:00
6 3 1000 01-01-2019 15:00:00
Need to return :
id cid viewCount date
1 1 50 31-12-2018
2 1 50 01-01-2019
5 2 150 01-01-2019
6 3 1000 01-01-2019
I tried the following code
$qb = $this->createQueryBuilder('c');
$qb->select('a.id as id')
->addSelect('COALESCE(SUM(a.viewCount),0) as viewCount')
->addSelect('DATE_FORMAT(a.createdAt, \'%d-%m-%Y\') as date');
->innerJoin('c.analytics', 'a')
->groupBy('c.cid')
->addGroupBy('date')
->orderBy('a.createdAt', 'ASC');
return:
id cid viewCount date
1 1 50 31-12-2018
2 1 50 01-01-2019
3 2 50 01-01-2019
4 2 100 01-01-2019
5 2 150 01-01-2019
6 3 1000 01-01-2019
I have tried to create a subquery :
$qbLastHour = $this->createQueryBuilder('cc');
$qbLastHour->select('MAX(DATE_FORMAT(aa.createdAt, \'%H\'))')
->innerJoin('cc.analytics', 'aa')
->where('cc.id=c.id')
->groupBy('cc.cid')
->addGroupBy('s');
$qb->addSelect(sprintf("(%s) AS r", $qbLastHour->getDQl()));
But something go wrong because i dont groupBy date at the subquery.
If someone can help me. Thank you
Update
Here is an attempt, in sql again, to select only one row per date and cid based on the max time per day
SELECT id, c.cid, viewCount, max_date
FROM content a
JOIN content_analytic c ON a.id = c.content_id
RIGHT JOIN (SELECT c.cid, DATE_FORMAT(created_at, '%d-%m-%Y') dt, MAX(created_at) max_date
FROM content a
JOIN content_analytic c ON a.id = c.content_id
GROUP BY dt, c.cid) x ON x.max_date = a.created_at and x.cid = c.cid
This is how I believe the query should be in pure sql
SELECT c.cid, COALESCE(SUM(a.viewCount), 0), DATEFORMAT(a.created_at, ‘%d-%m-%Y’) as date
FROM content a
INNER JOIN content_analytic c ON a.id = c.content_id
GROUP BY c.cid, date
ORDER BY date

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.
I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.
Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.
I've this table with the following data:
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 600
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 400
4 103 2018-03-29 2018-05-29 200
5 104 2018-02-05 2018-02-09 50
The expected output table should be as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 200
3 102 2018-12-31 2019-01-01 200
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
5 104 2018-02-05 2018-02-09 50
Or as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER DATE_RELOAD_GAS
1 100 2018-10-25 2018-12-01 200 2018-10-01
1 100 2018-10-25 2018-12-01 200 2018-11-01
1 100 2018-10-25 2018-12-01 200 2018-12-01
2 101 2018-07-19 2018-07-24 100 2018-07-01
3 102 2018-12-31 2019-01-01 200 2018-12-01
3 102 2018-12-31 2019-01-01 200 2019-01-01
4 103 2018-03-29 2018-05-29 66.66 2018-03-01
4 103 2018-03-29 2018-05-29 66.66 2018-04-01
4 103 2018-03-29 2018-05-29 66.66 2018-05-01
5 104 2018-02-05 2018-02-09 50 2018-02-01
Can someone please help me out with this query?
I'm using oracle database
Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.
The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.
This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:
with cte as (
select f.*
, ceil(months_between(last_day(CONSUM_DATE_TO)
, trunc(CONSUME_DATE_FROM,'mm'))) as diff
from fuel_consumption f
)
select cte.id
, cte.VehicleId
, cte.CONSUME_DATE_FROM
, cte.CONSUM_DATE_TO
, cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
, add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
;
"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"
From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.
By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :
select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO,
trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER,
'01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS
from
(
with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
(
select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all
select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all
select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all
select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all
select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual
)
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID >= 2
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
union all
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID = 1
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
) q
group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn
order by ID, myMonth;
I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all
as c.ID >= 2 and c.ID = 1
Rextester Demo

CREATE TEMP TABLE BASED ON SELECT DISTINCT ON 3 COLUMNS BUT WITH 1 EXTRA COLUMN

I need to make a temporary file with in it:
Partcode, MutationDate, MovementType, Qty
Every partcode has multiple mutationdates per Movementtype (there are max 9 movementtypes possible)
I need to get the last mutationdate per movementtype per partcode and the quantity that goes with that.
An example with partcode 003307
003307 2018-05-31 1 -100
003307 2018-06-11 2 -33
003307 2018-04-25 3 +25
and so on for all 9 movementtypes.
What did I get so far:
create table #LMUT(
MutationDate T_Date
,PartCode T_Code_Part
,CumInvQty T_Quantum_Qty10_3
,MovementType T_Type_PMOverInvt
)
insert #LMUT(
MutationDate,
Partcode,
CumInvQty,
MovementType)
SELECT
cast (max(MOV.MutationDate) as date)
,MOV.PartCode
,INV.MutationQty
,INV.PMOverInvtType
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
MOV.PartMovementType = 1
group by MOV.PartCode,INV.PMOverInvtType,INV.MutationQty,MOV.MutationDate
SELECT * FROM #LMUT where partcode='003007'
drop table #LMUT
results in:
2016-12-06 00:00:00.000 003007 -24.000 2
2016-09-29 00:00:00.000 003007 -24.000 2
2016-11-09 00:00:00.000 003007 -24.000 2
2016-11-22 00:00:00.000 003007 -24.000 2
2016-10-26 00:00:00.000 003007 -24.000 2
2016-09-12 00:00:00.000 003007 -42.000 2
2016-10-13 00:00:00.000 003007 -24.000 2
2016-12-03 00:00:00.000 003007 100.000 5
2017-01-12 00:00:00.000 003007 -48.000 2
2016-10-04 00:00:00.000 003007 306.000 7
Not what I need, still have 8 times type 2
What else have I tried:
SELECT distinct MOV.Partcode,INV.PMOverInvtType,mov.MutationDate
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
mov.MutationDate = (SELECT MAX (c.MutationDate) FROM
dbo.T_PartMovementMain as c
inner join dbo.T_PartMovementOverInvt as d on D.PMMainCode=c.PMMainCode
WHERE
C.PartMovementType = 1 AND
C.PartCode=mov.PartCode AND
D.PMMainCode = C.PMMainCode AND
D.PMOverInvtType=inv.PMOverInvtType
)
and MOV.PartMovementType = 1 and mov.partcode='003007'
order by MOV.Partcode,INV.PMOverInvtType
Results in:
3007 2 2017-01-12 00:00:00.000
3007 5 2016-12-03 00:00:00.000
3007 7 2016-10-04 00:00:00.000
That is what I want but I need to get the Qty too.
use row_number() window function
with cte as
( SELECT MOV.*,INV.*,
row_number() over(partition by INV.PMOverInvtType order by MOV.MutationDate desc)rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
) select cte.* from cte where rn=1
Solved it like this:
create table #LMUT(
PartCode T_Code_Part
,MovementType T_Type_PMOverInvt
,MutationDate T_Date
,CumInvQty T_Quantum_Qty10_3
)
insert #LMUT(Partcode,MovementType,MutationDate,CumInvQty)
select Artikel,Type,Datum,Aant
from (
SELECT MOV.Partcode as Artikel,INV.PMOverInvtType as Type,mov.MutationDate as Datum,INV.MutationQty as Aant,
row_number() over(partition by MOV.Partcode,INV.PMOverInvtType order by MOV.Partcode,INV.PMOverInvtType,MOV.MutationDate desc) rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on INV.PMMainCode=MOV.PMMainCode) cse
where rn=1
select * from #LMUT order by Partcode
drop table #LMUT