SQL values disappear when using max dates - sql

First time posting here and have a query that I hope someone maybe able to help with, i have tried to search for the answer but with no joy.
When i use the below SQL to find a value (in this case eb.annualvalue) it returns multiple values because no end dates have been entered into the eb table and there are too many employees without end dates for me to close down.
LEFT JOIN
(
SELECT
eb.empid, eb.bencode, eb.currencycode AS [currencycode], eb.notes AS [notes], eb.annualvalue
FROM
employeebenefit AS [eb]
WHERE
eb.bencode IN ('US 401K Plan')
AND (eb.enddate IS NULL OR eb.enddate >= '20180101')
)
AS eb26
ON eb26.empid = e.empid
However, when i use MAX startdate (code below) it returns the correct number or rows however, the eb.annualvalue figure disappears.
LEFT JOIN
(
SELECT
eb.empid, eb.bencode, eb.currencycode AS [currencycode], eb.notes AS [notes], eb.annualvalue
FROM
employeebenefit AS [eb]
WHERE
eb.bencode IN ('US 401K Plan')
AND (eb.enddate IS NULL OR eb.enddate >= '20180101')
AND (eb.startdate = (SELECT MAX(eb.startdate) FROM employeebenefit AS [eb]))
)
AS eb26
ON eb26.empid = e.empid
Any help would be greatly appreciated. Thanks Dan.

This sounds like a greatest-n-per-group problem, you just want one row per employee, from a table with many rows per employee. I'm not 100% clear on how you want to select that one row, but I can give an example.
Ideally, you would use ROW_NUMBER() but that only came in to effect from SQL Server 2008 onward.
The two commons alternative are:
- Join on your data twice. Once to find the "highest date" per user, again to find the whole row.
- Use a correlated sub-query to work out an individual's best row (still really joining twice)
Simple-self-join:
LEFT JOIN
(
SELECT
empid,
MAX(startdate) AS max_startdate
FROM
employeebenefit
WHERE
bencode IN ('US 401K Plan')
AND (enddate IS NULL OR enddate >= '20180101')
GROUP BY
empid
)
latest_employeebenefit
ON latest_employeebenefit.empid = e.empid
LEFT JOIN
employeebenefit
ON employeebenefit.empid = latest_employeebenefit.empid
AND employeebenefit.startdate = latest_employeebenefit.max_startdate
AND employeebenefit.bencode IN ('US 401K Plan')
AND (employeebenefit.enddate IS NULL OR employeebenefit.enddate >= '20180101')
This has the "feature" that if two such records both match the max_startdate (a tie) then both will come through. Often that is impossible, often it's desirable, it depends on your data and your needs.
Correlated-sub-query for join:
LEFT JOIN
employeebenefit
ON employeebenefit.id =
(
SELECT TOP(1) lookup.id
FROM employeebenefit AS lookup
WHERE lookup.empid = e.empid -- the correlated bit
AND lookup.bencode IN ('US 401K Plan')
AND (lookup.enddate IS NULL OR lookup.enddate >= '20180101')
ORDER BY lookup.startdate DESC
)
This is slightly different in that it always returns just one row. If there can be a tie when only sorting by startdate it's generally best to add another column to the ORDER BY, even if it's just an id column, to ensure the results are deterministic.

You can use the code bellow , if I undestood your question
OUTER APPLY
(
SELECT TOP 1
eb.empid, eb.bencode, eb.currencycode AS [currencycode], eb.notes AS [notes], eb.annualvalue
FROM
employeebenefit AS [eb]
WHERE
eb.empid = e.empid
AND eb.bencode IN ('US 401K Plan')
AND (eb.enddate IS NULL OR eb.enddate >= '20180101')
ORDER BY
eb.startdate DESC
)
AS eb26

Related

Select latest and 2nd latest date rows per user

I have the following query to select rows where the LAST_UPDATE_DATE field is getting records that have a date value greater than or equal to the last 7 days, which works great.
SELECT 'NEW ROW' AS 'ROW_TYPE', A.EMPLID, B.FIRST_NAME, B.LAST_NAME,
A.BANK_CD, A.ACCOUNT_NUM, ACCOUNT_TYPE, PRIORITY, A.LAST_UPDATE_DATE
FROM PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P' AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND A.EFFDT = (SELECT MAX(A1.EFFDT)
FROM PS_DIR_DEP_DISTRIB A1
WHERE A1.EMPLID = A.EMPLID
AND A1.EFFDT <= GETDATE())
AND D.EFF_STATUS = 'A'
AND D.EFFDT = (SELECT MAX(D1.EFFDT)
FROM PS_DIRECT_DEPOSIT D1
WHERE D1.EMPLID = D.EMPLID
AND D1.EFFDT <= GETDATE())
AND A.LAST_UPDATE_DATE >= GETDATE() - 7
What I would like to add onto this is to also add the previous (2nd MAX) row per EMPLID, so that I can output the 'old' row (that was prior to the last update the latest row meeting above criteria), along with the new row that I already am outputting in the query.
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE
NEW ROW 12345 JOHN SMITH 123548999 45234879 C 999 2019-03-06 00:00:00.000
OLD ROW 12345 JOHN SMITH 214080046 92178616 C 999 2018-10-24 00:00:00.000
NEW ROW 56399 CHARLES MASTER 785816167 84314314 C 999 2019-03-07 00:00:00.000
OLD ROW 56399 CHARLES MASTER 345761227 547352 C 999 2017-05-16 00:00:00.000
So the EMPLID would be ordered by NEW ROW, followed by OLD ROW as shown above. In this example the 'NEW ROW' is getting the record that is within the past 7 days, as indicated by the LAST_UPDATE_DATE.
I would like to get feedback on how to modify the query so I can also get the 'old' row (which is the max row that is less than the 'NEW' row retrieved above).
It was a slow day for crime in Gotham, so I gave this a whirl. Might work.
This is unlikely to work right out of the box, though, but it should get you started.
Your LAST_UPDATE_DATE column is on the table PS_DIR_DEP_DISTRIB, so we'll start there. First, you want to identify all of the records that were updated in the last 7 days because those are the only ones you're interested in. Throughout this, I'm assuming, and I'm probably wrong, that the natural key for the table consists of EMPLID, BANK_CD, and ACCOUNT_NUM. You'll want to sub in the actual natural key for those columns in a few places. That said, the date limiter looks something like this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND
limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
Now we'll use that as a correlated sub-query in a WHERE EXISTS clause that we'll correlate back to the base table to limit ourselves to records with natural key values that were updated in the last week. I altered the SELECT list to just SELECT 1, which is typical verbiage for a correlated sub, since it stops looking for a match when it finds one (1), and doesn't actually return any values at all.
Additionally, since we're filtering this record set anyway, I moved all the other WHERE clause filters for this table into this (soon to be) sub-query.
Finally, in the SELECT portion, I added a DENSE_RANK to force order the records. We' use the DENSE_RANK value later to filter off only the first (N) records of interest.
So that leaves us with this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
Replace the original INNER JOIN to PS_DIR_DEP_DISTRIB with that query. In the SELECT list, the first hard-coded value is now dependent on the RowNum value, so that's a CASE expression now. In the WHERE clause, the dates are all driven by the subquery, so they're gone, several were folded into the subquery, and we're adding WHERE dist.RowNum <= 2 to bring back the top 2 records.
(I also replaced all the table aliases so I could keep track of what I was looking at.)
SELECT
CASE dist.RowNum
WHEN 1 THEN 'NEW ROW'
ELSE 'OLD ROW'
END AS ROW_TYPE
,dist.EMPLID
,emp.FIRST_NAME
,emp.LAST_NAME
,dist.BANK_CD
,dist.ACCOUNT_NUM
,ACCOUNT_TYPE
,dist.PRIORITY
,dist.LAST_UPDATE_DATE
FROM
PS_DIRECT_DEPOSIT AS dd
INNER JOIN
(
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
) AS dist
ON
dist.EMPLID = dd.EMPLID
AND dist.EFFDT = dd.EFFDT
INNER JOIN
PS_EMPLOYEES AS emp
ON
emp.EMPLID = dist.EMPLID
WHERE
dist.RowNum <= 2
AND
emp.EMPL_STATUS NOT IN ('T', 'R', 'D')
AND
dd.EFF_STATUS = 'A';

How to find the min, where TSQL groups by

I have found the first transaction (min), but when I add the column 'Winners', I get a row for their first win and a row for their first loss. I need only the first row, including whether they won or lost. I have tried aggregating the winners column to no avail. I would prefer not to sub-query if possible. Thanks in advance for checking this out.
SELECT
MIN(dbo.ADT.Time) AS FirstShowWager,
dbo.AD.Account, dbo.AD.FirstName,
dbo.AD.LastName, dbo.ADW.Winners
FROM
dbo.BLAH
WHERE
(dbo.ADT.RunDate = CONVERT(DATETIME, '2014-04-12
00:00:00', 102)) AND (dbo.ADW.Pool = N'shw')
GROUP BY
dbo.AD.Account,
dbo.AD.FirstName,
dbo.AD.LastName,
dbo.AD.RunDate,
dbo.ADW.Winners
ORDER BY
dbo.AD.Account
select sorted.*
from
(
SELECT dbo.ADT.Time AS FirstShowWager,
dbo.AD.Account, dbo.AD.FirstName,
dbo.AD.LastName, dbo.ADW.Winners,
ROW_NUMBER ( ) OVER (partition by dbo.AD.Account,
dbo.AD.FirstName,
dbo.AD.LastName,
dbo.AD.RunDate
order by dbo.ADT.Time) as rowNum
FROM dbo.AD
WHERE dbo.ADT.RunDate = CONVERT(DATETIME, '2014-04-1200:00:00', 102)
AND dbo.ADW.Pool = N'shw'
) as sorted
where rowNum = 1
ROW_NUMBER
It sounds like you don't care about the value of winners column, by grouping on winners you'd get multiple rows, one for null and others for non-null values. If you don't care about the amount they've won but just simply if they've won or lost, you can do something like this,
SELECT
MIN(dbo.ADT.Time) AS FirstShowWager,
dbo.AD.Account, dbo.AD.FirstName,
dbo.AD.LastName, CASE WHEN dbo.ADW.Winners IS NULL THEN 0 ELSE 1 END
FROM
dbo.BLAH
WHERE
(dbo.ADT.RunDate = CONVERT(DATETIME, '2014-04-12
00:00:00', 102)) AND (dbo.ADW.Pool = N'shw')
GROUP BY
dbo.AD.Account,
dbo.AD.FirstName,
dbo.AD.LastName,
dbo.AD.RunDate,
dbo.ADW.Winners
ORDER BY
dbo.AD.Account
Add this case statement instead of winners column in the select statement and group by
Case ( winners is NULL then 'Lose' else 'Win' end )
Usually this is done with a derived table selecting the record you want then joining back to the orginal table on all the group by fields.
You can find the MIN in an inner query and then join it to the ADW table on by the ID to get if they are a winner.
SELECT b.*, ADW.winner
FROM dbo.ADW ADW INNER JOIN (SELECT MIN(ADT.RunTime) AS FirstShowWager,
AD.Account, AD.FirstName,
AD.Lastnamne, AD.ADID
FROM dbo.AD AD INNER JOIN dbo.ADT ADT AD.adid = ADT.ADID
GROUP BY AD.Account, AD.Firstname, AD.Lastnamne, AD.ADID) b
ON ADW.ADID = b.ADID
Assumptions: There is a foreign key between
From the ADT to the AD table.
From the ADW to the AD table.

Unpivot date columns to a single column of a complex query in Oracle

Hi guys, I am stuck with a stubborn problem which I am unable to solve. Am trying to compile a report wherein all the dates coming from different tables would need to come into a single date field in the report. Ofcourse, the max or the most recent date from all these date columns needs to be added to the single date column for the report. I have multiple users of multiple branches/courses for whom the report would be generated.
There are multiple blogs and the latest date w.r.t to the blogtitle needs to be grouped, i.e. max(date_value) from the six date columns should give the greatest or latest date for that blogtitle.
Expected Result:
select u.batch_uid as ext_person_key, u.user_id, cm.batch_uid as ext_crs_key, cm.crs_id, ir.role_id as
insti_role, (CASE when b.JOURNAL_IND = 'N' then
'BLOG' else 'JOURNAL' end) as item_type, gm.title as item_name, gm.disp_title as ITEM_DISP_NAME, be.blog_pk1 as be_blogPk1, bc.blog_entry_pk1 as bc_blog_entry_pk1,bc.pk1,
b.ENTRY_mod_DATE as b_ENTRY_mod_DATE ,b.CMT_mod_DATE as BlogCmtModDate, be.CMT_mod_DATE as be_cmnt_mod_Date,
b.UPDATE_DATE as BlogUpDate, be.UPDATE_DATE as be_UPDATE_DATE,
bc.creation_date as bc_creation_date,
be.CREATOR_USER_ID as be_CREATOR_USER_ID , bc.creator_user_id as bc_creator_user_id,
b.TITLE as BlogTitle, be.TITLE as be_TITLE,
be.DESCRIPTION as be_DESCRIPTION, bc.DESCRIPTION as bc_DESCRIPTION
FROM users u
INNER JOIN insti_roles ir on u.insti_roles_pk1 = ir.pk1
INNER JOIN crs_users cu ON u.pk1 = cu.users_pk1
INNER JOIN crs_mast cm on cu.crsmast_pk1 = cm.pk1
INNER JOIN blogs b on b.crsmast_pk1 = cm.pk1
INNER JOIN blog_entry be on b.pk1=be.blog_pk1 AND be.creator_user_id = cu.pk1
LEFT JOIN blog_CMT bc on be.pk1=bc.blog_entry_pk1 and bc.CREATOR_USER_ID=cu.pk1
JOIN gradeledger_mast gm ON gm.crsmast_pk1 = cm.pk1 and b.grade_handler = gm.linkId
WHERE cu.ROLE='S' AND BE.STATUS='2' AND B.ALLOW_GRADING='Y' AND u.row_status='0'
AND u.available_ind ='Y' and cm.row_status='0' and and u.batch_uid='userA_157'
I am getting a resultset for the above query with multiple date columns which I want > > to input into a single columnn. The dates have to be the most recent, i.e. max of the dates in the date columns.
I have successfully done the Unpivot by using a view to store the above
resultset and put all the dates in one column. However, I do not
want to use a view or a table to store the resultset and then do
Unipivot simply because I cannot keep creating views for every user
one would query for.
The max(date_value) from the date columns need to be put in one single column. They are as follows:
* 1) b.entry_mod_date, 2) b.cmt_mod_date ,3) be.cmt_mod_date , 4) b.update_Date ,5) be.update_date, 6) bc.creation_date *
Apologies that I could not provide the desc of all the tables and the
fields being used.
Any help to get the above mentioned max of the dates from these
multiple date columns into a single column without using a view or a
table would be greatly appreciated.*
It is not clear what results you want, but the easiest solution is to use greatest().
with t as (
YOURQUERYHERE
)
select t.*,
greatest(entry_mod_date, cmt_mod_date, cmt_mod_date, update_Date,
update_date, bc.creation_date
) as greatestdate
from t;
select <columns>,
case
when greatest (b_ENTRY_mod_DATE) >= greatest (BlogCmtModDate) and greatest(b_ENTRY_mod_DATE) >= greatest(BlogUpDate)
then greatest( b_ENTRY_mod_DATE )
--<same implementation to compare each time BlogCmtModDate and BlogUpDate separately to get the greatest then 'date'>
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
case
when greatest (be_cmnt_mod_Date) >= greatest (be_UPDATE_DATE)
then greatest( be_cmnt_mod_Date )
when greatest (be_UPDATE_DATE) >= greatest (be_cmnt_mod_Date)
then greatest( be_UPDATE_DATE )
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
GREATEST(bc_creation_date)
,<columns>
FROM table
<rest of the query>

Skip rows for specific time in SQL

Need a help.
I have two timestamp columns, so basically I want to get the max and min value with a thirD column showing as timedifference. I am skipping any 12.am time so used the syntax below. ANy help how to achieve the third column, timedifference.. It is in DB2.
SELECT EMPID,MIN(STARTDATETIME),MAX(ENDDATETIME)
FROM TABLE
WHERE DATE(STARTDATETIME)= '2012-05-15' AND HOUR(STARTDATETIME)<>0 AND HOUR(ENDDATETIME)<>0
GROUP BY EMPID
You can use the results from that in an inner select, and use those values to define the TimeDifference column. My knowledge of DB2 is very limited, so I'm making some assumptions, but this should give you an idea. I'll update the answer if something is drastically incorrect.
Select EmpId,
MinStartDate,
MaxEndDate,
MaxEndDate - MinStartDate As TimeDifference
From
(
Select EMPID,
MIN(STARTDATETIME) As MinStartDate,
MAX(ENDDATETIME) As MaxEndDate
From Table
Where DATE(STARTDATETIME) = '2012-05-15'
And HOUR(STARTDATETIME) <> 0
And HOUR(ENDDATETIME) <> 0
Group By EMPID
) A

SQL Server adjust each value in a column by another table

I have two tables, TblVal and TblAdj.
In TblVal I have a bunch of values that I need adjusted according to TblAdj for a given TblVal.PersonID and TblVal.Date and then returned in some ViewAdjustedValues. I must apply only those adjustments where TblAdj.Date >= TblVal.Date.
The trouble is that since all the adjustments are either a subtraction or a division, they need to be made in order. Here is the table structure:
TblVal: PersonID, Date, Value
TblAdj: PersonID, Date, SubtractAmount, DivideAmount
I want to return ViewAdjustedValues: PersonID, Date, AdjValue
Can I do this without iterating through TblAdj using a WHILE loop and an IF block to either subtract or divide as necessary? Is there some nested SELECT table magic I can perform that would be faster?
I think you can do it without a loop, but whether you want to or not is another question. A query that I think works is below (SQL Fiddle here). The key ideas are as follows:
Each SubtractAmount has the ultimate effect of subtracting SubtractAmount divided by the product of all later DivideAmounts for the same PersonID. The Date associated with the PersonID isn't relevant to this adjustment (fortunately). The CTE AdjustedAdjustments contains these adjusted SubtractAmount values.
The initial Value for a PersonID gets divided by the product of all DivideAmount values on or after that persons Date.
EXP(SUM(LOG(x))) works as an aggregate product if all values of x are positive. You should constrain your DivideAmount values to assure this, or adjust the code accordingly.
If there are no DivideAmounts, the associated product is NULL and changed to 1. Similarly, NULL sums of adjusted SubtractAmount values are changed to zero. A left join is used to preserve an values that are not subject to any adjustments.
SQL Server 2012 supports an OVER clause for aggregates, which was helpful here to aggregate "all later DivideAmounts."
WITH AdjustedAdjustments AS (
select
PersonID,
Date,
SubtractAmount/
EXP(
SUM(LOG(COALESCE(DivideAmount,1)))
OVER (
PARTITION BY PersonID
ORDER BY Date
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
)
) AS AdjustedSubtract,
DivideAmount
FROM TblAdj
)
SELECT
p.PersonID,
p.Value/COALESCE(EXP(SUM(LOG(COALESCE(DivideAmount,1)))),1)
-COALESCE(SUM(a.AdjustedSubtract),0) AS AmountAdjusted
FROM TblVal AS p
LEFT OUTER JOIN AdjustedAdjustments AS a
ON a.PersonID = p.PersonID
AND a.Date >= p.Date
GROUP BY p.PersonID, p.Value, p.Date;
Try something like following:
with CTE_TblVal (PersonID,Date,Value)
as
(
select A.PersonID, A.Date, A.Value
from TblVal A
inner join TblAdj B
on A.PersonID = B.PersonID
where B.Date >= A.Date
)
update CTE_TblVal
set Date = TblAdj.Date,
Value = TblAdj.Value
from CTE_TblVal
inner join TblAdj
on CTE_Tblval.PersonID = TblAdj.PersonID
output inserted.* into ViewAdjustedValues
select * from ViewAdjustedValues