Name Change SQL Query - Positive Attrition - sql

Using SQL Server Management Studio and I've got an Employee table which has records for every time something changes with an employee, be it manager, pay-scale, etc. etc. etc. Whenever a change is made, and EffectiveEndDateKey will be marked for the last time that complete field was relevant and the next record will have the next day as and EffectiveBeginDateKey.
What I'm trying to do is extract the last record an employee had BEFORE they changed their job title in the last month. The goal is IF an employee changes their job title within a given month that would count as "Positive Attrition" and we're trying to figure out how much positive attrition we get in a given month. (It's always for the previous month so in the where statement I'm pulling just changes for the previous month).
Take a look below:
In this example, on July 4th, John Doe went from being an Apple Store Clerk to the next day being a manager, so there was a job title change. What I want is to pull the record in the red box (Jon's last day - in July - when her was an Apple Store Clerk before he became a manager) b/c that tells me that an EffectiveEndDateKey had a change that resulted in a job title change.
So the where statement is going to have a Cast in it to convert the EffectiveEndDateKey to a date and then look at last months data, pulling only records that have EffectiveEndDateKeys from last month (July) and what I need help with is the part where those records must ALSO have a different job title.
if say someone changed job titles on July 31st (so their new job title/EffectiveBeginDateKey was 20130801), that would still count as 1 positive attrition and we'd want to pull the last record from July 31st.
Any thoughts?

You can do this with a self-join:
select eprev.*
from Employee e
Employee eprev
on e.EmployeeId = eprev.EmployeeId and
cast(cast(e.EffectiveBeginDateKey as varchar(255)) as datetime) =
cast(cast(eprev.EffectiveEndDateKey as varchar(255)) as datetime) + 1
where cast(cast(e.EffectiveBeginDateKey as varchar(255)) as datetime) >= dateadd(mm, -1, getdate()) and
eprev.JobTitle <> e.JobTitle;
The key here is the conversion of the number to a datetime. The format YYYYMMDD is easily convertible, when it is a string. So, convert the number of a string first, then to a datetime. The rest is just the mechanics of the join.

Since your strings are effectively just dates in ISO format (yyyymmdd), you don't even have to convert it to datetime, you could just get previous row:
select E.EmployeeID, E.JobTitle, ENEXT.JobTitle as NextJobTitle
from Employee as E
outer apply (
select top 1 T.JobTitle
from Employee as T
where
T.EmployeeID = E.EmployeeID and
T.EffectiveEndDateKey > E.EffectiveEndDateKey
order by T.EffectiveEndDateKey asc
) as ENEXT
where
E.JobTitle <> ENEXT.JobTitle and
E.EffectiveEndDateKey >= convert(nvarchar(8), dateadd(mm, datediff(mm, 0, getdate()) - 1, 0), 112)
see SQL FIDDLE example

Related

Get the latest full week's data for analysis in SQL

I was given sales data, where I have items and sales on a particular date. Now, the company wants to analyze the latest full week’s data against the total sales of company.
Item date Sales
Apple 08/25/2020 10
Orange 08/24/2020 20
Orange 08/21/2020 30
Now the full week is defined by a complete week from Sunday-Saturday. In the above made up example, it is clear that, these two data Apple 08/25/2020 10 Orange 08/24/2020 20 are from days Friday and Thursday respectively, so it is not a full week, hence we cannot take this week’s data. We need to check the last week’s data which would be for 08/21/2020
I was given 10 minutes to think on this, my immediate solution was, find the weekday number for the maximum data in the table. And subtract it from 7. If that is equal to 0 then we have a full week, and we can take the max date as the end date of our analysis and use a dateadd() to subtract 7 days from the max date to make it a start date. If I have something other than the 0, for example 6, then I use dateadd to go 6 days prior to my max date and use it as end date, again go 7 days behind this and get the start date.
CREATE TABLE SALES(Item nvarchar(10), dates date, Sales Numeric)
INSERT INTO SALES VALUES('Apple',CAST('08/25/2020' AS DATE),10),
('Orange',CAST('08/24/2020' AS DATE),20),
('Orange',CAST('08/21/2020' AS DATE),30)
WITH end_dates AS
(
SELECT CASE WHEN 7-DATEPART(dw, max(dates))=0 THEN max(dates)
ELSE DATEADD(day,- DATEPART(dw, max(dates)),max(dates)) END AS end_date
FROM SALES
),
Full_Week_Date AS
(
SELECT DATEADD(day,-6,end_date) as start_date ,end_date FROM end_dates
)
SELECT (SELECT SUM(SALES.sales)*100 FROM SALES JOIN Full_Week_Date ON(dates BETWEEN start_date AND end_date))/(SELECT SUM(SALES.sales) FROM SALES) AS revenue_per
This is the best I could think of, but the interviewer said, given a large amount of data, this would run like forever. What would be an optimum solution for this problem? I only want to know, how to get start and end date of the week that I want to analyze. Rest the revenue and % revenue will be fairly easy I believe if I have this in place.
In an actual database the query to use will depend on indexes and other things. For a basic answer, there are a few things to consider here.
They state "a complete week from Sunday-Saturday". Unless you are reporting at 11:59PM on Saturday you never will really have that full weeks sales in the same week. Since that is the case there is no reason to do all the checks you mentioned. They will cause unnecessary processing.
One thing you didn't mention is if the total company sales included the week your sales you are checking are for. I am going to assume they want to exclude that weeks sales.
I am not going to claim this is the most efficient way, but I would do it like this.
INSERT INTO #Sales
VALUES
('Apple', '08/25/2020', 10),
('Orange', '08/24/2020', 20),
('Orange', '08/21/2020', 30),
('Apple', '11/14/2020', 25);
-- Get week to check (last week)
DECLARE #curWeek int = DATEPART(WW, DATEADD(wk, -1, GETDATE()));
-- Get Sales
SELECT
SUM(COALESCE(SalesAmt, 0)) AS CompanySales
, (SELECT SUM(COALESCE(SalesAmt, 0)) FROM #Sales WHERE DATEPART(WW, SalesDate) = #curWeek) AS WeekSales
FROM #Sales
WHERE DATEPART(WW, SalesDate) <= #curWeek;

Update table based on last date of previous month

Please would you advise how I could create a column which showed a timestamp/date for each row indicating the last day of the previous month. For example:
Name Surname DOB Timestamp
John Smith 1970/04/20 2015/02/28
Cindy Smith 1975/03/20 2015/02/28
Now this could be for 5000 people and I've just given 2 rows to show you what I mean.
CREATE table employees (Name NVARCHAR(30),Surname NVARCHAR (30),DOB DATETIME,Timestamp DATETIME)
To tackle the problem of the dates not showing hours,minutes, seconds, I am using
CONVERT(CHAR(10),Timestamp,113)
Do you use a While loop or something to create a column which shows the same timestamp for each row?
Thanks.
I think the easiest way is to just subtract the day of the month from the date:
select t.*, dateadd(day, -day(timestamp), timestamp)
from table t;
EDIT: In an `update:
update t
set timestamp = dateadd(day, -day(dob), dob)
In addition, you shouldn't use convert to remove the time component of a date, you should simply case to date. If dob had a time component (which seems unlikely):
update t
set timestamp = cast(dateadd(day, -day(dob), dob) as date)
Assuming the table have records with Timestamp column as NULL. Then with the following update query will update all records with previous month's last day.
UPDATE employees
SET [Timestamp] = CAST(DATEADD(DAY,-1,DATEADD(month, DATEDIFF(month, 0, GETDATE()), 0))AS DATE)
WHERE [Timestamp] IS NULL
The inner DATEADD will find first day of current month and outer DATEADD decrements the date by one which result in previous month's last date.
When the month in GETDATE() is April, the records with NULL values will be updated with 31-Mar-2015. When the GETDATE() becomes May, the records with NULL values will be updated with 30-Apr-2015 ie, it won't update the records which have already values or which are updated.

Selecting sets of data and creating a new column in SQL Server

In SQL Server can you select the first set of values (i.e. week numbers 1 - 52) give them another value in a new column, then select the next lot of values.
The problem I am having is the data table I am working on has week numbers for each financial year, which starts the first Sunday after 1 October. So it simply iterates 1 - 52 for each financial year.
I am trying to make a column in a view that grabs the first 52 gives them the a financial year value of 1, then grabs the next 52 and gives them a financial year value of 2 etc (obviously with year 1 starting at the first record). I do have the Week Ending Date column to work with also.
Here is a snippet of the table:
Is this possible?
Leave the Sundays and Octobers. If I understand correctly, you only need to assign a rank to each occurrence of week number in order of the ending dates.
Please try this (but use copy of the table or transaction to check first; of course T is name of your table):
update T
set fiscal_year = YearNumbers.FiscalYear
from T
inner join
(
select WeekEndingDate, WeekNumber, DENSE_RANK() over (partition by WeekNumber order by WeekEndingDate) as FiscalYear
from T
) as YearNumbers
on T.WeekEndingDate = YearNumbers.WeekEndingDate and T.WeekNumber = YearNumbers.WeekNumber

SQL Server 2005, Calculating upcoming birthdays from date of birth

This one has bugged me for a while now. Recently when revisiting some code I wrote for a customer a few years ago I was wondering if there is a more elegant solution to the problem.
The customer stores all of their clients information including date of birth (date time field)
They run an extract every Monday that retrieves any customer whose birthday will fall within the following week.
I.e. if the extract was run on Monday Jan 1st, Customers whose birthday fell between (and including) Monday Jan 8th -> Sunday Jan 14th would be retrieved.
My solution was to use the Datepart(dy) function and calculate all upcoming birthdays based off the customers date of birth converted to day of year, adding some logic to include for the extract being run at the end of a year.
The problem was that using Day of year throws results off by 1 day if the customer was born on a leap year and / or the extract is run on a leap-year after the 29th of Feb, so once again I had to add more logic so the procedure returned the expected results.
This seemed quite over-kill for what should be a simple task
For simplicity let’s say the table 'customer' contains 4 fields, first name, last name, dob, and address.
Any suggestions on how to simplify this would really be appreciated
Wes
Would something like this work for you?
select * from Customers c
where dateadd(year, 1900-year(dob), dob)
between dateadd(year, 1900-year(getdate()), getdate())
and dateadd(year, 1900-year(getdate()), getdate())+7
Why not use DATEPART(wk) on this year's birthday?
SET DATEFIRST 1 -- Set first day of week to monday
SELECT * FROM customer
WHERE DATEPART(wk, DATEADD(yy, DATEPART(yy, GETDATE()) - DATEPART(yy, customer.dob), customer.dob)) = DATEPART(wk, GETDATE()) + 1
It selects all customers who's birthday's weeknumber is one greater than the current weeknumber.
I think DATEADD should do the proper thing.
YEAR(GETDATE() - dbo.Patients.Dob) - 1900
I can safely assume you will never have customers born before 1900
Please Try This one.
SELECT TOP 10 BirthDate, FirstName
FROM Customers
WHERE DATEPART(mm,BirthDate) >= DATEPART(mm,GETDATE())
AND DATEPART(day,BirthDate) >= DATEPART(day,getdate())
OR DATEPART(mm,BirthDate) > DATEPART(mm,getdate())
ORDER BY DatePart(mm,BirthDate),DatePart(day,BirthDate)
this query will get upcoming birthdays including today itself

T-SQL absence by month from start date end date

I have an interesting query to do and am trying to find the best way to do it. Basically I have an absence table in our personnel database this records the staff id and then a start date and end date for the absence. End date being null if not yet entered (not returned). I cannot change the design.
They would like a report by month on number of absences (12 month trend). With staff being off over the month change it obviously may be difficult to calculate.
e.g. Staff off 25/11/08 to 05/12/08 (dd/MM/yy) I would want the days in November to go into the November count and the ones in December in the December count.
I am currently thinking in order to count the number of days I need to separate the start and end date into a record for each day of the absence, assigning it to the month it is in. then group the data for reporting. As for the ones without an end date I would assume null is the current date as they are presently still absent.
What would be the best way to do this?
Any better ways?
Edit: This is SQL 2000 server currently. Hoping for an upgrade soon.
I have had a similar issue where there has been a table of start/end dates designed for data storage but not for reporting.
I sought out the "fastest executing" solution and found that it was to create a 2nd table with the monthly values in there. I populated it with the months from Jan 2000 to Jan 2070. I'm expecting it will suffice or that I get a large pay cheque in 2070 to come and update it...
DECLARE TABLE months (start DATETIME)
-- Populate with all month start dates that may ever be needed
-- And I would recommend indexing / primary keying by start
SELECT
months.start,
data.id,
SUM(CASE WHEN data.start < months.start
THEN DATEDIFF(DAY, months.start, data.end)
ELSE DATEDIFF(DAY, data.start, DATEADD(month, 1, months.start))
END) AS days
FROM
data
INNER JOIN
months
ON data.start < DATEADD(month, 1, months.start)
AND data.end > months.start
GROUP BY
months.start,
data.id
That join can be quite slow for various reasons, I'll search out another answer to another question to show why and how to optimise the join.
EDIT:
Here is another answer relating to overlapping date ranges and how to speed up the joins...
Query max number of simultaneous events