How to freeze the historical data? - sql

I have this assignment where i am having employee data. For example the excel screenshot.
I need to freeze the data every month. Which means that if the data changes in august then it should not affect the data which has already been uploaded in july. As you can seen the red marked cells should be eliminated for august but they will still exists for july month.
I am using redshift database for data and will be creating reports in tableau. There will be filter to select the month and data will be filtered according to month. My solution for this was to add a column named uploaded month so it will track the data on which month how many employees were active for a paritcular Job-ID.
Is there better way to do this? In this solution the history of data is kept, but not entirely from the beginning of time. I am looking for a solution in SQL, Tableau or in PySpark.

Related

SQL - How do I expand a dataset to do a cohort analysis?

This is my first post so apologies if something is posted incorrectly - please let me know and I will fix it.
I am trying to build a SQL query in bigQuery that creates a cohort analysis so I can see how many customers have been retained over time by the month they joined (their cohort).
I work in insurance, so we have data on customers, when they joined, and any time they changed the policy (e.g., when they added a car to their coverage), but I do not have it laid out as every month of premium. The data is as follows:
Data vs how I need the data
Do you know how I could fill in the missing months?

How to Extend the End Date in a SQL Calendar Table?

I have a report that, according to users, started miscalculating dates in one field in November 2015. After some digging around, I found that one of the tables the field referenced seemed to have an end date on 2015-10-31.
The "D" field seems to represent the day of the week, with Sunday being day 1 and Saturday being 7.
Is there a way to extend the calendar so that it ends further into the future, for example 2049-12-31?
Our calendar table, for a variety of reasons, goes the the end of the current year. We have written a query that adds a new year to this table. This query takes care of most of the fields in that table. It does not touch the holiday field. That is updated manually through a web page.
We send ourselves reminders. Starting in March, we send monthly reminders that we should think about adding another year. After ensuring that the database segment has space, and that none of the definitions, such as fiscal periods, have changed, we run the query that adds a year.
Later in the year we start mailing ourselves reminders about the holidays. Then we check to see if HR has declared them, and if so, update the records accordingly.
This meets our business requirements. Yours will be different of course.

automatically renewing records when the last day of year

I'm developing a c# application that consists of Document Incoming System for my police station.
In this system, variable document's contents are been saved to an SQL database. I must give them a "Document Number".
I'm achieving all of these, but i want that every years last day such as 31.12.2014, the numbers that have been given to a document like "2145" will turn to "1" at the the first day of year 01.01.2015.
So, the records must be 2014/2145. and the last days of years turns to 2015/1.
How can I achieve this?
You count the existing documents for the same year, then add one.
So if you want to store a document that belongs to 2013, you first count how many existing documents you have in 2013, then add one.
I can't write the sql for you, because you haven't described the data structure, but it should be simple enough using SQL COUNT, and DATEPART to retrieve only the year from a date field.

MS access latest date issue

I have 2 excel files at work where I maintain the rates of assets and the dates when the rates were issued. Another excel file has the list of assets and the dates when they were sold.
So one excel file has the following columns:
Asset------Rate------Rate_Issued_On
1. X-------1500------21-Apr-2014
2. X-------2000------28-Aug-2013
3. Z-------2200------11-Jan-2014
4. X-------3000------1-Jan-2014
The other excel file has (let's suppose):
Asset-----Sold_Date
1. X------1-Dec-2013
2. Z------12-Mar-2014
Now since the sold date of Asset X lies between 1-Jan-2014 and 28-Aug-2013 it should take the rate of 2000. If for example the sold date was 22-Apr-2014 it should take the rate as 1500. If the sold date is 27-Aug-2013 it should display a blank record. So basically the sold date should be greater than the latest Issued date and rate will correspond to that particular date.
I can easily get this working in excel but the problem is that the excel file has now become so large that it runs very slow. So I just want this incorporated in ms access. Is this possible? (I am a novice in ms access so kindly requesting you to go a little easy on me)
Thanks
Yes - a few simple queries can match up the data they way you want. If your two tables are called Rates and Sales, you could use two queries to get the results you need. The 1st query would use the Sales and Rates table to find the largest Rate_date that is less than the Sale_date, and the second query would match this back to the Rate table to get the rate on that date.
A very similar problem is described in How to use another table fields as a criteria for MS Access

How do I automate a report on variance in the same SQL table fields on monthly basis?

I have a T-SQL view with integer fields. I need a report on a monthly basis regarding the difference from one month to the next, i.e. so many people were engaged in a particular activity on 8am of the 1st of this month, so many the previous month, here is the difference. The numbers fluctuate all the time. I need a variance between 2 snapshots in time.
I am using the SSRS, however in reporting services I can only display the "current" situation. I could run a report at 8am of the 1st of each month and then calculate the differences manually. But how could I automate this calculation and then report on the difference?
I have tried to import data from SQL to 1 Excel spreadsheet from 1 month, then to the 2nd spreadsheet from the 2nd month. The 3rd spreadsheet calculates the difference. But how do I create a nice looking report from Excel?
Additionally I cannot send the report by email. It has to be available online.
Furthermore, each office wants their figures to be confidential and not visible to another office.
Thanx in advance.
Can you add a UserCount table that stores each office's user count for each month? It could have columns like:
id
date
user_count
office_id
You would insert a new row each month based on what the view tells you that month for each office. Then it's as simple as exporting that table to Excel and graphing it using Excel's built-in graphing tools.