Report on a point in time

Report on a point in time - sql

I am about to create what I assume will be 2 new tables in SQL. The idea is for one to be the "live" data and a second which would hold all the changes. Dates are in DD/MM/YYYY format.
Active
ID | Name | State Date | End Date
1 Zac 1/1/2016 -
2 John 1/5/2016 -
3 Sam 1/6/2016 -
4 Joel 1/7/2016 -
Changes
CID | UID | Name | Start Date | End Date
1 1 Zac 1/1/2016 -
2 4 Joel 1/1/2016 -
3 4 Joel - 1/4/2016
4 2 John 1/5/2016 -
5 3 Sam 1/6/2016 -
6 4 Joel 1/7/2016 -
In the above situation you can see that Joel worked from the 1/1/2016 until the 1/4/2016, took 3 months off and then worked from the 1/7/2016.
I need to build a query where by I can pick a date in time and report on who was working at that time. The above table only lists the name but there will be many more columns to report on for a point in time.
What would be best way to structure the tables to be able to achieve this query.

I started writing this last night and finally coming back to it. Basically you would have to use your change table to create a Slowly Changing Dimension and then generate a row number to match your start and ends. This will assume however that your DB will never be out of sync by adding 2 start records or 2 end records in a row.
This also assumes you are using a RDBMS that supports common table expressions and Window Functions such as SQL Server, Oracle, PostgreSQL, DB2....
WITH cte AS (
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY UID ORDER BY ISNULL(StartDate,EndDate)) As RowNum
FROM
Changes c
)
SELECT
s.UID
,s.Name
,s.StartDate
,COALESCE(e.EndDate,GETDATE()) as EndDate
FROM
cte s
LEFT JOIN cte e
ON s.UID = e.UID
AND s.RowNum + 1 = e.RowNum
WHERE
s.StartDate IS NOT NULL
AND '2016-05-05' BETWEEN s.StartDate AND COALESCE(e.EndDate,GETDATE())

Related

Identifying Records Where a String Appears More Than Once

I have a following dataset that looks like:
ID Medication Dose
1 Aspirin 4
1 Tylenol 7
1 Aspirin 2
1 Ibuprofen 1
2 Aspirin 6
2 Aspirin 2
2 Ibuprofen 6
2 Tylenol 4
3 Tylenol 3
3 Tylenol 7
3 Tylenol 2
I would like to develop a code that would identify patients who have been administered a medication more than once. So for example, ID 1 had Aspirin twice, ID 2 had Aspirin twice and ID 3 had Tylenol three times.
I could be wrong but I think the easiest way to do this would be to concatenate each ID based on Medication using a code similar to the one below; but I'm not quite sure what to do after that - is it possible to count if a string appears twice within a cell?
SELECT DISTINCT ST2.[ID],
SUBSTRING(
(
SELECT ','+ST1.Medication AS [text()]
FROM ED_NOTES_MASTER ST1
WHERE ST1.[ID] = ST2.[ID]
Order BY [ID]
FOR XML PATH ('')
), 1, 200000) [Result]
FROM ED_NOTES_MASTER ST2
I would like the output to look like the following:
ID MEDICATION Aspirin2x Tylenol2x Ibuprofen2x
1 Aspirin, Tylenol , Aspirin YES NO NO
2 Ibuprofen, Aspirin, Aspirin YES NO NO
3 Tylenol, Tylenol ,Tylenol NO YES NO

For the first part of your question (identify patients that have had a particular medication more than once), you can do this using GROUP BY to group by the ID and medication, and then using COUNT to get how many times each medication was given to each patient. For example:
SELECT ID, Medication, COUNT(*) AS amount
FROM ST2
GROUP BY ID, Medication
This will give you a list of all ID - Medication combinations that appear in the table and a count of how many times each combo appears. To limit these results down to just those that are greater than 2, you can add a condition to the COUNTed field using HAVING:
SELECT ID, Medication, COUNT(*) AS amount
FROM ST2
GROUP BY ID, Medication
HAVING amount >= 2
The problem now is formatting the results in the way you want. What you will get from the query above is a list of all patient - medication combinations that came up in the table more than once, like this:
ID | Medication | Count
------+---------------+-------
1 | Aspirin | 2
2 | Aspirin | 2
3 | Tylenol | 3
I'd suggest that you try and work with this format if possible, because as you have found, to get multiple values returned in a comma delimited list as you have in your Medication column you have to resort to some hacks to get it to work (although a recent version of SQL Server does implement some sort of proper group concatenation functionality.). If you really need the Aspirin2x etc. columns, take a look at the PIVOT operation in SQL Server.

SQL- Write script to show project name and dates

I'm new to SQl and just going through some exercises. I'm trying to do scripts but need some assistance and would appreciate if someone can help me with the below topic which I am stuck on.
Table structure
**Project**
ID(PK) NAME Due_Date
1 Alpha 1/1/2040
2 Bravo 3/1/2030
3 Charlie 2/1/2017
4 Delta 4/1/2017
*Employee*
ID(PK) NAME
1 Kevin
2 Mike
3 Eric
4 Ira
5 Peter
*Project Assignment*
ID(PK) ProjectID(FK) EmployeeID(FK)
1 1 1
2 1 2
3 2 2
4 2 3
5 3 3
6 3 4
7 1 3
Question
Write a script that will return all project names and how much time (in days) is left until they are due for all projects which have not been completed yet.

If your question is asked correctly, then you only need the projects table. But I doubt that is what you want.
SELECT Name,
DATEDIFF (DAY, GETDATE(), Due_Date) AS DaysRemaining
FROM Project
WHERE Due_Date > GETDATE()
If you need employee data included, please adjust your question.

From my understanding i do this,
select pa.ID,e.NAME,p.NAME,p.Due_Date, DATEDIFF (DAY, GETDATE(), Due_Date) AS
DaysRemaining from
Project_Assignment pa inner join project p
on pa.projectid = p.id
inner join Employe e
on pa.EmployeeID = e.ID
and p.due_date > getdate()
Revert me any clarifications needed...

SQL Calculating time from last transaction for each ID

Hello I'm stuck trying to calculate the difference in time between each transaction for each ID.
The data looks like
Customer_ID | Transaction_Time
1 00:30
1 00:35
1 00:37
1 00:38
2 00:20
2 00:21
2 00:23
I'm trying to get the result to look something like
Customer_ID | Time_diff
1 5
1 2
1 1
2 1
2 2
I would really appreciate any help.
Thanks

Most databases support the LAG() function. However, the date/time functions can depend on the database. Here is an example for SQL Server:
select t.*
from (select t.*,
datediff(second,
lag(transaction_time) over (partition by customer_id order by transaction_time),
transaction_time
) as diff
from t
) t
where diff is not null;
The logic would be similar in most databases, although the function for calculating the time difference varies.

Calculating incremental differences in a given column

i was searching web and stackoverflow but didn,t find an answer. :( So please help me i am still learning and reading, but i am not yet thinking correctly, there are no IF and FOR LOOPs to do stuff. :)
I have table1:
id| date |state_on_date|year_quantity
1|30.12.2013|23 |100
1|31.12.2013|25 |100
1|1.1.2014 |35 |150
1|2.1.2014 |12 |150
2|30.12.2013|34 |200
2|31.12.2013|65 |200
2|1.1.2014 |43 |300
I am trying to get:
table2:
id| date |state_on_date|year_quantity|state_on_date_compare
1|30.12.2013| 23 |100 |23
1|31.12.2013| 25 |100 |-2
1|1.1.2014 | 35 |150 |-10
1|2.1.2014 | 12 |150 |23
2|30.12.2013| 34 |200 |34
2|31.12.2013| 65 |200 |-31
2|1.1.2014 | 43 |300 |22
Rules to get numbers:
id|date |state_on_date|year_quantity|state_on_date_compare
1|30.12.2013| 23 |100| 23 (lowest state_on_date for id 1)
1|31.12.2013| 25 |100| -2 (23-25)
1| 1.1.2014| 35 |150|-10 (25-35)
1| 2.1.2014| 12 |150| 23 (35-12)
2|30.12.2013| 34 |200| 34 (lowest state_on_date for id 2)
2|31.12.2013| 65 |200|-31 (34-65)
2| 1.1.2014| 43 |300| 22 (65-43)
Thanks in advace for every suggestion or solution you will make.

You have to understand that SQL is misleading because of presentation issues. Like in The Matrix ("there is no spoon"), in a query there is no previous record.
SQL is based on set theory, for which there IS NO ORDER of records. All records are just set members. The theory behind SQL is that anything you do normally should be considered as though you are doing it to ALL RECORDS AT THE SAME TIME! The fact that a datasheet view of a SELECT query shows record A before record B is an artifact of presentation - not of actual record order.
In fact, the records returned by a query are in the same order as they appear in a table UNLESS you have included a GROUP BY or ORDER BY clause. And the order of record appearance in a table is usually the order in which they were created UNLESS there is a functional primary key on that table.
However, both of these statements leave you with the same problem. There is no SYNTAX for the concepts of NEXT and PREVIOUS because it is the CONCEPT of order that doesn't exist in SQL.
VBA recordsets, though based on SQL as recordsources, create an extra context that encapsulates the SQL context. That is why VBA can do what you want and SQL itself cannot. It is the "extra" context in which VBA can define variables holding what you wanted to remember until another record comes along.
Having now rained on your parade, here are some thoughts that MIGHT help.
When you want to see "previous record" data, there MUST be a way for Access to find what you consider to be the "previous record." Therefore, if you have not allowed for this situation, it is a design flaw. (Based on you not realizing the implications of SET theory, which is eminently forgivable for new Access users, so don't take it too hard.) This is based on the "Old Programmer's Rule" that says "Access can't tell you anything you didn't tell it first." Which means - in practical terms - that if order means something to you, you must give Access the data required to remember and later impose that order. If you have no variable to identify proper order with respect to your data set, you cannot impose the desired order later. In this case, it looks like a combination of id and date together will give you an ordering variable.
You can SOMETIMES do something like a DLookup in a query where you look for the record that would precede the current one based on some order identifier.
e.g. if you were ordering by date/time fields and meant "previous" to imply the record with the next earlier time than the record in focus, you would choose the record with the maximum date less than the date in focus. Look at the DMax function. Also notice I said "record in focus" not "current record." This is a fine point, but "Current" also implies ordering by connotation. ("Previous" and "Next" imply order by denotation, a stronger definition.)
Anyway, contemplate this little jewel:
DLookup( "[myvalue]", "mytable", "[mytable]![mydate] = #" & CStr( DMax( "[mydate]", "mytable", "[mytable]![mydate] < #" & CStr( [mydate] ) & "# )" ) & "#" )
I don't guarantee that the parentheses are balanced for the functions and I don't guarantee that the syntax is exactly right. Use Access Help on DLookup, DMax, Cstr, and on strings (in functions) in order to get the exact syntax. The idea is to use a query (implied by DMax) to find the largest date less than the date in focus in order to feed a query (implied by DLookup) to find the value for the record having that date. And the CStr converts the date/time variable to a string so you can use the "#" signs as date-string brackets.
IF you are dealing with different dates for records with different qualifiers, you will also have to include the rest of the qualifies in BOTH the DMax and DLookup functions. That syntax gets awfully nasty awfully fast. Which is why folks take up VBA in the first place.

Johnny Bones makes some good points in his answer, but in fact there is a way to have Access SQL perform the required calculations in this case. Our sample data is in a table named [table1]:
id date state_on_date year_quantity
-- ---------- ------------- -------------
1 2013-12-20 23 100
1 2013-12-31 25 100
1 2014-01-01 25 150
1 2014-01-02 12 150
2 2013-12-30 34 200
2 2013-12-31 65 200
2 2014-01-01 43 300
Step 1: Determining the initial rows for each [id]
We start by creating a saved query in Access named [StartDatesById] to give us the earliest date for each [id]
SELECT id, MIN([date]) AS MinOfDate
FROM table1
GROUP BY id
That gives us
id MinOfDate
-- ----------
1 2013-12-30
2 2013-12-30
Now we can use that in another query to give us the initial rows for each [id]
SELECT
table1.id,
table1.date,
table1.state_on_date,
table1.year_quantity,
table1.state_on_date AS state_on_date_compare
FROM
table1
INNER JOIN
StartDatesById
ON table1.id = StartDatesById.id
AND table1.date = StartDatesById.MinOfDate
which gives us
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-30 23 100 23
2 2013-12-30 34 200 34
Step 2: Calculating the subsequent rows
This step begins with creating a saved query named [PreviousDates] that uses a self-join on [table1] to give us the previous dates for each row in [table1] that is not the first row for that [id]
SELECT
t1a.id,
t1a.date,
MAX(t1b.date) AS previous_date
FROM
table1 AS t1a
INNER JOIN
table1 AS t1b
ON t1a.id = t1b.id
AND t1a.date > t1b.date
GROUP BY
t1a.id,
t1a.date
That query gives us
id date previous_date
-- ---------- -------------
1 2013-12-31 2013-12-30
1 2014-01-01 2013-12-31
1 2014-01-02 2014-01-01
2 2013-12-31 2013-12-30
2 2014-01-01 2013-12-31
Once again, we can use that query in another query to derive the subsequent records for each [id]
SELECT
curr.id,
curr.date,
curr.state_on_date,
curr.year_quantity,
prev.state_on_date - curr.state_on_date AS state_on_date_compare
FROM
(
table1 AS curr
INNER JOIN
PreviousDates
ON curr.id = PreviousDates.id
AND curr.date = PreviousDates.date
)
INNER JOIN
table1 AS prev
ON prev.id = PreviousDates.id
AND prev.date = PreviousDates.previous_date
which returns
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-31 25 100 -2
1 2014-01-01 35 150 -10
1 2014-01-02 12 150 23
2 2013-12-31 65 200 -31
2 2014-01-01 43 300 22
Step 3: Combining the results of steps 1 and 2
To combine the results from the previous two steps we just include them both in a UNION query and sort by the first two columns
SELECT
table1.id,
table1.date,
table1.state_on_date,
table1.year_quantity,
table1.state_on_date AS state_on_date_compare
FROM
table1
INNER JOIN
StartDatesById
ON table1.id = StartDatesById.id
AND table1.date = StartDatesById.MinOfDate
UNION ALL
SELECT
curr.id,
curr.date,
curr.state_on_date,
curr.year_quantity,
prev.state_on_date - curr.state_on_date AS state_on_date_compare
FROM
(
table1 AS curr
INNER JOIN
PreviousDates
ON curr.id = PreviousDates.id
AND curr.date = PreviousDates.date
)
INNER JOIN
table1 AS prev
ON prev.id = PreviousDates.id
AND prev.date = PreviousDates.previous_date
ORDER BY 1, 2
returning
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-30 23 100 23
1 2013-12-31 25 100 -2
1 2014-01-01 35 150 -10
1 2014-01-02 12 150 23
2 2013-12-30 34 200 34
2 2013-12-31 65 200 -31
2 2014-01-01 43 300 22

I hope this would be helpful
http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/calculating-mean-median-and-mode-with-sq
you can use select * from table1 into table2 where specify your conditions,I am not sure whetehr this would work

Total number of days for a task before going on to the next one, grouped by person

I am trying to figure out how to show how many days have been worked on a certain task by using the dates in between each “task login” for each person. I think this can be done with one query? I'm open to suggestions and/or ideas.
The Table:
--------+-----------+----------
Person | TaskLogin | Date
--------+-----------+----------
Jane | A | 2013-01-01
Jane | B | 2013-01-03
Jane | A | 2013-01-06
Jane | B | 2013-01-10
Bob | A | 2013-01-01
Bob | A | 2013-01-06
---------------------------------------------------------------------
Row 1: Jane starts task A starting 2013-01-01 and works on it until starting Task B on 2013-01-03 = 2 days worked on Task A
Row 2: Jane starts on task B starting 2013-01-03 and works on it until starting task A on 2013-01-06 = 3 days worked on Task B
Row 3: Jane starts on task A starting 2013-01-06 and works on it until starting task B on 2013-01-10 = 4 days worked on Task A
Row 4: Skip because that is the highest date for Jane (Jane may or may not finish task B 2013-01-10 but we will not count it)
Row 5: Bob starts task A starting on 2013-01-01 and works on it until continuing to work on task A by logging it again on 2013-01-06 = 5 days worked on task A
Row 6: Skip because that is the highest date for Bob
A = 11 days because 2 + 4 + 5
B = 3 days because of Row 2
The output:
------+---------------------
Tasks | Time between Tasks
------+---------------------
A | 11 days
B | 3 days
**EDIT:*****
The solutions of Nicarus and Gordon Linoff (first pre-2013 solution specifically, with my edits in the comments) works. Note that (select distinct * from table t) t for table can be added to Gordon Linoff's solution to accommodate for the case of someone logging in twice in the same day.

What you are looking for is the lead() function. This is only available in SQL Server 2012. Before that, the easiest way is a correlated subquery:
select TaskLogin, sum(datediff(day, date, nextdate)) as days
from (select t.*,
(select top 1 date
from table t2
where t2.person = t.person
order by date desc
) as nextdate
from table t
) t
where nextdate is not null
group by TaskLogin;
In SQL Server 2012, it would be:
select TaskLogin, sum(datediff(day, date, nextdate)) as days
from (select t.*, lead(date) over (partition by person order by date) as nextdate
from table t
) t
where nextdate is not null
group by TaskLogin;

Maybe not the most elegant way, but it certainly works:
-- Setup table/insert values --
IF OBJECT_ID('TempDB.dbo.#TaskAccounting') IS NOT NULL BEGIN
DROP TABLE #TaskAccounting
END
CREATE TABLE #TaskAccounting
(
Person VARCHAR(4) NOT NULL,
TaskLogin CHAR(1) NOT NULL,
TaskDate DATETIME NOT NULL
)
INSERT INTO #TaskAccounting
VALUES ('Jane','A','2013-01-01')
INSERT INTO #TaskAccounting
VALUES ('Jane','B','2013-01-03')
INSERT INTO #TaskAccounting
VALUES ('Jane','A','2013-01-06')
INSERT INTO #TaskAccounting
VALUES ('Jane','B','2013-01-10')
INSERT INTO #TaskAccounting
VALUES ('Bob','A','2013-01-01')
INSERT INTO #TaskAccounting
VALUES ('Bob','A','2013-01-06');
-- Use a CTE to add sequence and join on it --
WITH Tasks AS (
SELECT
Person,
TaskLogin,
TaskDate,
ROW_NUMBER() OVER(PARTITION BY Person ORDER BY TaskDate) AS Sequence
FROM
#TaskAccounting
)
SELECT
a.TaskLogin AS Tasks,
CAST(SUM(DATEDIFF(DD,a.TaskDate,b.TaskDate)) AS VARCHAR) + ' days' AS TimeBetweenTasks
FROM
Tasks a
JOIN
Tasks b
ON (a.Person = b.Person)
AND (a.Sequence = b.Sequence - 1)
GROUP BY
a.TaskLogin

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas