query for non-existant value in access - sql

I am trying to create an access query that will report when someone misses a task, based on the fact that there is no record for that task for a shift.
I have a table that records task completion basic fields are:
Date
Shift
Task
Tech
when the tech completes a task, a record is created logging the event.
I need to be able to pull a query that identifies when a shift misses a task.
I have tried an unmatched query to no avail using a sample table with shifts as its only data, and even tried adding the task to this table.
so I am looking for some SQL help.
there has to be a way to do this...

Is it something like
SELECT Date, Shift, Task, Tech
WHERE Shift IS NOT NULL AND Task IS NULL

If we make the assumption (i know.. that's dangerous) that you have a seperate shift table, and the shift in the list you are looking for is a foreign key. I think you are looking for a shift ID that has no assciated task. You can a use a join and search for null values on your task table.
example:
SELECT shifttable.shift, Tasktable.task, tasktable.tech, tasktable.date
FROM shiftTable
LEFT JOIN TaskTable
ON TaskTable.ShiftID = shiftTable.ShiftID
WHERE TaskTable.PrimaryKey IS NULL
Given the Data:
Shift Table:
ShiftID Shift
1 Monday
2 Tuesday
3 Wednesday
Task Table
TaskID ShiftID Date tech task
1 1 5/15/15 bob job 1
2 3 5/22/15 Sam job 4
would yeild the results:
Shift
Tuesday

Related

I would like to know if there's a way to complete this query

I'm trying to obtain the average time of an "activity" in a moodle database, i am not an sql expert, but i have managed to get to the point showed in the picture, my question is if exists a way to obtain, first the timestamp/time difference (this "activity" does not have a starting time column like many others) by day and then sum them all to get the average of that activity , for the first i tried with the function 'EXTRACT()' and comparing the dates in the format "%Y-%m-%d" but only sums the first row where they are equal, by the way i have been doing this just by a sql statement, i know the existence of store procedures but my level of sql is not that high.
Thanks in advance!
data obtained so far
Data on table logs (the most important i think)
component
action
objecttable
userid
courseid
timecreated
mod_quiz*
viewed
quiz_attempts
6
2
1645287525
mod_forum
viewed
forum
5
2
1645288525
core
loggedout
user
2
0
1645291745
mod_page
viewed
page
5
2
1645291955
Data i've trying to get:
Activity
StartTime
EndTime
Total
forum
19:01
19:10
9 minute(s)
quiz
15:45
16:00
15 minute(s)
page
...
...
...
workshop
...
...
...
but so far i get to assort the data in a column
Time
2022-x-x h:m
....
but when i try to sum by day with the function EXTRACT() and trying to match the dates in a very long query it just get the first record.
NOTE: * half of the "activities" were easy to calculate since they have a "timestart" and "timeend" columns but i can not figure out how to solve the ones that do not have a "timestart" column.

Calculation of the difference between two dates based on multiple conditions

will be very grateful for any help you can provide related to the following situation.
I have 2 tables and 3rd table which is a joined table of those 2 tables.
Each table contains information on stage changes where for my calculation important are old value of stage field, new value, and date of change.
In case there is a only date1 in table 1 I use the following SQLite code
select *, case when `Duration1` = 0 then 1 else Duration1 end as "Duration1"
from
(select * ,
coalesce(JULIANDAY(`date2`) - JULIANDAY(`date1`), `report2: Stage Duration`)
as "Duration1"
from table)`
In ideal scenario table 1 contains project and 1st date (date1), table 2 contains project and 2nd date (date2). I can join them and I can get 3rd table with 1st and 2nd dates and calculate the difference between 2 dates there.
The complication pops up in cases when I have 2 dates in table 1 and 1 date in table 2. Here I need a help. I would like to add a condition in SQL code saying
if count(dates from report 1/date1)>count(dates from report 2/date2)
the difference should be (Duration I need) calculated as
today - max(JULIANDAY(`date1`))
This is my first question here. Thank you for you help and understanding in advance!

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!
Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.
This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

storing data ranges - effective representation

I need to store values for every day in timeline, i.e. every user of database should has status assigned for every day, like this:
from 1.1.2000 to 28.05.2011 - status 1
from 29.05.2011 to 30.01.2012 - status 3
from 1.2.2012 to infinity - status 4
Each day should have only one status assigned, and last status is not ending (until another one is given). My question is what is effective representation in sql database? Obvious solution is to create row for each change (with the last day the status is assigned in each range), like this:
uptodate status
28.05.2011 status 1
30.01.2012 status 3
01.01.9999 status 4
this has many problems - if i would want to add another range, say from 15.02.2012, i would need to alter last row too:
uptodate status
28.05.2011 status 1
30.01.2012 status 3
14.02.2012 status 4
01.01.9999 status 8
and it requires lots of checking to make sure there is no overlapping and errors, especially if someone wants to modify ranges in the middle of the list - inserting a new status from 29.01.2012 to 10.02.2012 is hard to implement (it would require data ranges of status 3 and status 4 to shrink accordingly to make space for new status). Is there any better solution?
i thought about completly other solution, like storing each day status in separate row - so there will be row for every day in timeline. This would make it easy to update - simply enter new status for rows with date between start and end. Of course this would generate big amount of needless data, so it's bad solution, but is coherent and easy to manage. I was wondering if there is something in between, but i guess not.
more context: i want moderator to be able to assign status freely to any dates, and edit it if he would need to. But most often moderator will be adding new status data ranges at the end. I don't really need the last status. After moderator finishes editing whole month time, I need to generate raport based on status on each day in that month. But anytime moderator may want to edit data months ago (which would be reflected on updated raports), and he can put one status for i.e. one year in advance.
You seem to want to use this table for two things - recording the current status and the history of status changes. You should separate the current status out and move it up to the parent (just like the registered date)
User
===============
Registered Date
Current Status
Status History
===============
Uptodate
Status
Your table structure should include the effective and end dates of the status period. This effectively "tiles" the statuses into groups that don't overlap. The last row should have a dummy end date (as you have above) or NULL. Using a value instead of NULL is useful if you have indexes on the end date.
With this structure, to get the status on any given date, you use the query:
select *
from t
where <date> between effdate and enddate
To add a new status at the end of the period requires two changes:
Modify the row in the table with the enddate = 01/01/9999 to have an enddate of yesterday.
Insert a new row with the effdate of today and an enddate of 01/01/9999
I would wrap this in a stored procedure.
To change a status on one date in the past requires splitting one of the historical records in two. Multiple dates may require changing multiple records.
If you have a date range, you can get all tiles that overlap a given time period with the query:
select *
from t
where <periodstart> <= enddate and <periodend> >= effdate

GROUP BY with date range

I have a table with 4 columns, id, Stream which is text, Duration (int), and Timestamp (datetime). There is a row inserted for every time someone plays a specific audio stream on my website. Stream is the name, and Duration is the time in seconds that they are listening. I am currently using the following query to figure up total listen hours for each week in a year:
SELECT YEARWEEK(`Timestamp`), (SUM(`Duration`)/60/60) FROM logs_main
WHERE `Stream`="asdf" GROUP BY YEARWEEK(`Timestamp`);
This does what I expect... presenting a total of listen time for each week in the year that there is data.
However, I would like to build a query where I have a result row for weeks that there may not be any data. For example, if the 26th week of 2006 has no rows that fall within that week, then I would like the SUM result to be 0.
Is it possible to do this? Maybe via a JOIN over a date range somehow?
The tried an true old school solution is to set up another table with a bunch of date ranges that you can outer join with for the grouping (as in the other table would have all of the weeks in it with a begin / end date).
In this case, you could just get by with a table full of the values from YEARWEEK:
201100
201101
201102
201103
201104
And here is a sketch of a sql statement:
SELECT year_weeks.yearweek , (SUM(`Duration`)/60/60)
FROM year_weeks LEFT OUTER JOIN logs_main
ON year_weeks.yearweek = logs_main.YEARWEEK(`Timestamp`)
WHERE `Stream`="asdf" GROUP BY year_weeks.yearweek;
Here is a suggestion. might not be exactly what you are looking for.
But say you had a simple table with one column [year_week] that contained the values of 1, 2, 3, 4... 52
You could then theoretically:
SELECT
A.year_week,
(SELECT SUM('Duration')/60/00) FROM logs_main WHERE
stream = 'asdf' AND YEARWEEK('TimeStamp') = A.year_week GROUP BY YEARWEEK('TimeStamp'))
FROM
tblYearWeeks A
this obviously needs some tweaking... i've done several similar queries in other projects and this works well enough depending on the situation.
If your looking for a one table/sql based solution then that is deffinately something I would be interested in as well!