SQL change over time query - sql-server-2012

I have created 2 tables. one table has 4 fields. a unique name, a date and 3 figures. The second table contains the same fields but records the output of a merge function. therefore has a date at which time the update or insert function happened. what I want to do is retrieve a sum of either the difference between 2 days or alternatively the totals of the 2 days to work out how much the value has changed over the day. The merge function only updates if a value has changed or it needs to insert a new value.
so far I have this
select sum(Change_Table_1.Disk_Space) as total,
Change_Table_1.Date_Updated
from VM_Info
left join Change_Table_1
on VM_Info.VM_Unique = Change_Table_1.VM_Unique
where VM_Info.Agency = 'test'
group by Change_Table_1.Date_Updated
but this would just return the sum of that days updated total rather than the difference between the two days. One answer to this question would be to to add all new records to the table but this would contain a number of duplicates. So in my head what I want it to do is loop over the current figures for the day then loop over the next day but also to include all values that haven't updated. sorry if I haven't explained this well. so what I want to achieve is to get some sort of change of the total over time. If its poor design im in a position to accept that also.
Any help is much appreciated.
maybe this would explain it better. show me total for day 1, if the value hasn't changed then show me the same value for day 2 if it has changed show me new value. and so on...
ok to further elaborate.
the Change_Table looks like
vm date created action value_Field1 value_field_2 Disk_Space
abc 14/10/2013 insert 5 5 30
def 14/10/2013 insert 5 5 75
abc 15/10/2013 update 5 5 75
so the out put I want is for the 14th the total for the last column is 105. On the 15th abc has changed from 30 to 75 but def hasn't changed but still neds to be included giving 150
so the output would look like
date disk_Space
14/10/2013 105
15/10/2013 150

Does this help? If not, can you provide a few rows of sample data, and an example of the desired result?
select
(VM_Info.Disk_Space - Change_Table_1.Disk_Space) as DiskSpaceChange,
Change_Table_1.Date_Updated
from
VM_Info
left join Change_Table_1 on VM_Info.VM_Unique = Change_Table_1.VM_Unique and VM_Info.Date = Change_Table_1.Date_Updated
where
VM_Info.Agency = 'test'

Related

I would like to know if there's a way to complete this query

I'm trying to obtain the average time of an "activity" in a moodle database, i am not an sql expert, but i have managed to get to the point showed in the picture, my question is if exists a way to obtain, first the timestamp/time difference (this "activity" does not have a starting time column like many others) by day and then sum them all to get the average of that activity , for the first i tried with the function 'EXTRACT()' and comparing the dates in the format "%Y-%m-%d" but only sums the first row where they are equal, by the way i have been doing this just by a sql statement, i know the existence of store procedures but my level of sql is not that high.
Thanks in advance!
data obtained so far
Data on table logs (the most important i think)
component
action
objecttable
userid
courseid
timecreated
mod_quiz*
viewed
quiz_attempts
6
2
1645287525
mod_forum
viewed
forum
5
2
1645288525
core
loggedout
user
2
0
1645291745
mod_page
viewed
page
5
2
1645291955
Data i've trying to get:
Activity
StartTime
EndTime
Total
forum
19:01
19:10
9 minute(s)
quiz
15:45
16:00
15 minute(s)
page
...
...
...
workshop
...
...
...
but so far i get to assort the data in a column
Time
2022-x-x h:m
....
but when i try to sum by day with the function EXTRACT() and trying to match the dates in a very long query it just get the first record.
NOTE: * half of the "activities" were easy to calculate since they have a "timestart" and "timeend" columns but i can not figure out how to solve the ones that do not have a "timestart" column.

Listing Unmatched Positions out of One Table where reference date is specific

I am pretty new to SQL, but i need to use it for my new job as the project requires it and as I am a non-IT-guy, it is more difficult for me, because thats my first time I work professionally with SQL.
Hopefully you can help me with it: (Sry for my english, i am a non-native speaker)
I need to start a query where I get unequal IDs from 2 different reference dates.
So I have one Table with following data:
DATES ID AMOUNT SID
201910 122424 99999 1
201911 41241242 99999 2
201912 12412424 -22222 3
...
GOAL:
So the ID's from the DATE: 201911 shall be compared with those from 201910
and the query should show me the unequal ID's. So only the unmatched ID's shall be displayed.
Out of this query, the Amount should be summed up and grouped into SIDs.
If you have two dates and you want sids that are only on one of them, then:
select sid
from t
where date in (201911, 201910)
group by sid
having count(distinct date) = 1;

Get the number of records from 2 columns where the time is overlapping

I am new to MS ACCESS and am having trouble trying to get the number of records from overlapping time ranges. This is an example of my data.
example of raw data
I am trying to do is to get the column number_of_records. For example, if there are 4 records added at 5.11, the number_of_records should become 8 as 4 records are added at 5.10.
example of raw data with no_of_records column
There is a mistake in my image above. I forgot to mention that for example, if the time hits 6:00, the number of records should not add on to the previous records and should start afresh.
Do any of you have any suggestions?
Consider the correlated count subquery:
SELECT t.time_column_1, t.time_column_2,
(SELECT Count(*) FROM myTable sub
WHERE sub.time_column_1 <= t.time_column_1
AND sub.time_column_2 = t.time_column_2) AS number_of_records
FROM mytable t
ORDER BY t.time_column_2, t.time_column_1

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!
Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.
This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

GROUP BY with date range

I have a table with 4 columns, id, Stream which is text, Duration (int), and Timestamp (datetime). There is a row inserted for every time someone plays a specific audio stream on my website. Stream is the name, and Duration is the time in seconds that they are listening. I am currently using the following query to figure up total listen hours for each week in a year:
SELECT YEARWEEK(`Timestamp`), (SUM(`Duration`)/60/60) FROM logs_main
WHERE `Stream`="asdf" GROUP BY YEARWEEK(`Timestamp`);
This does what I expect... presenting a total of listen time for each week in the year that there is data.
However, I would like to build a query where I have a result row for weeks that there may not be any data. For example, if the 26th week of 2006 has no rows that fall within that week, then I would like the SUM result to be 0.
Is it possible to do this? Maybe via a JOIN over a date range somehow?
The tried an true old school solution is to set up another table with a bunch of date ranges that you can outer join with for the grouping (as in the other table would have all of the weeks in it with a begin / end date).
In this case, you could just get by with a table full of the values from YEARWEEK:
201100
201101
201102
201103
201104
And here is a sketch of a sql statement:
SELECT year_weeks.yearweek , (SUM(`Duration`)/60/60)
FROM year_weeks LEFT OUTER JOIN logs_main
ON year_weeks.yearweek = logs_main.YEARWEEK(`Timestamp`)
WHERE `Stream`="asdf" GROUP BY year_weeks.yearweek;
Here is a suggestion. might not be exactly what you are looking for.
But say you had a simple table with one column [year_week] that contained the values of 1, 2, 3, 4... 52
You could then theoretically:
SELECT
A.year_week,
(SELECT SUM('Duration')/60/00) FROM logs_main WHERE
stream = 'asdf' AND YEARWEEK('TimeStamp') = A.year_week GROUP BY YEARWEEK('TimeStamp'))
FROM
tblYearWeeks A
this obviously needs some tweaking... i've done several similar queries in other projects and this works well enough depending on the situation.
If your looking for a one table/sql based solution then that is deffinately something I would be interested in as well!