SQL - Use DATEADD in GROUP BY (?) - sql

Im working on a query that is meant to retrive sales by hour, which it does. However, in the used database table all the timestamps are UTC +1, also for sales made in a country with UTC +2.
So what I'm trying to achive is a result that can be used by local business units (a parameter is set depending on who is looking at a report will determine the which country/store to display). So when it's sales in a UTC +2 country the datestamp needs to be modified with +1.
I'm thinking this can be done in the group by, perhaps by using a DATEADD together with a condition that checks the county name. For example, when the country is 'Greece' (column exists in the database), use a DATEADD to add 1 hour to the timestamp.
Is this a possible solution and if so, how is it done?
This is the GROUP BY im using at the moment:
SELECT
DATEPART(hour, sales.OrderDate) AS Hour,
SUM(CASE WHEN FORMAT(sales.OrderDate, 'yyyy-MM-dd') = Cast(GETDATE() AS date) THEN sales.SALES * (1 + sales.vv / 100) END) AS SALES,
COUNT(DISTINCT (CASE WHEN FORMAT(sales.OrderDate, 'yyyy-MM-dd') = Cast(GETDATE() AS date) THEN sales.OID END) ) AS CUSTOMERS,
MAX(CASE WHEN FORMAT(sv.Date_Time, 'yyyy-MM-dd') = Cast(GETDATE() AS date) THEN sv.CC END) AS VISITORS
FROM
[DW].[Tot_Sales] AS sales
LEFT JOIN [DW].[SV_24M] AS sv ON dateadd(hour, datediff(hour, 0, sales.[OrderDate]), 0) = dateadd(HOUR, 0, sv.Date_Time)
AND sales.SID = sv.SID
WHERE
sales.SID = #Store
AND FORMAT(sales.OrderDate, 'yyyy-MM-dd') > DATEADD(year, - 1, Cast(GETDATE() AS date))
GROUP BY
sales.SID,
DATEPART(hour, sales.OrderDate)
ORDER BY
DATEPART(hour, sales.OrderDate)
It is in the columns SALES and CUSTOMERS this needs to be applied, and they are from the table named sales. This is how the result looks:
Resultset
The issue is that the sales and customers occuring at hour 9 actually occured hour 10 in UCT +2. Visitors data arrives in local time (UCT +2 in this case), and therefore it's a mismatch between Visitors and Customers.
Sample data set:
Dataset

If you know the countries then you could make use of AT TIME ZONE in the query to set the appropriate local time.
Refer to this

Related

Matching date with calculated DATEADD date

I am trying to create a table with columns containing the current date, prior year date, and additional column for the total sum revenue as below:
cur_date | py_date | py_rev
I'm trying to compare revenue across any daily period across years. Assume all dates and revenue values are included in the same SQL Server table.
I attempted to use a case statement using [date] = DATEADD(wk,-52,[date]) as the condition to return the appropriate total. The full line code is below:
select
[date] as cur_date,
DATEADD(wk,-52,[date]) py_date,
SUM(case when [date] = DATEADD(wk,-52,[date]) then sum_rev else 0 end) as py_rev
from summary
group by [date]
When running this code the py_date is as expected but py_rev returns 0 as if there is no match. What's most confusing is if I hard code a random date in place of the DATEADD portion then a total is returned. I have also tried CAST to format both date and the DATEADD portion as date with no luck.
Is there something about DATEADD that will not match to other date columns that I'm missing?
If you want the previous years revenue, then lag() is one method. This works assuming that "previous year" means 52 weeks ago and you have records for all dates:
select [date] as cur_date,
dateadd(week, -52, [date]) as py_date,
lag(sum_rev, 52 * 7) over (order by date) as py_rev
from summary;
If you do not have records for all dates, then another approach is needed. You can use a LEFT JOIN:
select s.date, dateadd(week, -52, s.[date]),
sprev.sum_rev as py_rev
from summary s left join
summary sprev
on sprev.date = dateadd(week, -52, s.[date]);

SQL Query Average by Day of Week

I am trying to devise a query which will tell me the average number of procedures done on a given weekday as well as the total number of procedures on that week day for the entire time frame. The query I've developed looks like it works, but the values are not adding up correctly.
SELECT [Day], COUNT(*) AS "Week Day Count", AVG(Totals) AS [Avg]
FROM
(
SELECT
w = DATEDIFF(WEEK, 0, CompleteDate),
[Day] = DATENAME(WEEKDAY, CompleteDate),
Totals = COUNT(*)
FROM dbo.[order]
WHERE CompleteDate Between '2015-01-01' AND '2016-04-22'
AND PlacerFld2 IN ('CT','SAMR')
AND OrderStatusID = '2'
GROUP BY
DATEDIFF(WEEK, 0, CompleteDate),
DATENAME(WEEKDAY, CompleteDate),
DATEPART(WEEKDAY, CompleteDate)
) AS q
GROUP BY [Day]
ORDER BY [Day];
I feel like the Average results are correct, however, the "Week Day Count" does not come up nearly as high as I thought it should be and perhaps it's just the way I am computing it.
When I add up the values in the Week Day Count it comes up to be about 365, but when I do the query below, I get about 1750:
SELECT COUNT(*) AS "Total 2015-2016"
FROM [order]
WHERE CompleteDate Between '2015-01-01' AND '2016-04-22'
AND PlacerFld2 IN ('CT','SAMR')
AND OrderStatusID = '2'
I suspect that you actually want the sum of the total:
SUM(Totals) AS "Week Day Count"
Your query is (I think) counting the number of days in the data for each weekday.

Query to check number of records created in a month.

My table creates a new record with timestamp daily when an integration is successful. I am trying to create a query that would check (preferably automated) the number of days in a month vs number of records in the table within a time frame.
For example, January has 31 days, so i would like to know how many days in january my process was not successful. If the number of records is less than 31, than i know the job failed 31 - x times.
I tried the following but was not getting very far:
SELECT COUNT (DISTINCT CompleteDate)
FROM table
WHERE CompleteDate BETWEEN '01/01/2015' AND '01/31/2015'
Every 7 days the system executes the job twice, so i get two records on the same day, but i am trying to determine the number of days that nothing happened (failures), so i assume some truncation of the date field is needed?!
One way to do this is to use a calendar/date table as the main source of dates in the range and left join with that and count the number of null values.
In absence of a proper date table you can generate a range of dates using a number sequence like the one found in the master..spt_values table:
select count(*) failed
from (
select dateadd(day, number, '2015-01-01') date
from master..spt_values where type='P' and number < 365
) a
left join your_table b on a.date = b.CompleteDate
where b.CompleteDate is null
and a.date BETWEEN '01/01/2015' AND '01/31/2015'
Sample SQL Fiddle (with count grouped by month)
Assuming you have an Integers table*. This query will pull all dates where no record is found in the target table:
declare #StartDate datetime = '01/01/2013',
#EndDate datetime = '12/31/2013'
;with d as (
select *, date = dateadd(d, i - 1 , #StartDate)
from dbo.Integers
where i <= datediff(d, #StartDate, #EndDate) + 1
)
select d.date
from d
where not exists (
select 1 from <target> t
where DATEADD(dd, DATEDIFF(dd, 0, t.<timestamp>), 0) = DATEADD(dd, DATEDIFF(dd, 0, d.date), 0)
)
Between is not safe here
SELECT 31 - count(distinct(convert(date, CompleteDate)))
FROM table
WHERE CompleteDate >= '01/01/2015' AND CompleteDate < '02/01/2015'
You can use the following query:
SELECT DATEDIFF(day, t.d, dateadd(month, 1, t.d)) - COUNT(DISTINCT CompleteDate)
FROM mytable
CROSS APPLY (SELECT CAST(YEAR(CompleteDate) AS VARCHAR(4)) +
RIGHT('0' + CAST(MONTH(CompleteDate) AS VARCHAR(2)), 2) +
'01') t(d)
GROUP BY t.d
SQL Fiddle Demo
Explanation:
The value CROSS APPLY-ied, i.e. t.d, is the ANSI string of the first day of the month of CompleteDate, e.g. '20150101' for 12/01/2015, or 18/01/2015.
DATEDIFF uses the above mentioned value, i.e. t.d, in order to calculate the number of days of the month that CompleteDate belongs to.
GROUP BY essentially groups by (Year, Month), hence COUNT(DISTINCT CompleteDate) returns the number of distinct records per month.
The values returned by the query are the differences of [2] - 1, i.e. the number of failures per month, for each (Year, Month) of your initial data.
If you want to query a specific Year, Month then just simply add a WHERE clause to the above:
WHERE YEAR(CompleteDate) = 2015 AND MONTH(CompleteDate) = 1

Combining daily averages for different time periods in a single query

I have a table with hourly entries for multiple products dating back to 2 years. I am trying to write a query which would look something like this:
PRODUCT, TODAY'S AVERAGE, LAST MONTHS DAILY AVERAGE, YEAR TO DATE DAILY AVERAGE
I am able to achieve this by writing separate queries for each of the averages and then joining them on the PRODUCT NAME. However, I want to be able to do the same, by writing one single query.
Is their a standard algorithm/method that I can apply?
This is an aggregation query. However, it gets variables for each of the time periods you want, and sums by day to do the final calculations.
select product,
sum(DailySum*IsToday) as Today,
sum(1.0*DailySum*IslastMonth) / sum(IslastMonth)
sum(1.0*DailySum*IsYTD) / sum(IsYTD)
from (select product, cast(dt as date) as thedate, sum(val) as DailySum
(case when cast(dt as date) = cast(getdate() as date) then 1 else 0 end) as IsToday,
(case when year(dt) = year(dateadd(month, -1, getdate()) and month(dt) = month(dateadd(month, -1, getdate())
then 1 else 0
end) as IslastMonth,
(case when year(dt) = year(getdate()) tehn 1 else 0
end) as IsYTD
from t
group by product, cast(dt as date)
) t
) t

Calculating Open incidents per month

We have Incidents in our system with Start Time and Finish Time and project name (and other info) .
We would like to have report: How many Incidents has 'open' status per month per project.
Open status mean: Not finished.
If incident is created in December 2009 and closed in March 2010, then it should be included in December 2009, January and February of 2010.
Needed structure should be like this:
Project Year Month Count
------- ------ ------- -------
Test 2009 December 2
Test 2010 January 10
Test 2010 February 12
....
In SQL Server:
SELECT
Project,
Year = YEAR(TimeWhenStillOpen),
Month = DATENAME(month, MONTH(TimeWhenStillOpen)),
Count = COUNT(*)
FROM (
SELECT
i.Project,
i.Incident,
TimeWhenStillOpen = DATEADD(month, v.number, i.StartTime)
FROM (
SELECT
Project,
Incident,
StartTime,
FinishTime = ISNULL(FinishTime, GETDATE()),
MonthDiff = DATEDIFF(month, StartTime, ISNULL(FinishTime, GETDATE()))
FROM Incidents
) i
INNER JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 0 AND MonthDiff - 1
) s
GROUP BY Project, YEAR(TimeWhenStillOpen), MONTH(TimeWhenStillOpen)
ORDER BY Project, YEAR(TimeWhenStillOpen), MONTH(TimeWhenStillOpen)
Briefly, how it works:
The most inner subselect, that works directly on the Incidents table, simply kind of 'normalises' the table (replaces NULL finish times with the current time) and adds a month difference column, MonthDiff. If there can be no NULLs in your case, just remove the ISNULL expression accordingly.
The outer subselect uses MonthDiff to break up the time range into a series of timestamps corresponding to the months where the incident was still open, i.e. the FinishTime month is not included. A system table called master..spt_values is also employed there as a ready-made numbers table.
Lastly, the main select is only left with the task of grouping the data.
A useful technique here is to create either a table of "all" dates (clearly that would be infinite so I mean a sufficiently large range for your purposes) OR create two tables: one of all the months (12 rows) and another of "all" years.
Let's assume you go for the 1st of these:
create table all_dates (d date)
and populate as appropriate. I'm going to define your incident table as follows
create table incident
(
incident_id int not null,
project_id int not null,
start_date date not null,
end_date date null
)
I'm not sure what RDBMS you are using and date functions vary a lot between them so the next bit may need adjusting for your needs.
select
project_id,
datepart(yy, all_dates.d) as "year",
datepart(mm, all_dates.d) as "month",
count(*) as "count"
from
incident,
all_dates
where
incident.start_date <= all_dates.d and
(incident.end_date >= all_dates.d or incident.end_date is null)
group by
project_id,
datepart(yy, all_dates.d) year,
datepart(mm, all_dates.d) month
That is not going to quite work as we want as the counts will be for every day that the incident was open in each month. To fix this we either need to use a subquery or a temporary table and that really depends on the RDBMS...
Another problem with it is that, for open incidents it will show them against all future months in your all_dates table. adding a all_dates.d <= today solves that. Again, different RDBMSs have different methods of giving back now/today/systemtime...
Another approach is to have an all_months rather than all_dates table that just has the date of first of the month in it:
create table all_months (first_of_month date)
select
project_id,
datepart(yy, all_months.first_of_month) as "year",
datepart(mm, all_months.first_of_month) as "month",
count(*) as "count"
from
incident,
all_months
where
incident.start_date <= dateadd(day, -1, dateadd(month, 1, first_of_month)
(incident.end_date >= first_of_month or incident.end_date is null)
group by
project_id,
datepart(yy, all_months.first_of_month),
datepart(mm, all_months.first_of_month)