How to aggregate 7 days in SQL - sql

I was trying to aggregate a 7 days data for FY13 (starts on 10/1/2012 and ends on 9/30/2013) in SQL Server but so far no luck yet. Could someone please take a look. Below is my example data.
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
So, my desired output would be like:
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
Total 11 15
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
Total 13 16
--------through 9/30/2013
Please note, since FY13 starts on 10/1/2012 and ends on 9/30/2012, the first week of FY13 is 6 days instead of 7 days.
I am using SQL server 2008.

You could add a new computed column for the date values to group them by week and sum the other columns, something like this:
SELECT DATEPART(ww, DATEADD(d,-2,[DATE])) AS WEEK_NO,
SUM(Bread) AS Bread_Total, SUM(Milk) as Milk_Total
FROM YOUR_TABLE
GROUP BY DATEPART(ww, DATEADD(d,-2,[DATE]))
Note: I used DATEADD and subtracted 2 days to set the first day of the week to Monday based on your dates. You can modify this if required.

Use option with GROUP BY ROLLUP operator
SELECT CASE WHEN DATE IS NULL THEN 'Total' ELSE CONVERT(nvarchar(10), DATE, 101) END AS DATE,
SUM(BREAD) AS BREAD, SUM(MILK) AS MILK
FROM dbo.test54
GROUP BY ROLLUP(DATE),(DATENAME(week, DATE))
Demo on SQLFiddle
Result:
DATE BREAD MILK
10/01/2012 1 3
10/02/2012 2 4
10/03/2012 2 3
10/04/2012 0 4
10/05/2012 4 0
10/06/2012 2 1
Total 11 15
10/07/2012 1 3
10/08/2012 4 7
10/10/2012 0 4
10/11/2012 4 0
10/12/2012 2 1
10/13/2012 2 1
Total 13 16

You are looking for a rollup. In this case, you will need at least one more column to group by to do your rollup on, the easiest way to do that is to add a computed column that groups them into weeks by date.
Take a lookg at: Summarizing Data Using ROLLUP

Here is the general idea of how it could be done:
You need a derived column for each row to determine which fiscal week that record belongs to. In general you could subtract that record's date from 10/1, get the number of days that have elapsed, divide by 7, and floor the result.
Then you can GROUP BY that derived column and use the SUM aggregate function.
The biggest wrinkle is that 6 day week you start with. You may have to add some logic to make sure that the weeks start on Sunday or whatever day you use but this should get you started.

The WITH ROLLUP suggestions above can help; you'll need to save the data and transform it as you need.
The biggest thing you'll need to be able to do is identify your weeks properly. If you don't have those loaded into tables already so you can identify them, you can build them on the fly. Here's one way to do that:
CREATE TABLE #fy (fyear int, fstart datetime, fend datetime);
CREATE TABLE #fylist(fyyear int, fydate DATETIME, fyweek int);
INSERT INTO #fy
SELECT 2012, '2011-10-01', '2012-09-30'
UNION ALL
SELECT 2013, '2012-10-01', '2013-09-30';
INSERT INTO #fylist
( fyyear, fydate )
SELECT fyear, DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart)) AS fydate
FROM Common.NUMBERS
CROSS APPLY (SELECT * FROM #fy WHERE fyear = 2013) fy
WHERE fy.fend >= DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart));
WITH weekcalc AS
(
SELECT DISTINCT DATEPART(YEAR, fydate) yr, DATEPART(week, fydate) dt
FROM #fylist
),
ridcalc AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY yr, dt) AS rid, yr, dt
FROM weekcalc
)
UPDATE #fylist
SET fyweek = rid
FROM #fylist
JOIN ridcalc
ON DATEPART(YEAR, fydate) = yr
AND DATEPART(week, fydate) = dt;
SELECT list.fyyear, list.fyweek, p.[date], COUNT(bread) AS Bread, COUNT(Milk) AS Milk
FROM products p
JOIN #fylist list
ON p.[date] = list.fydate
GROUP BY list.fyyear, list.fyweek, p.[date] WITH ROLLUP;
The Common.Numbers reference above is a simple numbers table that I use for this sort of thing (goes from 1 to 1M). You could also build that on the fly as needed.

Related

How to LEFT JOIN on ROW_NUM using WITH

Right now I'm in the testing phase of this query so I'm only testing it on two Queries. I've gotten stuck on the final part where I want to left join everything (this will have to be extended to 12 separate queries). The problem is basically as the title suggests--I want to join 12 queries on the created Row_Num column using the WITH() statement, instead of creating 12 separate tables and saving them as table in a database.
WITH Jan_Table AS
(SELECT ROW_NUMBER() OVER (ORDER BY a.SALE_DATE) as Row_ID, a.SALE_DATE, sum(a.revenue) as Jan_Rev
FROM ba.SALE_TABLE a
WHERE a.SALE_DATE BETWEEN '2015-01-01' and '2015-01-31'
GROUP BY a.SALE_DATE)
SELECT ROW_NUMBER() OVER (ORDER BY a.SALE_DATE) as Row_ID, a.SALE_DATE, sum(a.revenue) as Jun_Rev, j.Jan_Rev
FROM ba.SALE_TABLE a
LEFT JOIN Jan_Table j
on "j.Row_ID" = a.Row_ID
WHERE a.SALE_DATE BETWEEN '2015-06-01' and '2015-06-30'
GROUP BY a.SALE_DATE
And then I get this error message:
ERROR: column "j.Row_ID" does not exist
I put in the "j.Row_ID" because the previous message was:
ERROR: column a.row_id does not exist Hint: Perhaps you meant to
reference the column "j.row_id".
Each query works individually without the JOIN and WITH functions. I have one for every month of the year and want to join 12 of these together eventually.
The output should be a single column with ROW_NUM and 12 Monthly Revenues columns. Each row should be a day of the month. I know not every month has 31 days. So, for example, Feb only has 28 days, meaning I'd want days 29, 30, and 31 as NULLs. The query above still has the dates--but I will remove the "SALE_DATE" column after I can just get these two queries to join.
My initially thought was just to create 12 tables but I think that'd be a really bad use of space and not the most logical solution to this problem if I were to extend this solution.
edit
Below are the separate outputs of the two qaruies above and the third table is what I'm trying to make. I can't give you the raw data. Everything above has been altered from the actual column names and purposes of the data that I'm using. And I don't know how to create a dataset--that's too above my head in SQL.
Jan_Table (first five lines)
Row_Num Date Jan_Rev
1 2015-01-01 20
2 2015-01-02 20
3 2015-01-03 20
4 2015-01-04 20
5 2015-01-05 20
Jun_Table (first five lines)
Row_Num Date Jun_Rev
1 2015-06-01 30
2 2015-06-02 30
3 2015-06-03 30
4 2015-06-04 30
5 2015-06-05 30
JOINED_TABLE (first five lines)
Row_Num Date Jun_Rev Date Jan_Rev
1 2015-06-01 30 2015-01-01 20
2 2015-06-02 30 2015-01-02 20
3 2015-06-03 30 2015-01-03 20
4 2015-06-04 30 2015-01-04 20
5 2015-06-05 30 2015-01-05 20
It seems like you can just use group by and conditional aggregation for your full query:
select day(sale_date),
max(case when month(sale_date) = 1 then sale_date end) as jan_date,
max(case when month(sale_date) = 1 then revenue end) as jan_revenue,
max(case when month(sale_date) = 2 then sale_date end) as feb_date,
max(case when month(sale_date) = 2 then revenue end) as feb_revenue,
. . .
from sale_table s
group by day(sale_date)
order by day(sale_date);
You haven't specified the database you are using. DAY() is a common function to get the day of the month; MONTH() is a common function to get the months of the year. However, those particular functions might be different in your database.

SQL query to find out number of days in a week a user visited

I'd like to find out how many days in a week users have visited my site. For example, 1 day in a week, 2 days in a week, every day of the week (7).
I imagine the easiest way of doing this would be to set the date range and find out the number of days within that range (option 1). However, ideally I'd like the code to understand a week so I can run a number of weeks in one query (option 2). I'd like the users to be unique for each number of days (ie those who have visited 2 days have also visited 1 day but would only be counted in the 2 days row)
In my database (using SQLWorkbench64) I have user ids (id) and date (dt)
I'm relatively new to SQL so any help would be very much appreciated!!
Expected results (based on total users = 5540):
Option 1:
Number of Days Users
1 2000
2 1400
3 1000
4 700
5 300
6 100
7 40
Option 2:
Week Commencing Number of Days Users
06/05/2019 1 2000
06/05/2019 2 1400
06/05/2019 3 1000
06/05/2019 4 700
06/05/2019 5 300
06/05/2019 6 100
06/05/2019 7 40
You can find visitor count between a date range with below script. Its also consider if a visitor visits multi days in the given date range, s/he will be counted for the latest date only from the range-
Note: Dates are used as sample in the query.
SELECT date,COUNT(id)
FROM
(
SELECT id,max(date) date
FROM your_table
WHERE date BETWEEN '04/21/2019' AND '04/22/2019'
GROUP BY ID
)A
GROUP BY date
You can find the Monday of the week of a date and then group by that. After you have the week day there is a series of group by. Here is how I did this:
DECLARE #table TABLE
(
id INT,
date DATETIME,
MondayOfWeek DATETIME
)
DECLARE #info TABLE
(
CommencingWeek DATETIME,
NumberOfDays INT,
Users INT
)
INSERT INTO #table (id,date) VALUES
(1,'04/15/2019'), (2,'07/21/2018'), (3,'04/16/2019'), (4,'04/16/2018'), (1,'04/16/2019'), (2,'04/17/2019')
UPDATE #table
SET MondayOfWeek = CONVERT(varchar(50), (DATEADD(dd, ##DATEFIRST - DATEPART(dw, date) - 6, date)), 101)
INSERT INTO #info (CommencingWeek,NumberOfDays)
SELECT MondayOfWeek, NumberDaysInWeek FROM
(
SELECT id,MondayOfWeek,COUNT(*) AS NumberDaysInWeek FROM #table
GROUP BY id,MondayOfWeek
) T1
SELECT CommencingWeek,NumberOfDays,COUNT(*) AS Users FROM #info
GROUP BY CommencingWeek,NumberOfDays
ORDER BY CommencingWeek DESC
Here is the output from my query:
CommencingWeek NumberOfDays Users
2019-04-14 00:00:00.000 1 2
2019-04-14 00:00:00.000 2 1
2018-07-15 00:00:00.000 1 1
2018-04-15 00:00:00.000 1 1

SQL Server iterate in select query and update upon condition

I have the following result from query
empId totalPoints addPointsDate incidentDate
-------------------------------------------------
1 11 2015-06-04 2015-07-11
2 12 2015-07-04 2015-08-16
3 10 2015-08-04 2015-06-14
4 9 2015-06-14 2015-09-11
I have to update the total points if the addPointsDate was 5 weeks ago and I didn't have any incidentDate during this period.
Can I do that with stored procedure or I have to use sql function
A simple update is sufficient:
update mytable
set totalPoints = totalPoints + 1
, addPointsDate = getdate()
where
addPointsDate <= dateadd(week, -5, getdate()) -- add points was before 5 weeks ago
and incidentDate < addPointsDate -- last incident was before add points
I would use plain SQL (this is just a simple UPDATE statement) or a procedure -- functions expect return values which you don't need.
UPDATE
pointsTable
SET
totalPoints = totalPoints + 1
WHERE
-- More than 5 weeks ago
DATEDIFF(DAY, addPointsDate, GETDATE()) > 35 -- 5 weeks * 7 days
AND
-- No incidents, or incident was before the add points date
(incidentDate IS NULL OR incidentDate < addPointsDate)

Splitting SQL Data Into Months

I have a Datatable with several hundred rows for this year in it. (MS SqlServer 2k8)
I would like to split this data set out into customer enquiries / Month.
What I have so far is;
Select count(id) As Customers, DatePart(month, enquiryDate) as MonthTotal, productCode From customerEnquiries
where enquiryDate > '2012-01-01 00:00:00'
group by productCode, enquiryDate
But this then produces a row for each data item. (Whereas I want a row per month for each data item.)
So how do I change the above query, so that instead of getting
1 1 10
1 1 10
1 1 11
1 2 10
1 2 10
...
I get
2 1 10 <-- 2 enquiries for product code 10 in month 1
1 1 11 <-- 1 enquiries for product code 11 in month 1
2 2 10 <-- 2 enquiries for product code 10 in month 2
etc
And as a bonus question, is there an easy way of naming each month so the output is Jan, Feb, March instead of 1,2,3 in the month column?
Try this
Select count(id) As Customers, DatePart(month, enquiryDate) as MonthTotal, productCode From customerEnquiries
where enquiryDate > '2012-01-01 00:00:00'
group by productCode, DatePart(month, enquiryDate)
This may help you.
For the Bonus, DATENAME(MONTH, enquiryDate) will give you the name of the Month.

Can I use Oracle SQL to plot actual dates from Schedule Information?

I asked this question in regard to SQL Server, but what's the answer for an Oracle environment (10g)?
If I have a table containing schedule information that implies particular dates, is there a SQL statement that can be written to convert that information into actual rows, using something like MSSQL's Commom Table Expressions, perhaps?
Consider a payment schedule table with these columns:
StartDate - the date the schedule begins (1st payment is due on this date)
Term - the length in months of the schedule
Frequency - the number of months between recurrences
PaymentAmt - the payment amount :-)
SchedID StartDate Term Frequency PaymentAmt
-------------------------------------------------
1 05-Jan-2003 48 12 1000.00
2 20-Dec-2008 42 6 25.00
Is there a single SQL statement to allow me to go from the above to the following?
Running
SchedID Payment Due Expected
Num Date Total
--------------------------------------
1 1 05-Jan-2003 1000.00
1 2 05-Jan-2004 2000.00
1 3 05-Jan-2005 3000.00
1 4 05-Jan-2006 4000.00
2 1 20-Dec-2008 25.00
2 2 20-Jun-2009 50.00
2 3 20-Dec-2009 75.00
2 4 20-Jun-2010 100.00
2 5 20-Dec-2010 125.00
2 6 20-Jun-2011 150.00
2 7 20-Dec-2011 175.00
Your thoughts are appreciated.
Oracle actually has syntax for hierarchical queries using the CONNECT BY clause. SQL Server's use of the WITH clause looks like a hack in comparison:
SELECT t.SchedId,
CASE LEVEL
WHEN 1 THEN
t.StartDate
ELSE
ADD_MONTHS(t.StartDate, t.frequency)
END 'DueDate',
CASE LEVEL
WHEN 1 THEN
t.PaymentAmt
ELSE
SUM(t.paymentAmt)
END 'RunningExpectedTotal'
FROM PaymentScheduleTable t
WHERE t.PaymentNum <= t.Term / t.Frequency
CONNECT BY PRIOR t.startdate = t.startdate
GROUP BY t.schedid, t.startdate, t.frequency, t.paymentamt
ORDER BY t.SchedId, t.PaymentNum
I'm not 100% on that - I'm more confident about using:
SELECT t.SchedId,
t.StartDate 'DueDate',
t.PaymentAmt 'RunningExpectedTotal'
FROM PaymentScheduleTable t
WHERE t.PaymentNum <= t.Term / t.Frequency
CONNECT BY PRIOR t.startdate = t.startdate
ORDER BY t.SchedId, t.PaymentNum
...but it doesn't include the logic to handle when you're dealing with the 2nd+ entry in the chain to add months & sum the amounts. The summing could be done with GROUP BY CUBE or ROLLUP depending on the detail needed.
I don't understand why 5 payment days for schedid = 1 and 7 for scheid = 2?
48 /12 = 4 and 42 / 6 = 7. So I expected 4 payment days for schedid = 1.
Anyway I use the model clause:
create table PaymentScheduleTable
( schedid number(10)
, startdate date
, term number(3)
, frequency number(3)
, paymentamt number(5)
);
insert into PaymentScheduleTable
values (1,to_date('05-01-2003','dd-mm-yyyy')
, 48
, 12
, 1000);
insert into PaymentScheduleTable
values (2,to_date('20-12-2008','dd-mm-yyyy')
, 42
, 6
, 25);
commit;
And now the select with model clause:
select schedid, to_char(duedate,'dd-mm-yyyy') duedate, expected, i paymentnum
from paymentscheduletable
model
partition by (schedid)
dimension by (1 i)
measures (
startdate duedate
, paymentamt expected
, term
, frequency)
rules
( expected[for i from 1 to term[1]/frequency[1] increment 1]
= nvl(expected[cv()-1],0) + expected[1]
, duedate[for i from 1 to term[1]/frequency[1] increment 1]
= add_months(duedate[1], (cv(i)-1) * frequency[1])
)
order by schedid,i;
This outputs:
SCHEDID DUEDATE EXPECTED PAYMENTNUM
---------- ---------- ---------- ----------
1 05-01-2003 1000 1
1 05-01-2004 2000 2
1 05-01-2005 3000 3
1 05-01-2006 4000 4
2 20-12-2008 25 1
2 20-06-2009 50 2
2 20-12-2009 75 3
2 20-06-2010 100 4
2 20-12-2010 125 5
2 20-06-2011 150 6
2 20-12-2011 175 7
11 rows selected.
I didn't set out to answer my own question, but I'm doing work with Oracle now and I have had to learn some new Oracle-flavored things.
Anyway, the CONNECT BY statement is really nice--yes, much nicer than MSSQL's hierchical query approach, and using that construct, I was able to produce a very clean query that does what I was looking for:
SELECT DISTINCT
t.SchedID
,level as PaymentNum
,add_months(T.StartDate,level - 1) as DueDate
,(level * t.PaymentAmt) as RunningTotal
FROM SchedTest t
CONNECT BY level <= (t.Term / t.Frequency)
ORDER BY t.SchedID, level
My only remaining issue is that I had to use DISTINCT because I couldn't figure out how to select my rows from DUAL (the affable one-row Oracle table) instead of from my table of schedule data, which has at least 2 rows. If I could do the above with FROM DUAL, then my DISTINCT indicator wouldn't be necessary. Any thoughts?
Other than that, I think this is pretty nice. Et tu?