Hi I have a table with date and the number of views that we had in our channel at the same day
date views
03/06/2020 5
08/06/2020 49
09/06/2020 50
10/06/2020 1
13/06/2020 1
16/06/2020 1
17/06/2020 102
23/06/2020 97
29/06/2020 98
07/07/2020 2
08/07/2020 198
12/07/2020 1
14/07/2020 168
23/07/2020 292
No we want to see in each calendar date the sum of the past 7 and 30 days
so the result will be
date sum_of_7d sum_of_30d
01/06/2020 0 0
02/06/2020 0 0
03/06/2020 5 5
04/06/2020 5 5
05/06/2020 5 5
06/06/2020 5 5
07/06/2020 5 5
08/06/2020 54 54
09/06/2020 104 104
10/06/2020 100 105
11/06/2020 100 105
12/06/2020 100 105
13/06/2020 101 106
14/06/2020 101 106
15/06/2020 52 106
16/06/2020 53 107
17/06/2020 105 209
18/06/2020 105 209
so I was wondering what is the best SQL that I can write in order to get it
I'm working on redshift and the actual table (not this example) include over 40B rows
I used to do something like this:
select dates_helper.date
, tbl1.cnt
, sum(tbl1.cnt) over (order by date rows between 7 preceding and current row ) as sum_7d
, sum(tbl1.cnt) over (order by date rows between 30 preceding and current row ) as sum_7d
from bi_db.dates_helper
left join tbl1
on tbl1.invite_date = dates_helper.date
This is a sample Data Frame
Date Items_Sold
12/29/2019 10
12/30/2019 20
12/31/2019 30
1/1/2020 40
1/2/2020 50
1/3/2020 60
1/4/2020 35
1/5/2020 56
1/6/2020 34
1/7/2020 564
1/8/2020 6
1/9/2020 45
1/10/2020 56
1/11/2020 45
1/12/2020 37
1/13/2020 36
1/14/2020 479
1/15/2020 47
1/16/2020 47
1/17/2020 578
1/18/2020 478
1/19/2020 3578
1/20/2020 67
1/21/2020 578
1/22/2020 478
1/23/2020 4567
1/24/2020 7889
1/25/2020 8999
1/26/2020 99
1/27/2020 66
1/28/2020 678
1/29/2020 889
1/30/2020 990
1/31/2020 58585
2/1/2020 585
2/2/2020 555
2/3/2020 56
2/4/2020 66
2/5/2020 66
2/6/2020 6634
2/7/2020 588
2/8/2020 2588
2/9/2020 255
I am running this query
%sql
use my_items_table;
select weekofyear(Date), count(items_sold) as Sum
from my_items_table
where year(Date)=2020
group by weekofyear(Date)
order by weekofyear(Date)
I am getting this output. (IMP: I have added random values in Sum)
Week Sum
1 | 300091
2 | 312756
3 | 309363
4 | 307312
5 | 310985
6 | 296889
7 | 315611
But I want in which with week number one column should hold a start date of each week. Like this
Start_Date Week Sum
12/29/2019 1 300091
1/5/2020 2 312756
1/12/2020 3 309363
1/19/2020 4 307312
1/26/2020 5 310985
2/2/2020 6 296889
2/9/2020 7 315611
I am running the query on Azure Data Bricks.
If you have data for all days, then just use min():
select min(date), weekofyear(Date), count(items_sold) as Sum
from my_items_table
where year(Date) = 2020
group by weekofyear(Date)
order by weekofyear(Date);
Note: The year() is the calendar year starting on Jan 1. You are not going to get dates from other years using this query. If that is an issue, I would suggest that you ask a new question asking how to get the first day for the first week of the year.
I have the receiving and sending data for whole year. so i want to built the monthly report base on that data with the rule is Fisrt in first out. It means is the first receiving will be sent out first ...
DECLARE #ReceivingTbl AS TABLE(Id INT,ProId int, RecQty INT,ReceivingDate DateTime)
INSERT INTO #ReceivingTbl
VALUES (1,1001,210,'2019-03-12'),
(2,1001,315,'2019-06-15'),
(3,2001,500,'2019-04-01'),
(4,2001,10,'2019-06-15'),
(5,1001,105,'2019-07-10')
DECLARE #SendTbl AS TABLE(Id INT,ProId int, SentQty INT,SendMonth int)
INSERT INTO #SendTbl
VALUES (1,1001,50,3),
(2,1001,100,4),
(3,1001,80,5),
(4,1001,80,6),
(5,2001,200,6)
SELECT * FROM #ReceivingTbl ORDER BY ProId,ReceivingDate
SELECT * FROM #SendTbl ORDER BY ProId,SendMonth
Id ProId RecQty ReceivingDate
1 1001 210 2019-03-12
2 1001 315 2019-06-15
5 1001 105 2019-07-10
3 2001 500 2019-04-01
4 2001 10 2019-06-15
Id ProId SentQty SendMonth
1 1001 50 3
2 1001 100 4
3 1001 80 5
4 1001 80 6
5 2001 200 6
--- And the below is what i want:
Id ProId RecQty ReceivingDate ... Mar Apr May Jun
1 1001 210 2019-03-12 ... 50 100 60 0
2 1001 315 2019-06-15 ... 0 0 20 80
5 1001 105 2019-07-10 ... 0 0 0 0
3 2001 500 2019-04-01 ... 0 0 0 200
4 2001 10 2019-06-15 ... 0 0 0 0
Thanks!
Your question is not clear to me.
If you want to purely use the FIFO approach, therefore ignore any data the table contains, you necessarely need to order by ID, which in your example you are providing, and looks like it is in order of insert.
The first line inserted should be also the first line appearing in the select (FIFO), in order to do so you have to use:
ORDER BY Id ASC
Which will place the lower value of the ID first (1, 2, 3, ...)
To me though, this doesn't make much sense, so pay attention to the meaning o the data you actually have and leverage dates like ReceivingDate, and order by that, maybe even filtering by month of the date, below an example for January data:
WHERE MONTH(ReceivingDate) = 1
In order to verify if Deliveries are done on time, I need to match delivery Documents to PO schedule lines (SchLin) based on the comparison between Required Quantity (ReqQty) and Delivered Quantity (DlvQty).
The Delivery Docs have a reference to the PO and POItm but not to the SchLin.
Once a Delivery Doc is assigned to a Schedule Line I can calculate the Delivery Delta (DlvDelta) as the number of days it was delivered early or late compared to the requirement (ReqDate).
Examples of the two base tables are as follows:
Schedule lines
PO POItm SchLin ReqDate ReqQty
123 1 1 10/11 20
123 1 2 30/11 30
124 2 1 15/12 10
124 2 2 24/12 15
Delivery Docs
Doc Item PO POItm DlvDate DlvQty
810 1 123 1 29/10 12
816 1 123 1 02/11 07
823 1 123 1 04/11 13
828 1 123 1 06/11 08
856 1 123 1 10/11 05
873 1 123 1 14/11 09
902 1 124 2 27/11 05
908 1 124 2 30/11 07
911 1 124 2 08/12 08
923 1 124 2 27/12 09
Important: Schedule Lines and Deliveries should have the same PO and POItm.
The other logic to link is to sum the DlvQty until we reach (or exceed) ReqQty.
Those deliveries are then linked to the schedule line. Subsequent deliveries are used for the following schedule line(s). A delivery schould be matched to only one schedule line.
After comparing the ReqQty and DlvQty the assignments should result in following:
Result
Doc Item PO POItm Schlin ReqDate DlvDate DlvDelta
810 1 123 1 1 10/11 29/10 -11
816 1 123 1 1 10/11 02/11 -08
823 1 123 1 1 10/11 04/11 -06
828 1 123 1 2 30/11 06/11 -24
856 1 123 1 2 30/11 10/11 -20
873 1 123 1 2 30/11 14/11 -16
902 1 124 2 1 15/12 27/11 -18
908 1 124 2 1 15/12 30/11 -15
911 1 124 2 2 24/12 08/12 -16
923 1 124 2 2 24/12 27/12 +03
Up till now, I have done this with loops using cursors but performance is rather sluggish.
Is there another way in SQL (script) using e.g. joins by comparing measures to achieve the same result?
Regards,
Eric
If you can express the rule for matching a delivery with a schedule line, you can produce the results you want in a single query. And, yes, I promise it will be faster (and simpler) than executing the same logic in loops on cursors.
I can't reproduce your exact results because I don't quite understand how the two tables relate. Hopefully from the code below you'll be able to figure it out by adjusting the join criteria.
I don't have your DBMS. My code uses SQLite, which has its own peculiar date functions. You'll have to substitute the ones your system provides. In any event, I can't recommend 5-character strings for dates. Use a datetime type if you have one, and include 4-digit years regardless. Else how many days are there between Christmas and New Years Day?
create table S (
PO int not NULL,
POItm int not NULL,
SchLin int not NULL,
ReqDate char not NULL,
ReqQty int not NULL,
primary key (PO, POItm, SchLin)
);
insert into S values
(123, 1, 1, '10/11', 20 ),
(123, 1, 2, '30/11', 30 ),
(124, 2, 1, '15/12', 10 ),
(124, 2, 2, '24/12', 15 );
create table D (
Doc int not NULL,
Item int not NULL,
PO int not NULL,
POItm int not NULL,
DlvDate char not NULL,
DlvQty int not NULL,
primary key (Doc, Item)
);
insert into D values
(810, 1, 123, 1, '29/10', 12 ),
(816, 1, 123, 1, '02/11', 07 ),
(823, 1, 123, 1, '04/11', 13 ),
(828, 1, 123, 1, '06/11', 08 ),
(856, 1, 123, 1, '10/11', 05 ),
(873, 1, 123, 1, '14/11', 09 ),
(902, 1, 124, 2, '27/11', 05 ),
(908, 1, 124, 2, '30/11', 07 ),
(911, 1, 124, 2, '08/12', 08 ),
(923, 1, 124, 2, '27/12', 09 );
select D.Doc, D.Item, D.PO, S.SchLin, S.ReqDate, D.DlvDate
, cast(
julianday('2018-' || substr(DlvDate, 4,2) || '-' || substr(DlvDate, 1,2))
- julianday('2018-' || substr(ReqDate, 4,2) || '-' || substr(ReqDate, 1,2))
as int) as DlvDelta
from S join D on S.PO = D.PO and S.POItm = D.POItm
;
Result:
Doc Item PO SchLin ReqDate DlvDate DlvDelta
---------- ---------- ---------- ---------- ---------- ---------- ----------
810 1 123 1 10/11 29/10 -12
810 1 123 2 30/11 29/10 -32
816 1 123 1 10/11 02/11 -8
816 1 123 2 30/11 02/11 -28
823 1 123 1 10/11 04/11 -6
823 1 123 2 30/11 04/11 -26
828 1 123 1 10/11 06/11 -4
828 1 123 2 30/11 06/11 -24
856 1 123 1 10/11 10/11 0
856 1 123 2 30/11 10/11 -20
873 1 123 1 10/11 14/11 4
873 1 123 2 30/11 14/11 -16
902 1 124 1 15/12 27/11 -18
902 1 124 2 24/12 27/11 -27
908 1 124 1 15/12 30/11 -15
908 1 124 2 24/12 30/11 -24
911 1 124 1 15/12 08/12 -7
911 1 124 2 24/12 08/12 -16
923 1 124 1 15/12 27/12 12
923 1 124 2 24/12 27/12 3
Hi Here is my SQL code:
SELECT a."Date", a."Missed", b."Total Client Schedules", cast(100-((a."Missed"*100) / b."Total Client Schedules")AS decimal) as "Pct Completed" -
FROM -
( -
SELECT DATE(scheduled_start) as "Date",count(*) as "Missed" FROM -
events WHERE node_name IS NOT NULL AND status IN ('Missed') GROUP BY DATE(scheduled_start) -
) as a, -
( -
SELECT DATE(scheduled_start) as "Date", count(*) as -
"Total Client Schedules" FROM events WHERE node_name IS NOT NULL GROUP BY DATE(scheduled_start) -
) as b -
WHERE a."Date" = b."Date" ORDER BY "Date" desc
and Here is the output
Date Missed Total Client Schedules Pct Completed
----------- ------------ ----------------------- --------------
2013-02-20 2 805 100
2013-02-19 14 805 99
2013-02-18 29 805 97
2013-02-17 59 805 93
2013-02-16 29 806 97
2013-02-15 49 805 94
2013-02-14 33 805 96
2013-02-13 57 805 93
2013-02-12 21 805 98
2013-02-11 35 805 96
2013-02-10 34 805 96
it always seems to round to the highest number when i want it to be like 99.99% or 97.2% etc..
You don't specify what database you are using. However, some databases do integer arithmetic, so 1/2 is 0 not 0.5.
To fix this, just make the constants you are using numeric rather than integer:
cast(100.0-((a."Missed"*100.0) / b."Total Client Schedules")AS decimal)
It will then convert to a non-integer type for the arithmetic.