How to calculate the std of the previous 10 values over the time of a measures on MDX - conditional-statements

I have to calculate the standard deviation of the 10 previous value of a measure X over the date hierarchy:
Something like that:
+------------+---------+---------+
| Date | X | std10d |
+------------+---------+---------+
| 24/04/2019 | 238985 | |
| 25/04/2019 | 61567 | |
| 26/04/2019 | -37350 | |
| 27/04/2019 | 27482 | |
| 28/04/2019 | 65499 | |
| 29/04/2019 | 3373 | |
| 30/04/2019 | 88660 | |
| 01/05/2019 | 22094 | |
| 02/05/2019 | 99731 | |
| 03/05/2019 | -4878 | |
| 04/05/2019 | -100024 | 77268 |
| 05/05/2019 | -54204 | 60966 |
| 06/05/2019 | -9833 | 63679 |
+------------+---------+---------+
I khow that the MDX formula should be like that :
stdev
(
[00_Time].[0_dateHierarchy].PrevMember.Lag(9) :
[00_Temps].[0_dateHierarcy].PrevMember,
[Measures].[X]
)
But I don't know what condition add to prevent the calculation of the first 10 value of std10d.

The expression [00_Time].[0_dateHierarchy].PrevMember.Lag(9).name for first 9 members will return null.Just check that null
Case
[Product].[Subcategory].currentmember.lag(9).name =null
then
null
else
stdev
(
[00_Time].[0_dateHierarchy].PrevMember.Lag(9) :
[00_Temps].[0_dateHierarcy].PrevMember,
[Measures].[X]
)
end

Related

How to filter the a column with isoweek values in BigQuery in StandardSQL?

I have the following table where I want to filter only the last 4 weeks - challenge: the date range of the underlying table must be from 2018 - 2021 so that all other columns can be filled. Filtering the date did not work for me, because then I wouldn't get data for the columns of the previous year.
How can I filter the table so that I always get the last 4 weeks from today but also have the data of all other columns?
+---------+---------------+---------------+---------------+---------------+--+
| isoweek | sessions_2021 | sessions_2020 | sessions_2019 | sessions_2018 | |
+---------+---------------+---------------+---------------+---------------+--+
| 44 | 534260 | 156450 | 476604 | 539819 | |
+---------+---------------+---------------+---------------+---------------+--+
| 45 | 514197 | 133285 | 481228 | 491133 | |
+---------+---------------+---------------+---------------+---------------+--+
| 46 | 487541 | 122930 | 448876 | 485281 | |
+---------+---------------+---------------+---------------+---------------+--+
| 47 | 502791 | 169920 | 267869 | 491630 | |
+---------+---------------+---------------+---------------+---------------+--+
| 48 | 430129 | 144058 | null | 459051 | |
+---------+---------------+---------------+---------------+---------------+--+
| 49 | 410885 | 127426 | null | 468970 | |
+---------+---------------+---------------+---------------+---------------+--+
| 50 | 183323 | 147254 | null | 438814 | |
+---------+---------------+---------------+---------------+---------------+--+
| 51 | null | 122491 | null | 455786 | |
+---------+---------------+---------------+---------------+---------------+--+
| 52 | null | 70972 | null | 478501 | |
+---------+---------------+---------------+---------------+---------------+--+
| 53 | null | null | 52712 | null | |
+---------+---------------+---------------+---------------+---------------+--+
If the input is the example table you have above you can try the following:
with sample_data as (
select isoweek from unnest(generate_array(1,52,1)) as isoweek
)
select isoweek
from sample_data
where isoweek between extract(isoweek from date_sub(current_date(),INTERVAL 4 WEEK))
and extract(isoweek from current_date());

How can I only "pick" (not aggregated) one row if there are duplicate values on specific column? [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 2 years ago.
Right now I have this query:
SELECT DISTINCT
stock_picking.id as delivery_order_id,
sale_order.id as sale_order_id,
sale_order.name as sale_order_name,
stock_picking.origin as stock_picking_origin,
stock_picking.name as stock_picking_name,
stock_picking.create_date as stock_picking_create_date,
sub.count_origin as sale_order_delivery_order_done_count
FROM
(
SELECT
origin,
COUNT(origin) as count_origin
FROM stock_picking
WHERE state = 'done'
GROUP BY origin
HAVING COUNT(origin) > 1
ORDER BY origin
) sub
JOIN sale_order ON sale_order.name = sub.origin
JOIN account_invoice ON account_invoice.origin = sale_order.name
JOIN stock_picking ON stock_picking.origin = sale_order.name
WHERE
account_invoice.create_date >= '04/17/20' AND
sale_order.create_date <= '04/01/20 07:00' AND
sale_order.create_date >= '03/01/20'
ORDER BY sale_order.name
;
It returns:
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
| delivery_order_id | sale_order_id | sale_order_name | stock_picking_origin | stock_picking_name | stock_picking_create_date | sale_order_delivery_order_done_count |
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
| 2053131 | 5840046 | 3258428 | 3258428 | WH/OUT/1804215 | 2020-03-01 07:10:32.144694 | 2 |
| 2071149 | 5840046 | 3258428 | 3258428 | WH/OUT/1819605 | 2020-03-03 18:00:25.208632 | 2 |
| 2154480 | 5840046 | 3258428 | 3258428 | WH/OUT/1894584 | 2020-03-11 08:39:33.514114 | 2 |
| 2053494 | 5840408 | 3258728 | 3258728 | WH/OUT/1804574 | 2020-03-01 07:41:26.728154 | 2 |
| 2105133 | 5840408 | 3258728 | 3258728 | WH/OUT/1849288 | 2020-03-07 13:59:10.049683 | 2 |
| 2192492 | 5840408 | 3258728 | 3258728 | WH/OUT/1929553 | 2020-03-13 09:10:26.18469 | 2 |
| 2061022 | 5861189 | 3279458 | 3279458 | WH/OUT/1811084 | 2020-03-02 14:37:35.803326 | 2 |
| 2170656 | 5861189 | 3279458 | 3279458 | WH/OUT/1909477 | 2020-03-12 08:57:15.434752 | 2 |
| 2072002 | 5885577 | 3294059 | 3294059 | WH/OUT/109633 | 2020-03-04 02:44:03.302924 | 2 |
| 2130430 | 5885577 | 3294059 | 3294059 | WH/OUT/114259 | 2020-03-10 03:13:58.33838 | 2 |
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
I want to make sure that the column sale_order_id is unique, but picked from the least delivery_order_id and not aggregated.
I want to have a result like this:
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
| delivery_order_id | sale_order_id | sale_order_name | stock_picking_origin | stock_picking_name | stock_picking_create_date | sale_order_delivery_order_done_count |
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
| 2053131 | 5840046 | 3258428 | 3258428 | WH/OUT/1804215 | 2020-03-01 07:10:32.144694 | 2 |
| 2053494 | 5840408 | 3258728 | 3258728 | WH/OUT/1804574 | 2020-03-01 07:41:26.728154 | 2 |
| 2061022 | 5861189 | 3279458 | 3279458 | WH/OUT/1811084 | 2020-03-02 14:37:35.803326 | 2 |
| 2072002 | 5885577 | 3294059 | 3294059 | WH/OUT/109633 | 2020-03-04 02:44:03.302924 | 2 |
+-------------------+---------------+-----------------+----------------------+--------------------+----------------------------+--------------------------------------+
You can use distinct on. Your query is complicated, so I'll encapsulate it in a CTE:
with q as (
. . .
)
select distinct on (sale_order_id) q.*
from q
order by sale_order_id, delivery_order_id;

Finding MAX date aggregated by order - Oracle SQL

I have a data orders that looks like this:
| Order | Step | Step Complete Date |
|:-----:|:----:|:------------------:|
| A | 1 | 11/1/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/3/2019 |
| | 5 | 11/3/2019 |
| | 6 | 11/5/2019 |
| | 7 | 11/5/2019 |
| B | 1 | 12/1/2019 |
| | 2 | 12/2/2019 |
| | 3 | |
| C | 1 | 10/21/2019 |
| | 2 | 10/23/2019 |
| | 3 | 10/25/2019 |
| | 4 | 10/25/2019 |
| | 5 | 10/25/2019 |
| | 6 | |
| | 7 | 10/27/2019 |
| | 8 | 10/28/2019 |
| | 9 | 10/29/2019 |
| | 10 | 10/30/2019 |
| D | 1 | 10/30/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/2/2019 |
| | 5 | 11/2/2019 |
What I need to accomplish is the following:
For each order, assign the 'Order_Completion_Date' field as the most recent 'Step_Complete_Date'. If ANY 'Step_Complete_Date' is NULL, then the value for 'Order_Completion_Date' should be NULL.
I set up a SQL FIDDLE with this data and my attempt, below:
SELECT
OrderNum,
MAX(Step_Complete_Date)
FROM
OrderNums
WHERE
Step_Complete_Date IS NOT NULL
GROUP BY
OrderNum
This is yielding:
ORDERNUM MAX(STEP_COMPLETE_DATE)
D 11/2/2019
A 11/5/2019
B 12/2/2019
C 10/30/2019
How can I achieve:
| OrderNum | Order_Completed_Date |
|:--------:|:--------------------:|
| A | 11/5/2019 |
| B | NULL |
| C | NULL |
| D | 11/2/2019 |
Aggregate function with KEEP can handle this
select ordernum,
max(step_complete_date)
keep (DENSE_RANK FIRST ORDER BY step_complete_date desc nulls first) res
FROM
OrderNums
GROUP BY
OrderNum
You can use a CASE expression to first count if there are any NULL values and if not then find the maximum value:
Query 1:
SELECT OrderNum,
CASE
WHEN COUNT( CASE WHEN Step_Complete_Date IS NULL THEN 1 END ) > 0
THEN NULL
ELSE MAX(Step_Complete_Date)
END AS Order_Completion_Date
FROM OrderNums
GROUP BY OrderNum
Results:
| ORDERNUM | ORDER_COMPLETION_DATE |
|----------|-----------------------|
| D | 11/2/2019 |
| A | 11/5/2019 |
| B | (null) |
| C | (null) |
First, you are representing dates as varchars in mm/dd/yyyy format (at least in fiddle). With max function it can produce incorrect result, try for example order with dates '11/10/2019' and '11/2/2019'.
Second, the most simple solution is IMHO to use fallback date for nulls and get null back when fallback date wins:
SELECT
OrderNum,
NULLIF(MAX(NVL(Step_Complete_Date,'~')),'~')
FROM
OrderNums
GROUP BY
OrderNum
(Example is still for varchars since tilde is greater than any digit. For dates, you could use 9999-12-31, for instance.)

Fill in data from another table using ranges in postgresql

I'm filling empty arrivals time in a bus timetable, I have start,end and some midpoints times, I do by calculating average speed between start and end route, but results are inexact, so must use intervals speed to compute arrivals times.
General Timetable shown in First Code Snippet, please run to see table, and Computed arrival times for Intervals in Second Code Snippet .
What I'm trying to do is to fill in av_speed ingeneral timetable with Intervals average speed, i.e. put 251.17 into timetable stops_sequences 1 to 5, 230.68 into sequences 6 to 10 and so on.
+============+===============+==============+==========+===========+===========+==========+
| stoptimeid | stop_sequence | arrival_time | distance | dist_trav | time_trav | av_speed |
+============+===============+==============+==========+===========+===========+==========+
| 54689 | 1 | 6:05:00 | 0,00 | | | 220,98 |
| 54690 | 2 | | 0,35 | | | 220,98 |
| 54691 | 3 | | 0,49 | | | 220,98 |
| 54692 | 4 | | 0,91 | | | 220,98 |
| 54693 | 5 | 6:10:00 | 1,19 | | | 220,98 |
| 54694 | 6 | | 1,50 | | | 220,98 |
| 54695 | 7 | | 1,67 | | | 220,98 |
| 54696 | 8 | | 1,96 | | | 220,98 |
| 54697 | 9 | | 2,16 | | | 220,98 |
| 54698 | 10 | 6:15:00 | 2,49 | | | 220,98 |
| 54699 | 11 | | 2,64 | | | 220,98 |
| 54700 | 12 | | 3,11 | | | 220,98 |
| 54701 | 13 | | 3,79 | | | 220,98 |
| 54702 | 14 | | 4,14 | | | 220,98 |
| 54703 | 15 | | 4,39 | | | 220,98 |
| 54704 | 16 | | 4,96 | | | 220,98 |
| 54705 | 17 | | 5,10 | | | 220,98 |
| 54706 | 18 | 6:25:00 | 5,21 | | | 220,98 |
+============+===============+==============+==========+===========+===========+==========+
+============+===============+==============+==========+===========+===========+==========+
| stoptimeid | stop_sequence | arrival_time | distance | dist_trav | time_trav | av_speed |
+============+===============+==============+==========+===========+===========+==========+
| 54689 | 1 | 6:05:00 | 0,00 | | | |
| 54693 | 5 | 6:10:00 | 1,19 | 1,194423 | 300 | 251,17 |
| 54698 | 10 | 6:15:00 | 2,49 | 1,300520 | 300 | 230,68 |
| 54706 | 18 | 6:25:00 | 5,21 | 2,715214 | 600 | 220,98 |
+============+===============+==============+==========+===========+===========+==========+
Have checked the use of a CASE function that fills in descending order, fills 18 to 1, then fills 10 to 5 and finally 5 to 1, so final result it's ok.
This is what I tried but only fill with only one av_speed, the last one.
CREATE OR REPLACE VIEW arrivaltime_check AS
SELECT
CASE
WHEN arrivaltime.stop_sequence<=foo.stop_sequence THEN foo.av_speed
ELSE foo.av_speed END,foo.trip_id2,foo.stop_sequence
FROM
public.arrivaltime,
(SELECT
b.rid,
b.trip_id2,
b.stoptimeid,
b.stop_sequence,
b.arrival_time,
extract(epoch from (b.arrival_time)) as time_epoch,
b.distance,
b.distance -lag(b.distance,1) over (partition by b.trip_id2 ) as dist_trav,
b.time_epoch -lag(b.time_epoch,1) over (partition by b.trip_id2 ) as time_trav,
(b.time_epoch -lag(b.time_epoch,1) over (partition by b.trip_id2 ))/(b.distance -lag(b.distance,1) over (partition by b.trip_id2 )) as av_speed
from public.timecalc1 b
order by b.trip_id2,b.stop_sequence desc) foo;
foo subquery only computes values when arrivaltimes is not null,
How can I fill in timetable with intervals av_speed?

Hiding inside group columns from other columns that don't have values

I'm working on a report. How do I get columns from the outside that are displaying dates to be next to a column inside the matrix that is displaying values.
For example it is setup like this:
| HiredDt | TermDt | [Type] | LicDt | MedDt |
---------------------------------------------------------------------------------
ID | [HiredDt] | [TermDt] | SUM([Count_of_Type]) | [LicDt] | [MedDt] |
---------------------------------------------------------------------------------
And looks like this:
| HiredDt | TermDt | Lic | Med | App | LicDt | MedDt |
----------------------------------------------------------------------------------------
1 | 1/31/12 | 1/31/14 | 1 | 1 | 12 | 6/1/15 | 9/1/14 |
2 | 2/19/12 | 9/18/14 | 1 | 1 | 12 | 3/2/15 | 9/1/14 |
But when I use inside grouping to match up the date next to the associated document type I get:
| HiredDt | TermDt | Lic | | | Med | | | App | | |
----------------------------------------------------------------------------------------------------------------------------
1 | 1/31/12 | 1/31/14 | 1 | 6/1/15 | | 1 | | 9/1/2014 | 12 | | |
2 | 2/19/12 | 9/18/14 | 1 | 3/2/15 | | 1 | | 9/1/2014 | 12 | | |
What I'm trying to get this:
| HiredDt | TermDt | Lic | LicDt | Med | MedDt | App |
--------------------------------------------------------------------------------------
1 | 1/31/12 | 1/31/14 | 1 | 6/1/15 | 1 | 9/1/14 | 12 |
2 | 2/19/12 | 9/18/14 | 1 | 3/2/15 | 1 | 9/1/14 | 12 |
Is this possible?
I would right-click on the cell you have labelled SUM([Count_of_Type]) and choose Insert Column - Inside Group - Right.
In that new cell I would set the expression to: = Max ( [LicDt] )