MDX: Aggregate last member

MDX: Aggregate last member - mdx

I have a data cube with one hierarchical dimension "MyTime", in each level the elements are ordered. This dimension is somehow a time dimension, but does not fit 100% to gregorian calendar.
There is one cube using this dimension.
Extract of my OLAP-schema:
<Dimension name="MyTime">
<Hierarchy hasAll="true">
<Level name="MyYear" type="Numeric" uniqueMembers="true"/>
<Level name="MyMonth" type="Numeric" uniqueMembers="true"/>
<Level name="MyDay" type="Numeric" uniqueMembers="true"/>
<Level name="MyShift" type="Numeric" uniqueMembers="true"/>
</Hierarchy>
</Dimension>
<Cube name="MyCube">
<DimensionUsage name="MyTime" source="MyTime"/>
<Measure name"Price" aggregator="avg"/>
</Cube>
DB-Tables look like this:
MyTimeDim
id | myYear | myMonth | myDay | myShift | ... other fields
---+--------+---------+-------+---------+-----------------
1 | 2014 | 6 | 3 | 1 |
2 | 2014 | 6 | 3 | 2 |
3 | 2014 | 6 | 3 | 3 |
4 | 2014 | 6 | 4 | 1 |
5 | 2014 | 6 | 4 | 2 |
6 | 2014 | 6 | 4 | 3 |
MyFact
id | timeDim | price
---+---------+------
1 | 1 | 20
2 | 2 | 9
3 | 3 | 25
4 | 4 | 3
5 | 5 | 37
6 | 6 | 5
The task is, to show a hierarchical evaluation to drill down by MyTime. On each level different aggregates of the price have to be build. Easy ones are min and max. But I also have to show the first and the last member.
That means on Day-level, the result should look like this:
Date | Min | Max | First | Last
-----------+-----+-----+-------+-----
2014-06-03 | 9 | 25 | 20 | 25
2014-06-04 | 20 | 37 | 3 | 5
I think, to provide this, I have to define a calculated member. But I could not figure out, how to setup a "user defined" aggregate.

There is no First or Last aggregate in Mondrian as of now.
The only way to do it, IMO, is on query time with a calculated member like this:
MEMBER [Measures].[Last Period Measure] as ( [Measures].[My Measures], [MyTime].CurrentMember.LastChild)

Related

How to Generate new number each time when value changes

I need to generate a random number based on below three conditions
Generate new PRCnumber when color changes
Generate new PRCnumber when the day changes
Generate new PRCnumber when color remains same but shirt number
changes within the day.
The random number will be used as identifier for the lot of the products generated. Shirt table will have the shirt number and respective color details.
Below is the query I will be using to fetch and it will take shirt number Shirt_num as input
select shirt.facility,
shirt.color,
shirt.lotnum,
colorMgr.untpal,
colorMgr.style,
shirt.product_line
from shirt,
colorMgr,
products
where colorMgr.facility = shirt.facility
and colorMgr.color = shirt.color
and colorMgr.color_id = shirt.color_id
and shirt.product_line = products.product_line
and shirt.shirtnum = #shirtnum;
I want to write conditions something like but not sure how to catch and compare when new color keeps changing.
if day_changes ()
SELECT RAND() as prcnumber
else if shirt_num_changes()
SELECT RAND() as prcnumber
else if color_changes ()
SELECT RAND() as prcnumber
..........
The desired output will be something like this.
shirt_num RandomNum Color Day
1111011 384700412 Black 26 Feb, 2021
1111011 384700132 Black 27 Feb, 2021 (Random numbr when day changes)
1111017 384701792 Black 26 Feb, 2021 (Random numbr when shirt_num changes)
1111011 384700458 Blue 26 Feb, 2021 (Random numbr when color changes)

You can do what you want with window functions, but it would depend on the ordering of the query, and it would not be deterministic (you'd get a different number each query).
I would instead recommend using a checksum of the fields.
mysql> select *, sha1(concat_ws('|', shirtnum, color, day)) from shirts;
+----------+-------+------------+--------------------------------------------+
| shirtnum | color | day | sha1(concat_ws('|', shirtnum, color, day)) |
+----------+-------+------------+--------------------------------------------+
| 111 | black | 2021-02-26 | ad48a0f6b9b197ce11a78a85bca70a487af8f028 |
| 111 | black | 2021-02-26 | ad48a0f6b9b197ce11a78a85bca70a487af8f028 |
| 111 | blue | 2021-02-26 | d863fb4761e0deae3815b6f1ab53c69d5a1a2958 |
| 222 | black | 2021-02-26 | a5dbace4af2bd0510178dcb05065392f2a73d3c0 |
| 222 | black | 2021-02-27 | 1906c453b7bff2b140bb972be8ba29fa80260392 |
| 222 | black | 2021-02-27 | 1906c453b7bff2b140bb972be8ba29fa80260392 |
| 111 | black | 2021-02-27 | c0061ba0d2be5d261c0f7ddcb28ccd8de0ea7c52 |
+----------+-------+------------+--------------------------------------------+
The advantage is this is deterministic and does not depend on the order of the results.
You can implement this as a generated column if your database supports that, or with a view.
mysql> create view shirt_ids as select *, sha1(concat_ws('|', shirtnum, color, day)) as whatever_this_is from shirts;
Query OK, 0 rows affected (0.01 sec)
mysql> select * from shirt_ids;
+----------+-------+------------+------------------------------------------+
| shirtnum | color | day | whatever_this_is |
+----------+-------+------------+------------------------------------------+
| 111 | black | 2021-02-26 | ad48a0f6b9b197ce11a78a85bca70a487af8f028 |
| 111 | black | 2021-02-26 | ad48a0f6b9b197ce11a78a85bca70a487af8f028 |
| 111 | blue | 2021-02-26 | d863fb4761e0deae3815b6f1ab53c69d5a1a2958 |
| 222 | black | 2021-02-26 | a5dbace4af2bd0510178dcb05065392f2a73d3c0 |
| 222 | black | 2021-02-27 | 1906c453b7bff2b140bb972be8ba29fa80260392 |
| 222 | black | 2021-02-27 | 1906c453b7bff2b140bb972be8ba29fa80260392 |
| 111 | black | 2021-02-27 | c0061ba0d2be5d261c0f7ddcb28ccd8de0ea7c52 |
+----------+-------+------------+------------------------------------------+

Specifying condition operator (AND/OR) for a column based on another column value in SQL

I have a recipe table with a many-to-many to a recipe_filter table. Here's some sample data:
recipe:
id | name
----+-----------
1 | test 2019
12 | slug-14
8 | dfadsfd
6 | test 4
4 | test 2
11 | slug-11
10 | Testology
13 | slug-15
5 | test 3
14 | slug-16
(10 rows)
recipe_filter_join:
recipeId | recipeFilterId
----------+----------------
1 | 1
2 | 2
3 | 3
4 | 1
6 | 5
7 | 6
8 | 4
9 | 7
6 | 8
14 | 9
14 | 4
5 | 9
5 | 38
filter:
id | slug | name | label
----+----------------------+-------------+----------------
2 | fdsfa | fdsfa | Category
3 | dsfds | dsfds | Category
6 | fdsaf | fdsaf | Category
7 | dfad | dfad | Category
8 | product-spice-2 | Spice #2 | Product
9 | product-spice-3 | Spice #3 | Product
5 | product-spice-4 | Spice #4 | Product
4 | product-spice-5 | Spice #5 | Product
1 | product-spice-6 | Spice #6 | Product
10 | product-spice-1 | Spice #1 | Product
40 | diet-halal | Halal | Diet
38 | diet-keto | Keto | Diet
41 | diet-gluten-free | Gluten free | Diet
37 | diet-vegan | Vegan | Diet
39 | diet-diabetic | Diabetic | Diet
42 | cooking-method-bake | Bake | Cooking method
43 | cooking-method-fry | Fry | Cooking method
44 | cooking-method-steam | Steam | Cooking method
45 | cooking-method-roast | Roast | Cooking method
(19 rows)
The input to my query is a list of filters.slugs for example product-spice-1, product-spice-5, cooking-method-fry, cooking-method-steam.
For the above example, I want to write a query that gets all recipes where the filter slug is (product-spice-1 or product-spice-5) and (cooking-method-fry or cooking-method-steam).
How do I create a generic query from the example above?
Update: In case it's not clear, for the list of filters given, I want to group them based on label and apply an OR between group members and an AND condition for other groups, if that makes any sense.

You want to INTERSECT two queries
SELECT
rfj."recipeId"
FROM recipe_filter_join rfj
JOIN filter ON filter.id = rfj."recipeFilterId"
WHERE filter.slug IN ('product-spice-1','product-spice-5')
INTERSECT
SELECT
rfj."recipeId"
FROM recipe_filter_join rfj
JOIN filter ON filter.id = rfj."recipeFilterId"
WHERE filter.slug IN ('cooking-method-fry', 'cooking-method-steam')
And this is is quite generalizable. As you can see, the only difference between the two parts is in the WHERE clause. If you have other conditions on Diet or category, you could generate the appropriate query string with the variation on filer & join them with INTERSECT as the separator in your programming language of choice.
I want to group them based on label and apply an OR between group members and an AND condition for other groups.
If you would prefer to have your application code call the query with just a list of slugs, then the following solution is more general.
If we restate the problem description as :
We want to search for recipes which have ingredients intersecting with the provided ingredient list, and the distinct labels for the recipes equals the distinct labels derived from the ingredient list (this last part is handled by the having clause)
We can write
WITH distinct_labels AS (
SELECT
ARRAY_AGG(DISTINCT label ORDER BY label) distinct_labels_filtered
FROM filter
WHERE slug IN ('product-spice-1','product-spice-5','cooking-method-fry', 'cooking-method-steam')
)
SELECT
rfj."recipeId"
FROM filter
JOIN recipe_filter_join rfj
ON filter.id = rfj."recipeFilterId"
WHERE slug IN ('product-spice-1','product-spice-5','cooking-method-fry', 'cooking-method-steam')
GROUP BY 1
HAVING ARRAY_AGG(DISTINCT label ORDER BY label) = (SELECT distinct_labels_filtered FROM distinct_labels)

Query'd top 15 faults, need the accumulated downtime from another column

I'm currently trying to query up a list of the top 15 occurring faults on a PLC in the warehouse. I've gotten that part down:
Select top 15 fault_number, fault_message, count(*) FaultCount
from Faults_Stator
where T_stamp> dateadd(hour, -18, getdate())
Group by Fault_number, Fault_Message
Order by Faultcount desc
HOOOWEVER I now need to find out the accumulated downtime of said faults in the top 15 list, information in another column "Fault_duration". How would I go about doing this? Thanks in advance, you've all helped me so much already.
+--------------+---------------------------------------------+------------+
| Fault Number | Fault Message | FaultCount |
+--------------+---------------------------------------------+------------+
| 122 | ST10: Part A&B Failed | 23 |
| 4 | ST16: Part on Table B | 18 |
| 5 | ST7: No Spring Present on Part A | 15 |
| 6 | ST7: No Spring Present on Part B | 12 |
| 8 | ST3: No Pin Present B | 8 |
| 1 | ST5: No A Housing | 5 |
| 71 | ST4: Shuttle Right Not Loaded | 4 |
| 144 | ST15: Vertical Cylinder did not Retract | 3 |
| 98 | ST8: Plate Loader Can not Retract | 3 |
| 72 | ST4: Shuttle Left Not Loaded | 2 |
| 94 | ST8: Spring Gripper Cylinder did not Extend | 2 |
| 60 | ST8: Plate Loader Can not Retract | 1 |
| 83 | ST6: No A Spring Present | 1 |
| 2 | ST5: No B Housing | 1 |
| 51 | ST4: Vertical Cylinder did not Extend | 1 |
+--------------+---------------------------------------------+------------+
I know I wouldn't be using the same query, but I'm at a loss at how to do this next step.
Fault duration is a column which dictates how long the fault lasted in ms. I'm trying to have those accumulated next to the corresponding fault. So the first offender would have those 23 individual fault occurrences summed next to it, in another column.

You should be able to use the SUM accumulator:
Select top 15 fault_number, fault_message, count(*) FaultCount, SUM (Fault_duration) as FaultDuration
from Faults_Stator
where T_stamp> dateadd(hour, -18, getdate())
Group by Fault_number, Fault_Message
Order by Faultcount desc

influxdb/SQL get field count

I have an influxdb table lets call it my_table
my_table is structured like this (simplified):
+-----+-----+-----
| Time| m1 | m2 |
+=====+=====+=====
| 1 | 8 | 4 |
+-----+-----+-----
| 2 | 1 | 12 |
+-----+-----+-----
| 3 | 6 | 18 |
+-----+-----+-----
| 4 | 4 | 1 |
+-----+-----+-----
However I was wondering if it is possible to find out how many of the metrics are larger than a certain (dynamic) threshold for each time.
So lets say I want to know how many of the metrics (columns) are higher than 5,
I would want to do something like this:
select fieldcount(/m*/) from my_table where /m*/ > 5
Returning:
1
1
2
0
I am relatively restricted in structuring the database as I'm using diamond collector (python) which takes care of all datacollection for me and flushes it to my influxdb without me telling what the tables should look like.
EDIT
I am aware of a possible solution if I hardcode the threshold and add a third metric named mGreaterThan5:
+-----+-----+------------------+
| Time| m1 | m2 |mGreaterThan5|
+=====+=====+====+=============+
| 1 | 8 | 4 | 1 |
+-----+-----+----+-------------+
| 2 | 1 | 12 | 1 |
+-----+-----+----+-------------+
| 3 | 6 | 18 | 2 |
+-----+-----+----+-------------+
| 4 | 4 | 1 | 0 |
+-----+-----+----+-------------+
However this means that I cant easily change this threshold to 6 or any other number so thats why I would prefer a better solution if there is one.
EDIT2
Another similar problem occurs with trying to retrieve the highest x amount of metrics. Eg:
On Jan 1st what were the highest 3 values of m? Given table:
+-----+-----+----+-----+----+-----+----+
| Time| m1 | m2 | m3 | m4 | m5 | m6 |
+=====+=====+====+=====+====+=====+====+
| 1/1 | 8 | 4 | 1 | 7 | 2 | 0 |
+-----+-----+----+-----+----+-----+----+
Am I screwed if I keep the table structured this way?

SQL query for many-to-many self-join

I have a database table that has a companion many-to-many self-join table alongside it. The primary table is part and the other table is alternate_part (basically, alternate parts are identical to their main part with different #s). Every record in the alternate_part table is also in the part table. To illustrate:
`part`
| part_id | part_number | description |
|---------|-------------|-------------|
| 1 | 00001 | wheel |
| 2 | 00002 | tire |
| 3 | 00003 | window |
| 4 | 00004 | seat |
| 5 | 00005 | wheel |
| 6 | 00006 | tire |
| 7 | 00007 | window |
| 8 | 00008 | seat |
| 9 | 00009 | wheel |
| 10 | 00010 | tire |
| 11 | 00011 | window |
| 12 | 00012 | seat |
`alternate_part`
| main_part_id | alt_part_id |
|--------------|-------------|
| 1 | 5 | // Wheel
| 5 | 1 | // |
| 5 | 9 | // |
| 9 | 5 | // |
| 2 | 6 | // Tire
| 6 | 2 | // |
| ... | ... | // |
I am trying to produce a simple SQL query that will give me a list of all alternates for a main part. The tricky part is: some alternates are only listed as alternates of alternates, it is not guaranteed that every viable alternate for a part is listed as a direct alternate. e.g., if 'Part 3' is an alternate of 'Part 2' which is an alternate of 'Part 1', then Part 3 is an alternate of Part 1 (even if the alternate_part table doesn't list a direct link). The reverse is also true (Part 1 is an alternate of Part 3).
Basically, right now I'm pulling alternates and iterating through them
SELECT p.*, ap.*
FROM part p
INNER JOIN alternate_part ap ON p.part_id = ap.main_part_id
And then going back and doing the same again on those alternates. But, I think there's got to be a better way.
The SQL query I'm looking for will basically give me:
| part_id | alt_part_id |
|---------|-------------|
| 1 | 5 |
| 1 | 9 |
For part_id = 1, even when 1 & 9 are not explicitly linked in the alternates table.
Note: I have no control whatever over the structure of the DB, it is a distributed software solution.
Note 2: It is an Oracle platform, if that affects syntax.

You have to create hierarchical tree , probably you have to use connect by prior , nocycle query
something like this
select distinct p.part_id,p.part_number,p.description,c.main_part_id
from part p
left join (
select main_part_id,connect_by_root(main_part_id) real_part_id
from alternate_part
connect by NOCYCLE prior main_part_id = alternate_part_id
) c
on p.part_id = c.real_part_id and p.part_id != c.main_part_id
order by p.part_id
You can read full documentation about Hierarchical queries at http://docs.oracle.com/cd/B28359_01/server.111/b28286/queries003.htm

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MDX: Aggregate last member - mdx

There is no First or Last aggregate in Mondrian as of now. The only way to do it, IMO, is on query time with a calculated member like this: MEMBER [Measures].[Last Period Measure] as ( [Measures].[My Measures], [MyTime].CurrentMember.LastChild)

Related

How to Generate new number each time when value changes

Specifying condition operator (AND/OR) for a column based on another column value in SQL

Query'd top 15 faults, need the accumulated downtime from another column

influxdb/SQL get field count

SQL query for many-to-many self-join

Categories

Resources