I'm no expert in MSAS Cube so may be this is obvious, but this is blocking an important feature in our team.
We have a fact table of "Indicators" (basicaly values from a calculator), that are computed for a specific date. indicators have a versionId, to group them following a functional rule.
It goes like :
From Date, Value, NodeId, VersionId
D0 - 1.45 - N2 - V0
We have a fact table of "VersionsAssociation" that lists all the versions (the very same versions as the ones in the "Indicator" fact table) that are valid and visible and for what date.
To fit with a customer need, some versions are visible at multiple dates.
For instance, a version computed for date D0, may be visible/recopied for date D1, D2, ...; so for a specific version V0, we would have in "VersionAssociation" :
VersionId , Date From (computed), Date To (Visible at what date)
V0 - D0 - D0
V0 - D0 - D1
V0 - D0 - D2
V0 - D0 - D3
...
In our cube model, "Indicators" facts have a "From Date", the date they are compute for, but no "To Date", because when they are visible is not up to the indicator, but rather decided by the "VersionAssociation".
The means that in our "Dimension Usage" panel, we have a many-to-many relation from "Indicator" pointing to "VersionAssociation" on the dimension "To Date".
So far, this part works as expected. When we select "To Date" = D1 in Excel, we see indicators recopied from D0, with right values (no duplicate).
Then we have a thing called projection, where we split an indicator value alongside a specific dimension. For that we have a third measure group called "Projection", with values called "Weight".
Weights have a "To Date", because the weight are computed for a specific date, and even if an indicator is copied from D0 into D1, when projected, it is projected using D1 Weights.
Also we duplicate the weight regarding all the available from date, that's strange, but without it, the result are pure chaos.
Meaning we would have in the weights:
NodeId,From Date, To Date, Projection Axis, Weight
N2 , D0 , D0 , P1 , 0.75
N2 , D0 , D0 , P2 , 0.25 (a value on node N2 would be split into 2 different values, where the sum is still the same)
N2 , D0 , D1 , P1 , 0.70
N2 , D0 , D1 , P2 , 0.30
Here goes the issue:
The Measure Group "Projection" and "Indicator" are directly linked to the dimension "Projection".
"Projection" has a direct link to the "From Date" and the "To Date" dimension.
"Indicator" has a direct link to the "From Date" dimension, but only a m2m reference to the "To Date" dimension, through the "Version Association" measure group.
To apply the Projection weights, we use a measure expression on the mesures from the "Indicator" Measure group, having something like "[Value Unit] * [Weight]".
Because of reasons, this causes MSAS to not properly disciminate the weight that are eligible to apply to a certain value in the "Indicator" measure group.
For instance, if we look into excel and ask for the D1 date (same behavior for all date), on the Projection Axsi P1 we got :
Value Weight
1.45 * 0.75 (Weight: From Date D0, To Date D0, P1)
+ 1.45 * 0.70 (Weight: From Date D0, To Date D1, P1)
for D1 and P2 we have :
Value Weight
1.45 * 0.25 (Weight: From Date D0, To Date D0, P2)
+ 1.45 * 0.30 (Weight: From Date D0, To Date D1, P2)
This cause the values to mean nothing and be non readable.
So what all of this is for, is to ask for a way to limit the weights that can be applied in the measure expression. We tried to use scope on "From Date" , "To Date" with the "Weight" measure or the "Value" measure, but the cube never step in our SCOPE instructions.
This is very long, and complicated, but we're stuck.
I am not sure that I understoond your problem completely, but what I understood is that since there is no projection axis in the fact Indicator, hence for a similar FromDate and ToDate, when Projection is selected they repeat values.
example from your data
D0 , D0 , P1 , 0.75
D0 , D0 , P2 , 0.25
for this the indicator value is repeated 1.45 for both rows where as it should be 1.45*0.75 for the first row and 1.45*0.25 for the second.
If this is the issue try the below query
with member Measures.IndicatorTest
as
([DimFromDate].[FromDate].CurrentMember,
[DimToDate].[ToDate].CurrentMember,
[Value Unit])
member Measures.ProjectionTest
as
([DimFromDate].[FromDate].CurrentMember,
[DimToDate].[ToDate].CurrentMember,
[DimProjection].[Projection].CurrentMember
[Weight])
member Measures.WeightedIndicator
as
Measures.IndicatorTest*Measures.ProjectionTest
select Measures.WeightedIndicator
on columns,
nonempty
(
[DimFromDate].[FromDate].[FromDate],
[DimToDate].[ToDate].[ToDate],
[DimProjection].[Projection].[Projection]
)
on rows
from yourCube
For closure, as it turns out the behavior expected is not possible (as far as out team tried). so we reverted to merging two of the 3 tables together, and ahving only one many-to-many join in the measure groups.
Related
Today I am trying to figure out the best way to create a solution to my problem.
I am trying to generate a due date(column J). This due date is based off of another date(column N), so starting with that date I need to check for a priority level (column L), of which there are 4 different values. Levels of priority: 2,3,4, or 5. Priorities are column K. Then I need to check column C for the first two letters of the string. There are 3 different options that can come up in column C such as DR and SR and A4 but A4 can be ignored all together. Below are the formula's for DR and SR
DR'S --------------------------------------------------------
2A (or B) - Column N + 29 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+29)
3A (or B) - Column N + 89 = cColumn J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+89)
4A (or B) - Column N + 179 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+179)
5A (or B) - Column N + 364 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+364)
SR'S -----------------------------------------------------------
2A (or B) - Column N + 89 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+89)
3A (or B) - Column N + 179 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+179)
4A (or B) - Column N + 269 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+279)
5A (or B) - Column N + 364 = Column J(The solution)
=DATE(YEAR(N29)+0,MONTH(N29)+0,DAY(N29)+364)
I was hoping to get a nudge in the right direction, and some insight of the best way to implement this.
Try,
=m29+if(left(c29, 2)="DR", choose(left(k29)-1, 29, 89, 179, 364), if(left(c29, 2)="SR", choose(left(k29)-1, 89, 179, 279, 364), 0))
May need to assert the target cell's formatting as a date otherwise you may receive an answer like 43,089.
I am trying to fill column D and column E.
Column A: varchar(64) - unique for each trip
Column B: smallint
Column C: timestamp without time zone (excel messed it up in the
image below but you can assume this as timestamp column)
Column D: numeric - need to find out time from origin in minutes
column E: numeric - time to destination in minutes.
Each trip has got different intermediate stations and I am trying to figure out the time it has been since origin and time to destination
Cell D2 = C2 - C2 = 0
cell D3 = C3 - C2
Cell D4 = C4 - C2
Cell E2 = E6 - E2
Cell E3 = E6 - E3
Cell E6 = E6 - E6 = 0
The main issue is that each trip contains differnt number of stations for each trip_id. I can think about using partition by column but cant figure out how to implement it.
Another sub question: I am dealing with very large table (100 million rows). What is the best way Postgresql experts implement data modification. Do you create like a sample table from the original data and implement everything on the sample before implementing the modifications on the original table or do you use something like "Begin trasaction" on the original data so that you can rollback in case of any error.
PS: Help with question title appreciated.
you don't need to know the number of stops
with a as (select *,extract(minutes from c - min(c) over (partition by a)) dd,extract(minutes from max(c) over (partition by a) - c) ee from td)
update td set d=dd, e=ee
from a
where a.a = td.a and a.b=td.b
;
http://sqlfiddle.com/#!17/c9112/1
I am seeking to combine the results of two columns, and view it in a single column:
select description1, description2 from daclog where description2 is not null;
results two registry:
1st row:
DESCRIPTION1
Initialization scans sent to RTU 1, 32 bit mask: 0x00000048. Initialization mask bits are as follows: B0 - status dump, B1 - analog dump B2 - accumulator dump, B3 - Group Data Dump, B4 - accumulat
(here begin DESCRIPTION2)
,or freeze, B5 - power fail reset, B6 - time sync.
2nd row:
DESCRIPTION1
Initialization scans sent to RTU 1, 32 bit mask: 0x00000048. Initialization mask bits are as follows: B0 - status dump, B1 - analog dump B2 - accumulator dump, B3 - Group Data Dump, B4 - accumulat
(here begin DESCRIPTION2)
,or freeze, B5 - power fail reset, B6 - time sync.
Then I need the value of description1 and description2, on the same column.
It is possible?
Thank you!
You can combine two columns into one by using || operator.
select description1 || description2 as description from daclog where description2 is not null;
If you would like to use some substrings from each of the descriptions, you can use String functions and then combine the results. FNC(description1) || FNC(descriptions2) where FNC might be a function to return the desired substring of your columns.
I am porting some code I wrote to NEON using inline assembly.
One of the things I need is to convert byte values ranging [0..128] to other byte values in a table which take the full range [0..255]
The table is short but the math behind this is not easy so I think it is not worth calculating it each time "on the fly". So I want to try Look Up tables.
I have used VTBL for a 32byte case, and works as expected
For the full range, one idea would be to first compare the range where the source is and do different lookups (i.e, having 4 32-bit lookup tables).
My question is: Is there any more efficient way to do it?
EDIT
After some trials, I have done it with four look-ups and (still not scheduled) I am happy with the results. I leave here a piece of the code lines in inline assembly, just in case someone may find it useful or thinks it can be improved.
// Have the original data in d0
// d1 holds #32 value
// d6,d7,d8,d9 has the images for the values [0..31]
//First we look for the 0..31 images. The values out of range will be 0
"vtbl.u8 d2,{d6,d7,d8,d9},d0 \n\t"
// Now we sub #32 to d1 and find the images for [32...63], which have been previously loaded in d10,d11,d12,d13
"vsub.u8 d0,d0,d1\n\t"
"vtbl.u8 d3,{d10,d11,d12,d13},d1 \n\t"
// Do the same and calculating images for [64..95]
"vsub.u8 d0,d0,d1\n\t"
"vtbl.u8 d4,{d14,d15,d16,d17},d0 \n\t"
// Last step: images for [96..127]
"vsub.u8 d0,d0,d1\n\t"
"vtbl.u8 d5,{d18,d19,d20,d21},d0 \n\t"
// Now we add all. No need to saturate, since only one will be different than zero each time
"vadd.u8 d2,d2,d3\n\t"
"vadd.u8 d4,d4,d5\n\t"
"vadd.u8 d2,d2,d4\n\t" // Leave the result in d2
The proper sequence is through
vtbl d0, { d2,d3,d4,d5 }, d1 // first value
vsub d1, d1, d31 // decrement index
vtbx d0, { d6,d7,d8,d9 }, d1 // all the subsequent values
vsub d1, d1, d31 // decrement index
vtbx d0, { q5,q6 }, d1 // q5 = d10,d11
vsub d1, d1, d31
vtbx d0, { q7,q8 }, d1
The difference between vtbl and vtbx is that vtbl zeroes the element d0, when d1 >= 32, where as vtbx leaves the original value in d0 intact. Thus there's no need for the trickery as in my comment and no need to merge the partial values.
I am running a fairly simply query where I am looking for the count of snapshots on Tasks between a certain time frame. Assuming dates d1, d2 and d3 where d1 < d2 < d3 I would expect that the count of snapshots between d1 inclusive and d2 exclusive plus the count of snapshots between d2 inclusive and d3 exlusive should be equal to the count of snapshots between d1 inclusive and d3 exclusive. However, I am consistently getting different counts... the count between d1 and d3 is larger than the sumation of the individual queries.
---- CASE 1: 01/13 – 01/15
Input
{
find:{
"_TypeHierarchy":"Task",
"_ValidFrom":{"$gte" : "2013-01-13T00:00:00.000Z"},
"_ValidTo": {"$lt" : "2013-01-15T00:00:00.000Z"}},
fields:["_id","ObjectID","_SnapshotNumber","_ValidFrom","_ValidTo"],
pagesize:1
}
Output
"TotalResultCount": 559,
---- CASE 2: 01/13 – 01/14
Input
{
find:{
"_TypeHierarchy":"Task",
"_ValidFrom":{"$gte" : "2013-01-13T00:00:00.000Z"},
"_ValidTo": {"$lt" : "2013-01-14T00:00:00.000Z"}},
fields:["_id","ObjectID","_SnapshotNumber","_ValidFrom","_ValidTo"],
pagesize:1
}
Output
"TotalResultCount": 52,
---- CASE 3: 01/14 – 01/15
Input
{
find:{
"_TypeHierarchy":"Task",
"_ValidFrom":{"$gte" : "2013-01-14T00:00:00.000Z"},
"_ValidTo": {"$lt" : "2013-01-15T00:00:00.000Z"}},
fields:["_id","ObjectID","_SnapshotNumber","_ValidFrom","_ValidTo"],
pagesize:1
}
Output
"TotalResultCount": 498,
In studying the result objects, I now see my mistake. If you want to find the count of snapshots in a given timeframe, you should specify the start and end time and the inequalities in the _ValidFrom clause:
"_ValidFrom":{"$gte" : "2012-01-13T00:00:00.000Z", "$lt" : "2012-01-14T00:00:00.000Z"}},
In the orginal question, the query in case 2 and case 3 wouldn't include a snapshot where ValidFrom was after d1 and ValidTo was after d2 but prior to d3. However the first case would. Hence the total count being higher for case 1 than case 2 and case 3 combined.