I need to analyse user queries for Olap Cube to find out which dimensions, attributes and measures aren't used. My idea is to use a list of names of dimensions, attributes and measures and find out how many times were they used with regular expressions. However, I need to be sure that I haven't missed any formatting options and I don't really know MDX. My assumptions are:
Dimension: Dimension name is always enclosed by square brackets [] or by apostroves '' [['] dimensionName []']
Attribute: Attribute is always preceeded by it's dimension formatted as above and enclosed in square brackets [['] dimensionName []'].[attributeName]
Measure: Measure is always enclosed by square brackets, of surrounded with arithmetical operation signs, or combination of these two
Are those assumptions correct? If not, how can I improve them to cover all cases?
[Measures] is a dimension as well.
[Measures].[MeasureName] — the measure was used.
[DimensionName].[AttributeName] — the dimension and its
attribute were used.
Related
I noticed an inconsistency in how "HANA SQL" escapes single quotes in the context of the PLACEHOLDER clause. For example, consider the following PLACEHOLDER clause snippet:
('PLACEHOLDER' = ('$$CC_PARAM$$','''foo'',''an escaped single quote \'' '''))
The PLACEHOLDER clause above contains multiple values assigned to the CC_PARAM. parameter. We can see that inside of the second argument we have a single quote that's escaped with a backslash. However, we escape the single quotes outside each argument with another single quote (i.e. we do '' instead of \''. It's possible to use the \'' format for the first case, but it's not possible to use the '' format in the second case.
Why is there this discrepancy? It makes escaping quotes in multi-input input parameters tricky. I'm looking to programmatically craft SQL queries for HANA. Am I missing something here? Is it safe to use \'' over '' in all cases? Or do I need logic that can tell where a single quote occurs and escape as appropriate?
The implicit rule here - given by how the software is implemented - is that for parameter values of calculation views, the backslash \ is used to escape the single quotation mark.
For all standard SQL string occurrences, using the single-quotation mark twice '' is the correct way to differentiate between syntax element and string literal.
As for the why:
the PLACEHOLDER syntax is not SQL, but a HANA-specific command extension. So, there is no general standard that the current implementation violates.
that given, this command extension is embedded into, respectively clamped onto the standard SQL syntax and has to be handled by the same parser.
But the parameters are not only parsed once, by the SQL parser but again by the component that instantiates the calculation scenario based on the calculation view. With a bit of squinting it's not hard to see that the parameters interface is a general key-value interface that allows for all sorts of information to be handed over to the calc. engine.
One might argue that the whole approach of providing parameters via key-value pairs is not consistent with the general SQL syntax approach and be correct. On the flip side, this approach allows for general flexibility for adding new command elements to the HANA-specific parts, without structurally changing the syntax (and with it the parser).
The clear downside of this is that both the key names, as well as the values, are string-typed. To avoid losing the required escaping for the "inner string" an escape string different from the main SQL escape string needs to be used.
And here we are with two different ways of handing over a string value to be used as a filter condition.
Funny enough, both approaches may still lead to the same query execution plan.
As a matter of fact, in many scenarios with input parameters, the string value will be internally converted into a SQL conforming form. This is the case when the input parameter is used for filtering or in expressions in the calc. view that can be converted into SQL expressions.
For example
SELECT
"AAA"
FROM "_SYS_BIC"."sp/ESC"
('PLACEHOLDER' = ('$$IP_TEST$$', 'this is a test\''s test'));
shows the following execution plan on my system
OPERATOR_NAME OPERATOR_DETAILS
PROJECT TEST.AAA
COLUMN TABLE FILTER CONDITION: TEST.AAA = 'this is a test's test'
(DETAIL: ([SCAN] TEST.AAA = 'this is a test's test'))
Note how the escape-\' has been removed.
All in all: when using PLACEHOLDER values, the \' escaping needs to be used and in all other cases, the '' escaping.
That should not be terribly difficult to implement for a query builder as you can consider this when dealing with the PLACEHOLDER syntax.
Since Keen is not strongly typed, I've noticed it is possible to send data of different types into the same property. For instance, some events may have a property whose value is a String (sent surrounded by quotes), and some whose value is an integer (sent without quotes). In the case of mathematical operations, what is the expected behavior?
Our comparator will only compute mathematical operations on numbers. If you have a property whose values are mixed, the operation will only apply to the numbers, strings will be ignored. You can see the values in your property by running a select_unique query on that property as the target_property, then (if you're using the Explorer) selecting JSON from the drop-down in the top-right. Any values you see there that are surrounded by quotes will be ignored by a mathematical query type (minimum, maximum, median, average, percentile, and sum).
If you are just starting out, and you know you want to be able to do mathematical operations on this property, we recommend making sure that you always send integers as numbers (without quotes). If you really want to keep your dataset clean, you can even start a new collection once you've made sure you are no longer sending any strings.
Yes, you're correct, Keen can accept data of different types as the value for your properties. An example of Keen's lenient data type is that a property such as VisitorID can contain both numbers (ie 14558) or strings (ie "14558").
This is article from the Keen site is useful for seeing where you can check data types: https://keen.io/docs/data-collection/data-modeling-guide-200/#check-for-data-type-mismatch
I've been trying for a while to create a (little bit complicated?) cell calculation where I need to use a dynamic formula but is not working. At the end I'm obtaining the mdx expression as the cell value, which is wrong. Here is my expression:
-----------------------------------------------------------------------
---SET With 2 elements(Sales Fcst(20131223), Sales Fcst(2014))
WITH SET [CategoryFcst] AS
FILTER([Category].[Category].children, InStr([Category].[Category].CURRENTMEMBER.MEMBER_CAPTION,"Sales Fcst") > 0 )
CELL CALCULATION SalesPlanAch FOR
'([Category].[Category].[Sales Plan Att%], [Measures].[Amount])'
AS
'(StrToMember("' + [CategoryFcst].item(0).UNIQUENAME + '"),[Measures].[Amount] )'
--The formula expression result is: (StrToMember("[Category].[Category].&[29]"),[Measures].[Amount] )
SELECT [Time].[Calendar Time].[Quarter].members on columns,
[Category].[Category].children on rows
from [AllData]
WHERE ([Measures].[Amount])
-----------------------------------------------------------------------
If I put directly '(StrToMember("[Category].[Category].&[29]"),[Measures].[Amount])' as the cell calculation formula it will work, but I need to calculate that dynamically.
Also, if I put '(StrToMember("' + '[Category].[Category].&[29]' + '"),[Measures].[Amount] )' doesn't work. It sets that string as the cell value.
That is not the real cell calculation but the part that is causing me problems.
I'd appreciate any advice/answers.
Best regards,
Daniel
Just use
'(StrToMember([CategoryFcst].item(0).UNIQUENAME),[Measures].[Amount] )'
As far as I understand, the necessity of quotes around this expression is due to the history of Analysis Services as described here. And, as described in the same article, in version 2005 of Analysis Services, this was changed for member and set definitions in the WITH clause, but apparently not for the rarely used cell calculations. Hence, the quotes here are not something that encloses a string, but more or less a syntax requirement to enclose the expression.
I am actually not sure how you would use single quotes within this expression. But you can use double or single quotes in MDX expressions to enclose strings. Thus you will rarely come across the necessity to absolutely need single quotes in this type of expressions.
I am working on a project that is using OLAP cubes, and we have a problem with an MDX statement.
The problem is that our source data has square brackets in. For example:
Customer [customer code]
This is an example query on a specific customer, whose name is "CustomerA [123]". The measures list is insignificant and has been left out:
SELECT <measures> ON COLUMNS,
[Customer_Dim].[All Customer_Dim].[CustomerA [123]] ON ROWS
FROM [cubename]
So when we submit this query for one specific customer, the MDX uses those square brackets as delimiters.
This causes an error, because it is reading the [123] and thinks there is a syntax error.
We want to protect the square brackets from the data from being read as delimiters. Is there a way to shield them, or are we forced to change the character to something else?
Thanks
You can have names with brackets in MDX, you'll have to use the escape character ] :
[Customer_Dim].[All Customer_Dim].[CustomerA [123]]] // note 3 ]]]
]] -> character ']', last for a 'classical' mdx bracket
Square brackets allow you to use names for columns or aliases that contain characters not permitted for column names or aliases.
a) I’ve noticed that lots of times table and column names are enclosed inside square brackets, even though their names consist of perfectly legal characters. Why is that?
b) As far as I know, square brackets enclosing object’s name aren’t actually a part of that name. Thus, if we create a table named [A]:
CREATE TABLE [A] ( … )
we can later reference it without using brackets:
SELECT * FROM A
But why isn’t the same true when I try to reference a column KEY from a table returned by CONTAINSTABLE function? Namely, if I omit the brackets enclosing column name, I get an error:
SELECT ct.KEY
FROM CONTAINSTABLE(fullText,*,'some_string') as ct
thanx
KEY is a reserved word in SQL so requires the brackets to use it as a column name.
I think lots of the time you see superfluous square brackets may well be code generated by a tool. Management Studio puts them on when generating some scripts.
Brackets are not just for legal characters but to allow the use of otherwise reserved words for column names, etc.