Mondrian cache is used when it shouldn't

Mondrian cache is used when it shouldn't - mdx

I have a problem with what it seems to be Mondrian's cache. I have this query:
SELECT
{ [Measures].[Searches] } ON COLUMNS,
{ [Date.Date].[2014].[4].[4] , [Date.Date].[2014].[4].[3] } ON ROWS
FROM [Searches]
that returns:
[Measures].[Searches]
[Date].[2014].[4].[4] 463
[Date].[2014].[4].[3] 381
what is correct. But if I do this query before the above one:
WITH
SET [TopCombinations] AS TOPCOUNT([Tags Group.Tag Group Combinations].[Combination].Members, 5000, [Measures].[Searches])
SELECT
{ [Measures].[Searches] } ON COLUMNS,
{ Filter( {[TopCombinations] }, [Measures].Searches > 5 ) } ON ROWS
FROM [Searches]
WHERE ( [Date.Date].[2014].[4].[4] )
when I do the first query, it returns a different result:
[Measures].[Searches]
[Date].[2014].[4].[4] 2,061
[Date].[2014].[4].[3] 381
It seems that when the Topcount query is done, some cache is made. After that the other query uses cached data and returns a different value. Any thoughts on what is going on here?
Thanks

Related

Nested subquery in django ORM

I need to transform this query to django, but I can't figure out how.
SELECT SUM(income)
FROM (
SELECT COUNT(keyword)*
CASE
WHEN country='ca' THEN 390
WHEN country='fi' THEN 290
WHEN country='it' THEN 280
WHEN country='nl' THEN 260
ELSE 250
END AS income
FROM analytics_conversions
WHERE keyword = 'online'
AND click_time BETWEEN '2022-06-01' AND '2022-06-30'
GROUP BY country) as _
Now I have this code, but it returns multiple rows. These rows should be summed and return only that one row to be used in a subquery.
keywords_conversions_params = {
'keyword': OuterRef('keyword'),
'keyword_type': OuterRef('keyword_type')
}
keywords_conversions_value = Conversions.objects.filter(
**keywords_conversions_params).order_by().values('keyword').annotate(
value=Count('pk') * Case(
When(country='ca', then=350),
When(country='fi', then=290),
When(country='it', then=280),
When(country='nl', then=260),
default=250
)).values('value')

I managed to fix this issue by removing grouping and simply summing the values with a condition. This is the fixed code.
keywords_conversions_params = {
'keyword': OuterRef('keyword'),
'keyword_type': OuterRef('keyword_type')
}
keywords_conversions_value = Conversions.objects.filter(
**keywords_conversions_params).values('keyword').annotate(
value=Sum(Case(
When(country='ca', then=350),
When(country='fi', then=290),
When(country='it', then=280),
When(country='nl', then=260),
default=250
))).values('value')

MDX query - Subselect implementation - Select all values of a column except one

I have to implement a query with the following requirements.
1) I need to have multiple conditions(with AND,OR).
2) There are conditions where I need to exclude the records with a particular value.
SELECT {...} ON Columns, {...} ON ROWS
FROM
(SELECT {([Element1].[Value].&[98]&[002], [Element2].Value.&[Value1]),
([Element1].[Value].&[98]&[004], [Element2].Value.&[Value2]), ([Element1].[Value].&[98]&[005], [Element2].Value.NOTIN[value1, value2]), } ON Columns
FROM [CubeName])
I have mentioned NOTIN[value1,value2]) as I am unaware of how this can be implemented. I have to get all values except those mentioned. Please let me know if any one can provide a solution.

You would generally use the function EXCEPT to exclude some members from a set:
SELECT
{...} ON 0
, {...} ON 1
FROM
(
SELECT
EXCEPT(
[Element1].[Value].[Value].MEMBERS //<<name of the full set
,{ //<<the set to be excluded
[Element1].[Value].&[98]&[002],
[Element1].[Value].&[98]&[004],
[Element1].[Value].&[98]&[005]
} ON 0
FROM [CubeName]
);
The above could be expanded out to tuples but the first argument will need to be a cross-join:
SELECT
{...} ON 0
, {...} ON 1
FROM
(
SELECT
EXCEPT(
[Element1].[Hier1].[Hier1].MEMBERS
* [Element1].[Hier2].[Hier2].MEMBERS //<<name of the full set
,{ //<<the set to be excluded
([Element1].[Hier1].[Hier1].&[Value1],[Element1].[Hier2].[Hier2].&[Value1]),
([Element1].[Hier1].[Hier1].&[Value2],[Element1].[Hier2].[Hier2].&[Value2]),
([Element1].[Hier1].[Hier1].&[Value3],[Element1].[Hier2].[Hier2].&[Value3]),
} ON 0
FROM [CubeName]
);

MDX queries test

I want to combine 3 queries like that in one query but I don't know how can I do this:
Query 1:
select {
Crossjoin({[Measures].[Store sales]}, {[Occupation].[Occupation].Members})
} on columns,
{
[Product].[Product Family].Members
} on rows
From test

Something like this is ok - but you need to bring All members into the script so that each set of tuples has the same dimensionality:
SELECT
[Measures].[Store sales]
*
{
[Occupation].[Occupation].MEMBERS * {[Essai].[All]}
,
{[Occupation].[All]} * [Essai].[Essai].MEMBERS
} ON COLUMNS
,{[Product].[Product Family].MEMBERS} ON ROWS
FROM test;

Sorry, the second query is
select {
Crossjoin({[Measures].[Store sales]}, {[Yearly_Income]. [Yearly_Income].Members})
} on columns,
{
[Product].[Product Family].Members
} on rows
From test

MDX query optimization while using CrossJoin

I am writing an MDX query in which i am selecting some Measures and while selection i have a where condition in which i am doing a cross join two facts , one is date and another a unique id and i am passing around 2000 unique ids and the query is taking around 20 minutes to execute and give the result.
Please find below query for the same
SELECT {[Measures].[TOTAL1], [Measures].[TOTAL2], [Measures].[TOAL3]} ON COLUMNS,
" + " {TOPCOUNT(FILTER([ID].[Ids].MEMBERS,
[ID].CurrentMember > 0),
5,[Measures].[TOTAL])} " + "ON ROWS
FROM [CHARTS]
WHERE({[Date].&[2015-09-01 00:00:00.0]}*{[NUM].[1],[NUM].[10],"
+ "[NUM].[18],[NUM].[47],[NUM].[52],[NUM].[105],[NUM].[126],[NUM].[392],"
+ "[NUM].[588],[NUM].[656],[NUM].[995],[NUM].[1005],[NUM].[1010],[NUM].[1061]})";
The straight mdx without the string manipulation operators (+) is as follows:
SELECT
{
[Measures].[TOTAL1]
,[Measures].[TOTAL2]
,[Measures].[TOAL3]
} ON COLUMNS
,{
TopCount
(
Filter
(
[ID].[Ids].MEMBERS
,
[ID].CurrentMember > 0
)
,5
,[Measures].[TOTAL]
)
} ON ROWS
FROM [CHARTS]
WHERE
{[Date].&[2015-09-01 00:00:00.0]}
*
{
[NUM].[1]
,[NUM].[10]
,[NUM].[18]
,[NUM].[47]
,[NUM].[52]
,[NUM].[105]
,[NUM].[126]
,[NUM].[392]
,[NUM].[588]
,[NUM].[656]
,[NUM].[995]
,[NUM].[1005]
,[NUM].[1010]
,[NUM].[1061]
};
Can you please tell me the different performance optimization techniques for the same.

TopCount is slow if you use the third ordering parameter - it is better to order the data first and then feed your pre-ordered set into TopCount with just 2 parameters:
WITH
SET [S0] AS
Filter
(
[ID].[Ids].MEMBERS
,
[ID].CurrentMember > 0
)
SET [S1] AS
Order
(
[S0]
,[Measures].[TOTAL]
,BDESC
)
SET [S2] AS
TopCount
(
[S1]
,5
)
SELECT
{
[Measures].[TOTAL1]
,[Measures].[TOTAL2]
,[Measures].[TOAL3]
} ON COLUMNS
,[S2] ON ROWS
FROM [CHARTS]
WHERE
{[Date].&[2015-09-01 00:00:00.0]}
*
{
[NUM].[1]
,[NUM].[10]
,[NUM].[18]
,[NUM].[47]
,[NUM].[52]
,[NUM].[105]
,[NUM].[126]
,[NUM].[392]
,[NUM].[588]
,[NUM].[656]
,[NUM].[995]
,[NUM].[1005]
,[NUM].[1010]
,[NUM].[1061]
};

NHibernate SetFirstResult causes duplicate results

We are having trouble with the way NHibernate (version 4.0.0.4000 AND 4.0.4.4000 tested) returns duplicate results. In the sample below, I get 566 results (the correct number of results), but only 549 are unique, meaning there are 17 duplicates.
#region Get Record IDs
public IList<string> GetRecordIds(string user, string agency, DateTime utcFrom, DateTime utcTo, SearchDateRangeType dateRangeType, IEnumerable<string> status, IEnumerable<string> billingStatus, IEnumerable<string> qaStatus, IEnumerable<string> transmissionStatus, IEnumerable<string> scheduledTransmissions, int pageSize = -1, int pageNumber = -1)
{
using (ISession session = NHibernateHelper.OpenSession())
{
ICriteria crit = session.CreateCriteria<Metadata>();
var dateDisjunction = Restrictions.Disjunction();
dateDisjunction.Add(Restrictions.Between("IncidentDate", utcFrom, utcTo));
crit.Add(dateDisjunction);
if (string.IsNullOrEmpty(agency) == false)
{
crit.CreateAlias("Ownership._entities.AsIList", "entities");
crit.Add(Restrictions.Eq("entities._entityName._value", agency));
crit.Add(Restrictions.Eq("entities._isDeleted._value", false) || Restrictions.IsNull("entities._isDeleted._value"));
}
crit.AddOrder(Order.Asc(Projections.Property("RecordId")));
crit.SetProjection(Projections.Property("RecordId"));
if (pageSize > 0 && pageNumber > 0)
{
crit.SetFirstResult(pageSize * (pageNumber - 1)).SetMaxResults(pageSize);
}
var ret = crit.List<string>();
return ret;
}
}
#endregion
SQL Sample 1 is the generated first iteration code from NHibernate. Subsequent pages (second page onward) use ROW_NUMBER() OVER. SQL Sample 2 is a manually-created first page, which uses ROW_NUMBER() OVER as if it was a subsequent page. NHibernate has apparently "optimized" away the ROW_NUMBER() OVER for the first page and that seems(?) to be the cause of our issues.
SQL Sample 1: Generated by NHibernate. Causes duplicates.
SELECT
TOP (100) this_.RecordId as y0_
FROM
PcrMetadata this_
inner join
PcrEntities entities1_
on this_.Id=entities1_.ListKey
WHERE
(
this_.IncidentDate between '0001-01-01 00:00:00.0000000' and '9999-01-01 00:00:00.0000000'
)
and entities1_.Name = 'ClientIDNumber'
and (
entities1_.Entities_IsDeleted = 0
or entities1_.Entities_IsDeleted is null
)
SQL Sample 2: Manually created based on NHibernate second page on. Does not cause duplicates.
SELECT
TOP (100) this_.RecordId as y0_
FROM
(SELECT
this_.Record as y0_,
ROW_NUMBER() OVER(
ORDER BY
CURRENT_TIMESTAMP) as __hibernate_sort_row
FROM
PcrMetadata this_
inner join
PcrEntities entities1_
on this_.Id=entities1_.ListKey
WHERE
(
this_.IncidentDate between '0001-01-01 00:00:00.0000000' and '9999-01-01 00:00:00.0000000'
)
and entities1_.Name = 'ClientIDNumber'
and (
entities1_.Entities_IsDeleted = 0
or entities1_.Entities_IsDeleted is null
)) as query
WHERE
query.__hibernate_sort_row > 0 -- CHANGE THIS NUMBER
Am I doing something wrong? Or there anything I can do to force NHibernate to use ROW_NUMBER?
Thanks in advance for any help!

We cannot JOIN collections and apply paging. Because we are getting cartesian product, which is paged (experience described above).
The solution I would suggest, is to (my way NEVER) join collection. To get the similar results, we should:
use subquery to apply WHERE
use fetch batching to later recieve all collection items without 1 + N issue
There is detailed answer about this issue.
see also:
How to Eager Load Associations without duplication in NHibernate?
What is the solution for the N+1 issue in hibernate?
There is more about making result distinct, but this could not help here:
Criteria.DISTINCT_ROOT_ENTITY vs Projections.distinct

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Mondrian cache is used when it shouldn't - mdx

Related

Nested subquery in django ORM

MDX query - Subselect implementation - Select all values of a column except one

MDX queries test

MDX query optimization while using CrossJoin

NHibernate SetFirstResult causes duplicate results

Categories

Resources