mondrian.olap.ResourceLimitExceededException: Mondrian Error:Number of members to be read exceeded limit (10,000) - mdx

I have a problem similar to the one reported on the Pentaho blog in the post below:
https://forums.pentaho.com/threads/47819-Help-regd-member-restriction/?p=141499
And that query is very simple:
MDX: SELECT NON EMPTY {[Measures].[QTD_EMPRESAS]} on 0, NON EMPTY {[BAIRRO.BAIRRO_H].[TODOS_BAIRRO_H],[BAIRRO.BAIRRO_H].[06], [BAIRRO.BAIRRO_H].[061]} on 1 FROM [V_DM_EMPRESAS_LOCALIDADE] WHERE {[BAIRRO_F.BAIRRO_H].[06],[BAIRRO_F.BAIRRO_H].[061]}
If i use one filter it is return successfully, but with two filter, like the query above error occurs. Both filters together add up to 200 records, less than the limit reported in the error.
The dimension BAIRRO have more than 10000 records, but i filter only two BAIRRO.
Error occurred:
Caused by: mondrian.olap.ResourceLimitExceededException: Mondrian Error:Number of members to be read exceeded limit (10,000)
at mondrian.resource.MondrianResource$_Def11.ex(MondrianResource.java:1180)
at mondrian.rolap.SqlMemberSource.getMemberChildren2(SqlMemberSource.java:993)
at mondrian.rolap.SqlMemberSource.getMemberChildren(SqlMemberSource.java:891)
at mondrian.rolap.SqlMemberSource.getMemberChildren(SqlMemberSource.java:864)
at mondrian.rolap.NoCacheMemberReader.getMemberChildren(NoCacheMemberReader.java:179)
at mondrian.rolap.RolapCubeHierarchy$NoCacheRolapCubeHierarchyMemberReader.readMemberChildren(RolapCubeHierarchy.java:970)
at mondrian.rolap.RolapCubeHierarchy$NoCacheRolapCubeHierarchyMemberReader.getMemberChildren(RolapCubeHierarchy.java:1027)
at mondrian.rolap.NoCacheMemberReader.getMemberChildren(NoCacheMemberReader.java:159)
at mondrian.rolap.RolapSchemaReader.internalGetMemberChildren(RolapSchemaReader.java:186)
at mondrian.rolap.RolapSchemaReader.getMemberChildren(RolapSchemaReader.java:169)
at mondrian.rolap.RolapSchemaReader.getMemberChildren(RolapSchemaReader.java:162)
at mondrian.olap.DelegatingSchemaReader.getMemberChildren(DelegatingSchemaReader.java:78)
at mondrian.olap.fun.AggregateFunDef$AggregateCalc.getChildCount(AggregateFunDef.java:571)
at mondrian.olap.fun.AggregateFunDef$AggregateCalc.optimizeMemberSet(AggregateFunDef.java:490)
at mondrian.olap.fun.AggregateFunDef$AggregateCalc.optimizeChildren(AggregateFunDef.java:398)
at mondrian.olap.fun.AggregateFunDef$AggregateCalc.optimizeTupleList(AggregateFunDef.java:252)
at mondrian.rolap.RolapResult.<init>(RolapResult.java:314)
at mondrian.rolap.RolapConnection.executeInternal(RolapConnection.java:662)
at mondrian.rolap.RolapConnection.access$000(RolapConnection.java:52)
at mondrian.rolap.RolapConnection$1.call(RolapConnection.java:613)
at mondrian.rolap.RolapConnection$1.call(RolapConnection.java:611)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
To avoid this bug, in the Mondrian schema, I configured Role and in the HierarchyGrant configuration i am setting rollupPolicy to false.
Is this really a good strategy? Or is there a better one?

Role and Hierarchy grants are for security so I wouldn't set those unless you need them for data access control.
It looks like your mondrian.result.limit configuration setting is set too low at 10,000. For optimizing Mondrian performance, I would start with this mondrian.properties and tweak settings from there:
https://github.com/pentaho/pentaho-platform/blob/master/assemblies/pentaho-solutions/src/main/resources/pentaho-solutions/system/mondrian/mondrian.properties

Related

LDAP Query for all uniquemembers of a group

I can accomplish this with memberOf but due to API constraints with Okta, this goes over our rate limit and is too slow for production use. I am trying to do this with uniqueMember as this supports indexing but I am having a real hard time getting this to query correctly.
Here is what I have and I believe the issue is how I am thinking of uniqueMember. I believe it is just an attribute for the groups but I don't know how to query a list of groups to get the uniqueMembers.
(&(objectClass=inetOrgPerson)(|(uniqueMember=CN=groupnamehere,ou=groups,dc=myorg,dc=okta,dc=com)
The memberof query that works
(&(objectClass=inetOrgPerson)(|(memberOf=CN=group1,ou=groups,dc=myorg,dc=okta,dc=com)

SOLR shards abruptly goes down

I am processing around 7 billion documents per day into my solr cloud with 10 instances running with 5GB XMX and XMS values respectovely, which is getting pushed to one collection named 'X'. Its schema has around 150+ fields, out of which almost all of them are indexed. The collection X has 240 shards with 2 replication factor each.
The problem that I am facing currently is that, out of 240 shards, 3 to 4 shards randomly goes down with the following exception :
org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: X slice: shard118
at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:747)
at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:733)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:305)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Followed by this we found another exception in the solr logs :
ERROR (zkCallback-4-thread-4-processing-n:<IP>:8983_solr) [c:X s:shard63 r:core_node57 x:X_shard63_replica1] o.a.s.c.Overseer Could not create Overseer node
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /overseer
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:391)
at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:388)
at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:388)
at org.apache.solr.cloud.Overseer.createOverseerNode(Overseer.java:731)
at org.apache.solr.cloud.Overseer.getStateUpdateQueue(Overseer.java:604)
at org.apache.solr.cloud.Overseer.getStateUpdateQueue(Overseer.java:591)
at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:314)
at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:170)
at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:135)
at org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:56)
at org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:348)
at org.apache.solr.common.cloud.SolrZkClient$3.lambda$process$0(SolrZkClient.java:268)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
As a solution, I deleted all the replicas and recreated them for the shards which were down. This solved the problem but it works intermittently.
Also this has a risk losing a good amount of data in case no in sync replicas are found.
Can anyone please suggest me a better way to resolve this issue. Also this is happening in production so I cannot recreate the collection ( which solves the problem but the same issue may re-appear again after some time ) nor can I afford to restart zookeeper as many other spark jobs are depenedent on the same.
I am stuck into this for long.
UPDATE :
We are not performing any operations on the SolrCloud due to which the shard may come down. The only operation that occurs is that a spark batch job runs on top of this collection, to process data. The spark batch job runs twice a day but shards do not come down during this period of time.

Bigquery: Error running query: Query exceeded resource limits for tier 1. Tier 29 or higher required. in Redash

I would like to know how to increase my billing tier for only one Bigquery query in Redash (not the whole project).
I am getting this error, while trying to refresh the query in Redash:
"Error running query: Query exceeded resource limits for tier 1. Tier 29 or higher required".
According to Bigquery's documentation there are three ways to increase this limit (https://cloud.google.com/bigquery/pricing#high-compute). However, I am not sure which one is applicable to the queries that are written directly in the Redash's query editor. It would be great if you could provide an example.
Thanks for your help.

How to use BigQuery Slots

Hi,there.
Recently,I want to run a query in bigquery web UI by using "group by" over some tables(tables' name suits xxx_mst_yyyymmdd).The rows will be over 10 million. Unhappily,the query failed with this error:
Query Failed
Error: Resources exceeded during query execution.
I did some improvements with my query language,the error may not happen for this time.But with the increasement of my data, the Error will also appear in the future.So I checked the latest release of Bigquery,maybe there two ways to solve this:
1.After 2016/01/01,Bigquery will change the Query pricing tiers to satisfy the "High Compute Tiers" so that the "resourcesExceeded error" will not happen again.
2.BigQuery Slots.
I checked some documents in Google and didn't find a way on how to use BigQuery Slots.Is there any sample or usecase of BigQuery Slots?Or I have to contact with BigQuery Team to open the function?
Hope someone can help me to answer this question,thanks very much!
A couple of points:
I'm surprised that a GROUP BY with a cardinality of 10M failed with resources exceeded. Can you provide a job id of the failed query so we can investigate? You mention that you're concerned about hitting these errors more often as your data size increases; you should likely be able to increase your data size by a few more orders of magnitude without seeing this; likely you've encountered either a bug or something was strange with either your query or your data.
"High Compute Tiers" won't necessarily get rid of resourcesExceeded. For the most part, resourcesExceeded means that BigQuery ran into memory limitations; high compute tiers only address CPU usage. (and note, they haven't been enabled yet).
BigQuery slots enable you to process data faster and with more reliable performance. For the most part, they also wouldn't help prevent resourcesExceeded errors.
There is currently (as of Nov 5) a bug where you may need to provide an EACH keyword with a GROUP BY. Recent changes should enable BigQuery to automatically select the execution strategy, so EACH shouldn't be needed, but there are a couple of cases where it doesn't pick the right one. When in doubt, add an EACH to your JOIN and GROUP BY operations.
To get your project eligible for using slots you need to contact support.

LDAP filter boolean expression maximum number of arguments

I was writing a small test case to see what's more efficient, multiple small queries or a single big query, when I encountered this limitation.
The query looks like this:
(| (clientid=1) (clientid=2) (clientid=3) ...)
When the number of clients goes beyond 2103 ?! the LDAP server throws an error:
error code 1 - Operations Error
As far as I can tell the actual filter string length does not matter ~69KB (at least for Microsoft AD the length limit is 10MB). I tried with longer attribute names and got the same strange limit: 2103 operands
Does anyone have more information about this limitation?
Is this something specified in the LDAP protocol specification or is it implementation specific?
Is it configurable?
I tested this against IBM Tivoli Directory Server V6.2 using both the UnboundID and JNDI Java libraries.
It cannot be more than 8099 characters. See http://www-01.ibm.com/support/docview.wss?uid=swg21295980
Also, what you are doing is not a good practice. If there are common attributes these entries share (e.g., country code, department number, location, etc.), try to retrieve the results using common criteria given you by those attributes. If not, divide your search filter into smaller ones each of which is with few predicates and execute multiple searches. It depends the programming language you're using to do this, but try to execute each search in a separate thread to speed up your data retrieval process.