I have a sample star schema made in this way:
author (id, name)
book (id, name)
sample_fact_table (id, authorfk, bookfk, quantity)
where obiouvsly authorfk is a FK to author.id and bookfk is FK to book.id.
Dimensions are: "author", "book". Measure is "quantity".
I made this configuration for the cube, using Pentaho Schema Workbench tool:
<Schema name="MySchema">
<Dimension type="StandardDimension" visible="true" name="Author">
<Hierarchy visible="true" hasAll="true" allMemberName="All Authors" primaryKey="id">
<Table name="author">
</Table>
<Level name="Name" visible="true" table="author" column="id" nameColumn="name" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" visible="true" name="Book">
<Hierarchy visible="true" hasAll="true" allMemberName="All Books" primaryKey="id">
<Table name="book">
</Table>
<Level name="Name" visible="true" table="book" column="id" nameColumn="name" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Cube name="TestCube" visible="true" cache="true" enabled="true">
<Table name="sample_fact_table">
</Table>
<DimensionUsage source="Author" name="Author" visible="true" foreignKey="authorfk">
</DimensionUsage>
<DimensionUsage source="Book" name="Book" visible="true" foreignKey="bookfk">
</DimensionUsage>
<Measure name="quantity" column="quantity" aggregator="sum" visible="true">
</Measure>
</Cube>
</Schema>
If I try to execute the MDX query:
select
Measures.quantity ON COLUMNS,
NON EMPTY Author.Children ON ROWS
from [TestCube]
I have a good result:
Axis #0:
{}
Axis #1:
{[Measures].[quantity]}
Axis #2:
{[author].[Al]}
{[author].[John]}
{[author].[Jack]}
Row #0: 3
Row #1: 9
Row #2: 1
But if instead of Author I query on Book, like this:
select
Measures.quantity ON COLUMNS,
NON EMPTY Book.Children ON ROWS
from [TestCube]
I get this error:
Mondrian Error:Failed to parse query 'select
Measures.quantity ON COLUMNS,
NON EMPTY Book.Children ON ROWS
from [TestCube]'
Mondrian Error:MDX object 'Book' not found in cube 'TestCube'
What I'm doing wrong?
Author and Book are both Dimensions, both declared in the same way, both referenced into the Cube.
Thank you!
I have defined a schema. I am repeating the definition of the same view in each hierarchy within a dimension. Is there a way I could define the view for dimension instead of each hierarchy in a dimension ?
<Schema name="shop_cube">
<Dimension name="Product dimensions">
<Hierarchy name="gtin" hasAll="true" primaryKey="id">
<View alias=""product_dimension"">
<SQL dialect="generic">
<![CDATA[SELECT * FROM product_dimension where name="my_shop"]]>
</SQL>
</View>
<Level name="gtin" column="product_gtin" type="String" hideMemberIf="IfBlankName" uniqueMembers="false">
</Level>
</Hierarchy>
<Hierarchy name="Productgroup" hasAll="true" primaryKey="id">
<View alias=""product_dimension"">
<SQL dialect="generic">
<![CDATA[SELECT * FROM product_dimension where name="my_shop"]]>
</SQL>
</View>
<Level name="Productgroup" column="product_group" type="String" hideMemberIf="IfBlankName" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
...
...
<DimensionUsage name="Product dimensions" source="Product dimensions" foreignKey="product_dimension_id"/>
</Schema>
i have converted sql analysis to mdx analysis and found that mdx is far more expensive in term of execution time than sql. below is the sql which i am trying to convert to mdx
select inc.incident_id,inc.incident_start_time, inc.incident_end_time,inc_type.incident_type,inc.incident_desc,st_x(inc.the_geom) as incident_lon, st_y(inc.the_geom) as incident_lat,inc_type.class_name as incident_class, inc_type.incident_image ,fir.modus_operandi, fir.fir_status, fir.fir_serial_no as fir_id from analyst_mdx.incident as inc join analyst_mdx.incident_types as inc_type on inc.incident_type = inc_type.incident_type_id left outer join analyst_mdx.fir_details as fir on inc.fir_no = fir.fir_no join analyst_mdx.district on (st_contains(district.the_geom, inc.the_geom) and district.district_id=1) where inc.incident_type in ('10','8','14','11','5','9','19','21','6','7','16','3','4','12','20','13','1','2','23','17','18') and ((inc.incident_start_time between '2012-01-29T00:00:00' and '2014-12-29T23:59:59')) order by incident_start_time, incident_end_time
and this is the fact tables are interconnected
now let me show you the mdx which i have made
with member [Measures].[incident_type] as '[incident].CurrentMember.Properties("incident_type")'
member [Measures].[incident_desc] as '[incident].CurrentMember.Properties("incident_desc")'
member [Measures].[incident_class] as '[incident].CurrentMember.Properties("incident_class")'
member [Measures].[incident_image] as '[incident].CurrentMember.Properties("incident_image")'
member [Measures].[modus_operandi] as '[incident].CurrentMember.Properties("modus_operandi")'
member [Measures].[fir_status] as '[incident].CurrentMember.Properties("fir_status")'
member [Measures].[fir_id] as '[incident].CurrentMember.Properties("fir_id")'
member [Measures].[date_start_time] as '[incident].CurrentMember.Properties("date_start_time")'
member [Measures].[the_geom] as '[incident].CurrentMember.Properties("the_geom")'
member [Measures].[date_end_time] as '[incident].CurrentMember.Properties("date_end_time")'
member [time].[newtime] as 'Aggregate({([time].[all periods].[2012].[4].[15] : [time].[all periods].[2012].[4].[30]),([time].[all periods].[2012].[5]:[time].[all periods].[2012].[12]) ,([time].[all periods].[2013]),([time].[all periods].[2014].[1] : [time].[all periods].[2014].[6]),([time].[all periods].[2014].[7].[1] : [time].[all periods].[2014].[7].[15])})'
select {[Measures].[incident_type], [Measures].[incident_desc], [Measures].[incident_class],[Measures].[incident_image], [Measures].[modus_operandi], [Measures].[fir_status],[Measures].[fir_id], [Measures].[the_geom], [Measures].[date_start_time],[Measures].[date_end_time], [Measures].[incident_count]} ON COLUMNS,
NON EMPTY Filter([incident].[incident_id].Members, ST_Contains([district].[all district].[1].Properties("the_geom"),[incident].CurrentMember.Properties("the_geom"))) ON ROWS
from [pin_mapping]
where ([time].[newtime])
maximum record for a table is 65 thousand, sql executes in not more than 3 seconds and mdx takes more than 10 minutes. my entire olap concepts fails with it. can anyone explain me where i am going wrong ?
this is the cube that i am using...
<Schema name="analyst_haryana">
<Cube name="pin_mapping" cache="true" enabled="true">
<Table name="pin_mapping_fact_table" schema="analyst_mdx"></Table>
<Dimension type="StandardDimension" foreignKey="incident_id" name="incident">
<Hierarchy hasAll="true" allMemberName="all incident" primaryKey="incident_id">
<Table name="incident" schema="analyst_mdx"> </Table>
<Level name="incident_id" column="incident_id" type="Numeric" uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
<Property name="the_geom" column="the_geom" type="Geometry"></Property>
<Property name="incident_type" column="incident_desc" type="String"></Property>
<Property name="incident_desc" column="incident_desc" type="String"></Property>
<Property name="fir_status" column="fir_status" type="String"></Property>
<Property name="incident_image" column="incident_image" type="String"></Property>
<Property name="modus_operandi" column="incident_desc" type="String"></Property>
<Property name="incident_class" column="incident_desc" type="String"></Property>
<Property name="date_start_time" column="date_start_time" type="String"></Property>
<Property name="date_end_time" column="date_end_time" type="String"></Property>
</Level>
</Hierarchy>
</Dimension>
<Dimension type="TimeDimension" foreignKey="time_id" name="time">
<Hierarchy hasAll="true" allMemberName="all periods" primaryKey="time_id">
<Table name="aux_dates" schema="analyst_mdx"></Table>
<Level name="year" column="the_year" type="Numeric" uniqueMembers="true" levelType="TimeYears" hideMemberIf="Never"></Level>
<Level name="quarter" column="the_quarter" type="String" uniqueMembers="false" levelType="TimeQuarters" hideMemberIf="IfParentsName"></Level>
<Level name="month" column="the_month" type="Numeric" uniqueMembers="false" levelType="TimeMonths" hideMemberIf="IfParentsName"></Level>
<Level name="day" column="the_day" type="Numeric" uniqueMembers="false" levelType="TimeDays" hideMemberIf="IfParentsName"></Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" foreignKey="district_id" name="district">
<Hierarchy hasAll="true" allMemberName="all district" primaryKey="district_id">
<Table name="district" schema="analyst_mdx"></Table>
<Level name="district_id" column="district_id" type="Numeric" uniqueMembers="true" levelType="Regular" hideMemberIf="IfParentsName">
<Property name="the_geom" column="the_geom" type="Geometry"></Property>
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" foreignKey="incident_type_id" name="incident_type">
<Hierarchy hasAll="true" allMemberName="all incident _types" primaryKey="incident_type_id">
<Table name="incident_types" schema="analyst_mdx"></Table>
<Level name="incident_type_id" column="incident_type_id" type="Numeric" uniqueMembers="true" levelType="Regular" hideMemberIf="Never"></Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" foreignKey="fir_serial_no" name="fir_details">
<Hierarchy hasAll="true" allMemberName="all status" primaryKey="fir_serial_no">
<Table name="fir_details" schema="analyst_mdx"></Table>
<Level name="fir_status" column="fir_status" type="String" uniqueMembers="false" levelType="Regular" hideMemberIf="Never"></Level>
</Hierarchy>
</Dimension>
<Measure name="incident_count" column="incident_count" datatype="Numeric" aggregator="count" visible="true"></Measure>
</Cube>
</Schema>
Well, not exactly sure, since I don't have the execution plans, but here goes nothing..
You have a well tuned SQL query on top which is doing set computation, which SQL Server is good at. And since the size of table is relatively small, it is bound to be fast. On the other hand, in the MDX query, you are actually doing cell-by-cell computation(due to use of CURRENTMEMBER function. The analysis engine sucks at it! Thus it is bound to be slow.
You are using the NON EMPTY function, which basically iterates through each and every row and filters out the EMPTY ones. This adds to the overhead on the engine. Run the query by removing the NON EMPTY clause, and see if it brings down the execution time(Though, I have observed that sometimes NON EMPTY clause actually improves rendering time)
The cube design must be revisited here. All the calculated members must be tried to be put as attributes of the dimensions involved, if possible.
Enable the mondrian slow sql query log (done in log4j.xml) and see which query is the issue.
See if it is one query, or simply that mondrian is executing 1000's.
If the latter review your approxRowCount options in the schema as well as uniqueMembers options. Possibly post the schema and results of the above changes so we can understand more. There's also the (deprecated) highcardinality option.
I have the following table created by the following MDX
SELECT
{
[Measures].numTickets
}ON COLUMNS,
{
Descendants(DateCreacion.Children, DateCreacion.Month)
}ON ROWS
FROM tickets
The thing is that i want to add another column to the numTickets but every time i add a dimension to the column, i get an empty column.
select {[Clinica].Children} ON COLUMNS,
{Descendants([DateCreacion].Children, [DateCreacion.YQMD].[Month])} ON ROWS
from [tickets]
How would i show the same data as the first picture but in the second format?
<Schema name="New Schema1">
<Cube name="tickets" visible="true" cache="true" enabled="true">
<Table name="fact">
</Table>
<Dimension type="TimeDimension" visible="true" foreignKey="fecha_tickets_id" name="DateCreacion">
<Hierarchy name="YQMD" visible="true" hasAll="true">
<Table name="dim_fecha_creacion_tickets" alias="">
</Table>
<Level name="Year" visible="true" column="año" type="Numeric" uniqueMembers="false" levelType="TimeYears">
</Level>
<Level name="Quarter" visible="true" column="cuarto" type="Numeric" uniqueMembers="false" levelType="TimeQuarters">
</Level>
<Level name="Month" visible="true" column="mes" type="Numeric" uniqueMembers="false" levelType="TimeMonths">
</Level>
<Level name="Day" visible="true" column="dia" type="Numeric" uniqueMembers="false" levelType="TimeDays">
<Property name="date_iso" column="date_iso" type="Numeric">
</Property>
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" visible="true" foreignKey="clinica_id" name="Clinica">
<Hierarchy name="New Hierarchy 0" visible="true" hasAll="true">
<Table name="dim_posicion" alias="">
</Table>
<Level name="Posicion" visible="true" column="sigla" type="String" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Measure name="numTickets" column="idTicket" datatype="Numeric" aggregator="count" visible="true">
</Measure>
</Cube>
</Schema>
When adding the [Clinica].Children to the columns, you removed the measures.
You probably want to keep them, using a cross join, which can be stated using the * operator in MDX: Either
select {[Clinica].Children}
*
{ [Measures].numTickets }
ON COLUMNS,
...
or
select { [Measures].numTickets }
*
{[Clinica].Children}
ON COLUMNS,
...
depending on the order of columns you want to see.
I need to create an OLAP View just from one table in MySQL.
I need to get information from the following columns in my table:
loginNote
logoutNote
timestamp
userFirstName
So I created this Mondrian Schema:
<Schema name="Login">
<Cube name="Login" visible="true" cache="true" enabled="true">
<Table name="event_log">
</Table>
<Dimension visible="true" highCardinality="false" name="UserFirstName">
<Hierarchy visible="true" hasAll="true" allMemberName="All UserFirstName">
<Level name="UserFirstName" visible="true" column="userFirstName" type="String" uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Dimension visible="true" highCardinality="false" name="LoginNote">
<Hierarchy visible="true" hasAll="true" allMemberName="All LoginNote">
<Level name="LoginNote" visible="true" column="loginNote" type="String" uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Dimension visible="true" highCardinality="false" name="LogoutNote">
<Hierarchy visible="true" hasAll="true" allMemberName="All UserFirstName">
<Level name="LogoutNote" visible="true" column="logoutNote" type="String" uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Measure name="Users" column="userFirstName" aggregator="count" description="Users">
</Measure>
I would like to know how can I run a MDX query to be able to show on the rows the LoginNote and LogoutNote information, and in the columns, the UserFirstName.
I was able to run
Select
UserFirstName.Children ON COLUMNS,
LogoutNote.Children ON ROWS
FROM Login
or
Select
UserFirstName.Children ON COLUMNS,
LoginNote.Children ON ROWS
FROM Login
but I cannot run
Select
UserFirstName.Children ON COLUMNS,
{LogoutNote.Children,LoginNote.Children} ON ROWS
FROM Login
because an error is returned:
All arguments to function '{}' must have same hierarchy.
Any help will be appreciated!
Thanks!
The {...} notation is shorthand for Union(...), which combines two sets of members together. Those members must come from the same hierarchy (as the error message says), but you are including members from LogoutNote and LoginNote which are different dimensions/hierarchies.
If you want to combine hierarchies, you need to Crossjoin() them, creating a cartesian product of the two sets.
SELECT
UserFirstName.Children ON COLUMNS,
Crossjoin(LogoutNote.Children, LoginNote.Children) ON ROWS
FROM Login
I'm not sure if this is exactly what you expect as the results from your query, and you might want to add a NON EMPTY before that Crossjoin() to eliminate all the combinations of LoginNote and LogoutNote that have no values.
Hope that sets you on the right track.