Check if SSAS Cube is available i.e. processed - ssas

I am looking for a method to check if a cube is accessable i.e. it is processed and not broken.
Example: I got a working cube and i full process a shared dimension so that the cube gets broken.
Is there any mdx or xmla method of finding out what cubes are accessable / processed?

There is an XMLA command DISCOVER_XML_METADATA that can return the state of the database (processes/unprocessed) among other properties. I don't have the best handle on XMLA, so I don't know how to get just the part you need, but this query will return results in the form of XML, and you can parse it from there.
<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">
<RequestType>DISCOVER_XML_METADATA</RequestType>
<Restrictions>
<RestrictionList>
<DatabaseID>AdventureWorks2012MD</DatabaseID>
</RestrictionList>
</Restrictions>
<Properties>
<PropertyList>
</PropertyList>
</Properties>
</Discover>
This requests gets the properties from the objects related to the SSAS database called AdventureWorks2012M. In the results you will see the following:
<Database>
<Name>AdventureWorks2012MD</Name>
<ID>AdventureWorks2012MD</ID>
<CreatedTimestamp>2013-08-01T01:41:10.926667</CreatedTimestamp>
<LastSchemaUpdate>2013-08-01T01:45:05.91</LastSchemaUpdate>
<Description />
<LastProcessed>2013-08-01T01:46:39.713333</LastProcessed>
<State>Processed</State>
<LastUpdate>2014-01-07T19:41:45.146667</LastUpdate>
<AggregationPrefix />
<Language>1033</Language>
<Collation>Latin1_General_CI_AS</Collation>
<Visible>true</Visible>
...
You care about <State>Processed</State>for that database. You can also get the state for each of the dimensions and measure groups as well by adding MeasureGroupID or DimensionID to the restrictions list.

Related

Errors in the OLAP storage engine: Rigid relationships between attributes cannot be changed during incremental processing of a dimension

I am new to SSAS and I'm facing a confusing problem.
I have a regular process for updating dimensions (with a ProcessUpdate).
<Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200"> <Object> <DatabaseID>Central</DatabaseID> <DimensionID>Prestatarios</DimensionID> </Object> <Type>ProcessUpdate</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process>
It has been working fine but in the last run I got the following error:
<root xmlns="urn:schemas-microsoft-com:xml-analysis:empty">
<Exception xmlns="urn:schemas-microsoft-com:xml-analysis:exception" />
<Messages xmlns="urn:schemas-microsoft-com:xml-analysis:exception">
<Error ErrorCode="3238002695" Description="Internal error: The operation terminated unsuccessfully." Source="Microsoft Analysis Services" HelpFile="" />
<Error ErrorCode="3240034307" Description="Errors in the OLAP storage engine: Rigid relationships between attributes cannot be changed during incremental processing of a dimension. The error occurred when processing attribute 'Sub Grupo'. Table: 'dbo_Prestatarios', Column: 'SubGrupo', Value: 'A00377'. Source attribute: 'Prestatario'. Key column value(s) of the source attribute: '7384538'." Source="Microsoft Analysis Services" HelpFile="" />
<Error ErrorCode="3240034317" Description="Errors in the OLAP storage engine: An error occurred while the 'Prestatario' attribute of the 'Prestatarios' dimension from the 'Central' database was being processed." Source="Microsoft Analysis Services" HelpFile="" />
<Error ErrorCode="3239837702" Description="Server: The current operation was cancelled because another operation in the transaction failed." Source="Microsoft Analysis Services" HelpFile="" />
</Messages>
I googled this and the cause seems to be that in the source data some of the of the attributes has been changed. However, I reviewed it and the offending record has not been updated at all:
Source data before T-SQL processing:
Key
Grupo
GrupoCubo
Correlativo
IDCubo
7384538
ARIV
A00377
2971
A003772971
Source data after T-SQL processing:
Key
Grupo
GrupoCubo
Correlativo
IDCubo
7384538
ARIV
A00377
2971
A003772971
So I'm not sure why is failling. I restored backups, reprocessed, and get the same results.
I would apprreciate very much any suggestion or advice.
Thanks for reading
If anyone has a similar problem, this is how I solved.
I restored the last known "good" backup for both the transactional and OLAP databases related to the process, and reran all the processes in sequence during all the periods needed -- in this case there were only 2 periods to reprocess.
All the processes reexecuted this way ran smoothly and the error didn't appear again.
I'm assuming that the first time there were an execution error at some point (not sure exactly when) that corrupted the OLAP database and we couldn't fix it. Any attempt we did for that (reprocess dimensions / partitions, etc) only generated more errors.
This incident illustrate the importance of taking backups at key points in time as a part of the processes. Fortunately, we have that policy so we had the proper backups to recover.
Thanks for reading.

How to get more info within only one geosearch call via Wikipedia API?

I am using an API call similar to http://en.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=10000&gscoord=41.426140|26.099319.
I returns something like this
<?xml version="1.0"?>
<api>
<query>
<geosearch>
<gs pageid="27460829" ns="0" title="Kostilkovo" lat="41.416666666667" lon="26.05" dist="4245.1" primary="" />
<gs pageid="27460781" ns="0" title="Belopolyane" lat="41.45" lon="26.15" dist="4988.7" primary="" />
<gs pageid="27460862" ns="0" title="Siv Kladenets" lat="41.416666666667" lon="26.166666666667" dist="5713.5" primary="" />
<gs pageid="13811116" ns="0" title="Svirachi" lat="41.483333333333" lon="26.116666666667" dist="6521.9" primary="" />
<gs pageid="27460810" ns="0" title="Gorno Lukovo" lat="41.366666666667" lon="26.1" dist="6613.4" primary="" />
<gs pageid="27460799" ns="0" title="Dolno Lukovo" lat="41.366666666667" lon="26.083333333333" dist="6746.2" primary="" />
<gs pageid="27460827" ns="0" title="Kondovo" lat="41.433333333333" lon="26.016666666667" dist="6937" primary="" />
<gs pageid="27460848" ns="0" title="Plevun" lat="41.45" lon="26.016666666667" dist="7383.1" primary="" />
<gs pageid="24179704" ns="0" title="Villa Armira" lat="41.499069444444" lon="26.106263888889" dist="8130" primary="" />
<gs pageid="27460871" ns="0" title="Zhelezari" lat="41.413333333333" lon="25.998333333333" dist="8540.1" primary="" />
</geosearch>
</query>
</api>
But while I am actually trying to get some pictures of those pages, subsequent calls are needed, like
to get some page images
http://en.wikipedia.org/w/api.php?action=query&prop=images&pageids=13843906
then, to get image info
http://en.wikipedia.org/w/api.php?action=query&titles=File:Alexandru_Ioan_Cuza_Dealul_Patriarhiei.jpg&prop=imageinfo&iiprop=url
Well, even if this gets me what I ultimately need, it is not efficient at all.
I would like to know if there are some parameters for this calls, or maybe completely other call(s) that would bring all this info in maximum 2 steps/calls. It would be great, though, if it would be only one.
Wow, I had no idea that such a feature exists nowadays! But to answer your question, since it's a list query, you can probably use it as a generator.
Let's try it:
Original geosearch query: http://en.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=10000&gscoord=41.426140|26.099319
Generator query to get images on matching pages: http://en.wikipedia.org/w/api.php?action=query&prop=images&imlimit=max&generator=geosearch&ggsradius=10000&ggscoord=41.426140|26.099319
The prop=images query can also be used as a generator, so you can also do this:
Get URLs for all images on a list of pages: http://en.wikipedia.org/w/api.php?action=query&prop=imageinfo&iiprop=url&generator=images&gimlimit=max&pageids=13811116|24179704|27460781|27460799|27460810|27460827|27460829|27460848|27460862|27460871
Alas, AFAIK you can't nest generators, so you can't do both steps in one query. You can either:
get the list of images in one query, and then use another query to get the URLs, or
start with the basic geosearch query to get the page IDs, and then get the images and their URLs in another query.
Alas, it turns out that both of these options fail to give you some information that you may want. If you use list=geosearch as a generator, you don't get the coordinate information that you may need if you e.g. wish to display the results on a map. On the other hand, using prop=images as a generator makes you miss out on something even more important: the knowledge of which images are used on which pages!
Thus, unfortunately, it seems that, if your goal is to place images on a map, you'll probably have to do it with three separate queries. At least you can still query multiple pages / images in one request, so you shouldn't need more than three (until you hit the query limits and need to use continuations, that is).
(Also, doing it in three steps lets you apply some filtering to the images before the third step. For example, most of the pages returned by your example query only have the same three images — Flag of Bulgaria.svg, Ivaylovgrad Reservoir.jpg and Oblast Khaskovo.png — all of which are used via templates, and none of which really look like good choices to represent the specific location.)
Ps. If you're just interested in finding images near a particular location, even if they're not used on any specific Wikipedia article, you might want to try using geosearch directly on Wikimedia Commons. It doesn't seem to return any results for your Bulgarian example coordinates, but it works just fine in a more crowded location.
Here is an alternative to build on the previous answer. If you start with this query as a partial answer:
https://en.wikipedia.org/w/api.php?action=query&prop=images&imlimit=max&generator=geosearch&ggsradius=10000&ggscoord=41.426140|26.099319
Then you can build on this to get the information in a single query. The pageimages property can work with the generator. You cannot nest generators but you can chain properties. A query can use pageimages to get the page's main image url for each of the geosearch results. It looks like this:
https://en.wikipedia.org/w/api.php?action=query&prop=images|pageimages&pilimit=max&piprop=thumbnail&iwurl=&imlimit=max&generator=geosearch&ggsradius=10000&ggscoord=41.426140|26.099319
This query returns the image "File" names (images property) and a single URL for the main image (pageimages property). The main image of the page is all I need. You might be able to extrapolate the "file" urls by matching the changes from the file to the url that is output with the query but I cannot recommend such a hack.
The images property has a setting that is supposed to return urls for interwiki links, iwurl. I see the "file" as an interwiki link. This parameter is not working and images does not return a url. Playing on the sandbox might lead you to a better answer.
Intuitively it seems like you should be able to chain the images and imageinfo properties together. Doing so does not give the expected results.
If a single url for the main image of the page is not enough I can encourage you to play in the API sandbox to try and get what you need with some combination of properties. I am using the geosearch generator and get the page image, text description, and lat/long coordinates so that I can get the address. Good luck!

Using CalculatedMembers in Aggregated tables with mondrian 2.4.2

I'm trying to set up custom aggregated tables in our mondrian data warehouse.
I defined the custom aggregated table like this:
<Table>
<AggName name="..." ...>
<AggFactCount ...></AggFactCount>
<AggMeasure name="[Measures].[Example]" column="..."> </AggMeasureName>
</AggName>
</Table>
As long as [Measures].[Example] points to a <Measure name="Example" column="..." .../> item it works fine.
When it points to a <CalculatedMember name="Example" formula="..." ... /> then I get the following error message:
"Failed to find measure 'Example' for Cube 'ExampleCube'"
As far as I checked it online, this thing should be working for recent versions. We're using mondrian 2.4.2 which supports only Measures in AggregatedTables.
Is there any workaround, sg. like defining a cache-table?
Maybe is it enough to provide a configuration item?
Thanks,
Tamas

Quickest method for matching nested XML data against database table structure

I have an application which creates datarequests which can be quite complex. These need to be stored in the database as tables. An outline of a datarequest (as XML) would be...
<datarequest>
<datatask view="vw_ContractData" db="reporting" index="1">
<datefilter modifier="w0">
<filter index="1" datatype="d" column="Contract Date" param1="2009-10-19 12:00:00" param2="2012-09-27 12:00:00" daterange="" operation="Between" />
</datefilter>
<filters>
<alternation index="1">
<filter index="1" datatype="t" column="Department" param1="Stock" param2="" operation="Equals" />
</alternation>
<alternation index="2">
<filter index="1" datatype="t" column="Department" param1="HR" param2="" operation="Equals" />
</alternation>
</filters>
<series column="Turnaround" aggregate="avg" split="0" splitfield="" index="1">
<filters />
</series>
<series column="Requested 3" aggregate="avg" split="0" splitfield="" index="2">
<filters>
<alternation index="1">
<filter index="1" datatype="t" column="Worker" param1="Malcom" param2="" operation="Equals" />
</alternation>
</filters>
</series>
<series column="Requested 2" aggregate="avg" split="0" splitfield="" index="3">
<filters />
</series>
<series column="Reqested" aggregate="avg" split="0" splitfield="" index="4">
<filters />
</series>
</datatask>
</datarequest>
This encodes a datarequest comprising a daterange, main filters, series and series filters. Basically any element which has the index attribute can occur multiple times within its parent element - the exception to this being the filter within datefilter.
But the structure of this is kind of academic, the problem is more fundamental:
When a request comes through, XML like this is sent to SQLServer as a parameter to a stored proc. This XML is shredded into a de-normalised table and then written iteratively to normalised tables such as tblDataRequest (DataRequestID PK), tblDataTask, tblFilter, tblSeries. This is fine.
The problem occurs when I want to match a given XML defintion with one already held in the DB. I currently do this by...
Shredding the XML into a de-normalised table
Using a CTE to pull all the existing data in the database into that same de-normalised form
Matching using a huge WHERE condition (34 lines long)
..This will return me any DataRequestID which exactly matches the XML given. I fear that this method will end up being painfully slow - partly because I don't believe the CTE will do any clever filtering, it will pull all the data every single time before applying the huge WHERE.
I have thought there must be better solutions to this eg
When storing a datarequest, also store a hash of the datarequest somehow and simply match on that. In the case of collision, use the current method. I wanted however to do this using set-logic. And also, I'm concerned about irrelevant small differences in the XML changing the hash - spurious spaces etc.
Somehow perform the matching iteratively from the bottom up. Eg produce a list of filters which match on the lowest level. Use this as part of an IN to match Series. Use this as part of an IN to match DataTasks etc etc. The trouble is, I start to black-out when I think about this for too long.
Basically - Has anyone ever encountered this kind of problem before (they must have). And what would be the recommended route for tackling it? example (pseudo)code would be great :)
To get rid of the possibility of minor variances, I'd run the request through an XML transform (XSLT).
Alternatively, since you've already got the code to parse this out into a denormalized staging table that's fine too. I would then simply using FOR XML to create a new XML doc.
Your goal here is to create a standardized XML document that respects ordering where appropriate and removes inconsistencies where it is not.
Once that is done, store this in a new table. Now you can run a direct comparison of the "standardized" request XML against existing data.
To do the actual comparison, you can use a hash, store the XML as a string and do a direct string comparison, or do a full XML comparison like this: http://beyondrelational.com/modules/2/blogs/28/posts/10317/xquery-lab-36-writing-a-tsql-function-to-compare-two-xml-values-part-2.aspx
My preference, as long as the XML is never over 8000bytes, would be to create a unique string (either VARCHAR(8000) or NVARCHAR(4000) if you have special character support) and create a unique index on the column.

REST - allow GET resource/ to output different versions

For simplicity, say I have a resource users. The HTTP call GET users/ returns a list of links to concrete users:
<users>
<link rel='user' href='/users/user/1/'/>
<link rel='user' href='/users/user/2/'/>
<link rel='user' href='/users/user/3/'/>
....
</users>
The result representation is described in a specific media type:
application/vnd.company.Users+xml
In our frontends, we want to display a table with all users. This means we need to be able to fetch user information to display, such as the name, gender, friends, ... I would like to avoid that we need a separate request for each user (GET /users/user/x/) to retrieve this information. In addition, some frontends will only display the name, while other frontends will display the name and his/her friends. And so on.
In essence, we are still returning users, but with extentions depending on what the frontend needs.
Which option would you choose? Why?
(1) Make GET users/ customizable via parameters such that the customizations are listed. Depending on the customizations , different media types might be returned, since the syntax of one version/combination might be very different than one of another version/combination:
GET users/ -> application/vnd.company.Users+xml
GET users/?fields=name,gender -> application/vnd.company.Users+xml
GET users/?fields=name,gender,friends -> application/vnd.company.UsersWithFriends+xml
(2) Different resources are created to distinguish different between media types. Parameters are still used for basic customizations covered by the media type. This gives:
GET users?fields=name -> application/vnd.company.Users+xml
GET users?fields=name,gender -> application/vnd.company.Users+xml
GET users_with_friends?fields=gender -> application/vnd.company.UsersWithFriends+xml
(3) The same as (1), but instead of parameters, the desired media type is set by the client in the Accept header. Customizable fields covered by the media type are still set via parameters:
GET users/?fields=name ACCEPT application/vnd.company.Users+xml
GET users/?fields=name,gender ACCEPT application/vnd.company.Users+xml
GET users/?fields=name,gender ACCEPT application/vnd.company.UsersWithFriends+xml
(4) Something else?
To answer my own question, I think that:
Solution (1) is very very wrong. The media type must not be dependant on parameters.
Solution (2) and (3) are more or less equal and up to preferences. I prefer (3) since this would not introduce an explosion of resources to be introduced. In addition, in essence we are still returning users. The only difference is the amount of information, reflected by different media types, that is returned. So one might argue that there is no real need to introduce new resources as done in (2).
Do you agree? What do you think?
(3) is surely the best using strict Media Type, but would require specific HTTP Request client and won't be accessible through basic URL open library or browser.
Why not using solution 1 with another extra parameter : names "expect" or "as".
ie:
users/?fields=name,gender&expect=application/vnd.company.Users+xml
users/?fields=name,gender&expect=application/vnd.company.UsersWithFriends+xml
This would be the same as ACCEPT solution but won't need very custom client library to forge the request.
However you'll have to parse the parameter to provide correct output (the (3) would also have this requirement for parsing the ACCEPT)
Personally, I'm not a fan of using query string parameters to allow clients to pick the data elements they wish to include in a representation. I find it makes it hard to optimize the server and it pollutes the cache with many overlapping variants. Also, you really shouldn't try and use conneg to select between representations that contain different sets of data. Conneg is really just for selecting the serialization format.
With the Hal media type you can approach this problem a bit differently. Consider a service with a root representation that looks like:
<resource rel="self"
href="http://example.org/userservice"
xmlns:us="http://example.org/userservice/rels">
<link rel="us:users" name="users" href="http://example.org/users">
<link rel="us:userswithfriends" href="http://example.org/userswithfriends">
</resource>
When you use hal, instead of using the media type documentation to describe your application domain, you can use link relations. In this case, the us:users link points to a document that contains a list of users. I know the namespace stuff looks a bit wierd, but it is not really being used as an XML namespace, just as way of making a Compact URI (CURIE). When you invent your own rel values, they need to be specified in the form of a URI to try and ensure uniqueness.
The list of users would look something like:
<resource rel="self"
href="http://example.org/users"
xmlns:us="http://example.org/userservice/rels">
<resource rel="us:user" name="1" href="/user/1">
<name>Bob</name>
<age>45</age>
<resource>
<resource rel="us:user" name="2" href="/user/2">
<name>Fred</name>
<age>Bill</age>
<resource>
</resource>
and 'us:userswithfriends' points to a different resource that contains the list of users with each user containing a list of friends.
<resource rel="self"
href="http://example.org/users"
xmlns:us="http://example.org/userservice/rels">
<resource rel="us:user" name="1" href="/user/1">
<name>Bob</name>
<resource rel="us:friend" name="1" href="/user/10">
<name>Sheila</name>
<resource>
<resource rel="us:friend" name="2" href="/user/74">
<name>Robert</name>
<resource>
<resource>
<resource rel="user" name="2" href="/user/2">
<name>Fred</name>
<resource rel="us:friend" name="1" href="/user/14">
<name>Bill</name>
<resource>
<resource rel="us:friend" name="2" href="/user/33">
<name>Margaret</name>
<resource>
<resource>
</resource>
With hal it is the documentation of your rels (us:users, us:friend) that decribes what data elements are allowed to exist in the resource element. You are free to embed all of the data of the resource, or more likely just a subset of the data. If the client wants to access a completely representation of the embedded resource then it can follow the provided link.