MDX query issue - ssas

In analysis services, I have cube that is based on hospitalization data. For each hospitalization there are potentially 9 icd codes and these are each stored in their own field in the view on which the cube is based. These are stored in a child table in the relational database on which the SSAS database is based.
I would like to query the cube to return all rows that have a certain ICD code in any one or more of the 9 icd code fields. It seems as if it should be simple to have this sort of "OR" in the WHERE or the Filter clause, but I'm not finding the correct method.
Thanks in advance,
Jeremy Schrader

As far as i understand, you are an SQL guy and new to MDX, so that's why you have difficulties for the query.
it would be better if you tell us what are the measures you want to select with ICD codes but i am going to try to show you an mdx query sample as simple as possible. Your query should like below;
select {Measure1,Measure2,...} on columns
ICDCodeDimension.Children on rows
//{ICDCodeDimension.ICDCode1,ICDCodeDimension.ICDCode5,...} on rows
from Cube
MDX is highly advanced query language and there are many more concept you should know/learn to use it effectively.
Hope this help.

I am guessing that you'd have a dimension called [ICD Codes] with a single level called [Codes] and 9 members called [Code A] and [Code B] or whatever. Perhaps even a member for [No code] too?
In that case your query would be able to tell you the total number of hospitalisation cases for each code, for a certain time period, across all hospitals:
SELECT {[ICD Codes].[Codes].members} ON ROWS,
{[Measures].[Number of Cases]} ON COLUMNS
FROM [CubeName]
WHERE ([Time].[2010].[Quarter 1])

Thanks for both of your feedback. After I researched further (particularly an article here: http://sqlblog.com/blogs/mosha/default.aspx that uses the SUBCUBE method to give some "OR" functionality but with very poor performance) I realized that the OR construct that I was looking for requires record-level information and so doesn't work after the aggregation that SSAS performs. Thus, I need to create a field on the fact table that has the result of the SQL "OR" statement that I need.
In this case I will just create a flag for any record that has a certain range of ICD codes in any of the 9 ICD code fields. Then, I'll create a measure that gives a count of these. Luckily, the requirements of my app are that only a limited number of diagnoses need to be looked at in this way(i.e. any hospitalization that is diabetes-related,tobacco-related,etc.). I'm still curious how one would approach this if you needed to allow the user to choose any ICD code. My understanding at this point is that you would then need to revert back to plain SQL.
Jeremy

Related

Dynamic where clause and operand SQL Server

Having read almost all topics related to dynamic where clauses, I still can't find a way through.
Here is my source table:
Source Table
And the result I want is:
Results
In fact I want to return all values satisfying the Test value condition but don't know how to implement it dynamicly (I have a table with 700K lines).
Thanks a lot for your help.
EDIT:
Following your answers, I will detailed a bit more the approach.
Unfortunately, as I'm a new user, I'm not allowed to post pictures directly in the post.
I'm basicly performing segregation of duties controls over the SAP system.
Basicly, I want to test if some of the access of a customer are conflictual based on SAP extractions against a knowledge template stating potential conflicts.
Here is an simplified example of the SAP extract:
And here is a simplified example of the Potential conflict template:
<table><tbody><tr><th>Field</th><th>Value</th></tr><tr><td>ACTVT</td><td>AZ0220</td></tr><tr><td>KOART</td><td>K</td></tr><tr><td>BUKRS</td><td>*</td></tr><tr><td>EKGRP</td><td>03</td></tr></tbody></table>
This is a faxe example of the raw data of the customer:
<table><tbody><tr><th>FIELD</th><th>RANGE_START</th><th>RANGE_END</th></tr><tr><td>ACTVT</td><td>AZ01*</td><td>AZ99*</td></tr><tr><td>KOART</td><td>A</td><td>L</td></tr><tr><td>BUKRS</td><td>011</td><td>099</td></tr><tr><td>EKGRP</td><td>2</td><td>10</td></tr></tbody></table>
I thought a way of doing this is to use dynamic where clause.
Thanks a lot for your help

Learning ExecuteSQL in FMP12, a few questions

I have joined a new job where I am required to use FileMaker (and gradually transition systems to other databases). I have been a DB Admin of a MS SQL Server database for ~2 years, and I am very well versed in PL/SQL and T-SQL. I am trying to pan my SQL knowledge to FMP using the ExecuteSQL functionaloty, and I'm kinda running into a lot of small pains :)
I have 2 tables: Movies and Genres. The relevant columns are:
Movies(MovieId, MovieName, GenreId, Rating)
Genres(GenreId, GenreName)
I'm trying to find the movie with the highest rating in each genre. The SQL query for this would be:
SELECT M.MovieName
FROM Movies M INNER JOIN Genres G ON M.GenreId=G.GenreId
WHERE M.Rating=
(
SELECT MAX(Rating) FROM Movies WHERE GenreId = M.GenreId
)
I translated this as best as I could to an ExecuteSQL query:
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
I set the field type to Text and also ensured values are not stored. But all I see are '?' marks.
What am I doing incorrectly here? I'm sorry if it's something really stupid, but I'm new to FMP and any suggestions would be appreciated.
Thank you!
--
Ram
UPDATE: Solution and the thought process it took to get there:
Thanks to everyone that helped me solve the problem. You guys made me realize that traditional SQL thought process does not exactly pan to FMP, and when I probed around, what I realized is that to best use SQL knowledge in FMP, I should be considering each column independently and not think of the entire result set when I write a query. This would mean that for my current functionality, the JOIN is no longer necessary. The JOIN was to bring in the GenreName, which is a different column that FMP automatically maps. I just needed to remove the JOIN, and it works perfectly.
TL;DR: The thought process context should be the current column, not the entire expected result set.
Once again, thank you #MissJack, #Chuck (how did you even get that username?), #pft221 and #michael.hor257k
I've found that FileMaker is very particular in its formatting of queries using the ExecuteSQL function. In many cases, standard SQL syntax will work fine, but in some cases you have to make some slight (but important) tweaks.
I can see two things here that might be causing the problem...
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON
M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
You can't use the standard FMP table::field format inside the query.
Within the quotes inside the ExecuteSQL function, you should follow the SQL format of table.column. So M::MovieName should be M.MovieName.
I don't see an AS anywhere in your code.
In order to create an alias, you must state it explicitly. For example, in your FROM, it should be Movies AS M.
I think if you fix those two things, it should probably work. However, I've had some trouble with JOINs myself, as my primary experience is with FMP, and I'm only just now becoming more familiar with SQL syntax.
Because it's incredibly hard to debug SQL in FMP, the best advice I can give you here is to start small. Begin with a very basic query, and once you're sure that's working, gradually add more complicated elements one at a time until you encounter the dreaded ?.
There's a number of great posts on FileMaker Hacks all about ExecuteSQL:
Since you're already familiar with SQL, I'd start with this one: The Missing FM 12 ExecuteSQL Reference. There's a link to a PDF of the entire article if you scroll down to the bottom of the post.
I was going to recommend a few more specific articles (like the series on Robust Coding, or Dynamic Parameters), but since I'm new here and I can't include more than 2 links, just go to FileMaker Hacks and search for "ExecuteSQL". You'll find a number of useful posts.
NB If you're using FMP Advanced, the Data Viewer is a great tool for testing SQL. But beware: complex queries on large databases can sometimes send it into fits and freeze the program.
The first thing to keep in mind when working with FileMaker and ExecuteSQL() is the difference between tables and table occurrences. This is a concept that's somewhat unique to FileMaker. Succinctly, tables store the data, but table occurrences define the context of that data. Table occurrences are what you're seeing in FileMaker's relationship graph, and the ExecuteSQL() function needs to reference the table occurrences in its query.
I agree with MissJack regarding the need to start small in building the SQL statement and use the Data Viewer in FileMaker Pro Advanced, but there's one more recommendation I can offer, which is to use SeedCode's SQL Explorer. It does require the adding of table occurrences and fields to duplicate the naming in your existing solution, but this is pretty easy to do and the file they offer includes a wizard for building the SQL query.

Transpose SQL table when the value is text

I have a table containing a series of survey responses, structured like this:
Question Category | Question Number | Respondent ID | Answer
This seemed the most logical storage for the data. The combination of Question Category, Question Number, and Respondent ID is unique.
Problem is, I've been asked for a report with the Respondent ID as the columns, with Answer as the rows. Since Answer is free-text, the numeric-expecting PIVOT command doesn't help. It would be great if each row was a specific Question Category / Question Number pairing so that all of the information is displayed in a single table.
Is this possible? I'm guessing a certain amount of dynamic SQL will be required, especially with the expected 50-odd surveys to display.
I think this task should be done by your client code - trying to do this transposing on SQL side is not very good idea. Such SQL (even if it can be constructed) will likely be extremely complicated and fragile.
First of all, you should count how many distinct answers are there - you probably don't want to create report 1000 columns wide if all answers are different. Also, you probably want to make sure that answers are narrow - what if someone gave really bad 1KB wide answer?
Then, you should construct your report (would that be HTML or something else) dynamically based on results of your standard, non-transposed SQL.
For example, in HTML you can create as many columns as you want using <th>column</th> for table header and <td>value</td> for data cell - as long as you know already how many columns are going to be in your output result.
To me, it seems that the problem is the number of columns. You don't know how many respondents there are.
One idea would be to concatenate the respondent ids. You can do this in SQL Server as:
select distinct Answer,
(select cast(RespondentId as varchar(255))+'; '
from Responses r2
where r2.Answer = r.Answer
for xml path ('')
) AllResponders
from Responses r
(This is untested so may have syntax errors.)
If reporting services or excel power pivot are possibilities for the report then they could both probably accomplish what you want easier than a straight sql query. In RS you can use a tablix, and in power pivot a pivot table. Both avoid having to define your pivot columns in an sql pivot statement, and both can dynamically determine the column names from a tabular result set.

How could i write this code in a more performant way?

In our app people have 1 or multiple projects. These projects have a start and an end date. People have a limited amount of available days.
Now we have a page that displays the availability of a given person on a week by week basis. It currently shows 18 weeks.
The way we currently calculate the available time for a given week is like this:
def days_available(query_date=Date.today)
days_engaged = projects.current.where("start_date < ? AND finish_date > ?", query_date, query_date).sum(:days_on_project)
available = days_total - hours_engaged
end
This means that to display the page descibed above the app will fire 18(!) queries into the database. We have pages that lists the availability of multiple people in a table. For these pages the amount of queries is quickly becomes staggering.
It is also quite slow.
How could we handle the availability retrieval in a more performant manner?
This is quite a common scenario when working with date ranges in an entity. Easy and fastest way is in SQL:
Join your events to a number generated date table (see generate days from date range) so that you have a row for each day a person or people are occupied. Once you have the data in this form it is simply a matter of grouping by the week date part of the date and counting the rows per grouping.
You can extend this to group by person for multiple person queries.
From a SQL point of view, I'd advise using a stored procedure and pass in your date/range requirement, you can then return a recordset for a user or possibly multiple users. This way your code just has to access db once.
You can then output recordset data in one go, by iterating through.
Hope this helps.
USE Stored procedure to fire your query to SQL to get data.
Pass paramerts in your case it is today's date to the SQl query.
Apply your conditions and Logic in the SQL Stored procedure , Using procedure is the goood and fastest way to retrieve data from the SQL , also it will prevent your code from the SQL injection too.
Call that SP from your Code as i dont know the Ruby on raisl I cant provide you steps about how to Call the Stored procedure from it.
After that the data fdetched as per you stored procedure will be available in Data table or something like that.
After getting the data you can perform all you need
Hope this helps
see what query is executed. further you may make comand explain to your query
explain select * from project where start_date < any_date and end_date> any_date2
you see the plan of query . Use this plan to optimized your query.
for example :
if you have index using field end_date replace a condition(end_date> any_date2 and start_date < any_date) . this step will using index if you have index on this field. But it step is db dependent . example is for nysql. if you want use index in mysql you must have using index condition on left part of where
There's not really enough information in your question to know exactly what you're trying to achieve here, e.g. the code snippet doesn't make use of the returned database query, so you could just remove it to make it faster. Perhaps this is just a bug in the code you posted?
Having said that, there are some techniques you should look into to implement your functionality.
I would take a look at using data warehouse techniques. I would think of your 'availability information' as a Fact table in a star schema, with 'Dates' and 'People' as Dimension tables.
You can then use queries to get stuff like - list of users for this projects for this week, and their availability.
Data warehousing has a whole bunch of resources you can tap into to help make this perform well, but there's also a lot of terminology that can be confusing, but for this type of 'I need to slice and dice my data across several sets of things (people and time)', Data Warehousing techniques can be quite powerful.
As I dont understand ruby on rails,from sql point of view i suggest you to write a stored procedure and return a dataset.And do the necessary table operations on the dataset from front end.It will reduce the unnecessary calls to DB.

Can scalar functions be applied before filtering when executing a SQL Statement?

I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.