I'm getting a result from my cross join thay I can't wrap my heads around, see my first MDX query that gives me a result as expected. (SQL Server Analysis Services 2012)
Alright, now I want to do the same thing, only I want to target a specific date, let's take "2020-05-05". This is what happens:
I get an empty result, why don't I see the the results from the first query (limited to the rows with 2020-05-05)? I really do not understand this.
Turned out that the SSAS Cube was only partially processed, doing a full process of everything seems to have solved the problem.
Altough, I would still appricaite an explanation on why the second query failed? The first query shows quite clear that the cube, at that state, contained information on 2020-05-05?
Related
I'm trying to optimize a 2M row SSAS query into Power BI by using MDX prior to the Power Query. I have experience in T-SQL and found a website to help translate T-SQL experience into MDX, which was successful for some queries (basic rows/column selects, crossjoins, non empty, order by, filter, where). So now I want to get in my sales data which contains three dimensions and four measures but I get the following error:
Executing the query ...
Query (3, 1) The 'Measures' hierarchy appears more than once in the tuple.
Run complete
I attempted a few variations related to crossjoining the measures and the dimensions, only selecting one measure (which still took too long), and specifying members vs children.
'''
select
([Date].[OrderDate].children, [Customer].[CustID].children, [ProdLevel].[ProdNumber].children) on rows,
([Measures].[Revenue], [Measures].[Units], [Measures].[ASP], [Measures].[Profit]) on columns
from [RepProdDB]
where [ProdLevel].[Prod Description].[MyBusinessUnit]
'''
Looking up the error: "The 'Measures' hierarchy appears more than once in the tuple." is a bit vague to me as I have slight but probably incomplete understanding of tuples.
My hope is to have something that I can easily get in PivotTable OLAP, Power Pivot, and Power Query but using the actual MDX code. Thoughts?
So you need to understand the diffrence between tuples and sets.
select
non empty
(
[Date].[OrderDate].children,
[Customer].[CustID].children,
[ProdLevel].[ProdNumber].children
)
on rows,
{
[Measures].[Revenue],
[Measures].[Units],
[Measures].[ASP],
[Measures].[Profit]
}
on columns
from [RepProdDB]
where
[ProdLevel].[Prod Description].[MyBusinessUnit]
I trying to calculate a new field in Tableau similar to a sub query in SQL. However, my numbers not matching up when I try to do this. I am stuck at this point and I am trying to see what others have done.
To provide reference, below is the subquery that I am trying to duplicate in Tableau.
select
((sum(n.non_influenced_sales*n.calls)/sum(n.calls))-(sum(n.average_sales*calls)/sum(calls))))/
(sum(n.non_influenced_sales*n.calls)/sum(n.calls)) as impact
from (select
count(d.id) as calls
,avg(d.sale) as average_sales
,avg(case when non_influenced=1 then d.sale else null end) as non_influenced_sales
from data d
group by skill) n
When I try to build the same calculation in Tableau, I am able to get the same results as long as I comment out the group by skill. However, when I try to group by skill, my attempts to match the number have not been working.
The closest I have come is when I try to fix the level of detail expression by using include. Tableau code:
(include [skill]:([non_influenced_sales]-[average_sales])/[non_influenced_sales]}
However, doing this or using fixed has not worked and I can't match the numbers I am getting from SQL.
FYI, Impact is an aggregated measure. I built the sub-query part in tableau by just creating separate fields for the calculation I needed. So for example
Non Influenced Sales calculated in Tableau:
avg(if [non_influenced]=1 then [non_influenced_sales] end)
However, I am not sure if this matters or not.
I have also tried creating custom sql. I am able to get a rolled up version using all of the dates correct. But when I want to get down to different dates/use other filters, things get messy real quick. I am trying to build relationships on a date level, but that hasn't worked either.
Is there an easier way to do this?
I frequently do a static analysis of SQL databases, during which I have the luxury of nobody being able to change the data except me.
However, I have not found a way to 'tell' this to SQL in order to prevent running the same query multiple times.
Here is what I would like to do, first I start with a complicated query that has a very small output.
SELECT * FROM MYTABLE WHERE MYPROPERTY = 1234
Then I run a simple query from the same window (Mostly using SQL server studio if that is relevant)
SELECT 1
Now I suddenly realize that I forgot to save the results from my first complicated (slow) query.
As I know the underlying data did not change (or even if it did) I would like to look one step back and simply get the result. However at the moment I don't know any trick to do this and I have to run the entire query again.
So the question summary is: How can I (automatically store/)get the results from recently executed queries.
I am particulary interested in simple select queries, and would be happy to allocate say 100MB memory for automated result storage. Would prefer a solution that works in SQL server studio with T-SQL, but other SQL solutions are also welcome.
EDIT: I am not looking for a way to manually prevent this from happening. In the cases where I can anticipate the problem it will not happen.
This can't be done in Microsoft SQL Server. SQL Server does not cache results, instead it caches data pages that were accessed by your query. This should make your query go a lot faster the second time around so it won't be as painful to re-run it.
In other databases, such as Oracle and MySQL, they do have a query caching mechanism that will allow you to retrieve the results directly the second time around.
I run into this frequently, I often just throw the results of longer-running queries into a temp table:
SELECT *
INTO #results1
FROM MYTABLE WHERE MYPROPERTY = 1234
SELECT *
FROM #results1
If the query is very long-running I might use a 'real' table. It's a good way to save on re-run time.
Downside is that it adds to your query.
You can also send query results to a file in SSMS, info on formatting the output is here: SSMS Results to File
The easiest way to do this is to run each query in its own SSMS window, the results will stay there until you close it, or run out of memory - besides that, I am not sure there is a way to accomplish what you want.
Once you close the SSMS window, I don't believe there is a way to get back 'cached' results.
This isn't a technical answer to your question. Having written queries and looking at results for many years, I am in the habit of saving the results in Excel, regardless of the database/query tool I'm using.
The format in Excel is rather methodical:
Each worksheet has the date. (Called something like "1 Jul".)
Each spreadsheet contains one month. (Typically with the month name like "work-201307".)
In the "B" column I copy and paste the query.
Underneath, in the "C" column, I copy and paste the results.
The next query starts a few lines after, one after the other.
I put the queries in the "B" column, so I can go to the "A" column and use to get to the first row. I put the results in the "C" column, so I can go to the "B" column and use to move between queries.
I mostly do this so I can go back and see the work I did many months ago. For instance, someone sends an email from February and says "do this again". I can go back to the February spreadsheet, go to the day it was created, and see what I was doing at that time.
In your question, though, I realize that I now instinctively solve this problem, because the "right click on the grid, copy with column headers, alt-tab to excel, alt-V" is a behavior that I comes quite naturally.
I was going to suggest you to run each query into a script with a counter (stored in a table) increased each time the query is executed (i.e. i++) and storing each query in a Temp Table called "tmpTable" + i, but it sounds very complicated to manage. Am I right?
Then I googled and I've found this Tool Pack: I didn't try it but you could take a look:
http://www.ssmstoolspack.com/Features
Hope it helps.
EDIT: added the folliwing link. There's the option to output as XML file and they mention SQL Server Integration Services as a possible solution too.
http://michaeljswart.com/2012/03/sending-query-results-to-others/#method5
SECOND EDIT: There's this DBMS-Independent tool too, it sounds interesting:
http://www.sql-workbench.net/
i am not sure this is what you want. Anyway check my answer
In sql server management studio you can open multiple tabs for executing queries. Open new tab for each query, then the result of executed queries will be available under that tab.
After executing one query in a tab dont use that tab for new query, open new tab for that job.
Have you considered using some kind of offline SQL client such as Excel? Specifically, Excel will retrieve the results into the spread sheet (using the Data ribbon/menus) where they are stored pretty much permanently as results. It will prompt you to refresh when necessary or you can do it on demand.
Your question as to whether it can be done in T/SQL or other databases depends on the database and results cache and even then they are options that the query processor can use not guarantees to the individual query.
I'm using SQL Server Analysis Services.
I have a calculated member that, for now, just does this:
[MyDimension].[MyOnlyHierarchy].CurrentMember.Properties("MEMBER_UNIQUE_NAME")
Previously, I had just written [MyDimension].[MyOnlyHierarchy].CurrentMember.UniqueName. They should be the same anyway.
Now, I used SQL Profiler to get ahold of the query my application issues. For a simple calculated member in [MyDimension].[MyOnlyHierarchy] that just sums to different members, say with IDs 401 and 402, I get this result:
[MyDimension].[MyOnlyHierarchy].&[401][MyDimension].[MyOnlyHierarchy].&[402]
In other words, it is as if AS evaluates the underlying members and concatenates the results, rather than giving me the unique name of the calculated member...
The REALLY strange thing to me is that when I take the original query, and prepend the following:
WITH MEMBER [Measures].[GiveMeCalculatedMemberUniqueName]
AS
(
[MyDimension].[MyOnlyHierarchy].CurrentMember.Properties("MEMBER_UNIQUE_NAME")
)
...rest of query
I get the CORRECT results using this second measure! The context is the same (to me at least). Everything is the same... Yet the measure declared in the project file gives a different result than this inline calculated member.
What's going on here? Note, I've redeployed 10000 times, and I've checked the actual definition in the cube on the server and everything. It just doesn't make sense to me.
Calculations are evaluated based on solve order. may be because you moved it down and it was how the solve order was supposed to work it gives you correct result . I have a little blog on solve order here but there are many more articles on internet.
HTH
Funny... I've been sitting for about 2-3 hours with this, exploring every thought; then, after I post this question I decide to try one more thing:
move the calulated member definition to the bottom of the calculated members script file in the project.
This now gives the correct result.
I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.