Grouping MDX query results - ssas

I have a following query (based on sample data provided with Microsoft® SQL Server® 2008 MDX Step by Step book):
WITH
SET important_months AS
{
([Product].[Product Categories].[Subcategory].&[28].CHILDREN , {[Date].[Month of Year].&[1], [Date].[Month of Year].&[2]}),
([Product].[Product Categories].[Product].&[477] , {[Date].[Month of Year].&[3]})
}
SELECT [Measures].[Order Count] ON COLUMNS,
important_months ON ROWS
FROM [Step-by-Step]
The query shows the number of orders placed on products in a particular subcategory in particular months. For all products in category 28, i need the count of orders placed in January or February (month 1 or 2). Exceptions are orders placed on product 447: in this
case, I additionally need to include number of orders placed in March.
In the end however, I'm not really interested in details regarding months:
all I want, is simple number of orders placed on a particular product (i.e. I want to loose/hide the information about what was the month the order was placed).
So instead of
Mountain Bottle Cage, January, 176
Mountain Bottle Cage, February, 183
Road Bottle Cage, January, 141
Road Bottle Cage, February, 152
Water Bottle - 30 oz, January, 381
Water Bottle - 30 oz, February, 403
Water Bottle - 30 oz, March, 414
I need to have:
Mountain Bottle Cage, 359 (176 + 183)
Road Bottle Cage, 293 (141 + 152)
Water Bottle - 30 oz., 1198 (381 + 403 + 414)
I tried with putting the important_months set into a where clause, but (besides circular reference error due to custom set) I wouldn't be able to project the
categories on rows axis (would I?). Also, I thought of using a subquery, but it appears I cannot refer to the important_months set there either.
In other words: I need to get result that in SQL i would get by issuing
SELECT SUM([Order Count])
FROM <MDX RESULT HERE>
GROUP BY Product
Can it be done?

An educated guess is that MDX Subqueries is the solution. Did you try using tuples in the subselect :
WITH
SELECT [Measures].[Order Count] ON COLUMNS,
{[Product].[Product Categories].[Subcategory].&[28].CHILDREN,[Product].[Product Categories].[Product].&[477]} ON ROWS
FROM (
SELECT
{([Product].[Product Categories].[Subcategory].&[28].CHILDREN,{[Date].[Month of Year].&[1], [Date].[Month of Year].&[2]}),
([Product].[Product Categories].[Product].&[477],{[Date].[Month of Year].&[3]})} ON 0
FROM [Step-by-Step]
)

You're creating an asymmetric set (with March for Water Bottle only), so you can't really slice this directly in the WHERE clause without including it for all other products.
icCube's answer looks good to me, with one small addition: add a DISTINCT to the row selection to combine Water Bottle back into one row.
WITH
SELECT [Measures].[Order Count] ON COLUMNS,
DISTINCT {[Product].[Product Categories].[Subcategory].&[28].CHILDREN, [Product].[Product Categories].[Product].&[477]} ON ROWS
FROM (
SELECT
{([Product].[Product Categories].[Subcategory].&[28].CHILDREN,{[Date].[Month of Year].&[1], [Date].[Month of Year].&[2]}),
([Product].[Product Categories].[Product].&[477],{[Date].[Month of Year].&[3]})} ON 0
FROM [Step-by-Step]
)

Related

Need help to build a sales funnel report by sql query

I have created a view for sales.In this view, there are relations among lead, opportunity and quotation. We can see not every lead turns to opportunity and quotation.
LeadID OfferingID QuotationID Product Salesperson Department Date Salesprice
L1 O1 Q1 X001 Mr.X Machine Sales 11-01-2011 100
L2 O2 Q2 X002 Mr.Y Marine Sales 10-02-2011 200
L3 O3 X003 Mr.Z Engine Sales 11-03-2011 300
L4 O4 Q3 X004 Mr.P Parts Sales 13-04-2011 50
L5 X001 Mr.X Machine Sales 20-05-2012 100
L6 O5 X001 Mr.X Machine Sales 30-06-2012 100
My final output for the sales funnel for all department will be like [total number of leads (6)]->[total number of offering(5)]->[total number of quotations(3)].
If i want to filter it by 'Machine Sales' department ,the funnel will be like:
[total number of leads (3)]->[total number of offering(2)]->[total number of quotations(1)]..
i need to be able to filter the funnel by date,salesperson,product and department.please help me to build this sales funnel query.
i will then visualize the data in microsoft powerbi after implementing the query which will be in a funnel shape.
Is there anything stopping you from feeding this data directly into Power BI?
I think you might be over-engineering this problem, and creating another table/view on you database that you'll have to remember/manage.
Leads = COUNT('YourTableNameHere'[LeadID])
Offers = COUNT('YourTableNameHere'[OfferID])
Quotes = COUNT('YourTableNameHere'[QuoteID])
This is very straightforward conditional aggregation with a group by:
select date
,salesperson
,etc
,sum(case when LeadID <> '' then 1 end) as NumberOfLeads
,etc
from YoutTable
group by date
,salesperson
,etc
If your LeadID, OfferingID and QuotationID columns have null values where there is no data you don't even need the conditional within the aggregate and can instead just use count as the null values are ignored:
select ...
,count(LeadID) as NumberOfLeads
,...
etc
I think you want:
select department, count(leadid) as num_leads, count(offeringid) as numoffers,
count(distinct quotationid) as numquotations
from t
group by department;
I don't think count(distinct) is needed for the first two columns, but your data has no examples of duplicates so it is unclear.

JOIN the same table on two columns

I use JOINs to replace country and product IDs in import and export data with actual country and products names stored in separate tables. In the data source table (data), there are two columns with country IDs, for origin and destination, both of which I am replacing with country names.
The code I have come up with refers to the country_names table twice – as country_names, and country_names2, – which doesn’t seem to be very elegant. I expected to be able to refer to the table just once, by a single name. I would be grateful if someone pointed me to a more elegant and maybe more efficient way to achieve the same result.
SELECT
country_names.name AS origin,
country_names2.name AS dest,
product_names.name AS product,
SUM(data.export_val) AS export_val,
SUM(data.import_val) AS import_val
FROM
OEC.year_origin_destination_hs92_6 AS data
JOIN
OEC.products_hs_92 AS product_names
ON
data.hs92 = product_names.hs92
JOIN
OEC.country_names AS country_names
ON
data.origin = country_names.id_3char
JOIN
OEC.country_names AS country_names2
ON
data.dest = country_names2.id_3char
WHERE
data.year > 2012
AND data.export_val > 1E8
GROUP BY
origin,
dest,
product
The table to convert product IDs to product names has 6K+ rows. Here is a small sample:
id hs92 name
63215 3215 Ink
2130110 130110 Lac
21002 1002 Rye
2100200 100200 Rye
52706 2706 Tar
20902 902 Tea
42203 2203 Beer
42302 2302 Bran
178703 8703 Cars
The table to convert country IDs to country names (which is the table I have to JOIN on twice) has 264 rows for all countries in the world. (id_3char is the column used.) Here is a sample:
id id_3char name
euchi chi Channel Islands
askhm khm Cambodia
eublx blx Belgium-Luxembourg
eublr blr Belarus
eumne mne Montenegro
euhun hun Hungary
asmng mng Mongolia
nabhs bhs Bahamas
afsen sen Senegal
And here is a sample of data from the import and export data table with a total of 205M rows that has the two columns origin and dest that I am making a join on:
year origin dest hs92 export_val import_val
2009 can isr 300410 2152838.47 3199.24
1995 chn jpn 590190 275748.65 554154.24
2000 deu gmb 100610 1573508.44 1327.0
2008 deu jpn 540822 10000.0 202062.43
2010 deu ukr 950390 1626012.04 159423.38
2006 esp prt 080530 2470699.19 125291.33
2006 grc ind 844859 8667.0 3182.0
2000 ltu deu 630399 6018.12 5061.96
2005 usa zaf 290219 2126216.52 34561.61
1997 ven ecu 281122 155347.73 1010.0
I think you already have it done such that it can be considered good enough to just use as is :o)
Meantime, If for some reason you really-really want to avoid two joins on that country table - what you can do is to materialize below select statement into let's say `OEC.origin_destination_pairs` table
SELECT
o.id_3char o_id_3char,
o.name o_name,
d.id_3char d_id_3char,
d.name d_name
FROM `OEC.country_names` o
CROSS JOIN `OEC.country_names` d
Then you can just join on that new table as below
SELECT
country_names.o_name AS origin,
country_names.d_name AS dest,
product_names.name AS product,
SUM(data.export_val) AS export_val,
SUM(data.import_val) AS import_val
FROM OEC.year_origin_destination_hs92_6 AS data
JOIN OEC.products_hs_92 AS product_names
ON data.hs92 = product_names.hs92
JOIN OEC.origin_destination_pairs AS country_names
ON data.origin = country_names.o_id_3char
AND data.dest = country_names2.d_id_3char
WHERE data.year > 2012
AND data.export_val > 1E8
GROUP BY
origin,
dest,
product
The motivation behind above is cost of storing and querying in your particular case
Your `OEC.country_names` table is just about 10KB in size
Each time you query it you pay as if it is 10MB (Charges are rounded to the nearest MB, with a minimum 10 MB data processed per table referenced by the query, and with a minimum 10 MB data processed per query.)
So, if you will materialize above mentioned table - it will still be less than 10MB so no difference in querying charges
Similar situation with storing that table - no visible changes in charges
You can check more about pricing here

sum not calculating correct no. of units in SQL command

I have the following SQL script(of which the result is displayed under the script). The issue I am having is that I need to add up the quantity on the invoice. The quantity works fine when all the products on the invoice are different. When there is a product that appears twice on the invoice, the result is incorrect. Any help appreciated.
The DISTINCT keyword acts on all columns you select.
A new product introduces a difference which makes it no longer distinct. Hence the extra row(s).
Where you had:
Order Product Total
1 Toaster $10
2 Chair $20
And another item is added to order 1:
Order Product Total
1 Toaster $99
1 Balloon $99 -- Yes that's a $89 balloon!
2 Chair $20
The new row (balloon) is distinct and isn't reduced into the previous row (toaster).
To make is distinct again, don't select the product name:
Order Total
1 $99
2 $20
Uniqueness kicks in and everyone's happy!
If you can remove the column from the select list that's "different", you should get the results you need.

SQL SUM with Repeating Sub Entries - Best Practice?

I hit this issue regularly but here is an example....
I have a Order and Delivery Tables. Each order can have one to many Deliveries.
I need to report totals based on the Order Table but also show deliveries line by line.
I can write the SQL and associated Access Report for this with ease ....
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
until I get to the summing element. I obviously only want to sum each Order once, not the 1-many times there are deliveries for that order.
e.g. The SQL might return the following based on 2 Orders (ignore the banalness of the report, this is very much simplified)
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
South 3 £50 23-04-2012
I would want to report:
Total Sales North - £173
Delivery 12-04-2012
Delivery 14-04-2012
Delivery 01-05-2012
Delivery 03-05-2012
Delivery 07-05-2012
Total Sales South - £50
Delivery 23-04-2012
The bit I'm referring to is the calculation of the £173 and £50 which the first of which obviously shouldn't be £419!
In the past I've used things like MAX (for a given Order) but that seems like a fudge.
Surely there must be a regular answer to this seemingly common problem but I can't find one.
I don't necessarily need the code - just a helpful point in the right direction.
Many thanks,
Chris.
A roll up operator may not look pretty. However, it would do the regular aggregates that you see now, and it show the subtotals of the order. This is what you're looking for.
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
GROUP BY xxx
WITH ROLLUP;
I'm not exactly sure how the rest of your query is set up, but it would look something like this:
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
NULL NULL f419 NULL
I believe what you want is called a windowing function for your aggregate operation. It looks like the following:
SELECT xxx, SUM(Value) OVER (PARTITION BY Order.Region) as OrderTotal
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
Here's the MSDN article. The PARTITION BY tells the SUM to be done separately for each distinct Order.Region.
Edit: I just noticed that I missed what you said about orders being counted multiple times. One thing you could do is SUM() the values before joining, as a CTE (guessing at your schema a bit):
WITH RegionOrders AS (
SELECT Region, OrderNo, SUM(Value) OVER (PARTITION BY Region) AS RegionTotal
FROM Order
)
SELECT Region, OrderNo, Value, DeliveryDate, RegionTotal
FROM RegionOrders RO
INNER JOIN Delivery D on D.OrderNo = RO.OrderNo

MDX query to use a set but return a single row

I am new to MDX and have just started using Named sets to group several members of a dimension.
Whenever I use a SET in a query, the results returned are always detailed out for each individual member of the set. I am looking to get one one for the set.
For example: I have two Measures: Sales Dollars and Shipped Units. The then have a State dimension for each of the 50 states in the United States.
I want to see the Sales and Units measures for 3 specific states and then also for a group (Named Set) of 4 other states.
Example MDX:
With SET [My Favorite States] AS '{[States].[Illinois], [States].[Wisconsin]}'
select NON EMPTY {[Measures].[Sales], [Measures].[Shipped Units]} ON COLUMNS,
NON EMPTY {[States].[Alabama], [States].[New York], [My Favorite States]} ON ROWS
from [cubename]
This returns:
Measures
States Sales Shipped Units
Alabama $100 5
New York $500 20
Illinois $150 15
Wisconsin $900 25
What I want is for the Set to appear as a total on a single line. Similar to:
Measures
States Sales Shipped Units
Alabama $100 5
New York $500 20
My Favorite States $1,050 40
Is there an MDX function that will allow the set of specific members to be treated as a group?
You can use a calculated member to aggregate the separate states:
With Member [States].[My Favorite States] AS 'Aggregate({[States].[Illinois], [States].[Wisconsin]})'
select NON EMPTY {[Measures].[Sales], [Measures].[Shipped Units]} ON COLUMNS,
NON EMPTY {[States].[Alabama], [States].[New York], [States].[My Favorite States]} ON ROWS
from [cubename]