In my cube I have a Fact Order Line, which has variable Order Cost. This variable is for course unique per Order and has the same value in every Order Line of an Order.
Now I want to create a calculated field that sums the Order Cost, but only takes this value once for every order.
So, using the calculated member this
+-------------------+--------------+------------+
| Order Line Number | Order Number | Order Line |
| | | Order Cost |
+-------------------+--------------+------------+
| 10 | 1 | $0.20 |
| 11 | 1 | $0.20 |
| 20 | 2 | $0.25 |
+-------------------+--------------+------------+
has to become this
+--------------+------------+
| Order Number | Order Cost |
+--------------+------------+
| 1 | $0.20 |
| 2 | $0.25 |
+--------------+------------+
The MDX expression I currently have (see below), sums over the order line, making the Order Cost $0.40 for Order Number 1.
SUM(
DISTINCT(
CROSSJOIN(
[Order Line Details].[Order Number].[All].Children, [Measures].[Order Line Order Cost]
)
)
)
What do I need to change to get the desired behavior?
Please let me know if there is anything unclear regarding the question.
Solution
Ok, I found the problem. I changed the Aggregate Behaviour from the [Measures].[Order Line Order Cost] to min. After that you initial solution worked. Thanks for the help!
Does the below work? I have got rid of unnecessary crossjoin, and put the distinct function on the [Order Line Details].[Order Number].Children and used the SUM function to add up the Order Line Order Cost against the order numbers.
SUM(
DISTINCT([Order Line Details].[Order Number].Children)
, [Measures].[Order Line Order Cost]
)
EDIT
Try the below code:
WITH
SET DistinctOrderNumbers AS
DISTINCT(EXISTING [Order Line Details].[Order Number].Children)
MEMBER [Measures].[Order Cost] AS
SUM(DistinctOrderNumbers, [Measures].[Order Line Order Cost])
SELECT NON EMPTY { [Measures].[Order Cost] } ON COLUMNS,
NON EMPTY { ([Reseller].[Reseller].[Reseller].ALLMEMBERS ) } ON ROWS
FROM [BI Cube]
EDIT2 (avg not sum)
WITH
SET DistinctOrderNumbers AS
DISTINCT(EXISTING [Order Line Details].[Order Number].Children)
MEMBER [Measures].[Order Cost] AS
AVG(DistinctOrderNumbers, [Measures].[Order Line Order Cost])
SELECT NON EMPTY { [Measures].[Order Cost] } ON COLUMNS,
NON EMPTY { ([Reseller].[Reseller].[Reseller].ALLMEMBERS ) } ON ROWS
FROM [BI Cube]
Please try the below:
MIN(
[Order Line Details].[Order Number].[All].Children
,[Measures].[Order Line Order Cost]
)
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
Write a query that will return a table with the following columns:
User ID, Site ID, User Name, Total Sales, Total Refunds, Net Amount Collected
I need to write a query that will return a table which at the moment im trying to figure out thanks.
Tried select statement but failed.enter image description here
I agree with the others that simply handing this over will not help your learning much. There are a bunch of concepts that you need to learn here. Submitting this answer for your homework might be awkward (and result in a score of 0) if you can't explain it!
Common Table Expressions
Aggregate Functions
Outer Joins
with cte_sales as
(
select
t.[User Id],
t.[Site Id],
sum(t.Amount) as [Total Sales]
from Transactions t
where t.[Transaction Type] = 'Sale'
group by t.[User Id],
t.[Site Id]
),
cte_refunds as
(
select
t.[User Id],
t.[Site Id],
sum(t.Amount) as [Total Refunds]
from Transactions t
where t.[Transaction Type] = 'Refund'
group by t.[User Id],
t.[Site Id]
)
select
u.[User Id],
u.[Site Id],
u.[Name] as [User Name],
coalesce(s.[Total Sales],0) as [Total Sales],
abs(coalesce(r.[Total Refunds],0)) as [Total Refunds],
(coalesce(s.[Total Sales],0) + coalesce(r.[Total Refunds],0)) as [Net Amount Collected]
from Users u
left join cte_sales s on s.[User Id] = u.[User Id]
and s.[Site Id] = u.[Site Id]
left join cte_refunds r on r.[User Id] = u.[User Id]
and r.[Site Id] = u.[Site Id]
order by u.[User Id],
u.[Site Id];
Demo
| User Id | Site Id | User Name | Total Sales | Total Refunds | Net Amount Collected |
|---------|---------|-----------|-------------|---------------|----------------------|
| 1 | 1 | Arthur | 120 | 120 | 0 |
| 2 | 1 | Aaron | 90 | 30 | 60 |
| 2 | 2 | Brett | 90 | 0 | 90 |
I have a PostgreSQL query that yields the following results:
SELECT o.order || '-' || osh.ordinal_number AS order,
o.company,
o.order_total,
SUM(osh.items) AS order_shipment_total,
o.order_type
FROM orders o
JOIN order_shipments osh ON o.order_id = osh.order_id
WHERE o.order = [some order number]
GROUP BY o.order,
o.company,
o.order_total,
o.order_type;
order | company | order_total | order_shipment_total | order_type
-------------------------------------------------------------------
123-1 | A corp. | null | 125.00 | new
123-2 | B corp. | null | 100.00 | new
I need to replace the o.order_total (it doesn't work properly) and sum up the sum of the order_shipment_total column so that, for the example above, each row winds up saying 225.00. I need the results above to look like this below:
order | company | order_total | order_shipment_total | order_type
-------------------------------------------------------------------
123-1 | A corp. | 225.00 | 125.00 | new
123-2 | B corp. | 225.00 | 100.00 | new
What I've Tried
1.) To replace o.order_total, I've tried SUM(SUM(osh.items)) but get the error message that you cannot nest aggregate functions.
2.) I've tried to put the entire query as a subquery and sum the order_shipment_total column, but when I do, it just repeats the column itself. See below:
SELECT order,
company,
SUM(order_shipment_total) AS order_shipment_total,
order_shipment_total,
order_type
FROM (
SELECT o.order || '-' || osh.ordinal_number AS order,
o.company,
o.order_total,
SUM(osh.items) AS order_shipment_total,
o.order_type
FROM orders o
JOIN order_shipments osh ON o.order_id = osh.order_id
WHERE o.order = [some order number]
GROUP BY o.order,
o.company,
o.order_total,
o.order_type
) subquery
GROUP BY order,
company,
order_shipment_total,
order_type;
order | company | order_total | order_shipment_total | order_type
-------------------------------------------------------------------
123-1 | A corp. | 125.00 | 125.00 | new
123-2 | B corp. | 100.00 | 100.00 | new
3.) I've tried to only include the rows I actually want to group by in my subquery/query example above, because I feel like I was able to do this in Oracle SQL. But when I do that, I get an error saying "column [name] must appear in the GROUP BY clause or be used in an aggregate function."
...
GROUP BY order,
company,
order_type;
ERROR: column "[a column name]" must appear in the GROUP BY clause or be used in an aggregate function.
How do I accomplish this? I was certain that a subquery would be the answer but I'm confused as to why this approach will not work.
The thing you're not quite grasping with your query / approach is that you're actually wanting two different levels of grouping in the same query row results. The subquery approach is half right, but when you do a subquery that groups, inside another query that groups you can only use the data you've already got (from the subquery) and you can only choose to keep it at the level of aggregate detail it already is, or you can choose to lose precision in favor of grouping more. You can't keep the detail AND lose the detail in order to sum up further. A query-of-subquery is hence (in practical terms) relatively senseless because you might as well group to the level you want in one hit:
SELECT groupkey1, sum(sumx) FROM
(SELECT groupkey1, groupkey2, sum(x) as sumx FROM table GROUP BY groupkey1, groupkey2)
GROUP BY groupkey1
Is the same as:
SELECT groupkey1, sum(x) FROM
table
GROUP BY groupkey1
Gordon's answer will probably work out (except for the same bug yours exhibits in that the grouping set is wrong/doesn't cover all the columns) but it probably doesn't help much in terms of your understanding because it's a code-only answer. Here's a breakdown of how you need to approach this problem but with simpler data and foregoing the window functions in favor of what you already know.
Suppose there are apples and melons, of different types, in stock. You want a query that gives a total of each specific kind of fruit, regardless of the date of purchase. You also want a column for the total for each fruit overall type:
Detail:
fruit | type | purchasedate | count
apple | golden delicious | 2017-01-01 | 3
apple | golden delicious | 2017-01-02 | 4
apple | granny smith | 2017-01-04 ! 2
melon | honeydew | 2017-01-01 | 1
melon | cantaloupe | 2017-01-05 | 4
melon | cantaloupe | 2017-01-06 | 2
So that's 7 golden delicious, 2 granny smith, 1 honeydew, 6 cantaloupe, and its also 9 apples and 7 melons
You can't do it as one query*, because you want two different levels of grouping. You have to do it as two queries and then (critical understanding point) you have to join the less-precise (apples/melons) results back to the more precise (granny smiths/golden delicious/honydew/cantaloupe):
SELECT * FROM
(
SELECT fruit, type, sum(count) as fruittypecount
FROM fruit
GROUP BY fruit, type
) fruittypesum
INNER JOIN
(
SELECT fruit, sum(count) as fruitcount
FROM fruit
GROUP BY fruit
) fruitsum
ON
fruittypesum.fruit = fruitsum.fruit
You'll get this:
fruit | type | fruittypecount | fruit | fruitcount
apple | golden delicious | 7 | apple | 9
apple | granny smith | 2 | apple | 9
melon | honeydew | 1 | melon | 7
melon | cantaloupe | 6 | melon | 7
Hence for your query, different groups, detail and summary:
SELECT
detail.order || '-' || detail.ordinal_number as order,
detail.company,
summary.order_total,
detail.order_shipment_total,
detail.order_type
FROM (
SELECT o.order,
osh.ordinal_number,
o.company,
SUM(osh.items) AS order_shipment_total,
o.order_type
FROM orders o
JOIN order_shipments osh ON o.order_id = osh.order_id
WHERE o.order = [some order number]
GROUP BY o.order,
o.company,
o.order_type
) detail
INNER JOIN
(
SELECT o.order,
SUM(osh.items) AS order_total
FROM orders o
JOIN order_shipments osh ON o.order_id = osh.order_id
--don't need the where clause; we'll join on order number
GROUP BY o.order,
o.company,
o.order_type
) summary
ON
summary.order = detail.order
Gordon's query uses a window function achieve the same effect; the window function runs after the grouping is done, and it establishes another level of grouping (PARTITION BY ordernumber) which is the effective equivalent of my GROUP BY ordernumber in the summary. The window function summary data is inherently connected to the detail data via ordernumber; it is implicit that a query saying:
SELECT
ordernumber,
lineitemnumber,
SUM(amount) linetotal
sum(SUM(amount)) over(PARTITION BY ordernumber) ordertotal
GROUP BY
ordernumber,
lineitemnumber
..will have an ordertotal that is the total of all the linetotal in the order: The GROUP BY prepares the data to the line level detail, and the window function prepares data to just the order level, and repeats the total as many times are necessary to fill in for every line item. I wrote the SUM that belongs to the GROUP BY operation in capitals.. the sum in lowercase belongs to the partition operation. it has to sum(SUM()) and cannot simply say sum(amount) because amount as a column is not allowed on its own - it's not in the group by. Because amount is not allowed on its own and has to be SUMmed for the group by to work, we have to sum(SUM()) for the partition to run (it runs after the group by is done)
It behaves exactly the same as grouping to two different levels and joining together, and indeed I chose that way to explain it because it makes it more clear how it's working in relation to what you already know about groups and joins
Remember: JOINS make datasets grow sideways, UNIONS make them grow downwards. When you have some detail data and you want to grow it sideways with some more data(a summary), JOIN it on. (If you'd wanted totals to go at the bottom of each column, it would be unioned on)
*you can do it as one query (without window functions), but it can get awfully confusing because it requires all sorts of trickery that ultimately isn't worth it because it's too hard to maintain
You should be able to use window functions:
SELECT o.order || '-' || osh.ordinal_number AS order, o.company,
SUM(SUM(osh.items)) OVER (PARTITION BY o.order) as order_total,
SUM(osh.items) AS order_shipment_total,
o.order_type
FROM orders o JOIN
order_shipments osh
ON o.order_id = osh.order_id
WHERE o.order = [some order number]
GROUP BY o.order, o.company, o.order_type;
I am having trouble getting my head around the following sum. I have a table of items, which shows the location, the item code and the size code. The unique feature of this table is there is a flag that determines whether the location will stock a particular item.
I have another table that shows the stock movements of the items. This table also shows location, item, size and either a positive or negative entry. The sum of the positive/negative entries give the current stock holding.
What i can't seem to do is say SUM the stock movement quantities where the item and size are marked with 'location can stock this item'
The first select statement brings back items that can be stocked by location
select
S.[Location Code],S.[Item No_],S.[size],
from [Stockkeeping Units] S
where [Range in Location] = 1
The results return a list as:
location Code| Item no | Size
1 | SHIRT1 | s
1 | SHIRT1 | m
1 | SHIRT2 | s
1 | SHIRT2 | m
2 | SHIRT1 | s
2 | SHIRT2 | m
The second select statement bring back the current stock for an item by location
select
L.[Location Code],L.[Item No_],L.[size],
sum(L.[Quantity]) as Quantity
from [Item Ledger Entry] L
location Code| Item no | Size | Quantity
1 | SHIRT1 | s | 5
1 | SHIRT1 | m | 3
1 | SHIRT2 | s | 5
1 | SHIRT2 | m | 7
2 | SHIRT1 | s | 3
2 | SHIRT2 | m | 0
It is when i try to join these tables to bring back the combination of the first 2 select statements, that it goes astray
select L.[Location Code],L.[Item No_], L.[Variant Code],
sum(L.[Quantity]) as Quantity
from [$Item Ledger Entry] L
join [Stockkeeping Unit] on [Item Ledger Entry].[Item No] = [Stockkeeping
Unit].[Item No_]
where [Stockkeeping Unit].[Range in Location] = 1
group by L.[Location Code],L.[Item No_],L.[Variant Code]
What i would like to see is:
location|item no|size|quantity where range in location is yes
The joined query is bringing back result the are ignoring the
[Stockkeeping Unit].[Range in Location] = 1 request
The joined query is also not returning the same SUM results as the second SELECT query
It looks like you meant this:
SELECT [Location Code], [Item No_], [size], SUM([Quantity]) AS Quantity
FROM [Item Ledger Entry] L
WHERE EXISTS (
SELECT *
FROM [Stockkeeping Units] S
WHERE S.[Range in Location]=1
AND S.[Location code]=L.[Location code]
AND S.[Item No]=L.[Item no]
AND S.[Size]=L.[Size]
)
GROUP BY L.[Location Code], L.[Item No_], L.[size];
Apologies if this has been asked elsewhere. I have been looking on Stackoverflow all day and haven't found an answer yet. I am struggling to write the query to find the highest month's sales for each state from this example data.
The data looks like this:
| order_id | month | cust_id | state | prod_id | order_total |
+-----------+--------+----------+--------+----------+--------------+
| 67212 | June | 10001 | ca | 909 | 13 |
| 69090 | June | 10011 | fl | 44 | 76 |
... etc ...
My query
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders GROUP BY `month`, `state`
ORDER BY sales;
| month | state | sales |
+------------+--------+--------+
| September | wy | 435 |
| January | wy | 631 |
... etc ...
returns a few hundred rows: the sum of sales for each month for each state. I want it to only return the month with the highest sum of sales, but for each state. It might be a different month for different states.
This query
SELECT `state`, MAX(order_sum) as topmonth
FROM (SELECT `state`, SUM(order_total) order_sum FROM orders GROUP BY `month`,`state`)
GROUP BY `state`;
| state | topmonth |
+--------+-----------+
| ca | 119586 |
| ga | 30140 |
returns the correct number of rows with the correct data. BUT I would also like the query to give me the month column. Whatever I try with GROUP BY, I cannot find a way to limit the results to one record per state. I have tried PartitionBy without success, and have also tried unsuccessfully to do a join.
TL;DR: one query gives me the correct columns but too many rows; the other query gives me the correct number of rows (and the correct data) but insufficient columns.
Any suggestions to make this work would be most gratefully received.
I am using Apache Drill, which is apparently ANSI-SQL compliant. Hopefully that doesn't make much difference - I am assuming that the solution would be similar across all SQL engines.
This one should do the trick
SELECT t1.`month`, t1.`state`, t1.`sales`
FROM (
/* this one selects month, state and sales*/
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
) AS t1
JOIN (
/* this one selects the best value for each state */
SELECT `state`, MAX(sales) AS best_month
FROM (
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
)
GROUP BY `state`
) AS t2
ON t1.`state` = t2.`state` AND
t1.`sales` = t2.`best_month`
It's basically the combination of the two queries you wrote.
Try this:
SELECT `month`, `state`, SUM(order_total) FROM orders WHERE `month` IN
( SELECT TOP 1 t.month FROM ( SELECT `month` AS month, SUM(order_total) order_sum FROM orders GROUP BY `month`
ORDER BY order_sum DESC) t)
GROUP BY `month`, state ;
I recently posted a question about a SQL Where Statement/Grouping here:
SQL statement using WHERE from a GROUP or RANK
Now I've got somewhat of a follow-up.
So similar to the previous question, let's assume I have a table of say 35,000 rows with these columns:
Sales Rep | Parent Account ID| Account ID | Total Contract Value | Date
Each row is individual by account id but multiple account IDs can fall under a parent account ID.
Similar to the responses on the first question, this is probably going to be a table w/i a table. So first, everything has to be grouped by Sales Rep. From that, everything needs to be grouped by Parent Account ID where the grouped total contract value of all the accounts is >= 10,000. Then everything will be displayed and ranked by the total TCV of the Parent account ID and I need the top 35 Parent account IDs by agent.
So the first couple of lines of data may look like this:
Sales Rep | Parent Account ID| Account ID | Total Contract Value | Date | Rank
John Doe | ParentABC12345 | ABC425 | 5,000 | 1/2/2013 |1
John Doe | ParentABC12345 | ABC426 | 10,000 | 1/2/2013 |1
John Doe | ParentDJE12345 | DJE523 | 11,000 | 1/2/2013 |2
John Doe | ParentFBC12345 | FBC6723 | 4,000 | 1/2/2013 |3
John Doe | ParentFBC12345 | FBC6727 | 4,000 | 1/2/2013 |3
Notice how the ranking works based off of the parent Account ID. The account ID DJE523 has the single greatest TCV but it's ranked second b/c the grouped value of parent account ID ParentABC12345 is greater. So there would be a ranking of 35 parent account IDs but in that ranking their could be say 100+ lines of actual data.
Any thoughts?
Always nice to follow up. The "parent rank" is added as an INNER JOIN.
Edit: As correctly mentioned by Dan Bracuk, my first answer was not correct. I altered the query to meet the correct conditions. I also applied the timespan to the Parent Account's.
DECLARE #minimumValue decimal(20,2) = 10000
DECLARE #numberOfAccounts int = 35
DECLARE #from datetime = '1/1/2013'
DECLARE #till datetime = DATEADD(MONTH, 1, #from)
SELECT
[sub].[Sales Rep],
[sub].[Rank],
[sub].[Account ID],
[sub].[Total Contract Value],
[sub].[Parent Account ID],
[sub].[Total],
[sub].[ParentRank]
FROM
(
SELECT
[s].[Sales Rep],
[s].[Account ID],
[s].[Total Contract Value],
DENSE_RANK() OVER (PARTITION BY [s].[Sales Rep] ORDER BY [s].[Total Contract Value] DESC) AS [Rank],
[p].[Parent Account ID],
[p].[Total],
[p].[ParentRank]
FROM [Sales] [s]
INNER JOIN
(
SELECT
[Parent Account ID],
SUM([Total Contract Value]) AS [Total],
RANK() OVER(ORDER BY SUM([Total Contract Value]) DESC) AS [ParentRank]
FROM [Sales]
WHERE[Date] > #from AND [Date] < #till
GROUP BY [Parent Account ID]
HAVING SUM([Total Contract Value]) > #minimumValue
) AS [p] ON [s].[Parent Account ID] = [p].[Parent Account ID]
WHERE [Date] > #from AND [Date] < #till
) AS [sub]
WHERE [sub].[Rank] <= #numberOfAccounts
ORDER BY
[Sales Rep] ASC,
[ParentRank] ASC,
[Rank] ASC
And here is a new Fiddle.
I think this will do it for you, if you're using SQL Server:
Select top 35
SalesRep,
ParentAccountId,
sum(TotalContractValue) from Table
group by SalesRep, ParentAccountId
order by sum(TotalContractValue) desc