I have two dimensions called [Session Length] and [Date], and a measure called [Count - Logins] in my Cube. The [Session Length] dimension contains an attribute called [Session Length] whose members are integers from 0 to 240.
I like to write an MDX query which would aggregate [Count - Logins] over customized subsets of the [Session Length] dimension (i.e. I want to create a customized set based on the [Session Length] dimension and aggregate the count over individual members of this custom set). Here is the query I have come up with so far but unfortunately I have no clue how to move forward:
WITH SET [Description] AS {
[SessionLength].[Session Length].&[0], //Glimpse
[SessionLength].[Session Length].&[1]:[SessionLength].[Session Length].&[5], //Short
[SessionLength].[Session Length].&[6]:[SessionLength].[Session Length].&[30], //Medium
[SessionLength].[Session Length].&[31]:[SessionLength].[Session Length].&[90], //Long
[SessionLength].[Session Length].&[90]:[SessionLength].[Session Length].&[240]} //Extended
MEMBER [SessionLength].[Session Length].SessionDescription AS
Aggregate([Description])
SELECT
{ [Measures].[Count - Logins] }
ON COLUMNS,
NONEMPTY({[SessionLength].[Session Length].SessionDescription} * {[Date].[Date].[Date]}) ON ROWS
FROM MyCube
With the following sample result set:
Session Length | Date | Count - Logins
-------------------------------------------------
SessionDescription | 2014-02-01 | 22
SessionDescription | 2014-02-01 | 17
As you can see the count is being aggregated over the whole set and not each member individually. Here is the result I'm hoping to produce:
Session Length | Date | Count - Logins
-------------------------------------------------
Glimpse | 2014-02-01 | 3
Short | 2014-02-01 | 4
Medium | 2014-02-01 | 9
Long | 2014-02-01 | 5
Extended | 2014-02-01 | 1
Glimpse | 2014-02-02 | 2
Short | 2014-02-02 | 5
Medium | 2014-02-02 | 7
Long | 2014-02-02 | 2
Extended | 2014-02-02 | 1
Any help would be appreciated. I know this can be achieved by modifying the DSV but I don't want to alter the Cube's structure.
You must create separate SessionDescription members if you want to see separate entries on the rows e. g. like this:
WITH
MEMBER [SessionLength].[Session Length].Glimpse AS
Aggregate([SessionLength].[Session Length].&[0])
MEMBER [SessionLength].[Session Length].Short AS
Aggregate([SessionLength].[Session Length].&[1]:[SessionLength].[Session Length].&[5])
MEMBER [SessionLength].[Session Length].Medium AS
Aggregate([SessionLength].[Session Length].&[6]:[SessionLength].[Session Length].&[30])
MEMBER [SessionLength].[Session Length].Long AS
Aggregate([SessionLength].[Session Length].&[31]:[SessionLength].[Session Length].&[90])
MEMBER [SessionLength].[Session Length].Extended AS
Aggregate([SessionLength].[Session Length].&[90]:[SessionLength].[Session Length].&[240])
SELECT
{ [Measures].[Count - Logins] }
ON COLUMNS,
NONEMPTY({
[SessionLength].[Session Length].Glimpse,
[SessionLength].[Session Length].Short,
[SessionLength].[Session Length].Medium,
[SessionLength].[Session Length].Long,
[SessionLength].[Session Length].Extended
}
* {[Date].[Date].[Date]})
ON ROWS
FROM MyCube
By the way, I left the 90 member in both, Long and Extended as it was in your original query. If you do not want to double-count these, you should remove them from one.
Related
Our accounting department needs pull tax data from our MIS every month and submit it online to the Dept. of Revenue. Unfortunately, when pulling the data, it is duplicated a varying number of times depending on which jurisdictions we have to pay taxes to. All she needs is the dollar amount for one jurisdiction, for one line, because she enters that on the website.
I've tried using DISTINCT to pull only one record of the type, in conjunction with LEFT() to pull just the first 7 characters of the jurisdiction but it ended up excluding certain results that should have been included. I believe it was because the posting date and the amount on a couple transactions was identical. They were separate transactions but the query took them as duplicates and ignored them.
Here is a couple of examples of queries I've run that have been successful in pulling most of the data, but most times either too much or not enough:
SELECT DISTINCT LEFT("Sales-Tax-Jurisdiction-Code", 7), "Taxable-Base", "Posting-Date"
FROM ARInvoiceTax
WHERE ("Posting-Date" >= '2019-09-01' AND "Posting-Date" <= '2019-09-30')
AND (("Sales-Tax-Jurisdiction-Code" BETWEEN '55001' AND '56763')
OR "Sales-Tax-Jurisdiction-Code" = 'Dakota Cty TT')
ORDER BY "Sales-Tax-Jurisdiction-Code"
Here is a query that I can to pull all of the data and the subsequent result is below that:
SELECT "Sales-Tax-Jurisdiction-Code", "Taxable-Base", "Posting-Date"
FROM ARInvoiceTax
WHERE ("Posting-Date" >= '2019-09-01' AND "Posting-Date" <= '2019-09-30')
AND (("Sales-Tax-Jurisdiction-Code" BETWEEN '55001' AND '56763')
OR "Sales-Tax-Jurisdiction-Code" = 'Dakota Cty TT')
ORDER BY "Sales-Tax-Jurisdiction-Code"
Below is a sample of the output:
Jurisdiction | Tax Amount | Posting Date
-------------|------------|-------------
5512100City | $50.00 | 2019-09-02
5512100City | $50.00 | 2019-09-03
5512100City | $70.00 | 2019-09-02
5512100Cnty | $50.00 | 2019-09-02
5512100Cnty | $50.00 | 2019-09-03
5512100Cnty | $70.00 | 2019-09-02
5512100State | $70.00 | 2019-09-02
5512100State | $50.00 | 2019-09-02
5512100State | $50.00 | 2019-09-03
5513100Cnty | $25.00 | 2019-09-12
5513100State | $25.00 | 2019-09-12
5514100City | $9.00 | 2019-09-06
5514100City | $9.00 | 2019-09-06
5514100Cnty | $9.00 | 2019-09-06
5514100Cnty | $9.00 | 2019-09-06
5515100State | $12.00 | 2019-09-11
5516100City | $6.00 | 2019-09-13
5516100City | $7.00 | 2019-09-13
5516100State | $6.00 | 2019-09-13
5516100State | $7.00 | 2019-09-13
As you can see, the data can be all over the place. One zip code could have multiple different lines. What the accounting department does now is prints a report with this information and, in a spreadsheet, only records (1) dollar amount per transaction. For example, for 55121, she would need to record $50.00, $50.00 and $70.00 (she tallies them and adds the total amount on the website) however the SQL query gives me those (3) numbers, (3) times.
I can't seem to figure out a query that will pull only one set of the data. Unfortunately, I can't do it based on the words/letters after the 00 because not all jurisdictions have all 3 (city, cnty, state) and thus trying to remove lines based on that removes valid lines as well.
Can you use select distinct? If the first five characters are the zip code and you just want that:
select distinct left(jurisdiction, 5), tax_amount
from t;
Take only City/County/.. whatever is first
select jurisdiction, tax_amount, Posting_Date
from (
select *, dense_rank() over(partition by left(jurisdiction, 7) order by substring(jurisdiction, 8, len(jurisdiction))) rnk
from taxes -- you output here
)
where rnk=1;
Sql server syntax, you may need other string functions in your dbms.
Postgresql fiddle
How to get previous date member only among selected/visible members of date dimension?
I've tried PREVMEMBER and LAG functions but they return previous calendar date (yesterday).
Data in OLAP cube:
DATE | SUM
-----------------
2018-09-01 | 500
2018-09-02 | 150
2018-09-03 | 300
2018-09-04 | 777
2018-09-05 | 900
2018-09-06 | 1200
2018-09-07 | 1500
In my query I'm selecting different dates in a filter and I need to get SUM of previous visible date:
DATE | SUM | PREV_SUM
-------------------------------
2018-09-02 | 150 | NULL
2018-09-04 | 777 | 150 (from 2018-09-02)
2018-09-07 | 1500 | 777 (from 2018-09-04)
My MDX query:
WITH
MEMBER PREV_MEMBER AS
MEMBERTOSTR([dim_date].[Day Id].CURRENTMEMBER.PREVMEMBER)
MEMBER PREV_MEMBER_LAG AS
MEMBERTOSTR([dim_date].[Day Id].CURRENTMEMBER.lag(1))
MEMBER PREV_SUM AS
SUM(
STRTOMEMBER(PREV_MEMBER),
[Measures].[SUM]
)
SELECT
NON EMPTY {
[Measures].[SUM],
PREV_SUM,
PREV_MEMBER,
PREV_MEMBER_LAG
} ON COLUMNS,
NON EMPTY {(
[dim_date].[Day Id].ALLMEMBERS
)} ON ROWS
FROM (
SELECT ({
[dim_date].[Day Id].&[20180902],
[dim_date].[Day Id].&[20180904],
[dim_date].[Day Id].&[20180907]
}) ON COLUMNS
FROM [cub_main]
)
My result (returns yesterday):
DATE | SUM | PREV_SUM | PREV_MEMBER | PREV_MEMBER_LAG
--------------------------------------------------------------------------------------------
20180902 | 150 | 500 | [dim_date].[Day Id].&[20180901] | [dim_date].[Day Id].&[20180901]
20180904 | 777 | 300 | [dim_date].[Day Id].&[20180903] | [dim_date].[Day Id].&[20180903]
20180907 | 1500 | 1200 | [dim_date].[Day Id].&[20180906] | [dim_date].[Day Id].&[20180906]
How can I get PREV_SUM only among selected/displayed members?
Try creating a custom set in your WITH clause. The set will be made up of the dates.
Then use the GENERATE function to iterate over the set. I think lag should then use only the dates in the set.
(Apologies I am away from a PC so unable to test)
WITH
// Create custom set
SET SSS AS
{
[dim_date].[Day Id].&[20180902],
[dim_date].[Day Id].&[20180904],
[dim_date].[Day Id].&[20180907]
}
// Find current date member rank in custom set
// Decrement index of current member by 2
// Use ITEM function
MEMBER PREV_MEMBER AS
SETTOSTR(SSS.ITEM(RANK([dim_date].[Day Id].CURRENTMEMBER, SSS)-2))
MEMBER PREV_SUM AS
SUM(
STRTOSET(PREV_MEMBER),
[Measures].[SUM]
)
SELECT
NON EMPTY {
[Measures].[SUM],
PREV_MEMBER,
PREV_SUM
} ON COLUMNS,
NON EMPTY {(
// Use custom set in rows as a date filter
SSS
)} ON ROWS
FROM [cub_main]
My data is like -
+-----------+------------------+-----------------+-------------+
| Issue Num | Created On | Closed at | Issue Owner |
+-----------+------------------+-----------------+-------------+
| 1 | 12/21/2016 15:26 | 1/13/2017 9:48 | Name 1 |
| 2 | 1/10/2017 7:38 | 1/13/2017 9:08 | Name 2 |
| 3 | 1/13/2017 8:57 | 1/13/2017 8:58 | Name 2 |
| 4 | 12/20/2016 20:30 | 1/13/2017 5:46 | Name 2 |
| 5 | 12/21/2016 19:30 | 1/13/2017 1:14 | Name 1 |
| 6 | 12/20/2016 20:30 | 1/12/2017 9:11 | Name 1 |
| 7 | 1/9/2017 17:44 | 1/12/2017 1:52 | Name 1 |
| 8 | 12/21/2016 19:36 | 1/11/2017 16:59 | Name 1 |
| 9 | 12/20/2016 19:54 | 1/11/2017 15:45 | Name 1 |
+-----------+------------------+-----------------+-------------+
What I am trying to achieve is
Number of issues created per week
Number of issues closed per week
Net number of issues remaining per week
I am able to resolve the top two points but unable to approach the last.
My attempt -
This gives me number of issues created every week.
Similarly I have done for Closed per week.
For Net number of issues (Created-Closed) -
I tried adding Closed At column along with Created On but I can't see second bar in the chart along with Created On either.
Something like this
I tried doing the same in excel -
I want something of this sort but with another column as the difference of
number of issues created that week - number of issues closed that week.
In this case, 8-6=2.
You could use a calculated field(Analysis->Create Calculated Field). Something like this:
{FIXED [Create Date]:Count(if DATEPART('year',[Create Date]) = 2016 then [Number of Records] end)} - {FIXED [Closed Date]:Count(if DATEPART('year',[Closed Date]) = 2016 then [Number of Records] end)}
This function is using LOD expressions to pull back both sets of values. It will filter on all 2016 results for both date sets and then minus them from each other.
For more on LOD's see here:
https://www.tableau.com/about/blog/LOD-expressions
Use this as your measure and pull in one of your date fields as the dimension.
The normal way to solve this problem is to reshape the data so you have one row per status change instead of one row per issue, with a column named [Date] and a column named [Action]. The action can be submit and close (or in a more complex world include approve, reject, whatever - tracking the history.
You can do the reshaping without modifying your source data by using a UNION to get two copies of each row with appropriate calculated fields to make the visible columns make sense (e.g., create calculated a field called Date that returns the submission date or closing date depending on whether the row is from the first or second union, with a similar one called Action whose value depends on that as well. Filter out Close actions that have a null date)
Or you can preprocess the data to reshape it.
Or you can use data blending to make two sources that point to the same data source but customizing the linking fields to line up the submit and close dates (e.g., duplicate the data connection and rename both date fields to have the same name). But in this case, you probably want to create scaffolding source that has every date, but no other data, to use as the primary data source to avoid filtering out data from the secondary for dates that don't appear in the primary. The blending approach can be brittle.
Assuming you used the UNION approach instead of Data Blending, then you can count the number of submissions and closures within a certain date range, or compute a running total of the difference to see the backlog size over time.
I want to calculate a percentage value of a selected period. I don't know how to handle it.
| Quantity | CalcMember |
January | 5 | |
2015-01-01 | 1 | 20% |
2015-01-02 | 2 | 40% |
2015-01-03 | 2 | 40% |
I need only the total of my selected period from day X to X and not the result of the whole month for my calculation.
The issue is summarizing the filtered members within the calculated member.
edit: I found a solution!
I have to create a dynamic set
CREATE DYNAMIC SET CurrentCube.[SelectedDates] AS [Date].[YearMonth].[Date].Members;
CREATE MEMBER CURRENTCUBE.[Measures].[Percentage] AS
[Measures].[Qty] / SUM([SelectedDates], [Measures].[Qty]),
format_string = "Percent"
but this works only when the dates are in the rows...
Use EXISTING function
CREATE HIDDEN DYNAMIC SET [SelectedDates] AS
EXISTING [Date].[YearMonth].[----smallest level of your hierarchy---].Members;
CREATE MEMBER CURRENTCUBE.[Measures].[Percentage] AS
[Measures].[Qty] / // or somethinng like [Original Value] from numeric calculations if you want to do that for multiple measures at once
Sum
([SelectedDates], [Measures].[Qty]
);
I have a table "AuctionResults" like below
Auction Action Shares ProfitperShare
-------------------------------------------
Round1 BUY 6 200
Round2 BUY 5 100
Round2 SELL -2 50
Round3 SELL -5 80
Now I need to aggregate results by every auction with BUYS after netting out SELLS in subsequent rounds on a "First Come First Net basis"
so in Round1 I bought 6 Shares and then sold 2 in Round2 and rest "4" in Round3 with a total NET profit of 6 * 200-2 * 50-4 * 80 = 780
and in Round2 I bought 5 shares and sold "1" in Round3(because earlier "4" belonged to Round1) with a NET Profit of 5 * 100-1 * 80 = 420
...so the Resulting Output should look like:
Auction NetProfit
------------------
Round1 780
Round2 420
Can we do this using just Oracle SQL(10g) and not PL-SQL
Thanks in advance
I know this is an old question and won't be of use to the original poster, but I wanted to take a stab at this because it was an interesting question. I didn't test it out enough, so I would expect this still needs to be corrected and tuned. But I believe the approach is legitimate. I would not recommend using a query like this in a product because it would be difficult to maintain or understand (and I don't believe this is really scalable). You would be much better off creating some alternate data structures. Having said that, this is what I ran in Postgresql 9.1:
WITH x AS (
SELECT round, action
,ABS(shares) AS shares
,profitpershare
,COALESCE( SUM(shares) OVER(ORDER BY round, action
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING)
, 0) AS previous_net_shares
,COALESCE( ABS( SUM(CASE WHEN action = 'SELL' THEN shares ELSE 0 END)
OVER(ORDER BY round, action
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING) ), 0 ) AS previous_sells
FROM AuctionResults
ORDER BY 1,2
)
SELECT round, shares * profitpershare - deduction AS net
FROM (
SELECT buy.round, buy.shares, buy.profitpershare
,SUM( LEAST( LEAST( sell.shares, GREATEST(buy.shares - (sell.previous_sells - buy.previous_sells), 0)
,GREATEST(sell.shares + (sell.previous_sells - buy.previous_sells) - buy.previous_net_shares, 0)
)
) * sell.profitpershare ) AS deduction
FROM x buy
,x sell
WHERE sell.round > buy.round
AND buy.action = 'BUY'
AND sell.action = 'SELL'
GROUP BY buy.round, buy.shares, buy.profitpershare
) AS y
And the result:
round | net
-------+-----
1 | 780
2 | 420
(2 rows)
To break it down into pieces, I started with this data set:
CREATE TABLE AuctionResults( round int, action varchar(4), shares int, profitpershare int);
INSERT INTO AuctionResults VALUES(1, 'BUY', 6, 200);
INSERT INTO AuctionResults VALUES(2, 'BUY', 5, 100);
INSERT INTO AuctionResults VALUES(2, 'SELL',-2, 50);
INSERT INTO AuctionResults VALUES(3, 'SELL',-5, 80);
INSERT INTO AuctionResults VALUES(4, 'SELL', -4, 150);
select * from auctionresults;
round | action | shares | profitpershare
-------+--------+--------+----------------
1 | BUY | 6 | 200
2 | BUY | 5 | 100
2 | SELL | -2 | 50
3 | SELL | -5 | 80
4 | SELL | -4 | 150
(5 rows)
The query in the "WITH" clause adds some running totals to the table.
"previous_net_shares" indicates how many shares are available to sell before the current record. This also tells me how many 'SELL' shares I need to skip before I can start allocating it to this 'BUY'.
"previous_sells" is a running count of the number of "SELL" shares encountered, so the difference between two "previous_sells" indicates the number of 'SELL' shares used in that time.
round | action | shares | profitpershare | previous_net_shares | previous_sells
-------+--------+--------+----------------+---------------------+----------------
1 | BUY | 6 | 200 | 0 | 0
2 | BUY | 5 | 100 | 6 | 0
2 | SELL | 2 | 50 | 11 | 0
3 | SELL | 5 | 80 | 9 | 2
4 | SELL | 4 | 150 | 4 | 7
(5 rows)
With this table, we can do a self-join where each "BUY" record is associated with each future "SELL" record. The result would look like this:
SELECT buy.round, buy.shares, buy.profitpershare
,sell.round AS sellRound, sell.shares AS sellShares, sell.profitpershare AS sellProfitpershare
FROM x buy
,x sell
WHERE sell.round > buy.round
AND buy.action = 'BUY'
AND sell.action = 'SELL'
round | shares | profitpershare | sellround | sellshares | sellprofitpershare
-------+--------+----------------+-----------+------------+--------------------
1 | 6 | 200 | 2 | 2 | 50
1 | 6 | 200 | 3 | 5 | 80
1 | 6 | 200 | 4 | 4 | 150
2 | 5 | 100 | 3 | 5 | 80
2 | 5 | 100 | 4 | 4 | 150
(5 rows)
And then comes the crazy part that tries to calculate the number of shares available to sell in the order vs the number over share not yet sold yet for a buy. Here are some notes to help follow that. The "greatest"calls with "0" are just saying we can't allocate any shares if we are in the negative.
-- allocated sells
sell.previous_sells - buy.previous_sells
-- shares yet to sell for this buy, if < 0 then 0
GREATEST(buy.shares - (sell.previous_sells - buy.previous_sells), 0)
-- number of sell shares that need to be skipped
buy.previous_net_shares
Thanks to David for his assistance