I have a query like
WITH MEMBER [measures].[Count] AS
SUM(([Location].[Hierarchy].[Zone].[1].Children),[Measures].[Length])
SELECT {[measures].[Count]} ON 0,
{[Location].[Hierarchy].[Zone].&[1].Children} on 1
FROM [NTAP]
I'm a real beginner with MDX but from my understanding this should get me a list with all Zone 1:s children and a sum of all those children Length summarized. The problem is I get a list with the children and a sum of all Zone 1:s Length?
I get this:
1 103026769420
2 103026769420
3 103026769420
4 103026769420
But what I would like to get is something like this
1 84984958
2 9494949
3 934883
4 9458948588
Location is a hierarchy like:
Zone Children
1
1
2
3
2
1
2
3
edit: should probably say that the reason I use with member is that the measure.length will with a Iif in the final version. But I cant even get this working :(
edit2: fixed spelling
You are getting the sum of all children of Zone 1 for each child of Zone 1.
You can rewrite it as:
WITH MEMBER [Measures].[Count] AS
SUM([Location].[Hierarchy].CurrentMember.Children, [Measures].[Length])
SELECT {[Measures].[Count]} ON 0,
{[Location].[Hierarchy].[Zone].&[1].Children} on 1
FROM [NTAP]
By the way, [1] <> &[1]. Without the & you are specifying the name and with - the key. If in your case key = name you have nothing to worry about.
A query counting the children with L > 0 would be:
WITH
MEMBER [Measures].[Count of Children with L more than 0] AS
FILTER([Location].[Hierarchy].CurrentMember.Children,
[Measures].[Length] > 0).COUNT
SELECT
{
[Measures].[Count of Children with L more than 0]
} ON 0,
{
[Location].[Hierarchy].[Zone].&[1]
} ON 1
FROM [Your Cube]
This of course won't work if you select the children on rows as then you'd get NULLs as they are leaves and have no children themselves.
Related
I've run into a subtlety around count(*) and join, and a hoping to get some confirmation that I've figured out what's going on correctly. For background, we commonly convert continuous timeline data into discrete bins, such as hours. And since we don't want gaps for bins with no content, we'll use generate_series to synthesize the buckets we want values for. If there's no entry for, say 10AM, fine, we stil get a result. However, I noticed that I'm sometimes getting 1 instead of 0. Here's what I'm trying to confirm:
The count is 1 if you count the "grid" series, and 0 if you count the data table.
This only has to do with count, and no other aggregate.
The code below sets up some sample data to show what I'm talking about:
DROP TABLE IF EXISTS analytics.measurement_table CASCADE;
CREATE TABLE IF NOT EXISTS analytics.measurement_table (
hour smallint NOT NULL DEFAULT NULL,
measurement smallint NOT NULL DEFAULT NULL
);
INSERT INTO measurement_table (hour, measurement)
VALUES ( 0, 1),
( 1, 1), ( 1, 1),
(10, 2), (10, 3), (10, 5);
Here are the goal results for the query. I'm using 12 hours to keep the example results shorter.
Hour Count sum
0 1 1
1 2 2
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
10 3 10
11 0 0
12 0 0
This works correctly:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(measurement_table.hour) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (measurement_table.hour = hour_series.hour)
GROUP BY 1
ORDER BY 1
This returns misleading 1's on the match:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(*) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (hour_series.hour = measurement_table.hour)
GROUP BY 1
ORDER BY 1
0 1 1
1 2 2
2 1 0
3 1 0
4 1 0
5 1 0
6 1 0
7 1 0
8 1 0
9 1 0
10 3 10
11 1 0
12 1 0
The only difference between these two examples is the count term:
count(*) -- A result of 1 on no match, and a correct count otherwise.
count(joined to table field) -- 0 on no match, correct count otherwise.
That seems to be it, you've got to make it explicit that you're counting the data table. Otherwise, you get a count of 1 since the series data is matching once. Is this a nuance of joinining, or a nuance of count in Postgres?
Does this impact any other aggrgate? It seems like it sholdn't.
P.S. generate_series is just about the best thing ever.
You figured out the problem correctly: count() behaves differently depending on the argument is is given.
count(*) counts how many rows belong to the group. This just cannot be 0 since there is always at least one row in a group (otherwise, there would be no group).
On the other hand, when given a column name or expression as argument, count() takes in account any non-null value, and ignores null values. For your query, this lets you distinguish groups that have no match in the left joined table from groups where there are matches.
Note that this behavior is not Postgres specific, but belongs to the standard
ANSI SQL specification (all databases that I know conform to it).
Bottom line:
in general cases, uses count(*); this is more efficient, since the database does not need to check for nulls (and makes it clear to the reader of the query that you just want to know how many rows belong to the group)
in specific cases such as yours, put the relevant expression in the count()
I am trying to retrieve a tree structure i want specific levels of 'parenthood'. my table has level of depth, pathIndex and mapping. my first approach was to make some kinds of substrings to be able to look for the value via the mapping, but I am getting multiple errors on conversion of strings. one thing that might be possible is that if i try and query an item that is not at the lowest level it should return null for the levels it is missing.
In the table if i where to query for the line while asterisks
Id depth pathindex ItemNumber
4CF91F7F-832E-468D-B44A-E14DC66E710A 0 0 0.0
D34784A3-2134-4D09-828E-0EDA0C275C43 1 1 1
38158804-3EBC-4841-B1AF-1B86AD153010 2 1 1.1
8E25D494-322F-45F9-8A91-2A385F561C71 3 1 1.1.1
**64EB6C43-FF9C-0FF9-133F-01F4F21DA14F** 4 1 1.1.1.1
13AFA35C-80F8-405A-8980-33C3F7733EE2 2 2 1.2
3F1332E9-4D42-4BD8-9423-598430E94CB5 3 1 1.2.1
B3CC1306-A122-46F6-8F67-30FBABA3B590 4 1 1.2.1.1
C3F27C8E-F96B-4498-A85F-E4FC8EA90ED7 4 2 1.2.1.2
This is how it should be looking for the information, the static string are the ones i don't know how to generate in order to get nulls when asking for a level that is not that deep.
Select top 1 VehicleGroupId as Region
from GroupHierarchy where GroupHierarchy.numericalmapping = '1'
Select top 1 VehicleGroupId as gz
from GroupHierarchy where GroupHierarchy.numericalmapping = '1.1'
Select top 1 VehicleGroupId as cedis
from GroupHierarchy where GroupHierarchy.numericalmapping = '1.1.1'
Instead of putting decimals between your hierarchy members rather use forwardslashes (1/2/3) and then you can use Microsoft SQL hiearchy data type and functions to easily join and retain the structure:
https://www.sqlshack.com/use-hierarchyid-sql-server/
I have a query, let's call it qry_01, that produces a set of data similar to this:
ID N CN Sum
1 4 0 0
2 3 3 3
5 4 4 7
8 3 3 10
The values shown in this query actually come from a chain of queries and from a bunch of different tables.
The corrected value CN is calculated within the query, and counts N if the ID is not 1, and 0 if it is 1.
The Sum is the value I want to calculate by progressively summing up the CN values.
I tried to use DSUM, but I came out with nothing.
Can anyone please help me?
You could use a correlated subquery in the following way:
select t.id, t.n, t.cn, (select sum(u.cn) from qry_01 u where u.id <= t.id) as [sum]
from qry_01 t
I am using Access with a table having over 200k rows of data. I am looking for counts on a column which is broken down by job descriptions. For example, I want to return the total count (id) for a location where a person is status = "active" and position like "cook" [should equal 20] also another output where I get a count (id) for the same location where a person is status = "active" and position = "Lead Cook" [should equal 5]. So, one is a partial of the total population.
I have a few others to do just like this (# Bakers, # Lead Bakers...). How can I do this with one grand query/subquery or one query for each grouping.
My attempt is more like this:
SELECT
a.location,
Count(a.EMPLOYEE_NUMBER) AS [# Cook Total], --- should equal 20
(SELECT count(b.EMPLOYEE_ID) FROM Table_abc AS b where b.STATUS="Active Assignment" AND b.POSITION Like "*cook*" AND b.EMPLOYEE_ID=a.EMPLOYEE_ID) AS [# Lead Cook], --- should equal 5
FROM Table_abc AS a
ORDER BY a.location;
Results should be similar to:
Location Total Cooks Lead Cooks Total Bakers Lead Bakers
1 20 4 15 2
2 45 7 12 2
3 22 2 16 1
4 19 2 17 2
5 5 1 9 1
Try using conditional aggregation -- no need for sub queries.
Something like this should work (although I may not understand your desired results completely):
select location,
count(EMPLOYEE_NUMBER) as CookTotal,
sum(IIf(POSITION Like "*cook*",1,0)) as AllCooks,
sum(IIf(POSITION = "Lead Cook",1,0)) as LeadCooks
from Table_abc
where STATUS="Active Assignment"
group by location
I have two dimensions DimFlag and DimPNL and a fact table FactAmount. I am looking to:
When pnl is stat(Is Stat=1) : sum (Actual x FlagId)
For pnl I multiply the amounts by field FlagId basically if it will be so 0 0 X = 0 ...
DimFlag
FlagId FlagLabel
-----------------
1 NotClosed
0 IsClosed
DimPNL
PNLId PNLName Is Stat
1 a 1
2 test 1
3 test2 0
FactAmount
id PNLId FlagId Actual
1 1 1 100
2 2 1 10
3 3 0 120
I tried the following MDX but it didn't work, any idea please ?
Scope (
[Dim PNL].[PNL].members,[Measures].members
);
this = iif([Dim PNL].[PNL].CurrentMember.Properties("Is Stat") =1
,
aggregate([Dim PNL].[PNL].currentmember,[Measures].currentmember)* iif([Dim Flag].[Flag Label].[Flag Label].currentmember = 0, 0, 1),
aggregate([Dim PNL].[PNL].currentmember,[Measures].currentmember)
);
While this type of calculation can be done in MDX, the MDX can get complex and performs bad. I would suggest to explicitly do the calculation e. g. in the DSV or a view on the fact table that you then use instead of the fact table directly in the DSV. The result of the calculation would then be another column on which you can base a standard measure.
To do it in the DSV, assuming you use a relational table as the base for the fact table, add a named calculation to it, define the column name however you like, and use the expression Actual * FlagID. For the other calculation, you may need a subselect, i. e. the expression would be Actual * case when pnlId in(1,2) then 1 else 0 end. You can use any SQL that works as a column expression in the select list as the expression in for a named calculation.
Implementing the same in a view on FactAmount, you could implement the second expression better, as then you could join table DimPNL in the view definition and thus use column IsStat in the calculation. Then you would replace table FactAmout by the view, which has the two additional measure columns.
In either case, just define two measures on the two new columns in the cube, and you are done.
As a rule, calculations that are done on record level in the fact table before any aggregation should be done at data loading time, i. e. as described above.