MDX-Statement - skip level - mdx

i have the following hierarchy:
group -> country -> place
Some data:
Blockquote
group1 -> france -> place1
group1 -> france -> place2
group1 -> france -> place3
group1 -> italy -> place1
group1 -> italy -> place2
group1 -> spain -> place1
group2 -> france -> place1
group2 -> france -> place2
group2 -> italy-> place1
group3 -> italy-> place1
group3 -> italy-> place2
group3 -> italy-> place3
group3 -> spain-> place1
I need a MDX-Statement that is no filter (i work with SAS-OLAP and need to create an authorization on a Dimension):
The authorization should be on level2. The user is in a sas-group "italy" - so
the user should see all values on level1 - only see "italy" on level2 and on level 3 he should see the childrens of italy.
Can someone please help with the mdx-statement? How is this possible?
Thank you very much
Kind regards
George

Thank you very much - but with Siblings i get also all other members of level 2 - but i just want on level 2 "italy".
I thought, that in SAS it is not possible to use filters in the authorization of the dimension. However, it is possible. SO i tried the following MDX:
Blockquote
[DIM_NAME].[DIM_NAME_H01].[All GROUPS],
[DIM_NAME].[DIM_NAME_H01].[All GROUPS].Children,
FILTER([DIM_NAME].[DIM_NAME_H01].[All GROUPS].[COUNTRIES].ALLMEMBERS,
kupcase([DIM_NAME].[DIM_NAME_H01].CurrentMember.Properties("CAPTION")) = kupcase("italy"))
So i see all members on level 1. --> That´s OK
On Level 2 i just see "italy" merbers. --> That´s OK
But i get no members on Level 3. --> does someone have an idea, how i get also members on level 3? Or are filters not the right approach?
Thank you very much
Kind regards
George

Take a look at below sample. I am trying to get parent and all sibilings of the parent of Queensland
SELECT {[Measures].[Internet Sales Amount]} ON 0,
{([Customer].[Customer Geography].[State-Province].&[QLD]&[AU]),
([Customer].[Customer Geography].[State-Province].&[QLD]&[AU]).parent.Siblings}
ON 1
FROM [Adventure Works]
Result
To add children to the result too.
SELECT {[Measures].[Internet Sales Amount]} ON 0,
{
([Customer].[Customer Geography].[State-Province].&[QLD]&[AU]),
([Customer].[Customer Geography].[State-Province].&[QLD]&[AU]).parent.Siblings,
([Customer].[Customer Geography].[State-Province].&[QLD]&[AU]).children
} ON 1
FROM [Adventure Works]

Related

Concatenate multiple row and sum on a specific column

I want to concatenate multiple row into a single row. I manage to concatenate the row, however, when I try to apply sum based on a specific column, it gave me an error TypeError: can only concatenate str (not "float") to str
Item Sum Brand Type User ID
ABC 5 High Zinc John 20A
CDD 3 Low Iron Bail 10B
ABC 10 High Zinc John 20A
CDD 200 Low Iron Bail 10B
Below is my code:
df = df.groupby(['ID','User','Type','Brand']).agg({'Item':''.join, 'Sum':'sum'}).reset_index()
Desired Output:
Item Sum Brand Type User ID
ABC 15 High Zinc John 20A
CDD 203 Low Iron Bail 10B
Thank You in advance!
df = df.pivot_table(index=['Brand', 'Type', 'User', 'ID'],values=['Sum'], columns=['Item'], aggfunc=sum).stack().reset_index()
Brand Type User ID Item Sum
0 High Zinc John 20A ABC 15.0
1 Low Iron Bail 10B CDD 203.0

Google Data Studio: Average Number of Sessions based on selected country values

Let's say that for the dimension country I have 4 values and for each of the 4 I have the respective number of Sessions. E.g.
+---------+----------+
| country | Sessions |
+---------+----------+
| Italy | 10 |
| France | 12 |
| Germany | 14 |
| Spain | 16 |
+---------+----------+
I want to compute and output in a scorecard the average number of Sessions, only for those specific countries. So, in the example, the output should be 13.
I tried with the following calculated field but it doesn't work:
Sessions * AVG(CASE
WHEN REGEXP_MATCH(country, '^Italy|France|Germany|Spain.*') THEN 1
ELSE 0 END)
Create a filter based on the country dimension using the Matching RegEx operator.
Then apply this to a scorecard with the metric sessions. In the Data tab on the right hand side, you should be able to click on a little pencil icon for the metric, and choose the aggregation method as average instead of sum.
You may not have this option if you're using the GA connector. In this case, there should be an Average Session metric in the data source.
One way it can be achieved is by using a Filter Control and a Calculated Field:
1) Filter Control
Add the component with the Dimension set to Country and then add a Default Selection (a comma separated list of the required countries):
Italy, France, Germany, Spain
2) Calculated Field (Scorecard)
Sessions / COUNT_DISTINCT(Country)
Google Data Studio Report and a GIF to elaborate:

SQLQuery COUNT number of games per team

I have 4 tables:
Teams
codTeam: 1
year: 1995
codYears: 1
codType: 1
name: FCP
points: 3
codTeam: 2
year: 1990
codYears: 1
codType: 1
name: SLB
points: 3
codTeam: 3
year: 1995
codYears: 3
codType: 2
name: BCP
points: 0
Trainers (People who train a team)
codTrainer: 1
name: Peter
street: Ghost street
cellphone: 252666337
birthdayDate: 1995-02-01
BI: 11111111
number: 121212121
codTrainer: 1
name: Pan
street: Ghost street Remade
cellphone: 253999666
birthdayDate: 1995-01-01
BI: 22222222
number: 212121212
TeamsTrainers (In which team is someone training)
codTeamTrainer: 1
codTeam: 1
codTrainer: 2
dataInicio: 1998-05-05
codTeamTrainer: 2
codTeam: 2
codTrainer: 2
dataInicio: 1998-06-07
codTeamTrainer: 3
codTeam: 2
codTrainer: 1
dataInicio: 1999-09-09
Games
codGame: 1
date: 2015-02-12 13:00:00
codTeamHome: 1
codTeamAgainst: 2
goalsHome: 3
goalsAgainst: 2
codTypeGame: 1
codGame: 2
date: 2015-02-12 15:00:00
codTeamHome: 2
codTeamAgainst: 1
goalsHome: 1
goalsAgainst: 2
codTypeGame: 3
So basically I want to:
Get the table Games and show:
Team Name | Trainer Name | Goals Home | Goals Against | Points | Ammout of Games from the Home Team
I have the following code for that in SQLQuery:
SELECT Teams.name, Trainers.name, Games.goalsHome,
Games.goalsAgainst, Teams.points, COUNT(*)
FROM Teams, Trainers, Games, TeamsTrainers
WHERE Games.codTeamHome = Teams.codTeam AND
TeamsTrainers.codTeam = Teams.codTeam AND
TeamsTrainers.codTrainer = Trainers.codTrainer
GROUP BY Teams.name, Trainers.name, Games.goalsHome,
Games.goalsAgainst, Teams.points
(May have some errors as I translated)
Yet, the COUNT only shows 1 (Probably because on the WHERE it has "teamHome" so it only counts 1), yet, if it's because of that, how do I fix it?
Result:
FCP | Pan | 3 | 2 | 3 | 1 (Count)
SLB | Peter | 1 | 2 | 3 | 1 (Count)
SLB | Pan | 1 | 2 | 3 | 1 (Count)
It should be 2 for each one on the Count
Any idea?
The reason you get wrong result is of wrong joing data type. You should use repsectivelly: left, right or inner join instead of joing data via using where clause. Your data model provides 1 to N relationship, so you should use specific type of join.
See: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
EDIT
SELECT Te.name, Tr.name, Ga.goalsHome, Ga.goalsAgainst, Te.points,
(SELECT COUNT(*)
FROM Games
WHERE codTeamHome = Te.codTeam OR codTeamAgainst = Te.codTeam)
AS CountOfGames
FROM TeamsTrainers AS Tt
LEFT JOIN Teams AS Te ON Tt.codTeam = Te.codTeam
LEFT JOIN Trainers AS Tr ON Tt.codTrainer = Tr.codTrainer
LEFT JOIN Games AS Ga ON Ga.codTeamHome = Te.codTeam
SQL Fiddle
You can change your WHERE clause by saying
[what you have] OR (Games.codTeamAgainst = Teams.codTeam AND ...)
However, this probably causes other problems because you probably care about whether a particular team scores the goals, not whether the home team scores the goals in games that team plays on either side.
You might not notice the other problems for a while because your GROUP BY clause is probably pretty far from what you want, and you might want to be selecting aggregate functions for a much simpler grouping.

SPSS Chi-square test on raw data

I want to run a chi square test on my raw data but SPSS is not reading the raw data as I want it to.
The raw data summarizes several test conditions, locations, groups of participants and results in one file. Example data:
# country gender Question 1 Question 2
1 chile f yes no
2 chile f no no
3 chile m no yes
4 chile m yes no
5 chile f yes no
6 chile f yes yes
7 chile m no yes
8 france f yes no
9 france f no no
10 france f no no
11 france m yes yes
12 france m yes no
13 france f yes no
14 france f no yes
15 china m yes yes
16 china f no yes
17 china f no yes
18 china f yes yes
19 china m yes yes
20 china m yes no
.....
I want to do do a chi-square test on comparing French men and Chilean women and their responses to question 1 through 14.
Problem 1: A filter will not make SPSS ignore the remaining rows such as Chinese participants.
Problem 2: Furthermore if I just try and compare male and female participants SPSS produces the following error message after calculating:
No statistics are computed because (Question) is a constant.<
I cannot accept that I am forced to summarize the data manually into a frequency table, so how do I make SPSS automatically do this?
EDIT: Allow me to rephrase. (Protip: don't ask online when you are frustrated by the problem)
Problem 1 again: I could just move the do the rows I need to a different file but if/when I need to do so many times it becomes increasingly cumbersome. Is there a way to select a combination of filters (for example) for the two columns in the contingency table?
I was using filters ("Select Cases...") wrong!
Solution: I first make a filter for the data I want to compare. (Pseudo code the menu Data>Select Cases...) Filter: IF French AND Male.
Now the filter (I label it 'FilterA' in the Variable View) and it creates a new filter column in the Data View with row variable = 1 for French and male, and zero for the rest.
Now the women from Chile are zero as I want them in the filter (this is important) but so are all the Chinese, the French women and so on.
So I define a second filter: IF Chile AND Female OR French AND Male
Now all the data is filtered out as shown by a diagonal line through the row-counter column on the far left in SPSS. Now when a statistical test is run this data will not be included in the test.
Now I can finally perform the test chi-square: I choose Descriptive Statistics>Crosstabs.
For column I choose "FilterA" which I created. For row I choose the QuestionX/whatever data I want calculated.
Fin.

Rank within Groups in Pig 11

Pig question,
I have my data setup the following way.
Function Group Home Name
Rent MX 1 John
Rent MX 1 Jake
Rent MX 1 Pat
Rent DG 2 Jason
Rent DG 6 Patrick
Rent DG 6 Smith
Rent DG 6 Joe
What I want to do is Group by function,group and home and then rank within that group.
Function Group Home Name Rank
Rent MX 1 John 1
Rent MX 1 Jake 2
Rent MX 1 Pat 3
Rent DG 6 Patrick 1
Rent DG 6 Smith 2
Rent DG 6 Joe 3
The RANK function in Pig does not allow me to RANK within group.Any suggestions? Jython UDF ?
Check out the Enumerate UDF in DataFu, it does this for you. http://datafu.incubator.apache.org/docs/datafu/1.1.0/datafu/pig/bags/Enumerate.html
I will give some pointers to this.
In Cascading API ,I used buffer which allows us to iterate the group values.
I read that cascading also has an api for Jython developers ,you may explore that.
Ok this worked
def num_bag(input):
output = []
for rank, item in num(input):
output.append(tuple([rank] + list(item)))
return output