Set analysis not working as intended

Set analysis not working as intended - qlikview

I'm creating a dashboard for a school project, and I just came to the conclusion my set analysis is not working correctly. I have a funnel chart with the turnover of the employees who are working at the company for less than 20 years.
I have this same funnel chart for employees who work at the company for more than 20 years. Now when I click on, for example, the red, all level 2 sales employees who work <20 years should be displayed in this table:
But it stills shows the people who are working there for longer than 20 years.
This is my code:
sum({1<SALES_STAFF_WORKEXPERIENCE_nr = {"<20"}, SALES_STAFF_POSITION_en = {'Level 1 Sales Representative', 'Level 2 Sales Representative', 'Level 3 Sales Representative'}>} ORDER_DETAILS_turnover)
sum({1<SALES_STAFF_WORKEXPERIENCE_nr = {">=20"}, SALES_STAFF_POSITION_en = {'Level 1 Sales Representative', 'Level 2 Sales Representative', 'Level 3 Sales Representative'}>} ORDER_DETAILS_turnover)

This is a common problem with set analysis. Because you have specified the selection
SALES_STAFF_POSITION_en = {'Level 1 Sales Representative', 'Level 2 Sales Representative', 'Level 3 Sales Representative'}
Selecting one of those values will be ignored and all three levels displayed.
You need to make it the intersection of selections and the set it's a small change but often overlooked, see below:
sum({1<SALES_STAFF_WORKEXPERIENCE_nr = {"<20"}, SALES_STAFF_POSITION_en = p(SALES_STAFF_POSITION_en)* {'Level 1 Sales Representative', 'Level 2 Sales Representative', 'Level 3 Sales Representative'}>} ORDER_DETAILS_turnover)
sum({1<SALES_STAFF_WORKEXPERIENCE_nr = {">=20"}, SALES_STAFF_POSITION_en =p(SALES_STAFF_POSITION_en)* {'Level 1 Sales Representative', 'Level 2 Sales Representative', 'Level 3 Sales Representative'}>} ORDER_DETAILS_turnover)
The p() means possible values in the field. I think there is a simpler syntax but I don't use it because this reads easier for me.
Also if those levels are the only 3 levels then you don't need to include them in the set analysis at all, unless you actually want to override any selections that get made in that dimension

Related

Only Show unique Customers per date cohort for repeat purchase rate

Scenario:
I have a table that has all of the customer purchases by Month and each month has a period. Within that table I am showing the customers that have made purchases in each Month/Period. What I am trying to figure out is how to exclude any customer that made a purchase in the previous month so that the repeat purchases are only for unique customers. The data looks like the following:
customer_email
cohortMonth
month_number
orders_for_period
abc#gmail.com
10/2019
0
2
def#gmail.com
10/2019
0
1
ghi#gmail.com
10/2019
0
1
def#gmail.com
10/2019
1
1
abc#gmail.com
10/2019
1
1
def#gmail.com
10/2019
2
1
In the Table above for Month_number=0 we have 3 total customers and within this period customer abc#gmail.com was the only repeat customer because they have 2 orders. This would show as a 33% repeat purchase rate for month_number 0. For Month_number=1 we have 2 customers that have purchased again in the period but only def#gmail.com is unique as abc#gmail.com already made the purchase. This would then bring the repeat_rate to 66% as now 2 customers have comeback and purchased out of the 3 that originally purchased.
cohortMonth
month_number
repeat_purchase_rate
10/2019
0
33%
10/2019
1
66%
10/2019
2
66%
With every unique customer that purchases in the subsequent periods we want to add that to the total to understand the repeat rate at a cumulative level.
I have tried a ton of different ways to figure this out but backing out the customers that made purchases in the previous period and only showing the unique customers is where I am struggling at. Any help is greatly appreciated!
Side Note: Whenever I format a table it looks like how I want it to look in the preview but then when I review I get the error :"Your post appears to contain code that is not properly formatted as code. Please indent all code by 4 spaces using the code toolbar button or the CTRL+K keyboard shortcut. For more editing help, click the [?] toolbar icon."
I then indent and it breaks the way the table looks. Any help on that would be great as well. Thank you

how do I return the latest month submission from each site in a database?

Hi I'm trying to update one of my pre extraction check query to ensure that all the submitters have made submissions before I extract the updated data as the final data set is something like 49 columns by 80k+rows.
Currently my test code is either a distinct list of submitters filtered for a certain year (financial year) in format YY/YY and current period (financial month) format M. I restrict the dataset which covers all submitters to the region I cover and then manually change the year or month to the correct values for the current period.
What I am hoping to do is change the code that says
SELECT DISTINCT
site,
activity_period
WHERE activity_period= '9'
AND (site = 'site 1'
OR site = 'site 2'
OR site = 'site x')
AND year = 'yy/yy';
FROM submissions
What I want to do is change this from having the month statement needing to be entered manually each month, into a statement that return's the max month in a given year for each of the sites in the site = statement. Then I'd order this by the month in ascending order so I can see the sites yet to submit and can chase accordingly does anyone know how I would do this.
Additionally it would be nice to know how I could set the year/month to be current period by asking for the max value that the database contains within for a separate check I'm working on that would be great.
edit notes added in changed month to activity_period so the columns make sense and added in a from statement. Edit 2 added in a couple sample tables one showing a few lines that may be in the data and the other showing what the outcome would want to show.
sample data
| site| value |submission month|
| --- | --- |--- |
| site 1| 40 |1|
| site 1| 40 |2|
| site 2| 5 |1|
| site 3| 400 |1|
| site 3| 409 |2|
| site 3| 4 |3|
output of query
| site| latest month received|
| ---|---|
| site 2|1|
| site 1|2|
| site 3|3|
I am not interested in finding any of the data (value of submission or other fields within the data I haven't put in the dummy) rather I just want to know which sites have not yet made submissions by having the sites with not having the latest month data (3 in the dummy) shown above the latest data so at a glance I can say yes I have all 18 of my sites in the latest month or not.

Just want to start by rewriting your query with where clause after from
SELECT DISTINCT
site,
activity_period
FROM submissions
WHERE activity_period= '9'
AND (site = 'site 1'
OR site = 'site 2'
OR site = 'site x')
AND year = 'yy/yy';
If you want to only pull the "latest" month per site you can join the submisisons table to itself
SELECT DISTINCT
site,
activity_period
FROM submissions s
inner join (select max(year) year, site from submissions group by site) s2
on s2.site = s.site and s2.year = s.year
WHERE activity_period= '9'
AND (site = 'site 1'
OR site = 'site 2'
OR site = 'site x');
The query provided differs a little bit from the sample data. Try running this to see how it works using your sample data
with a (site, submission, month) as(
select 'site 1' site, ' 40' submission,'1' month
union all select'site 1', '40' ,'2'
union all select 'site 2', '5' ,'1'
union all select 'site 3', '400' ,'1'
union all select 'site 3', '409' ,'2'
union all select 'site 3', '4' ,'3'
)
select a.* from a
inner join (select site, max(month) month from a group by site) a2
on a2.site=a.site and a2.month=a.month
Output is as below
site submission month
site 1 40 2
site 2 5 1
site 3 4 3
Explanation. What we're doing is creating a temporary table that has each site and the max or latest month. If we join this to our initial table with matching criteria on month and site we essentially filter the initial table for latest month per site.

average salary for graduating student based on different major

I want to get the average salary for graduating student based on different major
3 tables involve:
Survey_result (Student_Id, Annual_Salary)
Student_Degree(Student_ID, Degree_ID)
Degree(Degree_ID,Major_Name)
Survey_Result
9320000000, $1000
9320000001, $2000
9320000002, $3000
9320000003, $4000
Student_Degree
9320000000, 1
9320000001, 2
9320000002, 3
9320000003, 4
Degree
1, Accounting
2, Finance
3, Accounting
4, Finance
need sql result: Account: 2000, Finance: 3000

I think you can do this:
SELECT
AVG(Survey_Result.Annual_Salary),
Degree.Major_Name
FROM
Degree
JOIN Student_Degree
ON Student_Degree.Degree_ID=Degree.Degree_ID
JOIN Survey_Result
ON Student_Degree.Student_ID=Survey_Result.Student_ID
GROUP BY
Degree.Major_Name

Storing a set of criteria in another table

I have a large table with sales data, useful data below:
RowID Date Customer Salesperson Product_Type Manufacturer Quantity Value
1 01-06-2004 James Ian Taps Tap Ltd 200 £850
2 02-06-2004 Apple Fran Hats Hats Inc 30 £350
3 04-06-2004 James Lawrence Pencils ABC Ltd 2000 £980
...
Many rows later...
...
185352 03-09-2012 Apple Ian Washers Tap Ltd 600 £80
I need to calculate a large set of targets from table containing values different types, target table is under my control and so far is like:
TargetID Year Month Salesperson Target_Type Quantity
1 2012 7 Ian 1 6000
2 2012 8 James 2 2000
3 2012 9 Ian 2 6500
At present I am working out target types using a view of the first table which has a lot of extra columns:
SELECT YEAR(Date)
, MONTH(Date)
, Salesperson
, Quantity
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType1
, CASE WHEN Manufacturer = 'Hats Inc' AND Product_Type IN ('Hats','Coats') THEN True ELSE False END AS IsType2
...
...
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType24
, CASE WHEN Manufacturer IN ('Tap Ltd','Hats Inc') AND Product_Type = 'Hats' THEN True ELSE False END AS IsType25
FROM SalesTable
WHERE [some stuff here]
This is horrible to read/debug and I hate it!!
I've tried a few different ways of simplifying this but have been unable to get it to work.
The closest I have come is to have a third table holding the definition of the types with the values for each field and the type number, this can be joined to the tables to give me the full values but I can't work out a way to cope with multiple values for each field.
Finally the question:
Is there a standard way this can be done or an easier/neater method other than one column for each type of target?
I know this is a complex problem so if anything is unclear please let me know.
Edit - What I need to get:
At the very end of the process I need to have targets displayed with actual sales:
Type Year Month Salesperson TargetQty ActualQty
2 2012 8 James 2000 2809
2 2012 9 Ian 6500 6251
Each row of the sales table could potentially satisfy 8 of the types.
Some more points:
I have 5 different columns that need to be defined against the targets (or set to NULL to include any value)
I have between 30 and 40 different types that need to be defined, several of the columns could contain as many as 10 different values
For point 2, if I am using a row for each permutation of values, 2 columns with 10 values each would give me 100 rows for each sales person for each month which is a lot but if this is the only way to define multiple values I will have to do this.
Sorry if this makes no sense!

If I am correct that the "Target_Type" field in the Target Table is based on the Manufacturer and the Product_Type, then you can create a TargetType table that looks like what's below and JOIN on Manufacturer and the Product_Type to get your Target_Type_Value:
ID Product_Type Manufacturer Target_Type_Value
1 Taps Tap Ltd 1
2 Hats Hats Inc 2
3 Coats Hats Inc 2
4 Hats Caps Inc 3
5 Pencils ABC Ltd 6
This should address the "multiple values for each field" problem by having a row for each possibility.

MDX query to use a set but return a single row

I am new to MDX and have just started using Named sets to group several members of a dimension.
Whenever I use a SET in a query, the results returned are always detailed out for each individual member of the set. I am looking to get one one for the set.
For example: I have two Measures: Sales Dollars and Shipped Units. The then have a State dimension for each of the 50 states in the United States.
I want to see the Sales and Units measures for 3 specific states and then also for a group (Named Set) of 4 other states.
Example MDX:
With SET [My Favorite States] AS '{[States].[Illinois], [States].[Wisconsin]}'
select NON EMPTY {[Measures].[Sales], [Measures].[Shipped Units]} ON COLUMNS,
NON EMPTY {[States].[Alabama], [States].[New York], [My Favorite States]} ON ROWS
from [cubename]
This returns:
Measures
States Sales Shipped Units
Alabama $100 5
New York $500 20
Illinois $150 15
Wisconsin $900 25
What I want is for the Set to appear as a total on a single line. Similar to:
Measures
States Sales Shipped Units
Alabama $100 5
New York $500 20
My Favorite States $1,050 40
Is there an MDX function that will allow the set of specific members to be treated as a group?

You can use a calculated member to aggregate the separate states:
With Member [States].[My Favorite States] AS 'Aggregate({[States].[Illinois], [States].[Wisconsin]})'
select NON EMPTY {[Measures].[Sales], [Measures].[Shipped Units]} ON COLUMNS,
NON EMPTY {[States].[Alabama], [States].[New York], [States].[My Favorite States]} ON ROWS
from [cubename]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas