select lowest percentage values - vba

With a set of data that has for each user an age and a score, I would like to select and plot the lowest 15% of the scores for each age.
Is there a way to simply take out the lowest 15% of scores for each age? This would be preferably in an automated way (don't want to have to repeat process for all 100 ages)
I have tried the conditional formatting but then I would have to manually select what data to keep. I suspect it is a complicated if function that hopefully I wont have to rewrite for each age.
I've created a column that ranks the scores with respect to the other of that age:
RANK(B4,$B$4:$B$426,1)
Then in a new column convert that to a percentage of the total number of entries for that age:
(E4/COUNT($E$4:$E$426))*100
Then an IF statement where it copies the score only if it is in the bottom 15% of scores;
IF(F4<15,B4,"-")
This process is long and messy doesn't seem logical to repeat it 100 times so how to I automate it?

Related

How to calculate dynamic % of grand total as a measure on Power BI?

I have the below table connected into Power BI and I am looking for ways to create a formula calculating % of grand total of the Rating column and further subtracting with targets for each rating. For example, the % of grand total for Rating 1 is 3 divided by 7 (42.86%). The most important part of the formula is the denominator which has to remain at a total level and dynamic for any filters applied to either Grade or BU columns. For example, denominator at a total level would be 7 and when filtered down to Academy BU should be 3.
Sample Data Table:
Rating Target Table:
I want the end result to look like this,
I have used the following formula to achieve this,
Measure created: % of total calc = DIVIDE(COUNT('Table'[Rating]),CALCULATE(SUM('Table'[Count]),'Table'[Rating]))
To make the above formula work I had to add an extra column and include ones in it (see below)
I want to know if there are other ways of achieving this outcome?
ALLEXCEPT will produce such result to exclude used dimensions and include mandatory filters such as date with one condition, rating, date, any dimension must be in the same table.

Find outliers to each band of records

The goal is to find extremely small or large records for each band based on a formula.
Input:
Distance Rate
10 5
25 200
50 300
1000 5
2000 2000
Bands are defined by my input. For example, I want to have two bands for this input (actually there are more, like 10 bands) for distance: 1-100, 101-10000.
For each band, we want to find all records that the rates are outliers by formula f (two standard deviations away from mean, if you are interested in the formula)
The formula f I want to use
(Rate- avg(Rate) over ()) / (stddev(Rate) over ()) > 2
Output:
Distance Rate
10 5
1000 5 (this number is for illustrative purpose only.)
The difficult part is I do not know how to do it for each band, and it makes applying formula more difficult.
Without knowing how you intend to apply your formula (my guess would be UDF), you can create your "bands" by grouping by a CASE expression:
GROUP BY CASE
WHEN Distance BETWEEN 1 AND 100 THEN 'Band1'
WHEN Distance BETWEEN 101 AND 10000 THEN 'Band2'
ETC
END
Similarly you use the same CASE expression in a RANK() OVER () function, if that works better for the rest of your query.
EDIT: based on your clarification, you need to handle this with a correlated sub-query in your WHERE clause. I would consider encapsulating it in a UDF to make the main query look cleaner. Something like:
WHERE (Rate- {Correlated query to select the AVG(rate) of all rows in this band (using the above CASE statement to determine "this band"} over ()) / (stddev(Rate) over ()) > 2

Confused on this assignment, any guidance?

The Problem:
First create a table called amttopay that has three fields: rec_no, idno and amt (make amount a numeric field that can hold 3 decimal places. You are also going to use a copy of the donor table for this assignment. Take in a number that matches an idno on the donor table. Check the yrgoal for that record. If it is larger than 500 then double it to create a new goal and write four records on the amttopay table containing the quarterly payment number (1 through 4), the idno, and the quarterly amount to pay to achieve the new goal. If it is not larger than 500 then add 50% to the goal to make the new goal and process it by writing the four records with the same information.
I've created the table, and I understand I've gotta write PL/SQL code to accomplish this, but what I'm not understanding is how the question is worded.
"If it is larger than 500 then double it to create a new goal and write four records on the amttopay table containing the quarterly payment number (1 through 4), the idno, and the quarterly amount to pay to achieve the new goal."
What does that mean? How would I go about bringing logic into this?
Thanks so much for the help.
Assuming you are trying to actually understand the question, this is how you would do it:
Break your statement into parts:
Check the yrgoal for that record.
If it is larger than 500 then
double it to create a new goal
and write four records on the amttopay table containing the quarterly payment number (1 through 4), the idno, and the quarterly amount to pay to achieve the new goal.
If it is not larger than 500 then
add 50% to the goal to make the new goal
and process it by writing the four records with the same information.
Simplified, this gives the following:
Create new record
if yrgoal>500 then
double yrgoal
Create 4 records with idnoand the quarterly amount
else
yrgoal * 1.5
Create 4 records as before
The rest is up to you, of course …

No sum() with null values

I'm looking for a solution to create sums of +- 10 scores and targets of a product over 6 different dimensions. There are some more i won't bother you with. Of every dimension I need a total. For example
SalesPeriod. Product: Bikes. Dimensions: bmx, size, colours, with bars etc. Targets: 1,2,3,4,5. Scores:1,2,3,4,5.
So 10 totals for bmx bikes with size x, colour red and bars, and 10 totals for bmx bikes, size x, colour red etc etc.
However, every score needs to be calculated only when none of the underlying values is a null. For example score 1 contains a null then no calculation, but score 2 does not contain a null thus should be calculated.
At this point the calculation is done via a case statement which basically checks the values of within each column/score and only calculates the total when the count of scores is equal to the expected rows.
The calculation requires a lot of cpu and with a larger dataset this is very inefficient and it simply takes too long.
I'm looking for a solution that will be much more effecient. What could be my best option to try?
You can filter (or first group by) the products with Non Null values only first by using your same count method. I don't think there is any other method.
SELECT columnid, SUM(column1)
FROM table
GROUP BY columnid
HAVING COUNT(column1)=COUNT(*);
Then you can join it on columnid with another similar query on another columnN as well.
(I'm not sure if understood your problem completely, but you basically want an efficient query with sum(scores) and sum(targets) only when they are not null? or only when they are both not null? or only scores? or only targets?)

Retrieve names by ratio of their occurrence

I'm somewhat new to SQL queries, and I'm struggling with this particular problem.
Let's say I have query that returns the following 3 records (kept to one column for simplicity):
Tom
Jack
Tom
And I want to have those results grouped by the name and also include the fraction (ratio) of the occurrence of that name out of the total records returned.
So, the desired result would be (as two columns):
Tom | 2/3
Jack | 1/3
How would I go about it? Determining the numerator is pretty easy (I can just use COUNT() and GROUP BY name), but I'm having trouble translating that into a ratio out of the total rows returned.
SELECT name, COUNT(name)/(SELECT COUNT(1) FROM names) FROM names GROUP BY name;
Since the denominator is fixed, the "ratio" is directly proportional to the numerator. Unless you really need to show the denominator, it'll be a lot easier to just use something like:
select name, count(*) from your_table_name
group by name
order by count(*) desc
and you'll get the right data in the right order, but the number that's shown will be the count instead of the ratio.
If you really want that denominator, you'd do a count(*) on a non-grouped version of the same select -- but depending on how long the select takes, that could be pretty slow.