How to compute percentage of total based on Counts of 1 field and filtered by another field - sql

Data and Desired Result:
I have the above data, i would like to compute the percentage and show the corresponding counts of the records. The CountKey is a concatenation of 3 fields, i only want to count it when it is unique by LastName, I then would like to find out the percentage of total for each different Status type by last name. The CountKeyTotal is the total count of unique CountKeys for Smith, CountKey is the Total unique Countkeys by LastName by Status
I am fairly new to SQL and have only been able to get either totals in whole (as an example using the data provided, Smith 40 3 12 25%
any help would be appreciated

You can use a group by and a dinamic temp table for total
select
a.name, a.status
, count(a.countKey) as CountKEy
, c2 as CountKeyTotal
, (a.count(*) / b.c1) *100 as percentage
from my_table as a
inner join( select name, count(*) as c1 , count(countKey) c2 from my_table
group by name) b on a.name = b.name
group by a.name

Related

Summing numbers don't match up?

I have a table with data like this (acc_v in query):
value
id
100
1
300
2
200
1
As you can see there are multiple values per id. I want to sum all of the values across the ids and end up with this:
value
id
300
1
300
2
;with accV as (SELECT
d_id
,[period_end_date]
,max(a_value) as value
,id
FROM bh
where period_end_date = '2021-6-30'
group by d_id, period_end_date, id)
SELECT bh.id, sum(value)
FROM bh join accV on accV.id = bh.id
group by bh.id
For some reason the total values are adding up to significantly larger amounts than they should be. I verified this by taking the original values and summing them in excel. If anyone knows what I am doing wrong the help is much appreciated.
You can use window functions:
select id, sum(sum(value)) over () as total_sum
from t
group by id;

GROUP BY one column, then GROUP BY another column

I have a database table t with a sales table:
ID
TYPE
AGE
1
B
20
1
BP
20
1
BP
20
1
P
20
2
B
30
2
BP
30
2
BP
30
3
P
40
If a person buys a bundle it appears the bundle sale (TYPE B) and the different bundle products (TYPE BP), all with the same ID. So a bundle with 2 products appears 3 times (1x TYPE B and 2x TYPE BP) and has the same ID.
A person can also buy any other product in that single sale (TYPE P), which has also the same ID.
I need to calculate the average/min/max age of the customers but the multiple entries per sale tamper with the correct calculation.
The real average age is
(20 + 30 + 40) / 3 = 30
and not
(20+20+20+20 + 30+30+30 + 40) / 8 = 26,25
But I don't know how I can reduce the sales to a single row entry AND get the 4 needed values?
Do I need to GROUP BY twice (first by ID, then by AGE?) and if yes, how can I do it?
My code so far:
SELECT
AVERAGE(AGE)
, MIN(AGE)
, MAX(AGE)
, MEDIAN(AGE)
FROM t
but that does count every row.
Assuming the age is the same for all rows with the same ID (which in itself indicates a normalisation problem), you can use nest aggregation:
select avg(min(age)) from sales
group by id
AVG(MIN(AGE))
-------------
30
SQL Fiddle
The example in the documentation is very similar; and is explained as:
This calculation evaluates the inner aggregate (MAX(salary)) for each group defined by the GROUP BY clause (department_id), and aggregates the results again.
So for your version:
This calculation evaluates the inner aggregate (MIN(age)) for each group defined by the GROUP BY clause (id), and aggregates the results again.
It doesn't really matter whether the inner aggregate is min or max - again, assuming they are all the same - it's just to get a single value per ID, which can then be averaged.
You can do the same for the other values in your original query:
select
avg(min(age)) as avg_age,
min(min(age)) as min_age,
max(min(age)) as max_age,
median(min(age)) as med_age
from sales
group by id;
AVG_AGE MIN_AGE MAX_AGE MED_AGE
------- ------- ------- -------
30 20 40 30
Or if you prefer you could get the one-age-per-ID values once ina CTE or subquery and apply the second layer of aggregation to that:
select
avg(age) as avg_age,
min(age) as min_age,
max(age) as max_age,
median(age) as med_age
from (
select min(age) as age
from sales
group by id
);
which gets the same result.
SQL Fiddle

SQL: Checking value counts of a column

I'd like to check if a column in a table has values with a small number of value counts.
Consider the following table as an example:
RowID |Product
1 | A
2 | A
3 | B
...
200.000 | C
the following table is aggregated of the table above:
Product |Count
A |204
B |682
C |553
D |1402
E |30855
F |357
G |1
H |542
What I'd like to know of the column Product of my table is, whether or not a Product has a count that is less than 5%. And if so, the SQL statement should return: 'Some values of this field have a small number of value counts'
In other words: IF [MinValueCount]/[Count] <= .05 then 'Some values of this field have a small number of value counts' else 'null'
With the example above, I should get: 'Some values of this field have a small number of value counts'
as product G is less than 5% of the total count of products.
how should the SQL statement look like?
With kind regards,
Lazzanova
Use two levels of aggregation. You can get the total using window functions:
select max( 'Some values of this field have a small number of value counts')
from (select product, count(*) as cnt,
sum(count(*)) over () as total_cnt
from t
) t
where cnt < 0.05 * total_cnt;
The use of max() in the outer query is just to return one row. You could also use fetch or a similar clause (whatever your database supports):
select 'Some values of this field have a small number of value counts'
from (select product, count(*) as cnt,
sum(count(*)) over () as total_cnt
from t
) t
where cnt < 0.05 * total_cnt
fetch first 1 row only;

Get rollup group value in SQL Server

I have a table with following data:
Name
Score
A
2
B
3
A
1
B
3
I want a query which returns the following output.
Name
Score
A
2
A
1
Subtotal: A
3
B
3
B
3
Subtotal: B
6
I am able to get "Subtotal" with group by rollup query but I want to get subtotal along with group column value.
Please help me with some SQL code
If score has at most one value per name, you can use GROUPING SETS`:
select name, sum(score) as score
from t
group by grouping sets ((name, score), (name));
If name is never null, I would just use:
coalesce(name, 'Grouping ' + name)
Otherwise you need to use grouping().

SQL - Count Results of 2 Columns

I have the following table which contains ID's and UserId's.
ID UserID
1111 11
1111 300
1111 51
1122 11
1122 22
1122 3333
1122 45
I'm trying to count the distinct number of 'IDs' so that I get a total, but I also need to get a total of ID's that have also seen the that particular ID as well... To get the ID's, I've had to perform a subquery within another table to get ID's, I then pass this into the main query... Now I just want the results to be displayed as follows.
So I get a Total No for ID and a Total Number for Users ID - Also would like to add another column to get average as well for each ID
TotalID Total_UserID Average
2 7 3.5
If Possible I would also like to get an average as well, but not sure how to calculate that. So I would need to count all the 'UserID's for an ID add them altogether and then find the AVG. (Any Advice on that caluclation would be appreciated.)
Current Query.
SELECT DISTINCT(a.ID)
,COUNT(b.UserID)
FROM a
INNER JOIN b ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999)
GROUP BY a.ID
Which then Lists all the IDs and COUNT's all the USERID.. I would like a total of both columns. I've tried warpping the query in a
SELECT COUNT(*) FROM (
but this only counts the ID's which is great, but how do I count the USERID column as well
You seem to want this:
SELECT COUNT(DISTINCT a.ID), COUNT(b.UserID),
COUNT(b.UserID) * 1.0 / COUNT(DISTINCT a.ID)
FROM a INNER JOIN
b
ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999);
Note: DISTINCT is not a function. It applies to the whole row, so it is misleading to put an expression in parentheses after it.
Also, the GROUP BY is unnecessary.
The 1.0 is because SQL Server does integer arithmetic and this is a simple way to convert a number to a decimal form.
You can use
SELECT COUNT(DISTINCT a.ID) ...
to count all distinct values
Read details here
I believe you want this:
select TotalID,
Total_UserID,
sum(Total_UserID+TotalID) as Total,
Total_UserID/TotalID as Average
from (
SELECT (DISTINCT a.ID) as TotalID
,COUNT(b.UserID) as Total_UserID
FROM a
INNER JOIN b ON someID = someID
WHERE a.ID IN ( SELECT ID FROM c WHERE GROUPID = 9999)
) x