SQL Get percent of bad records from total - sql

i am relatively new to SQL. Each employee access an account for testing with a tech, sometimes it's a good attempt, sometimes it's bad, so I need to calculate the percentage of the bad attempts mostly, my report should look something like this:
SELECT
employee, event, total, percentage
FROM my_table
employee | event | total | percentage|
user1 | good | 50 | 50% |
user1 | bad | 50 | 50% |

Calculate the total in a subquery and then JOIN to calculate percentage on each row.
SELECT employee, event, COUNT(*), COUNT(*) * 100.0 / t.total as percentage
FROM my_table
JOIN (SELECT employee, count(*) total
FROM my_table
GROUP BY employee) T
ON my_table.employee = t.employee
GROUP BY employee, event

Try something like this calculate the bad event percentage for each employee
select employee,(sum(case when event = 'bad' then 1 else 0 end) / count(*)) * 100
From Yourtable
Group by employee

Related

Cumulative Sum Query in SQL table with distinct elements

I have a table like this, with column names as Date of Sale and insurance Salesman Names -
Date of Sale | Salesman Name | Sale Amount
2021-03-01 | Jack | 40
2021-03-02 | Mark | 60
2021-03-03 | Sam | 30
2021-03-03 | Mark | 70
2021-03-02 | Sam | 100
I want to do a group by, using the date of sale. The next column should display the cumulative count of the sellers who have made the sale till that date. But same sellers shouldn't be considered again.
For example,
The following table is incorrect,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 5 | 300
The following table is correct,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 3 | 300
I am not sure how to frame the SQL query, because there are two conditions involved here, cumulative count while ignoring the duplicates. I think the OVER clause along with the unbounded row preceding may be of some use here? Request your help
Edit - I have added the Sale Amount as a column. I need the cumulative sum for the Sales Amount also. But in this case , all the sale amounts should be considered unlike the salesman name case where only unique names were being considered.
One approach uses a self join and aggregation:
WITH cte AS (
SELECT t1.SaleDate,
COUNT(CASE WHEN t2.Salesman IS NULL THEN 1 END) AS cnt,
SUM(t1.SaleAmount) AS amt
FROM yourTable t1
LEFT JOIN yourTable t2
ON t2.Salesman = t1.Saleman AND
t2.SaleDate < t1.SaleDate
GROUP BY t1.SaleDate
)
SELECT
SaleDate,
SUM(cnt) OVER (ORDER BY SaleDate) AS NumSalesman,
SUM(amt) OVER (ORDER BY SaleDate) AS TotalAmount
FROM cte
ORDER BY SaleDate;
The logic in the CTE is that we try to find, for each salesman, an earlier record for the same salesman. If we can't find such a record, then we assume the record in question is the first appearance. Then we aggregate by date to get the counts per day, and finally take a rolling sum of counts in the outer query.
The best way to do this uses window functions to determine the first time a sales person appears. Then, you just want cumulative sums:
select saledate,
sum(case when seqnum = 1 then 1 else 0 end) over (order by saledate) as num_salespersons,
sum(sum(sales)) over (order by saledate) as running_sales
from (select t.*,
row_number() over (partition by salesperson order by saledate) as seqnum
from t
) t
group by saledate
order by saledate;
Note that this in addition to being more concise, this should have much, much better performance than a solution that uses a self-join.

SQLite percentages with small values

So I have this table of subscribers of users and the country they are in.
UserID | Name | Country
-------+-------------------+------------
1 | Zaphod Beeblebrox | UK
2 | Arthur Dent | UK
3 | Gene Kelly | USA
4 | Nat King Cole | USA
I need to produce a list of all the users by percentage from each of the countries. I also need all the smaller member countries (under 1%) to be collapsed into an "OTHERS" category.
I can accomplish a simple "top x" of members trivially with a
SELECT COUNTRY, COUNT(*) AS POPULATION FROM SUBSCRIBERS GROUP BY COUNTRY ORDER BY POPULATION DESC LIMIT 10
and can generate the percentages by PHP server side code, but I don't quite know how to:
Do all of it in SQL including percentage calculations directly in the result
Club all under 1% members into a single OTHERS category.
So I need something like this:
Country | Population
--------+-----------
USA | 25.4%
Brazil | 12%
UK | 5%
OTHERS | 65%
Appreciate the help!
Here is query for this, I used a subquery to count the total number of rows and then used that to get the percentage value for each. The 'Others' category was generated in a separate query. Rows are sorted by descending population with the Others row last.
SELECT * FROM
(SELECT country , ROUND((100.0*COUNT(*)/count_all),1) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) > 1
GROUP BY country
ORDER BY population DESC)
UNION ALL
SELECT 'OTHERS', IFNULL(ROUND(100.0*COUNT(*)/count_all,1),0.0) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) <= 1
Ok I think I might have found a way to do this that's a hell of a lot quicker on execution speed:
SELECT territory,
Round(Sum(percentage), 3) AS Population
FROM (SELECT
Round((Count(*)*100.0)/(SELECT Count(*) FROM subscribers),3) AS Percentage,
CASE
WHEN ((Count(*)*100.0)/(SELECT Count(*) FROM subscribers)) > 2 THEN
country
ELSE 'Other'
END AS Territory
FROM subscribers
GROUP BY country
ORDER BY percentage DESC)
GROUP BY territory
ORDER BY population DESC;

SQL Query to get sums among multiple payments which are greater than or less than 10k

I am trying to write a query to get sums of payments from accounts for a month. I have been able to get it for the most part but I have hit a road block. My challenge is that I need a count of the amount of payments that are either < 10000 or => 10000. The business rules are that a single payment may not exceed 10000 but there can be multiple payments made that can total more than 10000. As a simple mock database it might look like
ID | AccountNo | Payment
1 | 1 | 5000
2 | 1 | 6000
3 | 2 | 5000
4 | 3 | 9000
5 | 3 | 5000
So the results I would expect would be something like
NumberOfPaymentsBelow10K | NumberOfPayments10K+
1 | 2
I would like to avoid doing a function or stored procedure and would prefer a sub query.
Any help with this query would be greatly appreciated!
I suggest avoiding sub-queries as much as possible because it hits the performance, specially if you have a huge amount of data, so, you can use something like Common Table Expression instead. You can do the same by using:
;WITH CTE
AS
(
SELECT AccountNo, SUM(Payment) AS TotalPayment
FROM Payments
GROUP BY AccountNo
)
SELECT
SUM(CASE WHEN TotalPayment < 10000 THEN 1 ELSE 0 END) AS 'NumberOfPaymentsBelow10K',
SUM(CASE WHEN TotalPayment >= 10000 THEN 1 ELSE 0 END) AS 'NumberOfPayments10K+'
FROM CTE
You can get the totals per account using SUM and GROUP BY...
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
You can use that result to count the number over 10000
SELECT COUNT(*)
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)
WHERE TotPay>10000
You can get the the number over and the number under in a single query if you want but that's a but more complicated:
SELECT
COUNT(CASE WHEN TotPay<=10000 THEN 1 END) AS Below10K,
COUNT(CASE WHEN TotPay> 10000 THEN 1 END) AS Above10K
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)

Calculating % of single row to sum of rows - Teradata SQL

I am trying to calculate the % contribution of the a's in the "qual" column to the total sales for each loc.
loc | qual | sales
- - - - - - - - - - - -
us | a | 1,000
us | b | 500
gc | a | 200
gc | b | 400
So the answer that I would be looking for is US = 66.66% (1,000/1,500) and gc = 33.33% (200/600). So the return result would be....
loc | Pct
us | 66.66%
gc | 33.33%
Thanks!
You can do this with aggregation and window functions:
select loc, count(*) as sales, sum(count(*)) over () as total_sales,
count(*) / sum(1.0*count(*)) over () as sales_proportion
from t
group by loc;
If sales is really an integer, may should convert it to a floating point or decimal representation (you can just multiple by 1.0).
EDIT:
Oops, the above does something useful, but not what the OP asks for. Here is the simplest method:
select loc,
avg(case when qual = 'a' then 1.0 else 0 end) as proportion_a
from t
group by loc;
You need a conditional aggregate:
select loc,
100.00
* sum(case when qual = 'a' then sales else 0 end) -- only 'a' sales
/ sum(sales) as Pct -- all sales
from tab
group by loc
Change the precision of 100.00 * according to your needs.
Caution, unless the datatype of sales is a FLOAT or NUMBER you must multiply 100 * first and then divide.
To calculate the percentage, you know that you will need to sum over loc. But you don't want to group by loc and aggregate, because that will destroy parts of the table that you want to keep, and inner joining that resulting table back to the table you have here ... well, there's gotta be an easier way. And that's where a correlated subquery comes in! You can perform a subselect that counts the sales correlated to the loc in your table. The following code should give you what you want.
select loc,
sales * 100 / (
select sum(sales)
from my_table sq
where sq.loc = t.loc
) as pct
from my_table t
where qual = 'a'
Hope this helps!

how to find percentage without calculating manually?

Thanks you for looking.I am new to tsql and dont know how to proceed. I have a table with 10 different companies and 20 department for each(the departments are same for all the companies).
I am trying to calculate percentage of expenses for each department and want an extra column 'Percentage' to be displayed in the result.
please note that for every company the first department is totalcompexpenses which is just the total expenses of the company for all the department combined and dont need to calculate that and should be calculated from the next row.
Is it possible to do this by using while loop or any other way instead of doing it manually for each one of them?
ID |Company_name| Department |Expenses | Percentage
1 |Company1 |TotalComp1Expenses |50000 | -
2 |Company1 |Department1 |4000 | ?
3 |Company1 |Department2 |8000 | ?
4 |Company1 |Department3 |8000 | ?
5 |Company1 |Department4 |7000 | ?
6 |Company1 |Department5 |10000 | ?
...
11 |Company2 |TotalComp2Expenses |100000 | -
12 |Company2 |Department1 |6000 | ?
13 |Company2 |Department2 |5000 | ?
15 |Company2 |Department3 |8000 | ?
15 |Company2 |Department4 |7000 | ?
16 |Company2 |Department5 |10000 | ?
...
21 |Company3 |TotalComp3Expenses |70000 | -
22 |Company3 |Department1 |2000 | ?
23 |Company3 |Department2 |7000 | ?
24 |Company3 |Department3 |9000 | ?
25 |Company3 |Department4 |8000 | ?
26 |Company3 |Department5 |10000 | ?
...
I think the clearest way is to use window functions. If you want the percentages based on the Total% columns, then you can do it as:
select ID, Company_name, Department, Expenses,
(100.0* Expenses /
max(case when Department like 'Total%Expenses' then Expenses end) over
(partition by Company_Name)
) as Percentage
from t;
You can also do this as a sum of the non-Total expenses:
select ID, Company_name, Department, Expenses,
(100.0* Expenses /
max(case when Department not like 'Total%Expenses' then Expenses end) over
(partition by Company_Name)
) as Percentage
from t;
The window function is like an aggregation function, but without the aggregation. The sum for each group is added as an additional column on each row. The definition of the grouping is based on the partition by clause.
Add this column to the query
Expenses * 200.0 / SUM(expenses) over (partition by company_name) as PercentageExepenses
You have to multiply expenses by 200.0 to take into account that you already have the total for the company and therefore double count.
if you self-join and thus have the total of each company in a separate column, you can calculate the percentages. the company total has 100% then, which i deem as correct
select
id
, company_name
, department
, expenses
, expenses/total*100 as percentage
from table_expenses tbx
inner join
(select
company_name
, sum(expenses/2) as expenses
from table_expenses
group by
company_name
) sums
on
(tbx.company_name = sums.company_name)
EDIT:
Are you actually storing the company totals in your database? If, so then this should work for the CTE:
select
compname,
expense as CompExp
from
<YourTable>
where
Department like 'Total%'
But I don't know why you would want to store subtotals like that.
Using your "table" as an example:
;with CompTotal as (
select
compname,
sum(expenses) as CompExp
from
<YourTable>
group by CompName)
select
C.CompName,
Department,
CompTotal.CompExp,
sum(Expenses)as DeptEexpense,
(sum(Expenses) / (CompTotal.CompExp * 1.0)) * 100 as Pct
from
<YourTable> C
inner join
CompTotal
on C.CompName = CompTotal.CompName
group by
C.CompName,
Department,
CompTotal.CompExp
The CTE gives us totals by company. We then join that back to the original table on company name, and total up by Department. Then just regular math gives us the percentage of each department of it's company total.
(SQLFiddle is down, or I'd link to a full example there)