Working out the percentage of outcomes in a column within a table

Working out the percentage of outcomes in a column within a table - sql

I am using SQL developer and have a table called table1 which looks like this (but with loads more data):
item_id seller_id warranty postage_class
------- --------- -------- -------------
14 2 1 2
17 6 1 1
14 2 1 1
14 2 1 2
14 2 1 1
14 2 1 2
I want to identify the percentage of items sent by first class.
If anyone could help me out that would be amazing!

You can use conditional aggregation. The simplest method is probably:
select avg(case when postage_class = 1 then 1.0 else 0 end)
from t;
Note this calculates a ratio between 0 and 1. If you want a "percentage" between 0 and 100, then use 100.0 instead of 1.0.
Some databases make it possible to shorten this even further. For instance, in Postgres, you can do:
select avg( (postage_class = 1)::int )
from t;

Related

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.

SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

Get a percentage of all in Access SQL

I have a table containing a list of features that will be implemented by a given team for a given release, with a flag to tell me if the feature is testable or not.
Sample data can be:
feature team rel testable
1 1 1 1
2 1 1 1
3 1 1 1
4 1 2 1
5 1 2 1
6 1 2 0
7 1 3 0
8 1 3 0
9 1 3 1
10 2 1 0
11 2 1 0
12 2 1 0
13 2 2 1
14 2 2 0
15 2 2 0
16 2 3 1
17 2 3 1
18 2 3 0
What I try to get is, for each team and each release, what is the percentage of testable feature (over the overall count of features for this team and release.
Ideally I would like to keep it as a single SQL query due to the way I designed the display of the result.
I went as far as this:
SELECT
MyTable.team AS team,
MyTable.rel AS rel,
(COUNT(*)*100 / (
SELECT COUNT(*)
FROM MyTable
WHERE
[MyTable].team = team
AND [MyTable].rel = rel
)
) AS result
FROM MyTable
WHERE
MyTable.team IN (1,2)
AND MyTable.rel IN (1,2,3)
AND MyTable.testable = 1
GROUP BY
MyTable.rel,
MyTable.team
ORDER BY
MyTable.team,
MyTable.rel
Here is the result I expect (I don't really care about the rounding)
team rel result
1 1 1 // all are testable for team 1 release 1
1 2 0.66 // 2 out of 3 are testable for team 1 release 2
1 3 0.33
2 1 0
2 2 0.33
2 3 0.66
My feeling is that I am not that far from the solution, but I am not able to fix it.

I would think a simple average function would work here; assuming all values in the testable field are 1 or 0 only.
oh and get rid of testable = 1 in where clause
I'm not sure if access will implicitly cast the Boolean... so this will enable the avg to work by converting the value to 1,0 explicitly.
SELECT
MyTable.team AS team,
MyTable.rel AS rel,
AVG(iif(Testable,1,0)) AS result
FROM MyTable
WHERE
MyTable.team IN (1,2)
AND MyTable.rel IN (1,2,3)
GROUP BY
MyTable.rel,
MyTable.team
ORDER BY
MyTable.team,
MyTable.rel

select y.team, y.rel, x.cnt/y.tot as res
from (
select t.team, t.rel, sum(x.cnt) as tot
from (
select team, rel, testable, count(*) as cnt
from table where team in (1,2) and rel in (1,2,3)
group by team, rel, testable) x
join table t on t.team = x.team and t.rel = x.rel
group by team, rel) y
You can try this.

SQL Converting Column into Rows in Single Select Statement

I need solution for converting SQL output
I am writing
SELECT Merchant_Master.Merchant_ID,
COUNT(Coupon_Type_ID) AS "Total Coupons",
Coupon_Type_ID,
CASE WHEN Coupon_Type_ID=1
THEN COUNT(Coupon_Type_ID)
END AS "Secret",
CASE WHEN Coupon_Type_ID=2
THEN count(Coupon_Type_ID)
END AS "Hot"
FROM Coupon_Master
INNER JOIN Merchant_Master
ON Coupon_Master.Merchant_ID=Merchant_Master.Merchant_ID
GROUP BY
Coupon_Master.Coupon_Type_ID,
Merchant_Master.Merchant_ID
and getting output as
Merchant_ID Total Coupons Coupon_Type_ID Secret Hot
----------- ------------- -------------- ----------- -----------
20 6 1 6 NULL
22 4 1 4 NULL
22 2 2 NULL 2
23 1 2 NULL 1
24 2 1 2 NULL
25 3 1 3 NULL
25 2 2 NULL 2
But I want output as
Merchant_ID Secret Hot_Coupons
----------- ------ -------------
20 6 0
22 4 2
23 0 1
24 2 0
25 3 2
Please, help me to solve the issue.

Move the CASE expressions inside the aggregates. I've also switched to using SUM rather than COUNT - there is a COUNT variant but it may display a warning about eliminating NULL values that I'd rather avoid.
SELECT Merchant_Master.Merchant_ID,
SUM(CASE WHEN Coupon_Type_ID=1
THEN 1 ELSE 0 END) AS "Secret",
SUM(CASE WHEN Coupon_Type_ID=2
THEN 1 ELSE 0 END) AS "Hot"
FROM Coupon_Master
INNER JOIN Merchant_Master
ON Coupon_Master.Merchant_ID=Merchant_Master.Merchant_ID
GROUP BY
Merchant_Master.Merchant_ID

Place it in a subquery and add group by Merchant_ID, Total, Coupons, Coupon_Type_ID
Aggregate the Secret and hot as SUM
select
...
SUM(secret) as secret,
SUM(Hot_Coupons) as Hot_Coupons
FROM (your original query) raw
group by Merchant_ID, Total, Coupons, Coupon_Type_ID

Inserting a new indicator column to tell if a given row maximizes another column in SQL

I currently have a table in SQL that looks like this
PRODUCT_ID_1 PRODUCT_ID_2 SCORE
1 2 10
1 3 100
1 10 3000
2 10 10
3 35 100
3 2 1001
That is, PRODUCT_ID_1,PRODUCT_ID_2 is a primary key for this table.
What I would like to do is use this table to add in a row to tell whether or not the current row is the one that maximizes SCORE for a value of PRODUCT_ID_1.
In other words, what I would like to get is the following table:
PRODUCT_ID_1 PRODUCT_ID_2 SCORE IS_MAX_SCORE_FOR_ID_1
1 2 10 0
1 3 100 0
1 10 3000 1
2 10 10 1
3 35 100 0
3 2 1001 1
I am wondering how I can compute the IS_MAX_SCORE_FOR_ID_1 column and insert it into the table without having to create a new table.

You can try like this...
Select PRODUCT_ID_1, PRODUCT_ID_2 ,SCORE,
(Case when b.Score=
(Select Max(a.Score) from TableName a where a.PRODUCT_ID_1=b. PRODUCT_ID_1)
then 1 else 0 End) as IS_MAX_SCORE_FOR_ID_1
from TableName b

You can use a window function for this:
select product_id_1,
product_id_2,
score,
case
when score = max(score) over (partition by product_id_1) then 1
else 0
end as is_max_score_for_id_1
from the_table
order by product_id_1;
(The above is ANSI SQL and should run on any modern DBMS)

Referencing the value of the previous calculcated value in Oracle

How can one reference a calculated value from the previous row in a SQL query? In my case each row is an event that somehow manipulates the same value from the previous row.
The raw data looks like this:
Eventno Eventtype Totalcharge
3 ACQ 32
2 OUT NULL
1 OUT NULL
Lets say each Eventtype=OUT should half the previous row totalcharge in a column called Remaincharge:
Eventno Eventtype Totalcharge Remaincharge
3 ACQ 32 32
2 OUT NULL 16
1 OUT NULL 8
I've already tried the LAG analytic function but that does not allow me to get a calculated value from the previous row. Tried something like this:
LAG(remaincharge, 1, totalcharge) OVER (PARTITION BY ...) as remaincharge
But this didn't work because remaingcharge could not be found.
Any ideas how to achieve this? Would need a analytics function that can give me the the cumulative sum but given a function instead with access to the previous value.
Thank you in advance!
Update problem description
I'm afraid my example problem was to general, here is a better problem description:
What remains of totalcharge is decided by the ratio of outqty/(previous remainqty).
Eventno Eventtype Totalcharge Remainqty Outqty
4 ACQ 32 100 0
3 OTHER NULL 100 0
2 OUT NULL 60 40
1 OUT NULL 0 60
Eventno Eventtype Totalcharge Remainqty Outqty Remaincharge
4 ACQ 32 100 0 32
3 OTHER NULL 100 0 32 - (0/100 * 32) = 32
2 OUT NULL 60 40 32 - (40/100 * 32) = 12.8
1 OUT NULL 0 60 12.8 - (60/60 * 12.8) = 0

In your case you could work out the first value using the FIRST_VALUE() analytic function and the power of 2 that you have to divide by with RANK() in a sub-query and then use that. It's very specific to your example but should give you the general idea:
select eventno, eventtype, totalcharge
, case when eventtype <> 'OUT' then firstcharge
else firstcharge / power(2, "rank" - 1)
end as remaincharge
from ( select a.*
, first_value(totalcharge) over
( partition by 1 order by eventno desc ) as firstcharge
, rank() over ( partition by 1 order by eventno desc ) as "rank"
from the_table a
)
Here's a SQL Fiddle to demonstrate. I haven't partitioned by anything because you've got nothing in your raw data to partition by...

A variation on Ben's answer to use a windowing clause, which seems to take care of your updated requirements:
select eventno, eventtype, totalcharge, remainingqty, outqty,
initial_charge - case when running_outqty = 0 then 0
else (running_outqty / 100) * initial_charge end as remainingcharge
from (
select eventno, eventtype, totalcharge, remainingqty, outqty,
first_value(totalcharge) over (partition by null
order by eventno desc) as initial_charge,
sum(outqty) over (partition by null
order by eventno desc
rows between unbounded preceding and current row)
as running_outqty
from t42
);
Except it gives 19.2 instead of 12.8 for the third row, but that's what your formula suggests it should be:
EVENTNO EVENT TOTALCHARGE REMAININGQTY OUTQTY REMAININGCHARGE
---------- ----- ----------- ------------ ---------- ---------------
4 ACQ 32 100 0 32
3 OTHER 100 0 32
2 OUT 60 40 19.2
1 OUT 0 60 0
If I add another split so it goes from 60 to zero in two steps, with another non-OUT record in the mix too:
EVENTNO EVENT TOTALCHARGE REMAININGQTY OUTQTY REMAININGCHARGE
---------- ----- ----------- ------------ ---------- ---------------
6 ACQ 32 100 0 32
5 OTHER 100 0 32
4 OUT 60 40 19.2
3 OUT 30 30 9.6
2 OTHER 30 0 9.6
1 OUT 0 30 0
There's an assumption that the remaining quantity is consistent and you can effectively track a running total of what has gone before, but from the data you've shown that looks plausible. The inner query calculates that running total for each row, and the outer query does the calculation; that could be condensed but is hopefully clearer like this...

Ben's answer is the better one (will probably perform better) but you can also do it like this:
select t.*, (connect_by_root Totalcharge) / power (2,level-1) Remaincharge
from the_table t
start with EVENTTYPE = 'ACQ'
connect by prior eventno = eventno + 1;
I think it's easier to read
Here is a demo

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Working out the percentage of outcomes in a column within a table - sql

Related

Resetting a Count in SQL

Get a percentage of all in Access SQL

SQL Converting Column into Rows in Single Select Statement

Inserting a new indicator column to tell if a given row maximizes another column in SQL

Referencing the value of the previous calculcated value in Oracle

Categories

Resources