How to calculate 90th Percentile, SD, Mean for data in SQL - sql

Hi I have a table facility. Which holds a score for each day (Multiple scores can be reported each day and both would be valid)
I need to calculate the 90th percentile, SD, and Mean for score by month.
Facility:
Id Month Date score
1 Jan 1 5
1 Jan 1 5
1 Jan 2 3
1 Jan 3 4
1 Jan 4 4
1 Jan 5 4
1 Feb 1 5
1 Feb 1 5
1 Feb 2 3
1 Feb 3 4
1 Feb 4 4
1 Feb 5 4
Is there any way?
Thanks for your help.

You can use the new suite of analytic functions introduced in SQL Server 2012:
SELECT DISTINCT
[Month],
Mean = AVG(Score) OVER (PARTITION BY [Month]),
StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
P90 = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM my_table
There are 2 percentile functions: PERCENTILE_CONT for continuous distribution and PERCENTILE_DISC for discrete distribution. Picks one that suits your needs.

Here's the setup...
CREATE TABLE Facility (Id INT NOT NULL, Month nvarchar(3) NOT NULL, Date INT NOT NULL, score INT NOT NULL)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 5, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 5, 4)
Now, Standard Deviation and Mean are straight forward enough - there are built in aggregate functions for them...
SELECT
[Month],
AVG(CONVERT(real, score)) AS [Mean],
STDEV(score) AS [Standard Deviation]
FROM
Facility
GROUP BY
[Month]
For your 90th percentile, you'll need to invent a function...
CREATE FUNCTION NintythPercentile(#Month nvarchar(3)) RETURNS INT AS
BEGIN
DECLARE #ReturnValue INT
SELECT
#ReturnValue = MIN(DerivedTopTenPercent.score) --AS [90th Percentile]
FROM
(
SELECT TOP 10 PERCENT
score
FROM
Facility
WHERE
[Month] = #Month
ORDER BY
score DESC
) DerivedTopTenPercent
RETURN #ReturnValue
END
With that function in place, your final query will look like this...
SELECT
[Month],
AVG(CONVERT(real, score)) AS [Mean],
STDEV(score) AS [Standard Deviation],
dbo.NintythPercentile([Month]) AS [90th Percentile]
FROM
Facility
GROUP BY
[Month]

Related

I want to find the date intervals at which the employee comes on a regular basis

Imagine a employee who works in a company whos having a contract to work on a specific task, he comes in and goes on start and end date respectively. I want to get the interval at which the employee comes to office without any absence.
Example Data:
DECLARE #TimeClock TABLE (PunchID INT IDENTITY, EmployeeID INT, PunchinDate DATE)
INSERT INTO #TimeClock (EmployeeID, PunchInDate) VALUES
(1, '2020-01-01'), (1, '2020-01-02'), (1, '2020-01-03'), (1, '2020-01-04'),
(1, '2020-01-05'), (1, '2020-01-06'), (1, '2020-01-07'), (1, '2020-01-08'),
(1, '2020-01-09'), (1, '2020-01-10'), (1, '2020-01-11'), (1, '2020-01-12'),
(1, '2020-01-13'), (1, '2020-01-14'), (1, '2020-01-16'),
(1, '2020-01-17'), (1, '2020-01-18'), (1, '2020-01-19'), (1, '2020-01-20'),
(1, '2020-01-21'), (1, '2020-01-22'), (1, '2020-01-23'), (1, '2020-01-24'),
(1, '2020-01-25'), (1, '2020-01-26'), (1, '2020-01-27'), (1, '2020-01-28'),
(1, '2020-01-29'), (1, '2020-01-30'), (1, '2020-01-31'),
(1, '2020-02-01'), (1, '2020-02-02'), (1, '2020-02-03'), (1, '2020-02-04'),
(1, '2020-02-05'), (1, '2020-02-06'), (1, '2020-02-07'), (1, '2020-02-08'),
(1, '2020-02-09'), (1, '2020-02-10'), (1, '2020-02-12'),
(1, '2020-02-13'), (1, '2020-02-14'), (1, '2020-02-15'), (1, '2020-02-16');
--the output shall look like this '2020-01-01 to 2020-02-10' as this is the interval at which the employee comes without any leave
SELECT 1 AS ID, FORMAT( getdate(), '2020-01-01') as START_DATE, FORMAT( getdate(), '2020-01-10') as END_DATE union all
SELECT 1 AS ID, FORMAT( getdate(), '2020-01-11') as START_DATE, FORMAT( getdate(), '2020-01-15') as END_DATE union all
SELECT 1 AS ID, FORMAT( getdate(), '2020-01-21') as START_DATE, FORMAT( getdate(), '2020-01-31') as END_DATE union all
SELECT 1 AS ID, FORMAT( getdate(), '2020-02-01') as START_DATE, FORMAT( getdate(), '2020-02-10') as END_DATE
--the output shall look like this '2020-01-01 to 2020-01-15' and '2020 01-21 to 2020-02-10'as these are the intervals at which the employee comes without any leave
Using the example data provided we can query the table like this:
;WITH iterate AS (
SELECT *, DATEADD(DAY,1,PunchinDate) AS NextDate
FROM #TimeClock
), base AS (
SELECT *
FROM (
SELECT *, CASE WHEN DATEADD(DAY,-1,PunchInDate) = LAG(PunchinDate,1) OVER (PARTITION BY EmployeeID ORDER BY PunchinDate) THEN PunchInDate END AS s
FROM iterate
) a
WHERE s IS NULL
), rCTE AS (
SELECT EmployeeID, PunchInDate AS StartDate, PunchInDate AS EndDate, NextDate
FROM base
UNION ALL
SELECT a.EmployeeID, a.StartDate, r.PunchInDate, r.NextDate
FROM rCTE a
INNER JOIN iterate r
ON a.NextDate = r.PunchinDate
AND a.EmployeeID = r.EmployeeID
)
SELECT EmployeeID, StartDate, MAX(EndDate) AS EndDate, DATEDIFF(DAY,StartDate,MAX(EndDate)) AS Streak
FROM rCTE
GROUP BY rCTE.EmployeeID, rCTE.StartDate
This is known as a recursive common table expression, and allows us to compare values between related rows. In this case we're looking for rows where they follow a streak, and we want o re-start that streak anytime we encounter a break. We're using a windowed function called LAG to look back a row to the previous value, and compare it to the one we have now. If it's not yesterday, then we start a new streak.
EmployeeID StartDate EndDate Streak
------------------------------------------
1 2020-01-01 2020-01-15 14
1 2020-01-17 2020-02-10 24
1 2020-02-12 2020-02-16 4

How to sum and subtract one column value based on percentage in SQL Server 2008

DECLARE #BalanceTblRec TABLE
(
NetAmount decimal(18, 3),
Percentage int,
[Description] nvarchar(max)
)
DECLARE #BalanceTblPay TABLE
(
NetAmount decimal(18, 3),
Percentage int,
[Description] nvarchar(max)
)
INSERT INTO #BalanceTblRec
VALUES (21, 11, 'ReceiveReceipt'),
(20, 11, 'ReceiveReceipt'),
(20, 10, 'ReceiveReceipt'),
(20, 20, 'ReceiveReceipt'),
(10, 10, 'ReceiveReceipt')
INSERT INTO #BalanceTblPay
VALUES (10, 11, 'PayReceipt'),
(10, 11, 'PayReceipt'),
(10, 2, 'PayReceipt'),
(5, 15, 'PayReceipt'),
(30, 10, 'PayReceipt'),
(20, 10, 'PayReceipt')
;WITH MaPercentage AS
(
SELECT
Percentage,
SUM(NetAmount) AS Net,
'Receive' AS Flag
FROM
#BalanceTblRec
GROUP BY
Percentage
UNION ALL
SELECT
Percentage,
SUM(NetAmount) AS Net,
'Pay' AS Flag
FROM
#BalanceTblPay
GROUP BY
Percentage
)
SELECT * FROM MaPercentage
Now here I want subtract net from net based on falg, receive - pay based on percentage.
Like this:
Per Net Flag
-----------------------
10 30.000 - 50 Receive
11 41.000 - 20 Receive
20 20.000 Receive
2 10.000 Pay
15 5.000 Pay
I think this is what you want:
DECLARE #BalanceTblRec TABLE (NetAmount decimal(18,3), Percentage int, [Description] nvarchar(max))
DECLARE #BalanceTblPay TABLE (NetAmount decimal(18,3), Percentage int, [Description] nvarchar(max))
insert into #BalanceTblRec values (21, 11, 'ReceiveReceipt'),(20, 11, 'ReceiveReceipt'),(20, 10, 'ReceiveReceipt'),(20, 20, 'ReceiveReceipt'), (10, 10, 'ReceiveReceipt')
insert into #BalanceTblPay values (10, 11, 'PayReceipt'),(10, 11, 'PayReceipt'),(10, 2, 'PayReceipt'),(5, 15, 'PayReceipt'),(30, 10, 'PayReceipt') ,(20, 10, 'PayReceipt')
;WITH MaPercentage as (
select Percentage, sum(NetAmount) as Net, 'Receive' as Flag from #BalanceTblRec group by Percentage
union all
select Percentage, -sum(NetAmount) as Net, 'Pay' as Flag from #BalanceTblPay group by Percentage
)
select
Percentage,
abs(sum(net)) as SumNet,
case when sum(net) > 0 then 'Receive'
else 'Pay'
end as Flag
from MaPercentage
group by Percentage
Just changed the sign in the Pays and sum groupping by percentage.
Another way is to FULL JOIN the receivements with the payments.
;WITH RCV AS (
select Percentage, sum(NetAmount) as Net
from #BalanceTblRec
group by Percentage
)
, PAY AS (
select Percentage, sum(NetAmount) as Net
from #BalanceTblPay
group by Percentage
)
SELECT
COALESCE(r.Percentage, p.Percentage) AS Percentage,
ABS(COALESCE(r.Net, 0) - COALESCE(p.Net, 0)) AS Net,
(CASE
WHEN (COALESCE(r.Net, 0) - COALESCE(p.Net, 0)) < 0 THEN 'Pay'
ELSE 'Receive'
END) AS Flag
FROM RCV r
FULL JOIN PAY p ON p.Percentage = r.Percentage

SQL subtract same column values by conditional rows and insert new results on a table

I have this code in Snowflake to obtain the following table:
CREATE OR REPLACE TABLE LIVE_ANALYTICS.REPORTING."GL_REPORT"
AS
SELECT
SUM("TABLE_GL_ENTRY"."Amount" AS "AMOUNT",
MONTH("TABLE_GL_ENTRY"."Posting Date") AS "MONTH",
YEAR("TABLE_GL_ENTRY"."Posting Date") AS "YEAR",
"TABLE_GL_ENTRY"."Global Dimension 1 Code" AS "ID_STORE",
CASE
WHEN "YEAR" = YEAR(CURRENT_DATE()) THEN 'AMOUNT_CURRENT_YEAR'
ELSE 'AMOUNT_LAST_YEAR'
END AS "METRIC"
FROM
LIVE_ANALYTICS.NAVISION."G_L Entry" AS "TABLE_GL_ENTRY"
WHERE
(("MONTH" = MONTH(CURRENT_DATE()) OR "MONTH" = MONTH(ADD_MONTHS(CURRENT_DATE, 1)))
AND
("YEAR" = YEAR(CURRENT_DATE()) OR "YEAR" = YEAR(ADD_MONTHS(CURRENT_DATE, -12))))
AND
(("ID_STORE" LIKE '1')
OR
("ID_STORE" LIKE '2')
OR
("ID_STORE" LIKE '3'))
GROUP BY
MONTH",
"YEAR",
"ID_STORE",
;
From this table I need to subtract "AMOUNT" by "MONTH" and "ID_STORE" only if "YEAR" = 2021.
AMOUNT_CURRENT_YEAR - AMOUNT_LAST_YEAR
Finally, I want to insert these new results into the existing table with the other existing records.
How can I do this? Any suggestion?
Thank you in advance
Kind regards.
EDIT: This is the solution that i searched last days, thank you everyone!
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9909600b9ac804ba67b7086c40e0c844
create table last_amount as
select sum(amount) as 'amount_last', month, year, id, metric
from data where metric like 'last'
group by month, year, id, metric
order by id, month, year;
create table current_amount as
select sum(amount) as 'amount_current', month, year, id, metric
from data where metric like 'current'
group by month, year, id, metric
order by id, month, year
create table subtract as
select
c.amount_current - l.amount_last as amount, c.month, c.year, c.id
from
last_amount l
join
current_amount c
on
l.id = c.id
and
l.month = c.month;
alter table subtract add metric varchar(255) default 'subtract';
insert into data
select * from subtract;
drop table last_amount;
drop table current_amount;
select * from data order by id, month, year;
You can use conditional aggregation and insert:
insert into LIVE_ANALYTICS.NAVISION."G_L Entry" (amount, month, year, id_store, metric)
select sum(case when metric = 'AMOUNT_CURRENT_YEAR' then amount
when metric = 'AMOUNT_LAST_YEAR' then - amount
end),
month, year, id_store, 'SUBTRACT_LAST_YEAR'
from LIVE_ANALYTICS.NAVISION."G_L Entry" gl
where year = 2021
group by month, year, id_store;
here this might help you, this code might not contain all the values but it will give you an idea of the logic.
Use the link for an example:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=7ad1e26d11b1c835fc77e9c4578bebf5
FIRST : Select all data
Second : Introduce the lead () function to move the latest row 1 backwards
Third : Subtract based on your logic
Fourth : Joining both original and new results using UNION
with all_data as (
select * from data
),
last_year_current_year as (
select
*,
lead(amount) over(partition by id, month, year) as one_less
from all_data
)
,
subtraction as (
select
*,
one_less - amount as new_amount
from last_year_current_year
),
combining_data as (
select new_amount, month, year, id from subtraction
union
select amount, month, year, id from subtraction
)
select * from combining_data
-- this is in SQL Server but I hope the logic will be the same in MySql as well
create table #temp1
(
Amount money,
MonthNum int,
YearNum int,
ID_Store int,
Metric varchar(100)
);
insert into #temp1 values(10,8,2020,1,'AMOUNT LAST YEAR')
insert into #temp1 values(20,8,2021,1,'AMOUNT CURRENT YEAR')
insert into #temp1 values(30,8,2020,2,'AMOUNT LAST YEAR')
insert into #temp1 values(40,8,2021,2,'AMOUNT CURRENT YEAR')
insert into #temp1 values(50,8,2020,3,'AMOUNT LAST YEAR')
insert into #temp1 values(60,8,2021,3,'AMOUNT CURRENT YEAR')
insert into #temp1 values(70,9,2020,1,'AMOUNT__YEAR')
insert into #temp1 values(0,9,2021,1,'AMOUNT CURRENT YEAR')
insert into #temp1 values(90,9,2020,2,'AMOUNT__YEAR')
insert into #temp1 values(0,9,2021,2,'AMOUNT CURRENT YEAR')
insert into #temp1 values(110,9,2020,3,'AMOUNT__YEAR')
insert into #temp1 values(0,9,2021,3,'AMOUNT CURRENT YEAR')
-- start the actual code starts from here
Select
(t1.Amount-t3.Amount) as Amount
,t1.MonthNum
,t1.YearNum
,t1.ID_Store
,'Modified Metric' as Metric
into #temp2
from #temp1 t1
join (Select t2.Amount,t2.MonthNum,t2.ID_Store from #temp1 t2 where t2.YearNum=2020) t3 ON t1.MonthNum=t3.MonthNum and t1.ID_Store=t3.ID_Store
where t1.YearNum=2021
insert into #temp1
select * from #temp2
-- End the actual code starts from here
Select * from #temp1;
drop table #temp1;
drop table #temp2;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9909600b9ac804ba67b7086c40e0c844
create table data ( amount int, month int, year int, id int, metric varchar(255) );
insert into data values (10, 8, 2020, 1, 'last');
insert into data values (20, 8, 2021, 1, 'current');
insert into data values (30, 8, 2020, 2, 'last');
insert into data values (60, 8, 2021, 3, 'current');
insert into data values (70, 9, 2020, 1, 'last');
insert into data values (0, 9, 2021, 1, 'current');
insert into data values (90, 9, 2020, 2, 'last');
insert into data values (40, 8, 2021, 2, 'current');
insert into data values (50, 8, 2020, 3, 'last');
insert into data values (0, 9, 2021, 2, 'current');
insert into data values (110, 9, 2020, 3, 'last');
insert into data values (0, 9, 2021, 3, 'current');
select * from data order by id, month, year
create table last_amount as
select sum(amount) as 'amount_last', month, year, id, metric
from data where metric like 'last'
group by month, year, id, metric
order by id, month, year;
create table current_amount as
select sum(amount) as 'amount_current', month, year, id, metric
from data where metric like 'current'
group by month, year, id, metric
order by id, month, year
create table subtract as
select
c.amount_current - l.amount_last as amount, c.month, c.year, c.id
from
last_amount l
join
current_amount c
on
l.id = c.id
and
l.month = c.month;
alter table subtract add metric varchar(255) default 'subtract';
insert into data
select * from subtract;
drop table last_amount;
drop table current_amount;
select * from data order by id, month, year;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=6a50ae3dce1eda02250d79fbb6f95ca7
create table data ( amount int, month int, year int, id int, metric varchar(255) );
insert into data values (10, 8, 2020, 1, 'last');
insert into data values (20, 8, 2021, 1, 'current');
insert into data values (30, 8, 2020, 2, 'last');
insert into data values (60, 8, 2021, 3, 'current');
insert into data values (70, 9, 2020, 1, 'last');
insert into data values (0, 9, 2021, 1, 'current');
insert into data values (90, 9, 2020, 2, 'last');
insert into data values (40, 8, 2021, 2, 'current');
insert into data values (50, 8, 2020, 3, 'last');
insert into data values (0, 9, 2021, 2, 'current');
insert into data values (110, 9, 2020, 3, 'last');
insert into data values (0, 9, 2021, 3, 'current');
select * from data order by id, month, year
select * from data
union all
select
c.amount - l.amount as amount, c.month, c.year, c.id, 'subtract' as metric
from
data c
right join
data l
on
l.id = c.id
and
l.month = c.month
where c.year = 2021
and c.amount - l.amount != 0
order by id, month, year;
Better option:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4273a6e920b54e118868ca44a5927cce
create table data ( amount int, month int, year int, id int, metric
varchar(255) );
insert into data values (10, 8, 2020, 1, 'last');
insert into data values (20, 8, 2021, 1, 'current');
insert into data values (30, 8, 2020, 2, 'last');
insert into data values (60, 8, 2021, 3, 'current');
insert into data values (70, 9, 2020, 1, 'last');
insert into data values (0, 9, 2021, 1, 'current');
insert into data values (90, 9, 2020, 2, 'last');
insert into data values (40, 8, 2021, 2, 'current');
insert into data values (50, 8, 2020, 3, 'last');
insert into data values (0, 9, 2021, 2, 'current');
insert into data values (110, 9, 2020, 3, 'last');
insert into data values (0, 9, 2021, 3, 'current');
select * from data order by id, month, year
select * from data
union all
select
c.amount - l.amount as amount, c.month, c.year, c.id, 'subtract' as metric
from
(select * from data where year = 2021) c
left join
(select * from data where year = 2020) l
on
l.id = c.id
and
l.month = c.month
order by id, month, year, metric;

How to sum up Loss amount per each claim ignoring date

I have table with Loss amount per each transaction date.
How can I create column ClaimLoss that would sum up Loss amount per each claim?
declare #TempTable1 table (ID int, ClaimNumber varchar(100), date date, Loss money)
insert into #TempTable1
values (1, 'Claim1','2017-01-01', 100),
(2, 'Claim1','2017-03-06',150),
(3, 'Claim1','2017-05-01', 50),
(4, 'Claim2','2018-01-01', 150),
(5, 'Claim2','2018-08-15', 250),
(6, 'Claim2','2018-05-03', 350),
(7, 'Claim3','2018-09-01', 330),
(8, 'Claim4','2019-01-01', 140),
(9, 'Claim4','2019-01-13', 225),
(10, 'Claim5','2019-02-01', 145)
select ID,
ClaimNumber,
Date,
Loss
from #TempTable1
I need something like this:
Is it possible to do in the same select statement?
This seems like a place to use row_number() and case:
select t.*,
(case when row_number() over (partition by ClaimNumber order by date) = 1
then sum(loss) over (partition by ClaimNumber)
else 0
end) as claimloss
from #TempTable1 t;
You can use window function:
select ID, ClaimNumber, Date, Loss,
(case when min(id) over (partition by ClaimNumber) = id
then sum(loss) over (partition by ClaimNumber)
else 0
end) as claimloss
from #TempTable1;

Reward points by looking at the already created fields

I have a table here :
I want to reward a gold and a silver(i.e a value of 1 wherever applicable,else 0 )to top 2 persons by looking at pointsRewarded field.
I already have the first table created.I want a new table with the two new fields i.e the gold and silver fields.
i want the output to be something like this:
Please help me with the query or give me some suggestions on how to proceed.
Thanks a lot.
I think you want to use dense_rank() for this:
select t.*,
(case when rnk = 1 then 1 else 0 end) as gold,
(case when rnk = 2 then 1 else 0 end) as silver
from (select t.*,
dense_rank() over (partition by week order by pointsrewarded) as rnk
from t
) t;
dense_rank() will handle the case when there are ties. In that case, multiple "gold" and "silver" values will be assigned.
I should also note that the subquery is not necessary. You can repeat the dense_rank() in the outer query. I just think it is easier to follow the logic this way.
Make sure to order by pointsrewarded descending so first place is the highest points and not the lowest. My code is longer, but I find easier to read (personal preference).
--create table employee (employeeid int, employeename varchar(50), weeknumber int, pointsRewarded int, Hours int)
--insert into employee values (111, 'person1', 1, 400, 2)
--insert into employee values (112, 'person2', 1, 100, 10)
--insert into employee values (113, 'person3', 1, 200, 10)
--insert into employee values (111, 'person1', 2, 100, 2)
--insert into employee values (112, 'person2', 2, 50, 10)
--insert into employee values (113, 'person3', 2, 200, 10)
--insert into employee values (111, 'person1', 3, 20, 4)
--insert into employee values (112, 'person2', 3, 25, 5)
--insert into employee values (113, 'person3', 3, 100, 6)
;WITH Medals AS
(
SELECT
employeeid
,employeename
,weeknumber
,pointsRewarded
,hours
,ROW_NUMBER() OVER (PARTITION BY weeknumber ORDER BY pointsrewarded DESC) medal
FROM
employee
)
SELECT
employeeid
,employeename
,weeknumber
,pointsRewarded
,hours
,CASE WHEN medal = 1 THEN 1 ELSE 0 END AS gold
,CASE WHEN medal = 2 THEN 1 ELSE 0 END AS silver
FROM
Medals