Count all rows while not counting any row after a negative value - sql

I have a table t with:
PLACE
LOCATION
TS
ID
AMOUNT
GOING_IN
GOING_OUT
1
10
2020-10-01
1
100
10
0
1
10
2020-10-02
1
110
5
-50
1
10
2020-10-03
1
75
0
-100
1
10
2020-10-04
1
-25
30
0
1
10
2020-10-05
1
5
0
0
1
10
2020-10-06
1
5
38
-300
1
10
2020-10-07
1
-257
0
0
1
10
2020-10-01
2
1
10
0
1
10
2020-10-02
2
11
0
-12
1
10
2020-10-03
2
-1
0
-100
1
10
2020-10-04
2
-101
0
0
2
20
2020-11-15
1
18
20
0
2
20
2020-11-16
1
38
0
0
2
20
2020-11-15
3
-9
20
-31
2
20
2020-11-16
3
-20
0
0
So due to SAP legacy stuff some logistic data is mangled which may lead to negative inventory.
To check how severe the error is I need to count for each PLACE, LOCATION, ID
the number of rows that have a positive AMOUNT AND which do not have a negative AMOUNT before
the number of rows that have a negative AMOUNT AND any positive AMOUNT that has a negative AMOUNT anywhere before
As you can see in my table there are (for PLACE=1, LOCATION=10, ID=1) 3 rows with a positive AMOUNT without any negative AMOUNT before. But then there is a negative AMOUNT and some positive AMOUNTS afterwards --> those 4 rows should not be counted for COUNT_CORRECT but should count for COUNT_WRONG.
So in this example table my query should return:
PLACE
LOCATION
TOTAL
COUNT_CORRECT
COUNT_WRONG
RATIO
1
10
11
5
6
0.55
2
20
4
2
2
0.5
My code so far:
CREATE OR REPLACE TABLE ANALYTICS.t (
PLACE INT NOT NULL
, LOCATION INT NOT NULL
, TS DATE NOT NULL
, ID INT NOT NULL
, AMOUNT INT NOT NULL
, GOING_IN INT NOT NULL
, GOING_OUT INT NOT NULL
, PRIMARY KEY(PLACE, LOCATION, ID, TS)
);
INSERT INTO ANALYTICS.t
(PLACE, LOCATION, TS, ID, AMOUNT, GOING_IN, GOING_OUT)
VALUES
(1, 10, '2020-10-01', 1, 100, 10, 0)
, (1, 10, '2020-10-02', 1, 110, 5, -50)
, (1, 10, '2020-10-03', 1, 75, 0, -100)
, (1, 10, '2020-10-04', 1, -25, 30, 0)
, (1, 10, '2020-10-05', 1, 5, 0, 0)
, (1, 10, '2020-10-06', 1, 5, 38, 300)
, (1, 10, '2020-10-07', 1, -257, 0, 0)
, (1, 10, '2020-10-04', 2, 1, 10, 0)
, (1, 10, '2020-10-05', 2, 11, 0, -12)
, (1, 10, '2020-10-06', 2, -1, 0, -100)
, (1, 10, '2020-10-07', 2, -101, 0, 0)
, (2, 20, '2020-11-15', 1, 18, 12, 0)
, (2, 20, '2020-11-16', 1, 30, 0, 0)
, (2, 20, '2020-11-15', 3, -9, 20, -31)
, (2, 20, '2020-11-16', 3, -20, 0, 0)
;
Then
SELECT PLACE
, LOCATION
, SUM(CASE WHEN AMOUNT >= 0 THEN 1 ELSE 0 END) AS 'COUNT_CORRECT'
, SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) AS 'COUNT_WRONG'
, ROUND((SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) / COUNT(AMOUNT)) * 100, 2) AS 'ratio'
FROM t
GROUP BY PLACE, LOCATION
ORDER BY PLACE, LOCATION
;
But I don't know how I can filter for "AND which do not have a negative AMOUNT before" and counting by PLACE, LOCATION, ID as an intermediate step.
Any help appreciated.

I'm not sure if I understand your question correctly, but the following gives you the number of rows before the first negative amount per (place, location) partition.
The subselect computes the row numbers of all rows with a negative amount. Then we can select the minimum of this as the first row with a negative amount.
SELECT
place,
location,
COUNT(*) - NVL(MIN(pos) - 1, COUNT(*)) AS COUNT_WRONG,
COUNT(*) - local.COUNT_WRONG AS COUNT_CORRECT,
ROUND(local.COUNT_WRONG / COUNT(*),2) AS RATIO
FROM
( SELECT
amount,
place,
location,
CASE
WHEN amount < 0
THEN ROW_NUMBER() over (
PARTITION BY
place,
location
ORDER BY
"TIMESTAMP")
ELSE NULL
END pos -- Row numbers of rows with negative amount, else NULL
FROM
t)
GROUP BY
place,
location;

I have edited the query. Please let me know if this works.
ALL_ENTRIES query has all the row numbers for the table t partitioned by place,location and ID and ordered by timestamp.
TABLE1 is used to compute the first negative entry. This is done by joining with ALL_ENTRIES and selecting the minimum row number where amount < 0.
TABLE2 is used to compute the last correct entry. Basically ALL_ENTRIES is joined with TABLE1 with the condition that the row numbers should be lesser than the row number in TABLE1. This will give us the row number corresponding to the last correct entry.
TABLE1 and TABLE2 are joined with ALL_ENTRIES to calculate the max row number, which gives the total entries.
In the final select statement I have used case when statement to account for IDs where there are no negative amount values. In those scenarios all the entries should be correct. Hence, the max row number is considered for those cases.
WITH ALL_ENTRIES AS (
SELECT
PLACE,
LOCATION,
ID,
TIMESTAMP,
AMOUNT,
ROW_NUMBER() OVER(PARTITION BY PLACE,LOCATION,ID ORDER BY TIMESTAMP) AS 'ROW_NUM'
FROM t)
SELECT
PLACE,
LOCATION,
ID,
TOTAL,
COUNT_CORRECT,
TOTAL - COUNT_CORRECT AS COUNT_WRONG,
COUNT_CORRECT / TOTAL AS RATIO
FROM
(SELECT
ae.PLACE,
ae.LOCATION,
ae.ID,
MAX(ae.ROW_NUM) as TOTAL,
MAX (CASE WHEN table2.LAST_CORRECT_ENTRY IS NULL THEN ae.ROW_NUM ELSE table2.LAST_CORRECT_ENTRY END) AS COUNT_CORRECT,
FROM
ALL_ENTRIES ae
LEFT JOIN
(SELECT
ae.PLACE,
ae.LOCATION,
ae.ID,
MAX(ae.ROW_NUM) as LAST_CORRECT_ENTRY
FROM
ALL_ENTRIES ae
INNER JOIN
( SELECT
t.PLACE,
t.LOCATION,
t.ID, MIN(ae.ROW_NUM) as FIRST_NEGATIVE_ENTRY
FROM t t
INNER JOIN
ALL_ENTRIES ae ON t.PLACE = ae.PLACE
AND t.LOCATION = ae.LOCATION
AND t.ID = ae.ID
AND t.TIMESTAMP = ae.TIMESTAMP
AND t.AMOUNT = ae.AMOUNT
AND ae.AMOUNT < 0
GROUP BY t.PLACE, t.LOCATION
) table1
ON ae.PLACE = table1.PLACE
AND ae.LOCATION = table1.LOCATION
AND ae.ID = table1.ID
AND ae.ROW_NUM < table1.FIRST_NEGATIVE_ENTRY
GROUP BY ae.PLACE, ae.LOCATION, ae.ID
) table2
ON ae.PLACE = table2.PLACE
AND ae.LOCATION = table2.LOCATION
AND ae.ID = table2.ID
GROUP BY ae.PLACE, ae.LOCATION, ae.ID
)

Related

Count based on a date condition in PLSQLDev

I need help to do a count based on a date condition.
I have a DB similar to the following:
ManDB
ID
report_date
traffic_v
traffic_ul
traffic_dl
a
1/12/2021
0
0
100
a
2/12/2021
0
0
100
a
3/12/2021
100
0
100
a
4/12/2021
100
0
100
b
1/12/2021
0
100
100
b
2/12/2021
0
0
0
b
3/12/2021
0
100
0
b
4/12/2021
100
100
0
I need you to count the data to zero, for which I have the query:
SELECT
ID AS SECTOR,
SUM(TRAFFIC) TRAFICO_VOZ,
SUM(TRAFFIC_DL_G) + SUM(TRAFFIC_DL_E) TRAFFIC_DL,
SUM(TRAFFIC_UL_G) + SUM(TRAFFIC_UL_E) TRAFFIC_UL
FROM
MainDB
GROUP BY ID
HAVING SUM(TRAFFIC) = 0
OR (SUM(TRAFFIC_DL_G) + SUM(TRAFFIC_DL_E)) = 0
OR (SUM(TRAFFIC_UL_G) + SUM(TRAFFIC_UL_E)) = 0
But I need you to count me from the current date backwards, how many days has it been zero
You should only count me from the last record in zero.
So you should get the following result:
Expected result
ID
traffic_v
count_v
traffic_ul
count_ul
traffic_dl
count_dl
a
200
0
0
4
400
0
b
100
0
200
0
0
3
I do not know how to set the condition so that it detects the date on which I began to have zero records and perform the count of days until the current date.
In cases where the register is different from zero, the count must be restarted.
The db is updated daily.
the counts are displayed correctly with the query, as I only care about zero data.
try to use SUM / CASE, but it counts me from the minimum date that it finds at zero, regardless of having a different record
You can use a MODEL clause:
SELECT id,
count_traffic_v,
sum_traffic_v,
count_traffic_ul,
sum_traffic_ul,
count_traffic_dl,
sum_traffic_dl
FROM (
SELECT *
FROM (
SELECT m.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY report_date DESC) AS rn
FROM mainDB m
)
MODEL
PARTITION BY (id)
DIMENSION BY (report_date)
MEASURES (
rn,
traffic_v,
0 AS count_traffic_v,
0 AS sum_traffic_v,
traffic_ul,
0 AS count_traffic_ul,
0 AS sum_traffic_ul,
traffic_dl,
0 AS count_traffic_dl,
0 AS sum_traffic_dl
)
RULES AUTOMATIC ORDER (
count_traffic_v[report_date] = CASE traffic_v[cv()]
WHEN 0
THEN COALESCE(count_traffic_v[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_v[report_date] = CASE traffic_v[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_v[cv() - 1], 0) + traffic_v[cv()]
END,
count_traffic_ul[report_date] = CASE traffic_ul[cv()]
WHEN 0
THEN COALESCE(count_traffic_ul[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_ul[report_date] = CASE traffic_ul[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_ul[cv() - 1], 0) + traffic_ul[cv()]
END,
count_traffic_dl[report_date] = CASE traffic_dl[cv()]
WHEN 0
THEN COALESCE(count_traffic_dl[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_dl[report_date] = CASE traffic_dl[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_dl[cv() - 1], 0) + traffic_dl[cv()]
END
)
)
WHERE rn = 1;
Which, for the sample data:
CREATE TABLE maindb (ID, report_date, traffic_v, traffic_ul, traffic_dl) AS
SELECT 'a', DATE '2021-12-01', 0, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-02', 0, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-03', 100, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-04', 100, 0, 100 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-01', 0, 100, 100 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-02', 0, 0, 0 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-03', 0, 100, 0 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-04', 100, 100, 0 FROM DUAL;
Outputs:
ID
COUNT_TRAFFIC_V
SUM_TRAFFIC_V
COUNT_TRAFFIC_UL
SUM_TRAFFIC_UL
COUNT_TRAFFIC_DL
SUM_TRAFFIC_DL
a
0
200
4
0
0
400
b
0
100
0
200
3
0
db<>fiddle here

Calculating distance using geometry of x and y location in SQL

I'm using SQL Server and I need to calculate the distance between the x and y of a frame and the previous x and y of a frame where the day, team, and member are all the same. Currently, I have this code that works but doesn't accomplish what I need. I'm getting every distance permutation of the x and y location where the day, team, and member are all the same.
I need help to incorporate frames into the query so that I get the N+1 Frame x and y location minus the N Frame x and y location.
CREATE TABLE TestTable (
Day int NULL,
Frame int NULL,
Team int NULL,
Member int NULL,
x float NULL,
y float NULL
);
Insert into a Values
(1, 1, 1, 1, 1486.64, 2017.55),
(1, 1, 1, 2, 1754.55, 1495.81),
(1, 1, 2,1, 2049.15, 876.349),
(1, 2, 1, 1, 1707.59, 1171.22),
(1, 2, 1, 2, 1432.56, 1459.99),
(1, 2, 2, 1, 1470.27, 1086.22),
(1, 3, 1, 1, 3639.19, 1281.36),
(1, 3, 1, 2, 2751.37, 976.348),
(1, 3, 2, 1, 2496.69, 1283.29),
(1, 4, 1, 1, 2347.26, 984.255),
(1, 4, 1, 2, 2044.92, 711.154),
(1, 4, 2, 1, 2473.65, 1816.23);
Select A.Day, A.Frame, A.Team, A.Member,
GEOMETRY::Point(A.[x], A.[y], 0).STDistance(GEOMETRY::Point(B.[x], B.[y], 0)) As Distance
From a A
Join a B
ON A.Day = B.Day and A.Team = B.Team and A.Member = B.Member
I also may deal with NULL x and y values so if it's possible to add this to the query too.
Where A.x IS NOT NULL and A.y IS NOT NULL
Ultimately I want to track the distance of every member throughout the day, frame by frame.Later, I'll add up each member's total distance for the day.
;WITH CTE1 AS
(
SELECT
[day], team, member, frame, x, y,
LAG(x) OVER (PARTITION BY [day], team, member ORDER BY frame) AS PervFrameX,
LAG(y) OVER (PARTITION BY [day], team, member ORDER BY frame) AS PervFrameY
FROM
TestTable
WHERE
X IS NOT NULL AND Y IS NOT NULL
),
CTE2 AS
(
SELECT
[day], team, member, frame, x, y, PervFrameX, PervFrameY,
IIF(PervFrameX IS NULL OR PervFrameY IS NULL, 0,
GEOMETRY::Point(x, y, 0).STDistance(GEOMETRY::Point(PervFrameX, PervFrameY, 0))) As Distance
FROM
CTE1
)
SELECT
*,
SUM(Distance) OVER (PARTITION BY [day], team, member) AS MemberTotalDistance,
SUM(Distance) OVER (PARTITION BY [day]) AS DailyTotalDistance
FROM
CTE2
ORDER BY
[day], team, member, frame
CTE1 and CTE2 are used to improve readability of the query.
Output:
day team member frame x y PervFrameX PervFrameY Distance MemberTotalDistance DailyTotalDistance
1 1 1 1 1486.64 2017.55 NULL NULL 0.000 4135.086 8812.698
1 1 1 2 1707.59 1171.22 1486.64 2017.55 874.696 4135.086 8812.698
1 1 1 3 3639.19 1281.36 1707.59 1171.22 1934.738 4135.086 8812.698
1 1 1 4 2347.26 984.255 3639.19 1281.36 1325.652 4135.086 8812.698
1 1 2 1 1754.55 1495.81 NULL NULL 0.000 2483.257 8812.698
1 1 2 2 1432.56 1459.99 1754.55 1495.81 323.976 2483.257 8812.698
1 1 2 3 2751.37 976.348 1432.56 1459.99 1404.695 2483.257 8812.698
1 1 2 4 2044.92 711.154 2751.37 976.348 754.586 2483.257 8812.698
1 2 1 1 2049.15 876.349 NULL NULL 0.000 2194.355 8812.698
1 2 1 2 1470.27 1086.22 2049.15 876.349 615.750 2194.355 8812.698
1 2 1 3 2496.69 1283.29 1470.27 1086.22 1045.167 2194.355 8812.698
1 2 1 4 2473.65 1816.23 2496.69 1283.29 533.438 2194.355 8812.698

Sql Server Query for getting result

I have table schema as following
id pid amount drcr residce
1 33 1000 1 55
2 32 2000 2 44
3 33 1500 2 54
Here I want to calculate sum with drcr returns value whose value is greater
for ex. here 1500 > 100 so it should return total = 500 , drcr = 2 for pid = 33
I tried googling stuff but not got any idea.
I think you're looking for conditional aggregation as
SELECT pid, SUM(CASE WHEN drcr = 2 THEN Amount ELSE -Amount END)
FROM
(
VALUES
(1, 33, 1000, 1, 55),
(2, 32, 2000, 2, 44),
(3, 33, 1500, 2, 54)
) T(id, pid, amount, drcr, residce)
GROUP BY pid
Here is a db<>fiddle

How to get conditional SUM?

I am trying to get a conditional sum based on another column. For example, suppose I have this dataset:
ID Date Type Total
-----------------------
5 12/16/2019 0 7
5 12/16/2019 1 0
5 12/17/2019 0 7
5 12/17/2019 1 7
5 12/18/2019 0 7
5 12/18/2019 1 0
5 12/19/2019 0 7
5 12/19/2019 1 7
5 12/20/2019 0 7
5 12/20/2019 1 7
5 12/23/2019 0 7
5 12/24/2019 0 7
5 12/25/2019 0 7
5 12/26/2019 0 7
5 12/27/2019 0 7
If there is a type of 1 then I only want that data for that data, else if there is only 0 then I want that data for that date.
So for 12/16/2019 I would want the value 0. For 12/23/2019 - 12/27/2019 I would want the value 7.
You can use row_number() :
select t.*
from (select t.*, row_number() over (partition by id, date order by type desc) as seq
from table t
) t
where seq = 1;
A simple ROW_NUMBER can handle this quite easily. I changed some of the column names because reserved words are just painful to work with.
declare #Something table
(
ID int
, SomeDate Date
, MyType int
, Total int
)
insert #Something values
(5, '12/16/2019', 0, 7)
, (5, '12/16/2019', 1, 0)
, (5, '12/17/2019', 0, 7)
, (5, '12/17/2019', 1, 7)
, (5, '12/18/2019', 0, 7)
, (5, '12/18/2019', 1, 0)
, (5, '12/19/2019', 0, 7)
, (5, '12/19/2019', 1, 7)
, (5, '12/20/2019', 0, 7)
, (5, '12/20/2019', 1, 7)
, (5, '12/23/2019', 0, 7)
, (5, '12/24/2019', 0, 7)
, (5, '12/25/2019', 0, 7)
, (5, '12/26/2019', 0, 7)
, (5, '12/27/2019', 0, 7)
select ID
, SomeDate
, MyType
, Total
from
(
select *
, RowNum = ROW_NUMBER()over(partition by SomeDate order by MyType)
from #Something
) x
where x.RowNum = 1
You can do this with simple aggregation . . . well, and case:
select id, date, max(type),
coalesce(max(case when type = 1 then total end),
max(total)
) as total
from t
group by id, date;
This formulation is assuming that you have only types 0 and 1 and at most one of each type on each day for a given id.

Inclusion of both inclusive and exclusive range in the same table

i have the following table generated by SQL TABLE A
timeinterval count(exclusive range)
0-6 2
0-12 5
0-18 10
i want a table like this TABLE B
timeinterval count(exclusive range) count(inclusive range)
1-6 2 2
1-12 5 3
1-18 10 5
i have already generated table A and need table B. can i do something in SQL where i can add a query in the code for table A and do something like this (0-12)-(0-6) for 2nd row in table B.
code used for generating table A is
with ranges as
(
select 6 as val, 1 as count_all
union all
select 12, 1
union all
select 18, 1
union all
select 24, 1
union all
select 30, 1
union all
select 36, 1
union all
select 42, 1
union all
select 48, 1
union all
select 1, 0
)
select case when ranges.count_all = 0
then 'more'
else convert (varchar(10), ranges.val)
end [MetLifeExperienceMonths],
sum (case when (ranges.count_all = 0 and GoldListHistogram.MetLifeExperienceMonths>=1)
or
(GoldListHistogram.MetLifeExperienceMonths<= ranges.val and GoldListHistogram.MetLifeExperienceMonths>=1)
then 1 end) [count],
count(EmployeeID) as 'Total'
into yy
from GoldListHistogram
cross join ranges
where MetLifeExperienceMonths > 0
group by ranges.val, ranges.count_all
i need to modify the query such that i can subtract first two rows value for "count(exclusive range)" for every row staring from the 2nd row..like for 0-12(time interval) row i need to output a value that is difference of the first two rows..like row(i)=count(i)-count(i-1).
first column gives the time interval in 5 years (in months) second column calculates no. of employees in the exclusive range like (0-6,0-12,0-18)..6 ,12,18 being no. of months third column calculates no. of employees in the exclusive range like (0-6,6-12,12-18)
Could you not just add a start value to ranges? Something like:
with ranges as
(
select 6 as val, 0 as start, 1 as count_all
union all
select 12, 7, 1
union all
select 18, 13, 1
union all
select 24, 19, 1
union all
select 30, 25, 1
union all
select 36, 31, 1
union all
select 42, 37, 1
union all
select 48, 43, 1
union all
select 1, 49, 0
)
select case when ranges.count_all = 0
then 'more'
else convert (varchar(10), ranges.val)
end [MetLifeExperienceMonths],
sum (case when (ranges.count_all = 0 and GoldListHistogram.MetLifeExperienceMonths>=1)
or
(GoldListHistogram.MetLifeExperienceMonths=1)
then 1 else 0 end) [count inclusive],
sum (case when (ranges.count_all = 0 and GoldListHistogram.MetLifeExperienceMonths>=1)
or
(GoldListHistogram.MetLifeExperienceMonths=ranges.start)
then 1 else 0 end) [count exclusive],
count(EmployeeID) as 'Total'
into yy
from GoldListHistogram
cross join ranges
where MetLifeExperienceMonths > 0
group by ranges.val, ranges.count_all;