Sql Server Query for getting result - sql

I have table schema as following
id pid amount drcr residce
1 33 1000 1 55
2 32 2000 2 44
3 33 1500 2 54
Here I want to calculate sum with drcr returns value whose value is greater
for ex. here 1500 > 100 so it should return total = 500 , drcr = 2 for pid = 33
I tried googling stuff but not got any idea.

I think you're looking for conditional aggregation as
SELECT pid, SUM(CASE WHEN drcr = 2 THEN Amount ELSE -Amount END)
FROM
(
VALUES
(1, 33, 1000, 1, 55),
(2, 32, 2000, 2, 44),
(3, 33, 1500, 2, 54)
) T(id, pid, amount, drcr, residce)
GROUP BY pid
Here is a db<>fiddle

Related

Count all rows while not counting any row after a negative value

I have a table t with:
PLACE
LOCATION
TS
ID
AMOUNT
GOING_IN
GOING_OUT
1
10
2020-10-01
1
100
10
0
1
10
2020-10-02
1
110
5
-50
1
10
2020-10-03
1
75
0
-100
1
10
2020-10-04
1
-25
30
0
1
10
2020-10-05
1
5
0
0
1
10
2020-10-06
1
5
38
-300
1
10
2020-10-07
1
-257
0
0
1
10
2020-10-01
2
1
10
0
1
10
2020-10-02
2
11
0
-12
1
10
2020-10-03
2
-1
0
-100
1
10
2020-10-04
2
-101
0
0
2
20
2020-11-15
1
18
20
0
2
20
2020-11-16
1
38
0
0
2
20
2020-11-15
3
-9
20
-31
2
20
2020-11-16
3
-20
0
0
So due to SAP legacy stuff some logistic data is mangled which may lead to negative inventory.
To check how severe the error is I need to count for each PLACE, LOCATION, ID
the number of rows that have a positive AMOUNT AND which do not have a negative AMOUNT before
the number of rows that have a negative AMOUNT AND any positive AMOUNT that has a negative AMOUNT anywhere before
As you can see in my table there are (for PLACE=1, LOCATION=10, ID=1) 3 rows with a positive AMOUNT without any negative AMOUNT before. But then there is a negative AMOUNT and some positive AMOUNTS afterwards --> those 4 rows should not be counted for COUNT_CORRECT but should count for COUNT_WRONG.
So in this example table my query should return:
PLACE
LOCATION
TOTAL
COUNT_CORRECT
COUNT_WRONG
RATIO
1
10
11
5
6
0.55
2
20
4
2
2
0.5
My code so far:
CREATE OR REPLACE TABLE ANALYTICS.t (
PLACE INT NOT NULL
, LOCATION INT NOT NULL
, TS DATE NOT NULL
, ID INT NOT NULL
, AMOUNT INT NOT NULL
, GOING_IN INT NOT NULL
, GOING_OUT INT NOT NULL
, PRIMARY KEY(PLACE, LOCATION, ID, TS)
);
INSERT INTO ANALYTICS.t
(PLACE, LOCATION, TS, ID, AMOUNT, GOING_IN, GOING_OUT)
VALUES
(1, 10, '2020-10-01', 1, 100, 10, 0)
, (1, 10, '2020-10-02', 1, 110, 5, -50)
, (1, 10, '2020-10-03', 1, 75, 0, -100)
, (1, 10, '2020-10-04', 1, -25, 30, 0)
, (1, 10, '2020-10-05', 1, 5, 0, 0)
, (1, 10, '2020-10-06', 1, 5, 38, 300)
, (1, 10, '2020-10-07', 1, -257, 0, 0)
, (1, 10, '2020-10-04', 2, 1, 10, 0)
, (1, 10, '2020-10-05', 2, 11, 0, -12)
, (1, 10, '2020-10-06', 2, -1, 0, -100)
, (1, 10, '2020-10-07', 2, -101, 0, 0)
, (2, 20, '2020-11-15', 1, 18, 12, 0)
, (2, 20, '2020-11-16', 1, 30, 0, 0)
, (2, 20, '2020-11-15', 3, -9, 20, -31)
, (2, 20, '2020-11-16', 3, -20, 0, 0)
;
Then
SELECT PLACE
, LOCATION
, SUM(CASE WHEN AMOUNT >= 0 THEN 1 ELSE 0 END) AS 'COUNT_CORRECT'
, SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) AS 'COUNT_WRONG'
, ROUND((SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) / COUNT(AMOUNT)) * 100, 2) AS 'ratio'
FROM t
GROUP BY PLACE, LOCATION
ORDER BY PLACE, LOCATION
;
But I don't know how I can filter for "AND which do not have a negative AMOUNT before" and counting by PLACE, LOCATION, ID as an intermediate step.
Any help appreciated.
I'm not sure if I understand your question correctly, but the following gives you the number of rows before the first negative amount per (place, location) partition.
The subselect computes the row numbers of all rows with a negative amount. Then we can select the minimum of this as the first row with a negative amount.
SELECT
place,
location,
COUNT(*) - NVL(MIN(pos) - 1, COUNT(*)) AS COUNT_WRONG,
COUNT(*) - local.COUNT_WRONG AS COUNT_CORRECT,
ROUND(local.COUNT_WRONG / COUNT(*),2) AS RATIO
FROM
( SELECT
amount,
place,
location,
CASE
WHEN amount < 0
THEN ROW_NUMBER() over (
PARTITION BY
place,
location
ORDER BY
"TIMESTAMP")
ELSE NULL
END pos -- Row numbers of rows with negative amount, else NULL
FROM
t)
GROUP BY
place,
location;
I have edited the query. Please let me know if this works.
ALL_ENTRIES query has all the row numbers for the table t partitioned by place,location and ID and ordered by timestamp.
TABLE1 is used to compute the first negative entry. This is done by joining with ALL_ENTRIES and selecting the minimum row number where amount < 0.
TABLE2 is used to compute the last correct entry. Basically ALL_ENTRIES is joined with TABLE1 with the condition that the row numbers should be lesser than the row number in TABLE1. This will give us the row number corresponding to the last correct entry.
TABLE1 and TABLE2 are joined with ALL_ENTRIES to calculate the max row number, which gives the total entries.
In the final select statement I have used case when statement to account for IDs where there are no negative amount values. In those scenarios all the entries should be correct. Hence, the max row number is considered for those cases.
WITH ALL_ENTRIES AS (
SELECT
PLACE,
LOCATION,
ID,
TIMESTAMP,
AMOUNT,
ROW_NUMBER() OVER(PARTITION BY PLACE,LOCATION,ID ORDER BY TIMESTAMP) AS 'ROW_NUM'
FROM t)
SELECT
PLACE,
LOCATION,
ID,
TOTAL,
COUNT_CORRECT,
TOTAL - COUNT_CORRECT AS COUNT_WRONG,
COUNT_CORRECT / TOTAL AS RATIO
FROM
(SELECT
ae.PLACE,
ae.LOCATION,
ae.ID,
MAX(ae.ROW_NUM) as TOTAL,
MAX (CASE WHEN table2.LAST_CORRECT_ENTRY IS NULL THEN ae.ROW_NUM ELSE table2.LAST_CORRECT_ENTRY END) AS COUNT_CORRECT,
FROM
ALL_ENTRIES ae
LEFT JOIN
(SELECT
ae.PLACE,
ae.LOCATION,
ae.ID,
MAX(ae.ROW_NUM) as LAST_CORRECT_ENTRY
FROM
ALL_ENTRIES ae
INNER JOIN
( SELECT
t.PLACE,
t.LOCATION,
t.ID, MIN(ae.ROW_NUM) as FIRST_NEGATIVE_ENTRY
FROM t t
INNER JOIN
ALL_ENTRIES ae ON t.PLACE = ae.PLACE
AND t.LOCATION = ae.LOCATION
AND t.ID = ae.ID
AND t.TIMESTAMP = ae.TIMESTAMP
AND t.AMOUNT = ae.AMOUNT
AND ae.AMOUNT < 0
GROUP BY t.PLACE, t.LOCATION
) table1
ON ae.PLACE = table1.PLACE
AND ae.LOCATION = table1.LOCATION
AND ae.ID = table1.ID
AND ae.ROW_NUM < table1.FIRST_NEGATIVE_ENTRY
GROUP BY ae.PLACE, ae.LOCATION, ae.ID
) table2
ON ae.PLACE = table2.PLACE
AND ae.LOCATION = table2.LOCATION
AND ae.ID = table2.ID
GROUP BY ae.PLACE, ae.LOCATION, ae.ID
)

Group by range of values in bigquery

Is there any way in Bigquery to group by not the absolute value but a range of values?
I have a query that looks in a product table with 4 different numeric group by's.
What I am looking for is an efficient way to group by in a way like:
group by "A±1000" etc. or "A±10%ofA".
thanks in advance,
You can generate a column as a "named range" then group by the column. As an example for your A+-1000 case:
with data as (
select 100 as v union all
select 200 union all
select 2000 union all
select 2100 union all
select 2200 union all
select 4100 union all
select 8000 union all
select 8000
)
select count(v), ARRAY_AGG(v), ranges
FROM data, unnest([0, 2000, 4000, 6000, 8000]) ranges
WHERE data.v >= ranges - 1000 AND data.v < ranges + 1000
GROUP BY ranges
Output:
+-----+------------------------+--------+
| f0_ | f1_ | ranges |
+-----+------------------------+--------+
| 2 | ["100","200"] | 0 |
| 3 | ["2000","2100","2200"] | 2000 |
| 1 | ["4100"] | 4000 |
| 2 | ["8000","8000"] | 8000 |
+-----+------------------------+--------+
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.example` AS (
SELECT * FROM
UNNEST([STRUCT<id INT64, price FLOAT64>
(1, 15), (2, 50), (3, 125), (4, 150), (5, 175), (6, 250)
])
)
SELECT
CASE
WHEN price > 0 AND price <= 100 THEN ' 0 - 100'
WHEN price > 100 AND price <= 200 THEN '100 - 200'
ELSE '200+'
END AS range_group,
COUNT(1) AS cnt
FROM `project.dataset.example`
GROUP BY range_group
-- ORDER BY range_group
with result
Row range_group cnt
1 0 - 100 2
2 100 - 200 3
3 200+ 1
As you can see, in above solution you need construct CASE statement to reflect your ranges - if you have multiple - this can be quite boring - so below is more generic (but more verbose) solution - and it uses recently introduced RANGE_BUCKET function
#standardSQL
WITH `project.dataset.example` AS (
SELECT * FROM
UNNEST([STRUCT<id INT64, price FLOAT64>
(1, 15), (2, 50), (3, 125), (4, 150), (5, 175), (6, 250)
])
), ranges AS (
SELECT [100.0, 200.0] ranges_array
), temp AS (
SELECT OFFSET, IF(prev_val = val, CONCAT(prev_val, ' - '), CONCAT(prev_val, ' - ', val)) rng FROM (
SELECT OFFSET, IFNULL(CAST(LAG(val) OVER(ORDER BY OFFSET) AS STRING), '') prev_val, CAST(val AS STRING) AS val
FROM ranges, UNNEST(ARRAY_CONCAT(ranges_array, [ARRAY_REVERSE(ranges_array)[OFFSET(0)]])) val WITH OFFSET
)
)
SELECT
RANGE_BUCKET(price, ranges_array) range_group,
rng,
COUNT(1) AS cnt
FROM `project.dataset.example`, ranges
JOIN temp ON RANGE_BUCKET(price, ranges_array) = OFFSET
GROUP BY range_group, rng
-- ORDER BY range_group
with result
Row range_group rng cnt
1 0 - 100 2
2 1 100 - 200 3
3 2 200 - 1
As you can see, in second solution you need to define your your ranges in ranges as simple array enlisting your boundaries as SELECT [100.0, 200.0] ranges_array
Then temp does all needed calculation
You can do math operations on the GROUP BY, creating groups by any arbitrary criteria.
For example:
WITH data AS (
SELECT repo.name, COUNT(*) price
FROM `githubarchive.month.201909`
GROUP BY 1
HAVING price>100
)
SELECT FORMAT('range %i-%i', MIN(price), MAX(price)) price_range, COUNT(*) c
FROM data
GROUP BY CAST(LOG(price) AS INT64)
ORDER BY MIN(price)

SQL: How to find the sum of values where all the records has the same value in a column?

I have Payments table with multiple columns, including Student, Value and Payment_type.
I want to create a query that will calculate the sum of values, if all the records of the same student have only NULL as Payment type.
If a student has at least one Payment type different than NULL, that student shouldn't be included.
Example:
Student Payment Value Payment_type
1 1 100 NULL
1 2 200 NULL
2 1 200 NULL
3 1 150 Cash
2 2 100 Cash
3 2 200 NULL
1 3 200 NULL
If you look at the example, it should give me result 500, because the sum of values of student 1 is 500, and his/her ALL payment types are NULL.
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE Payments
(`Student` int, `Payment` int, `Value` int, `Payment_type` varchar(4))
;
INSERT INTO Payments
(`Student`, `Payment`, `Value`, `Payment_type`)
VALUES
(1, 1, 100, NULL),
(1, 2, 200, NULL),
(2, 1, 200, NULL),
(3, 1, 150, 'Cash'),
(2, 2, 100, 'Cash'),
(3, 2, 200, NULL),
(1, 3, 200, NULL)
;
Query 1:
select student, sum(value)
from payments
group by student
having max(Payment_type) IS NULL
Results:
| Student | sum(value) |
|---------|------------|
| 1 | 500 |
select student, sum(value)
from payments
group by student
having sum(case when Payment_type is not null then 1 else 0 end) = 0
This should work:
select
student, sum(value)
from
payments
group by
student
having sum
(case when Payment_type is not null then 1 else 0 end) = 0
For me, this is very clean, plus semantically accurate regarding your description:
SELECT student, SUM(value)
FROM payments p1
WHERE NOT EXISTS (SELECT 1
FROM payments p2
WHERE p2.student = p1.student
AND Payment_type IS NOT NULL)
GROUP BY student

Update Query with Condition in SQL

I have a table that has 2 columns and i am trying to update another table based on these criteria:
Set the flag to 'Good' for the most duplicate keys in the Main_Key column for the same GROUP_KEY (Note we can have different Main_Keys for any GROUP_KEY)
Set the flag to 'Bad' for the least duplicate keys in the Main_Key column for the same GROUP_KEY
Set the flag to 'Don't Use' if the different Main_Keys are equal for the same GROUP_KEY
HERE IS MY TABLE
GROUP_KEY MAIN_KEY
22 4
22 4
22 55
22 55
22 55
22 55
10 10
10 10
18 87
18 22
18 22
HERE IS THE DESIRED RESULT AFTER THE UPDATE
GROUP_KEY MAIN_KEY FLAG
22 4 Bad
22 4 bad
22 55 Good
22 55 Good
22 55 Good
22 55 Good
10 10 Don't Use
10 10 Don't Use
18 87 Bad
18 22 Good
18 22 Good
I only know how to do just normal update query but not where even to start this logic. thnx for the help
Use:
declare #t table(GROUP_KEY int, MAIN_KEY int)
insert #t values
(22, 4),
(22, 4),
(22, 55),
(22, 55),
(22, 55),
(22, 55),
(10, 10),
(10, 10),
(18, 87),
(18, 22),
(18, 22)
select t.*, b.flag
from #t t
join
(
select a.GROUP_KEY, a.MAIN_KEY
,
case
when a.GROUP_KEY = a.MAIN_KEY
then 'Don''t Use'
when a.count = MAX(a.count) over(partition by a.GROUP_KEY)
then 'Good'
else 'Bad'
end [flag]
from
(
select t.GROUP_KEY, t.MAIN_KEY, COUNT(*) [count]
from #t t
group by t.GROUP_KEY, t.MAIN_KEY
)a
)b
on b.GROUP_KEY = t.GROUP_KEY and b.MAIN_KEY = t.MAIN_KEY
Output:
GROUP_KEY MAIN_KEY flag
----------- ----------- ---------
10 10 Don't Use
10 10 Don't Use
18 22 Good
18 22 Good
18 87 Bad
22 4 Bad
22 4 Bad
22 55 Good
22 55 Good
22 55 Good
22 55 Good
Update:
Assuming you have flag column in your table:
update #t
set flag = b.flag
from #t t
join
(
select a.GROUP_KEY, a.MAIN_KEY
,
case
when a.GROUP_KEY = a.MAIN_KEY
then 'Don''t Use'
when a.count = MAX(a.count) over(partition by a.GROUP_KEY)
then 'Good'
else 'Bad'
end [flag]
from
(
select t.GROUP_KEY, t.MAIN_KEY, COUNT(*) [count]
from #t t
group by t.GROUP_KEY, t.MAIN_KEY
)a
)b
on b.GROUP_KEY = t.GROUP_KEY and b.MAIN_KEY = t.MAIN_KEY

Cross Tab or Pivot query in SQL 2005

Can someone please help me with a cross tab/pivot query in SQL 2005
Given Data looks like
EmpId OrgId DayCt Cost
1 20 15 100
2 20 36 300
3 40 25 200
4 40 10 50
Result to be like:
EmpId OrgId 20 OrgId 40
DayCt Cost DayCt Cost
1 15 100
2 36 300
3 25 200
4 10 50
EmpId in 1st Col and then Org Ids in the next col. But under every OrgId, I want DayCt & Cost also to be included as sub columns. Not sure if this is doable. Please help.
There is no such thing as sub columns this seems like something that should be done in your application/reporting tool.
This is about the closest you can get in SQL
;WITH T(EmpId,OrgId,DayCt,Cost) AS
(
select 1, 20, 15, 100 UNION ALL
select 2, 20, 36, 300 UNION ALL
select 3, 40, 25, 200 UNION ALL
select 4, 40, 10, 50
)
SELECT EmpId,
MAX(CASE WHEN OrgId =20 THEN DayCt END) AS [OrgId 20 DayCt],
MAX(CASE WHEN OrgId =20 THEN Cost END) AS [OrgId 20 Cost],
MAX(CASE WHEN OrgId =40 THEN DayCt END) AS [OrgId 40 DayCt],
MAX(CASE WHEN OrgId =40 THEN Cost END) AS [OrgId 40 Cost]
FROM T
GROUP BY EmpId
Returns
EmpId OrgId 20 DayCt OrgId 20 Cost OrgId 40 DayCt OrgId 40 Cost
----------- -------------- ------------- -------------- -------------
1 15 100 NULL NULL
2 36 300 NULL NULL
3 NULL NULL 25 200
4 NULL NULL 10 50