Different aggregation fields in same query - sql

I am trying to aggregate some fields in different way - using partition - in the same query but found a problem with AVG():
Take this definition:
CREATE TABLE Result
(CheckListId int, CheckId int, AuditId int, CheckListResult FLOAT, CheckResult FLOAT)
INSERT INTO Result VALUES (1,1,1,1,1)
INSERT INTO Result VALUES (1,2,1,1,3)
INSERT INTO Result VALUES (1,2,2,3,1)
INSERT INTO Result VALUES (2,1,1,3,1)
+-------------+---------+---------+-----------------+-------------+
| CheckListId | CheckId | AuditId | CheckListResult | CheckResult |
+-------------+---------+---------+-----------------+-------------+
| 1 | 1 | 1 | 1 | 1 |
| 1 | 2 | 1 | 1 | 3 |
| 1 | 2 | 2 | 3 | 1 |
| 2 | 1 | 1 | 3 | 1 |
+-------------+---------+---------+-----------------+-------------+
thanks to Format Text as Table for formatting
this is my select
SELECT
CheckListId
, CheckId
, (dense_rank() over (PARTITION BY CheckListId order by [AuditId])
+ dense_rank() over (PARTITION BY CheckListId order by [AuditId] desc)
- 1) AS N_AuditForCheckList
, AVG(CheckListResult) OVER(PARTITION BY CheckListId) AS AvgCheckListResult
, COUNT(AuditId) OVER (PARTITION BY CheckListId, CheckId) AS N_AuditForCheck
, AVG(CheckResult) OVER(PARTITION BY CheckListId, CheckId) AS AvgCheckResult
FROM Result
i get this result
+-------------+---------+---------------------+--------------------+-----------------+----------------+
| CheckListId | CheckId | N_AuditForCheckList | AvgCheckListResult | N_AuditForCheck | AvgCheckResult |
+-------------+---------+---------------------+--------------------+-----------------+----------------+
| 1 | 1 | 2 | 1,67 | 1 | 1 |
| 1 | 2 | 2 | 1,67 | 2 | 2 |
| 1 | 2 | 2 | 1,67 | 2 | 2 |
| 2 | 1 | 1 | 3 | 1 | 1 |
+-------------+---------+---------------------+--------------------+-----------------+----------------+
while in AvgCheckListResult i want 2 because on this checklist i have two results: 1 on first audit and 3 on second audit, while sql calculate avg of the 3 rows
Is there a way to do it without sub-query or joining many query?
p.s.
link to test it:
http://rextester.com/ZFEXOD67600

I didn't find (at moment) how to do without a sub query (but I think you already explored my following solution):
P.S. A +1 for the way you wrote your question (scripts, sample data, etc.)
SELECT *
, SUM(CheckListResult / C3 ) OVER (PARTITION BY CheckListId) / N_AuditForCheckList AS AvgChk3
FROM (
SELECT
CheckListId
, CheckId
, AuditId
, CheckListResult
, (dense_rank() over (PARTITION BY CheckListId order by [AuditId])
+ dense_rank() over (PARTITION BY CheckListId order by [AuditId] desc)
- 1) AS N_AuditForCheckList
, AVG(CheckListResult) OVER(PARTITION BY CheckListId) AS AvgCheckListResult
, AVG(CheckListResult) OVER(PARTITION BY CheckListId,AuditId) AS AvgCheckListResult2
, COUNT(*) OVER (PARTITION BY CheckListId, AuditId) AS C3
, COUNT(AuditId) OVER (PARTITION BY CheckListId, CheckId) AS N_AuditForCheck
, AVG(CheckResult) OVER(PARTITION BY CheckListId, CheckId) AS AvgCheckResult
FROM Result) A
Output:
+-------------+---------+---------+-----------------+---------------------+--------------------+---------------------+----+-----------------+----------------+---------+
| CheckListId | CheckId | AuditId | CheckListResult | N_AuditForCheckList | AvgCheckListResult | AvgCheckListResult2 | C3 | N_AuditForCheck | AvgCheckResult | AvgChk3 |
+-------------+---------+---------+-----------------+---------------------+--------------------+---------------------+----+-----------------+----------------+---------+
| 1 | 1 | 1 | 1 | 2 | 1,66666666666667 | 1 | 2 | 1 | 1 | 2 |
| 1 | 2 | 1 | 1 | 2 | 1,66666666666667 | 1 | 2 | 2 | 2 | 2 |
| 1 | 2 | 2 | 3 | 2 | 1,66666666666667 | 3 | 1 | 2 | 2 | 2 |
| 2 | 1 | 1 | 3 | 1 | 3 | 3 | 1 | 1 | 1 | 3 |
+-------------+---------+---------+-----------------+---------------------+--------------------+---------------------+----+-----------------+----------------+---------+

maybe i found a solution with nested query:
SELECT *
,(SELECT AVG(T.CheckListResult)
FROM
(SELECT DISTINCT AuditId, CheckListId, CheckListResult FROM Result) as T
WHERE T.CheckListId = Base.CheckListId
) AS AvgCheckListResult
,(SELECT COUNT(T.CheckListResult)
FROM
(SELECT DISTINCT AuditId, CheckListId, CheckListResult FROM Result) as T
WHERE T.CheckListId = Base.CheckListId
) AS N_AuditForCheckList
,(SELECT AVG(T.CheckResult)
FROM
(SELECT DISTINCT AuditId, CheckId, CheckResult FROM Result) as T
WHERE T.CheckId = Base.CheckId
) AS AvgCheckResult
FROM Result AS Base
result is correct but not sure about performance

Related

Replace null values with most recent non-null values SQL

I have a table where each row consists of an ID, date, variable values (eg. var1).
When there is a null value for var1 in a row, I want like to replace the null value with the most recent non-null value before that date for that ID. How can I do this quickly for a very large table?
So presume I start with this table:
+----+------------|-------+
| id |date | var1 |
+----+------------+-------+
| 1 |'01-01-2022'|55 |
| 2 |'01-01-2022'|12 |
| 3 |'01-01-2022'|45 |
| 1 |'01-02-2022'|Null |
| 2 |'01-02-2022'|Null |
| 3 |'01-02-2022'|20 |
| 1 |'01-03-2022'|15 |
| 2 |'01-03-2022'|Null |
| 3 |'01-03-2022'|Null |
| 1 |'01-04-2022'|Null |
| 2 |'01-04-2022'|77 |
+----+------------+-------+
Then I want this
+----+------------|-------+
| id |date | var1 |
+----+------------+-------+
| 1 |'01-01-2022'|55 |
| 2 |'01-01-2022'|12 |
| 3 |'01-01-2022'|45 |
| 1 |'01-02-2022'|55 |
| 2 |'01-02-2022'|12 |
| 3 |'01-02-2022'|20 |
| 1 |'01-03-2022'|15 |
| 2 |'01-03-2022'|12 |
| 3 |'01-03-2022'|20 |
| 1 |'01-04-2022'|15 |
| 2 |'01-04-2022'|77 |
+----+------------+-------+
cte suits perfect here
this snippets returns the rows with values, just an update query and thats all (will update my response).
WITH selectcte AS
(
SELECT * FROM testnulls where var1 is NOT NULL
)
SELECT t1A.id, t1A.date, ISNULL(t1A.var1,t1B.var1) varvalue
FROM selectcte t1A
OUTER APPLY (SELECT TOP 1 *
FROM selectcte
WHERE id = t1A.id AND date < t1A.date
AND var1 IS NOT NULL
ORDER BY id, date DESC) t1B
Here you can dig further about CTEs :
https://learn.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver16

Only include grouped observations where event order is valid

I have a table of dates for eye exams and eye wear purchases for individuals. I only want to keep instances where individuals bought their eye wear following an eye exam. In the example below, I would want to keep person 1, events 2 and 3 for person 2, person 3, but not person 4. How can I do this in SQL server?
| Person | Event | Order |
| 1 | Exam | 1 |
| 1 | Eyewear| 2 |
| 2 | Eyewear| 1 |
| 2 | Exam | 2 |
| 2 | Eyewear| 3 |
| 3 | Exam | 1 |
| 3 | Eyewear| 2 |
| 4 | Eyewear| 1 |
| 4 | Exam | 2 |
The final result would look like
| Person | Event | Order |
| 1 | Exam | 1 |
| 1 | Eyewear| 2 |
| 2 | Exam | 2 |
| 2 | Eyewear| 3 |
| 3 | Exam | 1 |
| 3 | Eyewear| 2 |
Self join should work...
select
t.Person
,t.Event
,t.[Order]
from
yourTable t
inner join
yourTable t2 on t2.Person = t.Person
and t2.[Order] = (t.[Order] +1)
where
t2.Event = 'Eyewear'
and t.Event = 'Exam'
I haven't tried to optimize it but this seems to work:
create table t(
person varchar(10),
event varchar(10),
[order] varchar(10)
);
insert into t values
('1','Exam','1'),
('1','Eyewear','2'),
('2','Eyewear','1'),
('2','Exam','2'),
('2','Eyewear','3'),
('3','Exam','1'),
('3','Eyewear','2'),
('4','Eyewear','1'),
('4','Exam','2');
with xxx(person,event_a,seq_a,event_b,seq_b) as (
select a.person,a.event,a.[order],b.event,b.[order]
from t a join t b
on a.person = b.person
and a.[order] < b.[order]
and a.event like 'exam'
and b.event like 'eyewear'
)
select person,event_a event,seq_a [order] from xxx
union
select person,event_b event,seq_b [order] from xxx
order by 1,3

Counting on multiple columns

I have a table like this:
+------------+---------------+-------------+
|store_number|entrance_number|camera_number|
+------------+---------------+-------------+
| 1 | 1 | 1 |
| 1 | 1 | 2 |
| 2 | 1 | 1 |
| 2 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 1 | 1 |
| 4 | 1 | 1 |
| 4 | 1 | 2 |
| 4 | 2 | 1 |
| 4 | 3 | 1 |
+------------+---------------+-------------+
In summary the stores are numbered 1 and up, the entrances are numbered 1 and up for each store, and the cameras are numbered 1 and up for each entrance.
What I want to do is count how many how many entrances in total, and how many cameras in total for each store. Producing this result from the above table:
+------------+---------------+-------------+
|store_number|entrances |cameras |
+------------+---------------+-------------+
| 1 | 1 | 2 |
| 2 | 2 | 3 |
| 3 | 1 | 1 |
| 4 | 3 | 4 |
+------------+---------------+-------------+
How can I count on multiple columns to produce this result?
You can do this with a GROUP BY and a COUNT() of each item:
Select Store_Number,
Count(Distinct Entrance_Number) as Entrances,
Count(Camera_Number) As Cameras
From YourTable
Group By Store_Number
From what I can tell from your expected output, you're looking for the number of cameras that appear, whilst also looking for the DISTINCT number of entrances.
This will work as well,
DECLARE #store TABLE
( store_number INT,entrance_number INT,camera_number INT)
INSERT INTO #store VALUES(1,1,1),(1,1,2),(2,1,1),(2,2,1),
(2,2,2),(3,1,1),(4,1,1),(4,1,2),(4,2,1),(4,3,1)
SELECT AA.s store_number, BB.e entrances,AA.c cameras FROM (
SELECT s,COUNT(DISTINCT c) c FROM ( SELECT store_number s,
CONVERT(VARCHAR,store_number) + CONVERT(VARCHAR,entrance_number) +
CONVERT(VARCHAR,camera_number) c FROM #store ) A GROUP BY s ) AA
LEFT JOIN
( SELECT s,COUNT(DISTINCT e) e FROM ( SELECT store_number s,
CONVERT(VARCHAR,store_number) + CONVERT(VARCHAR,entrance_number) e
FROM #store ) B GROUP BY s ) BB ON AA.s = BB.s
Hope it helped. :)

No rowid or key need most recent row

I am trying my hardest to get a list of the most recent rows by date in a DB2 file. The file has no unique id, so I am trying to get the entries by matching a set of columns. I need DESCGA most importantly as that changes often. When it does they keep another row for historical reasons.
SELECT B.COGA, B.COMSUBGA, B.ACCTGA, B.PRFXGA, B.DESCGA
FROM mylib.myfile B
WHERE
(
SELECT COUNT(*)
FROM
(
SELECT A.COGA,A.COMSUBGA,A.ACCTGA,A.PRFXGA,MAX(A.DATEGA) AS EDATE
FROM mylib.myfile A
GROUP BY A.COGA, A.COMSUBGA, A.ACCTGA, A.PRFXGA
) T
WHERE
(B.ACCTGA = T.ACCTGA AND
B.COGA = T.COGA AND
B.COMSUBGA = T.COMSUBGA AND
B.PRFXGA = T.PRFXGA AND
B.DATEGA = T.EDATE)
) > 1
This is what I am trying and so far I get 0 results.
If I remove
B.ACCTGA = T.ACCTGA AND
It will return results (of course wrong).
I am using ODBC in VS 2013 to structure this query.
I have a table with the following
| a | b | descri | date |
-----------------------------
| 1 | 0 | string | 20140102 |
| 2 | 1 | string | 20140103 |
| 1 | 1 | string | 20140101 |
| 1 | 1 | string | 20150101 |
| 1 | 0 | string | 20150102 |
| 2 | 1 | string | 20150103 |
| 1 | 1 | string | 20150103 |
and i need
| 1 | 0 | string | 20150102 |
| 2 | 1 | string | 20150103 |
| 1 | 1 | string | 20150103 |
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by a, b order by date desc) as seqnum
from mylib.myfile t
) t
where seqnum = 1;

Sql Pivot and Unpivot

a newbie here..Actually I have a table in Oracle vw_summary as
Onum | Uacheck | Uadesc | AU11 | AU12 | BD10 |
----------------------------------------------------------
1 | 5.1 | VENDOR | 0 | 0 | 0 |
2 | 5.2A | CUST | 0 | 0 | 0 |
and I need data displayed as:-
Onum | PLant | 5.1 - VENDOR | 5.2A - CUST
---------------------------------------------------
1 | AU11 | 0 | 0
2 | AU12 | 0 | 0
3 | BD10 | 0 | 0
i.e. I need the columns AU11, AU12, BD10 to become rows of my Plant column
and each concatenation of UACHECK || UADESC TO BECOME ROWS.
Try this:
WITH T1 AS (SELECT *
FROM vw_summary UNPIVOT (plantvalue
FOR plant
IN (AU11, AU12, BD10))),
T2
AS (SELECT UACHECK,
UADESC,
PLANT,
PLANTVALUE,
ROW_NUMBER () OVER (PARTITION BY UADESC ORDER BY UADESC)
AS NUM
FROM T1)
SELECT *
FROM t2 PIVOT (MIN (PLANTVALUE)
FOR (UADESC, UACHECK)
IN ( ('VENDOR', '5.1') AS "5.1 - VENDOR",
('CUST', '5.2A') AS "5.2A - CUST"));