How to get AVG in CASE with condition? - sql

I have a table with integer values.
They could be negative, 0, positive and NULL.
I need treat NULL as 0, calculate average for a given date and if average value is less than 0 then put 0 there.
My query is the following:
select
Id,
ValueDate,
case
when avg(isnull(Value, 0)) > 0 then avg(isnull(Value, 0))
else 0
end AvgValue
from SomeTable
where ValueDate = #givenDate
group by Id, ValueDate
How to avoid double aggregate function definition in case statement (aggregate statement could be much more complex)?

I think the greatest function could help you:
select
Id,
ValueDate,
greatest(avg(isnull(Value, 0)),0) AvgValue
from SomeTable
where ValueDate = #givenDate
group by Id, ValueDate

This is a solution without creating implementation of any not build-in functions. I know your example will be more complex but this is just an idea:
CREATE TABLE DataSource
(
[ID] TINYINT
,[Value] INT
)
INSERT INTO DataSource ([ID], [Value])
VALUES (1, 2)
,(1, 0)
,(1, NULL)
,(1, 98)
,(1, NULL)
,(2, -4)
,(2, 0)
,(2, 0)
,(2, NULL)
SELECT [ID]
,MAX([Value])
FROM
(
SELECT [ID]
,AVG(COALESCE([Value],0))
FROM DataSource
GROUP BY [ID]
UNION ALL
SELECT DISTINCT [ID]
,0
FROM DataSource
) Data([ID],[Value])
GROUP BY [ID]
Here is the fiddle - http://sqlfiddle.com/#!6/3d223/14

Related

How to Use Exists in self join

I want those Id whose Orgorder never equal to 1.
CREATE TABLE [dbo].[TEST](
[ORGORDER] [int] NULL,
[Id] [int] NOT NULL,
[ORGTYPE] [varchar](30) NULL,
ORGID INT NULL,
[LEAD] [decimal](19, 2) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[TEST] ([ORGORDER], [Id], [ORGTYPE] ,ORGID, [LEAD]) VALUES (1, 100, N'ABC',1, NULL)
GO
INSERT [dbo].[TEST] ([ORGORDER], [Id], [ORGTYPE],ORGID, [LEAD]) VALUES (0, 100, N'ABC',2, 0)
GO
INSERT [dbo].[TEST] ([ORGORDER], [Id], [ORGTYPE],ORGID, [LEAD]) VALUES (0, 100, N'ACD',1, NULL)
GO
INSERT [dbo].[TEST] ([ORGORDER], [Id], [ORGTYPE],ORGID, [LEAD]) VALUES (0, 101, N'ABC',0, 0)
GO
INSERT [dbo].[TEST] ([ORGORDER], [Id], [ORGTYPE],ORGID, [LEAD]) VALUES (2, 101, N'ABC',4, NULL)
GO
I am using exists but getting my result.
Expected result is -
ID
101
You can do this with one pass of the data, and order all ORGORDER = 1 first, then if it's the first row and it has the ORGORDER value you want to exclude, you can just ignore it.
;WITH x AS
(
SELECT Id, rn = ROW_NUMBER() OVER
(PARTITION BY Id ORDER BY CASE WHEN ORGORDER = 1 THEN 1 ELSE 2 END)
FROM dbo.TEST
)
SELECT Id FROM x WHERE rn = 1 AND ORGORDER <> 1;
Example db<>fiddle
Use a subquery in a NOT EXISTS clause, linking the subquery table to the outer query table by ID:
SELECT DISTINCT T1.ID
FROM dbo.TEST AS T1
WHERE NOT EXISTS (
SELECT *
FROM dbo.TEST AS T2
WHERE T1.ID = T2.ID
AND T2.ORGORDER = 1
)
db<>fiddle
An option would be using an aggregation with a suitable HAVING clause such as
SELECT [Id]
FROM [dbo].[TEST]
GROUP BY [Id]
HAVING SUM(CASE WHEN [ORGORDER] = 1 THEN 1 ELSE 0 END) = 0
where if there's at least one value equals to 1 for the concerned column([ORGORDER]), then that [Id] column won't be listed as result.
Demo

Explaining window function frames

Imagine, table with 2 columns - order no, value.
;with SourceTable as (
select *
from (values
(1, null)
,(2, 5)
,(3, null)
,(4, null)
,(5, 2)
,(6, 1)
) as T(OrderNo, Value)
)
select
*
,first_value(Value) over (
order by
case when Value is not null then 0 else 1 end
, OrderNo
rows between current row and unbounded following
) as X
from SourceTable
order by OrderNo
The issue is that it returns exactly same resultset as SourceTable. I don't understand why. E.g., if first row is processed (OrderNo = 1) I'd expect column X returns 5 because frame should include all rows (current row and unbound following) and it orders by Value - nonnulls first, then by OrderNo. So first row in frame should be OrderNo=2. Obviously it doesn't work like that but I don't get why.
Much appreciated if someone explains how is constructed the first frame.
Many thanks
Here is how I modified your query to investigte - I explicitly added the CASE as a column result, and then sorted the entire result set as your Window is ordered by:
;with SourceTable as (
select *
from (values
(1, null)
,(2, 5)
,(3, null)
,(4, null)
,(5, 2)
,(6, 1)
) as T(OrderNo, Value)
)
select
*
,case when Value is not null then 0 else 1 end AS CaseSort
,Value
,first_value(Value) over (
order by
case when Value is not null then 0 else 1 end
, OrderNo
rows between current row and unbounded following
) as X
from SourceTable
order by 3,OrderNo
Where you can see that "first_value" for the window matches the "Value" amount in each of the result rows.

SQL Server : loop through one table and get the sum till threshold value is reached and update the sum value in another table

I have two tables:
Sales table:
Returns table:
I have to loop through the Sales table and get sum of all the Qty based on Material+Batch+customer combination until it exceeds the value of Return_qty, and update the Summed value in the Returns table.
This is the desired output:
As you can see, from the Sales table until Sales_Invoice 4 only it considered as it exceeded the value of return_Qty.
What I have tried till now?
I have tried to use while loop to loop through and calculate running total. But its not working out. Maybe approach is wrong.
Any inputs will be highly appreciated.
Try this:
DECLARE #Sales TABLE
(
[Sales_Invoice] SMALLINT
,[Invoice_Date] DATE
,[Material] VARCHAR(3)
,[Batch] VARCHAR(2)
,[Customer] VARCHAR(4)
,[Qty] SMALLINT
);
DECLARE #Returns TABLE
(
[Return_Invoice] SMALLINT
,[Invoice_Date] DATE
,[Material] VARCHAR(3)
,[Batch] VARCHAR(2)
,[Customer] VARCHAR(4)
,[Return_Qty] SMALLINT
,[Sales_Qty] SMALLINT
);
INSERT INTO #Sales ([Sales_Invoice], [Invoice_Date], [Material], [Batch], [Customer], [Qty])
VALUES (1, '2019-06-07', 'AB1', 'B1', 'B001', 50)
,(2, '2019-06-07', 'AB1', 'B1', 'B001', 20)
,(3, '2019-06-06', 'AB1', 'B1', 'B001', 25)
,(4, '2019-06-06', 'AB1', 'B1', 'B001', 11)
,(5, '2019-06-06', 'AB1', 'B1', 'B001', 20)
,(6, '2019-06-01', 'BA2', 'C1', 'Y001', 100);
INSERT INTO #Returns ([Return_Invoice], [Invoice_Date], [Material], [Batch], [Customer], [Return_Qty])
VALUES (212, '2019-06-08', 'AB1', 'B1', 'B001', 100);
WITH DataSource AS
(
SELECT [Material], [Batch], [Customer]
,SUM([Qty]) OVER (PARTITION BY [Material], [Batch], [Customer] ORDER BY [Sales_Invoice] ASC) AS [Return_Qty]
FROM #Sales
)
UPDATE #Returns
SET [Sales_Qty] = DS.[Return_Qty]
FROM #Returns R
INNER JOIN
(
SELECT [Material], [Batch], [Customer]
,MIN([Return_Qty]) AS [Return_Qty]
FROM DataSource
WHERE [Return_Qty] >= 100
GROUP BY [Material], [Batch], [Customer]
) DS
ON R.[Material] = DS.[Material]
AND R.[Batch] = DS.[Batch]
AND R.[Customer] = DS.[Customer];
SELECT *
FROM #Returns;
If you want to be more dynamical, you can use the following:
WITH DataSource AS
(
SELECT [Material], [Batch], [Customer]
,SUM([Qty]) OVER (PARTITION BY [Material], [Batch], [Customer] ORDER BY [Sales_Invoice] ASC) AS [Return_Qty]
FROM #Sales
)
UPDATE #Returns
SET [Sales_Qty] = DataSource.[Return_Qty]
FROM #Returns R
CROSS APPLY
(
SELECT DS.[Material], DS.[Batch], DS.[Customer]
,MIN(DS.[Return_Qty]) AS [Return_Qty]
FROM DataSource DS
WHERE DS.[Return_Qty] >= R.[Return_Qty]
AND R.[Material] = DS.[Material]
AND R.[Batch] = DS.[Batch]
AND R.[Customer] = DS.[Customer]
GROUP BY [Material], [Batch], [Customer]
) DataSource;
you should really show your while statement in your post - can you do that please?
I think a common table expression using recursion is a good solution for you. something along the lines of ...
;
WITH
cte1 AS
(
SELECT
RANK() OVER
(ORDER BY S.Material, S.Batch, S.Customer) GroupId,
RANK() OVER
(
PARTITION BY S.Material, S.Batch, S.Customer,
ORDER BY S.INVOICE_Date) Seqn,
S.Material, S.Batch, S.Customer, S.qty, R.Return_qty
FROM
Sales S
JOIN
Returns R
ON S.Material = R.Material AND S.Batch = R.Batch AND S.Customer = R.Customer
),
cte2 AS
(
SELECT
GroupId, Seqn,Material, Batch, Customer, qty AS TriggeringQty, Return_qty
FROM cte1
WHERE seqn =1
UNION ALL
SELECT
cte1.GroupId, cte1.Seqn, cte1.Material, cte1.Batch, cte1.Customer,
cte1.qty + cte2.qty, cte1.Return_qty
FROM cte2
JOIN cte1
ON cte1.GroupId = cte2.GroupID AND cte1.seqn = cte2.seqn+1
WHERE
cte2.qty < 100 AND cte1.seqn + cte2.seqn+1 >= Return_qty )
UPDATE R
SET R.Sales_qty = cte2.triggeringqty
FROM Returns R
JOIN cte2 S ON
S.Material = R.Material AND S.Batch = R.Batch AND S.Customer = R.Customer
WHERE cte2.triggeringqty >= 100;
Sorry I haven't tried the above so probably won't run, but hopefully you see what's happening.

Trying to group by a value in SQL

I have a table called TESTTABLE
The table script and some sample date
CREATE TABLE Test_Table(
NODE VARCHAR(10) NOT NULL PRIMARY KEY
,EVENTID CHAR(255) NOT NULL
,TYPE INTEGER NOT NULL
,FIRSTOCCURRENCE VARCHAR(16) NOT NULL
,LASTOCCURRENCE VARCHAR(16) NOT NULL
,TALLY INTEGER NOT NULL
,TICKETNUMBER VARCHAR(20)
,TIME_DELTA VARCHAR(5)
);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Washington','ReachabilityProblem',2,'12/13/2017 23:24','12/13/2017 23:24',1,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('San Diego','ReachabilityProblem',1,'12/13/2017 23:23','12/13/2017 23:23',1,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Richmond','ReachabilityProblem',1,'12/13/2017 14:23','12/13/2017 14:23',1,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Richmond','ReachabilityProblem',1,'12/13/2017 23:23','12/13/2017 23:23',1,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('New York','ReachabilityProblem',2,'12/13/2017 23:24','12/13/2017 23:24',1,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('New York','ReachabilityProblem',2,'12/13/2017 11:32','12/13/2017 11:33',2,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('New York','ReachabilityProblem',1,'12/13/2017 16:35','12/13/2017 16:35',1,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Landsdown','ReachabilityProblem',2,'12/13/2017 23:24','12/13/2017 23:24',1,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Houston','ReachabilityProblem',2,'12/13/2017 14:24','12/13/2017 14:24',1,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Houston','ReachabilityProblem',1,'12/13/2017 11:31','12/13/2017 11:32',2,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Dallas','ReachabilityProblem',1,'12/13/2017 23:23','12/13/2017 23:23',1,NULL,NULL);
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Dallas','ReachabilityProblem',2,'12/13/2017 23:24','12/13/2017 23:24',1,NULL,'1 sec');
INSERT INTO Test_Table(NODE,EVENTID,TYPE,FIRSTOCCURRENCE,LASTOCCURRENCE,TALLY,TICKETNUMBER,TIME_DELTA) VALUES ('Coco Beach','ReachabilityProblem',1,'12/13/2017 23:23','12/13/2017 23:23',1,NULL,NULL);
I'm trying to obtain this
I have tried this
Select DATEDIFF(Day, GETDATE(), DATEADD(HOUR, 15, GETDATE()))
Select
[NODE]
,[EVENTID]
,[TYPE]
,[FIRSTOCCURRENCE]
,LASTOCCURRENCE]
,DATEDIFF(Minute, FIrst OCCURENCE, LAST OCCURENCE) as [Outage in MIN]
,[TicketNumber]
,[Severity]
,Tally]
From
[XYZ].[XYZ].[XYZ_STATUS]
Where
[FIRST OCCURRENCE] >= DATEADD(hh, -24, GETDATE())
Group by node;
Please help a rookie
Group by returns a relation/table with a row for each group, if you are going to use the GROUP BY clause, so in your SELECT statement you can only select the column that you are grouping by and use aggregate functions on that column because the other columns will not appear in the resulting table.
Maybe this is what you want...
Select
DATEDIFF (DAY, GETDATE(), DATEADD(Hour, 15, GETDATE())),
,Node
,EventID
,Type
,Severity
,Tally
FROM xyz.xyz.xyz_status
GROUP BY Node,EventID,Type,Severity,Tally
When we group by two or more columns, it is saying "Group them so that all of those with the same col1 and col2 are in the same group, and then calculate all the aggregate functions (Count, Sum, Average, etc.) for each of those groups"
Maybe you want this...
SELECT DATEDIFF(minute,(SELECT TOP(1) FIRSTOCCURRENCE FROM
xyz.xyz.xyz_status),(SELECT TOP(1) LASTOCCURRENCE FROM
xyz.xyz.xyz_status))
FROM xyz.xyz.xyz_status
WHERE node = 'Houston';
Here you can take a look at more examples of DATEDIFF function.
This should put you on track although Writing reports in SQL is probably a bad idea. What I believe you're wanting to do it output. You can also look at the ROLLUP options some of which are deprecated.
with data as (
select
NODE, EVENTID, TYPE, FIRSTOCCURRENCE, LASTOCCURRENCE,
DATEDIFF(Minute, FIRSTOCCURRENCE, LASTOCCURRENCE) as OutageInMin,
TicketNumber, Tally,
ROW_NUMBER() OVER (PARTITION BY NODE ORDER BY FIRSTOCCURRENCE) as rn
from Test_Table
--WHERE FIRSTOCCURRENCE >= DATEADD(hh, -24, GETDATE())
)
select
case when grouping(rn) = 1 then 'SITE TOTAL' else NODE end as NODE,
case when grouping(rn) = 1 then null else min(EVENTID) end as EVENTID,
case when grouping(rn) = 1 then null else min(TYPE) end as TYPE,
case when grouping(rn) = 1 then null else min(FIRSTOCCURRENCE) end as FIRSTOCCURRENCE,
case when grouping(rn) = 1 then null else min(LASTOCCURRENCE) end as LASTOCCURRENCE,
case when grouping(rn) = 1 then null else min(Tally) end as Tally,
case when grouping(rn) = 1 then null else min(TicketNumber) end as TicketNumber,
case when grouping(node) = 1
then min(OutageInMin) else sum(OutageInMin) end as "Outage In MIN"
from
data
group by grouping sets ( (NODE, rn), (NODE) )
order by data.NODE, grouping(rn), rn;
http://rextester.com/DZIHJ81264
GROUP BY is only authorized in SQL when you are aggregating something. The easiest exemple is a count.
Example : you want to know how much EventID are linked to a given Node :
SELECT Count(EventId), node FROM xyz.xyz.xyz_status GROUP BY node;
Here is a site that present the Group By function. If you clarify what you are searching for, we'll give you a more concrete example.

How can get null column after UNPIVOT?

I have got the following query:
WITH data AS(
SELECT * FROM partstat WHERE id=4
)
SELECT id, AVG(Value) AS Average
FROM (
SELECT id,
AVG(column_1) as column_1,
AVG(column_2) as column_2,
AVG(column_3) as column_3
FROM data
GROUP BY id
) as pvt
UNPIVOT (Value FOR V IN (column_1,column_2,column_3)) AS u
GROUP BY id
if column_1,column_2 and column_3 (or one of this columns) have values then i get result as the following:
id, Average
4, 5.12631578947368
if column_1,column_2 and column_3 have NULL values then the query does not return any rows as the following:
id, Average
my question is how can i get as the following result if columns contents NULL values?
id, Average
4, NULL
Have you tried using COALESCE or ISNULL?
e.g.
ISNULL(AVG(column_1), 0) as column_1,
This does mean that you will get 0 as the result instead of 'NULL' though - do you need null when they are all NULL?
Edit:
Also, is there any need for an unpivot? Since you are specifying all 3 columns, why not just do:
SELECT BankID, (column_1 + column_2 + column_3) / 3 FROM partstat
WHERE bankid = 4
This gives you the same results but with the NULL
Of course this is assuming you have 1 row per bankid
Edit:
UNPIVOT isn't supposed to be used like this as far as I can see - I'd unpivot first then try the AVG... let me have a go...
Edit:
Ah I take that back, it is just a problem with NULLs - other posts suggest ISNULL or COALESCE to eliminate the nulls, you could use a placeholder value like -1 which could work e.g.
SELECT bankid, AVG(CASE WHEN value = -1 THEN NULL ELSE value END) AS Average
FROM (
SELECT bankid,
isnull(AVG(column_1), -1) as column_1 ,
AVG(Column_2) as column_2 ,
Avg(column_3) as column_3
FROM data
group by bankid
) as pvt
UNPIVOT (Value FOR o in (column_1, column_2, column_3)) as u
GROUP BY bankid
You need to ensure this will work though as if you have a value in column2/3 then column_1 will no longer = -1. It might be worth doing a case to see if they are all NULL in which case replacing the 1st null with -1
Here is an example without UNPIVOT:
DECLARE #partstat TABLE (id INT, column_1 DECIMAL(18, 2), column_2 DECIMAL(18, 2), column_3 DECIMAL(18, 2))
INSERT #partstat VALUES
(5, 12.3, 1, 2)
,(5, 2, 5, 5)
,(5, 2, 2, 2)
,(4, 2, 2, 2)
,(4, 4, 4, 4)
,(4, 21, NULL, NULL)
,(6, 1, NULL, NULL)
,(6, 1, NULL, NULL)
,(7, NULL, NULL, NULL)
,(7, NULL, NULL, NULL)
,(7, NULL, NULL, NULL)
,(7, NULL, NULL, NULL)
,(7, NULL, NULL, NULL)
;WITH data AS(
SELECT * FROM #partstat
)
SELECT
pvt.id,
(ISNULL(pvt.column_1, 0) + ISNULL(pvt.column_2, 0) + ISNULL(pvt.column_3, 0))/
NULLIF(
CASE WHEN pvt.column_1 IS NULL THEN 0 ELSE 1 END +
CASE WHEN pvt.column_2 IS NULL THEN 0 ELSE 1 END +
CASE WHEN pvt.column_3 IS NULL THEN 0 ELSE 1 END
, 0)
AS Average
FROM (
SELECT id,
AVG(column_1) as column_1,
AVG(column_2) as column_2,
AVG(column_3) as column_3
FROM data
GROUP BY id
) as pvt