getting average of average depending on a column? - sql

I have a query where I get the average in each row and showing the employee.
I would like it to show the average for EACH employee. meaning I would like to average all the row with the same employee.
How would I be able to accomplish this?
This is my current query:
SELECT
(
SELECT AVG(rating)
FROM (VALUES
(cast(c.rating1 as Float)),
(cast(c.rating2 as Float)),
(cast(c.rating3 as Float)),
(cast(c.rating4 as Float)),
(cast(c.rating5 as Float))
) AS v(rating)
WHERE v.rating > 0
) avg_rating, employee
From CSEReduxResponses c
Where
month(c.approveddate)= 6
AND year(c.approveddate)=2014
Below I have some sample data I created:
create table CSEReduxResponses (rating1 int, rating2 int, rating3 int, rating4 int, rating5 int,
approveddate datetime,employee int)
insert into CSEReduxResponses (rating1 , rating2 ,rating3 , rating4 , rating5 ,
approveddate, employee )
values
(5,4,5,1,4,'2014-06-18',1),
(5,4,5,1,4,'2014-06-18',1),
(5,4,5,1,0,'2014-06-18',1),
(5,4,0,1,4,'2014-06-18',2),
(5,4,5,1,4,'2014-06-18',2),
(5,4,0,1,4,'2014-06-18',3),
(5,0,5,4,4,'2014-06-18',3),
(5,4,5,0,0,'2014-06-18',3);

How about something like this?
select employee,
avg(case when n.n = 1 and rating1 > 0 then rating1
when n.n = 2 and rating2 > 0 then rating2
when n.n = 3 and rating3 > 0 then rating3
when n.n = 4 and rating4 > 0 then rating4
when n.n = 5 and rating5 > 0 then rating5
end)
from CSEReduxResponses c cross join
(select 1 as n union all select 2 union all select 3 union all select 4 union all select 5
) n
where month(c.approveddate)= 6 and year(c.approveddate)=2014
group by employee;
I would recommend rewriting the where clause as:
where c.approveddate >= '2014-06-01' and c.approveddate < '2014-07-01'
This would allow the SQL engine to use an index on approveddate.

select
(sum(rating1)+sum(rating2)+sum(rating3)+sum(rating4)+sum(rating5))
/
(count(nullif(rating1,0))+count(nullif(rating2,0))+count(nullif(rating3,0))+count(nullif(rating4,0))+count(nullif(rating5,0)))
as avg_rating,
count(*) as number_of_responses, employee
From CSEReduxResponses where month(approveddate)= 6 AND year(approveddate)=2014 group by employee ;

I have also come come up with a slightly slicker version, using a UDF. I prefer this one, as the average function might come in useful for other queries...
DELIMITER //
DROP FUNCTION IF EXISTS cca_wip.avg_ignore0//
CREATE FUNCTION cca_wip.avg_ignore0(
str VARCHAR(500)
) RETURNS double
COMMENT 'values separated by a coma, that are to be averaged. 0 will be treated as NULL'
BEGIN
DECLARE ss TEXT;
DECLARE sum, count double;
IF length(str)=0 or str not regexp '[0-9]' then RETURN 0;
end if;
IF str regexp '[a-z]' then RETURN NULL;
end if;
SET str=replace(str,'NULL','0');
SET sum =0;
SET count =0;
WHILE length(str)>0 DO
set ss=substring_index(str,',',1);
SET sum = sum + ss;
IF ss>0 THEN SET count = count+1;
END IF;
set str=trim(trim(trim(',') from trim(trim(ss from str))));
END WHILE;
RETURN (sum/count);
END//
DELIMITER ;
select
avg_ignore0(group_concat(concat_ws(',',rating1,rating2,rating3,rating4,rating5))),
count(*) as number_of_responses,
employee
From CSEReduxResponses
where
month(approveddate)= 6 AND year(approveddate)=2014
group by employee ;

Related

calculate SUM of multiple Case When

My data has 3 groups of variables (set1_a,b,c,) (set2_x,y), (set3_n) as below. For each group, if at least 1 variable has value>90 then I count as 1.
Then, I SUM the count.
My code below works fine. However, I would like to put all in 1 select statement.
Can you please help?
Create TABLE have (
id varchar(225),
set1_a varchar(225),
set1_b varchar(225),
set1_c varchar(225),
set2_x varchar(225),
set2_y varchar(225),
set3_n varchar(225)
);
Insert into have (id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n) values (1,1,3,200,1,1,5);
Insert into have (id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n) values (2,1,3,200,200,1,5);
Insert into have (id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n) values (3,1,3,200,200,1,500);
Insert into have (id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n) values (4,1,3,1,1,1,500);
select * from have;
SELECT id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n,N1,N2, count1+count2+count3 as total_count FROM
(
select id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n,
case
when (set1_a >90 or set1_b>90 or set1_c>90) then 1 else 0
end as count1,
case
when (set2_x >90 or set2_y>90) then 1 else 0
end as count2
case
when (set3_n >90) then 1 else 0
end as count3
from have
)
--WHERE N1+N2>=2
;
This works. Also, ideally the data type should be int and not varchar since you are comparing with int.
SELECT id,set1_a,set1_b,set1_c,set2_x,set2_y,set3_n,
(case
when greatest(
cast(set1_a as int),
cast(set1_b as int),
cast(set1_c as int)
)>90 then 1 else 0 end) +
(case
when greatest(
cast(set2_x as int),
cast(set2_y as int)
)>90 then 1 else 0 end) +
(case
when cast(set3_n as int) > 90
then 1 else 0 end
) as total_count
from have
DB fiddle on postgres 11 for the same. But it's all standard sql.

Incremental Group BY

How I can achieve incremental grouping in query ?
I need to group by all the non-zero values into different named groups.
Please help me write a query based on columns date and subscribers.
If you have SQL Server 2012 or newer, you can use few tricks with windows functions to get this kind of grouping without cursors, with something like this:
select
Date, Subscribers,
case when Subscribers = 0 then 'No group'
else 'Group' + convert(varchar, GRP) end as GRP
from (
select
Date, Subscribers,
sum (GRP) over (order by Date asc) as GRP
from (
select
*,
case when Subscribers > 0 and
isnull(lag(Subscribers) over (order by Date asc),0) = 0 then 1 else 0 end as GRP
from SubscribersCountByDay S
) X
) Y
Example in SQL Fiddle
In general I advocate AGAINST cursors but in this case it ill not hurt since it ill iterate, sum up and do the conditional all in one pass.
Also note I hinted it with FAST_FORWARD to not degrade performance.
I'm guessing you do want what #HABO commented.
See the working example below, it just sums up until find a ZERO, reset and starts again. Note the and #Sum > 0 handles the case where the first row is ZERO.
create table dbo.SubscribersCountByDay
(
[Date] date not null
,Subscribers int not null
)
GO
insert into dbo.SubscribersCountByDay
([Date], Subscribers)
values
('2015-10-01', 1)
,('2015-10-02', 2)
,('2015-10-03', 0)
,('2015-10-04', 4)
,('2015-10-05', 5)
,('2015-10-06', 0)
,('2015-10-07', 7)
GO
declare
#Date date
,#Subscribers int
,#Sum int = 0
,#GroupId int = 1
declare #Result as Table
(
GroupName varchar(10) not null
,[Sum] int not null
)
declare ScanIt cursor fast_forward
for
(
select [Date], Subscribers
from dbo.SubscribersCountByDay
union
select '2030-12-31', 0
) order by [Date]
open ScanIt
fetch next from ScanIt into #Date, #Subscribers
while ##FETCH_STATUS = 0
begin
if (#Subscribers = 0 and #Sum > 0)
begin
insert into #Result (GroupName, [Sum]) values ('Group ' + cast(#GroupId as varchar(6)), #Sum)
set #GroupId = #GroupId + 1
set #Sum = 0
end
else begin
set #Sum = #Sum + #Subscribers
end
fetch next from ScanIt into #Date, #Subscribers
end
close ScanIt
deallocate ScanIt
select * from #Result
GO
For the OP: Please next time write the table, just posting an image is lazy
In a version of SQL Server modern enough to support CTEs you can use the following cursorless query:
-- Sample data.
declare #SampleData as Table ( Id Int Identity, Subscribers Int );
insert into #SampleData ( Subscribers ) values
-- ( 0 ), -- Test edge case when we have a zero first row.
( 200 ), ( 100 ), ( 200 ),
( 0 ), ( 0 ), ( 0 ),
( 50 ), ( 50 ), ( 12 ),
( 0 ), ( 0 ),
( 43 ), ( 34 ), ( 34 );
select * from #SampleData;
-- Run the query.
with ZerosAndRows as (
-- Add IsZero to indicate zero/non-zero and a row number to each row.
select Id, Subscribers,
case when Subscribers = 0 then 0 else 1 end as IsZero,
Row_Number() over ( order by Id ) as RowNumber
from #SampleData ),
Groups as (
-- Add a group number to every row.
select Id, Subscribers, IsZero, RowNumber, 1 as GroupNumber
from ZerosAndRows
where RowNumber = 1
union all
select FAR.Id, FAR.Subscribers, FAR.IsZero, FAR.RowNumber,
-- Increment GroupNumber only when we move from a non-zero row to a zero row.
case when Groups.IsZero = 1 and FAR.IsZero = 0 then Groups.GroupNumber + 1 else Groups.GroupNumber end
from ZerosAndRows as FAR inner join Groups on Groups.RowNumber + 1 = FAR.RowNumber
)
-- Display the results.
select Id, Subscribers,
case when IsZero = 0 then 'no group' else 'Group' + Cast( GroupNumber as VarChar(10) ) end as Grouped
from Groups
order by Id;
To see the intermediate results just replace the final select with select * from FlagsAndRows or select * from Groups.

Show 0 in count SQL

This is my result :
Year matches
2005 1
2008 2
and this is my expected result:
Year matches
2005 1
2006 0
2007 0
2008 2
This is what I have tried:
SELECT DATEPART(yy,A.match_date) AS [Year], COUNT(A.match_id) AS "matches"
FROM match_record A
INNER JOIN match_record B ON A.match_id = B.match_id
WHERE (score) IS NULL OR (score) = 0
GROUP BY DATEPART(yy,A.match_date);
I want to get zero as count in the years where score have some values(not null and zero, anything greater than 0) . Can someone help me?
This might do what you're looking for:
SELECT DATEPART(yy,A.match_date) AS [Year],
SUM(CASE WHEN score=0 or score is null THEN 1 ELSE 0 END) AS "matches"
FROM match_record A
INNER JOIN match_record B ON A.match_id = B.match_id
GROUP BY DATEPART(yy,A.match_date);
Assuming you have any data in the missing years, this should now produce your expected results.
If, instead, you need 0s for years where you have no data, you'll need to provide the list of years separately (say, via a numbers table) and then LEFT JOIN that source to your existing query.
Consider following is your table
SELECT * INTO #TEMP FROM
(
SELECT 2005 [YEARS],1 [MATCHES]
UNION ALL
SELECT 2008,2
)T
Declare two variables to get min and max date in your table
DECLARE #MINYEAR int;
DECLARE #MAXYEAR int;
SELECT #MINYEAR = MIN(YEARS) FROM #TEMP
SELECT #MAXYEAR = MAX(YEARS) FROM #TEMP
Do the following recursion to get years between the period in your table and LEFT JOIN with your table.
; WITH CTE as
(
select #MINYEAR as yr FROM #TEMP
UNION ALL
SELECT YR + 1
FROM CTE
WHERE yr < #MAXYEAR
)
SELECT DISTINCT C.YR,CASE WHEN T.MATCHES IS NULL THEN 0 ELSE T.MATCHES END MATCHES
FROM CTE C
LEFT JOIN #TEMP T ON C.yr=T.YEARS
DECLARE #t table(Year int, matches int)
DECLARE #i int=2005
WHILE #i <=2008
BEGIN
IF NOT exists (SELECT matches FROM tbl WHERE year=#i)
BEGIN
INSERT INTO #t
SELECT #i,'0'
SET #i=#i+1
END
else
BEGIN
INSERT INTO #t
SELECT year,[matches] from tbl
SET #i=#i+1
END
END
SELECT DISTINCT * FROM #t
how about,
SELECT
[year],
COUNT(*) [matches]
FROM (
SELECT
DATEPART(yy, [A].[match_date]) [year]
FROM
[match_record] [A]
LEFT JOIN
[match_record] [B]
ON [A].[match_id] = [B].[match_id]
WHERE
COALESCE([B].[score], 0) = 0) [Nils]
GROUP BY
[Year];

Find the missing number group by category

I want to find the missing batchNo group by each category
I try this and it work but I get all the missing number for all categories
how to group by the data ?
CREATE TABLE #tmp (BatchNo INT, Category VARCHAR(15))
INSERT INTO #tmp
SELECT 94, 'A01'
UNION ALL
SELECT 97, 'A01'
UNION ALL
SELECT 100, 'A02'
UNION ALL
SELECT 105, 'A02'
declare #valmax INT, #valmin INT, #i INT;
select #valmax=max(BatchNo) from #tmp;
select #valmin=min(BatchNo) from #tmp;
set #i=#valmin;
while (#i<#valmax) begin
if (not exists(select * from #tmp where BatchNo=#i)) begin
-- SELECT #i, Category FROM #tmp GROUP BY Category
SELECT #i
end;
set #i=#i+1
end;
the out put shold be like
95 A01
96 A01
101 A02
102 A02
103 A02
104 A02
You can do this by joining with a number table. This query uses thespt_valuestable and should work:
;with cte as (
select category , min(batchno) min_batch, max(batchno) max_batch
from #tmp
group by category
)
select number, category
from master..spt_values
cross join cte
where type = 'p'
and number > min_batch
and number < max_batch
group by category, number
Sample SQL Fiddle
Note that this table only has a sequence of numbers 0-2047so if yourBatchNocan be higher you need another source for the query (could be another table or a recursive cte); something like this would work:
;with
cte (category, min_batch, max_batch) as (
select category , min(batchno), max(batchno)
from #tmp
group by category
),
numbers (number, max_number) as (
select 1 as number, (select MAX(batchno) from #tmp) max_number
union all
select number + 1, max_number
from numbers
where number < max_number
)
select number, category
from numbers cross join cte
where number > min_batch
and number < max_batch
group by category, number
option (maxrecursion 0)
Amended my answer. Probably better than using spt_values just because there is a limit. Thanks for jpw for a better way of doing it
Declare #Start int
Declare #End int
Select #Start = Min(BatchNo), #End = Max(BatchNo) from #tmp;
with nums as (
select #Start as n
union all
select n+1
from nums
where n < #End
)
Select n, Category from nums
cross join #tmp t
where n > (select Min(BatchNo) from #tmp where Category = t.Category group by Category)
and n < (select Max(BatchNo) from #tmp where Category = t.Category group by Category)
group by category, n

Getting average from 3 columns in SQL Server

I have a table with 3 columns (smallint) in SQL Server 2005.
Table Ratings
ratin1 smallint,
ratin2 smallint
ratin3 smallint
These columns can have values from 0 to 5.
How can I select the average value of these fields, but only compare fields where the value is greater then 0.
So if the column values are 1, 3 ,5 - the average has to be 3.
if the values are 0, 3, 5 - The average has to be 4.
This is kind of quick and dirty, but it will work...
SELECT (ratin1 + ratin2 + ratin3) /
((CASE WHEN ratin1 = 0 THEN 0 ELSE 1 END) +
(CASE WHEN ratin2 = 0 THEN 0 ELSE 1 END) +
(CASE WHEN ratin3 = 0 THEN 0 ELSE 1 END) +
(CASE WHEN ratin1 = 0 AND ratin2 = 0 AND ratin3 = 0 THEN 1 ELSE 0 END) AS Average
#mwigdahl - this breaks if any of the values are NULL. Use the NVL (value, default) to avoid this:
Sum columns with null values in oracle
Edit: This only works in Oracle. In TSQL, try encapsulating each field with an ISNULL() statement.
There should be an aggregate average function for sql server.
http://msdn.microsoft.com/en-us/library/ms177677.aspx
This is trickier than it looks, but you can do this:
SELECT dbo.MyAvg(ratin1, ratin2, ratin3) from TableRatings
If you create this function first:
CREATE FUNCTION [dbo].[MyAvg]
(
#a int,
#b int,
#c int
)
RETURNS int
AS
BEGIN
DECLARE #result int
DECLARE #divisor int
SELECT #divisor = 3
IF #a = 0 BEGIN SELECT #divisor = #divisor - 1 END
IF #b = 0 BEGIN SELECT #divisor = #divisor - 1 END
IF #c = 0 BEGIN SELECT #divisor = #divisor - 1 END
IF #divisor = 0
SELECT #result = 0
ELSE
SELECT #result = (#a + #b + #c) / #divisor
RETURN #Result
END
select
(
select avg(v)
from (values (Ratin1), (Ratin2), (Ratin3)) as value(v)
) as average
You can use the AVG() function. This will get the average for a column. So, you could nest a SELECT statement with the AVG() methods and then SELECT these values.
Pseudo:
SELECT col1, col2, col3
FROM (
SELECT AVG(col1) AS col1, AVG(col2) AS col2, AVG(col3) AS col3
FROM table
) as tbl
WHERE col1 IN (0, 3, 5)
etc.