SQL average multiple columns for each row with nulls - sql

I have a table like this:
|Quality|Schedule|Cost Control|
-------------------------------
|7 | 8.5 |10 |
|NULL | 9 |NULL |
and I need to calculate the average of each row in the same table so it looks like this:
|Quality|Schedule|Cost Control|AVG|
----------------------------------
|7 | 8.5 |10 |8.5|
|NULL | 9 |NULL |9 |
which I have done using the following code:
SELECT r.Quality, r.Schedule, r.CostControl,
((coalesce(r.quality,0)+
coalesce(r.schedule,0)+
coalesce(r.CostControl,0)/3) as Average
FROM dbo.Rating r
Which gives the following table:
|Quality|Schedule|Cost Control|AVG|
----------------------------------
|7 | 8.5 |10 |8.5|
|NULL | 9 |NULL |3 |
I know the problem is that the divisor is hard coded in my select statement, but I can't figure out how to make it variable. I tried using a case statement to select an addition column:
select Count(case when(r.quality) > 0 then 1 else 0 end +
case when (r.Schedule) > 0 then 1 else 0 end +
case when (r.CostControl) > 0 then 1 else 0 end)
But that only gives me one value. I'm out of ideas and facing a pretty tight deadline, so any help would be much appreciated.

Instead of dividing by 3, use
(CASE WHEN Quality IS NULL THEN 0 ELSE 1 END +
CASE WHEN Schedule IS NULL THEN 0 ELSE 1 END +
CASE WHEN [Cost Control] IS NULL THEN 0 ELSE 1 END)

I would use apply instead :
select *, (select sum(v) / count(v)
from ( values (quality), (Schedule), (CostControl)
) tt(v)
) as AVG
from table t;

I would use apply with avg():
SELECT r.Quality, r.Schedule, r.CostControl, v.average
FROM dbo.Rating r CROSS APPLY
(SELECT avg(val)
FROM (VALUES (quality), (schedule), (CostControl)) v(val)
) v(average);
This requires no subqueries, no long case expressions, generalizes easily to more columns, runs no risk of divide-by-zero . . . and the performance might even be equivalent to the case expression.

Related

Create recursive CTE for this table [duplicate]

This question already has answers here:
The maximum recursion 100 has been exhausted before statement completion
(2 answers)
Closed 3 months ago.
I have a table like this:
|id |name |parent|
+-------+----------+------+
|1 |iran | |
|2 |iraq | |
|3 |tehran |1 |
|4 |tehran |3 |
|5 |Vaiasr St |4 |
|6 |Fars |1 |
|7 |shiraz |6 |
It's about addresses from country to street. I want to create address by recursive cte like this:
with cte_address as
(
select
ID, [Name], parent
from
[Address]
where
Parent is null
union all
select
a.ID, a.[name], a.Parent
from
address a
inner join
cte_address c on a.parent = c.id
)
select *
from cte_address
But I get an error:
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
you have to use option (maxrecursion 0) at the end of your select query,Maxrecursion 0 allows infinite recursion:
with cte_address as
(
...
...
)
select * from cte_address
option (maxrecursion 0)
Note :
Limiting the number of recursions allowed for a specific query in SQL Server with the 100 default value prevents the cause of an infinite loop situation due to a poorly designed recursive CTE query.

Looking to find duplicates using DIFFERENCE() among 2+ columns

I'm trying to write a SQL Select query that uses the DIFFERENCE() function to find similar names in a database to identify duplicates.
The short version of the code I'm using is:
SELECT *, DIFFERENCE(FirstName, LEAD(FirstName) OVER (ORDER BY SOUNDEX(FirstName))) d
WHERE d >= 3
The problem is my database has additional columns that include middle names and nicknames. So if I have a customer who has multiple names they go by, they might be in the database multiple times, and I need to compare a variety of columns against each other.
Sample Data:
+----+--------+--------+--------+--------+
|ID |First |Middle |AKA1 |AKA2 |
+----+--------+--------+--------+--------+
|1 |Sally |Ann |NULL |NULL |
|2 |Ann |NULL |NULL |NULL |
|3 |Sue |NULL |NULL |NULL |
|4 |Suzy |NULL |NULL |NULL |
|5 |Patricia|NULL |Trish |Patty |
|6 |Patty |NULL |Patricia|Trish |
|7 |Trish |NULL |Patty |Patricia|
+----+--------+--------+--------+--------+
In the above, rows 1+2 are duplicates of each other, as are 3+4, and 5+6+7.
So I'm not sure the best way to get what I want. Here's the longer version of the code I'm actually using:
WITH A AS (SELECT *,
SOUNDEX(FirstName) AS "FirstSoundex",
SOUNDEX(LastName) AS "LastSoundex",
LAG (SOUNDEX(FirstName)) OVER (ORDER BY SOUNDEX(FirstName)) AS "PreviousFirstSoundex",
LAG (SOUNDEX(LastName)) OVER (ORDER BY SOUNDEX(LastName)) AS "PreviousLastSoundex"
FROM Clients),
B AS (
SELECT *,
ISNULL(DIFFERENCE(FirstName, LEAD(FirstName) OVER (ORDER BY FirstSoundex)),0) AS "FirstScore",
ISNULL(DIFFERENCE(LastName, LEAD(LastName) OVER (ORDER BY LastSoundex)),0) AS "LastScore"
FROM A),
C AS (
SELECT *,
ISNULL(LAG (FirstScore) OVER (ORDER BY FirstSoundex),0) AS "PreviousFirstScore",
ISNULL(LAG (LastScore) OVER (ORDER BY LastSoundex),0) AS "PreviousLastScore"
FROM B
),
D AS (
SELECT *,
(CASE WHEN (PreviousFirstScore >=3 AND PreviousLastScore >=3) THEN (PreviousFirstSoundex + PreviousLastSoundex)
WHEN (FirstScore >= 3 AND LastScore >=3) THEN (FirstSoundex + LastSoundex)
END) AS "GroupName"
FROM C
WHERE ((PreviousFirstScore >=3 AND PreviousLastScore >=3) OR (FirstScore >= 3 AND LastScore >=3))
)
SELECT *,
LAG(GroupName) OVER (ORDER BY GroupName) AS "PreviousGroup",
LEAD(GroupName) OVER (ORDER BY GroupName) AS "NextGroup"
FROM D
WHERE (D.GroupName = D.PreviousGroup OR D.GroupName = D.NextGroup)
This lets me group together bundles of potential duplicates and it works well for me. However, I now want to add in a way to check against multiple columns, and I don't know how to do that.
I was thinking about creating a union, something like:
SELECT ClientID,
LastName,
FirstName AS "TempName"
FROM Clients
UNION
SELECT ClientID,
LastName,
MiddleName AS "TempName"
FROM Clients
WHERE MiddleName IS NOT NULL
...etc
But then my LAG() and LEAD() wouldn't work because I'd have multiple rows with the same ClientID. I don't want to identify a single Client as a duplicate of itself.
Anyways, any suggestions? Thanks in advance.

Count and max aggregate function in same table in one query

I have to do count and max aggregate function in same query. For example I have history table contains date column. I need to retrieve the latest date as well as count () with some criteria. Criteria is applicable for only count() . I am able to retrieve the latest date using max and rank function.But could not merge both. Could you please assist?
Update:
Scenario : Customer buys/sells Shares.
Input: Table Share_history and Table Customer and Table Share and Table Share_Status
Customer :
Cust_id |Cust_name
1 |A
2 |B
Share :
Share_id|Share_Name|Owner|
10 |ABC |XYZ |
20 |BCD |MNC |
Share_Status :
Share_Status_Id|Share_Status_Name
1 |Buy
2 |Sell
Share_history :
Share_history _id|Share_id|Trans_date|Share_status_Id|Cust_id
100 |10 |12/12/14 | 1 |1
101 |10 |24/12/14 | 2 |1
102 |10 |14/01/15 | 1 |1
103 |10 |28/02/15 | 2 |1
103 |10 |16/03/15 | 1 |1
Output: latest Trans_date and count(no of times specific share was bought(1)) and Cust_id=1.
Query:
select share1.Share_id,SHAREHIST.Latest_Date,SHAREHIST.buycount
from Share share1 left outer join
(select share_id,max(Trans_date) keep(dense_rank last order by share_id) as Latest_Date,
(select count(*) as buycount from Share_history where Share_status_id=1 and Share_id=share1.Share_id)
from Share_history
group by Share_id
) SHAREHIST
on SHAREHIST.share_id=share1.share_id
EXPECTED :
Share_id|Latest_Date|buycount
10 |16/03/15 | 3
Try using this:
SELECT
Share_id
,Trans_Date
,COUNT(Share_id) buycount
FROM
(
SELECT
*
FROM Share_history SH
WHERE Trans_Date = (SELECT MAX(Trans_Date) FROM Share_history)
) SH
GROUP BY Share_id, Trans_Date
Rest of the joins I think you can add.
I think you just want aggregation:
select sh.share_id, max(trans_date) as trans_date, count(*) as buy_count,
from share_history sh
where cust_id = 1
group by sh.share_id;

SQL query to return a grouped result as a single row

If I have a jobs table like:
|id|created_at |status |
----------------------------
|1 |01-01-2015 |error |
|2 |01-01-2015 |complete |
|3 |01-01-2015 |error |
|4 |01-02-2015 |complete |
|5 |01-02-2015 |complete |
|6 |01-03-2015 |error |
|7 |01-03-2015 |on hold |
|8 |01-03-2015 |complete |
I want a query that will group them by date and count the occurrence of each status and the total status for that date.
SELECT created_at status, count(status), created_at
FROM jobs
GROUP BY created_at, status;
Which gives me
|created_at |status |count|
-------------------------------
|01-01-2015 |error |2
|01-01-2015 |complete |1
|01-02-2015 |complete |2
|01-03-2015 |error |1
|01-03-2015 |on hold |1
|01-03-2015 |complete |1
I would like to now condense this down to a single row per created_at unique date with some sort of multi column layout for each status. One constraint is that status is any one of 5 possible words but each date might not have one of every status. Also I would like a total of all statuses for each day. So desired results would look like:
|date |total |errors|completed|on_hold|
----------------------------------------------
|01-01-2015 |3 |2 |1 |null
|01-02-2015 |2 |null |2 |null
|01-03-2015 |3 |1 |1 |1
the columns could be built dynamically from something like
SELECT DISTINCT status FROM jobs;
with a null result for any day that doesn't contain any of that type of status. I am no SQL expert but am trying to do this in a DB view so that I don't have to bog down doing multiple queries in Rails.
I am using Postresql but would like to try to keep it straight SQL. I have tried to understand aggregate function enough to use some other tools but not succeeding.
The following should work in any RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
The query uses conditional aggregation so as to pivot grouped data. It assumes that status values are known before-hand. If you have additional cases of status values, just add the corresponding sum(case ... expression.
Demo here
An actual crosstab query would look like this:
SELECT * FROM crosstab(
$$SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
ORDER BY 1, 2$$
,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);
Should perform very well.
Basics:
PostgreSQL Crosstab Query
The above does not yet include the total per date.
Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:
SELECT * FROM crosstab(
$$SELECT created_at, COALESCE(status, 'total'), ct
FROM (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY created_at, ROLLUP(status)
) sub
ORDER BY 1, 2$$
,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);
Up to Postgres 9.4, use this query instead:
WITH cte AS (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
)
TABLE cte
UNION ALL
SELECT created_at, 'total', sum(ct)
FROM cte
GROUP BY 1
ORDER BY 1
Related:
Grouping() equivalent in PostgreSQL?
If you want to stick to a simple query, this is a bit shorter:
SELECT created_at
, count(*) AS total
, count(status = 'error' OR NULL) AS errors
, count(status = 'complete' OR NULL) AS completed
, count(status = 'on hold' OR NULL) AS on_hold
FROM jobs
GROUP BY 1;
count(status) for the total per date is error-prone, because it would not count rows with NULL values in status. Use count(*) instead, which is also shorter and a bit faster.
Here is a list of techniques:
For absolute performance, is SUM faster or COUNT?
In Postgres 9.4+ use the new aggregate FILTER clause, like #a_horse mentioned:
SELECT created_at
, count(*) AS total
, count(*) FILTER (WHERE status = 'error') AS errors
, count(*) FILTER (WHERE status = 'complete') AS completed
, count(*) FILTER (WHERE status = 'on hold') AS on_hold
FROM jobs
GROUP BY 1;
Details:
How can I simplify this game statistics query?

view data from database table

SELECT dun, COUNT( id_ahli ) AS JUMLAH_KESELURUHAN, COUNT(kaum='melayu') AS melayu, COUNT(kaum='cina') AS cina
FROM maklumat_ahli
WHERE jantina = 'lelaki'
AND
(kematian_tarikh IS NULL)
AND (bayaran_pertama IS NULL)
AND (bayaran_kedua IS NULL)
GROUP BY dun
ORDER BY dun
this is my sql statement. Is it posible to count and view data by kaum?. i use that sql statement, but my count is not correct
/-----------------------------------------/
|dun | Jumlah_keseluruhan | melayu | cina |
-------------------------------------------
|A |123 |100 |23 |
-------------------------------------------
is it any possible way to view data from db like the table above.
To count data by special value you may use CASE clause
COUNT(case when kaum='melayu' then 1 else 0 end) AS melayu,
COUNT(case when kaum='cina'the 1 else 0 end) AS cina