select count by value - sql

Given a table messages with the following fields:
id | Number
customer_id | Number
source | VARCHAR2
...
I want to know how many messages each customer has, but I want to differentiate between messages where source equals to 'xml' and all other sources.
My query so far
SELECT customer_id,
case when source = 'xml' then 'xml' else 'manual' end as xml,
count(*)
FROM MESSAGES
GROUP BY customer_id,
case when source = 'xml' then 'xml' else 'manual' end;
which gives me a result similar to this:
customer_id | xml | count
----------------------------
1 | xml | 12
1 | manual | 34
2 | xml | 54
3 | xml | 77
3 | manual | 1
...
This is rather ugly in two ways:
I have to repeat the case statement in both the field list and in the group list
I now have two rows per customer.
Q: Is it possible to formulate a query, such that the result looks like this instead?
customer_id | xml | manual
--------------------------
1 | 12 | 34
2 | 54 | 0
3 | 11 | 1

You are looking for conditional aggregation:
SELECT customer_id,
count(case when source = 'xml' then 1 end) as xml_count,
count(case when source <> 'xm' then 1 end) as manual_count
FROM MESSAGES
GROUP BY customer_id
This works because aggregates ignore NULL values and the result of the CASE will be NULL if source does not contain the value from the case condition.

Use conditional aggregation.
SELECT customer_id,
sum(case when source = 'xml' then 1 else 0 end) as xml,
sum(case when source <> 'xml' then 1 else 0 end) as manual
FROM MESSAGES
GROUP BY customer_id
This assumes the source column is non null. If it can be null use coalesce or nvl in the case expression so the comparison gives you expected results.

This will work, it doesn't appear you have a source called 'manual'. COUNT or SUM will give you the same difference.
SELECT
customer_id
,ISNULL(COUNT(CASE WHEN source = 'xml' THEN 1 END),0) xml
,ISNULL(COUNT(CASE WHEN source <> 'xml' OR source IS NULL THEN 1 END),0) manual
FROM Messages
GROUP BY customer_id
This will allow for zero to appear where you usually would see a NULL value, your sample has a zero rather than a null.

Here is a fancy solution (it does almost exactly what vkp's solution does), using the PIVOT operation introduced in Oracle 11.1. Note how the distinction between 'xml' and all others (including NULL) is dealt with in the subquery.
select *
from (select customer_id, case when source = 'xml' then 'xml' else 'other' as source
from messages)
pivot (count(*) for source in ('xml' as xml, 'other' as other))
;

There is other way by using decode function apart from CASE:
SELECT cust_id,
COUNT(DECODE(source,'xml','xml'))"XML",
COUNT(DECODE(source,'manual','manual'))"manual"
FROM MESSAGES
GROUP BY cust_id;
But, this won't show result when you have null as source.

Related

How to create a table to count with a conditional

I have a database with a lot of columns with pass, fail, blank indicators
I want to create a function to count each type of value and create a table from the counts. The structure I am thinking is something like
| Value | x | y | z |
|-------|------------------|-------------------|---|---|---|---|---|---|---|
| pass | count if x=pass | count if y=pass | count if z=pass | | | | | | |
| fail | count if x=fail | count if y=fail |count if z=fail | | | | | | |
| blank | count if x=blank | count if y=blank | count if z=blank | | | | | | |
| total | count(x) | count(y) | count (z) | | | | | | |
where x,y,z are columns from another table.
I don't know which could be the best approach for this
thank you all in advance
I tried this structure but it shows syntax error
CREATE FUNCTION Countif (columnx nvarchar(20),value_compare nvarchar(10))
RETURNS Count_column_x AS
BEGIN
IF columnx=value_compare
count(columnx)
END
RETURN
END
Also, I don't know how to add each count to the actual table I am trying to create
Conditional counting (or any conditional aggregation) can often be done inline by placing a CASE expression inside the aggregate function that conditionally returns the value to be aggregated or a NULL to skip.
An example would be COUNT(CASE WHEN SelectMe = 1 THEN 1 END). Here the aggregated value is 1 (which could be any non-null value for COUNT(). (For other aggregate functions, a more meaningful value would be provided.) The implicit ELSE returns a NULL which is not counted.
For you problem, I believe the first thing to do is to UNPIVOT your data, placing the column name and values side-by-side. You can then group by value and use conditional aggregation as described above to calculate your results. After a few more details to add (1) a totals row using WITH ROLLUP, (2) a CASE statement to adjust the labels for the blank and total rows, and (3) some ORDER BY tricks to get the results right and we are done.
The results may be something like:
SELECT
CASE
WHEN GROUPING(U.Value) = 1 THEN 'Total'
WHEN U.Value = '' THEN 'Blank'
ELSE U.Value
END AS Value,
COUNT(CASE WHEN U.Col = 'x' THEN 1 END) AS x,
COUNT(CASE WHEN U.Col = 'y' THEN 1 END) AS y
FROM #Data D
UNPIVOT (
Value
FOR Col IN (x, y)
) AS U
GROUP BY U.Value WITH ROLLUP
ORDER BY
GROUPING(U.Value),
CASE U.Value WHEN 'Pass' THEN 1 WHEN 'Fail' THEN 2 WHEN '' THEN 3 ELSE 4 END,
U.VALUE
Sample data:
x
y
Pass
Pass
Pass
Fail
Pass
Fail
Sample results:
Value
x
y
Pass
3
1
Fail
1
1
Blank
0
2
Total
4
4
See this db<>fiddle for a working example.
I think you don't need a generic solution like a function with value as parameter.
Perhaps, you could create a view grouping your data and after call this view filtering by your value.
Your view body would be something like that
select value, count(*) as Total
from table_name
group by value
Feel free to explain your situation better so I could help you.
You can do this by grouping by the status column.
select status, count(*) as total
from some_table
group by status
Rather than making a whole new table, consider using a view. This is a query that looks like a table.
create view status_counts as
select status, count(*) as total
from some_table
group by status
You can then select total from status_counts where status = 'pass' or the like and it will run the query.
You can also create a "materialized view". This is like a view, but the results are written to a real table. SQL Server is special in that it will keep this table up to date for you.
create materialized view status_counts with distribution(hash(status))
select status, count(*) as total
from some_table
group by status
You'd do this for performance reasons on a large table which does not update very often.

MS-Access Query to PostgreSQL View

I am converting a microsoft access query into a postgresql view. The query has obvious components that I have found reasonable answers to. However, I am still stuck on getting the final result:
SELECT All_Claim_Data.Sec_ID,
Sum(IIf([Type]="LODE",IIf([Status]="Active",1,0),0)) AS LD_Actv,
Sum(IIf([Type]="LODE",IIf([Loc_Date]>#8/31/2017#,IIf([Loc_Date]<#9/1/2018#,1,0),0),0)) AS LD_stkd_17_18,
Sum(IIf([Type]="LODE",IIf([Loc_Date]>#8/31/2016#,IIf([Loc_Date]<#9/1/2017#,1,0),0),0)) AS LD_stkd_16_17,
Sum(IIf([Type]="LODE",IIf([Loc_Date]<#1/1/1910#,IIf(IsNull([Clsd_Date]),1,(IIf([Clsd_Date]>#1/1/1900#,1,0))),0),0)) AS Actv_1900s,
Sum(IIf([Type]="LODE",IIf([Loc_Date]<#1/1/1920#,IIf(IsNull([Clsd_Date]),1,(IIf([Clsd_Date]>#1/1/1910#,1,0))),0),0)) AS Actv_1910s,
FROM All_Claim_Data.Sec_ID,
GROUP BY All_Claim_Data.Sec_ID,
HAVING (((Sum(IIf([casetype_txt]="LODE",1,0)))>0));
Realizing I need to use CASE SUM WHEN, here is what I have worked out so far:
CREATE OR REPLACE VIEW hgeditor.vw_test AS
SELECT All_Claim_Data.Sec_ID,
SUM (CASE WHEN(Type='LODE' AND WHEN(Status='Active',1,0),0)) AS LD_Actv,
SUM (CASE WHEN(Type='LODE' AND WHEN(Loc_Date>'8/31/2017' AND Loc_Date<'9/1/2018',1,0),0),0)) AS LD_stkd_17_18,
SUM (CASE WHEN(Type='LODE' AND WHEN(Loc_Date<'1/1/1910' AND (IsNull(Clsd_Date),1,(WHEN([Clsd_Date]>'1/1/1900',1,0))),0),0)) AS Actv_1900s
FROM All_Claim_Data.Sec_ID,
GROUP BY All_Claim_Data.Sec_ID,
HAVING (((SUM(IIf(Type='LODE',1,0)))>0));
The goal is to count the number of instances in which the Sec_ID has the following:
has (Type = LODE and Status = Active) = SUM integer
has (Type = LODE and Loc_Date between 8/31/2017 and 9/1/2018) = SUM Integer
My primary issue is getting a SUM integer to populate in the new columns
Case expressions are the equivalent to the Access IIF() functions, but WHEN isn't a function so it isn't used by passing a set of parameters. Think of it as being a tiny where clause instead, it evaluates one or more predicates to determine what to do, and the action taken is established by what you specify after THEN
CREATE OR REPLACE VIEW hgeditor.vw_test AS
SELECT
All_Claim_Data.Sec_ID
, SUM( CASE
WHEN TYPE = 'LODE' AND
STATUS = 'Active' THEN 1
ELSE 0
END ) AS LD_Actv
, SUM( CASE
WHEN TYPE = 'LODE' AND
Loc_Date > to_date('08/31/2017','mm/dd/yyyy') AND
Loc_Date < to_date('09/1/2018','mm/dd/yyyy') THEN 1
ELSE 0
END ) AS LD_stkd_17_18
, SUM( CASE
WHEN TYPE = 'LODE' AND
Loc_Date < to_date('1/1/1910','mm/dd/yyyy') AND
[Clsd_Date] > to_date('1/1/1900','mm/dd/yyyy') THEN 1
ELSE 0
END ) AS Actv_1900s
FROM All_Claim_Data.Sec_ID
GROUP BY
All_Claim_Data.Sec_ID
HAVING COUNT( CASE
WHEN Type = 'LODE' THEN 1
END ) > 0
;
By the way, you should NOT be relying on MM/DD/YYYY as dates in Postgres
nb: Aggregate functions ignore NULL, take this example:
+----------+
| id value |
+----------+
| 1 x |
| 2 NULL |
| 3 x |
| 4 NULL |
| 5 x |
+----------+
select
count(*) c_all
, count(value) c_value
from t
+-------+----------+
| c_all | c_value |
+-------+----------+
| 5 | 3 |
+-------+----------+
select
sum(case when value IS NOT NULL then 1 else 0 end) sum_case
, count(case when value IS NOT NULL then 1 end) count_case
from t
+----------+-------------+
| sum_case | count_case |
+----------+-------------+
| 3 | 3 |
+----------+-------------+

Grouping but with keeping all non-NULL values

Let's say I have the table:
ID | Name | Intolerance
1 | Amy | Lactose
2 | Brian | Lactose
3 | Amy | Gluten
And I run this SQL query:
SELECT
Name,
CASE
WHEN Intolerance = 'Lactose' 1
END AS Lactose,
CASE
WHEN Intolerance = 'Gluten' 1
END AS Gluten
FROM
Table
I get:
Name | Lactose | Gluten
-------+---------+--------
Amy | 1 |
Amy | | 1
Brian | 1 |
But if I try to add "GROUP BY Name", Amy won't have a 1 in both columns, because GROUP BY only selects the last row of each Name. What I want to get instead is this:
Name | Lactose | Gluten
------+---------+---------
Amy | 1 | 1
Brian | 1 |
How can I get that? Is there perhaps a more efficient way to summarize who's allergic to what from the same input? Thanks in advance.
When using a GROUP BY then the aggregate functions can be used for columns that aren't in the GROUP BY.
In this case I assume you want to use MAX, to get only a 1 or a NULL.
SUM or COUNT can also be used to surround a CASE WHEN.
But then those would return a total.
SELECT
Name,
MAX(CASE WHEN Intolerance = 'Lactose' THEN 1 END) AS Lactose,
MAX(CASE WHEN Intolerance = 'Gluten' THEN 1 END) AS Gluten
FROM Table
GROUP BY Name
ORDER BY Name
Or if you don't want to see NULL's?
Then let the CASE return a varchar instead of a number.
SELECT
Name,
MAX(CASE WHEN Intolerance = 'Lactose' THEN '1' ELSE '' END) AS Lactose,
MAX(CASE WHEN Intolerance = 'Gluten' THEN '1' ELSE '' END) AS Gluten
FROM Table
GROUP BY Name
ORDER BY Name
I think what you need is the sum of the number of number of intolerances for each person. Also, put a ELSE so the value is 0 or 1:
SELECT
Name,
SUM(CASE WHEN Intolerance = 'Lactose' THEN 1 ELSE 0 END) AS Lactose,
SUM(CASE WHEN Intolerance = 'Gluten' THEN 1 ELSE 0 END) AS Gluten
FROM Table
GROUP BY Name
ORDER BY Name
I feel that each time I encounter a question like that, it's because a no proper amount of thinking on design was allowed to the project.
To put it simply : you are trying to move data to columns. This is what your application layer is for ! Not the database. People tend to mix what databases are for with what application / UI layer and vice versa are for !
And each time it happens, I see people reaping their mind to answer because that's the point here : answer the question no matter what. Don't question what the OP want to do, give him the answer...
Sorry for that, I am just a little bit pissed.
My solution :
Keep your original query and do the aesthetic on your UI / application layer side. You probably have a IList inside each Person. Just fill them and give the UI the opportunity to display them however it wants.
Because that's what you're asking the database to do : aesthetics.

SQL Count of Column value and its a subColumn

I have A Table in DB2 Database such as below:
StatusCode | IsResolved | IsAssigned
ABC | Y |
ABC | N |
ABC | |
ADEF | Y |
ADEF | | Y
I want to get data in the way such as:
StatusCode |Count of Status Code| Count of Resolved with value Y| Count of Assigned With value Y
ABC | 3 | 1 | 0
ADEF | 2 | 1 | 1
I am able to get count of Status Code by using groupBy but I am not sure how to fetch data of count of resolved and assigned in the same query.
Query: select statusCode,count(statusCode) from table group by statusCode
Can anyone help me in how to fetch the resolved and Assigned count?
Issue Solution: Christian and JPW: Solution was to Use sum(case IsResolved when 'Y' then 1 else 0 end)
Try to use
select statusCode, count(statusCode),
sum(case IsResolved when 'Y' then 1 else 0 end),
sum(case IsAssigned when 'Y' then 1 else 0 end)
from table
group by statusCode
One way to get the result you want is to use conditional aggregation (where you use a predicate to determine how to aggregate data) like this:
select
StatusCode,
count(*) as "Count of Status Code",
sum(case when IsResolved = 'Y' then 1 else 0 end) as "Count of Resolved with value Y",
sum(case when IsAssigned = 'Y' then 1 else 0 end) as "Count of Assigned With value Y"
from your_table
group by StatusCode;
The case expression construct (case ... when ... then .. end) is part of the ANSI SQL standard, so this should work in any compliant database.
You can achieve this using SUM() and CASE
SELECT
statusCode,
COUNT(statusCode)
,SUM(CASE WHEN IsResolved='Y' THEN 1 ELSE 0 END) Resolved
,SUM(CASE WHEN IsAssigned='Y' THEN 1 ELSE 0 END) Assigned
FROM [Questions] GROUP BY statusCode
Here is a related question: Sql Server equivalent of a COUNTIF aggregate function
I suppose the prior answers used the SUM aggregate because the value of the missing values was unknown. If the missing values are the NULL value, then each could have been coded as the COUNT with the same effect as the SUM.
And if the missing values from the "I have a table" given in the OP are the NULL value, and if [effectively the data meets or actually there exists] a CHECK constraint for the isColumnNames of IN ('Y','N'), then similar to the other answers, but performing a COUNT and using NULLIF as a simplified/special-case effect of the CASE expression:
select
statuscode as "StatusCode"
, count(*) as "Count of Status Code"
, count(nullif(isResolved,'N')) as "Count of Resolved with value Y"
, count(nullif(isAssigned,'N')) as "Count of Assigned with value Y"
from so39705143
group by statuscode
order by statuscode

How to get max of multiple columns in oracle

Here is a sample table:
| customer_token | created_date | orders | views |
+--------------------------------------+------------------------------+--------+-------+
| 93a03e36-83a0-494b-bd68-495f54f406ca | 10-NOV-14 14.41.09.000000000 | 1 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 20-NOV-14 14.41.47.000000000 | 0 | 1 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 26-OCT-14 16.14.30.000000000 | 2 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 11-OCT-14 16.31.11.000000000 | 0 | 2 |
In this customer data table I store all of the dates when a given customer has placed an order, or viewed a product. Now, for a report, I want to write a query where for each customer (auth_token), I want to generate the last_order_date (row where orders > 0) and last_view_date (row where product_views > 0).
I am looking for an efficient query as I have millions of records.
select customer_token,
max(case when orders > 0 then created_date else NULL end),
max(case when views > 0 then created_date else NULL end)
from Customer
group by customer_token;
Update: This query is quite efficient because Oracle is likely to scan the table only once. Also there is an interesting thing with grouping - when you use GROUP BY a select list can only contain columns which are in the GROUP BY or aggregate functions. In this query MAX is calculated for the column created_date, but you don't need to put orders and views in a GROUP BY because they are in the expression inside MAX function. It's not very common.
When you want to get the largest value from a row, you need to use the MAX() aggregate function. It is also best practice to group a column when you are using aggregate functions.
In this case, you want to group by customer_token. That way, you'll receive one row per group, and the aggregate function will give you the value for that group.
However, you only want to see the dates where the cell value is greater than 0, so I recommend you put a case statement inside your MAX() function like this:
SELECT customer_token,
MAX(CASE WHEN orders > 0 THEN created_date ELSE NULL END) AS latestOrderDate,
MAX(CASE WHEN views > 0 THEN created_date ELSE NULL END) AS latestViewDate
FROM customer
GROUP BY customer_token;
This will give you the max date only when orders is positive, and only when views is positive. Without that case statement, the DBMS won't know which groups to give you, and you would likely get incorrect results.
Here is an oracle reference for aggregate functions.