How to create a table to count with a conditional - sql

I have a database with a lot of columns with pass, fail, blank indicators
I want to create a function to count each type of value and create a table from the counts. The structure I am thinking is something like
| Value | x | y | z |
|-------|------------------|-------------------|---|---|---|---|---|---|---|
| pass | count if x=pass | count if y=pass | count if z=pass | | | | | | |
| fail | count if x=fail | count if y=fail |count if z=fail | | | | | | |
| blank | count if x=blank | count if y=blank | count if z=blank | | | | | | |
| total | count(x) | count(y) | count (z) | | | | | | |
where x,y,z are columns from another table.
I don't know which could be the best approach for this
thank you all in advance
I tried this structure but it shows syntax error
CREATE FUNCTION Countif (columnx nvarchar(20),value_compare nvarchar(10))
RETURNS Count_column_x AS
BEGIN
IF columnx=value_compare
count(columnx)
END
RETURN
END
Also, I don't know how to add each count to the actual table I am trying to create

Conditional counting (or any conditional aggregation) can often be done inline by placing a CASE expression inside the aggregate function that conditionally returns the value to be aggregated or a NULL to skip.
An example would be COUNT(CASE WHEN SelectMe = 1 THEN 1 END). Here the aggregated value is 1 (which could be any non-null value for COUNT(). (For other aggregate functions, a more meaningful value would be provided.) The implicit ELSE returns a NULL which is not counted.
For you problem, I believe the first thing to do is to UNPIVOT your data, placing the column name and values side-by-side. You can then group by value and use conditional aggregation as described above to calculate your results. After a few more details to add (1) a totals row using WITH ROLLUP, (2) a CASE statement to adjust the labels for the blank and total rows, and (3) some ORDER BY tricks to get the results right and we are done.
The results may be something like:
SELECT
CASE
WHEN GROUPING(U.Value) = 1 THEN 'Total'
WHEN U.Value = '' THEN 'Blank'
ELSE U.Value
END AS Value,
COUNT(CASE WHEN U.Col = 'x' THEN 1 END) AS x,
COUNT(CASE WHEN U.Col = 'y' THEN 1 END) AS y
FROM #Data D
UNPIVOT (
Value
FOR Col IN (x, y)
) AS U
GROUP BY U.Value WITH ROLLUP
ORDER BY
GROUPING(U.Value),
CASE U.Value WHEN 'Pass' THEN 1 WHEN 'Fail' THEN 2 WHEN '' THEN 3 ELSE 4 END,
U.VALUE
Sample data:
x
y
Pass
Pass
Pass
Fail
Pass
Fail
Sample results:
Value
x
y
Pass
3
1
Fail
1
1
Blank
0
2
Total
4
4
See this db<>fiddle for a working example.

I think you don't need a generic solution like a function with value as parameter.
Perhaps, you could create a view grouping your data and after call this view filtering by your value.
Your view body would be something like that
select value, count(*) as Total
from table_name
group by value
Feel free to explain your situation better so I could help you.

You can do this by grouping by the status column.
select status, count(*) as total
from some_table
group by status
Rather than making a whole new table, consider using a view. This is a query that looks like a table.
create view status_counts as
select status, count(*) as total
from some_table
group by status
You can then select total from status_counts where status = 'pass' or the like and it will run the query.
You can also create a "materialized view". This is like a view, but the results are written to a real table. SQL Server is special in that it will keep this table up to date for you.
create materialized view status_counts with distribution(hash(status))
select status, count(*) as total
from some_table
group by status
You'd do this for performance reasons on a large table which does not update very often.

Related

How can I summarise this JSON array data as columns on the same row?

I have a PostgreSQL database with a table called items which includes a JSONB field care_ages.
This field contains an array of between one and three objects, which include these keys:
AgeFrom
AgeTo
Number
Register
For a one-off audit report I need to run on this table, I need to "unpack" this field into columns on the same row.
I've used jsonb_to_recordset to split it out into rows and columns, which gets me halfway:
SELECT
items.id,
items.name,
care_ages.*
FROM
ofsted_items,
jsonb_to_recordset(items.care_age) AS care_ages ("AgeFrom" integer, "AgeTo" integer, "Register" text, "MaximumNumber" integer)
This gives me output like:
| id | name | register | age_from | age_to | maximum_number |
|----|--------------|----------|----------|--------|----------------|
| 1 | namey mcname | xyz | 0 | 4 | 5 |
| 1 | namey mcname | abc | 4 | 8 | 7 |
Next, I need to combine these rows together, perhaps using GROUP BY, adding extra columns, like this:
| id | name | register_xyz? | xyz_age_from | xyz_age_to | xyz_maximum_number | register_abc? | abc_age_from | abc_age_to | abc_maximum_number |
|----|--------------|---------------|--------------|------------|--------------------|---------------|--------------|------------|--------------------|
| 1 | namey mcname | true | 0 | 4 | 5 | true | 4 | 8 | 7 |
Because I know ahead of time which "registers" there are (there's only three of them), it seems like this should be possible.
I've tried following this example, using CASE to calculate extra columns, but I'm not getting any useful values: all 0s and 5s for some reason.
If you are using Postgres 12 or later, you can use a jsonpath query to first extract the JSON object for each register into separate columns. Then use the "usualy" operators to extract the keys. This avoids first expanding into multiple rows just to aggregate them back into a single row later
select id, name,
(reg_xyz ->> 'AgeFrom')::int as xyz_age_from,
(reg_xyz ->> 'AgeTo')::int as xyz_age_to,
(reg_xyz ->> 'Number')::int as xyz_max_num,
(reg_abc ->> 'AgeFrom')::int as abc_age_from,
(reg_abc ->> 'AgeTo')::int as abc_age_to,
(reg_abc ->> 'Number')::int as abc_max_num
from (
select id, name,
jsonb_path_query_first(care_age, '$[*] ? (#.Register == "xyz")') as reg_xyz,
jsonb_path_query_first(care_age, '$[*] ? (#.Register == "abc")') as reg_abc
from ofsted_items
) t
At one point or another you will have to explicitly write out one expression for each column, so jsonb_to_recordset doesn't really buy you that much.
Online example
If you need this a lot, you can easily put this into a view.
Try below query:
select id,name,
max(case when register='xyz' then true end) as "register_xyz?",
max(case when register='xyz' then age_from end) as xyz_age_from,
max(case when register='xyz' then age_to end) as xyz_age_to,
max(case when register='xyz' then maximum_number end) as xyz_maximum_number,
max(case when register='abc' then true end) as "register_abc?",
max(case when register='abc' then age_from end) as abc_age_from,
max(case when register='abc' then age_to end) as abc_age_to,
max(case when register='abc' then maximum_number end) as abc_maximum_number
from (SELECT
items.id,
items.name,
care_ages.*
FROM
ofsted_items,
jsonb_to_recordset(items.care_age) AS care_ages ("AgeFrom" integer, "AgeTo" integer, "Register" text, "MaximumNumber" integer)
)t
group by id, name
You can use conditional aggregation to pivot your table. This can be done using the CASE clause as it was done at the solution you already linked or using the FILTER clause:
demo:db<>fiddle
SELECT
id,
name,
bool_and(true) FILTER (WHERE register = 'xyz') as "register_xyz?",
MAX(age_from) FILTER (WHERE register = 'xyz') as xyz_age_from,
MAX(age_to) FILTER (WHERE register = 'xyz') as xyz_age_to,
MAX(maximum_number) FILTER (WHERE register = 'xyz') as xyz_maximum_number,
bool_and(true) FILTER (WHERE register = 'abc') as "register_abc?",
MAX(age_from) FILTER (WHERE register = 'abc') as abc_age_from,
MAX(age_to) FILTER (WHERE register = 'abc') as abc_age_to,
MAX(maximum_number) FILTER (WHERE register = 'abc') as abc_maximum_number
FROM items,
jsonb_to_recordset(items.care_ages) AS care_ages ("age_from" integer, "age_to" integer, "register" text, "maximum_number" integer)
GROUP BY id, name

Count results in SQL statement additional row

I am trying to get 3% of total membership which the code below does, but the results are bringing me back two rows one has the % and the other is "0" not sure why or how to get rid of it ...
select
sum(Diabetes_FLAG) * 100 / (select round(count(medicaid_no) * 0.03) as percent
from membership) AS PERCENT_OF_Dia
from
prefinal
group by
Diabetes_Flag
Not sure why it brought back a second row I only need the % not the second row .
Not sure what I am doing wrong
Output:
PERCENT_OF_DIA
1 11.1111111111111
2 0
SELECT sum(Diabetes_FLAG)*100 / (SELECT round(count(medicaid_no)*0.03) as percentt
FROM membership) AS PERCENT_OF_Dia
FROM prefinal
WHERE Diabetes_FLAG = 1
# GROUP BY Diabetes_Flag # as you're limiting by the flag in the where clause, this isn't needed.
Remove the group by if you want one row:
select sum(Diabetes_FLAG)*100/( SELECT round(count(medicaid_no)*0.03) as percentt
from membership) AS PERCENT_OF_Dia
from prefinal;
When you include group by Diabetes_FLAG, it creates a separate row for each value of Diabetes_FLAG. Based on your results, I'm guessing that it takes on the values 0 and 1.
Not sure why it brought back a second row
This is how GROUP BY query works. The group by clause group data by a given column, that is - it collects all values of this column, makes a distinct set of these values and displays one row for each individual value.
Please consider this simple demo: http://sqlfiddle.com/#!9/3a38df/1
SELECT * FROM prefinal;
| Diabetes_Flag |
|---------------|
| 1 |
| 1 |
| 5 |
Usually GROUP BY column is listed in in SELECT clause too, in this way:
SELECT Diabetes_Flag, sum(Diabetes_Flag)
FROM prefinal
GROUP BY Diabetes_Flag;
| Diabetes_Flag | sum(Diabetes_Flag) |
|---------------|--------------------|
| 1 | 2 |
| 5 | 5 |
As you see, GROUP BY display two rows - one row for each unique value of Diabetes_Flag column.
If you remove Diabetes_Flag colum from SELECT clause, you will get the same result as above, but without this column:
SELECT sum(Diabetes_Flag)
FROM prefinal
GROUP BY Diabetes_Flag;
| sum(Diabetes_Flag) |
|--------------------|
| 2 |
| 5 |
So the reason that you get 2 rows is that Diabetes_Flag has 2 distict values in the table.

SQL - Representing SUM's after using CASE to transform STR -> INT

I am sorry for what may be a long post in advance.
Background:
I am using Rational Team Concert (RTC) which stores work item data in conjunction with Jazz Reporting Service to create reports. Using the Report Builder tool, it allows you to write your own queries to pull data as a table, and has its own interface to represent the table as a graph.
There is not much options for of graphing; the chart type defaults as a count, unless you specify it to show a sum. In order to graph by sum, the data must be a number rather than a string. By default, the Report Builder assumes all variables in the SELECT statement are strings.
The data which I will be using are a bunch of work items. Each work item is associated to a team (A, B) and has a work estimation number (count1, count2).
Item # | Team | Work |
------------------------
123 | A | count1 |
------------------------
124 | A | count2 |
------------------------
125 | B | count2 |
------------------------
....
Problem:
Since the work estimation is entered as a Tag, the first step was to use a CATCH WHEN block when using SELECT to transform count1 -> 1, and count2 -> 2 (the string tag to an actual number which can be summed). This resulted in a table with numbers 1 and 2 in place of the typed tag (good so far).
Item # | Team | Work |
------------------------
123 | A | 1 |
------------------------
124 | A | 2 |
------------------------
125 | B | 2 |
------------------------
....
The problem is that I am trying to graph by sum, which means getting the tool to identify the variables in the SELECT statement as numbers, except for some reason any variable I declare in a SELECT statement is always viewed as a string (The tool has a table of the current columns i.e. variables in the SELECT, along with that the tool identifies as its variable type).
Attempted Solutions:
The first query I did was to return a table of each work item with its team name and work estimate
SELECT T1.NAME,
(CASE WHEN T1.TAGs='count1' THEN 1 ELSE 2 END) AS WORK
FROM RIDW.VW_REQUEST T1
WHERE T1.PROJECT_ID = 73
Which resulted in
Team | Work |
----------------
A | 1 |
----------------
A | 2 |
----------------
B | 2 |
----------------
....
but the tool still sees the numbers as strings. I then tried explicitly casting the CASE to an integer, but resulted in the same issue
...
CAST(CASE WHEN T1.TAGs='count1' THEN 1 ELSE 2 END AS Integer) AS WORK
...
Which again the tool still represents as a string.
Current Goal:
As I cannot confirm if the tool has an underlying problem, compatibility issues with queries, etc. What I believe will work now would be to return a table with 2 rows: The sum of the work for each team
|Sum of 1's and 2's |
-----------------------------
Team A | SUM(1) + SUM(2) |
-----------------------------
Team B | SUM(1) + SUM(2) |
-----------------------------
What I am having trouble with is using sub queries to use SUM to sum the data. When I try
SUM(CASE WHEN ... END) AS TIME2 I get an error that "Column modifiers AVG and SUM apply only to number attributes". This has me thinking that I need to have a sub query which returns the column after the CASE, and then SUM that, but I am sailing into uncharted waters and can't seem to get the syntax to work.
I understand that a post like this would be better off on the product help forum. I have tried asking around but cannot get any help. The solution I am proposing of returning the 2 row/column table should bypass any issues the software may have, but I need help sub-querying the SUM when using a case.
I appreciate your time and help!
EDIT 1:
Below is the full query code which preforms the CASE correctly, but still causes with the interpreted type by the tool:
SELECT
T1.Name,
CAST(CASE WHEN T1.TAGS='|release_points_1|' THEN 1 ELSE (CASE WHEN T1.TAGS='|release_points_2|' THEN 2 ELSE 0 END) END AS Integer) AS TAG,
FROM RIDW.VW_REQUEST T1
WHERE T1.PROJECT_ID = 73
AND
(T1.ISSOFTDELETED = 0) AND
(T1.REQUEST_ID <> -1 AND T1.REQUEST_ID IS NOT NULL
This small adjustment to your current query should work:
SELECT
T1.Name,
SUM(CAST(CASE WHEN T1.TAGS='|release_points_1|' THEN 1 ELSE (CASE WHEN T1.TAGS='|release_points_2|' THEN 2 ELSE 0 END) END AS Integer)) AS TAG,
FROM RIDW.VW_REQUEST T1
WHERE T1.PROJECT_ID = 73
AND
(T1.ISSOFTDELETED = 0) AND
(T1.REQUEST_ID <> -1 AND T1.REQUEST_ID IS NOT NULL
GROUP BY T1.Name

SQL - group by both bits

I have an SQL table with one bit column. If only one bit value (in my case 1) occurs in the rows of the table, how can I make a SELECT statement which shows for, example, the occurance of both bit values in the table, even if the other does not occur? This is the result I'm trying to achieve:
+----------+--------+
| IsItTrue | AMOUNT |
+----------+--------+
| 1 | 12 |
| 0 | NULL |
+----------+--------+
I already tried to google the answer but without success, as English is not my native language and I am not that familiar with SQL jargon.
select IsItTrue, count(id) as Amount from
(select IsItTrue, id from table
union
select 1 as IsItTrue, null as id
union
select 0 as IsItTrue, null as id) t
group by bool

How to get max of multiple columns in oracle

Here is a sample table:
| customer_token | created_date | orders | views |
+--------------------------------------+------------------------------+--------+-------+
| 93a03e36-83a0-494b-bd68-495f54f406ca | 10-NOV-14 14.41.09.000000000 | 1 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 20-NOV-14 14.41.47.000000000 | 0 | 1 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 26-OCT-14 16.14.30.000000000 | 2 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 11-OCT-14 16.31.11.000000000 | 0 | 2 |
In this customer data table I store all of the dates when a given customer has placed an order, or viewed a product. Now, for a report, I want to write a query where for each customer (auth_token), I want to generate the last_order_date (row where orders > 0) and last_view_date (row where product_views > 0).
I am looking for an efficient query as I have millions of records.
select customer_token,
max(case when orders > 0 then created_date else NULL end),
max(case when views > 0 then created_date else NULL end)
from Customer
group by customer_token;
Update: This query is quite efficient because Oracle is likely to scan the table only once. Also there is an interesting thing with grouping - when you use GROUP BY a select list can only contain columns which are in the GROUP BY or aggregate functions. In this query MAX is calculated for the column created_date, but you don't need to put orders and views in a GROUP BY because they are in the expression inside MAX function. It's not very common.
When you want to get the largest value from a row, you need to use the MAX() aggregate function. It is also best practice to group a column when you are using aggregate functions.
In this case, you want to group by customer_token. That way, you'll receive one row per group, and the aggregate function will give you the value for that group.
However, you only want to see the dates where the cell value is greater than 0, so I recommend you put a case statement inside your MAX() function like this:
SELECT customer_token,
MAX(CASE WHEN orders > 0 THEN created_date ELSE NULL END) AS latestOrderDate,
MAX(CASE WHEN views > 0 THEN created_date ELSE NULL END) AS latestViewDate
FROM customer
GROUP BY customer_token;
This will give you the max date only when orders is positive, and only when views is positive. Without that case statement, the DBMS won't know which groups to give you, and you would likely get incorrect results.
Here is an oracle reference for aggregate functions.