Rails aggregate query counting rows that satisfy certain conditions - sql

Let's say that have a table called bets. I want to run an aggregate sql query that counts rows satisfying certain conditions. For example, I want to return a count of all bets won, all bets lost, etc. I also want these counts to be grouped by several different columns. I tried a few different queries, but not getting the results I'd expect. For example:
Bet.select("user_id, event_id, bet_line_id, pick, COUNT(state = 'won') AS bets_won,
COUNT(state = 'lost') AS bets_lost, COUNT(state = 'pushed') AS bets_pushed").
group('user_id, event_id, bet_line_id, pick')
Just gives me the result of "1" for bets_won or bets_lost or bets_pushed for any of the records returned. Using Rails 3.2 + postgres.

you have to pass case expression so it will return bigint value .
Bet.select("user_id, event_id, bet_line_id, pick,
COUNT(CASE WHEN state = 'won' then 1 ELSE null END) AS bets_won,
COUNT(CASE WHEN state = 'lost' then 1 ELSE null END) AS bets_lost,
COUNT(CASE WHEN state = 'pushed' then 1 ELSE null END) AS bets_pushed").
group('user_id, event_id, bet_line_id, pick')

count(expression) is defined to count the "number of input rows for which the value of expression is not null". The state = 'won' expression only evaluates to NULL only when state is null, otherwise it will be one of the boolean values TRUE or FALSE. The result is that count(state = 'won') is actually counting the number of rows where state is not null and that's not what you're trying to do.
You can use Paritosh Piplewar's solution. Another common approach is to use sum and case:
sum(case state when 'won' then 1 else 0 end) as bets_won,
sum(case state when 'lost' then 1 else 0 end) as bets_lost,
sum(case state when 'pushed' then 1 else 0 end) as bets_pushed

Related

Combining multiple rows with the same ID, but different 'Yes'/'No' values for several columns, into one row showing all 'Yes'/'No' values

For the above table, I need to reduce the rows down to one per Filter ID and have all the possible yes/no values showing for that particular Filter Id
for example:
Filter ID
Outpatient Prescriptions
Opioid Outpatient Prescriptions
...
IP Pharmacy Medication Orders - Component Level
1
Yes
Yes
...
No
How is this achieved?
If I understand your question, for each partition of FilterID value, you want any field that has a yes to be aggregated up as 'Yes', otherwise 'No'. If you group by FilterID then you can handle the rollup using a CASE SUM CASE.
SELECT
FilterID,
Field1Response = CASE WHEN SUM(CASE WHEN Field1='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END,
Field2Response = CASE WHEN SUM(CASE WHEN Field2='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END ,
Field3Response = CASE WHEN SUM(CASE WHEN Field3='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END
...
FROM
Data
GROUP BY
FilterID
By the nature of the data, you can also simply use a MAX. This is not a good habit of getting into because the values may change over time, however, if the values are always Y or N then you could simply use MAX:
SELECT
FilterID,
Field1Response = MAX(Field1),
Field2Response = MAX(Field1),
Field3Response = MAX(Field1)
...
FROM
Data
GROUP BY
FilterID

Oracle SQL: Using COUNT() >1 When Combining two CASE WHEN Statements

I have a line of SQL which produces a count of purchases variable
count(distinct case when t.transaction_sub_type =1 then t.transaction_date end) as COUNTPUR,
I need to modify this so I can produce a 0/1 flag variable, which flags if a customer is a repeat purchaser. So, when a customer's purchases are greater than 1 then flag as 1 else flag as 0.
case when COUNTPUR>1 then 1 else 0 end as FLAG_REPEATPURCHASER
I need to combine these two case statements into one. I have been experimenting with different versions of the syntax, but I can't seem to nail it down. Below is one of the experiments which do not work.
max(case when (count(distinct case when t.transaction_sub_type =1 then t.transaction_date end))>1 then 1 else 0 end) as FLAG_REPEATPURCHASER,
Thanks in advance for assitance
You can use a case expression with conditional aggregation:
(case when count(distinct case when t.transaction_sub_type = 1 then t.transaction_date end) > 1
then 1 else 0
end) as FLAG_REPEATPURCHASER

count boolean column, and average another column based on boolean column

CREATE TABLE test (
calculate_time int4 NULL,
status bool NULL
);
INSERT INTO test (calculate_time,status) VALUES
(10,true)
,(15,true)
,(20,true)
,(20,true)
,(5,false)
,(10,false)
,(15,false)
,(100,NULL)
,(200,NULL)
,(300,NULL)
;
With this query it average all calculated_time values. Is there a way I can tell it only average ones where status = true? I tried adding a where clause but would make failed and suspended result in 0.
select
avg(calculate_time) as cal_time,
count(case when status = true then 1 end) as completed,
count(case when status = false then 1 end) as failed,
count(case when status is null then 1 end) as suspended
from test;
You seem to understand the concept of conditional aggregation. You can just also use a CASE expression for the average as you did for the other terms in your select:
select
avg(case when status then calculate_time end) as cal_time,
count(case when status then 1 end) as completed,
count(case when not status then 1 end) as failed,
count(case when status is null then 1 end) as suspended
from test;
This works because the AVG function, like most of the other aggregate functions, ignore NULL values. So the records for which status is not true, their calculate_time values would be effectively ignored and would not influence the overall average.
Other side note: You may use boolean values in a Postgres query directly without comparing them to true. That is, the following two CASE expressions are equivalent, with the second one being less terse:
avg(case when status = true then calculate_time end) as cal_time,
avg(case when status then calculate_time end) as cal_time,
Adding to #Tim's answer, since Postgres 9.4 you can add a filter clause to aggregate function calls, which may save you some of the boiler-plate of writing your own case expressions:
select
avg(calculate_time) filter (where status) as cal_time,
count(*) filter (where status) as completed,
count(*) filter (where not status) as failed,
count(*) filter (where status is null) as suspended
from test;

Summarize multiple transactions that share a common value but also have unique row identifiers

I'm currently working with a data set that has a large number of unique groups, and within each group there could be one to many unique rows, to describe the type of transaction applicable to that group. There are a limited number of types of transactions, and each transaction has a(n):
Amount
Location
Date
The data set would look something like this:
What I would like it to do is combine all groups into a single line, and have three columns for each type of transaction. I am trying to get the end result to look like this:
The closest I have gotten is to try a number of joins, based on the claim number while looking for the unique key. Unfortunately my results end up looking like this:
Any suggestions on how to get each unique Group to have only one row in the results, with the three Types spread out, having three columns each?
You can do this all with conditional aggregation:
select grp,
sum(case when type = 'S' then amount else null end) as type_s_amt,
min(case when type = 'S' then location else null end) as type_s_loc,
min(case when type = 'S' then date else null end) as type_s_dt,
sum(case when type = 'O' then amount else null end) as type_o_amt,
min(case when type = 'O' then location else null end) as type_o_loc,
min(case when type = 'O' then date else null end) as type_o_dt,
sum(case when type = 'F' then amount else null end) as type_f_amt,
min(case when type = 'F' then location else null end) as type_f_loc,
min(case when type = 'F' then date else null end) as type_f_dt
from tbl
group by grp
Fiddle: http://sqlfiddle.com/#!3/f7fae/5/0

nested SQL queries on one table

I am having trouble formulating a query to get the desired output.
This query involves one table and two columns.
First column bld_stat has 4 different values Private, public, Public-Abandoned, Private-Abandoned the other column bld_type, single_flr, multi_flr, trailer, Whs.
I need to get results that look like this:
So far I can get the first two columns but after that I have not been able to logically get a query to work
SELECT bld_stat, COUNT(grade) AS single_flr
FROM (SELECT bld_stat,bld_type
FROM bld_inventory WHERE bld_type = 'single_flr') AS grade
GROUP BY bld_stat,bld_type,grade
The term you are going for is pivoting. I think this should work...no need for the subquery, and I've changed your group by to only bld_stat
SELECT bld_stat,
sum(case when bld_type = 'singl_flr' then 1 else 0 end) AS single_flr,
sum(case when bld_type = 'multi_flr' then 1 else 0 end) AS multi_flr,
sum(case when bld_type = 'trailer' then 1 else 0 end) AS trailer,
sum(case when bld_type = 'whs' then 1 else 0 end) AS WHS
FROM bld_inventory
GROUP BY bld_stat