Postgres Count with different condition on the same query

Postgres Count with different condition on the same query - sql

I'm working on a report which has this following schema: http://sqlfiddle.com/#!15/fd104/2
The current query is working fine which looks like this:
Basically it is a 3 table inner join. I did not make this query but the developer who left it and I want to modify the query. As you can see, TotalApplication just counts the total application based on the a.agent_id. And you can see the totalapplication column in the result. What I want is to remove that and change the totalapplication to a new two column. I want to add a completedsurvey and partitalsurvey column. So basically this part will become
SELECT a.agent_id as agent_id, COUNT(a.id) as CompletedSurvey
FROM forms a WHERE a.created_at >= '2015-08-01' AND
a.created_at <= '2015-08-31' AND disposition = 'Completed Survey'
GROUP BY a.agent_id
I just added AND disposition = 'Completed Survey' But I need another column for partialsurvey which has the same query with completedsurvey being the only difference is
AND disposition = 'Partial Survey'
and
COUNT(a.id) as PartialSurvey
But I dunno where to put that query or how will be the query look like.So the final output has these columns
agent_id, name, completedsurvey, partialsurvey, loginhours, applicationperhour, rph
Once it is ok then applicationperhour and rph I can fix it myself

If I understand you correctly, you are looking for a filtered (conditional) aggregate:
SELECT a.agent_id as agent_id,
COUNT(a.id) filter (where disposition = 'Completed Survey') as CompletedSurvey,
count(a.id) filter (where disposition = 'Partial Survey') as partial_survey
FROM forms a
WHERE a.created_at >= '2015-08-01'
AND a.created_at <= '2015-08-31'
GROUP BY a.agent_id;
The above assumes the current version of Postgres (which is 9.4 at the time of writing). For older versions (< 9.4) you need to use a case statement as the filter condition is not supported there:
SELECT a.agent_id as agent_id,
COUNT(case when disposition = 'Completed Survey' then a.id end) as CompletedSurvey,
COUNT(case when disposition = 'Partial Survey' then a.id end) as partial_survey
FROM forms a
WHERE a.created_at >= '2015-08-01'
AND a.created_at <= '2015-08-31'
GROUP BY a.agent_id;

Related

How to solve a nested aggregate function in SQL?

I'm trying to use a nested aggregate function. I know that SQL does not support it, but I really need to do something like the below query. Basically, I want to count the number of users for each day. But I want to only count the users that haven't completed an order within a 15 days window (relative to a specific day) and that have completed any order within a 30 days window (relative to a specific day). I already know that it is not possible to solve this problem using a regular subquery (it does not allow to change subquery values for each date). The "id" and the "state" attributes are related to the orders. Also, I'm using Fivetran with Snowflake.
SELECT
db.created_at::date as Date,
count(case when
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-15,Date) and dateadd(day,-1,Date)) then db.id end)
= 0) and
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-30,Date) and dateadd(day,-16,Date)) then db.id end)
> 0) then db.user end)
FROM
data_base as db
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
In other words, I want to transform the below query in a way that the "current_date" changes for each date.
WITH completed_15_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-15,current_date) and dateadd(day,-1,current_date)
group by User
),
completed_16_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-30,current_date) and dateadd(day,-16,current_date)
group by User
)
SELECT
date(db.created_at) as Date,
count(distinct case when comp_15.completadas = 0 and comp_16.completadas > 0 then comp_15.user end) as "Total Users Churn",
count(distinct case when comp_15.completadas > 0 then comp_15.user end) as "Total Users Active",
week(Date) as Week
FROM
data_base as db
left join completadas_15_days_before as comp_15 on comp_15.user = db.user
left join completadas_16_days_before as comp_16 on comp_16.user = db.user
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
Does anyone have a clue on how to solve this puzzle? Thank you very much!

The following should give you roughly what you want - difficult to test without sample data but should be a good enough starting point for you to then amend it to give you exactly what you want.
I've commented to the code to hopefully explain what each section is doing.
-- set parameter for the first date you want to generate the resultset for
set start_date = TO_DATE('2020-01-01','YYYY-MM-DD');
-- calculate the number of days between the start_date and the current date
set num_days = (Select datediff(day, $start_date , current_date()+1));
--generate a list of all the dates from the start date to the current date
-- i.e. every date that needs to appear in the resultset
WITH date_list as (
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date_item
from table (generator(rowcount => ($num_days)))
)
--Create a list of all the orders that are in scope
-- i.e. 30 days before the start_date up to the current date
-- amend WHERE clause to in/exclude records as appropriate
,order_list as (
SELECT created_at, rt_id
from data_base
where created_at between dateadd(day,-30,$start_date) and current_date()
and state = 'finished'
)
SELECT dl.date_item
,COUNT (DISTINCT ol30.RT_ID) AS USER_COUNT
,COUNT (ol30.RT_ID) as ORDER_COUNT
FROM date_list dl
-- get all orders between -30 and -16 days of each date in date_list
left outer join order_list ol30 on ol30.created_at between dateadd(day,-30,dl.date_item) and dateadd(day,-16,dl.date_item)
-- exclude records that have the same RT_ID as in the ol30 dataset but have a date between 0 amd -15 of the date in date_list
WHERE NOT EXISTS (SELECT ol15.RT_ID
FROM order_list ol15
WHERE ol30.RT_ID = ol15.RT_ID
AND ol15.created_at between dateadd(day,-15,dl.date_item) and dl.date_item)
GROUP BY dl.date_item
ORDER BY dl.date_item;

Join table with filtering by date

I have 2 tables. 1st - wallets, 2nd - wallet histories. Now I can make a selection by the sum of all completed deposits/withdrawal for each wallet.
Sql-code looks something like this:
SELECT w.*, wh.deposit, wh2.withdrawal, wh3.pending
FROM wallets w
LEFT JOIN (
SELECT wallet_id, SUM(amount) deposit
FROM wallet_histories
WHERE type = 'Deposit'
AND status = 'Completed'
AND timestamp BETWEEN '2019-08-02 00:00:00+00' AND '2019-08-03 00:00:00+00'
GROUP BY wallet_id
) wh ON w.id = wh.wallet_id
LEFT JOIN (
SELECT wallet_id, SUM(amount) withdrawal
FROM wallet_histories
WHERE type = 'Withdrawal'
AND status = 'Completed'
AND timestamp BETWEEN '2019-08-02 00:00:00+00' AND '2019-08-03 00:00:00+00'
GROUP BY wallet_id
) wh2 ON w.id = wh2.wallet_id
LEFT JOIN (
SELECT wallet_id, SUM(amount) pending
FROM wallet_histories
WHERE status = 'Pending'
GROUP BY wallet_id
) wh3 ON w.id = wh3.wallet_id
Next, I need to make this selection but with filtering by field enabled_at from the table wallets.
I supposed to make one more condition in WHERE:
AND timestamp >= w.enabled_at
but got error
LINE 10: AND timestamp >= w.enabled_at
^
HINT: There is an entry for table "w", but it cannot be referenced from this part of the query.
How i can avoid this error (maybe fully rebuild query)?
Thank you in advance.

You need to add the condition in each subquery. The results of the aggregations do not have enabled_at, so that column is not available in the outer query -- either in the on or where clauses.
That said, you can drastically simplify your query using conditional aggregation:
SELECT wallet_id,
SUM(amount) FILTER (WHERE type = 'Deposit') as deposit
SUM(amount) FILTER (WHERE type = 'Withdrawal') as withdrawal,
SUM(amount) FILTER (WHERE status = 'Pending') as pending
FROM wallet_histories
WHERE (type IN ('Deposit', 'Withdrawal') AND
status = 'Completed' AND
timestamp >= '2019-08-02' AND
timestamp < '2019-08-03'
) OR
status = 'Pending'
GROUP BY wallet_id;
This might make it easier to add additional conditions.

SQL adding one filter in outer query to apply to all subqueries

I am creating a query that contains multiple sub-queries that show number of incidents in different status/category/etc. A date filter will need to be applied to all sub-queries, in order to count number of incidents created within the date range.
Because the report will be moved to Business Objects, I cannot specify the dates multiple times in the sub-queries. Hence I joined the incident table (inc) in the sub-queries with another incident table (inc_filter) in the outer query, and hoping to apply one date filter to all sub queries.
But the result returned was incorrect, I got multiple rows that have the value either 0 or 1.
Could anyone please point me to the right direction?
SELECT
(SELECT COUNT(*)
FROM Incident inc
WHERE inc.id = inc_filter.id
AND inc.status = 'Open')
"Total # of Open Inc",
(SELECT COUNT(*)
FROM Incident inc
WHERE inc.id = inc_filter.id
AND inc.status = 'Closed')
"Total # of Closed Inc"
--more sub-queries here...
FROM Incident inc_filter
AND inc_filter.CREATED > '10-Apr-2017'
AND inc_filter.CREATED < '13-Apr-2017'

You are probably simply looking for conditional aggregation:
SELECT
COUNT(CASE WHEN status = 'Open' THEN 1 END) AS "Total # of Open Inc",
COUNT(CASE WHEN status = 'Closed' THEN 1 END) AS "Total # of Closed Inc"
-- more counts here...
FROM Incident
WHERE created >= DATE '2017-04-10' AND created < DATE '2017-04-13';

First you should use case and you dont need use many subquery,
second if i understood your question you should use sum() like this
SELECT sum(case when inc_filter.status = 'Open' then 1 else 0 end) as open,
sum(case when inc_filter.status = 'Closed' then 1 else 0 end) as closed
FROM Incident inc_filter
AND inc_filter.CREATED > '10-Apr-2017'
AND inc_filter.CREATED < '13-Apr-2017'

Dear you need one more select above this with using sum of your count. actually you are using group function but that is apply on row level. you need you sum all your subquery columns to see the required result for example your query will be
select sum("Total # of Open Inc") ,sum("Total # of Closed Inc")
from(
SELECT
(SELECT COUNT(*)
FROM Incident inc
WHERE inc.id = inc_filter.id
AND inc.status = 'Open')
"Total # of Open Inc",
(SELECT COUNT(*)
FROM Incident inc
WHERE inc.id = inc_filter.id
AND inc.status = 'Closed')
"Total # of Closed Inc"
--more sub-queries here...
FROM Incident inc_filter
AND inc_filter.CREATED > '10-Apr-2017'
AND inc_filter.CREATED < '13-Apr-2017');
But it is better for you to use joins

Work Around for SQL Query 'NOT IN' that takes forever?

I am trying to run a query on an Oracle 10g DB to try and view 2 groups of transactions. I want to view basically anyone who has a transaction this year (2014) that also had a transaction in the previous 5 years. I then want to run a query for anyone who has a transaction this year (2014) that hasn't ordered from us in the last 5 years. I assumed I could do this with the 'IN' and 'NOT IN' features. The 'IN' query runs fine but the 'NOT IN' never completes. DB is fairly large which is probably why. Would love any suggestions from the experts!
*Notes, [TEXT] is a description of our Customer's Company name, sometimes the accounting department didn't tie this to our customer ID which left NULL values, so using TEXT as my primary grouping seemed to work although the name is obscure. CODE_D is a product line just to bring context to the name.
Below is my code:
SELECT CODE_D, sum(coalesce(credit_amount, 0) - coalesce(debet_amount,0)) as TOTAL
FROM
gen_led_voucher_row_tab
WHERE ACCOUNTING_YEAR like '2014'
and TEXT NOT IN
(select TEXT
from gen_led_voucher_row_tab
and voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
)
GROUP BY CODE_D
ORDER BY TOTAL DESC

Try using a LEFT JOIN instead of NOT IN:
SELECT t1.CODE_D, sum(coalesce(t1.credit_amount, 0) - coalesce(t1.debet_amount,0)) as TOTAL
FROM gen_led_voucher_row_tab AS t1
LEFT JOIN gen_led_voucher_row_tab AS t2
ON t1.TEXT = t2.TEXT
AND t2.voucher_date >= '01-JUN-09'
AND t2.voucher_date < '01-JUN-14'
AND (credit_amount > '1' or debet_amount > '1')
WHERE t2.TEXT IS NULL
AND t1.ACCOUNTING_YEAR = '2014'
GROUP BY CODE_D
ORDER BY TOTAL DESC
ALso, make sure you have an index on the TEXT column.

You can increase your performance by changing the Not In clause to a Where Not Exists like as follows:
Where Not Exists
(
Select 1
From gen_led_voucher_row_tab b
Where voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
And a.Text = b.Text
)
You'll need to alias the first table as well to a for this to work. Essentially, you're pulling back a ton of data to just discard it. Exists invokes a Semi Join which does not pull back any data at all, so you should see significant improvement.
Edit
Your query, as of the current update to the question should be this:
SELECT CODE_D,
sum(coalesce(credit_amount, 0) - coalesce(debet_amount,0)) as TOTAL
FROM gen_led_voucher_row_tab a
Where ACCOUNTING_YEAR like '2014'
And Not Exists
(
Select 1
From gen_led_voucher_row_tab b
Where voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
And a.Text = b.Text
)
GROUP BY CODE_D
ORDER BY TOTAL DESC

SQL Select case to create new value

Hello guys I am sorry but I didn’t know what I should call this question.
I have a table that contains information one these are how long it took from when the row was created and until it was last updated these are shown within the following columns:
CREATED
LAST_UPD
The time difference between these are shown in a separate column called:
SOLVED_SEC
(The time is shown in seconds)
Now I want to collect some of the data from this table but should the CREATED (which is a date) be outside of our company’s opening hours, the SOLVED_SEC should recalculated in my
Our opening hours exists in a table called KS_DRIFT.SYS_DATE_KS.
This table has a column named: THIS_DATE_OPENING.
I was thinking that I could calculate the new solved time as such:
THIS_DATE_OPENING-LAST_UPD
However I’m not quite sure how to do this
The following is the SQL that i have right now:
SELECT
TIDSPUNKT, LAST_UPD, AA.CREATED,
TRUNC(AA.SOLVED_SEC/60/60,2) as LØST_TIME,
//this is my attempt
CASE
WHEN AA.CREATED >= CC.THIS_DATE_CLOSING
THEN LØST_TIME = (LAST_UPD-CC.THIS_DATE_OPENING) AS LØST_TIME
END,
COUNT(CASE WHEN AA.LAST_UPD >= CC.THIS_DATE_CLOSING THEN 1 END) as AFTER_CLOSING,
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END)as Completed_Callbacks
FROM
KS_DRIFT.NYK_SIEBEL_CALLBACK_AGENT_H_V AA
INNER JOIN
KS_DRIFT.V_TEAM_DATO BB ON AA.TIDSPUNKT = BB.DATO
RIGHT JOIN
KS_DRIFT.SYS_DATE_KS CC ON AA.TIDSPUNKT = CC.THIS_DATE
WHERE
AA.TIDSPUNKT BETWEEN '2012-04-01' AND '2013-04-04'
AND AA.AFSLUTTET_AF = BB.INITIALER
GROUP BY
AA.TIDSPUNKT, LØST_SEKUNDER, LAST_UPD, AA.CREATED
Sadly this doesn’t work.
My question is how can I change the value of SOLVED_SEC if the CREATED > THIS_DATE_CLOSED ?
Should you require additional information please do not hesitate to comment.
UPDATE
I have tried the following:
CASE WHEN (AA.CREATED >= CC.THIS_DATE_CLOSING) THEN (AA.LAST_UPD-CC.THIS_DATE_OPENING) END AS SOVLED_AFTER_OPENING
However i get
"not a GROUP BY expression"
UPDATE 2
My SQL statement now looks like this:
SELECT TIDSPUNKT,
AA.CREATED,
LAST_UPD,
AA.AGENTGRUPPE,
TRUNC(AA.LØST_SEKUNDER/60/60,2) as LØST_TIME,
'LØST_TIME' = CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN DATEDIFF(ss, CC.THIS_DATE_OPENING, LAST_UPD) END,
COUNT(CASE WHEN AA.AGENTGRUPPE not in('Hovednumre','Privatcentre','Forsikring','Hotline','Stabe','Kunder','Erhverv','NykreditKunder','Servicecentret') THEN 1 END) as CALLBACKS_OUTSIDE_OF_KS,
COUNT(CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN 1 END) as AFTER_CLOSING,
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END)as Completed_Callbacks
FROM KS_DRIFT.NYK_SIEBEL_CALLBACK_AGENT_H_V AA
INNER JOIN KS_DRIFT.V_TEAM_DATO BB ON AA.TIDSPUNKT = BB.DATO
RIGHT JOIN KS_DRIFT.SYS_DATE_KS CC ON AA.TIDSPUNKT = CC.THIS_DATE
WHERE AA.TIDSPUNKT BETWEEN '2013-04-01' AND '2013-04-04'
AND AA.AFSLUTTET_AF = BB.INITIALER
GROUP BY AA.TIDSPUNKT, LAST_UPD, AA.CREATED, AA.LØST_SEKUNDER,
AA.AFSLUTTET_AF, AA.AGENTGRUPPE
However i get From keyword not found where expected

You need to use the DATEDIFF function when trying to calculate differences in time.
For example:
SELECT DATEDIFF(ss, THIS_DATE_OPENING, LAST_UPD);
Where "ss" denotes seconds. This first parameter is the datepart that you want to calculate. Like seconds or minutes or days or whatever.
You can find the documentation here http://msdn.microsoft.com/en-us/library/aa258269(v=sql.80).aspx
Let me know if I didn't understand your question correctly.
I also just noticed that you're trying to do this:
CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN LØST_TIME =(LAST_UPD-CC.THIS_DATE_OPENING) AS LØST_TIME END
Try this instead:
'L0ST_TIME' = CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN DATEDIFF(ss, CC.THIS_DATE_OPENING, LAST_UPD) END
HTH

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Postgres Count with different condition on the same query - sql

Related

How to solve a nested aggregate function in SQL?

Join table with filtering by date

SQL adding one filter in outer query to apply to all subqueries

Work Around for SQL Query 'NOT IN' that takes forever?

SQL Select case to create new value

Categories

Resources