Combine like lines based off of date - sql

I'm having trouble finishing this query. The end result I'm looking for is:
I have so far wrote this query:
SELECT
to_char(favi.date::date, 'mm-dd-yyyy') AS "Date",
(SELECT COUNT(dv.severity) WHERE dv.severity='Critical') AS "Critical Vulns",
(SELECT COUNT(dv.severity) WHERE dv.severity='Severe') AS "Severe Vulns",
(SELECT COUNT(dv.severity) WHERE dv.severity='Moderate') AS "Moderate Vulns",
COUNT(favi.asset_id) AS "Asset Count"
FROM fact_asset_vulnerability_instance favi
JOIN dim_vulnerability dv ON dv.vulnerability_id=favi.vulnerability_id
GROUP BY to_char(favi.date::date, 'mm-dd-yyyy'), dv.severity
ORDER BY to_char(favi.date::date, 'mm-dd-yyyy')
My results though are:
How exactly do I combine the rows on the date? The results split the date into three rows each one displaying the actual count for whatever severity it is. I just want one row per date that shows each of the severity values on one line.
I've played around with UNION, DISTINCT, FULL JOIN, etc but this is slightly past my knowledge of how to get the desired results.

The below code is what ended up working.
SELECT
to_char(favi.date::date, 'mm-dd-yyyy') AS "Date",
SUM(CASE WHEN dv.severity = 'Critical' THEN 1 END) AS "Critical Vulns",
SUM(CASE WHEN dv.severity = 'Severe' THEN 1 END) AS "Severe Vulns",
SUM(CASE WHEN dv.severity = 'Moderate' THEN 1 END) AS "Moderate Vulns",
COUNT(favi.asset_id) AS "Asset Count"
FROM fact_asset_vulnerability_instance favi
JOIN dim_vulnerability dv ON dv.vulnerability_id=favi.vulnerability_id
GROUP BY to_char(favi.date::date, 'mm-dd-yyyy')
ORDER BY to_char(favi.date::date, 'mm-dd-yyyy')
so by doing the SUM function as #SOS mentioned I was able to drop the GROUP BY on dv.severity

Related

How to solve a nested aggregate function in SQL?

I'm trying to use a nested aggregate function. I know that SQL does not support it, but I really need to do something like the below query. Basically, I want to count the number of users for each day. But I want to only count the users that haven't completed an order within a 15 days window (relative to a specific day) and that have completed any order within a 30 days window (relative to a specific day). I already know that it is not possible to solve this problem using a regular subquery (it does not allow to change subquery values for each date). The "id" and the "state" attributes are related to the orders. Also, I'm using Fivetran with Snowflake.
SELECT
db.created_at::date as Date,
count(case when
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-15,Date) and dateadd(day,-1,Date)) then db.id end)
= 0) and
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-30,Date) and dateadd(day,-16,Date)) then db.id end)
> 0) then db.user end)
FROM
data_base as db
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
In other words, I want to transform the below query in a way that the "current_date" changes for each date.
WITH completed_15_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-15,current_date) and dateadd(day,-1,current_date)
group by User
),
completed_16_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-30,current_date) and dateadd(day,-16,current_date)
group by User
)
SELECT
date(db.created_at) as Date,
count(distinct case when comp_15.completadas = 0 and comp_16.completadas > 0 then comp_15.user end) as "Total Users Churn",
count(distinct case when comp_15.completadas > 0 then comp_15.user end) as "Total Users Active",
week(Date) as Week
FROM
data_base as db
left join completadas_15_days_before as comp_15 on comp_15.user = db.user
left join completadas_16_days_before as comp_16 on comp_16.user = db.user
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
Does anyone have a clue on how to solve this puzzle? Thank you very much!
The following should give you roughly what you want - difficult to test without sample data but should be a good enough starting point for you to then amend it to give you exactly what you want.
I've commented to the code to hopefully explain what each section is doing.
-- set parameter for the first date you want to generate the resultset for
set start_date = TO_DATE('2020-01-01','YYYY-MM-DD');
-- calculate the number of days between the start_date and the current date
set num_days = (Select datediff(day, $start_date , current_date()+1));
--generate a list of all the dates from the start date to the current date
-- i.e. every date that needs to appear in the resultset
WITH date_list as (
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date_item
from table (generator(rowcount => ($num_days)))
)
--Create a list of all the orders that are in scope
-- i.e. 30 days before the start_date up to the current date
-- amend WHERE clause to in/exclude records as appropriate
,order_list as (
SELECT created_at, rt_id
from data_base
where created_at between dateadd(day,-30,$start_date) and current_date()
and state = 'finished'
)
SELECT dl.date_item
,COUNT (DISTINCT ol30.RT_ID) AS USER_COUNT
,COUNT (ol30.RT_ID) as ORDER_COUNT
FROM date_list dl
-- get all orders between -30 and -16 days of each date in date_list
left outer join order_list ol30 on ol30.created_at between dateadd(day,-30,dl.date_item) and dateadd(day,-16,dl.date_item)
-- exclude records that have the same RT_ID as in the ol30 dataset but have a date between 0 amd -15 of the date in date_list
WHERE NOT EXISTS (SELECT ol15.RT_ID
FROM order_list ol15
WHERE ol30.RT_ID = ol15.RT_ID
AND ol15.created_at between dateadd(day,-15,dl.date_item) and dl.date_item)
GROUP BY dl.date_item
ORDER BY dl.date_item;

Extract subset from the main query in SQL

In the below query, I get the count of customers who were active between "2017-09-01 00:00:00" and "2017-11-31 23:59:59" as cust_90 and would like to add another column to find the count of customers who were active between "2017-11-01 00:00:00" and "2017-11-31 23:59:59" (a subset of the whole period).
select custid, count(distinct concat(visit1, visit2)) as cust_90
from test1
where date_time between "2017-09-01 00:00:00" and "2017-11-31 23:59:59"
and custid = '234214124'
group by custid;
Sample output:
CustomerName cust_90 cust_30
David 38 15
Wondering whether I could have a subquery in the above query to find the customers active in a month. Any suggestions would be great.
This is called conditional aggregation which can be done using a case expression.
select custid,
count(distinct concat(visit1, visit2) end) as cust_90,
count(distinct case when to_date(date_time)>='2017-11-01' then concat(visit1, visit2) end) as cust_30
from test1
where date_time >= '2017-09-01' and date_time < '2017-12-01'
and custid = '234214124'
group by custid;

Postgres Count with different condition on the same query

I'm working on a report which has this following schema: http://sqlfiddle.com/#!15/fd104/2
The current query is working fine which looks like this:
Basically it is a 3 table inner join. I did not make this query but the developer who left it and I want to modify the query. As you can see, TotalApplication just counts the total application based on the a.agent_id. And you can see the totalapplication column in the result. What I want is to remove that and change the totalapplication to a new two column. I want to add a completedsurvey and partitalsurvey column. So basically this part will become
SELECT a.agent_id as agent_id, COUNT(a.id) as CompletedSurvey
FROM forms a WHERE a.created_at >= '2015-08-01' AND
a.created_at <= '2015-08-31' AND disposition = 'Completed Survey'
GROUP BY a.agent_id
I just added AND disposition = 'Completed Survey' But I need another column for partialsurvey which has the same query with completedsurvey being the only difference is
AND disposition = 'Partial Survey'
and
COUNT(a.id) as PartialSurvey
But I dunno where to put that query or how will be the query look like.So the final output has these columns
agent_id, name, completedsurvey, partialsurvey, loginhours, applicationperhour, rph
Once it is ok then applicationperhour and rph I can fix it myself
If I understand you correctly, you are looking for a filtered (conditional) aggregate:
SELECT a.agent_id as agent_id,
COUNT(a.id) filter (where disposition = 'Completed Survey') as CompletedSurvey,
count(a.id) filter (where disposition = 'Partial Survey') as partial_survey
FROM forms a
WHERE a.created_at >= '2015-08-01'
AND a.created_at <= '2015-08-31'
GROUP BY a.agent_id;
The above assumes the current version of Postgres (which is 9.4 at the time of writing). For older versions (< 9.4) you need to use a case statement as the filter condition is not supported there:
SELECT a.agent_id as agent_id,
COUNT(case when disposition = 'Completed Survey' then a.id end) as CompletedSurvey,
COUNT(case when disposition = 'Partial Survey' then a.id end) as partial_survey
FROM forms a
WHERE a.created_at >= '2015-08-01'
AND a.created_at <= '2015-08-31'
GROUP BY a.agent_id;

Work Around for SQL Query 'NOT IN' that takes forever?

I am trying to run a query on an Oracle 10g DB to try and view 2 groups of transactions. I want to view basically anyone who has a transaction this year (2014) that also had a transaction in the previous 5 years. I then want to run a query for anyone who has a transaction this year (2014) that hasn't ordered from us in the last 5 years. I assumed I could do this with the 'IN' and 'NOT IN' features. The 'IN' query runs fine but the 'NOT IN' never completes. DB is fairly large which is probably why. Would love any suggestions from the experts!
*Notes, [TEXT] is a description of our Customer's Company name, sometimes the accounting department didn't tie this to our customer ID which left NULL values, so using TEXT as my primary grouping seemed to work although the name is obscure. CODE_D is a product line just to bring context to the name.
Below is my code:
SELECT CODE_D, sum(coalesce(credit_amount, 0) - coalesce(debet_amount,0)) as TOTAL
FROM
gen_led_voucher_row_tab
WHERE ACCOUNTING_YEAR like '2014'
and TEXT NOT IN
(select TEXT
from gen_led_voucher_row_tab
and voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
)
GROUP BY CODE_D
ORDER BY TOTAL DESC
Try using a LEFT JOIN instead of NOT IN:
SELECT t1.CODE_D, sum(coalesce(t1.credit_amount, 0) - coalesce(t1.debet_amount,0)) as TOTAL
FROM gen_led_voucher_row_tab AS t1
LEFT JOIN gen_led_voucher_row_tab AS t2
ON t1.TEXT = t2.TEXT
AND t2.voucher_date >= '01-JUN-09'
AND t2.voucher_date < '01-JUN-14'
AND (credit_amount > '1' or debet_amount > '1')
WHERE t2.TEXT IS NULL
AND t1.ACCOUNTING_YEAR = '2014'
GROUP BY CODE_D
ORDER BY TOTAL DESC
ALso, make sure you have an index on the TEXT column.
You can increase your performance by changing the Not In clause to a Where Not Exists like as follows:
Where Not Exists
(
Select 1
From gen_led_voucher_row_tab b
Where voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
And a.Text = b.Text
)
You'll need to alias the first table as well to a for this to work. Essentially, you're pulling back a ton of data to just discard it. Exists invokes a Semi Join which does not pull back any data at all, so you should see significant improvement.
Edit
Your query, as of the current update to the question should be this:
SELECT CODE_D,
sum(coalesce(credit_amount, 0) - coalesce(debet_amount,0)) as TOTAL
FROM gen_led_voucher_row_tab a
Where ACCOUNTING_YEAR like '2014'
And Not Exists
(
Select 1
From gen_led_voucher_row_tab b
Where voucher_date >= '01-JUN-09'
and voucher_date < '01-JUN-14'
and (credit_amount > '1' or debet_amount > '1')
And a.Text = b.Text
)
GROUP BY CODE_D
ORDER BY TOTAL DESC

SQL Select case to create new value

Hello guys I am sorry but I didn’t know what I should call this question.
I have a table that contains information one these are how long it took from when the row was created and until it was last updated these are shown within the following columns:
CREATED
LAST_UPD
The time difference between these are shown in a separate column called:
SOLVED_SEC
(The time is shown in seconds)
Now I want to collect some of the data from this table but should the CREATED (which is a date) be outside of our company’s opening hours, the SOLVED_SEC should recalculated in my
Our opening hours exists in a table called KS_DRIFT.SYS_DATE_KS.
This table has a column named: THIS_DATE_OPENING.
I was thinking that I could calculate the new solved time as such:
THIS_DATE_OPENING-LAST_UPD
However I’m not quite sure how to do this
The following is the SQL that i have right now:
SELECT
TIDSPUNKT, LAST_UPD, AA.CREATED,
TRUNC(AA.SOLVED_SEC/60/60,2) as LØST_TIME,
//this is my attempt
CASE
WHEN AA.CREATED >= CC.THIS_DATE_CLOSING
THEN LØST_TIME = (LAST_UPD-CC.THIS_DATE_OPENING) AS LØST_TIME
END,
COUNT(CASE WHEN AA.LAST_UPD >= CC.THIS_DATE_CLOSING THEN 1 END) as AFTER_CLOSING,
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END)as Completed_Callbacks
FROM
KS_DRIFT.NYK_SIEBEL_CALLBACK_AGENT_H_V AA
INNER JOIN
KS_DRIFT.V_TEAM_DATO BB ON AA.TIDSPUNKT = BB.DATO
RIGHT JOIN
KS_DRIFT.SYS_DATE_KS CC ON AA.TIDSPUNKT = CC.THIS_DATE
WHERE
AA.TIDSPUNKT BETWEEN '2012-04-01' AND '2013-04-04'
AND AA.AFSLUTTET_AF = BB.INITIALER
GROUP BY
AA.TIDSPUNKT, LØST_SEKUNDER, LAST_UPD, AA.CREATED
Sadly this doesn’t work.
My question is how can I change the value of SOLVED_SEC if the CREATED > THIS_DATE_CLOSED ?
Should you require additional information please do not hesitate to comment.
UPDATE
I have tried the following:
CASE WHEN (AA.CREATED >= CC.THIS_DATE_CLOSING) THEN (AA.LAST_UPD-CC.THIS_DATE_OPENING) END AS SOVLED_AFTER_OPENING
However i get
"not a GROUP BY expression"
UPDATE 2
My SQL statement now looks like this:
SELECT TIDSPUNKT,
AA.CREATED,
LAST_UPD,
AA.AGENTGRUPPE,
TRUNC(AA.LØST_SEKUNDER/60/60,2) as LØST_TIME,
'LØST_TIME' = CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN DATEDIFF(ss, CC.THIS_DATE_OPENING, LAST_UPD) END,
COUNT(CASE WHEN AA.AGENTGRUPPE not in('Hovednumre','Privatcentre','Forsikring','Hotline','Stabe','Kunder','Erhverv','NykreditKunder','Servicecentret') THEN 1 END) as CALLBACKS_OUTSIDE_OF_KS,
COUNT(CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN 1 END) as AFTER_CLOSING,
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END)as Completed_Callbacks
FROM KS_DRIFT.NYK_SIEBEL_CALLBACK_AGENT_H_V AA
INNER JOIN KS_DRIFT.V_TEAM_DATO BB ON AA.TIDSPUNKT = BB.DATO
RIGHT JOIN KS_DRIFT.SYS_DATE_KS CC ON AA.TIDSPUNKT = CC.THIS_DATE
WHERE AA.TIDSPUNKT BETWEEN '2013-04-01' AND '2013-04-04'
AND AA.AFSLUTTET_AF = BB.INITIALER
GROUP BY AA.TIDSPUNKT, LAST_UPD, AA.CREATED, AA.LØST_SEKUNDER,
AA.AFSLUTTET_AF, AA.AGENTGRUPPE
However i get From keyword not found where expected
You need to use the DATEDIFF function when trying to calculate differences in time.
For example:
SELECT DATEDIFF(ss, THIS_DATE_OPENING, LAST_UPD);
Where "ss" denotes seconds. This first parameter is the datepart that you want to calculate. Like seconds or minutes or days or whatever.
You can find the documentation here http://msdn.microsoft.com/en-us/library/aa258269(v=sql.80).aspx
Let me know if I didn't understand your question correctly.
I also just noticed that you're trying to do this:
CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN LØST_TIME =(LAST_UPD-CC.THIS_DATE_OPENING) AS LØST_TIME END
Try this instead:
'L0ST_TIME' = CASE WHEN AA.CREATED >= CC.THIS_DATE_CLOSING THEN DATEDIFF(ss, CC.THIS_DATE_OPENING, LAST_UPD) END
HTH