I am struggling with joining the below two queries.
My main query is the first one below and what I am trying to achieve is the output of query 2's opt in rate column as a new column in my original query.
SELECT CAST("public"."event_event"."date_created" AS date) AS "date_created", "marketing_message__via__messag"."name" AS "name", count(*) AS "count"
FROM "public"."event_event"
LEFT JOIN "public"."marketing_message" "marketing_message__via__messag" ON "public"."event_event"."message_id" = "marketing_message__via__messag"."id" LEFT JOIN "public"."marketing_campaign" "marketing_campaign__via__campa" ON "public"."event_event"."campaign_id" = "marketing_campaign__via__campa"."id"
WHERE (date_trunc('month', CAST("public"."event_event"."date_created" AS timestamp)) = date_trunc('month', CAST(now() AS timestamp))
AND "marketing_message__via__messag"."name" IS NOT NULL AND ("marketing_message__via__messag"."name" <> ''
OR "marketing_message__via__messag"."name" IS NULL) AND "public"."event_event"."stage" = 'Lead' AND ("marketing_campaign__via__campa"."name" = 'a'
OR "marketing_campaign__via__campa"."name" = 'b'
OR "marketing_campaign__via__campa"."name" = 'c'
OR "marketing_campaign__via__campa"."name" = 'c1' OR "marketing_campaign__via__campa"."name" = 'd'
OR "marketing_campaign__via__campa"."name" = 'e'))
GROUP BY CAST("public"."event_event"."date_created" AS date), "marketing_message__via__messag"."name"
ORDER BY CAST("public"."event_event"."date_created" AS date) ASC, "marketing_message__via__messag"."name" ASC
I would like to add the following query output for "Opt-In rate" to my query above as a new column.
Query 2
SELECT marketing_message.message_text,
cast(sum((event_event.status='Opt-in')::int) as decimal) / nullif(sum((event_event.status='Sent')::int), 0)* 100 as "Opt-in Rate (Sent)"
FROM event_event
JOIN marketing_campaign ON event_event.campaign_id = marketing_campaign.id
JOIN marketing_message ON event_event.message_id = marketing_message.id
WHERE True [[AND {{campaign_name}}]] [[AND {{date_created}}]]
GROUP BY marketing_message.message_text
The short answer is that you can't.
Your first query is keyed on date and name (the GROUP BY clause) and your second query is keyed on message_text. As there is no relationship between the two datasets there is no way of joining/combining them in a single query.
You would need to find a common field (or fields) between the two datasets and join on them - but this won't give the same results that you have at the moment as your queries would need to be completely restructured.
Related
I am joining a metric table to a client listing. Currently when joined, the joined column populates for each client on all rows.
The goal is to only have the metric populate once per client grouping (hsp.PROV_ID). - SECOND IMAGE
select hsp.PROV_ID, HSP.id from hsp_client hsp
left join(select vat.ASGN_PROV_ID, sum(vat.ASGN_DFI_CNT) as 'Deficiency Count', sum(vat.DLQ_DFI_CNT) as 'Delinquent Count' from V_DT_PROV_ASGN_METRICS vat
where VAT.PAT_CLASS_C IN ('101', '102','104')
and VAT.METRIC_DATE = Convert(DATE, GetDate())
--and DEF_ID IS NULL
and vat.DEF_TYPE_C not in ('9')
group by vat.ASGN_PROV_ID,vat.ASGN_DFI_CNT,vat.DLQ_DFI_CNT
) vtt on vtt.ASGN_PROV_ID =hsp.PROV_ID
group by hsp.PROV_ID, hsp.id
If I understand your problem, you just want the metrics on the "first" row for each prov_id, where "first" row is defined by hsp.id.
You don't need aggregation in the outer query, just some logic on window functions:
select hsp.PROV_ID, HSP.id,
(case when hsp.id = min(hsp.id) over (partition by hsp.prov_id)
then vtt.DeficiencyCount
end) as DeficiencyCount,
(case when hsp.id = min(hsp.id) over (partition by hsp.prov_id)
then vtt.DelinquentCount
end),
from hsp_client hsp left join
(select vat.ASGN_PROV_ID, sum(vat.ASGN_DFI_CNT) as DeficiencyCount,
sum(vat.DLQ_DFI_CNT) as DelinquentCount
from V_DT_PROV_ASGN_METRICS vat
where VAT.PAT_CLASS_C IN ('101', '102','104') and
VAT.METRIC_DATE = Convert(DATE, GetDate()) and
--and DEF_ID IS NULL
vat.DEF_TYPE_C not in ('9')
group by vat.ASGN_PROV_ID
) vtt
on vtt.ASGN_PROV_ID = hsp.PROV_ID
order by hsp.PROV_ID, hsp.id;
This logic is more normally done in the application layer, but you can do it in SQL.
I'm not even sure the aggregation is needed in the subquery. But . . . it should be only at the "prov id" level.
My SQL code is removing duplicate values of "Time" specific to Project Description. For example, if a time value for a specific project is included two or more times, the data is only pulling the value once skewing the results.
I've tried adding SUM(PMTT_DailyTime.Time) as 'Sum of Time" and this creates a different problem and inaccurate results. It multiplies the sum values by the number of an irrelevant field.
SELECT View_ProjectsInfoDecoded.ProjectNbr
, View_ProjectsInfoDecoded.Department
, View_ProjectsInfoDecoded.ProjectDesc
, View_ProjectsInfoDecoded.ProjectStartDate
, View_ProjectsInfoDecoded.ProjectCompletionDate
, View_ProjectsInfoDecoded.VoidInd
, View_ProjectsInfoDecoded.ProjectStatus
, View_ProjectsInfoDecoded.ProjectType
, DatePart("yyyy", PMTT_DailyTime.ReportDate) AS [ReportYear]
, PMTT_DailyTime.Time
, PMTT_DailyTime.VoidInd
, View_ProjectsBuilderInfoDecoded.ProjectHealth
, View_ProjectsBuilderInfoDecoded.PrimaryBuilder
, View_ProjectsBuilderInfoDecoded.CurrentProjectStatus
FROM View_ProjectsInfoDecoded
LEFT JOIN View_ProjectsBuilderInfoDecoded ON View_ProjectsInfoDecoded.Department = View_ProjectsBuilderInfoDecoded.Department AND View_ProjectsInfoDecoded.ProjectNbr = View_ProjectsBuilderInfoDecoded.ProjectNbr
LEFT JOIN PMTT_DailyTime ON (View_ProjectsBuilderInfoDecoded.Department = PMTT_DailyTime.Department) AND (View_ProjectsBuilderInfoDecoded.ProjectNbr= PMTT_DailyTime.ProjectNbr)
WHERE (View_ProjectsInfoDecoded.Department IN ('107'))
And (View_ProjectsInfoDecoded.ProjectStatus <>'Cancel')
And (dbo.View_ProjectsInfoDecoded.VoidInd = 'N' OR dbo.View_ProjectsInfoDecoded.VoidInd IS NULL)
AND (PMTT_DailyTime.VoidInd = 'N' OR PMTT_DailyTime.VoidInd IS NULL)
AND ((DATEDIFF(MONTH, View_ProjectsInfoDecoded.ProjectCompletionDate,GETDATE()) <= 12) OR (View_ProjectsInfoDecoded.ProjectCompletionDate IS NULL) OR (View_ProjectsInfoDecoded.ProjectCompletionDate='' ))
GROUP BY View_ProjectsInfoDecoded.Department, View_ProjectsInfoDecoded.ProjectNbr
, View_ProjectsInfoDecoded.ProjectDesc, View_ProjectsInfoDecoded.ProjectStatus
, View_ProjectsInfoDecoded.EstStartDate, View_ProjectsInfoDecoded.ProjectStartDate
, View_ProjectsInfoDecoded.ProjectCompletionDate, View_ProjectsInfoDecoded.Complexity
, View_ProjectsInfoDecoded.ProjectType, View_ProjectsInfoDecoded.VoidInd
, View_ProjectsBuilderInfoDecoded.ProjectHealth, View_ProjectsBuilderInfoDecoded.PrimaryBuilder
, View_ProjectsBuilderInfoDecoded.CurrentProjectStatus, PMTT_DailyTime.VoidInd
, DatePart("yyyy", PMTT_DailyTime.ReportDate), PMTT_DailyTime.Time
I think this is an easy fix in the Group function or the type of joins used. But not sure...
With a re-factoring of your query to use aliases, include line breaks and re-order columns, you will notice you have two additional GROUP BY fields that are not included in SELECT: EstStartDate and p.Complexity. As a result, the SELECT columns may show repeated values over the distinct groupings of these omitted two fields.
For a more readable aggregate query, consider including the same columns in GROUP BY also in SELECT clause without omitting any. Do note: per SQL standard, you cannot have a column in SELECT that does not appear in GROUP BY. However, the reverse as your query does is valid. Alternatively, simply run the analogous SELECT DISTINCT without GROUP BY.
SELECT p.ProjectNbr, p.Department, p.ProjectDesc, p.ProjectStartDate, p.ProjectCompletionDate,
p.VoidInd, p.ProjectStatus, p.ProjectType, DatePart("yyyy", d.ReportDate) AS [ReportYear],
d.Time, d.VoidInd, b.ProjectHealth, b.PrimaryBuilder, b.CurrentProjectStatus
FROM (View_ProjectsInfoDecoded p
LEFT JOIN View_ProjectsBuilderInfoDecoded b
ON (p.Department = b.Department) AND (p.ProjectNbr = b.ProjectNbr))
LEFT JOIN PMTT_DailyTime d
ON (b.Department = d.Department) AND (b.ProjectNbr = d.ProjectNbr)
WHERE (p.Department IN ('107'))
AND (p.ProjectStatus <> 'Cancel')
AND (dbo.p.VoidInd = 'N' OR dbo.p.VoidInd IS NULL)
AND (d.VoidInd = 'N' OR d.VoidInd IS NULL)
AND ((DATEDIFF(MONTH, p.ProjectCompletionDate, GETDATE()) <= 12)
OR (p.ProjectCompletionDate IS NULL)
OR (p.ProjectCompletionDate='')
)
GROUP BY p.ProjectNbr, p.Department, p.ProjectDesc, p.ProjectStartDate, p.ProjectCompletionDate,
p.VoidInd, p.ProjectStatus, p.ProjectType, DatePart("yyyy", d.ReportDate),
d.Time, d.VoidInd, b.ProjectHealth, b.PrimaryBuilder, b.CurrentProjectStatus,
p.EstStartDate, p.Complexity -- ADDITIONAL NON-SELECT FIELDS
You are using the Group By clause, but have no aggregate function in your Select statement. This results in the same behavior as using Select Distinct. Removing the Group By clause will include the duplicate records you seem to be looking for.
SELECT View_ProjectsInfoDecoded.ProjectNbr, View_ProjectsInfoDecoded.Department, View_ProjectsInfoDecoded.ProjectDesc, View_ProjectsInfoDecoded.ProjectStartDate, View_ProjectsInfoDecoded.ProjectCompletionDate, View_ProjectsInfoDecoded.VoidInd, View_ProjectsInfoDecoded.ProjectStatus, View_ProjectsInfoDecoded.ProjectType, DatePart("yyyy", PMTT_DailyTime.ReportDate) AS [ReportYear], PMTT_DailyTime.Time, PMTT_DailyTime.VoidInd, View_ProjectsBuilderInfoDecoded.ProjectHealth, View_ProjectsBuilderInfoDecoded.PrimaryBuilder, View_ProjectsBuilderInfoDecoded.CurrentProjectStatus
FROM (View_ProjectsInfoDecoded LEFT JOIN View_ProjectsBuilderInfoDecoded ON (View_ProjectsInfoDecoded.Department = View_ProjectsBuilderInfoDecoded.Department) AND (View_ProjectsInfoDecoded.ProjectNbr = View_ProjectsBuilderInfoDecoded.ProjectNbr)) LEFT JOIN
PMTT_DailyTime ON (View_ProjectsBuilderInfoDecoded.Department = PMTT_DailyTime.Department) AND (View_ProjectsBuilderInfoDecoded.ProjectNbr= PMTT_DailyTime.ProjectNbr)
WHERE (View_ProjectsInfoDecoded.Department IN ('107')) And (View_ProjectsInfoDecoded.ProjectStatus <>'Cancel') And
(dbo.View_ProjectsInfoDecoded.VoidInd = 'N' OR dbo.View_ProjectsInfoDecoded.VoidInd IS NULL) AND (PMTT_DailyTime.VoidInd = 'N' OR PMTT_DailyTime.VoidInd IS NULL)
AND ((DATEDIFF(MONTH, View_ProjectsInfoDecoded.ProjectCompletionDate,GETDATE()) <= 12) OR (View_ProjectsInfoDecoded.ProjectCompletionDate IS NULL) OR (View_ProjectsInfoDecoded.ProjectCompletionDate='' ))
The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC
Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.
Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.
Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC
Looking for a better way to write this query and my SQL skills aren't great, basic really so looking for any pointers to make this better. This is only the first two columns and the full report will have a further 10.
I'm taking a specific set of repair types and doing analysis on them with counts and calculations. The 1st is jobs brought forward to the current financial year and the second is total amount of jobs currently received.
SELECT
"Type",
(
SELECT
NVL (COUNT(jjo.jjobno), 0)
FROM
jjobh jjo
WHERE
jjo.jclcode = 'L'
AND jjo.jstatus <> '6'
AND jjo.year_rec <> (
SELECT
sub_code
FROM
code_table
WHERE
main_code = 'YEAR'
)
AND (
week_comp IS NULL
OR year_comp = (
SELECT
sub_code
FROM
code_table
WHERE
main_code = 'YEAR'
)
)
AND jjo.jrepair_type = "Type"
) AS "B/F",
(
SELECT
NVL (COUNT(jjo.jjobno), 0)
FROM
jjobh jjo
WHERE
jjo.jclcode = 'L'
AND jjo.jstatus <> '6'
AND jjo.year_rec = (
SELECT
sub_code
FROM
code_table
WHERE
main_code = 'YEAR'
)
AND jjo.jrepair_type = "Type"
) AS "Recvd"
FROM
(
SELECT
rep.repair_type_code AS "Type"
FROM
repair_type rep
WHERE
rep.client = 'L'
AND rep.work_centre = '004682'
ORDER BY
rep.repair_type_code
)
ORDER BY
"Type";
Your code is a mess. I suspect you want something like:
SELECT jjo.jrepair_type, count(*) as valbf
FROM (SELECT coalesce(COUNT(jjo.jjobno), 0)
FROM jjobh jjo cross join
(SELECT sub_code
FROM code_table
WHERE main_code = 'YEAR'
) sc
WHERE jjo.jclcode = 'L' AND
jjo.jstatus <> '6' AND
jjo.year_rec <> sc.sub_code AND
(week_comp IS NULL OR
year_comp = sc.sub_code
)
) jjo join
(SELECT rep.repair_type_code AS "Type"
FROM repair_type rep
WHERE rep.client = 'L' AND
rep.work_centre = '004682'
) rtc
on jjo.jrepair_type = rtc.repair_type_code
group by jjo.jrepair_type;
It looks like you want to join the "jjo" table to the "repair type code" table, producing information about each repair type. The order by in the subquery is useless.
My suggestion is to move the "jjo" table to the outer "from". You should also move the WHERE clauses to the outermost WHERE clause (which I didn't do). I haven't quite figured out the date logic, but this might get you on the right track.
I have the following query
SELECT ProgramDate, [CountVal]= COUNT(ProgramDate)
FROM ProgramsTbl
WHERE (Type = 'Type1' AND ProgramDate = '10/18/11' )
GROUP BY ProgramDate
What happens is that if there is no record that matches the Type and ProgramDate, I do not get any records returned.
What I like to have outputted in the above is something like the following if there is no values returned. Notice how for the CountVal we have 0 even if there are no records returned that fit the match condition:
ProgramDate CountVal
10/18/11 0
This is a little more complicated than you would like however, it is very possible. You will first have to create a temporary table of dates. For example, the query below creates a range of dates from 2011-10-11 to 2011-10-20
CREATE TEMPORARY TABLE date_stamps AS
SELECT (date '2011-10-10' + new_number) AS date_stamp
FROM generate_series(1, 10) AS new_number;
Using this temporary table, you can select from it and left join your table ProgramsTbl. For example
SELECT date_stamp,COUNT(ProgramDate)
FROM date_stamps
LEFT JOIN ProgramsTbl ON ProgramsTbl.ProgramDate = date_stamps.date_stamp
WHERE Type = 'Type1'
GROUP BY ProgramDate;
Select ProgramDate, [CountVal]= SUM(occur)
from
(
SELECT ProgramDate, 1 occur
FROM ProgramsTbl
WHERE (Type = 'Type1' AND ProgramDate = '10/18/11' )
UNION
SELECT '10/18/11', 0
)
GROUP BY ProgramDate
Because each SELECT statement is really building a table of records you can use a SELECT query to build a table with both the program count and a default count of zero. This would require two SELECT queries (one to get the actual count, one to get the default count) and using a UNION to combine the two SELECT results into a single table.
From there you can SELECT from the UNIONed table to sum the CountVals (if the programDate occurs in the ProgramTable the CountVal will be
CountVal of the first query if it exists(>0) + CountVal of the second query (=0)).
This way even if there are no records for the desired programDate in ProgramTable you will get a record back indicating a count of 0.
This would look like:
SELECT ProgramDate, SUM(CountVal)
FROM
(SELECT ProgramDate, COUNT(*) AS CountVal
FROM ProgramsTbl
WHERE (Type = 'Type1' AND ProgramDate = '10/18/11' )
UNION
SELECT '10/18/11' AS ProgramDate, 0 AS CountVal) T1
Here's a solution that works on SQL Server; not sure about other db platforms:
DECLARE #Type VARCHAR(5) = 'Type1'
, #ProgramDate DATE = '10/18/2011'
SELECT pt.ProgramDate
, COUNT(pt2.ProgramDate)
FROM ( SELECT #ProgramDate AS ProgramDate
, #Type AS Type
) pt
LEFT JOIN ProgramsTbl pt2 ON pt.Type = pt2.Type
AND pt.ProgramDate = pt2.ProgramDate
GROUP BY pt.ProgramDate
Grunge but simple and efficient
SELECT '10/18/11' as 'Program Date', count(*) as 'count'
FROM ProgramsTbl
WHERE Type = 'Type1' AND ProgramDate = '10/18/11'
Try something along these lines. This will establish a row with a date of 10/18/11 that will definitely return. Then you left join to your actual data to get your desired count (which can now return 0 if there are no corresponding rows).
To do this for more than 1 date, you'd want to build a Date table that holds a list of all dates you want to query (so substitute the "select '10/18/11'" with "select Date from DateTbl").
SELECT ProgDt.ProgDate, [CountVal]= COUNT(ProgramsTbl.ProgramDate)
FROM (SELECT '10/18/11' as 'ProgDate') ProgDt
LEFT JOIN ProgramsTbl
ON ProgDt.ProgDate = ProgramsTbl.ProgramDate
WHERE (Type = 'Type1')
GROUP BY ProgDt.ProgDate
To create a date table that you can use for querying, do this (assumes SQL Server 2005+):
create table Dates (MyDate datetime)
go
insert into Dates
select top 100000 row_number() over (order by s1.name)
from master..spt_values s1, master..spt_values s2
go