SQL UNION query not working - sql

here is my current queries:
1
SELECT FilteredInvoice.accountidname,
FilteredInvoice.createdon,
FilteredInvoice.createdon AS sort_date,
FilteredInvoice.duedate,
FilteredInvoice.invoicenumber,
FilteredInvoice.statecodename,
FilteredInvoice.totalamount_base,
CONVERT(datetime, NULL) AS mag_paymentdate,
0 AS mag_amount_base,
GETDATE() AS Today
FROM FilteredAccount AS CRMAF_FilteredAccount
JOIN FilteredInvoice ON FilteredInvoice.accountid = CRMAF_FilteredAccount.accountid
JOIN FilteredMag_Payment ON FilteredInvoice.invoiceid = FilteredMag_Payment.mag_invoiceid
WHERE (FilteredInvoice.statecodename <> 'Canceled')
2
SELECT FilteredInvoice_1.accountidname,
FilteredInvoice_1.createdon,
FilteredInvoice_1.createdon AS sort_date,
FilteredInvoice_1.duedate,
FilteredInvoice_1.invoicenumber,
FilteredInvoice_1.statecodename,
FilteredInvoice_1.totalamount_base,
FilteredMag_Payment.mag_paymentdate,
FilteredMag_Payment.mag_amount_base,
GETDATE() AS Today
FROM FilteredAccount AS CRMAF_FilteredAccount
LEFT JOIN FilteredInvoice AS FilteredInvoice_1 ON FilteredInvoice_1.accountid = CRMAF_FilteredAccount.accountid
JOIN FilteredMag_Payment ON FilteredInvoice_1.invoiceid = FilteredMag_Payment.mag_invoiceid
WHERE (FilteredInvoice_1.statecodename <> 'Canceled')
These alone do exactly what i am wanting them to but as soon as i try and join them using a "UNION" or "Sub-query" the second query always breaks and displays the wrong information.
Am I just being blond not being able to work this out or am I actually doing something wrong.
All help is appreciated.
Many thanks Simon.
EDIT:
What I mean by "Wrong information" is that the 2nd query is returning all values rather then following the CRMAF_ prefix and returning only values from the account it is run on.

It's hard to guess what do you mean by "wrong information" but I believe you want UNION ALL rather than UNION.
UNION removes duplicates so the records from the second query won't be returned if they were previously returned by the first query. In addition, the possible duplicates within one query will be eliminated too.
The number of records in a UNION can be less than the total count of records in two queries.
If you just want to concatenate two recordsets, use UNION ALL:
SELECT FilteredInvoice.accountidname,
FilteredInvoice.createdon,
FilteredInvoice.createdon AS sort_date,
FilteredInvoice.duedate,
FilteredInvoice.invoicenumber,
FilteredInvoice.statecodename,
FilteredInvoice.totalamount_base,
CONVERT(datetime, NULL) AS mag_paymentdate,
0 AS mag_amount_base,
GETDATE() AS Today
FROM FilteredAccount AS CRMAF_FilteredAccount
JOIN FilteredInvoice ON FilteredInvoice.accountid = CRMAF_FilteredAccount.accountid
JOIN FilteredMag_Payment ON FilteredInvoice.invoiceid = FilteredMag_Payment.mag_invoiceid
WHERE (FilteredInvoice.statecodename <> 'Canceled')
UNION ALL
SELECT FilteredInvoice_1.accountidname,
FilteredInvoice_1.createdon,
FilteredInvoice_1.createdon AS sort_date,
FilteredInvoice_1.duedate,
FilteredInvoice_1.invoicenumber,
FilteredInvoice_1.statecodename,
FilteredInvoice_1.totalamount_base,
FilteredMag_Payment.mag_paymentdate,
FilteredMag_Payment.mag_amount_base,
GETDATE() AS Today
FROM FilteredAccount AS CRMAF_FilteredAccount
LEFT JOIN FilteredInvoice AS FilteredInvoice_1 ON FilteredInvoice_1.accountid = CRMAF_FilteredAccount.accountid
JOIN FilteredMag_Payment ON FilteredInvoice_1.invoiceid = FilteredMag_Payment.mag_invoiceid
WHERE (FilteredInvoice_1.statecodename <> 'Canceled')

It looks to me as though you should be able to get the same results as you would from the UNIONed query, with the following:
SELECT FilteredInvoice_1.accountidname,
FilteredInvoice_1.createdon,
FilteredInvoice_1.createdon AS sort_date,
FilteredInvoice_1.duedate,
FilteredInvoice_1.invoicenumber,
FilteredInvoice_1.statecodename,
FilteredInvoice_1.totalamount_base,
CASE PF.pay_flag
WHEN 0.0 THEN CONVERT(datetime, NULL)
ELSE FilteredMag_Payment.mag_paymentdate
END AS mag_paymentdate,
FilteredMag_Payment.mag_amount_base * PF.pay_flag AS mag_amount_base,
GETDATE() AS Today
FROM FilteredAccount AS CRMAF_FilteredAccount
CROSS JOIN (SELECT 1.0 pay_flag UNION SELECT 0.0) AS PF
JOIN FilteredInvoice AS FilteredInvoice_1 ON FilteredInvoice_1.accountid = CRMAF_FilteredAccount.accountid
LEFT JOIN FilteredMag_Payment ON FilteredInvoice_1.invoiceid = FilteredMag_Payment.mag_invoiceid
WHERE (FilteredInvoice_1.statecodename <> 'Canceled') AND
(PF.pay_flag = 0 OR FilteredMag_Payment.mag_invoiceid IS NOT NULL)
EDIT: LEFT JOIN FilteredMag_Payment
FURTHER EDIT: added final parenthesised OR condition to WHERE clause.

Related

How could I join these queries together?

I have 2 queries. One includes a subquery and the other is a pretty basic query. I can't figure out how to combine them to return a table with name, workdate, minutes_inactive, and hoursworked.
I have the code below for what I have tried. The simple query is lines 1,2, and the last 5 lines. I also added a join clause (join punchclock p on p.servrepID = l.repid) to it.
Both these queries ran on their own so this is solely just the problem of combining them.
select
sr.sr_name as liaison, cast(date_created as date) workdate,
(count(date_created) * 4) as minutes_inactive,
(select
sr_name, cast(punchintime as date) as workdate,
round(sum(cast(datediff(minute,punchintime, punchouttime) as real) / 60), 2) as hoursworked,
count(*) as punches
from
(select
sr_name, punchintime = punchdatetime,
punchouttime = isnull((select top 1 pc2.punchdatetime
from punchclock pc2
where pc2.punchdatetime > pc.punchdatetime
and pc.servrepid = pc2.servrepid
and pc2.inout = 0
order by pc2.punchdatetime), getdate())
from
punchclock pc
join
servicereps sr on pc.servrepid = sr.servrepid
where
punchyear >= 2017 and pc.inout = 1
group by
sr_name, cast(punchintime as date)))
from
tbl_liga_popup_log l
join
servicereps sr on sr.servrepID = l.repid
join
punchclock p on p.servrepID = l.repid collate latin1_general_bin
group by
cast(l.date_created as date), sr.sr_name
I get this error:
Msg 102, Level 15, State 1, Line 19
Incorrect syntax near ')'
I keep getting this error but there are more errors if I adjust that part.
I don't know that we'll fix everything here, but there are a few issues with your query.
You have to alias your sub-query (technically a derived table, but whatever)
You have two froms in your outer query.
You have to join to the derived table.
Here's an crude example:
select
<some stuff>
from
(select col1 from table1) t1
inner join t2
on t1.col1 = t2.col2
The large error here is that you are placing queries in the select section (before the from). You can only do this if the query returns a single value. Else, you have to put your query in a parenthesis (you have done this) in the from section, give it an alias, and join it accordingly.
You also seem to be using group bys that are not needed anywhere. I can't see aggregation functions like sum().
My best bet is that you are looking for the following query:
select
sr_name as liaison
,cast(date_created as date) workdate
,count(distinct date_created) * 4 as minutes_inactive
,cast(punchintime as date) as workdate
,round(sum(cast(datediff(minute,punchintime,isnull(pc2_query.punchouttime,getdate())) as real) / 60), 2) as hoursworked
,count(*) as punches
from
punchclock pc
inner join servicereps sr on pc.servrepid = sr.servrepid
cross apply
(
select top 1 pc2.punchdatetime as punchouttime
from punchclock pc2
where pc2.punchdatetime > pc.punchdatetime
and pc.servrepid = pc2.servrepid
and pc2.inout = 0
order by pc2.punchdatetime
)q1
inner join tbl_liga_popup_log l on sr.servrepID = l.repid
where punchyear >= 2017 and pc.inout = 1

Why is SQL doing an inner join where an outer join is needed

I have two tables which I want to "outer" join (and then fetch) using SQL. The exact SQL query (in question) is:
SELECT
LEFT(a.cusip, 6) AS cusip6,
a.date, a.prc, a.ret, a.vol, a.spread, a.shrout,
b.epsf12, (b.seqq-b.pstkq) / b.cshoq AS bps
FROM
crsp.msf a
FULL JOIN
compa.fundq b ON (LEFT(a.cusip, 6) = LEFT(b.cusip, 6)
AND a.date = b.datadate)
WHERE
(b.datadate BETWEEN '2010-01-01' and '2015-12-31')
AND (a.date BETWEEN '2010-01-01' and '2015-12-31')
AND (b.cshoq > 0)
This returns 670'293 rows.
But when I fetch the two datasets separately and (outer) join them through R-merge(), I get 1'182'093 rows. The two separate queries I use are:
SELECT
LEFT(cusip, 6) AS cusip6, date, prc, ret, vol, spread, shrout
FROM
crsp.msf
WHERE
date BETWEEN '2010-01-01' and '2015-12-31'
SELECT
LEFT(cusip, 6) AS cusip6, datadate AS date, epsf12,
(seqq-pstkq)/cshoq AS bps
FROM
compa.fundq
WHERE
datadate BETWEEN '2010-01-01' and '2015-12-31'
AND cshoq > 0
And then I merge (outer join) using:
merge(x = data_1, y = data_2, by.x = c("cusip6", "date"), by.y = c("cusip6", "date"), all = T)
This returns 1'182'093 rows which is correct. So my original (first) SQL query is in fact performing an "inner join" when I explicitly specify an outer join. The below R-merge() returned 670'293 rows re-validating that the fetched data from SQL is indeed an inner join.
merge(x = data_1, y = data_2, by.x = c("cusip6", "date"), by.y = c("cusip6", "date"))
What am I doing wrong with my SQL query?
Because the WHERE clause is applied after the JOINs. At this point there are NULL values (as a result of 'failed' JOINs), and those rows fail the WHERE clause.
If you want an OUTER JOIN and a filter, put the filter in the JOIN or a sub-query.
SELECT
LEFT(a.cusip, 6) AS cusip6,
a.date, a.prc, a.ret, a.vol, a.spread, a.shrout,
b.epsf12, (b.seqq-b.pstkq) / b.cshoq AS bps
FROM
(SELECT * FROM crsp.msf WHERE date BETWEEN '2010-01-01' and '2015-12-31') a
FULL JOIN
(SELECT * FROM compa.fundq WHERE datadate BETWEEN '2010-01-01' and '2015-12-31' AND cshoq > 0) b
ON LEFT(a.cusip, 6) = LEFT(b.cusip, 6)
AND a.date = b.datadate

How to only return 1 row when meet either subquery condition?

I work with electronic health records and need to determine % of patients who meet specific criteria. Note: Because the data is from EHR software, I don't have any control over how the tables and fields are organized.
My denominator is all patients who meet certain criteria, say PatientDenominator (for example: all patients who had XYZ123 service and were seen in July 2016 in a particular state). My numerator is how many in PatientDenominator had checkbox1 OR checkbox2 checked in the past year.
Except that these checkboxes are not evaluated at every visit - only at certain visits. So I have these as subqueries but because I put them in the FROM statement, when a patient meets both criteria they get two rows returned, which skews the numerator (which I'm doing in Excel for now since I'm still trying to figure out the query).
How do I reorganize the query so I can get only 1 row - when the patient meets EITHER criteria? The only way I see how is doing a CASE (when Scenario1 = 1 then 1 when Scenario2 = 1 then 1, else 0) but is that the best way to do it? I'm new to this sort of complexity and since I'll be evaluating many months at a time (and about 5,000 patients a month) I worry a little about performance.
declare
#startdate DATE,
#enddate DATE
set #startdate = '2016-07-01'
set #enddate = '2016-08-01'
select
PatientNumber,
PatientSex,
case
when (Scenario1='1' OR Scenario2='1') then 'Meets Criteria'
when (checkbox1 is null and checkbox2 is null) then 'N/A'
else 'Not Meet Criteria'
end 'Criteria'
from
PatientTable PT
left join VisitTable VT on PT.person_id=VT.person_id
left join LocationTable lt on VT.location_id=lt.location_id
left join TableZ on VT.VisitID=TableZ.VisitID
left join (
select distinct
PT1.person_id,
case
when TableA.checkbox1 = '1' OR TableB.checkbox2 = '1')
then '1'
else '0'
end Scenario1Eval
from
PatientTable1 PT1
left join VisitTable1 VT1 on PT1.person_id=VT1.person_id
left join TableA on VT1.VisitID=TableA.VisitID
left join TableB on VT1.VisitID=TableB.VisitID
left join TableC on VT1.VisitID=TableC.VisitID
where
sex='f'
and TableC.VisitType in ('Type1','Type2')
and (VT1.VisitDate > DATEADD(YEAR,-1,#startdate) and VT1.VisitDate < #enddate )
) as Scenario1 on PT.person_id=Scenario1.person_id
left join (
select distinct
PT2.person_id,
case
when TableA2.checkbox1 = '1' OR TableB2.checkbox2 = '1')
then '1'
else '0'
end Scenario2Eval
from
PatientTable1 PT2
left join VisitTable1 VT2 on PT2.person_id=VT2.person_id
left join TableA2 on VT2.VisitID=TableA2.VisitID
left join TableB2 on VT2.VisitID=TableB2.VisitID
left join TableD2 on VT2.VisitID=TableD2.VisitID
where
(VT1.VisitDate > DATEADD(YEAR,-1,#startdate) and VT1.VisitDate < #enddate )
and (TableD2.ChargeCode like '9920%' OR TableD2.ChargeCode like '9938%')) as Scenario2 on PT.person_id=Scenario2.person_id
where
VT1.VisitDate > #startdate and VT1.VisitDate < #enddate
and TableZ.QualifyingField = 'Y'
and lt.state='ZZ'
and (Scenario1 is not null OR Scenario2 is not null)

Select where no value is greater than X

I'm currently running this query:
SELECT DISTINCT f.FormName
FROM PatientTask as pt
INNER JOIN ClinicTask ct
ON pt.fTaskKey = ct.fTaskKey
INNER JOIN Form f
ON ct.fFormKey = f.FormKey
WHERE pt.TaskTargetDate <= CONVERT(datetime, '2012-01-01');
Now, clearly this is just going to return FormName that have a TaskTargetDate that is earlier than January 1st 2012. What I'm trying to do is find FormNames that do not have a TaskTargetDate which exists in the last 2 years. So if there's a form with a TaskTargetDate in 2010, 2011, and 2013, it should be excluded entirely from the query return because of that 2013 date.
Essentially I'm looking for old forms which are no longer being used.
A NOT IN should give you those results:
Select DISTINCT f.FormName
FROM Form f
WHERE f.FormKey NOT IN
(
SELECT ct.fFormKey
From PatientTask as pt
Inner Join ClinicTask ct on pt.fTaskKey = ct.fTaskKey
WHERE pt.TaskTargetDate >= CONVERT(datetime, '2012-01-01')
)
Also the CONVERT is not necessary - SQL will automatically parse '2012-01-01' as a date since it's being compared to a date value
You can use left outer join:
Select
DISTINCT f.FormName, pt.fTaskKey
From
Form f left outer join ClinicTask ct
on ct.fFormKey = f.FormKey
Inner Join
(
select fTaskKey
from PatientTask
WHERE pt.TaskTargetDate >= CONVERT(datetime, '2012-01-01')
) pt
on pt.fTaskKey = ct.fTaskKey
where ct.fFormKey is null

Adding zero values to report

Ok This is a good question I think.
At the moment I have a report showing amount of tickets per machine and how much each machine made in ticket sales.
Some machines sell Zero tickets but they are not includded in my report.
Now i want to include them.
there is a full list of all machines in machconfig table which I could compare to the ticketssold table which also has a field corresponding to the machine that sold it.
So I guess I could find all of the machines that havent sold any tickets by looking for machine id's (MCHterminalid) that dont appear in the ticketssold table (TKtermid column)
here is the code I've got so far..
SELECT TKtermID,
MCHlocation,
Count (TKvouchernum) AS Totaltickets,
Cast(Sum(TKcomission) AS FLOAT) / 100 AS Total_Comission
FROM ticketssold(NOLOCK)
INNER JOIN machconfig (NOLOCK)
ON MCHterminalID = TKtermID
WHERE cfglocationcountry = 'UK'
AND dateadded BETWEEN Getdate() - 100 AND Getdate()
GROUP BY vstermID,
cfglocation
ORDER BY Total_comission DESC
Change the inner join between ticketssold and machconfig to a right outer join to get all machines, regardless of a match in the tickets sold table. The count of TKVouchernum will return the zeros for you:
SELECT TKtermID,
MCHlocation,
Count (TKvouchernum) AS Totaltickets,
Cast(Sum(TKcomission) AS FLOAT) / 100 AS Total_Comission
FROM ticketssold(NOLOCK)
RIGHT OUTER JOIN machconfig (NOLOCK)
ON MCHterminalID = TKtermID
WHERE cfglocationcountry = 'UK'
AND dateadded BETWEEN DateAdd(DAY, -100, GetDate()) AND Getdate()
GROUP BY vstermID,
cfglocation
ORDER BY Total_comission DESC
OCD Version not totally proofed (also killing me that table names are not included before the fields). Use the outer join in combination with COALESCE
SELECT
TKTermID TicketTerminalId,
MchLocation MachineLocation,
COALESCE(COUNT(TKVoucherNum),0) TotalTickets,
COALESCE(CAST(SUM(TKComission) AS float),0) / 100 TotalComission
FROM
MachConfig (NOLOCK)
LEFT JOIN
TicketsSold (NOLOCK)
ON
TKtermID = MCHterminalID
WHERE
CfgLocationCountry = 'UK'
AND
DateAdded BETWEEN DATEADD(DAY, -100, GETDATE()) AND GETDATE()
GROUP BY
VSTermID,
CfgLocation
ORDER BY
COALESCE(CAST(SUM(TKComission) AS float),0) / 100 DESC; --Do this in reporting!
Do not use inner joins because they will eliminate rows. I start my joins with the table that has all the data. In this case machconfig and then do a left outer join to the table with the problematic data ticketssold.
You may also want to think about doing your grouping on the report side for flexibility.
Finally got it working the way I want.. Here is the proper code:
SELECT MCHTerminalID, MCHLocation, ISNULL(CONVERT(varchar(16), batch.LastBatchIn, 103),
'Did not batch in') AS LastBatchIn,
ISNULL(COUNT(Ticket.VoucherNum), 0) AS TotalVouchers,
ISNULL(SUM(Ticket.Sale), 0) AS TotalGrossAmount, ISNULL(SUM(Ticket.Refund),0) AS TotalRefundAmount, ISNULL(SUM(Ticket.Comission),0) AS TotalComission
FROM termConfig AS config WITH (NOLOCK)
LEFT OUTER JOIN
(SELECT bsTerminalID, MAX(bsDateTime) AS LastBatchIn
FROM batchSummary WITH (NOLOCK)
WHERE bsDateTime BETWEEN getdate()-50 AND getdate()
GROUP BY bsTerminalID
)
AS batch
ON config.MCHTerminalID = batch.bsTerminalID
LEFT OUTER JOIN
(SELECT DISTINCT TKTermID,
TKVoucherNum AS VoucherNum,
CAST(TKGrossTotal AS float)/100 AS Sale,
CAST(TKRefundAmount AS float)/100 AS Refund,
CAST(TKComission AS float)/100 AS Comission
FROM TicketVouchers WITH (NOLOCK)
WHERE dateAdded BETWEEN getdate()-50 AND getdate()
)
AS Ticket
ON
config.MCHTerminalID = Ticket.TKTermID
WHERE
config.MCHLocationCountry = 'uk'
AND config.MCHProductionTerminal = 'Y'
GROUP BY config.MCHTerminalID, config.MCHLocation, LastBatchIn
ORDER BY TotalComission desc
You could UNION the 'zero' rows to your original e.g.
<original query here>
...
UNION
SELECT MCHterminalID,
MCHlocation,
0 AS Totaltickets,
0 AS Total_Comission
FROM machconfig
WHERE NOT EXISTS (
SELECT *
FROM ticketssold
WHERE MCHterminalID = TKtermID
)
(Review for hints).