Subquery returning more than one result - sql

I am still fairly new to SQL and the stored procedure I recently created keeps telling me that a subquery is returning more than one result but I can't figure out which one is the problem. If anyone has a moment and can tell me what I am missing, I would greatly appreciate it!
Thanks!
SELECT DISTINCT a.customer_no [id],
x.esal1_desc [constituent],
a.perf [activity],
a.sp_act_dt [activity_date],
c.description[activity_type],
d.display_name_tiny [solicitor],
s.description [status],
ISNULL(a.num_attendees,0)[attending],
a.notes [notes],
e.address [email]
FROM [dbo].t_special_activity a
left outer join [dbo].tr_special_activity_status s ON s.id = a.status
left outer join [dbo].tr_special_activity c ON c.id = a.sp_act
left outer JOIN [dbo].FT_CONSTITUENT_DISPLAY_NAME() d ON a.worker_customer_no = d.customer_no
left outer JOIN [dbo].T_EADDRESS e on a.customer_no=e.customer_no and primary_ind='Y'
left outer JOIN [dbo].TX_CUST_SAL x on a.customer_no=x.customer_no and default_ind='Y'
WHERE a.status IN (ISNULL(#status, (SELECT DISTINCT id FROM TR_SPECIAL_ACTIVITY_STATUS)))
AND a.sp_act_dt BETWEEN (ISNULL(#activity_start,(SELECT MIN(sp_act_dt) FROM T_SPECIAL_ACTIVITY)))
AND (ISNULL(#activity_end,(SELECT MAX(sp_act_dt) FROM T_SPECIAL_ACTIVITY)))
AND ((ISNULL(#list,0) = 0) OR EXISTS (SELECT customer_no FROM T_LIST_CONTENTS lc WITH (NOLOCK)
WHERE a.customer_no = lc.customer_no and lc.list_no = #list))

Alas, you cannot use this expression:
WHERE a.status IN (ISNULL(#status, (SELECT DISTINCT id FROM TR_SPECIAL_ACTIVITY_STATUS)))
The subquery is in a place where a single value is expected. In any case, I think you want:
WHERE #status IS NULL OR
a.status IN (SELECT id FROM TR_SPECIAL_ACTIVITY_STATUS)
Note that select distinct is irrelevant in an IN clause. At best it does nothing; at worst it impedes the optimizer.
I realize this is a little confusing. You are thinking that IN takes a list -- and the list could even be a subquery. But, the elements of the list are scalars not lists. So, when a subquery is an element of the list, then it is assumed to be a single value.

Related

Oracle compare query results with multiple joins against a table

I need to compare query results against a table. I have the following query.
select
i.person_id,
a.appellant_first_name,
a.appellant_middle_name,
a.appellant_last_name,
s.*
from CWLEGAL.individuals i inner join CWLEGAL.tblappealsdatarevisionone a
on i.casenm = a.D_N_NUMBER1 and
i.first_name = a.appellant_first_name and
i.last_name = a.appellant_last_name
inner join CWLEGAL.tblappealstosupremecourt s
on a.DATABASEIDNUMBER = s.DBIDNUMBER
order by orclid21;
I need to see what orclid21's in cwlegal.tblappealstosupremecourt don't appear in the above query.
I was able to get this to work.
select
i.person_id,
a.appellant_first_name,
a.appellant_middle_name,
a.appellant_last_name,
s.*
from CWLEGAL.tblappealstosupremecourt s
join CWLEGAL.tblappealsdatarevisionone a
on a.DATABASEIDNUMBER = s.DBIDNUMBER
left outer join CWLEGAL.individuals i on
i.casenm = a.D_N_NUMBER1 and
i.first_name = a.appellant_first_name and
i.last_name = a.appellant_last_name
where person_id is null
order by orclid21
You are making the first inner join between i and a, the result of which you're joining with s.
Now, if you want to see which records won't join, that's known as anti-join, and in whatever database you're querying it, it may be achieved by either selecting a null result or taking those records as a new result.
Examples, with taking your query (the whole code in the question) as q, assuming you've kept all the needed keys in it:
Example 1:
with your_query as q
select s.orclid21 from q
left join CWLEGAL.tblappealstosupremecourt s
on q.DATABASEIDNUMBER = s.DBIDNUMBER
and s.orclid21 is null
Example 2:
with your_query as q
select s.orclid21 from q
right join CWLEGAL.tblappealstosupremecourt s
on q.DATABASEIDNUMBER != s.DBIDNUMBER
Example 3:
with your_query as q
select s.orclid21 from CWLEGAL.tblappealstosupremecourt s
where s.DBIDNUMBER not in (select distinct q.DATABASEIDNUMBER from q)

Using COALESCE with JOIN on a different database column

Trying to populate the location column of a query and was hoping that the use of the COALESCE function would help me get what I want.
SELECT OrderItem.Code AS ItemCode, MAX(COALESCE(OrderItem.Location, [Picklist].[dbo].[ItemData].InventoryLocation)) AS Location, SUM(OrderItem.Quantity) AS Quantity, MAX(Store.StoreName) AS Store
FROM OrderItem
INNER JOIN [Order] ON OrderItem.OrderID = [Order].OrderID
INNER JOIN [Store] ON [Order].StoreID = [Store].StoreID
LEFT JOIN [AmazonOrder] ON [AmazonOrder].OrderID = [Order].OrderID
JOIN [Picklist].[dbo].[ItemData] ON [Picklist].[dbo].[ItemData].[InventoryNumber] = [OrderItem].[Code]
WHERE (CASE WHEN [Order].[LocalStatus] = 'Recently Downloaded' AND [AmazonOrder].FulfillmentChannel = 2 THEN 1
WHEN [Order].[LocalStatus] = 'Recently Downloaded' AND [Store].StoreName != 'Amazon' THEN 1 ELSE 0 END ) = 1
GROUP BY OrderItem.Code
ORDER BY ItemCode
There will not be a location when the Store is Amazon so I need to Join on another table in another database. I don't believe I'm using this correctly. Also I do get the right Location results returned if I use :
SELECT InventoryLocation From [Picklist].[dbo].[ItemData] WHERE InventoryNumber = 'L1201-2W-EA'
Perhaps this is more like the query that you want:
SELECT oi.Code AS ItemCode, COALESCE(oi.Location, id.InventoryLocation) AS Location,
oi.Quantity, s.StoreName AS Store
FROM OrderItem oi INNER JOIN
[Order] o
ON oi.OrderID = o.OrderID INNER JOIN
[Store]
ON o.StoreID = s.StoreID LEFT JOIN
AmazonOrder ao
ON ao.OrderID = o.OrderID JOIN
[Picklist].[dbo].[ItemData] id
ON id.InventoryNumber = oi.[Code]
WHERE o.LocalStatus = 'Recently Downloaded' AND
(ao.FulfillmentChannel = 2 OR s.StoreName <> 'Amazon')
ORDER BY ItemCode
Here are the changes:
Removed the aggregation. It does not seem to be part of the question.
Introduced table aliases, so the query is easier to write and to read.
Simplified the logic in the where clause.
As the comment above says, the max seems somewhat strange, an arbitrary aggregation no doubt due to one of the joins bringing back more information than you might of expected.
Then the statement has a few issues:
The coalesce is using two fields, neither if which is in a left join, only the AmazonOrder is left joined, so that seems a bit strange, that would only work if the first field in the coalesce (OrderItem.Location) is nullable - which it might be, there is no schema posted.
The left join itself is an inner join in disguise at present - within the where clause you have given explicit conditions on a field from that table - AND [AmazonOrder].FulfillmentChannel = 2 - if the record was actually missing the left join would return null for that field, and the where clause would then drop it out of the results. If you want this to properly work as a left join, any condition on fields from that table must move into the join condition, or the where clause itself must allow for that field being null (explicitly or using a coalesce.)
SELECT OrderItem.Code AS Code,
CASE WHEN (LEN(ISNULL(MAX([OrderItem].[Location]),'')) = 1)
THEN MAX([OrderItem].[Location])
ELSE MAX([Picklist].[dbo].[ItemData].InventoryLocation)
END AS Location,
SUM(OrderItem.Quantity) AS Quantity,
MAX(Store.StoreName) AS Store
FROM OrderItem
INNER JOIN [Order] ON OrderItem.OrderID = [Order].OrderID
INNER JOIN [Store] ON [Order].StoreID = [Store].StoreID
LEFT JOIN [AmazonOrder] ON [AmazonOrder].OrderID = [Order].OrderID
LEFT JOIN [Picklist].[dbo].[ItemData] ON [Picklist].[dbo].[ItemData].[InventoryNumber] = [OrderItem].[Code] OR
[Picklist].[dbo].[ItemData].[MediaCreator] = [OrderItem].[Code]
WHERE [Order].LocalStatus = 'Recently Downloaded' AND (AmazonOrder.FulfillmentChannel = 2 OR Store.StoreName <> 'Amazon')
GROUP BY OrderItem.Code
ORDER BY OrderItem.Code
Decided to go with case statement on location column route because I could not get COALESCE to work for me. Schema, some not all data, at SQLFiddle.
I guess if someone gets COALESCE to work I'll change the answer?
#Gordon Linoff I used the re-written WHERE clause because it looked cleaner than using the CASE statement. It worked and guessed there was a simpler way to go about it but was more worried about getting COALESCE to work. As for the Aliases sometimes I like to use them but in this case since there was a lot of tables I like to code out what I'm actually working in. Just my preference .

Restrict SQL subquery in SELECT

I thought the subquery within the select statement will be restricted by the FROM and/or JOIN statements.
Therefore, my query always returns an error because there is more than one row in the subquery.
SELECT
dbo.Countries.Name,
dbo.Countries.ISO2,
(SELECT dbo.CountryFields.Field
FROM dbo.CountryFields
WHERE dbo.CountryFields.Field = 'Population') AS Population
FROM
dbo.CountryFields
INNER JOIN
dbo.Countries ON (dbo.CountryFields.Countries_Id = dbo.Countries.Countries_Id)
How can I restrict the number of rows in my subquery?
Do I need there also an inner join Statement inside the subquery? I hoped the subquery will inherit from normal SELECT so I don't need manual restrictions.
The column "Field" contains more than "Population" and I would like to show more rows in the SELECT statement with subselects but now ... I can't even get one column to work. :-(
I think you want something like this:
SELECT
a.Name,
a.ISO2,
(SELECT TOP 1 b.Field FROM dbo.CountryFields b WHERE b.Countries_Id = a.Countries_Id AND b.Field = 'Population') AS Population,
(SELECT TOP 1 b.Field FROM dbo.CountryFields b WHERE b.Countries_Id = a.Countries_Id AND b.Field = 'Capital') AS Capital,
(SELECT TOP 1 b.Field FROM dbo.CountryFields b WHERE b.Countries_Id = a.Countries_Id AND b.Field = 'Area') AS Area
FROM
dbo.Countries a
Of course there are ways to optimize the above query, but it's always a tradeoff between readability and speed.
Good luck!
I think that something like this is the proper query:
SELECT
C.Name,
C.ISO2,
ISNULL(CF_POP.Value,0) AS [Population],
ISNULL(CF_F2.Value,0) AS [Field2],
ISNULL(CF_F3.Value,0) AS [Field3]
FROM
dbo.Countries AS C
LEFT JOIN dbo.CountryFields AS CF_POP ON (C.Countries_Id = CF_POP.Countries_Id) AND (CF_POP.Field = 'Population')
LEFT JOIN dbo.CountryFields AS CF_F2 ON (C.Countries_Id = CF_F2.Countries_Id) AND (CF_F2.Field = 'Field2')
LEFT JOIN dbo.CountryFields AS CF_F3 ON (C.Countries_Id = CF_F3.Countries_Id) AND (CF_F3.Field = 'Field3')
In this example you connect each row from CountryFields as a column. I use LEFT JOIN, because I don't know how complete is your data (if you want to see blanks you have to remove ISNULL). I also put column Value, because I suppose that there must be second column which corresponds to CountryFields.Field. This also can be done with CROSS APPLY, but in that case syntax will be different.

SQL query join conditions

I have a query (exert from a stored procedure) that looks something like this:
SELECT S.name
INTO #TempA
from tbl_Student S
INNER JOIN tbl_StudentHSHistory TSHSH on TSHSH.STUD_PK=S.STUD_PK
INNER JOIN tbl_CODETAILS C
on C.CODE_DETL_PK=S.GID
WHERE TSHSH.Begin_date < #BegDate
Here is the issue, the 2nd inner join and corresponding where statement should only happen if only a certain variable (#UseArchive) is true, I don't want it to happen if it is false. Also, in TSHSH certain rows might have no corresponding entries in S. I tried splitting it into 2 separate queries based on #UseArchive but studio refuses to compile that because of the INTO #TempA statement saying that there is already an object named #TempA in the database. Can anyone tell me of a way to fix the query or a way to split the queries with the INTO #TempA statement?
Looks like you're asking 2 questions here.
1- How to fix the SELECT INTO issue:
SELECT INTO only works if the target table does not exist. You need to use INSERT INTO...SELECT if the table already exists.
2- Conditional JOIN:
You'll need to do a LEFT JOIN if the corresponding row may not exist. Try this.
SELECT S.name
FROM tbl_Student S
INNER JOIN tbl_StudentHSHistory TSHSH
ON TSHSH.STUD_PK=S.STUD_PK
LEFT JOIN tbl_CODETAILS C
ON C.CODE_DETL_PK=S.GID
WHERE TSHSH.Begin_date < #BegDate
AND CASE WHEN #UseArchive = 1 THEN c.CODE_DETL_PK ELSE 0 END =
CASE WHEN #UseArchive = 1 THEN S.GID ELSE 0 END
Putting the CASE statement in the WHERE clause and not the JOIN clause will force it to act like an INNER JOIN when #UseArchive and a LEFT JOIN when not.
I'd replace it with LEFT JOIN
LEFT JOIN tbl_CODETAILS C ON #UseArchive = 1 AND C.CODE_DETL_PK=S.GID
You can split the queries and then insert into a temp table easily.
SELECT * INTO #TempA FROM
(
SELECT * FROM Q1
UNION ALL
SELECT * FROM Q2
) T
SELECT S.name
INTO #TempA
from tbl_Student S
INNER JOIN tbl_StudentHSHistory TSHSH
on TSHSH.STUD_PK = S.STUD_PK
INNER JOIN tbl_CODETAILS C
on C.CODE_DETL_PK = S.GID
and #UseArchive = true
WHERE TSHSH.Begin_date < #BegDate
But putting #UseArchive = true in the join in this case is the same as where
Your question does not make much sense to me
So what if TSHSH certain rows might have no corresponding entries in S?
If you want just one of the joins to match
SELECT S.name
INTO #TempA
from tbl_Student S
LEFT OUTER JOIN tbl_StudentHSHistory TSHSH
on TSHSH.STUD_PK = S.STUD_PK
LEFT OUTER JJOIN tbl_CODETAILS C
on C.CODE_DETL_PK = S.GID
and #UseArchive = true
WHERE TSHSH.Begin_date < #BegDate
and ( TSHSH.STUD_PK is not null or C.CODE_DETL_PK id not null )

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.