subquery returning more than one value - sql

SELECT CG.SITEID,
CR.COLLECTIONID,
CG.COLLECTIONNAME,
CASE
WHEN CR.ARCHITECTUREKEY = 5
THEN
N'vSMS_R_System'
WHEN CR.ARCHITECTUREKEY = 0
THEN
(SELECT BASETABLENAME
FROM DISCOVERYARCHITECTURES
JOIN
COLLECTION_RULES
ON DISCOVERYARCHITECTURES.DISCARCHKEY =
COLLECTION_RULES.ARCHITECTUREKEY
JOIN
COLLECTIONS_G
ON COLLECTION_RULES.COLLECTIONID =
COLLECTIONS_G.COLLECTIONID
WHERE COLLECTIONS_G.SITEID = (SELECT TOP 1 SOURCECOLLECTIONID FROM VCOLLECTIONDEPENDENCYCHAIN WHERE DEPENDENTCOLLECTIONID = CG.SITEID ORDER BY LEVEL DESC))
ELSE (SELECT DA.BASETABLENAME FROM DISCOVERYARCHITECTURES DA WHERE DA.DISCARCHKEY=CR.ARCHITECTUREKEY) END AS TABLENAME
FROM COLLECTIONS_G CG
JOIN COLLECTIONS_L CL ON CG.COLLECTIONID=CL.COLLECTIONID
JOIN COLLECTION_RULES CR ON CG.COLLECTIONID=CR.COLLECTIONID
WHERE (CG.FLAGS&4)=4 AND CL.CURRENTSTATUS!=5
I am having a problem with the code above, around the line:
when cr.ArchitectureKey=0 then...
The problem is that the sub-query returns more than one value, and I'm not too sure how to invert the query so that I get rid of the error.
To make matters worse, cr.ArchitectureKey would normally join with da.DiscArchKey, but while cr.ArchitectureKey can have a value of 0, that does not exist in da.DiscArchKey, meaning if I join the two directly I lose data.
EDIT
More information regarding the problem itself:
This is a stored procedure for a Microsoft product that has a 'bug' (probably considered a feature though) which I'm trying to fix. Don't worry, this is only in my own little test server.
Anyway, there's the concept of a Collection. All Collections must have a parent (determined through VCOLLECTIONDEPENDENCYCHAIN), with the exception of the very top level Collection that is a system collection and cannot be modified.
Each collection can have 0 or more rules, and each rule has a rule type, where the ID of the rule type is saved onto COLLECTION_RULES and the matching string for that ID is saved onto DISCOVERYARCHITECTURES.
In most cases, a rule is a WQL query, and the rule type is determined by what tables are queried on the WQL query.
However, and this is where the problem lies, collections can also have a query of type 'include' or 'exclude', which basically forces it to borrow the query of another Collection. So effectively you include the results of another Collection's query onto your own Collection, and that's the query.
As far as COLLECTION_RULES is concerned, when that happens, the ID of the rule type is 0, which is a value that doesn't exist in DISCOVERYARCHITECTURES.
What I was trying to modify was so that when the rule type is 0, get and use the rule type(s) of the highest up parent (not the direct parent since the parent Collection could also have a single include rule, in which case the rule type would still be 0).
The problem is that because each rule can have multiple rule types, it returns multiple rows in some instances.
I tried to invert the query to remove the SELECT and use joins only, but failed because I found I always needed to join it to DISCOVERYARCHITECTURES and I have nothing to join it on when the rule type = 0.
EDIT2
Sample data:
Collections_G
Collections_L
Collection_Rules
DiscoveryArchitectures
vCollectionDependencyChain
Original Query and Original Results
SELECT cg.SiteID,
CASE
WHEN da.DiscArchKey=5
THEN N'vSMS_R_System'
ELSE da.BaseTableName END AS TableName
FROM Collections_G cg
JOIN Collections_L cl ON cg.CollectionID=cl.CollectionID
JOIN Collection_Rules cr ON cg.CollectionID=cr.CollectionID
JOIN DiscoveryArchitectures da ON cr.ArchitectureKey=da.DiscArchKey
WHERE (cg.Flags&4)=4 AND cl.CurrentStatus!=5
As you can see from the results picture above, some collections appear multiple times but with different TableNames. This is because each collection have have several rules, and each rule has one cr.ArchitectureKey
Also, and more importantly, collections PS10000B and PS10000C do not show up because their cr.ArchitectureKey = 0 which is a value that doesn't exist in da.DiscArchKey.
My goal is to have collections that have a cr.ArchitectureKey appear, but I need to assign them a cr.ArchitectureKey
My thought (which is slightly flawed, but don't know enough SQL to make it better, so if someone could help with that it would be appreciated too) was to get use the da.DiscArchKey from the top level parent. But the top level parent can have multiple DiscArchKeys, which is what is causing the problem.
As mentioned above getting the top level parent is slightly flawed, and ideally I would get the top level cr.ReferencedCollectionID. In other words, if PS10000B has a cr.ReferencedCollectionID of PS10000C and PS10000C has a cr.ReferencedCollectionID of SMS00002 but because SMS00002 has no cr.ReferencedCollectionID then SMS00002 is the top level cr.ReferencedCollectionID and both PS10000B and PS10000C should have da.DiscArchKey(s) equal to those of SMS00002.

Please have a look at a wired solution that comes into mind. You may face some syntax errors(most probably in 2nd and 3rd CTE) but it just an idea.
Get each case values in separate CTEs and then combine them at the end.
;WITH CTE
AS
(
SELECT CG.SITEID,
CR.COLLECTIONID,
CG.COLLECTIONNAME
FROM COLLECTIONS_G CG
JOIN COLLECTIONS_L CL ON CG.COLLECTIONID=CL.COLLECTIONID
JOIN COLLECTION_RULES CR ON CG.COLLECTIONID=CR.COLLECTIONID
WHERE (CG.FLAGS&4)=4 AND CL.CURRENTSTATUS!=5
),
ARCHITECTUREKEY5
AS
(
SELECT C.SITEID,
C.COLLECTIONID,
C.COLLECTIONNAME,
N'vSMS_R_System' as TABLENAME
FROM CTE C WHERE C.ARCHITECTUREKEY = 5
),
ARCHITECTUREKEY0
AS
(
SELECT C.SITEID,
C.COLLECTIONID,
C.COLLECTIONNAME,
BASETABLENAME as TABLENAME
FROM CTE C,
DISCOVERYARCHITECTURES
JOIN
COLLECTION_RULES
ON DISCOVERYARCHITECTURES.DISCARCHKEY =
COLLECTION_RULES.ARCHITECTUREKEY
JOIN
COLLECTIONS_G
ON COLLECTION_RULES.COLLECTIONID =
COLLECTIONS_G.COLLECTIONID
WHERE COLLECTIONS_G.SITEID = (SELECT TOP 1 SOURCECOLLECTIONID FROM VCOLLECTIONDEPENDENCYCHAIN WHERE DEPENDENTCOLLECTIONID = C.SITEID ORDER BY LEVEL DESC))
and C.ARCHITECTUREKEY = 0
),
ARCHITECTUREKEYOTHER
AS
(
SELECT C.SITEID,
C.COLLECTIONID,
C.COLLECTIONNAME,
DA.BASETABLENAME as TABLENAME
FROM DISCOVERYARCHITECTURES DA, CTE C WHERE DA.DISCARCHKEY=CR.ARCHITECTUREKEY AND C.ARCHITECTUREKEY not in (0,1)
)
Select * from ARCHITECTUREKEY5
UNION
Select * from ARCHITECTUREKEY0
UNION
Select * from ARCHITECTUREKEYOTHER

Related

Should I use an SQL full outer join for this?

Consider the following tables:
Table A:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
Table B:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
The DOC_TYPE and NEXT_STATUS columns have different meanings between the two tables, although a NEXT_STATUS = 999 means "closed" in both. Also, under certain conditions, there will be a record in each table, with a reference to a corresponding entry in the other table (i.e. the RELATED_DOC_NUM columns).
I am trying to create a query that will get data from both tables that meet the following conditions:
A.RELATED_DOC_NUM = B.DOC_NUM
A.DOC_TYPE = "ST"
B.DOC_TYPE = "OT"
A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999
A.DOC_TYPE = "ST" represents a transfer order to transfer inventory from one plant to another. B.DOC_TYPE = "OT" represents a corresponding receipt of the transferred inventory at the receiving plant.
We want to get records from either table where there is an ST/OT pair where either or both entries are not closed (i.e. NEXT_STATUS < 999).
I am assuming that I need to use a FULL OUTER join to accomplish this. If this is the wrong assumption, please let me know what I should be doing instead.
UPDATE (11/30/2021):
I believe that #Caius Jard is correct in that this does not need to be an outer join. There should always be an ST/OT pair.
With that I have written my query as follows:
SELECT <columns>
FROM A LEFT JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
Does this make sense?
UPDATE 2 (11/30/2021):
The reality is that these are DB2 database tables being used by the JD Edwards ERP application. The only way I know of to see the table definitions is by using the web site http://www.jdetables.com/, entering the table ID and hitting return to run the search. It comes back with a ton of information about the table and its columns.
Table A is really F4211 and table B is really F4311.
Right now, I've simplified the query to keep it simple and keep variables to a minimum. This is what I have currently:
SELECT CAST(F4211.SDDOCO AS VARCHAR(8)) AS SO_NUM,
F4211.SDRORN AS RELATED_PO,
F4211.SDDCTO AS SO_DOC_TYPE,
F4211.SDNXTR AS SO_NEXT_STATUS,
CAST(F4311.PDDOCO AS VARCHAR(8)) AS PO_NUM,
F4311.PDRORN AS RELATED_SO,
F4311.PDDCTO AS PO_DOC_TYPE,
F4311.PDNXTR AS PO_NEXT_STATUS
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = CAST(F4311.PDDOCO AS VARCHAR(8))
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
The other part of the story is that I'm using a reporting package that allows you to define "virtual" views of the data. Virtual views allow the report developer to specify the SQL to use. This is the application where I am using the SQL. When I set up the SQL, there is a validation step that must be performed. It will return a limited set of results if the SQL is validated.
When I enter the query above and validate it, it says that there are no results, which makes no sense. I'm guessing the data casting is causing the issue, but not sure.
UPDATE 3 (11/30/2021):
One more twist to the story. The related doc number is not only defined as a string value, but it contains leading zeros. This is true in both tables. The main doc number (in both tables) is defined as a numeric value and therefore has no leading zeros. I have no idea why those who developed JDE would have done this, but that is what is there.
So, there are matching records between the two tables that meet the criteria, but I think I'm getting no results because when I convert the numeric to a string, it does not match, because one value is, say "12345", while the other is "00012345".
Can I pad the numeric -> string value with zeros before doing the equals check?
UPDATE 4 (12/2/2021):
Was able to finally get the query to work by converting the numeric doc num to a left zero padded string.
SELECT <columns>
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = RIGHT(CONCAT('00000000', CAST(F4311.PDDOCO AS VARCHAR(8))), 8)
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
AND ( F4211.SDNXTR < 999
OR F4311.PDNXTR < 999 )
You should write your query as follows:
SELECT <columns>
FROM A INNER JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
LEFT join is a type of OUTER join; LEFT JOIN is typically a contraction of LEFT OUTER JOIN). OUTER means "one side might have nulls in every column because there was no match". Most critically, the code as posted in the question (with a LEFT JOIN, but then has WHERE some_column_from_the_right_table = some_value) runs as an INNER join, because any NULLs inserted by the LEFT OUTER process, are then quashed by the WHERE clause
See Update 4 for details of how I resolved the "data conversion or mapping" error.

How to cast only the part of a table using a single SQL command in PostgreSQL

In a PostgreSQL table I have several information stored as text. It depends on the context described by a type column what type of information is stored. The application is prepared to get by only one command the Id's of the row.
I got into trouble when i tried to compare the information (bigint stored as a string) with an external value (e.g. '9' > '11'). When I tried to cast the column, the datatbase return an error (not all values in the column are castable, e.g. datetime or normal text). Also when I try to cast only the result of a query command, I get a cast error.
I get the table with the castable rows by this command:
SELECT information.id as id, item.information::bigint as item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
The resulting rows are showing up only text that is castable. When I throw it into another command it results in an error.
SELECT x.id FROM (
SELECT information.id as id, item.information::bigint as item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) AS x
WHERE x.item > '0'::bigint
Accroding to the error, the database tried to cast all rows in the table.
Technically, this happens because the optimizer thinks WHERE x.item > '0'::bigint is a much more efficient filter than information.type = 'task'. So in the table scan, the WHERE x.item > '0'::bigint condition is chosen to be the predicate. This thinking is not wrong but will make you fall into this seemingly illogical trouble.
The suggestion by Gordon to use CASE WHEN inf.type = 'task' THEN i.information::bigint END can avoid this, but however it may sometimes ruin your idea to put that as a sub-query and require the same condition to be written twice.
A funny trick I tried is to use OUTER APPLY:
SELECT x.* FROM (SELECT 1 AS dummy) dummy
OUTER APPLY (
SELECT information.id as id, item.information::bigint AS item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) x
WHERE x.item > '0'::bigint
Sorry that I only verified the SQL Server version of this. I understand PostgreSQL has no OUTER APPLY, but the equivalent should be:
SELECT x.* FROM (SELECT 1 AS dummy) dummy
LEFT JOIN LATERAL (
SELECT information.id as id, item.information::bigint AS item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) x ON true
WHERE x.item > '0'::bigint
(reference is this question)
Finally, a more tidy but less flexible method is add the optimizer hint to turn off it to force the optimizer to run the query as how it is written.
This is unfortunate. Try using a case expression:
SELECT inf.id as id,
(CASE WHEN inf.type = 'task' THEN i.information::bigint END) as item
FROM information inf JOIN
item i
ON inf.id = i.informationid
WHERE inf.type = 'task';
There is no guarantee that the WHERE filter is applied before the SELECT. However, CASE does guarantee the order of evaluation, so it is safe.

Update field where the items are part of a subquery

I am testing some software and need to make some adjustments to the fields manually. For all items that are produced at factory A, the lead time needs to be adjusted for the other factories those items are produced. However, the other items that are at the other factories need the normal leadtime.
I have the query to select the items that are produced at the alternate factories. I've tried using update where exists and having that be a subquery. I can't seem to get it to work as I feel it should
update newgdmoperation
set newgdmoperation.productionoffset = 75
where exists
(
select
newgdmoperation.operationid
from newgdmoperation
right join
(
select mainproductid,productionoffset
from newgdmoperation
where fromlocationid = 'KR'
and transporttype like 'Ves%'
) a
on newgdmoperation.mainproductid = a.mainproductid
where fromlocationid <> 'KR'
and transporttype like 'Ves%'
)
This doesn't give any error results. However, it updates the field for all item.
The subquery under the where clause does in fact return the operationid (unique id) for the items that need to be updated. I was expecting that with the where exists, that only the items in the subquery would be updated while the rest would be left untouched.
Assuming that you're trying to update the NEWGDMOPERATION table it looks to me like you should use IN rather than EXISTS, and so your statement should be
UPDATE NEWGDMOPERATION g
SET g.PRODUCTIONOFFSET = 75
WHERE g.OPERATIONID IN (SELECT g2.OPERATIONID
FROM NEWGDMOPERATION g2
RIGHT JOIN (SELECT g3.MAINPRODUCTID,
g3.PRODUCTIONOFFSET
FROM NEWGDMOPERATION g3
WHERE g3.FROMLOCATIONID = 'KR' AND
g3.TRANSPORTTYPE LIKE 'VES%') a
ON g2.MAINPRODUCTID = a.MAINPRODUCTID
WHERE g2.FROMLOCATIONID <> 'KR' AND
g2.TRANSPORTTYPE LIKE 'VES%')

conditional IIF in a JOIN

I have the next data base:
Table Bill:
Table Bill_Details:
And Table Type:
I want a query to show this result:
The query as far goes like this:
SELECT
Bill.Id_Bill,
Type.Id_Type,
Type.Info,
Bill_Details.Deb,
Bill_Details.Cre,
Bill.NIT,
Bill.Date2,
Bill.Comt
FROM Type
RIGHT JOIN (Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill)
ON Type.Id_Type = Bill_Details.Id_Type
ORDER BY Bill.Id_Bill, Type.Id_Type;
With this result:
I'm not sure how to deal or how to include this:
Type.600,
Type."TOTAL",
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) >= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), "" ),
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) <= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), "" )
The previous code is the responsable of include new data in some fields, since all of the other fields will carry the same data of the upper register. I'll apreciate some sugestions to acomplish this.
Here is a revised version of the UNION which you removed from the question. The original query was a good start, but you just did not provide sufficient details about the error or problem you were experiencing. My comments were not meant to have you remove the problem query, only that you needed to provide more details about the error or problem. In the future if you have a UNION, make sure the each query of the UNION works separately. Then you could debug problems easier, one step at a time.
Problems which I corrected in the second query of the UNION:
Removed reference to table [Type] in the query, since it was not part of the FROM clause. Instead, I replaced it with a literal value.
Fixed FROM clause to join both [Bill] and [Bill_Details] tables. You had fields from both tables, so why would you not join on them just like in the first query of the UNION?
Grouped on all fields from table [Bill] referenced in the SELECT clause. You must either group on all fields, or include them in aggregate expressions like Sum() or First(), etc.
Replaced empty strings with Nulls for the False cases on Iif() statements.
SELECT
Bill.Id_Bill, Type.Id_Type, Type.Info,
Bill_Details.Deb,
Bill_Details.Cre,
Bill.NIT, Bill.Date2, Bill.Comt
FROM
Type RIGHT JOIN (Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill)
ON Type.Id_Type = Bill_Details.Id_Type;
UNION
SELECT
Bill.Id_Bill, 600 As Id_Type, "TOTAL" As Info,
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) >= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), Null ) As Deb,
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) <= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), Null ) As Cre,
Bill.NIT, Bill.Date2, Bill.Comt
FROM Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill
GROUP BY Bill.Id_Bill, Bill.NIT, Bill.Date2, Bill.Comt;

Two almost identical queries returning different results

I am getting different results for the following two queries and I have no idea why. The only difference is one has an IN and one has an equals.
Before I go into the queries you should know that I found a better way to do it by moving the subquery into a common table expression, but this is still driving me crazy! I really want to know what caused the issue in the first place, I am asking out of curiosity
Here's the first query:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer =(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
And the second one:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer IN (SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
The first query returns 0 rows while the second query returns 2 rows.The only difference is x.UniqProducer = versus x.UniqProducer IN for the last where clause.
Thanks for your time
SELECT TOP 1 doesn't guarantee that the same record will be returned each time.
Add an ORDER BY to your select to make sure the same record is returned.
(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM' ORDER BY idl.UniqEntity)
I would guess (with strong emphasis on the word “guess”) that the reason is based on how equals and in are processed by the query engine. For equals, SQL knows it needs to do a comparison with a specific value, where for in, SQL knows it needs to build a subset, and find if the "outer" value is in that "inner" subset. Yes, the end results should be the same as there’s only 1 row returned by the subquery, but as #RickS pointed out, without any ordering there’s no guarantee of which value ends up “on top” – and the (sub)query plan used to build the in - driven subquery might differ from that used by the equals pull.
A follow-up question: which is the correct dataset? When you analyze the actual data, should you have gotten zero, two, or a different number of rows?