Coalescing a subset to its parent set - sql

I have two tables, A and B. B is a random subset of A but with some values that override the default values in A. How do I join the two tables to coalesce their values?
A
1, 0
2, 0
3, 0
4, 0
B
2, 10
3, 11
Output
1, 0
2, 10
3, 11
4, 0
Here is my actual query - I thought I could do this with LEFT OUTER JOIN, but this restricts the Output set to the intersection of A and B rows. I need all A rows to return, coalesced with the relevant B rows.
SELECT A.factor, A.categorical_value, coalesce(A.positive, B.positive), coalesce(A.negative, B.negative)
FROM features A
LEFT OUTER JOIN profiles B ON (A.factor=B.factor AND A.categorical_value=B.categorical_value)
WHERE B.uuid='9e5083da74305628336631da9d2903e3'
As Craig Ringer points out below, I am inadvertently restricting A with my B clause. But then how do I do this? Table A is a many-to-many table of profile attributes, where uuid indicates the user id. Table B is a master list of all possible profile attributes. I want the query to return the master list with the an individual profile superimposed on to it.

After question update:
As #Craig already informed you, a WHERE condition on B would only select matching rows in B and act like a [INNER] JOIN instead of a LEFT [OUTER] JOIN.
You need to pull that WHERE condition up into the condition of the LEFT JOIN.
While being at it, I opportunistically simplified with USING, since all joining column names are identical. Details in the manual.
Information is still incomplete and contradicting. Here is another educated guess:
SELECT a.factor, a.categorical_value
, COALESCE(a.positive, b.positive) AS positive
, COALESCE(a.negative, b.negative) AS negative
FROM features a
LEFT JOIN profiles b USING (factor, categorical_value, uuid)
WHERE a.uuid='9e5083da74305628336631da9d2903e3'
Not sure if you need to join on udid, too.
Your example would indicate COALESCE(b.positive, a.positive). Something does not add up ...
More updates in comment
Adapt your JOIN condition then:
SELECT a.factor, a.categorical_value
, COALESCE(a.positive, b.positive) AS positive
, COALESCE(a.negative, b.negative) AS negative
FROM features a
LEFT JOIN profiles b ON a.factor = b.factor
AND a.categorical_value = b.categorical_value
AND b.uuid='9e5083da74305628336631da9d2903e3';

Related

How to join 2 tables with multiple conditions each?

I have data split between 2 tables, and need to join the necessary data together for analysis.
One table Test 3 Output contains ID numbers, and the return value of the test. The other table Test Results contains the same IDs, along with their corresponding serial number and overall test result.
I need to combine these into a single table that just displays ID, serial number and test value.
Sorry in advance for the horrible SQL thats about to follow, I'm brand new to this.
I have 2 working queries that give me what I want, but I can't seem to join them together.
The first query:
select `ID`,`Serial Number` from `Test Results`t where (len(`Serial Number`)=16 and FailMode = '24V Supply FAIL')
This gets me the ID and serial number of all the tests that failed '24V supply'. It also filters out garbage serial numbers as the correct ones should have 16 digits.
The second query:
select `ID` from `Test 3 Output`o where o.`24V Supply (V)`<30
This gets me the ID and test results, and filters out some results that were greater than 30V. Note that '24V Supply(V) is the name of the column containing the test results.
Now when I try to join these with the ID, I get a syntax error. Here's what I tried:
select `ID`,`Serial Number`
from `Test Results`t
where (len(`Serial Number`)=16 and FailMode = '24V Supply FAIL')
left join (`Test 3 Output`o ON t.`ID` = o.`ID` where o.`24V Supply (V)`<30)
This gives the error:
Error: Syntax error (missing operator) in query expression (len(`Serial Number`)=16 and FailMode = '24V Supply FAIL') left join (`Test 3 Output`o ON t.`ID` = o.`ID` where o.`24V Supply (V)`<30)
I'm not sure what operator I'm missing but I had a feeling its related to the fact there's two where statements?
Can anyone offer some help?
Edit: I found a workaround since I can't use 2 where clauses with a join. I created 2 views with my 2 separate queries, and performed the join on those which got me what I wanted. I'd still like to hear a proper way of doing it though :)
You can join 2 subqueries like this:
SELECT q1.a, q1.b, q2.c
FROM (
(SELECT a, b FROM table1
WHERE b > 10) AS q1
LEFT JOIN
(SELECT a, c FROM table2
WHERE c > 20) AS q2
ON q1.a = q2.a
)
Doing the subqueries as separate query objects is easier to debug, but the query objects keep piling up...

Should I use an SQL full outer join for this?

Consider the following tables:
Table A:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
Table B:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
The DOC_TYPE and NEXT_STATUS columns have different meanings between the two tables, although a NEXT_STATUS = 999 means "closed" in both. Also, under certain conditions, there will be a record in each table, with a reference to a corresponding entry in the other table (i.e. the RELATED_DOC_NUM columns).
I am trying to create a query that will get data from both tables that meet the following conditions:
A.RELATED_DOC_NUM = B.DOC_NUM
A.DOC_TYPE = "ST"
B.DOC_TYPE = "OT"
A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999
A.DOC_TYPE = "ST" represents a transfer order to transfer inventory from one plant to another. B.DOC_TYPE = "OT" represents a corresponding receipt of the transferred inventory at the receiving plant.
We want to get records from either table where there is an ST/OT pair where either or both entries are not closed (i.e. NEXT_STATUS < 999).
I am assuming that I need to use a FULL OUTER join to accomplish this. If this is the wrong assumption, please let me know what I should be doing instead.
UPDATE (11/30/2021):
I believe that #Caius Jard is correct in that this does not need to be an outer join. There should always be an ST/OT pair.
With that I have written my query as follows:
SELECT <columns>
FROM A LEFT JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
Does this make sense?
UPDATE 2 (11/30/2021):
The reality is that these are DB2 database tables being used by the JD Edwards ERP application. The only way I know of to see the table definitions is by using the web site http://www.jdetables.com/, entering the table ID and hitting return to run the search. It comes back with a ton of information about the table and its columns.
Table A is really F4211 and table B is really F4311.
Right now, I've simplified the query to keep it simple and keep variables to a minimum. This is what I have currently:
SELECT CAST(F4211.SDDOCO AS VARCHAR(8)) AS SO_NUM,
F4211.SDRORN AS RELATED_PO,
F4211.SDDCTO AS SO_DOC_TYPE,
F4211.SDNXTR AS SO_NEXT_STATUS,
CAST(F4311.PDDOCO AS VARCHAR(8)) AS PO_NUM,
F4311.PDRORN AS RELATED_SO,
F4311.PDDCTO AS PO_DOC_TYPE,
F4311.PDNXTR AS PO_NEXT_STATUS
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = CAST(F4311.PDDOCO AS VARCHAR(8))
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
The other part of the story is that I'm using a reporting package that allows you to define "virtual" views of the data. Virtual views allow the report developer to specify the SQL to use. This is the application where I am using the SQL. When I set up the SQL, there is a validation step that must be performed. It will return a limited set of results if the SQL is validated.
When I enter the query above and validate it, it says that there are no results, which makes no sense. I'm guessing the data casting is causing the issue, but not sure.
UPDATE 3 (11/30/2021):
One more twist to the story. The related doc number is not only defined as a string value, but it contains leading zeros. This is true in both tables. The main doc number (in both tables) is defined as a numeric value and therefore has no leading zeros. I have no idea why those who developed JDE would have done this, but that is what is there.
So, there are matching records between the two tables that meet the criteria, but I think I'm getting no results because when I convert the numeric to a string, it does not match, because one value is, say "12345", while the other is "00012345".
Can I pad the numeric -> string value with zeros before doing the equals check?
UPDATE 4 (12/2/2021):
Was able to finally get the query to work by converting the numeric doc num to a left zero padded string.
SELECT <columns>
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = RIGHT(CONCAT('00000000', CAST(F4311.PDDOCO AS VARCHAR(8))), 8)
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
AND ( F4211.SDNXTR < 999
OR F4311.PDNXTR < 999 )
You should write your query as follows:
SELECT <columns>
FROM A INNER JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
LEFT join is a type of OUTER join; LEFT JOIN is typically a contraction of LEFT OUTER JOIN). OUTER means "one side might have nulls in every column because there was no match". Most critically, the code as posted in the question (with a LEFT JOIN, but then has WHERE some_column_from_the_right_table = some_value) runs as an INNER join, because any NULLs inserted by the LEFT OUTER process, are then quashed by the WHERE clause
See Update 4 for details of how I resolved the "data conversion or mapping" error.

Conditional join that changes number of join conditions

I am trying to join data based on the following scenario.
Let's say there are two businesses. Business 1 has one field for customer data, business 2 has two fields. I need to join to multiple other tables using these customer fields.
I would like to create a join that joins on just field 1 for business 1, but field 1 AND field 2 for business 2. In other words, there is a more granular identifier available for business 2, but it is still valid to join on just field 1 for business 1 as well. It also needs to function like an inner join, in that we are only preserving the relevant data that match these conditions.
The code would look something like this for business 1:
FROM customer_data a
INNER JOIN marketing_data b
ON a.member_number = b.member_number
WHERE business_number = 1
And something like this for business 2:
FROM customer_data a
INNER JOIN marketing_data b
ON a.member_number = b.member_number
AND a.sub_member_number = b.sub_member_number
WHERE business_number = 2
I am hoping to extract both sets of data in one join statement. Also, just in case it helps, I am using the Snowflake platform to write my queries.
Following should work for both the cases.
FROM customer_data a
INNER JOIN marketing_data b ON a.member_number = b.member_number
WHERE (
a.sub_member_number = b.sub_member_number
AND business_number = 2
)
OR business_number = 1
You can put the conditions in the ON clause like this:
FROM customer_data cd INNER JOIN
marketing_data md
ON cd.member_number = md.member_number AND
( cd.business_number <> 2 OR
cd.sub_member_number = md.sub_member_number
)
Note: this generalizes beyond just businesses 1 and 2, with the special condition only applying to 2. The first condition can be = 1 if you want to be more specific.
Also note that this introduces meaningful table aliases rather than arbitrary letters. This makes queries much easier to understand.

MS Access SQL Query Not Exist

I have been stuck on trying to figure how to return data that is in one table but not the other. I thought an outter join would work, but it seems that Access does not allow that.
My SQL is returning results if a record exists in the MonthlyTargets_0_SPARTN_qry but if there is not record then no data is being returned. I would like to display a 0 if there are not records.
My sql is:
SELECT REF_TestCategory_tbl.CategoryID
,MonthlyTargets_0_SPARTN_qry.[Supervisor Id] AS TestOfficerID
,Count(MonthlyTargets_0_SPARTN_qry.[Sheet ID]) AS Actuals
,MonthlyTargets_0_SPARTN_qry.ComplianceMonth
FROM MonthlyTargets_0_SPARTN_qry
INNER JOIN (
REF_TestCategory_tbl INNER JOIN REF_TestCatalog_tbl ON REF_TestCategory_tbl.CategoryID = REF_TestCatalog_tbl.TestCategory
) ON MonthlyTargets_0_SPARTN_qry.[Test Number] = REF_TestCatalog_tbl.TestID
GROUP BY REF_TestCategory_tbl.CategoryID
,MonthlyTargets_0_SPARTN_qry.[Supervisor Id]
,MonthlyTargets_0_SPARTN_qry.ComplianceMonth
ORDER BY REF_TestCategory_tbl.CategoryID;
Which returns:
CategoryID TestOfficerID Actuals ComplianceMonth
1 3062 26 1/1/2020
1 3062 6 2/1/2020
2 3062 2 1/1/2020
3 3062 2 1/1/2020
3 3062 1 2/1/2020
if there are no records for feb, I need it to reurn 0 in Actuals
Thank you
If your 'ComplianceMonth' Values consistently exists regardless of your adjacent data(Meaning if the adjacent data returned for your ComplianceMonth is NULL) then you could do something like this.
SELECT REF_TestCategory_tbl.CategoryID,
MonthlyTargets_0_SPARTN_qry.[Supervisor Id] AS TestOfficerID,
coalesce(Count(MonthlyTargets_0_SPARTN_qry.[Sheet ID]),0) AS Actuals,
MonthlyTargets_0_SPARTN_qry.ComplianceMonth
FROM dbo.MonthlyTargets_0_SPARTN_qry RIGHT OUTER JOIN
dbo.REF_TestCategory_tbl RIGHT OUTER JOIN
dbo.REF_TestCatalog_tbl ON REF_TestCategory_tbl.CategoryID = REF_TestCatalog_tbl.TestCategory ON MonthlyTargets_0_SPARTN_qry.[Test Number] = REF_TestCatalog_tbl.TestID
GROUP BY REF_TestCategory_tbl.CategoryID, MonthlyTargets_0_SPARTN_qry.[Supervisor Id], MonthlyTargets_0_SPARTN_qry.ComplianceMonth
ORDER BY REF_TestCategory_tbl.CategoryID
Hope this Helps.
MS-Access DOES allow outer joins in its SQL. You can do both a LEFT JOIN or a RIGHT JOIN.
MS-Access does not include a statement for a full-outer-join. However, if you want to do a full-outer-join you can do it with a UNION ALL of a specific LEFT JOIN and a specific RIGHT JOIN. The instructions to do a full-outer-join are the following:
You do a “LEFT JOIN” (enclosed in a Select operation) between the two input record-lists. If one of the two input record-lists has one (or more) fields that for sure cannot be Null, that will be the left input record-list. The “ON” Boolean expression is the one that you want for the Full-Outer-Join.
If the left record-list has one (or more) fields that for sure cannot be Null, you skip this step. Otherwise, you do a Cross-Join between the left record-list and a record-list having only one record with one non-Null field (it can be exactly the same Select over “T_Numbers” in the example above, highlighted in green). The Cross-Join is enclosed in a Select that exposes all the fields from the Cross-Join operation, including the field “Num” from “T_Numbers” (with another field name, if you want).
You do a “RIGHT JOIN” having the same right input record-list from point 1. Its left record-list is either the Select from point 2, or the left input record-list from point 1, as corresponds (see point 2). The “ON” expression must be exactly the same as the one of the “LEFT JOIN” from point 1.
The “RIGHT JOIN” from point 3 is enclosed in a Select that exposes all the fields from the left and right input record-lists from point 1. This Select has the “WHERE” expression “IsNull(field)”, where “field” is either the “Num” field from point 2, or the field from left input record-list that for sure cannot be Null, as corresponds (see point 2).
You do a “UNION ALL” with the Select enclosing the “RIGHT JOIN” from point one and the Select enclosing the “RIGHT JOIN” from point 4.
More information at LightningGuide.net.

How to make a query to obtain only results that have N number within a range of values?

I'm trying to extract nutrient data in MS Access 2007 from the USDA food database, freely available at http://www.ars.usda.gov/Services/docs.htm?docid=24912
I need records that have ALL nutrients from NUT_DATA.Nutr_No . Those records have values between '501' and '511' . But I wish to exclude incomplete records that have missing values.
Currently, Baby food banana has all from nutrient 501 to 511, but Baby food Beverage has only 9 of the nutrients listed, and many others are like that.
As a last resort, I guess it would be acceptable to have all records, showing null for missing values, as long as each FOOD_DES.Long_Desc has exactly 11 records, one for each NUT_DATA.Nutr_No OR NUTR_DEF.NutrDesc (which correspond to each other).
SELECT
FOOD_DES.NDB_No, FOOD_DES.FdGrp_Cd, FOOD_DES.Long_Desc, NUT_DATA.Nutr_No, NUTR_DEF.NutrDesc, NUT_DATA.Nutr_Val, WEIGHT.Amount, WEIGHT.Msre_Desc, WEIGHT.Gm_Wgt, [WEIGHT]![Amount] & " " & [WEIGHT]![Msre_Desc] AS msre
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE
(NUT_DATA.Nutr_No between '501' and '511' ) and ((WEIGHT.Seq)="1") and NUT_DATA.Nutr_Val > '0' and
// this part is me out of ideas trying stuff, but didn't help
EXISTS (SELECT 1
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE count FOOD_DES.Long_Desc = "11" )
//end wild of experimentation
ORDER BY FOOD_DES.Long_Desc, NUTR_DEF.SR_Order;
This is a sample of the data. I just copied the most important columns. The red is not what I'm looking for because it doesn't have all 11 nutrients. I can paste on the google doc the whole table if someone thinks that would help.
https://docs.google.com/spreadsheets/d/1FghDD59wy2PYlpsqUlYVc3Ulwvy4MMLagpBUYtvLBfI/edit?usp=sharing
As your starting point, identify which food items have values > 0 for all 11 of those nutrients. Check whether this simpler GROUP BY query shows you the correct items:
SELECT ndat.NDB_No
FROM
NUT_DATA AS ndat
INNER JOIN WEIGHT AS wt
ON ndat.NDB_No = wt.NDB_No
WHERE
ndat.Nutr_Val>0
AND ndat.Nutr_No IN('501','502','503','504','505','506','507','508','509','510','511')
AND wt.Seq='1'
GROUP BY ndat.NDB_No
HAVING Count(ndat.Nutr_No)=11;
Note you could use Val(ndat.Nutr_No) Between 501 And 511 as the Nutr_No restriction, which would give you a more concise statement. However, evaluating Val() for every row of the table means that approach would forego the performance benefit of indexed retrieval ... so that version of the query should be noticeably slower.
Save that query and create a new query which joins it to the base tables for the additional data you need from other columns. Or use it as a subquery instead of a named query if you prefer.