SQL Left Join with WHERE clause SAS Enterprise Guide - sql

I am trying to pair 2 tables (nicknamed PERSIP and ECIF) on their ID field, (labeled TABLE1 & TABLE2) to create a RESULTTABLE, where the ym_id (for both tables) variable is set to my timekey0 variable for a specific datetime.
I am wondering why this code produces 0 rows of resulting data. After looking online, this was the format people posted as solutions to similar problems.
%let timekey0 = 202110;
proc sql;
CREATE TABLE RESULTTABLE AS
SELECT
PERSIP.col1,
PERSIP.col2,
PERSIP.col3,
ECIF.col1,
ECIF.col2,
ECIF.col3,
ECIF.col4
FROM DB.TABLE1 PERSIP
LEFT JOIN DB.TABLE2 ECIF
ON PERSIP.ID = ECIF.ID
WHERE ECIF.ym_id = &timekey0.
AND PERSIP.ym_id = &timekey0.;
quit;
I got a result of 0 rows with many columns. Not sure if my join type is incorrect but I have 0 rows in the table.

There may be two reasons for this:
There is no records matching to your where criteria (ECIF.ym_id = &timekey0.
AND PERSIP.ym_id = &timekey0.)
There is no records to join matching your on criteria (ON PERSIP.ID = ECIF.ID)

Your logic seems off. You say you want a LEFT JOIN then use a variable from the "RIGHT" table in your WHERE condition.
Most likely you just want to add those conditions to the ON condition.
FROM TABLE1 PERSIP
LEFT JOIN TABLE2 ECIF
ON PERSIP.ID = ECIF.ID
AND ECIF.ym_id = &timekey0.
AND PERSIP.ym_id = &timekey0.
Or perhaps just keep the condition that will limit the observations read from the "LEFT" table in the WHERE condition
FROM TABLE1 PERSIP
LEFT JOIN TABLE2 ECIF
ON PERSIP.ID = ECIF.ID
AND PERSIP.ym_id = ECIF.ym_id
WHERE PERSIP.ym_id = &timekey0.

Related

SQL Left join on different columns considering specific conditions

I believe this is an easy one. Just getting started on SQL, so I am finding it a bit tricky. So I am using SQL on SAS, and I want to join two tables but on different columns based on a value of a column. Practical example:
Proc sql;
create table new_table_name as select
a.proposal_code as new_name_proposal_code,
a.1st_client_code as new_name_1st_client_code,
a.2nd_client_code as new_name_2nd_client_code,
a.3rd_client_code as new_name_3rd_client_code,
a.4th_client_code as new_name_4th_client_code,
a.product_type as new_name_product_type,
b.2nd_client_code
from existing_table a
left join existing table b (on b.2nd_client_code=a.2nd_client_code and a.product_type = "clothes") or
left join existing table b (on b.2nd_client_code=a.3rd_client_code and (a.product_type = "cars" or a.product_type = "bikes"));
quit;
So this is the code that I'm using at the moment, and the goal is to join table a and table b using b.2nd client code = a.2nd client code if the product type from table a is = to "clothes", and if the product type from table a is either "cars" or "bikes", join table a and table b using b.2nd client code = a.3rd client code. Basically, look at two different "on's" regarding the specific product type. When joining these two tables, if one row has product type "clothes", I want it to look at the 2nd client code, if it is either "cars" or "bikes", look at the 3rd client code.
Hope I made it clear. The error I am getting at the moment is "expecting an on". Is it a problem of syntax?
Yes. The parentheses before the on is not correct. Your query has other issues as well. I think you want:
create table new_table_name as
select a.proposal_code as new_name_proposal_code,
a.1st_client_code as new_name_1st_client_code,
a.2nd_client_code as new_name_2nd_client_code,
a.3rd_client_code as new_name_3rd_client_code,
a.4th_client_code as new_name_4th_client_code,
a.product_type as new_name_product_type,
coalsesce(bc.2nd_client_code, bcb.2nd_client_code)
from existing_table a left join
existing_table bc
on bc.2nd_client_code = a.2nd_client_code and
a.product_type = 'clothes' left join
existing_table bcb
on bcb.2nd_client_code = a.3rd_client_code and
a.product_type in ('cars', 'bikes');
Notes:
No parentheses before the on clause.
No or left join. or is a boolean operator. left join is an operator on sets (i.e. tables and results sets). The don't mix.
No repeated table aliases.
You want to combine the two code, so you need something like coalesce() in the select.
The SQL delimiter for strings is the single quote, not the double quote.
in is simpler than a string of or conditions.
Sounds like you just want a complex ON criteria and not two joins.
Something like this:
proc sql;
create table new_table_name as
select
a.proposal_code as new_name_proposal_code
,a.client_code1 as new_name_client_code1
,a.client_code2 as new_name_client_code2
,a.client_code3 as new_name_client_code3
,a.client_code4 as new_name_client_code4
,a.product_type as new_name_product_type
,b.client_code2 as new_name_other_client_code2
from tableA a
left join tableB b
on (b.client_code2=a.client_code2 and a.product_type = "clothes")
or (b.client_code2=a.client_code3 and a.product_type in ("cars","bikes"))
;
quit;
For a better answer post example inputs and desired output.

Can I do a left join without returning the conditional columns?

New to SQL but I want to be able to optimize my query by bringing just the right amount of data. I am doing a left join on CS Rep Name and WE, which are two columns present in both tables. I find that if I don't bring in CS Rep Name and WE in the TECDR table, the query would error. Is there a workaround to this? Since it is a left join, I don't need redundant data.
SELECT *
FROM Tish_Email_CSAT_Dump AS TECD
LEFT JOIN (SELECT CS_Rep_Name,
Team_Leader,
Operations_Manager,
Tenure,
WE,
FileName
FROM Tish_Email_CSAT_Dump_Roster) AS TECDR
ON TECD.CS_Rep_Name = TECDR.CS_Rep_Name
AND TECD.WE = TECDR.WE
When you embed a SELECT inside a query in place of a table, the result of a select (projection) behave like a table visible only inside the query.
In your case, the join is the same as if there were a table called TECDR with the columns that you select. Hence, if you leave out some columns of Tish_Email_CSAT_Dump_Roster from your SELECT, these columns would not be available for joining or selection.
However, in your case this is unnecessary: all you need to do is joining to the underlying table, like this:
SELECT
TECD.*
, TECDR.Team_Leader
, TECDR.Operations_Manager
, TECDR.Tenure
, TECDR.FileName
FROM Tish_Email_CSAT_Dump AS TECD
LEFT JOIN Tish_Email_CSAT_Dump_Roster AS TECDR
ON TECD.CS_Rep_Name = TECDR.CS_Rep_Name AND TECD.WE = TECDR.WE
select
<place the columns you want here>
from
Tish_Email_CSAT_Dump as TECD
Left join Tish_Email_CSAT_Dump_Roster as TECDR
On TECD.CS_Rep_Name = TECDR.CS_Rep_Name and TECD.WE = TECDR.WE
Hope the following helps or else please share the query that errors:
select TECD.Column1, TECD.Column2, TECDR.Column1, TECDR.Column2
from Tish_Email_CSAT_Dump as TECD
Left join Tish_Email_CSAT_Dump_Roster as TECDR
On TECD.CS_Rep_Name = TECDR.CS_Rep_Name and TECD.WE = TECDR.WE

Most efficient way for a nested sql server update query

I am going to update a table using the sum of specific value from 3 different tables. For this purpose I wrote this query. But it takes too much time, what is the most efficient query for this purpose?
UPDATE dbo.dumpfile_doroud
SET dumpfile_doroud.sms_count_on_net = (SELECT sms_count_on_net
FROM dbo.dumpfile139201
WHERE
dbo.dumpfile_doroud.msisdn = dbo.dumpfile139201.msisdn)
+ (SELECT sms_count_on_net
FROM dbo.dumpfile139202
WHERE
dbo.dumpfile_doroud.msisdn = dbo.dumpfile139202.msisdn)
+ (SELECT sms_count_on_net
FROM dbo.dumpfile139203
WHERE
dbo.dumpfile_doroud.msisdn = dbo.dumpfile139203.msisdn)
P.S: dumpfile_doroud is small table but other three tables are really big.
Try this:
UPDATE t1
SET t1.sms_count_on_net=isnull(t2.sms_count_on_net,0) +
isnull(t3.sms_count_on_net,0) +
isnull(t4.sms_count_on_net,0)
FROM dbo.dumpfile_doroud t1
LEFT JOIN dbo.dumpfile139201 t2
ON t2.msisdn = t1.msisdn
LEFT JOIN dumpfile139202 t3
ON t3.msisdn = t1.msisdn
LEFT JOIN dumpfile139203 t4
ON t4.msisdn = t1.msisdn
I don't think it's possible to make faster query, so you can try put indexes. I think you can create nonclustered index on column msisdn on all tables. Syntax:
CREATE NONCLUSTERED INDEX IX_doroud_dumpfile139201
ON dbo.dumpfile139201(msisdn);
You can run SQL Management studio and turn on display estimated execution plan this sometimes gives good advices on creating indexes.
Create a subquery to calculate the totals then join the table to it
UPDATE o
SET o.sms_count_on_net = n.sms_count_on_net
FROM
dbo.dumpfile_doroud o
JOIN
(SELECT
d.msisdn, sms_count_on_net = (d1.sms_count_on_net+d2.sms_count_on_net+d3.sms_count_on_net)
FROM
dbo.dumpfile_doroud d
LEFT JOIN dbo.dumpfile139201 d1 ON d1.msisdn = d.msisdn
LEFT JOIN dbo.dumpfile139202 d2 ON d2.msisdn = d.msisdn
LEFT JOIN dbo.dumpfile139203 d3 ON d3.msisdn = d.msisdn) n
ON o.msisdn = n.msisdn
Note that if the value is missing from any of those tables the total will be null. That may or may not be what you want

Number of Records don't match when Joining three tables

Despite going through every material I could possibly find on the internet, I haven't been able to solve this issue myself. I am new to MS Access and would really appreciate any pointers.
Here's my problem - I have three tables
Source1084 with columns - Department, Sub-Dept, Entity, Account, +few more
R12CAOmappingTable with columns - Account, R12_Account
Table4 with columns - R12_Account, Department, Sub-Dept, Entity, New Dept, LOB +few more
I have a total of 1084 records in Source and the result table must also contain 1084 records. I need to draw a table with all the columns from Source + R12_account from R12CAOmappingTable + all columns from Table4.
Here is the query I wrote. This yields the right columns but gives me more or less number of records with interchanging different join options.
SELECT rmt.r12_account,
srb.version,
srb.fy,
srb.joblevel,
srb.scenario,
srb.department,
srb.[sub-department],
srb.[job function],
srb.entity,
srb.employee,
table4.lob,
table4.product,
table4.newacct,
table4.newdept,
srb.[beg balance],
srb.jan,
srb.feb,
srb.mar,
srb.apr,
srb.may,
srb.jun,
srb.jul,
srb.aug,
srb.sep,
srb.oct,
srb.nov,
srb.dec,
rmt.r12_account
FROM (source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account)
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity )
WHERE ( ( ( srb.[sub-department] ) = table4.subdept )
AND ( ( srb.entity ) = table4.entity )
AND ( ( rmt.r12_account ) = table4.r12_account ) );
In this simple example, Table1 contains 3 rows with unique fld1 values. Table2 contains one row, and the fld1 value in that row matches one of those in Table1. Therefore this query returns 3 rows.
SELECT *
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2
ON t1.fld1 = t2.fld1;
However if I add the WHERE clause as below, that version of the query returns only one row --- the row where the fld1 values match.
SELECT *
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2
ON t1.fld1 = t2.fld1
WHERE t1.fld1 = t2.fld1;
In other words, that WHERE clause counteracts the LEFT JOIN because it excludes rows where t2.fld1 is Null. If that makes sense, notice that second query is functionally equivalent to this ...
SELECT *
FROM
Table1 AS t1
INNER JOIN Table2 AS t2
ON t1.fld1 = t2.fld1;
Your situation is similar. I suggest you first eliminate the WHERE clause and confirm this query returns at least your expected 1084 rows.
SELECT Count(*) AS CountOfRows
FROM (source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account)
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity );
After you get the query returning the correct number of rows, you can alter the SELECT list to return the columns you want. But the columns aren't really the issue until you can get the correct rows.
Without knowing your tables values it is hard to give a complete answer to your question. The issue that is causing you a problem based on how you described it. Is more then likely based on the type of joins you are using.
The best way I found to understand what type of joins you should be using would referencing a Venn diagram explaining the different type of joins that you can use.
Jeff Atwood also has a really good explanation of SQL joins on his site using the above method as well.
Best to just use the query builder. Drop in your main table. Choose the columns you want. Now for any of the other lookup values then simply drop in the other tables, draw the join line(s), double click and use a left join. You can do this for 2 or 30 columns that need to "grab" or lookup other values from other tables. The number of ORIGINAL rows in the base table returned should ALWAYS remain the same.
So just use the query builder and follow the above.
The problem with your posted SQL is you NESTED the joins inside (). Don't do that. (or let the query builder do this for you – they tend to be quite messy but will also work).
Just use this:
FROM source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity )
As noted, I don't see why you are "repeating" the conditions again in the where clause.

Left Join With Where Clause

I need to retrieve all default settings from the settings table but also grab the character setting if exists for x character.
But this query is only retrieving those settings where character is = 1, not the default settings if the user havent setted anyone.
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1'
So i should need something like this:
array(
'0' => array('somekey' => 'keyname', 'value' => 'thevalue'),
'1' => array('somekey2' => 'keyname2'),
'2' => array('somekey3' => 'keyname3')
)
Where key 1 and 2 are the default values when key 0 contains the default value with the character value.
The where clause is filtering away rows where the left join doesn't succeed. Move it to the join:
SELECT `settings`.*, `character_settings`.`value`
FROM `settings`
LEFT JOIN
`character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1'
When making OUTER JOINs (ANSI-89 or ANSI-92), filtration location matters because criteria specified in the ON clause is applied before the JOIN is made. Criteria against an OUTER JOINed table provided in the WHERE clause is applied after the JOIN is made. This can produce very different result sets. In comparison, it doesn't matter for INNER JOINs if the criteria is provided in the ON or WHERE clauses -- the result will be the same.
SELECT s.*,
cs.`value`
FROM SETTINGS s
LEFT JOIN CHARACTER_SETTINGS cs ON cs.setting_id = s.id
AND cs.character_id = 1
If I understand your question correctly you want records from the settings database if they don't have a join accross to the character_settings table or if that joined record has character_id = 1.
You should therefore do
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT OUTER JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1' OR
`character_settings`.character_id is NULL
You might find it easier to understand by using a simple subquery
SELECT `settings`.*, (
SELECT `value` FROM `character_settings`
WHERE `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1') AS cv_value
FROM `settings`
The subquery is allowed to return null, so you don't have to worry about JOIN/WHERE in the main query.
Sometimes, this works faster in MySQL, but compare it against the LEFT JOIN form to see what works best for you.
SELECT s.*, c.value
FROM settings s
LEFT JOIN character_settings c ON c.setting_id = s.id AND c.character_id = '1'
For this problem, as for many others involving non-trivial left joins such as left-joining on inner-joined tables, I find it convenient and somewhat more readable to split the query with a with clause. In your example,
with settings_for_char as (
select setting_id, value from character_settings where character_id = 1
)
select
settings.*,
settings_for_char.value
from
settings
left join settings_for_char on settings_for_char.setting_id = settings.id;
The way I finally understand the top answer is realising (following the Order Of Execution of the SQL query ) that the WHERE clause is applied to the joined table thereby filtering out rows that do not satisfy the WHERE condition from the joined (or output) table. However, moving the WHERE condition to the ON clause applies it to the individual tables prior to joining. This enables the left join to retain rows from the left table even though some column entries of those rows (entries from the right tables) do not satisfy the WHERE condition.
The result is correct based on the SQL statement. Left join returns all values from the right table, and only matching values from the left table.
ID and NAME columns are from the right side table, so are returned.
Score is from the left table, and 30 is returned, as this value relates to Name "Flow". The other Names are NULL as they do not relate to Name "Flow".
The below would return the result you were expecting:
SELECT a.*, b.Score
FROM #Table1 a
LEFT JOIN #Table2 b
ON a.ID = b.T1_ID
WHERE 1=1
AND a.Name = 'Flow'
The SQL applies a filter on the right hand table.