Left join on 2 varchar fields not working - sql

Im having trouble joining two varchar(31) fields on sql server 2008. below is my query and it works fine
select A.CustId,A.Country,B.Country from [ACC].[dbo].[Customer] as A
left join
[Task Centre].[dbo].[CountryCodes] as B on A.Country=B.Country]
the results are as follows
CustomerA United Kingdom Null
CustomerB Ireland Ireland
CustomerC Spain Spain
CustomerD South Africa Null
South Africa and United Kingdom don't match even though they are in both dbs
I have tried to replace the space but its very slow and doesnt work. I think its something to do with the whitespace but I cant find the right command to achieve what I want.
Bear with me if I have omitted anything as Im a novice, I have also searched everywhere for an answer but cant find one that works for me.
Any help is greatly appreciated
Mike

Try to execute the following query on both tables. This will tell you if there's any "hidden" difference between the tables (for example, blank characters, line breaks, etc.):
select Country, CAST(Country AS VARBINARY) AS BinaryCountry
from [ACC].[dbo].[Customer]
where Country = 'United Kingdom'
select Country, CAST(Country AS VARBINARY) AS BinaryCountry
from [Task Centre].[dbo].[CountryCodes]
where Country = 'United Kingdom'
The column BinaryCountry should show a different value, if the content of the Country-columns are not exactly the same. If that is the case, consider correcting the error in either table. Once you've made sure that the value is the same in both tables, your join should work just fine.
Edit: The problem turns out to be a non-breaking space character in the Task Centre-table. To workaround this, use the following in your join criteria:
ON A.Country = Replace(B.Country, CHAR(0xA0), ' ')

Try this:
If any space is in value you need to trim and check
SELECT A.CustId, A.Country, B.Country
FROM [ACC].[dbo].[Customer] AS A LEFT JOIN
[Task Centre].[dbo].[CountryCodes] AS B
ON LTRIM(RTRIM(A.Country)) = LTRIM(RTRIM(B.Country))

If the difference is whitespace, then I guess this might work for you. It will be slow though, since it won't be able to use any indexes:
select A.CustId,
A.Country,
B.Country
from
(
SELECT A.CustId,
A.Country,
LOWER(REPLACE(A.Country, ' ', '')) AS CleanedCountry
FROM [ACC].[dbo].[Customer] as A
) A
left
join
(
SELECT B.Country,
LOWER(REPLACE(B.Country, ' ', '')) AS CleanedCountry
FROM [Task Centre].[dbo].[CountryCodes] as B
) B
on A.CleanedCountry=B.CleanedCountry
You would only need the lower if your collation is case sensitive...

Related

SQL query with many 'AND NOT CONTAINS' statements

I am trying to exclude timezones that have a substring in them so I only have records likely from the US.
The query works fine (e.g., the first line after the OR will remove local_timezones that include 'Africa/Abidjan'), but there's got to be a better way to write it.
It's too verbose, repetitive, and I suspect it's slower than it could be. Any advice greatly appreciated. (I'm using Snowflake's flavor of SQL but not sure that matters in this case).
NOTE: I'd like to keep a timezone such as America/Los_Angeles, but not America/El_Salvador, so for this reason I don't think wildcards are a good solution.
SELECT a_col
FROM a_table
WHERE
(country = 'United States')
OR
((country is NULL and not contains (local_timezone, 'Africa')
AND
country is NULL and not contains (local_timezone, 'Asia')
AND
country is NULL and not contains (local_timezone, 'Atlantic')
AND
country is NULL and not contains (local_timezone, 'Australia')
AND
country is NULL and not contains (local_timezone, 'Etc')
AND
country is NULL and not contains (local_timezone, 'Europe')
AND
country is NULL and not contains (local_timezone, 'Araguaina')
etc etc
If you have a known list of "good things" I would make a table, and then just JOIN to id. Here I made you a list of good timezones:
CREATE TABLE acceptable_timezone (tz_name text) AS
SELECT * FROM VALUES
('Pacific/Auckland'),
('Pacific/Fiji'),
('Pacific/Tahiti');
I love me some Pacific... now we have some important data in a CTE
WITH data(id, timezone) AS (
SELECT * FROM VALUES
(1, 'Pacific/Auckland'),
(2, 'Pacific/Fiji'),
(3, 'America/El_Salvador')
)
SELECT d.*
FROM data AS d
JOIN acceptable_timezone AS a
ON a.tz_name = d.timezone
ORDER BY 1;
which total does not match the El Salvador:
ID
TIMEZONE
1
Pacific/Auckland
2
Pacific/Fiji
You cannot get much faster than an equijoin, but if your DATA has the timezones as substrings, then the TABLE can have the wildcard matches % and you can use a LIKE just like Felipe's answer does but as
JOIN acceptable_timezone AS a
ON d.timezone LIKE a.tz_name
You can use LIKE ANY:
with data as
(select null country, 'something Australia maybe' local_timezone)
select *
from data
where country = 'United States'
or (
country is null
and not local_timezone like any ('%Australia%', '%Africa%', '%Atlantic%')
)

CASE Statement - An expression services limit has been reached

I'm getting the following error:
An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.
I'm attempting to run the below query, however it appears there is one line too many in my case statement (when i remove the "London" Line, it works perfectly) or "Scotland" for example.
I can't think of the best way to split this statement.
If i split it into 2 queries and union all, it does work. however the ELSE 'No Region' becomes a problem. Everything which is included in the first part of the query shows as "No Region" for the second part of the query, and vice versa.
(My end goal is essentially to create a list of customers per region) I can then use this as the foundation of a regional sales report.
Many Thanks
Andy
SELECT T0.CardCode, T0.CardName, T0.PostCode,
CASE
WHEN T0.PostCodeABR IN ('DG','KW','IV','PH','AB','DD','PA','FK','KY','G','EH','ML','KA','TD') THEN 'Scotland'
WHEN T0.PostCodeABR IN ('BT') THEN 'Ireland'
WHEN T0.PostCodeABR IN ('CA','NE','DH','SR','TS','DL','LA','BD','HG','YO','HX','LS','FY','PR','BB','L','WN','BL','OL') THEN 'North M62'
WHEN T0.PostCodeABR IN ('CH','WA','CW','SK','M','HD','WF','DN','HU','DE','NG','LN','S') THEN 'South M62'
WHEN T0.PostCodeABR IN ('LL','SY','LD','SA','CF','NP') THEN 'Wales'
WHEN T0.PostCodeABR IN ('NR','IP','CB') THEN 'East Anglia'
WHEN T0.PostCodeABR IN ('SN','BS','BA','SP','BH','DT','TA','EX','TQ','PL','TR') THEN 'South West'
WHEN T0.PostCodeABR IN ('LU','AL','HP','SG','SL','RG','SO','GU','PO','BN','RH','TN','ME','CT','SS','CM','CO') THEN 'South East'
WHEN T0.PostCodeABR IN ('ST','TF','WV','WS','DY','B','WR','HR','GL','OX','CV','NN','MK','PE','LE') THEN 'Midlands'
WHEN T0.PostCodeABR IN ('WD','EN','HA','N','NW','UB','W','WC','EC','E','IG','RM','DA','BR','CR','SM','KT','TW','SW') THEN 'London'
ELSE 'No Region'
END AS 'Region'
FROM [dbo].[REPS-PostcodeABBR] T0
As I mentioned in the comment, I would suggest you create a "lookup" table for the post codes, then all you need to do is JOIN to the table, and not have a "messy" and large CASE expression (T-SQL doesn't support Case (Switch) statements).
So your lookup table would look a little like this:
CREATE TABLE dbo.PostcodeRegion (Postcode varchar(2),
Region varchar(20));
GO
--Sample data
INSERT INTO dbo.PostcodeRegion (Postcode,Region)
VALUES('DG','Scotland'),
('BT','Ireland'),
('LL','Wales');
And then your query would just do a LEFT JOIN:
SELECT RPA.CardCode,
RPA.CardName,
RPA.PostCode,
COALESCE(PR.Region,'No Region') AS Region
FROM [dbo].[REPS-PostcodeABBR] RPA --T0 is a poor choice of an alias, there is no T not 0 in "REPS-PostcodeABBR"
LEFT JOIN dbo.PostcodeRegion PR ON RPA.PostCodeABR = PR.Region;
Note you would likely want to INDEX the table as well, and/or apply a UNIQUE CONSTRAINT or PRIMARY KEY to the PostCode column.
Thanks for the help... I tried multiple ways mentioned above, and they all did work, however the most efficient seemed to be this way.
Created a lookup table within SAP; This table included PostCodeFrom, PostCodeTo, PostCodeABR, Region
This would look like; TS00, TS99, TS, North M62
I then done;
SELECT OCRD.ZipCode PCLOOKUP.Region, PCLOOKUP.PostCodeABR FROM OCRD T0 LEFT OUTER JOIN PCLOOKUP ON OCRD.ZipCode >= PCLOOKUP.PostCodeFROM AND OCRD.ZipCode <= PCLOOKUP.PostCodeFrom
Basically, if the postcode is between
FROM AND To Display the abbreviation and region.

Selecting multiple options in MS Access

I am writing a code in SQL for Access. The query asks three questions. I have three categories -- I'll just use the categories 'country', 'city', 'street' for now. I am trying to figure out how to make it so that you only have to enter one answer even though it asks you 3. But if you answer two, it will give you the like terms. For example, if I answered Georgia and Atlanta, Atlanta Georgia would show up. Or if I entered Canal in 'street' and Louisiana, every street named Canal in Louisiana would show up.
Currently, if I typed out Canal and Louisiana, the query would show me everything listed under Louisiana and every street titled Canal (even the ones not in Louisiana).
SELECT *
FROM File
WHERE (((File.State)=[Enter the state]))
OR (((File.City)=[Enter the city]))
OR (((File.Street)=[Enter the street]));
I think you should be able to do it by using AND rather than OR to connect the criteria for the different columns, but not using the criteria for a column if its parameter wasn't given.
SELECT *
FROM File
WHERE ( ([Enter the state] = '') OR (File.State=[Enter the state]) )
AND ( ([Enter the city] = '') OR (File.City=[Enter the city]) )
AND ( ([Enter the street] = '') OR (File.Street=[Enter the street]) );
I'm kind of rusty with Access, so I'm not sure if the parameter will be null or '' if nothing is entered, so it might need to be adjusted a little for that.
SELECT * FROM File WHERE
(((File.State)='[Enter the state]'))
OR (((File.City)='[Enter the city]'))
OR (((File.Street)='[Enter the street]'));
You just needed some quotes around them because they are strings.

How to sum employee paychecks from multiple jobs into one table

table featuring fictitious employee data, area ( in hectares (ha)) cleared (noted in french as superficie), rate for 1 hectare of cleared land on the specific lot (french:taux) and amount due( expr1) for that lot.
My problem here is that I want the total amount due for each Worker, not the amount due for each worker for each lot. Totals for Sirs Alain, Jacques, Paul, Roger and Tanguay should normally be 4066, 4082 , 5638, 5811 and 3131 , respectively.
My code so far is this
SELECT tbl_Employés.Num_deb, tbl_Employés.Prénom, tbl_Employés.Nom, tbl_Employés.Age, tbl_Employés.DEP, tbl_Employés.Expérience, tbl_Employés.Adresse, Tbl_terrain.superficie, Tbl_terrain.Taux, [superficie]*[taux] AS [Montant à payer]
FROM tbl_Employés INNER JOIN Tbl_terrain ON tbl_Employés.Num_deb = Tbl_terrain.Num_deb
ORDER BY tbl_Employés.Nom;
I have so far tried to use GROUP BY Numéro_terrain, which returns an error that my query does not include the specified expression ''Num_deb'' as a part of an aggregation fuction
I woul greatly appreciate any imput. I am very sorry if you have a hard time understanding some words, as I am doing my best to translate everything from french.
One method is to include all the columns in the GROUP BY that are not arguments to aggregation functions:
SELECT e.Num_deb, e.Prénom, e.Nom, e.Age, e.DEP, e.Expérience, e.Adresse,
SUM([superficie]*[taux]) AS [Montant à payer]
FROM tbl_Employés as e INNER JOIN
Tbl_terrain as t
ON e.Num_deb = t.Num_deb
GROUP BY e.Num_deb, e.Prénom, e.Nom, e.Age, e.DEP, e.Expérience, e.Adresse
ORDER BY e.Nom;

SQL (oracle) imposing hierarchy on where clause

Table Name: REG_NBRS
STATE CITY COUNTRY REG_NBR
------------------------------------------
ILLINOIS USA 444333222
NEBRASKA USA 111222333
NEW YORK USA 333444555
FLORIDA USA 666222666
TAMPA USA 888333888
I have data something like this and I need to get REG_NBR for that state or city. If the row matches both for state and City, city takes precidence, and if it is not matched for either state or city, then I will have to still list the row with null for reg_nbr.
I tried to come up with a query but didn't get much successs as I don't know how to impose a precidence while doing an outer join.
SELECT C.NAME, C.AGE, RN.REG_NBR
FROM CUSTOMER C, REG_NBRS RN
WHERE C.COUNTRY = RN.COUNTRY
AND (C.STATE = RN.STATE OR C.CITY = RN.CITY)
AND C.ID BETWEEN 1000 AND 2000
As a beginner, I do not know how to join these two tables in such a way that
it joins first on STATE and City
But still list all 1000 rows
Put null to those registered numbers which are not from those states or cities
If both match, then use the State's Registered number (i.e. use 666222666 for Tampa even though we can find an entry for TAMPA)
I am sorry if this is not making sense but I have tried to explain as much as possible. I have also tried different combinations of left outer join and right outer join but couldn't get how to impose a hierarchy for WHERE coinditions. I thought of UNIONS but I think even unions would list 2 rows for a customer in Tampa with both REG_NBRs.
Any suggestions?
APOLOGIZE for jumbled code as I am using a (not so) smart phone to post this question.
Assuming that you have unique keys in (Country, State) and (Country, City), you can do it in a simple way by joining twice:
SELECT
C.NAME, C.AGE,
COALESCE(RN1.REG_NBR, RN2.REG_NBR) AS REG_NBR
FROM CUSTOMER C
LEFT OUTER JOIN REG_NBRS RN1
ON RN1.COUNTRY = RN1.COUNTRY
AND RN1.STATE = C.STATE
LEFT OUTER JOIN REG_NBRS RN2
ON RN2.COUNTRY = C.COUNTRY
AND RN2.CITY = C.CITY
WHERE C.ID BETWEEN 1000 AND 2000
Moreover this should be faster than the OR, which databases don't like much (at least with proper indexing in place).