ANSI NULLS and the ON clause - sql

Please see the DDL below:
create table #address (ID int IDENTITY, housenumber varchar(30), street varchar(30), town varchar(30), county varchar(30), postcode varchar(30), primary key (id))
insert into #address (housenumber,street,town,county,postcode) values ('1', 'The Street', 'Lincoln', null, 'LN21AA')
insert into #address (housenumber,street,town,county,postcode) values ('1', 'The Street', 'Lincoln', null, 'LN21AA')
insert into #address (housenumber,street,town,county,postcode) values ('1', 'The Street', 'Lincoln', 'Lincolnshire', 'LN21AA')
and the SQL below:
select #address .id as masterid, address2.id as childid from #address inner join #address as address2 on
#address.housenumber=address2.housenumber and #address.street=address2.street
and #address.town=address2.town
and #address.county=address2.county
and #address.postcode=address2.postcode
where #address.id<address2.id
I am trying to identify duplicates.
The 'County' is null sometimes and is not null others. The query above returns no rows.
I have tried this command:
set ansi_nulls off
However, it makes no difference. I realise I can do this:
select #address .id as masterid, address2.id as childid from #address inner join #address as address2 on
#address.housenumber=address2.housenumber and #address.street=address2.street
and #address.town=address2.town
and ((#address.county=address2.county) or (#address.county is null and address2.county is null))
and #address.postcode=address2.postcode
However, I am interested to know why setting ansi nulls to off allows you to do this:
select * from #address where county=null
which returns two rows. However, my first query returns no rows when ANSI NULLs is off. Why does ANSI NULLS have no affect on the ON clause.
I have spend 20 minutes Googling this, however I have not found my answer.
where #address.id

You can identify duplicates by using group by. The following returns the ids when there are two values:
select housenumber, street, town, country postcode, count(*) as cnt,
min(a.id) as masterid, max(a.id) as childid
from #address a
group by housenumber, street, town, country postcode
having count(*) >= 2;
Getting all ids for a given address would require additional joins or funky string aggregations.

Related

Getting more rows back than what I should while splitting tables into seperate tables

I have a temp customer table
create table #tmpCustomer
(
ID int identity(1,1),
CustomerID nvarchar(128),
CustomerName nvarchar(50),
FirstName nvarchar(50),
LastName nvarchar(50),
DateCreated nvarchar(50),
CreatedBy int,
YearBuilt nvarchar(50),
IsActive bit,
CustTypeID nvarchar(128),
CustomerTypeID int,
CompanyID int,
Line1 nvarchar(50) not null,
Line2 nvarchar(50) null,
Line3 nvarchar(50) null,
City nvarchar(50) not null,
ZipCode nvarchar(15),
StateID int not null,
NewCounty nvarchar(20),
SubDivisionID int null
)
That I am populating from an originating customer table and am using this to populate the temp table
declare #separator char(1);
set #separator = ',';
insert into #tmpCustomer
(CustomerID, CustomerName, FirstName, LastName, DateCreated, CreatedBy, YearBuilt, IsActive, CustomerTypeID, CompanyID, Line1, Line2, Line3, City, ZipCode, StateID, CountyID, NewCounty, SubDivisionID)
select
c.CustomerID,
c.CustomerName,
LastName = Case
When CHARINDEX(#separator, c.CustomerName, 1) - 1 <= 0 Then c.CustomerName
Else SUBSTRING(c.CustomerName,1,CHARINDEX(#separator, c.CustomerName, 1) - 1)
End,
FirstName = Case
When CHARINDEX(#separator, c.CustomerName, 1) - 1 <= 0 Then NULL
Else SUBSTRING(c.CustomerName,CHARINDEX(#separator, c.CustomerName, 1) + 1, Len(c.CustomerName) - (CHARINDEX(#separator, c.CustomerName, 1) ))
End,
GETDATE(),
1,
case when c.YearBuilt is NULL then 'N/A' else c.YearBuilt end,
c.EnabledInd,
case when CustomerTypeID = '4F0B6446-441D-46B8-81CB-B0E8A94624A7' then 1 else 2 end,
1,
c.Address1,
c.Address2,
null,
c.City,
c.ZipCode,
11,
null,
Case when c.County <> '' then County ELSE 'N/A' end as "County",
null
from [Customer] c
where c.CompanyID = '21DE6731-5E6C-11D5-AF81-00D0B74725F6'
This has a total count of 33165 records, which is exactly what it should be.
Next is that I am having a seperate table to hold the addresses
create table #tmpAddress
(
ID int identity(1,1),
Line1 nvarchar(50) not null,
Line2 nvarchar(50) null,
Line3 nvarchar(50) null,
City nvarchar(50) not null,
ZipCode nvarchar(15),
StateID int not null,
CountyID int null,
SubDivisionID int null
)
and if I run this
insert into #tmpAddress
(Line1, Line2, Line3, City, ZipCode, StateID, CountyID,SubDivisionID)
select cr.Line1, cr.Line2,cr.Line3, cr.City, cr.ZipCode, 11, 0, null from #tmpCustomer cr
Then I get the correct amoutn of addresses at 33165
The problem that I am running into is the County.
I have a table of counties, called County and the issue seems to be when I join the County table to get its ID's. Here what is returning more records than it should and it returns 34546 records, which is over a 1000 more than the other one.
insert into #tmpAddress
(Line1, Line2, Line3, City, ZipCode, StateID, ct.CountyID,SubDivisionID)
select cr.Line1, cr.Line2,'', cr.City, cr.ZipCode, 11, ct.CountyID, null from #tmpCustomer cr
inner join Exo.dbo.County ct on ct.County = cr.NewCounty
I don't know what is going wrong and maybe someone could point it out to me so I can get the 33165 records for the tmpAddress table
As alluded to by unfinishedmonkey, it looks like you have duplicate values in Exo.dbo.County.County, you can check with a query like:
SELECT c.County, COUNT(*)
FROM Exo.dbo.County c
GROUP BY c.County
HAVING COUNT(*) > 1
ORDER BY c.County
If this returns any records, then you have duplicate records in Exo.dbo.County with the same value in the County field, which in turn is leading to you getting multiple rows in your resultset for a single row from #tmpCustomer.
You could solve this in a couple of ways, firstly by removing records from Exo.dbo.County so that all County values in that table are unique, or by amending your SELECT clause:
SELECT DISTINCT cr.Line1, cr.Line2,'', cr.City, cr.ZipCode, 11, ct.CountyID, NULL
FROM #tmpCustomer cr
INNER JOIN Exo.dbo.County ct ON ct.County = cr.NewCounty
If this query still returns more records than you get from SELECT Line1, Line2, cr.City, cr.ZipCode FROM #tmpCustomer, I'm not sure what the problem could be. If it returns less records, then you must have some records in #tmpCustomer which have a NewCounty value which doesn't appear at all in Exo.dbo.County.County, so you might want to consider a LEFT OUTER JOIN if that's the case.

MS SQL Stored procedure to get the value

I have 3 tables
Staff table:
EmpId CandidateId
------------------------
1 2
Candidate table:
CandidateId Firstname Last name CountryId PassportCountry
--------------------------------------------------------------------
1 Mark Antony 2 3
2 Joy terry 1 3
Country:
CountryId Name
---------------------------
1 USA
2 UK
3 Australia
User will pass the EmpId in the querystring I need to show the candidate details according to the empId. I have only one country table and using that table for country, passportport country. So I need to get the country name when I get the candidate value.
How to write the stored procedure to get the candidate details. Im not good in sql. Can you guys help me on this. Thanks in advance.
Hi I tried the below script to get the country name and passport country name. I can get the country name, but not the passport country.
SELECT
FirstName,
LastName,
PassportCountry,
Country.CountryName as Country
from Candidate
inner join Country
on country.CountryId=candidate.country
where CandidateId=#CandidateId;
This should get you going in the right direction.
Basically, since you're referencing the Country table twice, you need to join it twice.
declare #staff table (EmpId int, CandidateId int)
insert into #staff values (1, 2 )
declare #Candidate table (CandidateId int, Firstname nvarchar(50), Lastname nvarchar(50), CountryId int, PassportCountry int)
insert into #Candidate
values (1, 'Mark', 'Antony', 2, 3),
(2, 'Joy', 'Terry', 1, 3)
declare #Country table (CountryId int, Name nvarchar(50))
insert into #Country
values (1, 'USA'),
(2, 'UK'),
(3, 'Australia')
declare #empID int
set #empID = 1
SELECT
FirstName,
LastName,
t2.Name as PersonCountry,
t3.Name as PassportCountry
from #staff s
inner join #Candidate t1 on s.CandidateId = t1.CandidateId
inner join #Country t2 on t1.CountryId=t2.CountryId
inner join #Country t3 on t1.PassportCountry=t3.CountryId
where s.EmpId=#empID;

Join on select SQL Server stored procedure

I am trying to create a stored procedure to insert data into 2 tables in SQL Server.
I have tried putting the join in all different positions of the code and still get an error.
CREATE PROCEDURE sp_Insert_Person
#s_FirstName nvarchar(50),
#s_Surname nvarchar(50),
#s_AddressLine1 nvarchar(50),
#s_AddressLine2 nvarchar(50),
#s_Postcode nvarchar(10),
#s_Phone nvarchar(50),
#s_Department nvarchar(50)
AS
BEGIN
INSERT INTO
tbl_person(FirstName, Surname, AddressLine1, AddressLine2,
Postcode, Phone, tbl_Department.Department)
INNER JOIN tbl_person
ON tbl_person.DepartmentID = tbl_Department.DepartmentID
VALUES (#s_FirstName,
#s_Surname,
#s_AddressLine1,
#s_AddressLine2,
#s_Postcode,
#s_Phone,
#s_Department)
END
I have tried the join at the end and at the beginning I have looked all over for insert joins, wondered if i was just getting it all wrong.
I have a department table and a person table and thought I would be able to access the department table through the FK DepartmentID which I have in the Person table, as is the PK in the Department table
I think something like this
INSERT INTO tbl_person
(FirstName,
Surname,
AddressLine1,
AddressLine2,
Postcode,
Phone,
DepartmentID)
Select #s_FirstName,
#s_Surname,
#s_AddressLine1,
#s_AddressLine2,
#s_Postcode,
#s_Phone,
tbl_Department.DepartmentID
from tbl_person
join DepartmentID
ON tbl_person.DepartmentID = tbl_Department.DepartmentID
where tbl_Department.Department = #s_Department
CREATE PROCEDURE sp_Insert_Person
#s_FirstName nvarchar(50),
#s_Surname nvarchar(50),
#s_AddressLine1 nvarchar(50),
#s_AddressLine2 nvarchar(50),
#s_Postcode nvarchar(10),
#s_Phone nvarchar(50),
#s_Department nvarchar(50)
AS
BEGIN
if not Exists(select * from tbl_Department where Department=#s_Department)
insert into tbl_Department (Department) Values (#s_Department)
INSERT INTO tbl_person
(FirstName,
Surname,
AddressLine1,
AddressLine2,
Postcode,
Phone,
DepartmentID)
select #s_FirstName,
#s_Surname,
#s_AddressLine1,
#s_AddressLine2,
#s_Postcode,
#s_Phone,
#s_Department,
DepartmentID
from tbl_Department
where Department=#s_Department
END

versioning of a table

anybody has seen any examples of a table with multiple versions for each record
something like if you would had the table
Person(Id, FirstName, LastName)
and you change a record's LastName than you would have both versions of LastName (first one, and the one after the change)
I've seen this done two ways. The first is in the table itself by adding an EffectiveDate and CancelDate (or somesuch). To get the current for a given record, you'd do something like: SELECT Id, FirstName, LastName FROM Table WHERE CancelDate IS NULL
The other is to have a global history table (which holds all of your historical data). The structure for such a table normally looks something like
Id bigint not null,
TableName nvarchar(50),
ColumnName nvarchar(50),
PKColumnName nvarchar(50),
PKValue bigint, //or whatever datatype
OriginalValue nvarchar(max),
NewValue nvarchar(max),
ChangeDate datetime
Then you set a trigger on your tables (or, alternatively, add a policy that all of your Updates/Inserts will also insert into your HX table) so that the correct data is logged.
The way we're doing it (might not be the best way) is to have an active bit field, and a foreign key back to the parent record. So for general queries you would filter on active employees, but you can get the history of a single employee with their Employee ID.
declare #employees
(
PK_emID int identity(1,1),
EmployeeID int,
FirstName varchar(50),
LastName varchar(50),
Active bit,
FK_EmployeeID int
primary key(PK_emID)
)
insert into #employees
(
EmployeeID,
FirstName,
LastName,
Active,
FK_EployeeID
)
select 1, 'David', 'Engle', 1,null
union all
select 2, 'Amy', 'Edge', 0,null
union all
select 2, 'Amy','Engle',1,2

Moving away from STI - SQL to break single table into new multi-table structure

I am moving old project that used single table inheritance in to a new database, which is more structured. How would I write a SQL script to port this?
Old structure
I've simplified the SQL for legibility.
CREATE TABLE customers (
id int(11),
...
firstName varchar(50),
surname varchar(50),
address1 varchar(50),
address2 varchar(50),
town varchar(50),
county varchar(50),
postcode varchar(50),
country varchar(50),
delAddress1 varchar(50),
delAddress2 varchar(50),
delTown varchar(50),
delCounty varchar(50),
delPostcode varchar(50),
delCountry varchar(50),
tel varchar(50),
mobile varchar(50),
workTel varchar(50),
);
New structure
CREATE TABLE users (
id int(11),
firstName varchar(50),
surname varchar(50),
...
);
CREATE TABLE addresses (
id int(11),
ForeignKey(user),
street1 varchar(50),
street2 varchar(50),
town varchar(50),
county varchar(50),
postcode varchar(50),
country varchar(50),
type ...,
);
CREATE TABLE phone_numbers (
id int(11),
ForeignKey(user),
number varchar(50),
type ...,
);
With appropriate cross-database notations for table references if appropriate:
INSERT INTO Users(id, firstname, surname, ...)
SELECT id, firstname, surname, ...
FROM Customers;
INSERT INTO Addresses(id, street1, street2, ...)
SELECT id, street1, street2, ...
FROM Customers;
INSERT INTO Phone_Numbers(id, number, type, ...)
SELECT id, phone, type, ...
FROM Customers;
If you want both the new and the old address (del* version), then repeat the address operation on the two sets of source columns with appropriate tagging. Similarly, for the three phone numbers, repeat the phone number operation. Or use a UNION in each case.
First make sure to backup your existing data!
The process is differnt if you are going to use the original id field or generate a new one.
Assuming you are going to use the orginal, make sure that you have the ability to insert id fields into the table before you start (the SQL Server equivalent if you are autogenrating the number is Set identity Insert on, not sure what mysql would use). Wirte an insert from the old table to the parent table:
insert newparenttable (idfield, field1, field2)
select idfield, field1, field2 from old parent table
then write similar inserts for all the child tables depending on what fields you need. Where you have multiple phone numbers in differnt fields, for instance, you would use a union all stament as your insert select.
Insert newphone (phonenumber, userid, phonetype)
select home_phone, id, 100 from oldparenttable
union all
select work_phone, id, 101 from oldparenttable
Union all
select cell_phone, id, 102 from oldparenttable
If you are going to have a new id generated, then create the table with a field for the old id. You can drop this at the end (although I'd keep it for about six months). Then you can join from the new parent table to the old parent table on the oldid and grab the new id from the new parent table when you do you inserts to child tables. Something like:
Insert newphone (phonenumber, userid, phonetype)
select home_phone, n.id, 100 from oldparenttable o
join newparenttable n on n.oldid = o.id
union all
select work_phone, n.id, 101 fromoldparenttable o
join newparenttable n on n.oldid = o.id
Union all
select cell_phone, n.id, 102 from oldparenttable o
join newparenttable n on n.oldid = o.id