Optional Unique Constraints? - sql

I am setting up a SaaS application that multiple clients will use to enter data. However, I have certain fields that Client A may want to force unique, Client B may want to allow dupes. Obviously if I am going to allow any one client to have dupes, the table may not have a unique constraint on it. The downside is that If I want to enforce a unique constraint for some clients, I will have to go about it in some other way.
Has anyone tackled a problem like this, and if so, what are the common solutions and or potential pitfalls to look out for?
I am thinking a trigger that checks for any possible unique flags may be the only way to enforce this correctly. If I rely on the business layer, there is no guarentee that the app will do a unique check before every insert.
SOLUTION:
First I considered the Unique Index, but ruled it out as they can not do any sort of joins or lookups, only express values. And I didn't want to modify the index every time a client was added or a client's uniqueness preference changed.
Then I looked into CHECK CONSTRAINTS, and after some fooling around, built one function to return true for both hypothetical columns that a client would be able to select as unique or not.
Here is the test tables, data, and function I used to
verify that a check constraint could do all that I wanted.
-- Clients Table
CREATE TABLE [dbo].[Clients](
[ID] [int] NOT NULL,
[Name] [varchar](50) NOT NULL,
[UniqueSSN] [bit] NOT NULL,
[UniqueVIN] [bit] NOT NULL
) ON [PRIMARY]
-- Test Client Data
INSERT INTO Clients(ID, Name, UniqueSSN, UniqueVIN) VALUES(1,'A Corp',0,0)
INSERT INTO Clients(ID, Name, UniqueSSN, UniqueVIN) VALUES(2,'B Corp',1,0)
INSERT INTO Clients(ID, Name, UniqueSSN, UniqueVIN) VALUES(3,'C Corp',0,1)
INSERT INTO Clients(ID, Name, UniqueSSN, UniqueVIN) VALUES(4,'D Corp',1,1)
-- Cases Table
CREATE TABLE [dbo].[Cases](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ClientID] [int] NOT NULL,
[ClaimantName] [varchar](50) NOT NULL,
[SSN] [varchar](12) NULL,
[VIN] [varchar](17) NULL
) ON [PRIMARY]
-- Check Uniques Function
CREATE FUNCTION CheckUniques(#ClientID int)
RETURNS int -- 0: Ok to insert, 1: Cannot insert
AS
BEGIN
DECLARE #SSNCheck int
DECLARE #VinCheck int
SELECT #SSNCheck = 0
SELECT #VinCheck = 0
IF (SELECT UniqueSSN FROM Clients WHERE ID = #ClientID) = 1
BEGIN
SELECT #SSNCheck = COUNT(SSN) FROM Cases cs WHERE ClientID = #ClientID AND (SELECT COUNT(SSN) FROM Cases c2 WHERE c2.SSN = cs.SSN) > 1
END
IF (SELECT UniqueVIN FROM Clients WHERE ID = #ClientID) = 1
BEGIN
SELECT #VinCheck = COUNT(VIN) FROM Cases cs WHERE ClientID = #ClientID AND (SELECT COUNT(VIN) FROM Cases c2 WHERE c2.VIN = cs.VIN) > 1
END
RETURN #SSNCheck + #VinCheck
END
-- Add Check Constraint to table
ALTER TABLE Cases
ADD Constraint chkClientUniques CHECK(dbo.CheckUniques(ClientID) = 0)
-- Now confirm constraint using test data
-- Client A: Confirm that both duplicate SSN and VIN's are allowed
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(1, 'Alice', '111-11-1111', 'A-1234')
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(1, 'Bob', '111-11-1111', 'A-1234')
-- Client B: Confirm that Unique SSN is enforced, but duplicate VIN allowed
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(2, 'Charlie', '222-22-2222', 'B-2345') -- Should work
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(2, 'Donna', '222-22-2222', 'B-2345') -- Should fail
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(2, 'Evan', '333-33-3333', 'B-2345') -- Should Work
-- Client C: Confirm that Unique VIN is enforced, but duplicate SSN allowed
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(3, 'Evan', '444-44-4444', 'C-3456') -- Should work
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(3, 'Fred', '444-44-4444', 'C-3456') -- Should fail
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(3, 'Ginny', '444-44-4444', 'C-4567') -- Should work
-- Client D: Confirm that both Unique SSN and VIN are enforced
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(4, 'Henry', '555-55-5555', 'D-1234') -- Should work
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(4, 'Isaac', '666-66-6666', 'D-1234') -- Should fail
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(4, 'James', '555-55-5555', 'D-2345') -- Should fail
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(4, 'Kevin', '555-55-5555', 'D-1234') -- Should fail
INSERT INTO Cases (ClientID, ClaimantName, SSN, VIN) VALUES(4, 'Lisa', '777-77-7777', 'D-3456') -- Should work
EDIT:
Had to modify the function a few times to catch NULL values in the dupe check, but all appears to be working now.

One approach is to use a CHECK constraint instead of a unique.
This CHECK constraint will be backed by a SCALAR function that will
take as input ClientID
cross-ref ClientID against a lookup table to see if duplicates are allowed (client.dups)
if not allowed, check for duplicates in the table
Something like
ALTER TABLE TBL ADD CONSTRAINT CK_TBL_UNIQ CHECK(dbo.IsThisOK(clientID)=1)

If you can identify rows in the table for each client, depending on your DBMS you could do something like this:
CREATE UNIQUE INDEX uq_some_col
ON the_table(some_column, other_column, client_id)
WHERE client_id IN (1,2,3);
(The above is valid for PostgreSQL and and I think SQL Server 2005)
The downsize is, that you will need to re-create that index each time a new client is added that requires those columns to be unique.
You will probably have some checks in the business layer as well, mostly to be able to show proper error messages.

That's perfectly possible now on Sql Server 2008(tested on my Sql Server 2008 box here):
create table a
(
cx varchar(50)
);
create unique index ux_cx on a(cx) where cx <> 'US';
insert into a values('PHILIPPINES'),('JAPAN'),('US'),('CANADA');
-- still ok to insert duplicate US
insert into a values('US');
-- will fail here
insert into a values('JAPAN');
Related article: http://www.ienablemuch.com/2010/12/postgresql-said-sql-server2008-said-non.html

There are a few things you can do with this, just depends on when/how you want to handle this.
You could use a CheckConstrain and modify this to do different lookups based on the client that was using it
You could use the business tier to handle this, but it will not protect you from raw database updates.
I personally have found that #1 can get too hard to maintain, especially if you get a high number of clients. I've found that doing it at the business level is a lot easier, and you can control it at a centralized location.
Thee are other options such as a table per client and others that could work as well, but these two are at least the most common that I've seen.

You could add a helper column. The column would be equal the the primary key for the application that allows duplicates, and a constant value for the other application. Then you create a unique constraint on UniqueHelper, Col1.
For the non-dupe client, it will have a constant in the helper column, forcing the column to be unique.
For the dupe column, the helper column is equal to the primary key, so the unique constraint is satisfied by that column alone. That application can add any number of dupes.

One possibility might be to use BEFORE INSERT and and BEFORE UPDATE triggers that can selectively enforce uniqueness.
And another possibility (kind of a kludge) would be to have an additional dummy field that is populated with unique values for one customer and duplicate values for the other customer. Then build a unique index on the combination of the dummy field and the visible field.

#Neil. I had asked in my comment above what your reasons were for putting all in the same table and you simply ignored that aspect of my comment and said everything was "plain and simple". Do you really want to hear the downsides of the conditional constraints approach in an Saas context?
You don't say how many different sets of rules this table in your Saas application may eventually need to incorporate. Will there be only two variations?
Performance is a consideration. Although each customer would have access to a dedicated conditional index|indices, fetching data from the base table could become slower and slower as the data from additional customers is added to it and the table grows.
If I were developing an Saas application, I'd go with dedicated transaction tables wherever appropriate. Customers could share standard tables like zipcodes, counties, and share even domain-specific tables like Products or Categories or WidgetTypes or whatever. I'd probably build dynamic SQL statements in stored procedures in which the correct table for the current customer was chosen and placed in the statement being constructed, e.g.
sql = "select * from " + DYNAMIC_ORDERS_TABLE + " where ... ")
If performance was taking a hit because the dynamic statements had to be compiled all the time, I might consider writing a dedicated stored procedure generator: sp_ORDERS_SELECT_25_v1.0 {where "25" is the id assigned to a particular user of the Saas app and there's a version suffix}.
You're going to have to use some dynamic SQL because the customer id must be appended to the WHERE-clause of every one of your ad hoc queries in order to take advantage of your conditional indexes:
sql = " select * from orders where ... and customerid = " + CURRENT_CUSTOMERID
Your conditional indexes involve your customer/user column and so that column must be made part of every query in order to ensure that only that customer's subset of rows are selected out of the table.
So, when all is said and done, you're really saving yourself the effort required to create a dedicated table and avoiding some dynamic SQL on your bread-and-butter queries. Writing dynamic SQL for bread-and-butter queries doesn't take much effort, and it's certainly less messy than having to manage multiple customer-specific indexes on the same shared table; and if you're writing dynamic SQL you could just as easily substitute the dedicated table name as append the customerid=25 clause to every query. The performance loss of dynamic SQL would be more than offset by the performance gain of dedicated tables.
P.S. Let's say your app has been running for a year or so and you have multiple customers and your table has grown large. You want to add another customer and their new set of customer-specific indexes to the large production table. Can you slipstream these new indexes and constraints during normal business hours or will you have to schedule the creation of these indexes for a time when usage is relatively light?

You don't make clear what benefit there is in having the data from separate universes mingled in the same table.
Uniqueness constraints are part of the entity definition and each entity needs its own table. I'd create separate tables.

Related

How to write a stored procedure to insert values into two tables with a foreign key relationship?

I created two tables, Employeeinfo and Employeerequests.
Table Employeeinfo can have one unique user with columns:
id (primary key, auto increment)
name
dep
address
and table Employeerequests can have multiple requests against one unique user ID with columns
id (primary key, auto increment)
CustomerID(foreign key to Employeeinfo(ID column))
category
requests.
Now I want to design a stored procedure in such a way so that I can insert values into both tables at the same time. Please help. I am very new to SQL. Thanks in advance.
This is a bit long for a comment.
SQL Server only allows you to insert into one table in a single query. You presumably want to provide both employee and request information. So that limitation on insert is a real problem.
You can get around the limitation by creating a view combining the two table and then defining an instead of insert trigger on the view. This is explained in the documentation.
That said, you seem to not have extensive SQL knowledge. So, I would recommend simply using two separate statements, one for each table. You can wrap them in a stored procedure, if you find that convenient.
In the stored procedure, you can use Output clause of Insert statement as:
DECLARE #MyTableVar TABLE (NewCustomerID INT);
-- The OUTPUT clause have access to all the columns in the table,
-- even those not part of Insert statement ex:EmployeeID.
INSERT INTO [dbo].[Employeeinfo] ([Name], [dep], [address])
OUTPUT INSERTED.Id INTO #MyTableVar
SELECT 'Test', 'TestDep', 'TestAddress'
-- then make insert in child table as
INSERT INTO [dbo].[Employeerequests] (CustomerID, category)
SELECT NewCustomerID, 'TestCat'
FROM #MyTableVar
Sample code here...
Hope that helps!

How can i prevent to have same values for one user

I have this kind of tables:
https://ibb.co/sPn5zT7
Here in the UserPl table, the ProgrammingLanguageId and KnowledgeId are foreign keys, connected with Primary Keys of Knowledge and ProgrammingLanguage table.
I want to make when I insert for example
insert into userPLs values(1,'a7ac3486-e852-42c0-a458-9075eb5ed7d7','Doe',1,1)
here Doe says that he knows C# with basic knowledge. I want to prevent in the next insert to be impossible for Doe, to be inserted again something like this:
insert into userPLs values(1,'a7ac3486-e852-42c0-a458-9075eb5ed7d7','Doe',1,2)
or
insert into userPLs values(2,'a7ac3486-e852-42c0-a458-9075eb5ed7d7','Doe',1,2)
because he once said that his knowledge of C# is basic.
I AM USING MS SQL SERVER
How can I achieve this?
Try to set a unique index, where required
You can prevent the insert with a constraint.
alter table UserPl
add constraint UserLanguageSkillLevel
unique (UserId, ProgrammingLanguageId);
You'll still gave to catch failed inserts or modify the front end to eliminate the opportunity to add contradictory information in the first place.
A uniqueness constraint is ultimately enforced with an index. If you create a unique index directly rather than by using a constraint you could also apply the ignore_dup_key index setting and let the engine silently discard bad inserts. I'm not going to endorse that as an ideal approach but it might be useful as a temporary stopgap.
Having Primary key / Cluster Index on the table UserPl would enforce whatever the combination of your needs i.e.
If User cannot know multiple programming languages, then key goes
Create clustered index CLU_UserPL on UserPl (UserID)
If User can can know multiple programming languages, but cannot have multiple level of knowledge in programming languages, then key goes
Create clustered index CLU_UserPL on UserPl (UserID, ProgrammingLanguageID)
If User can can know multiple programming languages, also have multiple level of knowledge in programming languages, then key goes
Create clustered index CLU_UserPL on UserPl (RecID) --- might be new identity column
or
Create clustered index CLU_UserPL on UserPl (UserPLID)
this can be achieved by using constraints UNIQUE.
Here is a detailed articles about UNIQUE constraint W3School UNIQUE Article
Simple words, UNIQUE is a constraint that will ensure there is no same value allowed in the selected field.
If you want to have another way to prevent Doe to insert new values in the table, you would like to use another method which is IF EXISTS
IF EXISTS (SELECT * FROM userPLs WHERE UserId = 'THE USER ID')
BEGIN
PRINT 'Data Already Exists! Insert will be ignored!'
END
ELSE
BEGIN
PRINT 'Data doesn''t exists! Proceeding to insert the data!'
//Start inserting the data
END
UPDATED ANSWER
Here is the modified SQL Query with IF EXISTS but with another condition.
IF EXISTS (SELECT * FROM userPLs WHERE UserId = 'THE USER ID' AND ProgrammingLanguageId = 'The ID')
BEGIN
PRINT 'Data Already Exists! Insert will be ignored!'
END
ELSE
BEGIN
PRINT 'Data doesn''t exists! Proceeding to insert the data!'
//Start inserting the data
END
The query above will solve your issue. If you are wondering how does it works, below is a simple explanation:
The query will check for the UserId first. Does the UserId has been registered to Database?
Next, the query will also check, does the data that will be inserted to Database (ProgrammingLanguageId) also exists in the Database for the selected user?
If the UserId is already registered and the UserId has the same ProgrammingLanguageId with the ID that will be inserted to database, it will ignore the insert and shows "Data Already Exists! Insert will be ignored"
But IF The UserId is already registered in the Database but HAS NO ProgrammingLanguageId that match with the data that will be inserted, it will start insert the data
For a better usage, I think you should create a trigger that will occur whenever an Insert is being executed.

Adding Row in existing table (SQL Server 2005)

I want to add another row in my existing table and I'm a bit hesitant if I'm doing the right thing because it might skew the database. I have my script below and would like to hear your thoughts about it.
I want to add another row for 'Jane' in the table, which will be 'SKATING" in the ACT column.
Table: [Emp_table].[ACT].[LIST_EMP]
My script is:
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
Will this do the trick?
Your statement looks ok. If the database has a problem with it (for example, due to a foreign key constraint violation), it will reject the statement.
If any of the fields in your table are numeric (and not varchar or char), just remove the quotes around the corresponding field. For example, if emp_cod and line_no are int, insert the following values instead:
('REG','EMP',45233,'2016-06-20 00:00:00:00',2,'SKATING','JANE')
Inserting records into a database has always been the most common reason why I've lost a lot of my hairs on my head!
SQL is great when it comes to SELECT or even UPDATEs but when it comes to INSERTs it's like someone from another planet came into the SQL standards commitee and managed to get their way of doing it implemented into the final SQL standard!
If your table does not have an automatic primary key that automatically gets generated on every insert, then you have to code it yourself to manage avoiding duplicates.
Start by writing a normal SELECT to see if the record(s) you're going to add don't already exist. But as Robert implied, your table may not have a primary key because it looks like a LOG table to me. So insert away!
If it does require to have a unique record everytime, then I strongly suggest you create a primary key for the table, either an auto generated one or a combination of your existing columns.
Assuming the first five combined columns make a unique key, this select will determine if your data you're inserting does not already exist...
SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND [EMP_COD] = wsEmpCod AND [DATE] = wsDate AND [LINE_NO] = wsLineno
The wsXXX declarations, you will have to replace them with direct values or have them DECLAREd earlier in your script.
If you ran this alone and recieved a value of 1 or more, then the data exists already in your table, at least those 5 first columns. A true duplicate test will require you to test EVERY column in your table, but it should give you an idea.
In the INSERT, to do it all as one statement, you can do this ...
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
WHERE (SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND
[EMP_COD] = wsEmpCod AND [DATE] = wsDate AND
[LINE_NO] = wsLineno) = 0
Just replace the wsXXX variables with the values you want to insert.
I hope that made sense.

What the best way to self-document "codes" in a SQL based application?

Q: Is there any way to implement self-documenting enumerations in "standard SQL"?
EXAMPLE:
Column: PlayMode
Legal values: 0=Quiet, 1=League Practice, 2=League Play, 3=Open Play, 4=Cross Play
What I've always done is just define the field as "char(1)" or "int", and define the mnemonic ("league practice") as a comment in the code.
Any BETTER suggestions?
I'd definitely prefer using standard SQL, so database type (mySql, MSSQL, Oracle, etc) should't matter. I'd also prefer using any application language (C, C#, Java, etc), so programming language shouldn't matter, either.
Thank you VERY much in advance!
PS:
It's my understanding that using a second table - to map a code to a description, for example "table playmodes (char(1) id, varchar(10) name)" - is very expensive. Is this necessarily correct?
The normal way is to use a static lookup table, sometimes called a "domain table" (because its purpose is to restrict the domain of a column variable.)
It's up to you to keep the underlying values of any enums or the like in sync with the values in the database (you might write a code generator to generates the enum from the domain table that gets invoked when the something in the domain table gets changed.)
Here's an example:
--
-- the domain table
--
create table dbo.play_mode
(
id int not null primary key clustered ,
description varchar(32) not null unique nonclustered ,
)
insert dbo.play_mode values ( 0 , "Quiet" )
insert dbo.play_mode values ( 1 , "LeaguePractice" )
insert dbo.play_mode values ( 2 , "LeaguePlay" )
insert dbo.play_mode values ( 3 , "OpenPlay" )
insert dbo.play_mode values ( 4 , "CrossPlay" )
--
-- A table referencing the domain table. The column playmode_id is constrained to
-- on of the values contained in the domain table playmode.
--
create table dbo.game
(
id int not null primary key clustered ,
team1_id int not null foreign key references dbo.team( id ) ,
team2_id int not null foreign key references dbo.team( id ) ,
playmode_id int not null foreign key references dbo.play_mode( id ) ,
)
go
Some people for reasons of "economy" might suggest using a single catch-all table for all such code, but in my experience, that ultimately leads to confusion. Best practice is a single small table for each set of discrete values.
add a foreign key to "codes" table.
the codes table would have the PK be the code value, add a string description column where you enter in the description of the value.
table: PlayModes
Columns: PlayMode number --primary key
Description string
I can't see this as being very expensive, databases are based on joining tables like this.
That information should be in database somewhere and not on comments.
So, you should have a table containing that codes and prolly a FK on your table to it.
I agree with #Nicholas Carey (+1): Static data table with two columns, say “Key” or “ID” and “Description”, with foreign key constraints on all tables using the codes. Often the ID columns are simple surrogate keys (1, 2, 3, etc., with no significance attached to the value), but when reasonable I go a step further and use “special” codes. Following are a few examples.
If the values are a sequence (say, Ordered, Paid, Processed, Shipped), I might use 1, 2, 3, 4, to indicate sequence. This can make things easier if you want to find all “up through” a give stages, such as all orders that have not yet been shipped (ID < 4). If you are into planning ahead, make them 10, 20, 30, 40; this will allow you to add values “in between” existing values, if/when new codes or statuses come along. (Yes, you cannot and should not try to anticipate everything and anything that might have to be done some day, but a bit of pre-planning like this can make some changes that much simpler.)
Keys/Ids are often integers (1 byte, 2 byte, 4 byte, whatever). There’s little cost to make them character values (1 char, 2 char, 3, char, 4 char). That’s character, not variable character. Done this way, you can have mnemonics on your codes, such as
O, P, R, S
Or, Pd, Pr, Sh
Ordr, Paid, Proc, Ship
…or whatever floats your boat. Done this way, I have found that it can save a lot of time when analyzing or debugging. You still want the lookup table, for relational integrity as well as a reminder for the more obscure codes.

Merging two tables with a common unique field in MySql

The problem is:
We have taken over a website which has an active member community. We've been given the application and database dump and have the site running on a new server successfully and the DNS has been switched.
The problem is that database has come out of sync in the time it took to get the files to us and the DNS switched over. Now that the DNS has switched and there is no chance of the database going out of sync, we've been handed members2 which is the table from the original server with the extra data.
Both tables look like this
`idmembers` int(10) unsigned NOT NULL auto_increment,
`firstName` varchar(20) default NOT NULL,
`lastName` varchar(20) default NOT NULL,
`email` varchar(255) default NOT NULL,
`date` varchar(10) default '0',
`source` varchar(50) default 'signup'
PRIMARY KEY (`idmembers`),
UNIQUE KEY `email` (`email`)
So the first table is called members1 and is the live database, which is missing a load of members from members2. I need to merge them both together keeping members1 as it is and allowing unique emails from members2 to be inserted into members1.
I am presuming that there is some SQL to do this but I have no idea what it could be.
My second and less preferable approach would be to use a tool like PhpMyAdmin to export all the records from members2 after a certain date and reimport them into members1 but the problem is they all export from members2 with an idmembers that conflict with members1 (as an autoincrement is used in both)
If I understand your question correctly, there are two separate issues here:
Adding completely new member records from members2 into members1
Updating the email field in members1, if that got changed in members2
As for the first case, you should be able do something like:
INSERT INTO members1 ('idmembers', 'firstname', etc.)
SELECT 'idmembers', 'firstname', etc.
FROM members2
WHERE idmembers NOT IN (SELECT idmembers FROM members1)
As for the second case, something like:
UPDATE members1 m1 LEFT JOIN members2 m2
ON m1.idmembers = m2.idmembers
SET m1.idmembers = m2.idmembers
WHERE m2.idmembers IS NOT NULL AND m2.idmembers != m1.idmembers
(Note1: Both statements constructed 'ad hoc' and untested!)
(Note2: Both statements assume that the primary key 'idmember' did not change during migration of members1! If that happened, these queries will not work.)
(Note3: If you encounter the 'different idmember keys' problem from Note2, you can still use the queries, but change the comparison and join operations to use the email field. But then you'd have to execute the second query first to prevent duplicates)
The most important suggestion is to do this on a copy of your database, not the live database, until you are certain the process results in the correcting merging!
First you should check to see if there are any rows in members2 with duplicate email addresses that already exist in members1:
SELECT members2.*
FROM members1 JOIN members2 USING (email);
If there are any (hopefully it'll be few), fix them up manually, or delete each row that is really a duplicate account of a person who already has an account in members1 (keep backup data of course).
If there's any other cases of redundant member accounts that should be considered duplicates and not inserted as new members, you may have to handle that manually. This is an example of a broader problem of database cleanup or de-duping that usually can't be automated fully.
You can copy rows from members2 into members1 while generating new id values like this:
INSERT INTO members1 (`firstName`, `lastName`, `email`, `date`, `source`)
SELECT `firstName`, `lastName`, `email`, `date`, `source`
FROM members2;
Yes, you have to name all the columns. By omitting idmembers from that query, that column will use its default behavior which is to generate a new id value.
You didn't say you need to update other tables that reference these new members by their id. If so, you should create a new table to map the members2 id to the new number generated as you import them into members1. You'll have to follow #ijclarkson's advice of inserting the members one at a time, so you can note the new id generated.
SELECT * FROM members2;
-- loop over results in a script:
INSERT INTO members1 (`firstName`, `lastName`, `email`, `date`, `source`)
VALUES (?, ?, ?, ?, ?);
INSERT INTO members_id_map (idmembers1, idmembers2)
VALUES (LAST_INSERT_ID(), ?); -- use idmembers from the query on members2
-- end loop
Just write a quick porting script which SELECTS the fields that are missing from "members1" and then does an INSERT for each one in the "members2" table.
You might have to do some checking if you require a unique email address, and you think there might be duplicates.