Can I insert into multiple related tables in a single statement? - sql

I have two related tables something like this:
CREATE TABLE test.items
(
id INT identity(1,1) PRIMARY KEY,
type VARCHAR(max),
price NUMERIC(6,2)
);
CREATE TABLE test.books
(
id INT PRIMARY KEY REFERENCES test.items(id),
title VARCHAR(max),
author VARCHAR(max)
);
Is it possible to insert into both tables using a single SQL statement?
In PostgreSQL, I can use something like this:
-- PostgreSQL:
WITH item AS (INSERT INTO test.items(type,price) VALUES('book',12.5) RETURNING id)
INSERT INTO test.books(id,title) SELECT id,'Good Omens' FROM item;
but apparently SQL Server limits CTEs to SELECT statements, so that won’t work.
In principle, I could use the OUTPUT clause this way:
-- SQL Server:
INSERT INTO test.items(type, price)
OUTPUT inserted.id, 'Good Omens' INTO test.books(id,title)
VALUES ('book', 12.5);
but this doesn’t work if there’s a foreign key involved, as above.
I know about using variables and procedures, but I wondered whether there is a simple single-statement approach.

You can using dynamic sql as follows. Although its awkward to construct query like this.
CREATE TABLE dbo.items (
id INT identity(1,1) PRIMARY KEY,
type VARCHAR(max),
price NUMERIC(6,2)
);
CREATE TABLE dbo.books (
id INT PRIMARY KEY REFERENCES dbo.items(id),
title VARCHAR(max),
author VARCHAR(max)
);
insert into dbo.books(id,title)
exec ('insert into dbo.items(type,price) output inserted.id,''Good Omen'' VALUES(''book'',12.5)')

Related

Inserting data into several tables via stored procedure

I got a performance issue with an import task (user need to import new data into the system) and could use little help.
The data within the database looks like this:
Table Categories:
Columns:
Id int primary key identity(1,1)
Code nvarchar(20) not null
RootId int not null
ParentId int not null
Table CategoryNames:
Columns:
CategoryId int not null foreign key references Category (Id)
LanguageId int not null foreign key references Language (Id)
Name nvarchar(100)
Currently it's working like this: for each row
Create connection to database
Call stored procedure with data[i]
Close connection
I already got rid of the creating and closing connection for each row of data. But it's still not good enough.
Currently the task needs round about 36 minutes to insert ~22000 categories.
The stored procedure looks like this:
procedure Category_Insert
(#code nvarchar(20),
#root int,
#parent int,
#languageId int,
#name nvarchar(100),
#id int output)
as
set nocount on
insert into Categories (Code, RootId, ParentId)
values (#code, #root, #parent)
select id = scope_identity()
insert into CategoryNames (CategoryId, LanguageId, Name)
values (#id, #languageId, #name)
Got any advice how I can speed up the performance of that task?
I would love to use bulk insert or something like that, but how would I realize the logic of the stored procedure with bulk insert?
Or is the any other way to speed this up?

Add value from foreign key table depending on name

I want to have the following two tables:
CREATE TABLE buildings
(
ID int IDENTITY NOT NULL PRIMARY KEY,
city_ID int NOT NULL REFERENCES(cities),
name char(20) NOT NULL
)
CREATE TABLE cities
(
ID int IDENTITY NOT NULL PRIMARY KEY,
name char(30) NOT NULL
)
INSERT INTO cities (name) VALUES ('Katowice')
Now I need that when I write:
INSERT INTO buildings (city_ID,name) values (1,'bahnhof')
makes the same effect that when I write:
INSERT INTO buildings VALUES ('Katowice','bahnhof')
My purpose is that when I want to add building to a city, I think about city name, not its ID in cities table. But sometimes I remember ID, and then I prefer to use ID. Is it possible without creating a procedure?
I am thinking about appropriate procedure:
CREATE PROCEDURE addbuilding
#city_ID int,
#name char
AS
BEGIN
INSERT INTO buildings (city_ID,name) VALUES (#city_ID,#name)
END
But as we can see above, #city_ID can be only int. Something like union in C++ could be a good solution, but is it possible in SQL?
I'm not sure if SQL procedures support union similarly to C++ as you ask, but my suggestion would be a rather simple one: two procedures.
CREATE PROCEDURE add_building_by_city_id
#city_ID int,
#name char
etc
CREATE PROCEDURE add_building_by_city_name
#city_name char,
#name char
etc
And then you could use whichever one you need. Of course that the second procedure would need a simple SELECT first, to find the city by its name and retrieve its ID.

Very slow DELETE query

I have problems with SQL performance. For sudden reason the following queries are very slow:
I have two lists which contains Id's of a certain table. I need to delete all records from the first list if the Id's already exists in the second list:
DECLARE #IdList1 TABLE(Id INT)
DECLARE #IdList2 TABLE(Id INT)
-- Approach 1
DELETE list1
FROM #IdList1 list1
INNER JOIN #IdList2 list2 ON list1.Id = list2.Id
-- Approach 2
DELETE FROM #IdList1
WHERE Id IN (SELECT Id FROM #IdList2)
It is possible the two lists contains more than 10.000 records. In that case both queries takes each more than 20 seconds to execute.
The execution plan also showed something I don't understand. Maybe that explains why it is so slow:
I Filled both lists with 10.000 sequential integers so both list contained value 1-10.000 as starting point.
As you can see both queries shows for #IdList2 Actual Number of Rows is 50.005.000!!. #IdList1 is correct (Actual Number of Rows is 10.000)
I know there are other solutions how to solve this. Like filling a third list instaed of removing from first list. But my question is:
Why are these delete queries so slow and why do I see these strange query plans?
Add a Primary key to your table variables and watch them scream
DECLARE #IdList1 TABLE(Id INT primary Key not null)
DECLARE #IdList2 TABLE(Id INT primary Key not null)
because there's no index on these table variables, any joins or subqueries must examine on the order of 10,000 times 10,000 = 100,000,000 pairs of values.
SQL Server compiles the plan when the table variable is empty and does not recompile it when rows are added. Try
DELETE FROM #IdList1
WHERE Id IN (SELECT Id FROM #IdList2)
OPTION (RECOMPILE)
This will take account of the actual number of rows contained in the table variable and get rid of the nested loops plan
Of course creating an index on Id via a constraint may well be beneficial for other queries using the table variable too.
The tables in table variables can have primary keys, so if your data supports uniqueness for these Ids, you may be able to improve performance by going for
DECLARE #IdList1 TABLE(Id INT PRIMARY KEY)
DECLARE #IdList2 TABLE(Id INT PRIMARY KEY)
Possible solutions:
1) Try to create indices thus
1.1) If List{1|2}.Id column has unique values then you could define a unique clustered index using a PK constraint like this:
DECLARE #IdList1 TABLE(Id INT PRIMARY KEY);
DECLARE #IdList2 TABLE(Id INT PRIMARY KEY);
1.2) If List{1|2}.Id column may have duplicate values then you could define a unique clustered index using a PK constraint using a dummy IDENTITY column like this:
DECLARE #IdList1 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );
DECLARE #IdList2 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );
2) Try to add HASH JOIN query hint like this:
DELETE list1
FROM #IdList1 list1
INNER JOIN #IdList2 list2 ON list1.Id = list2.Id
OPTION (HASH JOIN);
You are using Table Variables, either add a primary key to the table or change them to Temporary Tables and add an INDEX. This will result in much more performance. As a rule of thumb, if the table is only small, use TABLE Variables, however if the table is expanding and contains a lot of data then either use a temp table.
I'd be tempted to try
DECLARE #IdList3 TABLE(Id INT);
INSERT #IdList3
SELECT Id FROM #IDList1 ORDER BY Id
EXCEPT
SELECT Id FROM #IDList2 ORDER BY Id
No deleting required.
Try this alternate syntax:
DELETE deleteAlias
FROM #IdList1 deleteAlias
WHERE EXISTS (
SELECT NULL
FROM #IdList2 innerList2Alias
WHERE innerList2Alias.id=deleteAlias.id
)
EDIT.....................
Try using #temp tables with indexes instead.
Here is a generic example where "DepartmentKey" is the PK and the FK.
IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
drop table #Department
end
CREATE TABLE #Department
(
DepartmentKey int ,
DepartmentName varchar(12)
)
CREATE INDEX IX_TEMPTABLE_Department_DepartmentKey ON #Department (DepartmentKey)
IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
drop table #Employee
end
CREATE TABLE #Employee
(
EmployeeKey int ,
DepartmentKey int ,
SSN varchar(11)
)
CREATE INDEX IX_TEMPTABLE_Employee_DepartmentKey ON #Employee (DepartmentKey)
Delete deleteAlias
from #Department deleteAlias
where exists ( select null from #Employee innerE where innerE.DepartmentKey = deleteAlias.DepartmentKey )
IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
drop table #Employee
end
IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
drop table #Department
end

SQL Server 2008 Foreign Keys that are auto indexed

Are Foreign Keys in SQL Server 2008 are automatically indexed with a value? For Example. if I add a value in my Primary key (or auto incremetend) in may parent table will the table that has a foreign key referenced to that key will automatically have the same value? or I Have to do it explicitly?
No, if you create a foreign key in a child table, it will not automatically get populated when a parent row gets inserted. If you think about this it makes sense. Let's say you have a table like:
CREATE TABLE dbo.Students
(
StudentID INT IDENTITY(1,1) PRIMARY KEY,
Name SYSNAME
);
CREATE TABLE dbo.StudentLoans
(
LoanID INT IDENTITY(1,1) PRIMARY KEY,
StudentID INT FOREIGN KEY REFERENCES dbo.Students(StudentID),
Amount BIGINT -- just being funny
);
What you are suggesting is that when you add a row to Students, the system should automatically add a row to StudentLoans - but what if that student doesn't have a loan? If the student does have a loan, what should the amount be? Should the system pick a random number?
Typically what will happen in this scenario is that you'll be adding a student and their loan at the same time. So if you know the loan amount and the student's name, you can say:
DECLARE
#Name SYSNAME = N'user962206',
#LoanAmount BIGINT = 50000,
#StudentID INT;
INSERT dbo.Students(Name)
SELECT #Name;
SELECT #StudentID = SCOPE_IDENTITY();
INSERT dbo.StudentLoans(StudentID, Amount)
SELECT #StudentID, #LoanAmount;

Foreign Key is null when insert using Stored Procedure

I've created a insert stored procedure with two tables like in the exapmle:
Table NameAge
CREATE TABLE [dbo].[Assignment3_NameAge]
(
userID int PRIMARY KEY IDENTITY(1,1),
Name varchar(255) NOT NULL,
Age int NOT NULL
)
Table Hobbies
CREATE TABLE [dbo].[Assignment3_Hobbies]
(
hobbiesID int Identity(1,1) Primary Key,
userID int Foreign Key references Assignment3_NameAge(userID),
hobbies varchar(255) NOT NULL,
)
Insert Stored Procedure
CREATE PROCEDURE [dbo].p_Assignment3Join_ins
#Name nvarchar(100),
#Age int,
#Hobbies nvarchar(100)
AS
INSERT INTO [TABLE].[dbo].[Assignment3_NameAge]
([Name]
,[Age])
VALUES (#Name,#Age)
INSERT INTO [TABLE].[dbo].[Assignment3_Hobbies]
([Hobbies])
VALUES (#Hobbies)
The problem is that when i run the stored procedure the table Hobbies has a null value for userid(the foreign key)
What am i doing wrong?
You should provide the key of the Assignment3_NameAge value you want to insert into Assignment3_Hobbies.
If you want the last inserted you can use SCOPE_IDENTITY() from SQL Server(if you're using SQL Server) or equivalent. It will give you the last inserted value from Assignment3_NameAge
I am guessing this is SQL Server based on the IDENTITY column. Correct?
The first insert creates a user, but there is no user ID being set on the insert of the hobby. You need to capture the identity value from the first insert to be used in the second insert. Have you gon over the system functions available?
You're not supplying a value for it, SQL won't automagically fill the value in for you even though you've created a Foreign Key relationship. It's your job to populate the tables.