What is the correct solution for that query - sql

insert into Orders values ('1111',
(Select CustomerID from Customers where CustomerID = (Select CustomerID from customers where CompanyName= 'erp')),
(Select EmployeeID from Employees where EmployeeID = (Select EmployeeID from Employees where FirstName = 'Hello')),
(Select ShipperID from Shippers Where ShipperID = (Select ShipperID from Shippers where CompanyName= 'Ntat')),
'2014-12-01','2013-12-01','22','22','aa','aa','dd','gs','ga','ga','qq');
i am unable to run this Query as i m getting error :
Error Code: 1242. Subquery returns more than 1 row
Kindly help

The INSERT command comes in two flavors:
(1) either you have all your values available, as literals or SQL Server variables - in that case, you can use the INSERT .. VALUES() approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
VALUES(Value1, Value2, #Variable3, #Variable4, ...., ValueN)
Note: I would recommend to always explicitly specify the list of column to insert data into - that way, you won't have any nasty surprises if suddenly your table has an extra column, or if your tables has an IDENTITY or computed column. Yes - it's a tiny bit more work - once - but then you have your INSERT statement as solid as it can be and you won't have to constantly fiddle around with it if your table changes.
(2) if you don't have all your values as literals and/or variables, but instead you want to rely on another table, multiple tables, or views, to provide the values, then you can use the INSERT ... SELECT ... approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
SELECT
SourceColumn1, SourceColumn2, #Variable3, #Variable4, ...., SourceColumnN
FROM
dbo.YourProvidingTableOrView
Here, you must define exactly as many items in the SELECT as your INSERT expects - and those can be columns from the table(s) (or view(s)), or those can be literals or variables. Again: explicitly provide the list of columns to insert into - see above.
You can use one or the other - but you cannot mix the two - you cannot use VALUES(...) and then have a SELECT query in the middle of your list of values - pick one of the two - stick with it.
For more details and further in-depth coverage, see the official MSDN SQL Server Books Online documentation on INSERT - a great resource for all questions related to SQL Server!

TL;DR
There is a design integrity issue with your application, from which you will not be able to recover at a Sql Query level.
In Detail
Using non-key values to lookup foreign keys during an insert is not a great idea, as you've now found - the error message indicates that one or more of the subqueries has matched multiple rows, and now you are faced with an idempotence issue.
e.g. Lets just say that in this instance, you have more than one Employee with the name 'Hello'. Your options appear to be:
Either attribute the order to the FIRST employee with the name 'Hello' - obviously this is potentially unfair to the real employee who made the sale
Insert multiple orders, one for each employee - but now we risk double shipping and billing issues.
So the real solution is to ensure that you carry all of the key fields (either a Primary or Unique Key, whether natural or surrogate) for each of the FK role columns through your application at all times.
This then means that you can insert the data with confidence
insert into Orders values ('1111',
#CustomerId,
#EmployeeId,
#ShipperId,
'2014-12-01','2013-12-01','22','22','aa','aa','dd','gs','ga','ga','qq');

You will have to do this thing with the help of procedure because you are getting more than one value in select statement....
You will have to pass value one by one in insert statement
create procedure test
as
declare #customerid int
declare #empid int
declare #shipperid int
begin
set #customerid= (Select CustomerID from customers where CompanyName='erp')
set #empid=(Select EmployeeID from Employees where FirstName = 'Hello')
set #shipperid =(Select ShipperID from Shippers where CompanyName='Ntat')
-- but note down that it will assign last value to variable
-- but if it returns more than one value you will have to create a temporary table and --then assign value to it and will have to apply loop
-- like this create #temp1 (customerid id)
insert into orders values(#customerid,#smpid,#shipperid,'val1','val2'...ans so one)
end

Related

Best approach to populate new tables in a database

I have a problem I have been working on the past several hours. It is complex (for me) and I don't expect someone to do it for me. I just need the right direction.
Problem: We had the tables (below) added to our database and I need to update them based off of data already in our DailyCosts table. The tricky part is that I need to take DailyCosts.Notes and move it to PurchaseOrder.PoNumber. Notes is where we currenlty have the PONumbers.
I started with the Insert below, testing it out on one WellID. This is Inserting records from our DailyCosts table to the new PurchaseOrder table:
Insert Into PurchaseOrder (PoNumber,WellId,JObID,ID)
Select Distinct Cast(Notes As nvarchar(20)), WellID, JOBID,
DailyCosts.DailyCostID
From DailyCosts
Where WellID = '24A-23'
It affected 1973 rows (The Notes are in Ntext)
However, I need to update the other new tables because we need to see the actual PONumbers in the application.
This next Insert is Inserting records from our DailyCost table and new PurchaseOrder table (from above) to a new table called PurchaseOrderDailyCost
Insert Into PurchaseOrderDailyCost (WellID, JobID, ReportNo, AccountCode, PurchaseOrderID,ID,DailyCostSeqNo, DailyCostID)
Select Distinct DailyCosts.WellID,DailyCosts.JobID,DailyCosts.ReportNo,DailyCosts.AccountCode,
PurchaseOrder.ID,NEWID(),0,DailyCosts.DailyCostID
From DailyCosts join
PurchaseOrder ON DailyCosts.WellID = PurchaseOrder.WellID
Where DailyCosts.WellID = '24A-23'
Unfortunately, this produces 3,892,729 records. The Notes field contains the same list of PONumbers each day. This is by design so that the people inputting the data out in the field can easily track their PO numbers. The new PONumber column that we are moving the Notes to would store just unique POnumbers. I modified the query by replacing NEWID() with DailyCostID and the Join to ON DailyCosts.DailyCostID = PurchaseOrder.ID
This affected 1973 rows the same as the first Insert.
The next Insert looks like this:
Insert Into PurchaseOrderAccount (WellID, JobID, PurchaseOrderID, ID, AccountCode)
Select PurchaseOrder.WellID, PurchaseOrder.JobID, PurchaseOrder.ID, PurchaseOrderDailyCost.DailyCostID,PurchaseOrderDailyCost.AccountCode
From PurchaseOrder Inner Join
PurchaseOrderDailyCost ON PurchaseOrder.ID = PurchaseOrderDailyCost.DailyCostID
Where PurchaseOrder.WellID = '24A-23'
The page in the application now shows the PONumbers in the correct column. Everything looks like I want it to.
Unfortunately, it slows down the application to an unacceptable level. I need to figure out how to either modify my Insert or delete duplicate records. The problem is that there are multiple foreign key constraints. I have some more information below for reference.
This shows the application after the inserts. These are all duplicate records that I am hoping to elminate
Here is some additional information I received from the vendor about the tables:
-- add a new purchase order
INSERT INTO PurchaseOrder
(WellID, JobID, ID, PONumber, Amount, Description)
VALUES ('MyWell', 'MyJob', NEWID(), 'PO444444', 500.0, 'A new Purchase Order')
-- link a purchase order with id 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3' to a DailyCost record with SeqNo 0 and AccountCode 'MyAccount'
INSERT INTO PurchaseOrderDailyCost
(WellID, JobID, ReportNo, AccountCode, DailyCostSeqNo, PurchaseOrderID, ID)
VALUES ('MyWell', 'MyJob', 4, 'MyAccount', 0, 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3', NEWID())
-- link a purchase order with id 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3' to an account code 'MyAccount'
-- (i.e. make it choosable from the DailyCost PO-column dropdown for any DailyCost record whose account code is 'MyAccount')
INSERT INTO PurchaseOrderAccount
(WellID, JobID, PurchaseOrderID, ID, AccountCode)
VALUES ('MyWell', 'MyJob', 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3', NEWID(), 'MyAccount')
-- link a purchase order with id 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3' to an AFE No. 'MyAFENo'
-- (same behavior as with the account codes above)
INSERT INTO PurchaseOrderAFE
(WellID, JobID, PurchaseOrderID, ID, AFENo)
VALUES ('MyWell', 'MyJob', 'A356FBF4-A19B-4466-9E5C-20C5FD0E95C3', NEWID(), 'MyAFENo')
So it turns out I missed some simple joining principles. The better I get the more silly mistakes I seem to make. Basically, on my very first insert, I did not include a Group By. Adding this took my INSERT from 1973 to 93. Then on my next insert, I joined DailyCosts.Notes on PurchaseOrder.PONumber since these are the only records from DailyCosts I needed. This was previously INSERT 2 on my question. From there basically, everything came together. Two steps forward an one step back. Thanks to everyone that responded to this.

Split one large, denormalized table into a normalized database

I have a large (5 million row, 300+ column) csv file I need to import into a staging table in SQL Server, then run a script to split each row up and insert data into the relevant tables in a normalized db. The format of the source table looks something like this:
(fName, lName, licenseNumber1, licenseIssuer1, licenseNumber2, licenseIssuer2..., specialtyName1, specialtyState1, specialtyName2, specialtyState2..., identifier1, identifier2...)
There are 50 licenseNumber/licenseIssuer columns, 15 specialtyName/specialtyState columns, and 15 identifier columns. There is always at least one of each of those, but the remaining 49 or 14 could be null. The first identifier is unique, but is not used as the primary key of the Person in our schema.
My database schema looks like this
People(ID int Identity(1,1))
Names(ID int, personID int, lName varchar, fName varchar)
Licenses(ID int, personID int, number varchar, issuer varchar)
Specialties(ID int, personID int, name varchar, state varchar)
Identifiers(ID int, personID int, value)
The database will already be populated with some People before adding the new ones from the csv.
What is the best way to approach this?
I have tried iterating over the staging table one row at a time with select top 1:
WHILE EXISTS (Select top 1 * from staging)
BEGIN
INSERT INTO People Default Values
SET #LastInsertedID = SCOPE_IDENTITY() -- might use the output clause to get this instead
INSERT INTO Names (personID, lName, fName)
SELECT top 1 #LastInsertedID, lName, fName from staging
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber1, licenseIssuer1 from staging
IF (select top 1 licenseNumber2 from staging) is not null
BEGIN
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber2, licenseIssuer2 from staging
END
-- Repeat the above 49 times, etc...
DELETE top 1 from staging
END
One problem with this approach is that it is prohibitively slow, so I refactored it to use a cursor. This works and is significantly faster, but has me declaring 300+ variables for Fetch INTO.
Is there a set-based approach that would work here? That would be preferable, as I understand that cursors are frowned upon, but I'm not sure how to get the identity from the INSERT into the People table for use as a foreign key in the others without going row-by-row from the staging table.
Also, how could I avoid copy and pasting the insert into the Licenses table? With a cursor approach I could try:
FETCH INTO ...#LicenseNumber1, #LicenseIssuer1, #LicenseNumber2, #LicenseIssuer2...
INSERT INTO #LicenseTemp (number, issuer) Values
(#LicenseNumber1, #LicenseIssuer1),
(#LicenseNumber2, #LicenseIssuer2),
... Repeat 48 more times...
.
.
.
INSERT INTO Licenses(personID, number, issuer)
SELECT #LastInsertedID, number, issuer
FROM #LicenseTEMP
WHERE number is not null
There still seems to be some redundant copy and pasting there, though.
To summarize the questions, I'm looking for idiomatic approaches to:
Break up one large staging table into a set of normalized tables, retrieving the Primary Key/identity from one table and using it as the foreign key in the others
Insert multiple rows into the normalized tables that come from many repeated columns in the staging table with less boilerplate/copy and paste (Licenses and Specialties above)
Short of discreet answers, I'd also be very happy with pointers towards resources and references that could assist me in figuring this out.
Ok, I'm not an SQL Server expert, but here's the "strategy" I would suggest.
Calculate the personId on the staging table
As #Shnugo suggested before me, calculating the personId in the staging table will ease the next steps
Use a sequence for the personID
From SQL Server 2012 you can define sequences. If you use it for every person insert, you'll never risk an overlapping of IDs. If you have (as it seems) personId that were loaded before the sequence you can create the sequence with the first free personID as starting value
Create a numbers table
Create an utility table keeping numbers from 1 to n (you need n to be at least 50.. you can look at this question for some implementations)
Use set logic to do the insert
I'd avoid cursor and row-by-row logic: you are right that it is better to limit the number of accesses to the table, but I'd say that you should strive to limit it to one access for target table.
You could proceed like these:
People:
INSERT INTO People (personID)
SELECT personId from staging;
Names:
INSERT INTO Names (personID, lName, fName)
SELECT personId, lName, fName from staging;
Licenses:
here we'll need the Number table
INSERT INTO Licenses (personId, number, issuer)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then licenseNumber1
when 2 then licenseNumber2
...
when 50 then licenseNumber50
end as licenseNumber,
case nbrs.n
when 1 then licenseIssuer1
when 2 then licenseIssuer2
...
when 50 then licenseIssuer50
end as licenseIssuer
from staging
cross join
(select n from numbers where n>=1 and n<=50) nbrs
) WHERE licenseNumber is not null;
Specialties:
INSERT INTO Specialties(personId, name, state)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then specialtyName1
when 2 then specialtyName2
...
when 15 then specialtyName15
end as specialtyName,
case nbrs.n
when 1 then specialtyState1
when 2 then specialtyState2
...
when 15 then specialtyState15
end as specialtyState
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE specialtyName is not null;
Identifiers:
INSERT INTO Identifiers(personId, value)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then identifier1
when 2 then identifier2
...
when 15 then identifier15
end as value
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE value is not null;
Hope it helps.
You say: but the staging table could be modified
I would
add a PersonID INT NOT NULL column and fill it with DENSE_RANK() OVER(ORDER BY fname,lname)
add an index to this PersonID
use this ID in combination with GROUP BY to fill your People table
do the same with your names table
And then use this ID for a set-based insert into your three side tables
Do it like this
SELECT AllTogether.PersonID, AllTogether.TheValue
FROM
(
SELECT PersonID,SomeValue1 AS TheValue FROM StagingTable
UNION ALL SELECT PersonID,SomeValue2 FROM StagingTable
UNION ALL ...
) AS AllTogether
WHERE AllTogether.TheValue IS NOT NULL
UPDATE
You say: might cause a conflict with IDs that already exist in the People table
You did not tell anything about existing People...
Is there any sure and unique mark to identify them? Use a simple
UPDATE StagingTable SET PersonID=xyz WHERE ...
to set existing PersonIDs into your staging table and then use something like
UPDATE StagingTable
SET PersonID=DENSE RANK() OVER(...) + MaxExistingID
WHERE PersonID IS NULL
to set new IDs for PersonIDs still being NULL.

Query trying to select but get ambiguous error?

It runs but I select all the columns. Can someone explain to me why my first query doesn't work? I don't think I need a join. If I can get some help that would be good. To be quite honest I've never seen the error before. If it works with SELECT*, I don't understand why I have issues with select specific columns.
These are my tables:
create table product
(
pdt# varchar(10) not null,
pdt_name varchar(30) not null,
pdt_label varchar(30) not null,
constraint product_pk primary key (pdt#));
create table orders
(
pdt# varchar(10) not null,
qty number(11,0) not null,
city varchar(30) not null
);
And these are the values
insert into product values ([111,chair,chr]);
insert into product values ([222,stool,stl]);
insert into product values ([333,table,tbl]);
insert into orders values ([111,22,Ottawa]);
insert into orders values ([222,22,Ottawa]);
insert into orders values ([333,22,Toronto]);
Question is this:
c. List all [pdt#,pdt_name,qty] when the order is from [Ottawa]
I tried:
SELECT pdt#, pdt_name, qty FROM orders, product WHERE city='Ottawa';
I get column is ambiguously defined error. But when I run:
SELECT *, qty FROM orders, product WHERE city='Ottawa';
It runs but I select all the columns. Can someone explain to me why my first query doesn't work? I don't think I need a join. If I can get some help that would be good. To be quite honest I've never seen the error before. If it works with SELECT*, I don't understand why I have issues with select specific columns.
This is because both the tables have pdt# in common and you are selecting it in your query. In cases like these, you have to explicitly specify the table from which the column should be picked up.
You should also join the tables. Else you would get a cross-joined result.
SELECT p.pdt#, p.pdt_name, o.qty
FROM orders o join product p on o.pdt# = p.pdt#
WHERE o.city='Ottawa';
Your second query works because you are selecting all the columns from both the tables and ideally it should not be done. Always specify the columns you need when you are selecting from more than one table.

sql insert error

This is my Insert Statement
INSERT INTO ProductStore (ProductID, StoreID, CreatedOn)
(SELECT DISTINCT(ProductId), 1, GETDATE() FROM ProductCategory
WHERE EXISTS (SELECT StoreID, EntityID FROM EntityStore
WHERE EntityType = 'Category' AND ProductCategory.CategoryID = EntityStore.EntityID AND StoreID = 1))
I am trying to Insert into table ProductStore, all the Products Which are mapped to Categories that are mapped to Store 1. Column StoreID can definitely have more than one row with the same entry. And I am getting the following error: Violation of Primary Key Constraint...
However, the Following query does work:
INSERT INTO ProductStore (ProductID, StoreID, CreatedOn)
VALUES (2293,1,GETDATE()),(2294,1,GETDATE())
So apparently, the ProductID Column is trying to insert the same one more than once.
Can you see anything wrong with my query?
TIA
I don't see any part of that query that excludes records already in the table.
Take out the INSERT INTO statement and just run the SELECT - you should be able to spot pretty quickly where the duplicates are.
My guess is that you're slightly mistaken about what SELECT DISTINCT actually does, as evidenced by the fact that you have parentheses around the ProductId. SELECT DISTINCT only guarantees the elimination of duplicates when all columns in the select list are the same. It won't guarantee in this case that you only get one row for each ProductId.
select distinct productid is selecting an existing ID and therefor in violation with your primary key constraint.
Why don't you create the primary key using Identity increment? In that case you don't need to worry about the ID itself, it will be generated for you.

Help me with this SQL: 'DO THIS for ALL ROWS in TABLE'

[using SQL Server 2005]
I have a table full of users, I want to assign every single user in the table (16,000+) to a course by creating a new entry in the assignment table as well as a new entry in the course tracking table so their data can be tracked. The problem is I do no know how to do a loop in SQL because I don't think you can but there has got to be a way to do this...
FOR EACH user in TABLE
write a row to each of the two tables with userID from user TABLE...
how would I do this? please help!
You'd do this with 2 insert statements. You'd want to wrap this with a transaction to ensure consistency, and may want to double-check our isolation level to make sure that you get a consistent read from the users table between the 2 queries (take a look at SNAPSHOT or SERIALIZABLE to avoid phantom reads).
BEGIN TRAN
INSERT Courses
(UserID, CourseNumber, ...)
SELECT UserID, 'YourCourseNumberHere', ...
FROM Users
INSERT Assignments
(UserID, AssignmentNumber, ...)
SELECT UserID, 'YourAssignmentNumberHere', ...
FROM Users
COMMIT TRAN
Something like:
insert into CourseAssignment (CourseId, StudentId)
select 1 -- whatever the course number is
, StudendId
from Student
something like this, no need for looping, if you have dups use distinct
also change 1 with the course value
insert into AssingmentTable
select userid,1
from UserTable
insert into OtherTable
select userid,1
from UserTable
maybe I misuderstand your question, but I think you need INSERT..SELECT statement
INSERT INTO TABLE2
SELECT filed1, field2 field3 from TABLE1
SQL works on sets. It doesn't require loops ..
what you are looking for might be the "insert into" command.
INSERT INTO <new_table> (<list of fields, comma separated>)
SELECT <list of fields,comma separated>
FROM <usertable>
WHERE <selection condition if needed>
--grab 1 record for each student, and push it into the courses table
--i am using a sub-select to look up a course id based on a name
--that may not work for your situation, but then again, it may...
INSERT INTO COURSES(
COURSE_ID
,STUDENT_ID
)
SELECT
(SELECT COURSE_ID FROM COURSES WHERE COURSE_NAME = 'MATH')
,STUDENT_ID
FROM
STUDENTS;
--grab your recently entered course data and create an entry in
--your log table too
INSERT INTO COURSE_DATA(
COURSE_ID
,STUDENT_ID
)
SELECT
COURSE_ID
,STUDENT_ID
FROM
COURSES;
I would do this using the set based approaches that lots of others have already posted...
...however, just for completeness it is worth noting that you could do a loop if you really wanted to. Look up cursors and while loops in books online to see some examples.
Just please don't fall in to the trap of using cursors as lots of newbies do. They have their uses but if they're used incorrectly they can be terrible - there's almost always a better way of doing things.