sql insert error - sql

This is my Insert Statement
INSERT INTO ProductStore (ProductID, StoreID, CreatedOn)
(SELECT DISTINCT(ProductId), 1, GETDATE() FROM ProductCategory
WHERE EXISTS (SELECT StoreID, EntityID FROM EntityStore
WHERE EntityType = 'Category' AND ProductCategory.CategoryID = EntityStore.EntityID AND StoreID = 1))
I am trying to Insert into table ProductStore, all the Products Which are mapped to Categories that are mapped to Store 1. Column StoreID can definitely have more than one row with the same entry. And I am getting the following error: Violation of Primary Key Constraint...
However, the Following query does work:
INSERT INTO ProductStore (ProductID, StoreID, CreatedOn)
VALUES (2293,1,GETDATE()),(2294,1,GETDATE())
So apparently, the ProductID Column is trying to insert the same one more than once.
Can you see anything wrong with my query?
TIA

I don't see any part of that query that excludes records already in the table.

Take out the INSERT INTO statement and just run the SELECT - you should be able to spot pretty quickly where the duplicates are.
My guess is that you're slightly mistaken about what SELECT DISTINCT actually does, as evidenced by the fact that you have parentheses around the ProductId. SELECT DISTINCT only guarantees the elimination of duplicates when all columns in the select list are the same. It won't guarantee in this case that you only get one row for each ProductId.

select distinct productid is selecting an existing ID and therefor in violation with your primary key constraint.
Why don't you create the primary key using Identity increment? In that case you don't need to worry about the ID itself, it will be generated for you.

Related

Query trying to select but get ambiguous error?

It runs but I select all the columns. Can someone explain to me why my first query doesn't work? I don't think I need a join. If I can get some help that would be good. To be quite honest I've never seen the error before. If it works with SELECT*, I don't understand why I have issues with select specific columns.
These are my tables:
create table product
(
pdt# varchar(10) not null,
pdt_name varchar(30) not null,
pdt_label varchar(30) not null,
constraint product_pk primary key (pdt#));
create table orders
(
pdt# varchar(10) not null,
qty number(11,0) not null,
city varchar(30) not null
);
And these are the values
insert into product values ([111,chair,chr]);
insert into product values ([222,stool,stl]);
insert into product values ([333,table,tbl]);
insert into orders values ([111,22,Ottawa]);
insert into orders values ([222,22,Ottawa]);
insert into orders values ([333,22,Toronto]);
Question is this:
c. List all [pdt#,pdt_name,qty] when the order is from [Ottawa]
I tried:
SELECT pdt#, pdt_name, qty FROM orders, product WHERE city='Ottawa';
I get column is ambiguously defined error. But when I run:
SELECT *, qty FROM orders, product WHERE city='Ottawa';
It runs but I select all the columns. Can someone explain to me why my first query doesn't work? I don't think I need a join. If I can get some help that would be good. To be quite honest I've never seen the error before. If it works with SELECT*, I don't understand why I have issues with select specific columns.
This is because both the tables have pdt# in common and you are selecting it in your query. In cases like these, you have to explicitly specify the table from which the column should be picked up.
You should also join the tables. Else you would get a cross-joined result.
SELECT p.pdt#, p.pdt_name, o.qty
FROM orders o join product p on o.pdt# = p.pdt#
WHERE o.city='Ottawa';
Your second query works because you are selecting all the columns from both the tables and ideally it should not be done. Always specify the columns you need when you are selecting from more than one table.

What is the correct solution for that query

insert into Orders values ('1111',
(Select CustomerID from Customers where CustomerID = (Select CustomerID from customers where CompanyName= 'erp')),
(Select EmployeeID from Employees where EmployeeID = (Select EmployeeID from Employees where FirstName = 'Hello')),
(Select ShipperID from Shippers Where ShipperID = (Select ShipperID from Shippers where CompanyName= 'Ntat')),
'2014-12-01','2013-12-01','22','22','aa','aa','dd','gs','ga','ga','qq');
i am unable to run this Query as i m getting error :
Error Code: 1242. Subquery returns more than 1 row
Kindly help
The INSERT command comes in two flavors:
(1) either you have all your values available, as literals or SQL Server variables - in that case, you can use the INSERT .. VALUES() approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
VALUES(Value1, Value2, #Variable3, #Variable4, ...., ValueN)
Note: I would recommend to always explicitly specify the list of column to insert data into - that way, you won't have any nasty surprises if suddenly your table has an extra column, or if your tables has an IDENTITY or computed column. Yes - it's a tiny bit more work - once - but then you have your INSERT statement as solid as it can be and you won't have to constantly fiddle around with it if your table changes.
(2) if you don't have all your values as literals and/or variables, but instead you want to rely on another table, multiple tables, or views, to provide the values, then you can use the INSERT ... SELECT ... approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
SELECT
SourceColumn1, SourceColumn2, #Variable3, #Variable4, ...., SourceColumnN
FROM
dbo.YourProvidingTableOrView
Here, you must define exactly as many items in the SELECT as your INSERT expects - and those can be columns from the table(s) (or view(s)), or those can be literals or variables. Again: explicitly provide the list of columns to insert into - see above.
You can use one or the other - but you cannot mix the two - you cannot use VALUES(...) and then have a SELECT query in the middle of your list of values - pick one of the two - stick with it.
For more details and further in-depth coverage, see the official MSDN SQL Server Books Online documentation on INSERT - a great resource for all questions related to SQL Server!
TL;DR
There is a design integrity issue with your application, from which you will not be able to recover at a Sql Query level.
In Detail
Using non-key values to lookup foreign keys during an insert is not a great idea, as you've now found - the error message indicates that one or more of the subqueries has matched multiple rows, and now you are faced with an idempotence issue.
e.g. Lets just say that in this instance, you have more than one Employee with the name 'Hello'. Your options appear to be:
Either attribute the order to the FIRST employee with the name 'Hello' - obviously this is potentially unfair to the real employee who made the sale
Insert multiple orders, one for each employee - but now we risk double shipping and billing issues.
So the real solution is to ensure that you carry all of the key fields (either a Primary or Unique Key, whether natural or surrogate) for each of the FK role columns through your application at all times.
This then means that you can insert the data with confidence
insert into Orders values ('1111',
#CustomerId,
#EmployeeId,
#ShipperId,
'2014-12-01','2013-12-01','22','22','aa','aa','dd','gs','ga','ga','qq');
You will have to do this thing with the help of procedure because you are getting more than one value in select statement....
You will have to pass value one by one in insert statement
create procedure test
as
declare #customerid int
declare #empid int
declare #shipperid int
begin
set #customerid= (Select CustomerID from customers where CompanyName='erp')
set #empid=(Select EmployeeID from Employees where FirstName = 'Hello')
set #shipperid =(Select ShipperID from Shippers where CompanyName='Ntat')
-- but note down that it will assign last value to variable
-- but if it returns more than one value you will have to create a temporary table and --then assign value to it and will have to apply loop
-- like this create #temp1 (customerid id)
insert into orders values(#customerid,#smpid,#shipperid,'val1','val2'...ans so one)
end

How to perform insert for each selected row from database

I am trying to make stored procedure that:
- Get list of int rows
select ItemId from Items -- this returns: 1,2,3,4,5,6
In the second part of procedure I have to add row in another table for each of selected number. Something like:
foreach ItemId in previous result
insert into table (ItemIdInAnotherTable) values (ItemId)
UPDATE
I miss one important part from question.
In another part of procedure when I am inserting selected items in another table need to insert a few more columns. Something like this:
insert into dbo.ItemsNotificator
(UserId,ItemId)
(13879, (select ItemId from Items))
So it's not one column. Sorry for confusion :(
Edit :
Assuming that the table [table] already exists, and if User is a constant, then do like so:
INSERT INTO [table](UserId, ItemIdInAnotherTable)
SELECT 13879, ItemId
FROM Items;
If UserId comes from another table entirely, you'll need to figure out what relationship you need between UserId and ItemId. For instance, if all users are linked to all items, then it is:
INSERT INTO [table](UserId, ItemIdInAnotherTable)
SELECT u.UserId, i.ItemId
FROM Items i CROSS JOIN Users u;
If table [table] does NOT already exist, then you can use SELECT INTO, and specify a new table name (e.g. a #temp table stored in tempdb)
SELECT u.UserId, i.ItemId
INTO #tmpNewTable
FROM Items i CROSS JOIN Users u;
The columns in the newly created table will have the names UserId and ItemId and have the same types.
Looks simple to me:
INSERT INTO ItemIdInAnotherTable (ItemId)
SELECT ItemId from Items
That is exactly what a normal insert command does. Just do something like this:
insert into Table (ItemID)
select ItemID from Items;
use INSERT INTO..SELECT statement
INSERT INTO ItemIdInAnotherTable (ItemId)
SELECT ItemId
FROM Items

Delete duplicates with no primary key

Here want to delete rows with a duplicated column's value (Product) which will be then used as a primary key.
The column is of type nvarchar and we don't want to have 2 rows for one product.
The database is a large one with about thousands rows we need to remove.
During the query for all the duplicates, we want to keep the first item and remove the second one as the duplicate.
There is no primary key yet, and we want to make it after this activity of removing duplicates.
Then the Product columm could be our primary key.
The database is SQL Server CE.
I tried several methods, and mostly getting error similar to :
There was an error parsing the query. [ Token line number = 2,Token line offset = 1,Token in error = FROM ]
A method which I tried :
DELETE FROM TblProducts
FROM TblProducts w
INNER JOIN (
SELECT Product
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1
)Dup ON w.Product = Dup.Product
The preferred way trying to learn and adjust my code with something similar
(It's not correct yet):
SELECT Product, COUNT(*) TotalCount
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
--
;WITH cte -- These 3 lines are the lines I have more doubt on them
AS (SELECT ROW_NUMBER() OVER (PARTITION BY Product
ORDER BY ( SELECT 0)) RN
FROM Word)
DELETE FROM cte
WHERE RN > 1
If you have two DIFFERENT records with the same Product column, then you can SELECT the unwanted records with some criterion, e.g.
CREATE TABLE victims AS
SELECT MAX(entryDate) AS date, Product, COUNT(*) AS dups FROM ProductsTable WHERE ...
GROUP BY Product HAVING dups > 1;
Then you can do a DELETE JOIN between ProductTable and Victims.
Or also you can select Product only, and then do a DELETE for some other JOIN condition, for example having an invalid CustomerId, or EntryDate NULL, or anything else. This works if you know that there is one and only one valid copy of Product, and all the others are recognizable by the invalid data.
Suppose you instead have IDENTICAL records (or you have both identical and non-identical, or you may have several dupes for some product and you don't know which). You run exactly the same query. Then, you run a SELECT query on ProductsTable and SELECT DISTINCT all products matching the product codes to be deduped, grouping by Product, and choosing a suitable aggregate function for all fields (if identical, any aggregate should do. Otherwise I usually try for MAX or MIN). This will "save" exactly one row for each product.
At that point you run the DELETE JOIN and kill all the duplicated products. Then, simply reimport the saved and deduped subset into the main table.
Of course, between the DELETE JOIN and the INSERT SELECT, you will have the DB in a unstable state, with all products with at least one duplicate simply disappeared.
Another way which should work in MySQL:
-- Create an empty table
CREATE TABLE deduped AS SELECT * FROM ProductsTable WHERE false;
CREATE UNIQUE INDEX deduped_ndx ON deduped(Product);
-- DROP duplicate rows, Joe the Butcher's way
INSERT IGNORE INTO deduped SELECT * FROM ProductsTable;
ALTER TABLE ProductsTable RENAME TO ProductsBackup;
ALTER TABLE deduped RENAME TO ProductsTable;
-- TODO: Copy all indexes from ProductsTable on deduped.
NOTE: the way above DOES NOT WORK if you want to distinguish "good records" and "invalid duplicates". It only works if you have redundant DUPLICATE records, or if you do not care which row you keep and which you throw away!
EDIT:
You say that "duplicates" have invalid fields. In that case you can modify the above with a sorting trick:
SELECT * FROM ProductsTable ORDER BY Product, FieldWhichShouldNotBeNULL IS NULL;
Then if you have only one row for product, all well and good, it will get selected. If you have more, the one for which (FieldWhichShouldNeverBeNull IS NULL) is FALSE (i.e. the one where the FieldWhichShouldNeverBeNull is actually not null as it should) will be selected first, and inserted. All others will bounce, silently due to the IGNORE clause, against the uniqueness of Product. Not a really pretty way to do it (and check I didn't mix true with false in my clause!), but it ought to work.
EDIT
actually more of a new answer
This is a simple table to illustrate the problem
CREATE TABLE ProductTable ( Product varchar(10), Description varchar(10) );
INSERT INTO ProductTable VALUES ( 'CBPD10', 'C-Beam Prj' );
INSERT INTO ProductTable VALUES ( 'CBPD11', 'C Proj Mk2' );
INSERT INTO ProductTable VALUES ( 'CBPD12', 'C Proj Mk3' );
There is no index yet, and no primary key. We could still declare Product to be primary key.
But something bad happens. Two new records get in, and both have NULL description.
Yet, the second one is a valid product since we knew nothing of CBPD14 before now, and therefore we do NOT want to lose this record completely. We do want to get rid of the spurious CBPD10 though.
INSERT INTO ProductTable VALUES ( 'CBPD10', NULL );
INSERT INTO ProductTable VALUES ( 'CBPD14', NULL );
A rude DELETE FROM ProductTable WHERE Description IS NULL is out of the question, it would kill CBPD14 which isn't a duplicate.
So we do it like this. First get the list of duplicates:
SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1;
We assume that: "There is at least one good record for every set of bad records".
We check this assumption by positing the opposite and querying for it. If all is copacetic we expect this query to return nothing.
SELECT Dups.Product FROM ProductTable
RIGHT JOIN ( SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1 ) AS Dups
ON (ProductTable.Product = Dups.Product
AND ProductTable.Description IS NOT NULL)
WHERE ProductTable.Description IS NULL;
To further verify, I insert two records that represent this mode of failure; now I do expect the query above to return the new code.
INSERT INTO ProductTable VALUES ( "AC5", NULL ), ( "AC5", NULL );
Now the "check" query indeed returns,
AC5
So, the generation of Dups looks good.
I proceed now to delete all duplicate records that are not valid. If there are duplicate, valid records, they will stay duplicate unless some condition may be found, distinguishing among them one "good" record and declaring all others "invalid" (maybe repeating the procedure with a different field than Description).
But ay, there's a rub. Currently, you cannot delete from a table and select from the same table in a subquery ( http://dev.mysql.com/doc/refman/5.0/en/delete.html ). So a little workaround is needed:
CREATE TEMPORARY TABLE Dups AS
SELECT Product, COUNT(*) AS Duplicates
FROM ProductTable GROUP BY Product HAVING Duplicates > 1;
DELETE ProductTable FROM ProductTable JOIN Dups USING (Product)
WHERE Description IS NULL;
Now this will delete all invalid records, provided that they appear in the Dups table.
Therefore our CBPD14 record will be left untouched, because it does not appear there. The "good" record for CBPD10 will be left untouched because it's not true that its Description is NULL. All the others - poof.
Let me state again that if a record has no valid records and yet it is a duplicate, then all copies of that record will be killed - there will be no survivors.
To avoid this can may first SELECT (using the query above, the check "which should return nothing") the rows representing this mode of failure into another TEMPORARY TABLE, then INSERT them back into the main table after the deletion (using transactions might be in order).
Create a new table by scripting the old one out and renaming it. Also script all objects (indexes etc..) from the old table to the new. Insert the keepers into the new table. If you're database is in bulk-logged or simple recovery model, this operation will be minimally logged. Drop the old table and then rename the new one to the old name.
The advantage of this over a delete will be that the insert can be minimally logged. Deletes do double work because not only does the data get deleted, but the delete has to be written to the transaction log. For big tables, minimally logged inserts will be much faster than deletes.
If it's not that big and you have some downtime, and you have Sql Server Management studio, you can put an identity field on the table using the GUI. Now you have the situation like your CTE, except the rows themselves are truly distinct. So now you can do the following
SELECT MIN(table_a.MyTempIDField)
FROM
table_a lhs
join table_1 rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
table_a.MyTempIDField <> table_b.MyTempIDField
GROUP BY
lhs.field1, rhs.field2 etc
This gives you all the 'good' duplicates. Now you can wrap this query with a DELETE FROM query.
DELETE FROM lhs
FROM table_a lhs
join table_b rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
lhs.MyTempIDField <> rhs.MyTempIDField
and lhs.MyTempIDField not in (
SELECT MIN(lhs.MyTempIDField)
FROM
table_a lhs
join table_a rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
lhs.MyTempIDField <> rhs.MyTempIDField
GROUP BY
lhs.field1, lhs.field2 etc
)
Try this:
DELETE FROM TblProducts
WHERE Product IN
(
SELECT Product
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1)
This suffers from the defect that it deletes ALL the records with a duplicated Product. What you probably want to do is delete all but one of each group of records with a given Product. It might be worthwhile to copy all the duplicates to a separate table first, and then somehow remove duplicates from that table, then apply the above, and then copy remaining products back to the original table.

Increment non unique field during SQL insert

I'm not sure how to word this cause I am a little confused at the moment, so bear with me while I attempt to explain, I have a table with the following fields:
OrderLineID, OrderID, OrderLine, and a few other unimportant ones.
OrderLineID is the primary key and is always unique (which isn't a problem), OrderID is a foreign key that isn't unique (also not a problem), and OrderLine is a value that is not unique in the table, but should be unique for any OrderIDs that are the same...so if that didn't make sense, perhaps a picture...
OrderLineID, OrderID, OrderLine
1 1 1
2 1 2
3 1 3
4 2 1
5 2 2
For all OrderIDs there is a unique OrderLine. I am trying to create an insert statement that gets the max OrderLine value for a specific OrderId so I can increment it, but it's not working so well and I could use a little help. What I have right now is below, I build the SQL statement in a program and replace OrderID # with an actual value. I am pretty sure the problem is with the nested select statement, and incrementing the result, but I can't find any examples that do this since my Google skills are weak apparently....
INSERT INTO tblOrderLine (OrderID, OrderLine) VALUES
(<OrderID #>, (SELECT MAX(OrderLine)
FROM tblOrderLine WHERE orderID = <same OrderID #>)+1)
Any help would be nice.
This statement works in Access 2003. You would have to substitute your OrderID value in the WHERE clause.
INSERT INTO tblOrderLine (OrderID, OrderLine)
SELECT
s.OrderID,
s.MaxOrderLine + 1 AS NewOrderLine
FROM (
SELECT
OrderID,
Max(OrderLine) AS MaxOrderLine
FROM
tblOrderLine
WHERE
OrderID=1
GROUP BY
OrderID
) AS s;
I read the others' misgivings, and will leave the wisdom of this approach to you. It could get more interesting if you can have multiple users updating tblOrderLine at the same time.
Are you getting some type of error? Your SQL code seems to work fine for me.
Don't use a combination of VALUES and SELECT. Try:
INSERT INTO tblOrderLine (OrderID, OrderLine)
SELECT <OrderID #>, MAX(OrderLine)
FROM tblOrderLine
WHERE orderID = <same OrderID #>)+1
;
Adding a scalar to the result of a query isn't generally kosher. Try moving the "+1":
INSERT INTO tblOrderLine (OrderID, OrderLine) VALUES
(
<OrderID #>,
(SELECT MAX(OrderLine)+1 FROM tblOrderLine WHERE orderID = <OrderID #>)
)