Scenario
Adding a column to a table and then updating that column
alter sometable add example_column_name varchar(255);
update sometable set example_column_name = '';
(real update is a bit more complex, but this is a boiled down version we used trying to find the problem)
Problem
The update query gives 'Ambiguous column name example_column_name.'
This works in all databases except one.
It is only for exactly one specific column name it happens, adding a column with different name and updating that column works
The column name in question works in other databases, and it already exists in other tables in the same db
Question
Does anyone know what's going on, how can we get past this problem?
Update
The problem was an indexed view that used the column name of the new column in an already existing query. See comments and accepted answer for details.
This error can't happen purely from the code shown.
There must be a trigger or indexed view in play. You have ruled out triggers so an example demonstrating the indexed view scenario is below
CREATE TABLE T1(X INT, Y INT)
CREATE TABLE T2(X INT, Z INT)
GO
CREATE VIEW V1
WITH SCHEMABINDING
AS
SELECT T1.X,
T1.Y,
Z
FROM dbo.T1
JOIN dbo.T2
ON T1.X = T2.X
GO
CREATE UNIQUE CLUSTERED INDEX IX
ON V1(X)
GO
ALTER TABLE T1
ADD Z INT;
GO
UPDATE T1
SET Z = 0
When the view is initially created the only table containing a column Z is T2 so it is not ambiguous. After adding column Z to T1 the view definition becomes ambiguous. The UPDATE to the table tries to automatically maintain the view and the error is thrown.
Msg 209, Level 16, State 1, Procedure V1, Line 5 [Batch Start Line 23]
Ambiguous column name 'Z'. Msg 4413, Level 16, State 1, Line 25 Could
not use view or function 'V1' because of binding errors.
It is best practice to always use two part naming where your query references more than one table to avoid this type of error.
I have a table that has a primary key WORKITEMID, and the following 3 foreign keys PRODSERVID,PROCESSID,and TASKKNOWID.
I have a view that I can create that also has PRODSERVID,PROCESSID, AND TASKKNOWID. This view will usually have ALL the records in above table plus some new ones - not in the table. The 'table' by definition is meant to hold the unique combinations of PRODSERVID, PROCESSID, and TASKKNOWID.
I would like to insert from the view into the table any new combinations in the view not present in the table. And I don't want to overwrite the existing WORKITEMIDs in the INSERT- because those WORKITEMIDs are used elsewhere.
Can this be done in SQL?
Thanks
Absolutely, the simplest form of criteria for this is to use the negation of EXISTS()
INSERT INTO [TableName] (PRODSERVID,PROCESSID,TASKKNOWID,... )
SELECT PRODSERVID,PROCESSID,TASKKNOWID,...
FROM [ViewName] v
WHERE NOT EXISTS (
SELECT 1 FROM [TableName] t
WHERE t.PRODSERVID = v.PRODSERVID AND t.PROCESSID = v.PROCESSID AND t.TASKKNOWID = v.TASKKNOWID
)
replace the ... with your other fields
You could also use a non-corellating outer join but I find not exists makes the intent much clearer.
There is a good comparison of the different approaches to this issue in this article: T-SQL commands performance comparison – NOT IN vs SQL NOT EXISTS vs SQL LEFT JOIN vs SQL EXCEPT
How can I add a clustered index to the following view?
CREATE VIEW [vExcludedIds]
WITH SCHEMABINDING
AS
SELECT DISTINCT
TempTable.Id
FROM
(VALUES (1), (2), (3)) AS TempTable(Id)
And my index creation command is:
CREATE UNIQUE CLUSTERED INDEX IDX_V1
ON [vExcludedIds] (Id);
And I get the following error:
Cannot create index on view "Test.dbo.vExcludedIds" because it references derived table "TempTable" (defined by SELECT statement in FROM clause). Consider removing the reference to the derived table or not indexing the view.
Also, when I try to add the index manually in SQL Server Management Studio, I get an error at the top of "New Index" window saying:
HasClusteredColumnStoreIndex: unknown property.
Any ideas please?
Please read https://msdn.microsoft.com/en-AU/library/ms191432.aspx
There are a lot of limitations for creating indexed views.
...
The SELECT statement in the view definition must not contain the following Transact-SQL elements:
DISTINCT
Derived table
Consider creating a table or table function
I have the following Oracle 10g sql which to me looks about right:
update ( select OLD1.TC_CUSTOMER_NUMBER,NEW1.PRD_CUST_NUMBER
FROM TBYC84_PROFILE_ACCOUNT OLD1,
TMP_PRD_KEP NEW1
WHERE
OLD1.TC_CUSTOMER_NUMBER = NEW1.KEP_CUST_NUMBER )
SET
TC_CUSTOMER_NUMBER = PRD_CUST_NUMBER
But i am getting this error when i run the script:
SQL Error: ORA-01779: cannot modify a column which maps to a non key-preserved table
01779. 00000 - "cannot modify a column which maps to a non key-preserved table"
*Cause: An attempt was made to insert or update columns of a join view which
map to a non-key-preserved table.
*Action: Modify the underlying base tables directly.
I have done done some research on this error but not quite sure how to remedy.
So my question is, how can i fix this or is there a better way to write the update sql?
Any help would be appreciated.
many thanks
UPDATE
I have changed the update sql to this:
update
TBYC84_PROFILE_ACCOUNT PA
set
(
PA.TC_CUSTOMER_NUMBER
) = (
select
TPK.PRD_CUST_NUMBER
from
TMP_PRD_KEP TPK
where
TPK.KEP_CUST_NUMBER = PA.TC_CUSTOMER_NUMBER
)
Now this has updated the TBYC84_PROFILE_ACCOUNT table AND nulled out the TC_CUSTOMER_NUMBER
column.
Why did it do this?
There may be more than one row in the TBYC84_PROFILE_ACCOUNT.TC_CUSTOMER_NUMBER
that has the same account number but for different user_id's.
Please can anyone assist in helping me resolve this.
All I need to to is update the TBYC84_PROFILE_ACCOUNT.TC_CUSTOMER_NUMBER to the one that is xrefed in the TMP_PRD_KEP, surely this is not impossible.
many thanks
For an UPDATE statement, all the columns that are updated must be extracted from a key-preserved table.
Also:
A key-preserved table is one for which every primary key or unique key value in the base table is also unique in the join view.
Here.
In this case, TBYC84_PROFILE_ACCOUNT is being updated. So, it must be key-preserved in the view's subquery. Currently it is not. It must be changed in a way that it becomes key-preserved by involving primary or unique columns in the where clause. If not possible, you should try to update the base table instead.
UPDATE
In case of the table update problem, assuming the subquery returns at most one distinct value for the TC_CUSTOMER_NUMBER column, the reason you get NULLs is that all records are being updated even if they do not have any matching records in the TMP_PRD_KEP table. So, the parent update statement needs to be fitted with a where clause:
update
TBYC84_PROFILE_ACCOUNT PA
set
(
PA.TC_CUSTOMER_NUMBER
) = (
select
TPK.PRD_CUST_NUMBER
from
TMP_PRD_KEP TPK
where
TPK.KEP_CUST_NUMBER = PA.TC_CUSTOMER_NUMBER
)
where exists(select *
from TMP_PRD_KEP TPK
where TPK.KEP_CUST_NUMBER = PA.TC_CUSTOMER_NUMBER)
;
Create a index on the columns used in your where clause predicates. That should solve your problem.
I am currently reading the book "SQL Programming Style" wrote by Joe Celko.
In the first chapter, at the paragraph "Develop Standardized Postfixes" he states for the id column :
"_id" = identifier. It is unique in
the schema and refers to one entity
anywhere it appears in the schema.
Never user ">table_name<_id"
Few pages later he states
Do not use an underscore as the first
or last letter in a name. It looks
like the name is missing another
component.
He deprecated "id" as column name.
So I would like to know how you guys name the id column ?
I know that most people might think what the point of this question, but I am looking on standardizing my data model, following industry standards and ISO standards as much as I can.
I also deprecate the use of "Id" as a column name, even though it has become very widespread. "EmployeeId" is longer than "Id", but it is more descriptive. It also allows a foreign key to generally have the same name as the column to which it refers. This is enormously helpful when control over the database passes from one person to the next.
There is an exception to the above. It's possible to have two foreign keys in the same table that both refer to the same key. It's also possible to have a reflexive foreign key that refers to the key in a different row of the same table where it appears.
Let me give an example of a reflexive key. You have a table of employees, with key EmployeeId. You have another column, called SupervisorId, that records the relationship between a supervisor and several subordintes. The name of the foreign key in this case names the role, and not the entity.
As an alternative, it's possible to use user defined domains to document the fact that two columns refer to the same thing. Again, this is most useful when the fundamental meaning of the data has to be communicated to someone new.
The use of underscore as an internal visual separator inside a symbol is a completely separable issue. Camelcasing has become more widespread than underscore, and there are even systems where underscore is not allowed as a symbol constituent.
Above all, keep it consistent. If you use arbitrary, capricious, and contradictory naming conventions, you'll eventually confuse even your self.
I think it's a good question. Do what looks good to you, and always do that, every time. Then you'll be fine.
I use the tablename + 'id' model: UserId, PersonId etc.
Rather than share my opinions on naming standards, I'll attempt to answer your question ;)
I think the point Celko is making is that student_ID in a table of students is a code smell i.e. it could be that the designer's style is to always adds an ID column, probably an auto-increment column, to every table they create in the physical model (even when there is no such column in the logical model) with the intention of using these ID columns for foreign keys. In other words, Celko does not want you to always use surrogate key, rather he wants you to use natural keys where appropriate.
If you read on to section 1.2.5 (p14-15) and follow his rules for table names, you'll discover why table name + _ID an unlikely occurrence:
if I cannot find an industry standard
(name), I would look for a collective
or class name... Exception: use a
singular name if the table actually
has one and only one row in it.
So, for example, if you had a table containing student data it may be called Students rather than Student but more likely to be Enrolment (or similar). And a table containing one and only one row is unlikely to need an _ID column.
I suppose there are nouns for whom the plural is the same as the singular so maybe Sheep_ID is acceptable (but only in absence of an industry standard ovine identifier, of course!)
Also consider the rule 1.3.2. (p19) Avoid Names That Change From Place to Place e.g. the same domain referred to in the Students table as ID and in other tables as student_ID. It is unlikely that there will only be one element named _ID in the entire schema!
For Table IDs I always use tablename + ID.
The reason for this is to avoid ambiguous column names in Queries when it is a 1 to 1 mapping
Sometimes I quickly write up sql to test like this
Select
*
FROM table1
Inner join table2 on table1ID = table2ID
If I didnt use tablename in the ID column then this would throw an error (forcing me to use aliases on the tables)
Select
*
FROM table1
Inner join table2 on ID = ID
Also another good reason to use the table name, in general testing queries to see what data exists use the "*" to select columns. If you do a join and Select *, sometimes it is difficult to understand what ID came from what table, especially if you are returning a large number of columns from more than 2 tables
I always advocate for globally unique TABLENAME_ID. On that note, I strongly encourage table names which fully describe their context, so there is never any ambiguity as to their application when foreign references are made.
ID as a column name is hard to maintain and in my opinion can more easily lead to mistakes in joins.
Suppose for instance you always used ID as a column name in every table.
Now suppose you need to join to six of those tables. And being a typical person, you copy the first joins and change the table names. If you miss one, and you use id you will get a query that runs and gives the wrong anaswer. If you use tablenameId you will get a syntax error. see the following code for an example:
create table #test1 (id int identity, test varchar(10))
create table #test2 (id int identity, test varchar(10))
create table #test3 (id int identity, test varchar(10))
insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select *
from #test1 t1
join #test2 t2
on t1.id = t2.id
join #test3 t3
on t1.id = t2.id
select *
from #test1 t1
join #test2 t2
on t1.id = t2.id
join #test3 t3
on t1.id = t3.id
Drop table #test1
drop table #test2
drop table #test3
Go
create table #test1 (t1id int identity, test varchar(10))
create table #test2 (t2id int identity, test varchar(10))
create table #test3 (t3id int identity, test varchar(10))
insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select *
from #test1 t1
join #test2 t2
on t1.t1id = t2.t2id
join #test3 t3
on t1.t1id = t3.t3id
select *
from #test1 t1
join #test2 t2
on t1.t1id = t2.t2id
join #test3 t3
on t1.t1id = t2.t3id
Drop table #test1
drop table #test2
drop table #test3
Another thing about using tablenameId is that when you want the actual id from several tables in a complex reporting query, you don't have to create aliases in order to see which id came from where (and to make the reporting application happy as most of them inist on unique fieldnames for a report).
Wow, I was going to write "I always use TablenameID but everyone else in the world disagrees with me". However, it looks like everyone here agrees with me.
That is, of course, when I use a surrogate integer ID in the table. If there's a natural primary key I use that instead.
In my database:
For a foreign key ID, I use the singular version of the foreign table name + "Id". I use the capital I, lower d as it is a standard ingrained in me by FX cop.
For auto incrementing identities I often use "SequenceId"
In my data layer:
I use the name of the object + "Id", following best practice standards for "Id"