I have a file that has no Primary Key. In order to load the file and perform analysis I want to concatenate 2 existing columns and send the output to a new column. I'm then going to do a hash of this resultant column and use that as a PK.
I haven't even got to the hash part as I can not for the life of me work out how to populate my concatenated column with data.
The query I'm trying to use is:
ALTER TABLE members_250815
ADD COLUMN email_id VARCHAR;
UPDATE members_250815
INSERT INTO members_250815(email_id)(
SELECT ARRAY_TO_STRING(ARRAY[emailaddress, id], ' ') AS email_id
FROM members_250815);
As seperate queries both
ALTER TABLE members_250815
ADD COLUMN email_id VARCHAR;
and
SELECT ARRAY_TO_STRING(ARRAY[emailaddress, id], ' ') AS email_id
FROM members_250815;
seem to work as I want them to (ie - 1) create the new column and 2) concatenate the 2 columns) however my issue seems to be in joining it all together.
Am I doing something really stupid? I have tried to research this for hours but I am getting nowhere. Essentially the task I am trying to achieve is:
Create new column on existing table
Concatenate 2 existing columns
Take the result of the concatenation and update this new column with this data without affecting any of my other existing data.
Is this possible?
Many thanks in advance
---Update 260815
Many thanks for the quick advice guys, much appreciated!
Using a combination of your advice I have gotten to here:
CREATE TABLE members_update AS
SELECT * FROM members_250815;
ALTER TABLE members_update
ADD COLUMN email_id VARCHAR;<br/>
UPDATE members_update
SET email_id = email || id;
ALTER TABLE members_update
ADD COLUMN hashed_primary_key VARCHAR;
UPDATE members_update
SET hashed_primary_key = md5(email_id::VARCHAR);
ALTER TABLE members_update
ADD CONSTRAINT hashed_primary_key_urn
PRIMARY KEY (hashed_primary_key);
ANALYSE members_update;
I have checked and everything works as expected up until adding the primary key. This is because it turned out that my email field contains numerous NULL values which are then carried into to the email_id and hashed columns and stop the hashed version from being used as the PK.
As such I have been experimenting with IF THEN ELSE and WHERE ELSE statements like
UPDATE members_update(
IF email IS NOT NULL
THEN SET email_id = email || id
ELSE SET email_id = id
END IF);
I have tried numerous combinations, with and without brackets etc and I can never get it to work! I think I am close but just can't seem to make this final part work - has anyone got any ideas?
Many thanks,
Mark
The problem is your update statement is wrong
You need SET, and CASE Should be:
ALTER TABLE members_250815
ADD COLUMN email_id VARCHAR;
UPDATE members_250815
SET email_id = CASE
WHEN email IS NULL THEN id
ELSE email || id
END;
ARRAY_TO_STRING(ARRAY[emailaddress, id], ' ') may also work, but a further research will be necesarry to know if is more eficient than just concatenate the string.
Better way to create an PK column:
Just alter the table and add a serial column
SQL Fiddle Demo
CREATE TABLE members_250815
("DMDUNIT" varchar(5),
"IND" int)
;
INSERT INTO members_250815 VALUES ('TM001', 1);
INSERT INTO members_250815 VALUES ('TM002', 1);
INSERT INTO members_250815 VALUES ('TM003', 1);
ALTER TABLE members_250815
ADD COLUMN id SERIAL NOT NULL PRIMARY KEY;
Aditional Info
In postgres updates are very slow. So in some cases is better consider just create a new table:
CREATE new_table AS
SELECT *, CASE
WHEN email IS NULL THEN id
ELSE email || id
END as email_id
FROM members_250815
and then
DROP TABLE IF EXITS members_250815;
ALTER TABLE new_table RENAME TO members_250815
Related
An example to the problem:
There are 3 columns present in my SQL database.
+-------------+------------------+-------------------+
| id(integer) | age(varchar(20)) | name(varchar(20)) |
+-------------+------------------+-------------------+
There are a 100 rows of different ids, ages and names. However, since many people update the database, age and name constantly change.
However, there are some boundaries to age and name:
Age has to be an integer and has to be greater than 0.
Name has to be alphabets and not numbers.
The problem is a script to check if the change of values is within the boundaries. For example, if age = -1 or Name = 1 , these values are out of the boundaries.
Right now, there is a script that does insert * into newtable where age < 0 and isnumeric(age) = 0 or isnumeric(name) = 0;
The compiled new table has rows of data that have values that are out of the boundary.
I was wondering if there is a more efficient method to do such checking in SQL. Also, i'm using microsoft sql server, so i was wondering if it is more efficient to use other languages such as C# or python to solve this issue.
You can apply check constraint. Replace 'myTable' with your table name. 'AgeCheck' and 'NameCheck' are names of the constraints. And AGE is the name of your AGE column.
ALTER TABLE myTable
ADD CONSTRAINT AgeCheck CHECK(AGE > 0 )
ALTER TABLE myTable
ADD CONSTRAINT NameCheck CHECK ([Name] NOT LIKE '%[^A-Z]%')
See more on Create Check Constraints
If you want to automatically insert the invalid data into a new table, you can create AFTER INSERT Trigger. I have given snippet for your reference. You can expand the same with additional logic for name check.
Generally, triggers are discouraged, as they make the transaction lengthier. If you want to avoid the trigger, you can have a sql agent job to do auditing on regular basis.
CREATE TRIGGER AfterINSERTTrigger on [Employee]
FOR INSERT
AS
BEGIN
DECLARE #Age TINYINT, #Id INT, Name VARCHAR(20);
SELECT #Id = ins.Id FROM INSERTED ins;
SELECT #Age = ins.Age FROM INSERTED ins;
SELECT #Name = ins.Name FROM INSERTED ins;
IF (#Age = 0)
BEGIN
INSERT INTO [EmployeeAudit](
[ID]
,[Name]
,[Age])
VALUES (#ID,
#Name,
#Age);
END
END
GO
I want to do the following
ALTER TABLE runs ADD COLUMN userId bigint NOT NULL DEFAULT (SELECT id FROM users WHERE email = 'admin#example.com');
but it keeps giving me syntax error. How could I do this guys?
Any help is highly appreciated. ;)
Create a function to get the id from the table users with email as an arg.
CREATE OR REPLACE FUNCTION id_in_users(iemail varchar)
RETURNS int LANGUAGE SQL AS $$
SELECT id FROM users WHERE email = email;
$$;
And alter the table
ALTER TABLE runs ADD COLUMN userId bigint NOT NULL DEFAULT
id_in_users('admin#example.com');
SQL FIDDLE(DEMO)
You can't do that on DEFAULT. However you could use a trigger before insert checking if there is a NULL value.
You can check the PostgreSQL Trigger Documentation here
I'm trying to figure out how to implement the alter script described below. I'm familiar with the basics if insert/select already, but this is a lot more complex.
I have a legacy table and need to move its data to a new table with more columns. The new table has already been made public to some select users, who may have already manually moved the common data over.
So for each row in LegacyTable:
see if it already exists in NewImprovedTable (by checking for a match on a string field that exists in both tables)
if not, copy its over to NewImprovedTable
regardless of whether it had been copied to NewImprovedTable automatically just now, or previously by the user...
auto-populate a new Name field in NewImprovedTable (must be unique - e.g. "Legacy1", "Legacy2", etc.)
set an IsLegacy flag in NewImprovedTable
I need to implement this in both MS SQL and Oracle, but once I work out the logic on one I can figure out the syntax on the other.
The solution I settled on (in SQL Server - still need to port to Oracle):
IF NOT EXISTS(SELECT 1
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'NewImprovedTable'
AND COLUMN_NAME = 'legacyFlg')
BEGIN
ALTER TABLE [NewImprovedTable]
ADD legacyFlg TINYINT NULL
ALTER TABLE [LegacyTable]
ADD improvedId INT NULL
END
GO
IF NOT EXISTS(SELECT 1 FROM ImprovedTable WHERE legacyFlg = 1)
BEGIN
MERGE ImprovedTable AS TARGET
USING LegacyTable AS SOURCE
ON (TARGET.stringField = SOURCE.stringField)
WHEN NOT MATCHED THEN
INSERT (name, <other columns>, legacyFlg)
VALUES('Legacy' + SOURCE.stringField, <other column values>, 1)
WHEN MATCHED THEN
UPDATE SET TARGET.legacyFlg = 1;
END
GO
IF NOT EXISTS(SELECT 1 FROM LegacyTable WHERE improvedId <> 0)
BEGIN
MERGE LegacyTable AS TARGET
USING NewImprovedTable AS SOURCE
ON (SOURCE.stringField = TARGET.stringField)
WHEN MATCHED THEN
UPDATE SET TARGET.improvedId = SOURCE.pId;
END
GO
You could try using this, where 'input' is the string you are trying to confirm if already exists:
SELECT * FROM`NewImprovedTable` WHERE `Variable`='input'
This will return the whole row if found any match, if not it will return null, you can play with that
As for the unique ID field you need to create a primary key on your table with the auto increment option enable, for example
CREATE TABLE Persons
(
P_Id INT NOT NULL AUTO_INCREMENT,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
PRIMARY KEY (P_Id)
)
In this last example P_Id is set as an autoincrement variable, each time you crate a new row it will auto fill this column with a unique number.
You should check this page
http://www.w3schools.com/sql/sql_primarykey.asp
Hello I have table with many inserted row. I need to renumber all row by id and order them.
I have found this code but it does not work for me.
SET #i = 100;
UPDATE "main"."Categories" SET ID = (#i := #i +1) WHERE "Name" = "White";
ALTER TABLE "main"."Categories" AUTO_INCREMENT = 1
So using code above I expected renumbered all records that have name - white and start insert them from 100 with increment 1. But it is not work for me. Maybe there is some problem in my code but maybe it is a difference between SQL and SQLite query.
This how I created table:
CREATE TABLE Categories (id INTEGER PRIMARY KEY, Name TEXT, Free NUMERIC)
I hope there is already made solution how to do it because I don't want to do it manually :)
That code is not standard SQL.
SQLite does not have many programming constructs because it is designed to be an embedded database where it is more natural to have the logic in the host language.
If you want to do this in SQL, try the following:
First, create a temporary table so that we have an autoincrement column that can be used for counting:
CREATE TEMPORARY TABLE new_ids(i INTEGER PRIMARY KEY, old_id INTEGER);
Insert a dummy record to ensure that the next new record starts at 100, then insert all the IDs of the Categories table that you want to change:
INSERT INTO new_ids VALUES(99, NULL);
INSERT INTO new_ids SELECT NULL, id FROM "Categories" WHERE "Name" = 'White';
DELETE FROM new_ids WHERE i = 99;
Then we can change all these IDs in the original table:
UPDATE "Categories"
SET id = (SELECT i FROM new_ids WHERE old_id = "Categories".id)
WHERE id IN (SELECT old_id FROM new_ids);
DROP TABLE new_ids;
I have a SQL script that populates a temp column and then drops the column at the end of the script. The first time it runs, it works fine because the column exists, then it gets dropped. The script breaks the 2nd time because the column no longer exists, even though the IF statement ensures that it won't run again. How do I get around SQL checking for this field?
IF EXISTS (SELECT name FROM syscolumns
WHERE name = 'COLUMN_THAT_NO_LONGER_EXISTS')
BEGIN
INSERT INTO TABLE1
(
COLUMN_THAT_NO_LONGER_EXISTS,
COLUMN_B,
COLUMN_C
)
SELECT 1,2,3 FROM TABLE2
ALTER TABLE TABLE1 DROP COLUMN COLUMN_THAT_NO_LONGER_EXISTS
END
I had a similar problem once and got round it by building all the queries as strings and executing them using the Exec() call. That way the queries (selects, inserts or whatever) don't get parsed till they are executed.
It wasn't pretty or elegant though.
e.g
exec('INSERT INTO TABLE1(COLUMN_THAT_NO_LONGER_EXISTS,COLUMN_B,COLUMN_C) SELECT 1,2,3 FROM TABLE2')
Are you checking the column isnt on another table ? If not you probably to check the table too see if statement below.
If you are already doing that is it running a in a single transaction and not picking up the that dropped column has gone ?
IF Not EXISTS (SELECT name FROM sys.columns
WHERE name = 'COLUMN_THAT_NO_LONGER_EXISTS' and Object_Name(object_id) = 'Table1')
Created a quick script program for this; can you confirm this matches what you are trying to do because in SQL 2007 at least this isnt returning an error. If i create the table and run through with teh alter table to add colc it works; if i then run the if / insert that works even after dropping the table.
create table tblTests
(
TestID int identity (1,1),
TestColA int null,
TestColB int null
)
go -- Ran this on its own
insert into tblTests (TestColA, TestColB)
Select 1,2
go 10
-- Insert some initial data
alter table tblTests
add TestColC Int
go -- alter the table to add new column
-- Run this with column and then after it has removed it
IF EXISTS (SELECT name FROM sys.columns a
WHERE name = 'TestColC' AND
OBJECT_NAME(object_id) = 'tblTests')
Begin
insert into tblTests (TestColA, TestColB, testcolc)
select 1,2,3
alter table tblTests
drop column TestColC
End