Should I add another table or add a coulmn to exisiting one - sql

I have a table with thousands of entry and want to show if the entity is deleted or not.
I can add a new column "isDeleted" in the existing table and update every entry(thousands) of that entity in the table once it is deleted
OR
have a new table for the deleted entries and join the tables for queries.
I want to know which is faster.
I will be querying from the table and want the information about deleted entities as well as non deleted ones.
Lets say my table has columns:
id
type
prop1
info1
1
A
any
any
2
B
any
any
3
C
any
any
4
A
any
any
5
B
any
any
And i go and delete the type A, now I can have a isDeleted Column in this table only, as such
id
type
prop1
info1
isDeleted
1
A
any
any
true
2
B
any
any
false
3
C
any
any
false
4
A
any
any
true
5
B
any
any
false
or have a new table for deleted types.
with the first method I will have to go and update the isDeleted column for every instance of type A, and there are 1000's of such entries. whereas in the second method i can simply add a new row in the new table.
I want all such unique "types" that have not been deleted from my table. but dont want to remove the deleted types information
I hope this is clearer

The easiest way would be just to add an isDeleted column which is nullable and mark those that you delete as non-null. This would assert backwards compatibility also.
To build on this further, I would instead recommend to make this column into a deleted_at column stored as a nullable timestamp - this way you get the bonus of some extra metadata.
One such benefit of this extra metadata could be for audit trails.

To prevent repeated storage of the same data, add a different table types with columns type and is_deleted. This way, you can avoid inconsistencies, such as when rows 1 and 4 in your proposed example disagree with each other (one is true, another is false).
REFERENCES:
What is the reason to "normalize your databases"?
What is Normalisation (or Normalization)?

Related

How do I add a primary key to a column that may contain null or empty values?

Background info:
Table A has 100 rows in it representing inventory on a shelf. I want to make column boxCode a primary key but I can't because the empty rows all have boxCode of empty string.
Table B has a variable amount of rows based on the actual inventory. I want to make column boxCode a foreign key linked to Table A.
Problem:
Currently I perform 2 SQL queries to make the above operations occur but the action isn't atomic. I update table A followed by updating table B but there is a small period of time where the tables information is out of sync with each other. This is causing issues with our API. Is there a way in SQL to add or delete rows from Table B when I UPDATE Table A? I can guarantee I only ever UPDATE Table A if that makes a difference.

Transferring data when identity column values are different

I am in process of restructuring a database and creating a MVC 5 application. There are many tables which are normalized but few remains the same. In the original database table few of the rows were deleted. SO the table data looks like below,
Id Column1
--------------------
1 Some value
2 Some value
4 Some value
8 Some value
9 Some value
Now I am using code first to create new database with some new and some existing database tables. In my entity model I am using the following code to mark a field as primary key and identity,
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int ID { get; set; }
Now the tables created by code first have auto incremented values for ID columns.
Id Column1
--------------------
1 Some value
2 Some value
3 Some value
4 Some value
5 Some value
The number of records for first table are more then 100. ID column for this table is also used as foreign key in another table which has nearly 1000 records. Now problem I am facing is that how to account for the difference in the original table IDs and newly created IDs. If I try to specify the value for ID column explicitly then it gives error that I cannot explicitly specify value for Identity column. I have to write a seed method for both tables. What can be the proper way to handle this scenario?

Column Copy and Update vs. Column Create and Insert

I have a table with 32 Million rows and 31 columns in PostgreSQL 9.2.10. I am altering the table by adding columns with updated values.
For example, if the initial table is:
id initial_color
-- -------------
1 blue
2 red
3 yellow
I am modifying the table so that the result is:
id initial_color modified_color
-- ------------- --------------
1 blue blue_green
2 red red_orange
3 yellow yellow_brown
I have code that will read the initial_color column and update the value.
Given that my table has 32 million rows and that I have to apply this procedure on five of the 31 columns, what is the most efficient way to do this? My present choices are:
Copy the column and update the rows in the new column
Create an empty column and insert new values
I could do either option with one column at a time or with all five at once. The columns types are either character varying or character.
The columns types are either character varying or character.
Don't use character, that's a misunderstanding. varchar is ok, but I would suggest just text for arbitrary character data.
Any downsides of using data type "text" for storing strings?
Given that my table has 32 million rows and that I have to apply this
procedure on five of the 31 columns, what is the most efficient way to do this?
If you don't have objects (views, foreign keys, functions) depending on the existing table, the most efficient way is create a new table. Something like this ( details depend on the details of your installation):
BEGIN;
LOCK TABLE tbl_org IN SHARE MODE; -- to prevent concurrent writes
CREATE TABLE tbl_new (LIKE tbl_org INCLUDING STORAGE INCLUDING COMMENTS);
ALTER tbl_new ADD COLUMN modified_color text
, ADD COLUMN modified_something text;
-- , etc
INSERT INTO tbl_new (<all columns in order here>)
SELECT <all columns in order here>
, myfunction(initial_color) AS modified_color -- etc
FROM tbl_org;
-- ORDER BY tbl_id; -- optionally order rows while being at it.
-- Add constraints and indexes like in the original table here
DROP tbl_org;
ALTER tbl_new RENAME TO tbl_org;
COMMIT;
If you have depending objects, you need to do more.
Either was, be sure to add all five at once. If you update each in a separate query you write another row version each time due to the MVCC model of Postgres.
Related cases with more details, links and explanation:
Updating database rows without locking the table in PostgreSQL 9.2
Best way to populate a new column in a large table?
Optimizing bulk update performance in PostgreSQL
While creating a new table you might also order columns in an optimized fashion:
Calculating and saving space in PostgreSQL
Maybe I'm misreading the question, but as far as I know, you have 2 possibilities for creating a table with the extra columns:
CREATE TABLE
This would create a new table and filling could be done using
CREATE TABLE .. AS SELECT.. for filling with creation or
using a separate INSERT...SELECT... later on
Both variants are not what you seem to want to do, as you stated solution without listing all the fields.
Also this would require all data (plus the new fields) to be copied.
ALTER TABLE...ADD ...
This creates the new columns. As I'm not aware of any possibility to reference existing column values, you will need an additional UPDATE ..SET... for filling in values.
So, I' not seeing any way to realize a procedure that follows your choice 1.
Nevertheless, copying the (column) data just to overwrite them in a second step would be suboptimal in any case. Altering a table adding new columns is doing minimal I/O. From this, even if there would be a possibility to execute your choice 1, following choice 2 promises better performance by factors.
Thus, do 2 statements one ALTER TABLE adding all your new columns in on go and then an UPDATE providing the new values for these columns will achieve what you want.
create new column (modified colour), it will have a value of NULL or blank on all records,
run an update statement, assuming your table name is 'Table'.
update table
set modified_color = 'blue_green'
where initial_color = 'blue'
if I am correct this can also work like this
update table set modified_color = 'blue_green' where initial_color = 'blue';
update table set modified_color = 'red_orange' where initial_color = 'red';
update table set modified_color = 'yellow_brown' where initial_color = 'yellow';
once you have done this you can do another update (assuming you have another column that I will call modified_color1)
update table set 'modified_color1'= 'modified_color'

Update table with random data

This is an MS Access 2010 related question.
Is it possible to update the (empty) field of an existing table with data that have no connection with those already in the other fields of the table?
Assuming I have a field called "Letters" with 3 records (A, B and C). How can I update the field "Numbers" with 1, 2 and 3?
There is no connection at all between the values. I just want them to be in the same table. This should be done for large records and multiple fields.
Basically I am have to hold some reference date in the database and I would prefer to have them all in a single table.
The below retrieves the right numbers of records but they are all empty...
UPDATE tblDestination SET tblDestination.Numbers = [tblSource].[Numb];

Deleting record from sql table and update sql table

I am trying to delete a record in sqlite. i have four records record1, record2, record3, record 4
with id as Primary Key.
so it will auto increment for each record that i insert. now when i delete record 3, the primary key is not decrementing. what to do to decrement the id based on the records that i am deleting.
i want id to be 1,2,3 when i delete the record 3 from the database. now it is 1,2,4. Is there any sql query to change it. I tried this one
DELETE FROM TABLE_NAME WHERE name = ?
Note: I am implementing in xcode
I don't know why you want this but I would recommend leaving these IDs as is.
What is wrong with having IDs as 1,2,4?
Also you can potentially break things (referential integrity) if you use these ID values as foreign keys somewhere else.
Also please refer to this page to get a better understanding how autoincrement fields works
http://sqlite.org/autoinc.html
The sense of auto increment is always to create a new unique ID and not to fill the gaps created by deleting records.
EDIT
You can reach it by a special table design. There are no deleted records but with a field "del" marked as deleted.
For example, with a "select ... where del> 0" will find all active records.
Or place without the "where" all the records, then the ID's remain unaffected. To loop through an array with "if del = 0 continue". Thus, the array is always in consecutive order.
It's very flexible. Depending on the select ... you get.
all active records
all the deleted records
all records