How to transform SQL table of data into table of pointers? - sql

Let's say I have an SQL table of tuples (id, lastname, firstname, occupation) where all values are strings (ok, obviously id is a key).
I want to transform it to a table with tuples (id, lastid, firstid, occupid), where I only keep pointers to other tables that contain the actual values. I apologize if the example domain of names and occupations is not the best suited for this operation.
Obviously to create the other tables that will hold the data, I need to get all last names, unique, and insert them into a new table with an auto-generated key. The same with first names and occupations.
Having done that, is there a single transformation that can generate the new table containing the pointers (er, foreign keys) ?
Implementation uses SQLite, if it matters.

Assuming your tables for last/first names and occupations are Lnt, Fnt and Occ, each with just two columns, an id field and a value:
REPLACE INTO TheTable (last, first, occup)
SELECT Lnt.id, Fnt.id, Occ.id
FROM TheTable
JOIN Lnt ON (last=Lnt.value)
JOIN Fnt ON (first=Fnt.value)
JOIN Occ ON (occup=Occ.value)

Related

SQL Assign Unique number to each unique value in a column

I have a table in Snowflake with people's names and other attributes. To simplify, it looks like the table below.
How can I add a new column with assigned unique number to each person directly to the table using SQL?
The ideal result is like below
Use dense_rank():
select name, dense_rank() over (order by name) as uniquenum
from t;
You can use this logic in an update, but the exact syntax depends on the database.

Row Stores vs Column Stores

Assuming that the database is already populated with data, and that each of the following SQL statements is the one and only query that an application will perform, why is it better to use row-wise or column-wise record storage for the following queries?...
1) SELECT * FROM Person
2) SELECT * FROM Person WHERE id=5
3) SELECT AVG(YEAR(DateOfBirth)) FROM Person
4) INSERT INTO Person (ID,DateOfBirth,Name,Surname) VALUES(2e25,’1990-05-01’,’Ute’,’Muller’)
In those examples Person.id is the primary key.
The article Row Store and Column Store Databases gives a general discussion on this, but I am specifically concerned about the four queries above.
SELECT * FROM ... queries are better for row stores since it has to access numerous files.
Column store is good for aggregation over large volume of date or when you have quesries that only need a few fields from a wide table.
Therefore:
1st querie: row-wise
2nd query: row-wise
3rd query: column-wise
4th query: row-wise
I have no idea what you are asking. You have this statement:
INSERT INTO Person (ID, DateOfBirth, Name, Surname)
VALUES('2e25', '1990-05-01', 'Ute', 'Muller');
This suggests that you have a table with four columns, one of which is an id. Each person is stored in their own column.
You then have three queries. The first cannot be optimized. The second is optimized, assuming that id is a primary key (a reasonable assumption). The third requires a full table scan -- although that could be ameliorated with an index only on DateOfBirth.
If the data is already in this format, why would you want to change it?
This is a very simple data structure. Three of your four query examples access all columns. I see no reason why you would not use a regular row-store table structure.

SQL inserting rows from multiple tables

I have got an assignment. We have been given a table, MAIN_TABLE, which has a column patient_id as foreign key.
I need to make a separate table named patient which has patient_id as a primary key along with some other attributes such as name and address.
I did successfully create schema of this table. Now there is a serious problem I am facing. After creating this table I used insert statement to insert values for name and address from a dummy table.
Till this point everything works fine. However, the column patient_id is still empty rather I have set it to 0 by default.
Now the problem is that I need to get values into this column, patient_id, from the patient_id column of MAIN TABLE.
I can't figure out how do I do this? I did try to use:
UPDATE patient
SET patient_id=(select id from MAIN_TABLE)
BUT this gives me error that multiple rows returned which does make sense but what condition do I put in where clause then?
That sounds strange. How can there be a table MAIN_TABLE with a foreign key patient_id but the master table patient does not exist. Where do that patient_ids in MAIN_TABLE come from?
I suggest not to insert your data from a dummy table alone and then try to update it. But insert it with both - the MAIN_TABLE and the dummy table joined. If you can not join them. You would also not be able during the update.
So since i think they have no connected primary/foreign keys the only way to join them is using a good business key. Do you have a good business key?
You are talking about persons. So First Name, Last Name, Birth Day, Address often is good enough. But you have to think about it.
With your given data I can only give you some kind of meta insert statement. But you will get the point.
Example:
insert into patient (col1, col2, col3)
select
a.colA,
a.colF,
b.colX
from
dummy_table a
inner join MAIN_TABLE b on a.colN=b.colA and a.colM=b.colB
And: If patient_id is your primary key in patient you should ensure that it is even not possible to have duplicate values or null in this column. And you should use constraints to ensure your data integrity.
http://docs.oracle.com/cd/B19306_01/server.102/b14200/clauses002.htm

Move two tables with the same IDs in datas between DataBases

I have got two databases with the same structure, but different datas.
And... in both databases all datas have auto-incerement IDs in columns 'Tiles' and 'TilesData' and these IDs are related keys. I have to move rows from first one to another, but there is exception with id.
Is description of my problem clearly?
I tried this (but I'm a little afraid of reliability of this solution, there will be few millions of rows):
INSERT INTO DataBase_2.Tiles (X,Y,Zoom,Type,CacheTime)
SELECT X, Y, Zoom, Type, CacheTime FROM DataBase_1.Tiles;
INSERT INTO DataBase_2.TilesData(Tile)
SELECT Tile FROM DataBase_1.TilesData;
Could you help me or give me some tips? Is simply SQL enough?
When the autoincrementing columns are declared as INTEGER PRIMARY KEY (without AUTOINCREMENT), then new IDs will get the next value after the largest value already in the table.
Check if both tables have the same maximum ID value:
SELECT MAX(id) FROM DataBase_2.Tiles;
SELECT MAX(id) FROM DataBase_2.TilesData;
If they are equal, then corresponding rows will indeed get the same new ID. However, you should use ORDER BY to ensure that the rows are read/inserted in the same order:
INSERT INTO DataBase_2.Tiles (...)
SELECT ... FROM DataBase_1.Tiles ORDER BY id;
INSERT INTO DataBase_2.TilesData(Tile)
SELECT Tile FROM DataBase_1.TilesData ORDER BY id;
If the id columns are declared with AUTOINCREMENT, then you have to check (and if needed, adjust) the next ID values in the sqlite_sequence table.

Linking or Mapping two tables together

Consider my data as inventory list separated by categories.
When I started I had one table that should have been split into two tables, else in the oldTable the columns in a given row would have been un-related. I have created two new tables in my database, one for categories and the other for data/items. Now I am trying to use the oldTable existing data to fill the newTable data/items table so I can learn SQL and not have to manually do it. The categories table I filled in manually because I could not see how to do it otherwise.
The old table has:
tableName (
id,
categoryA,
categoryB,
categoryC,
categoryD,
categoryE,
categoryF,
isPriorityA,
isPriorityB,
isPriorityC,
isPriorityD,
isPriorityE,
isPriorityE
)
The new tables have:
Categories (
cat_id,
name
)
dataItem (
item_id,
cat_id,
name,
priority,
description,
URL
)
How do I force the new dataItem table to require the cat_id match one of the values in the Categories.cat_id table column? Perhaps to give an error if a value is added outside of the range? I believe this may be mapping or linking tables, to thereby make them relationship tables.
How do I copy the tableName data to the dataItem table one column at a time in alphabetical order bringing the name,priority with it and allowing it to auto-increment the item_id value?
Sounds like you want to use a foreign key to limit dataItem.cat_id to values in Categories.cat_Id. Something like this:
ALTER TABLE dataItem ADD FOREIGN KEY (cat_id) REFERENCES Categories(cat_id);
Exact syntax may depend on which database you are using. For more info on foreign keys see: http://www.w3schools.com/sql/sql_foreignkey.asp