Merging multiple databases with relations in SQLite - sql

I have couple of sqlite files (databases) with rather simple structure like:
| Id | Category | | Id | CatId | Name |
----------------- ---------------------
| 1 | A | | 1 | 2 | A |
----------------- -- relations one to many --> ---------------------
| 2 | B | | 2 | 1 | B |
---------------------
| 3 | 2 | BC |
So as you see there is a table with categories witch is related to name tabe. Problem is I have couple of sutrch file and i want to merge them into one and keep relations.
So merging first table is simple like:
ATTACH DATABASE '{databaseFilePath}' AS Db;
BEGIN;
INSERT INTO Category (Category) SELECT Category FROM Db.Category;
COMMIT;
DETACH DATABASE Db;
But this will change my id (it is set to autoincrement because in many db files there can have same id). Now I can do the same for second table with names, problem is with keeping relation as primary has changed. Is there any rational way to do this?
Here is create tables:
CREATE TABLE Category (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category (Category, Id) VALUES ('B', 2), ('A', 1);
CREATE TABLE Name (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name (Name,CatId,Id)VALUES ('A',1,1),('AB',1,3 ),('B',2,2);

I believe that you could base it on the following (instead of attaching the database, 2 has been appended to the 2nd set of table names (for convenience), additionally the data has been prefixed with C2 for the 2nd set of tables) :-
DROP TABLE IF EXISTS Name;
DROP TABLE IF EXISTS Name2;
DROP TABLE IF EXISTS Category;
DROP TABLE IF EXISTS Category2;
CREATE TABLE Category (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category (Category, Id) VALUES ('B', 2), ('A', 1);
CREATE TABLE Name (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name (Name,CatId,Id)VALUES ('A',1,1),('AB',1,3 ),('B',2,2);
CREATE TABLE Category2 (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category2 (Category, Id) VALUES ('C2B', 2), ('C2A', 1);
CREATE TABLE Name2 (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category2 (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name2 (Name,CatId,Id)VALUES ('C2A',1,1),('C2AB',1,3 ),('C2B',2,2);
UPDATE Category2 SET id = id + (Max((SELECT max(id) FROM Category),(SELECT max(id) FROM Category2)));
UPDATE Name2 SET id = id + (Max((SELECT Max(id) FROM name) ,(SELECT max(id) FROM name2)));
SELECT * FROM Category2;
SELECT * FROM Name2;
INSERT INTO Category SELECT * FROM Category2 WHERE 1;
INSERT INTO name SELECT * FROM name2 WHERE 1;
SELECT * FROM Category;
SELECT * FROM Name;
Note you mention AUTOINCREMENT but haven't included it, so checking for the highest sqlite_sequence value hasn't been included.
The above relies upon the CASCADE On UPDATE, to cascade the increase to the Category.id down to the CatId.
This works by finding the highest id of both tables with the same schema and then updating the id's of the table to be merged by adding the found highest id to the id's of all rows. When the tables are the Category table the updated ID's are cascaded to the respective Name table.
The process is performed for both the pair of Category tables and the pair of Name tables.
The result (the last query is) :-

Related

Make sure no two rows contain identical values in Postgresql

I have a table and I want to make sure that no two rows can be alike.
So, for example, this table would be valid:
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet
2 | DeskJet
But this table would not be:
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet <--error (duplicate row)
2 | DeskJet
1 | ThinkJet <--error (duplicate row)
This is because the last table has two instances of 1 | ThinkJet.
So, user_id can be repeated (i.e. 1) and printer can be repeated (i.e. LaserWriter) but once a record like 1 | ThinkJet is entered once that combination cannot be entered again.
How can I prevent such occurrences in a Postgresql 11.5 table?
I would try experimenting with SQL code but alas I am still new on the matter.
Please note this is for INSERTing data into the table, not SELECTing it. Like a constraint iirc.
Thanks
Here's your script
ALTER TABLE tableA ADD CONSTRAINT some_constraint PRIMARY KEY(user_id,printer);
INSERT INTO tableA(user_id, printer)
VALUES
(
1,
'LaserWriter'
)
ON CONFLICT (user_id, printer)
DO NOTHING;
You can use DISTINCT. For example:
SELECT user_id, DISTINCT printer FROM my_table;
That's all. Hope it helps!
You need a series of steps (assuming there is no already assigned unique key).
Add a temporary column to make each row unique.
Assign a value to the new columns.
Remove the already existing duplicates.
Create a Unique or Primary Key on the composite columns.
Remove the temporary column.
alter table your_table add temp_unique integer unique;
do $$
declare
row_num integer = 1;
c_assign cursor for
select temp_unique
from your_table
for update;
begin
for rec in c_assign
loop
update your_table
set temp_unique = row_num
where current of c_assign;
row_num = row_num + 1;
end loop;
end;
$$
delete from your_table ytd
where exists ( select 1
from your_table ytk
where ytd.user_id = ytk.user_id
and ytd.printer = ytk.printer
and ytd.temp_unique > ytk.temp_unique
) ;
alter table your_table add constraint id_prt_uk unique (user_id, printer);
alter table your_table drop temp_unique;
I found the answer. When creating the table I needed to specify the two columns as UNIQUE. Observe:
CREATE TABLE foo (user_id INT, printer VARCHAR(20), UNIQUE (user_id, printer));
Now, here are my results:
=# INSERT INTO foo VALUES (1, 'LaserWriter');
INSERT 0 1
=# INSERT INTO foo VALUES (4, 'LaserWriter');
INSERT 0 1
=# INSERT INTO foo VALUES (1, 'ThinkJet');
INSERT 0 1
=# INSERT INTO foo VALUES (2, 'DeskJet');
INSERT 0 1
=# INSERT INTO foo VALUES (1, 'ThinkJet');
ERROR: duplicate key value violates unique constraint "foo_user_id_printer_key"
DETAIL: Key (user_id, printer)=(1, ThinkJet) already exists.
=# SELECT * FROM foo;
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet
2 | DeskJet
(4 rows)

Create primary key with two columns

I have two tables, bank_data and sec_data. Table bank_data has the columns id, date, asset, and liability. The date column is divided into quarters.
id | date | asset | liability
--------+----------+--------------------
1 | 6/30/2001| 333860 | 308524
1 | 3/31/2001| 336896 | 311865
1 | 9/30/2001| 349343 | 308524
1 |12/31/2002| 353863 | 322659
2 | 6/30/2001| 451297 | 425156
2 | 3/31/2001| 411421 | 391846
2 | 9/30/2001| 430178 | 41356
2 |12/31/2002| 481687 | 46589
3 | 6/30/2001| 106506 | 104532
3 | 3/31/2001| 104196 | 102983
3 | 9/30/2001| 106383 | 104865
3 |12/31/2002| 107654 | 105867
Table sec_data has columns of id, date, and security. I combined the two tables into a new table named new_table in R using this code:
dbGetQuery(con, "CREATE TABLE new_table
AS (SELECT sec_data.id,
bank_data.date,
bank_data.asset,
bank_data.liability,
sec_data.security
FROM bank_data,bank_sec
WHERE (bank_data.id = sec_data.id) AND
(bank_data.date = sec_data.date)")
I would like to set two primary keys (id and date) in this R code without using pgAdmin. I want to use something like Constraint bankkey Primary Key (id, date) but the AS and SELECT functions are throwing me off.
First your query is wrong.. You say table sec_data but you assign table bank_sec and i am rephrase your query
CREATE TABLE new_table AS
SELECT
sec_data.id,
bank_data.date,
bank_data.asset,
bank_data.liability,
sec_data.security
FROM bank_data
INNER JOIN sec_data on bank_data.id = sec_data.id
and bank_data.date = sec_data.date
Avoid using Implicit Join and use Explicit Join instead.. And as stated by # a_horse_with_no_name you can't define more than 1 primary key in 1 table. So what you do are Composite Primary Key
Define :
is a combination of two or more columns in a table that can be used to
uniquely identify each row in the table
So you need to Alter Function because Your create statement base on other table..
ALTER TABLE new_table
ADD PRIMARY KEY (id, date);
You may run these two separate statements ( create table and Insert into )
CREATE TABLE new_table (
id int, date date, asset int, liability int, security int,
CONSTRAINT bankkey PRIMARY KEY (id, date)
) ;
INSERT INTO new_table (id,date,asset,liability,security)
SELECT s.id,
b.date,
b.asset,
b.liability,
s.security
FROM bank_data b JOIN bank_sec s
ON b.id = s.id AND b.date = s.date;
Demo
To create the primary key you desire, run the following SQL statement after your CREATE TABLE ... AS statement:
ALTER TABLE new_table
ADD CONSTRAINT bankkey PRIMARY KEY (id, date);
That has the advantage that the primary key index won't slow down the data insertion.

PostgreSQL: Select field value and update with next value from second table

CREATE TABLE t_a
(
a_id SERIAL PRIMARY KEY,
str VARCHAR(50)
)
CREATE TABLE t_b
(
b_id SERIAL PRIMARY KEY,
a_id_fk INTEGER REFERENCES (t_a(a_id),
)
Using the above tables, I want to SELECT a_id_fk FROM t_b WHERE b_id = 1 and then update a_id_fk with the next a_id in the sequence, but if I'm at the end of the available a_id's I cycle back to the first one. All this with multiple people querying/updating that specific row from t_b.
If it helps, the scenario I'm working on is multiple sites share a common list of words, but as each user for each sites grabs a word that sites index within the word list is moved to the next word until it hits the end then it loops back to the beginning.
Is there a way to do this in a single query? If not, what would be the best way to handle this? I can handle most of the logic, it's looping back when I run out of ids that has me stumped.
You could use something complicated like
UPDATE t_b
SET a_id_fk = COALESCE(
(SELECT MIN(a_id) FROM t_a WHERE a_id > t_b.a_id_fk),
(SELECT MIN(a_id) FROM t_a))
WHERE b_id = :b_id
but if I was given a requirement like that I'd probably maintain an auxiliary table that maps an a_id to the next a_id in the cycle...
This one is a bit more elegant (IMHO) than #pdw's solution:
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE t_a
( a_id SERIAL PRIMARY KEY
, str VARCHAR(50)
);
CREATE TABLE t_b
( b_id SERIAL PRIMARY KEY
, a_id_fk INTEGER REFERENCES t_a(a_id)
);
INSERT INTO t_a(str)
SELECT 'Str_' || gs::text
FROM generate_series(1,10) gs
;
INSERT into t_b(a_id_fk)
SELECT a_id FROM t_a
ORDER BY a_id
;
-- EXPLAIN ANALYZE
WITH src AS (
SELECT a_id AS a_id
, min(a_id) OVER (order BY a_id) AS frst
, lead(a_id) OVER (order BY a_id) AS nxt
FROM t_a
)
UPDATE t_b dst
SET a_id_fk = COALESCE(src.nxt, src.frst)
FROM src
WHERE dst.a_id_fk = src.a_id
AND dst.b_id IN ( 3, 10)
;
SELECT * FROM t_b
ORDER BY b_id
;
Result:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
CREATE TABLE
INSERT 0 10
INSERT 0 10
UPDATE 2
b_id | a_id_fk
------+---------
1 | 1
2 | 2
3 | 4
4 | 4
5 | 5
6 | 6
7 | 7
8 | 8
9 | 9
10 | 1
(10 rows)

insert data from one table to another

I have 2 different tables but the columns are named slightly differently.
I want to take information from 1 table and put it into the other table. I need the info from table 1 put into table 2 only when the "info field" in table 1 is not null. Table 2 has a unique id anytime something is created, so anything inserted needs to get the next available id number.
Table 1
category
clientLastName
clientFirstName
incidentDescription
info field is not null then insert all fields into table 2
Table 2
*need a unique id assigned
client_last_name
client_first_name
taskDescription
category
This should work. You don't need to worry about the identify field in Table2.
INSERT INTO Table2
(client_last_name, client_first_name, taskDescription, category)
(SELECT clientLastName, clientFirstName, incidentDescription, category
FROM Table1
WHERE info_field IS NOT NULL)
Member_ID nvarchar(255) primary key,
Name nvarchar(255),
Address nvarchar(255)
)
insert into Member(Member_ID,Name,Address) (select m.Member_Id,m.Name,m.Address from library_Member m WHERE Member_Id IS NOT NULL)

SQL Server table - Update Order by

I have a SQL Server table with fields: id, city, country. I imported this table from Excel file, everything is imported successfully, but id field is not ordered by number. The tool I use imported the rows in some random number.
What kind of Update command I should use from SQL Server Management Studio Express to re-order my ids?
Do you have a primary key and a clustered index on your table? If not, id is a good candidate for a primary key and when you create that the primary key it will be the clustered index.
Assuming this is your table
create table CityCountry(id int, city varchar(10), country varchar(10))
And you add data like this.
insert into CityCountry values (2, '2', '')
insert into CityCountry values (1, '1', '')
insert into CityCountry values (4, '4', '')
insert into CityCountry values (3, '3', '')
The output of select * from CityCountry will be
id city country
----------- ---------- ----------
2 2
1 1
4 4
3 3
A column that is primary key can not accept null values so first you have to do
alter table CityCountry alter column id int not null
Then you can add the primary key
alter table CityCountry add primary key (id)
When you do select * from CityCountry now you get
id city country
----------- ---------- ----------
1 1
2 2
3 3
4 4
Just use the order by part of the select statement to order them.
If I understood you correctly, you want all the ids to have consecutive numbers 1,2,3,4...
Image your table contents is:
select *
from yourTable
id city country
----------- ---------- ----------
1 Madrid Spain
3 Lisbon Portugal
7 Moscow Russia
10 Brasilia Brazil
(4 row(s) affected)
To reorder the ids, just run this:
declare #counter int = 0
update yourTable
set #counter = id = #counter + 1
(4 row(s) affected)
You can now check, that indeed all the ids are reordered:
select *
from yourTable
id city country
----------- ---------- ----------
1 Madrid Spain
2 Lisbon Portugal
3 Moscow Russia
4 Brasilia Brazil
(4 row(s) affected)
However, you need to be careful with this. If some table has a Foreign key to this id column, then you need first to disable that FK, update this table, update the values in other tables that have FK's pointing to yourTable finally enable again the FKs
First, I think you may have some misconceptions about the purpose of the Id column. The Id column is probably a surrogate key; i.e. an arbitrary value that is unique and non-null that is never shown to the user. Thus, it should not be implied to have any inherit meaning or sequence. In fact, you should always have another column or columns that are marked as being unique to represent a "business key" or a set of values that are unique to the user. In your case, city, country should probably be unique (although you will likely need to add province or state as it is common to have the same city exist in the same country multiple times.)
Now, that said, it is possible to re-sequence your Ids if the following are true:
The Id column is not an identity column. Since this was from an import, I'm going to guess this is true.
There does not exist a relationship to the table where Cascade Update is not enabled.
You are using SQL Express 2005 or later:
Update MyTable
Set Id = T2.NewId
From (
Select Id
, Row_Number() Over ( Order By Id ) As NewId
From MyTable
) As T1
Join MyTable As T2
On T2.Id = T1.Id