Create primary key with two columns - sql

I have two tables, bank_data and sec_data. Table bank_data has the columns id, date, asset, and liability. The date column is divided into quarters.
id | date | asset | liability
--------+----------+--------------------
1 | 6/30/2001| 333860 | 308524
1 | 3/31/2001| 336896 | 311865
1 | 9/30/2001| 349343 | 308524
1 |12/31/2002| 353863 | 322659
2 | 6/30/2001| 451297 | 425156
2 | 3/31/2001| 411421 | 391846
2 | 9/30/2001| 430178 | 41356
2 |12/31/2002| 481687 | 46589
3 | 6/30/2001| 106506 | 104532
3 | 3/31/2001| 104196 | 102983
3 | 9/30/2001| 106383 | 104865
3 |12/31/2002| 107654 | 105867
Table sec_data has columns of id, date, and security. I combined the two tables into a new table named new_table in R using this code:
dbGetQuery(con, "CREATE TABLE new_table
AS (SELECT sec_data.id,
bank_data.date,
bank_data.asset,
bank_data.liability,
sec_data.security
FROM bank_data,bank_sec
WHERE (bank_data.id = sec_data.id) AND
(bank_data.date = sec_data.date)")
I would like to set two primary keys (id and date) in this R code without using pgAdmin. I want to use something like Constraint bankkey Primary Key (id, date) but the AS and SELECT functions are throwing me off.

First your query is wrong.. You say table sec_data but you assign table bank_sec and i am rephrase your query
CREATE TABLE new_table AS
SELECT
sec_data.id,
bank_data.date,
bank_data.asset,
bank_data.liability,
sec_data.security
FROM bank_data
INNER JOIN sec_data on bank_data.id = sec_data.id
and bank_data.date = sec_data.date
Avoid using Implicit Join and use Explicit Join instead.. And as stated by # a_horse_with_no_name you can't define more than 1 primary key in 1 table. So what you do are Composite Primary Key
Define :
is a combination of two or more columns in a table that can be used to
uniquely identify each row in the table
So you need to Alter Function because Your create statement base on other table..
ALTER TABLE new_table
ADD PRIMARY KEY (id, date);

You may run these two separate statements ( create table and Insert into )
CREATE TABLE new_table (
id int, date date, asset int, liability int, security int,
CONSTRAINT bankkey PRIMARY KEY (id, date)
) ;
INSERT INTO new_table (id,date,asset,liability,security)
SELECT s.id,
b.date,
b.asset,
b.liability,
s.security
FROM bank_data b JOIN bank_sec s
ON b.id = s.id AND b.date = s.date;
Demo

To create the primary key you desire, run the following SQL statement after your CREATE TABLE ... AS statement:
ALTER TABLE new_table
ADD CONSTRAINT bankkey PRIMARY KEY (id, date);
That has the advantage that the primary key index won't slow down the data insertion.

Related

SQL SELECT column value based on value

I am using Postgresql. I would like to write a SELECT statement with a column value based on the value in the database.
For example.
| id | indicator |
| 1 | 0 |
| 2 | 1 |
indicator can be 0 or 1 only where 0 = manual and 1 = auto.
Expected output from a SELECT *
1 manual
2 auto
You can use a case expression:
select id, case indicator
when 0 then 'manual'
when 1 then 'auto'
end as indicator
from the_table;
If you need that frequently you could create a view for that.
In the long run, it might be better to create a proper lookup table and join to that:
create table indicator
(
id integer primary key,
name text not null
);
insert into indicator (id, name)
values (0, 'manua', 1, 'auto');
alter table the_table
add constraint fk_indicator
foreign key (indicator) references indicator (id);
Then join to it:
select t.id, i.name as indicator
from the_table t
join indicator i on i.id = t.indicator;

Make sure no two rows contain identical values in Postgresql

I have a table and I want to make sure that no two rows can be alike.
So, for example, this table would be valid:
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet
2 | DeskJet
But this table would not be:
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet <--error (duplicate row)
2 | DeskJet
1 | ThinkJet <--error (duplicate row)
This is because the last table has two instances of 1 | ThinkJet.
So, user_id can be repeated (i.e. 1) and printer can be repeated (i.e. LaserWriter) but once a record like 1 | ThinkJet is entered once that combination cannot be entered again.
How can I prevent such occurrences in a Postgresql 11.5 table?
I would try experimenting with SQL code but alas I am still new on the matter.
Please note this is for INSERTing data into the table, not SELECTing it. Like a constraint iirc.
Thanks
Here's your script
ALTER TABLE tableA ADD CONSTRAINT some_constraint PRIMARY KEY(user_id,printer);
INSERT INTO tableA(user_id, printer)
VALUES
(
1,
'LaserWriter'
)
ON CONFLICT (user_id, printer)
DO NOTHING;
You can use DISTINCT. For example:
SELECT user_id, DISTINCT printer FROM my_table;
That's all. Hope it helps!
You need a series of steps (assuming there is no already assigned unique key).
Add a temporary column to make each row unique.
Assign a value to the new columns.
Remove the already existing duplicates.
Create a Unique or Primary Key on the composite columns.
Remove the temporary column.
alter table your_table add temp_unique integer unique;
do $$
declare
row_num integer = 1;
c_assign cursor for
select temp_unique
from your_table
for update;
begin
for rec in c_assign
loop
update your_table
set temp_unique = row_num
where current of c_assign;
row_num = row_num + 1;
end loop;
end;
$$
delete from your_table ytd
where exists ( select 1
from your_table ytk
where ytd.user_id = ytk.user_id
and ytd.printer = ytk.printer
and ytd.temp_unique > ytk.temp_unique
) ;
alter table your_table add constraint id_prt_uk unique (user_id, printer);
alter table your_table drop temp_unique;
I found the answer. When creating the table I needed to specify the two columns as UNIQUE. Observe:
CREATE TABLE foo (user_id INT, printer VARCHAR(20), UNIQUE (user_id, printer));
Now, here are my results:
=# INSERT INTO foo VALUES (1, 'LaserWriter');
INSERT 0 1
=# INSERT INTO foo VALUES (4, 'LaserWriter');
INSERT 0 1
=# INSERT INTO foo VALUES (1, 'ThinkJet');
INSERT 0 1
=# INSERT INTO foo VALUES (2, 'DeskJet');
INSERT 0 1
=# INSERT INTO foo VALUES (1, 'ThinkJet');
ERROR: duplicate key value violates unique constraint "foo_user_id_printer_key"
DETAIL: Key (user_id, printer)=(1, ThinkJet) already exists.
=# SELECT * FROM foo;
user_id | printer
---------+-------------
1 | LaserWriter
4 | LaserWriter
1 | ThinkJet
2 | DeskJet
(4 rows)

Merging multiple databases with relations in SQLite

I have couple of sqlite files (databases) with rather simple structure like:
| Id | Category | | Id | CatId | Name |
----------------- ---------------------
| 1 | A | | 1 | 2 | A |
----------------- -- relations one to many --> ---------------------
| 2 | B | | 2 | 1 | B |
---------------------
| 3 | 2 | BC |
So as you see there is a table with categories witch is related to name tabe. Problem is I have couple of sutrch file and i want to merge them into one and keep relations.
So merging first table is simple like:
ATTACH DATABASE '{databaseFilePath}' AS Db;
BEGIN;
INSERT INTO Category (Category) SELECT Category FROM Db.Category;
COMMIT;
DETACH DATABASE Db;
But this will change my id (it is set to autoincrement because in many db files there can have same id). Now I can do the same for second table with names, problem is with keeping relation as primary has changed. Is there any rational way to do this?
Here is create tables:
CREATE TABLE Category (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category (Category, Id) VALUES ('B', 2), ('A', 1);
CREATE TABLE Name (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name (Name,CatId,Id)VALUES ('A',1,1),('AB',1,3 ),('B',2,2);
I believe that you could base it on the following (instead of attaching the database, 2 has been appended to the 2nd set of table names (for convenience), additionally the data has been prefixed with C2 for the 2nd set of tables) :-
DROP TABLE IF EXISTS Name;
DROP TABLE IF EXISTS Name2;
DROP TABLE IF EXISTS Category;
DROP TABLE IF EXISTS Category2;
CREATE TABLE Category (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category (Category, Id) VALUES ('B', 2), ('A', 1);
CREATE TABLE Name (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name (Name,CatId,Id)VALUES ('A',1,1),('AB',1,3 ),('B',2,2);
CREATE TABLE Category2 (Id INTEGER PRIMARY KEY NOT NULL UNIQUE,Category STRING);
INSERT INTO Category2 (Category, Id) VALUES ('C2B', 2), ('C2A', 1);
CREATE TABLE Name2 (Id INTEGER PRIMARY KEY UNIQUE NOT NULL,
CatId INTEGER
REFERENCES Category2 (Id) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE, Name STRING);
INSERT INTO Name2 (Name,CatId,Id)VALUES ('C2A',1,1),('C2AB',1,3 ),('C2B',2,2);
UPDATE Category2 SET id = id + (Max((SELECT max(id) FROM Category),(SELECT max(id) FROM Category2)));
UPDATE Name2 SET id = id + (Max((SELECT Max(id) FROM name) ,(SELECT max(id) FROM name2)));
SELECT * FROM Category2;
SELECT * FROM Name2;
INSERT INTO Category SELECT * FROM Category2 WHERE 1;
INSERT INTO name SELECT * FROM name2 WHERE 1;
SELECT * FROM Category;
SELECT * FROM Name;
Note you mention AUTOINCREMENT but haven't included it, so checking for the highest sqlite_sequence value hasn't been included.
The above relies upon the CASCADE On UPDATE, to cascade the increase to the Category.id down to the CatId.
This works by finding the highest id of both tables with the same schema and then updating the id's of the table to be merged by adding the found highest id to the id's of all rows. When the tables are the Category table the updated ID's are cascaded to the respective Name table.
The process is performed for both the pair of Category tables and the pair of Name tables.
The result (the last query is) :-

Joining column for multiple tables

I am trying to extract two same-data-type columns from two different tables using one query. NOTE: Accounts attribute length in both table varies. Union can't work here because number of columns are (in reality) different in both tables.
CREATE TABLE IF NOT EXISTS `mydb`.`TABLE_A` (
`ID_TABLE_A` INT NOT NULL AUTO_INCREMENT,
`ACCOUNT` VARCHAR(5) NULL,
`SALES` INT NULL,
PRIMARY KEY (`ID_TABLE_A`))
ENGINE = InnoDB;
CREATE TABLE IF NOT EXISTS `mydb`.`TABLE_B` (
`ID_TABLE_B` INT NOT NULL AUTO_INCREMENT,
`ACOUNT` VARCHAR(9) NULL,
`SALES` INT NULL,
PRIMARY KEY (`ID_TABLE_B`))
ENGINE = InnoDB;
Requirement:(I know this can't be right but just to demonstrate a partial picture)
SELECT
ACCOUNTS,
SALES
FROM
TABLE_A, TABLE_B
Result:
---------------
|accounts|sales|
| 2854 |52500 |
| 6584 |54645 |
| 54782| 5624 |
| 58496|46259 |
| 56958| 6528 |
---------------
If you want the union of two tables that are not union-compatible, then make them union-compatible:
(SELECT
ACCOUNTS,
SALES
FROM
TABLE_A) UNION ALL
(SELECT
ACCOUNTS,
SALES
FROM TABLE_B)
I put the UNION ALL based on the assumption that you would like to keep duplicates. If you would like the output to be duplicate-free, replace it with UNION.

PostgreSQL: Select field value and update with next value from second table

CREATE TABLE t_a
(
a_id SERIAL PRIMARY KEY,
str VARCHAR(50)
)
CREATE TABLE t_b
(
b_id SERIAL PRIMARY KEY,
a_id_fk INTEGER REFERENCES (t_a(a_id),
)
Using the above tables, I want to SELECT a_id_fk FROM t_b WHERE b_id = 1 and then update a_id_fk with the next a_id in the sequence, but if I'm at the end of the available a_id's I cycle back to the first one. All this with multiple people querying/updating that specific row from t_b.
If it helps, the scenario I'm working on is multiple sites share a common list of words, but as each user for each sites grabs a word that sites index within the word list is moved to the next word until it hits the end then it loops back to the beginning.
Is there a way to do this in a single query? If not, what would be the best way to handle this? I can handle most of the logic, it's looping back when I run out of ids that has me stumped.
You could use something complicated like
UPDATE t_b
SET a_id_fk = COALESCE(
(SELECT MIN(a_id) FROM t_a WHERE a_id > t_b.a_id_fk),
(SELECT MIN(a_id) FROM t_a))
WHERE b_id = :b_id
but if I was given a requirement like that I'd probably maintain an auxiliary table that maps an a_id to the next a_id in the cycle...
This one is a bit more elegant (IMHO) than #pdw's solution:
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE t_a
( a_id SERIAL PRIMARY KEY
, str VARCHAR(50)
);
CREATE TABLE t_b
( b_id SERIAL PRIMARY KEY
, a_id_fk INTEGER REFERENCES t_a(a_id)
);
INSERT INTO t_a(str)
SELECT 'Str_' || gs::text
FROM generate_series(1,10) gs
;
INSERT into t_b(a_id_fk)
SELECT a_id FROM t_a
ORDER BY a_id
;
-- EXPLAIN ANALYZE
WITH src AS (
SELECT a_id AS a_id
, min(a_id) OVER (order BY a_id) AS frst
, lead(a_id) OVER (order BY a_id) AS nxt
FROM t_a
)
UPDATE t_b dst
SET a_id_fk = COALESCE(src.nxt, src.frst)
FROM src
WHERE dst.a_id_fk = src.a_id
AND dst.b_id IN ( 3, 10)
;
SELECT * FROM t_b
ORDER BY b_id
;
Result:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
CREATE TABLE
INSERT 0 10
INSERT 0 10
UPDATE 2
b_id | a_id_fk
------+---------
1 | 1
2 | 2
3 | 4
4 | 4
5 | 5
6 | 6
7 | 7
8 | 8
9 | 9
10 | 1
(10 rows)