How to make sure only one column is not null in postgresql table - sql

I'm trying to setup a table and add some constraints to it. I was planning on using partial indexes to add constraints to create some composite keys, but ran into the problem of handling NULL values. We have a situation where we want to make sure that in a table only one of two columns is populated for a given row, and that the populated value is unique. I'm trying to figure out how to do this, but I'm having a tough time. Perhaps something like this:
CREATE INDEX foo_idx_a ON foo (colA) WHERE colB is NULL
CREATE INDEX foo_idx_b ON foo (colB) WHERE colA is NULL
Would this work? Additionally, is there a good way to expand this to a larger number of columns?

Another way to write this constraint is to use the num_nonulls() function:
create table table_name
(
a integer,
b integer,
check ( num_nonnulls(a,b) = 1)
);
This is especially useful if you have more columns:
create table table_name
(
a integer,
b integer,
c integer,
d integer,
check ( num_nonnulls(a,b,c,d) = 1)
);

You can use the following check:
create table table_name
(
a integer,
b integer,
check ((a is null) != (b is null))
);
If there are more columns, you can use the trick with casting boolean to integer:
create table table_name
(
a integer,
b integer,
...
n integer,
check ((a is not null)::integer + (b is not null)::integer + ... + (n is not null)::integer = 1)
);
In this example only one column can be not null (it simply counts not null columns), but you can make it any number.

One can do this with an insert/update trigger or checks, but having to do so indicates it could be done better. Constraints exist to give you certainty about your data so you don't have to be constantly checking if the data is valid. If one or the other is not null, you have to do the checks in your queries.
This is better solved with table inheritance and views.
Let's say you have (American) clients. Some are businesses and some are individuals. Everyone needs a Taxpayer Identification Number which can be one of several things such as a either a Social Security Number or Employer Identification Number.
create table generic_clients (
id bigserial primary key,
name text not null
);
create table individual_clients (
ssn numeric(9) not null
) inherits(generic_clients);
create table business_clients (
ein numeric(9) not null
) inherits(generic_clients);
SSN and EIN are both Taxpayer Identification Numbers and you can make a view which will treat both the same.
create view clients as
select id, name, ssn as tin from individual_clients
union
select id, name, ein as tin from business_clients;
Now you can query clients.tin or if you specifically want businesses you query business_clients.ein and for individuals individual_clients.ssn. And you can see how the inherited tables can be expanded to accommodate more divergent information between types of clients.

Related

How to select from table A and then insert selected id inside table B with one query?

I'm trying to implement a very basic banking system.
the goal is to have different types of transactions ( deposit, withdraw, transfer ) inside a table and refer to them as IDs inside transaction tables.
CREATE TABLE transaction_types (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
name VARCHAR UNIQUE NOT NULL
)
CREATE TABLE transactions (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
type_id INTEGER NOT NULL,
amount FLOAT NOT NULL
)
What I'm trying to accomplish is:
When inserting into transactions table no record can have an invalid type_id ( type_id should exist in transaction_types table )
First of all get type_id from transaction_types table and then insert inside transactions table, with one query ( if it's possible, I'm fairly new )
I'm using Node.js/Typescript and PostgreSQL, any help is appreciated a lot.
For (1): modify Transactions table definition by adding REFERENCES transaction_types(id) to the end of the type_id column definition prior to the comma.
For (2), assuming you know the name of the transaction_type, you can accomplish this by:
INSERT INTO transactions(type_id, amount)
VALUES ((SELECT id from transaction_types WHERE name = 'Withdrawal'), 999.99)
By the way, my PostgreSQL requires SERIAL instead of INTEGER AUTOINCREMENT

Oracle SQL Check

I'm trying to implement an Oracle SQL database, in one of my tables I must introduce a restriction which does not allow to have more than 4 people in the same group:
I've tried this:
CREATE TABLE PERSON (name VARCHAR (20) PRIMARY KEY, group VARCHAR (3), CHECK (COUNT (*) group FROM PERSON) <=4);
also this (among others):
CREATE TABLE PERSON (name VARCHAR (20) PRIMARY KEY, group VARCHAR (3), CHECK NOT EXISTS (Select COUNT(*) FROM PERSON GROUP BY group HAVING COUNT(*) > 4);
But I'm getting errors every time (ORA-00934: group function is not allowed here or ORA-02251: subquery not allowed here.
What is the correct way to do it?
You have multiple issues with this
CREATE TABLE PERSON (
name VARCHAR(20) PRIMARY KEY,
group VARCHAR(3),
CHECK (COUNT (*) group FROM PERSON) <=4);
);
Oracle explicitly prefers VARCHAR2() to VARCHAR().
GROUP is a really bad name for a column, because it is a keyword. Surely you can find something like group_name or whatever for the name.
CHECK constraints only work within a single row.
Probably the best way to handle this is:
Create a new table called groups -- or whatever. It should have a group_id as well as group_name and num_persons.
Add triggers to person to keep the counter up-to-date for inserts, deletes, and updates to person.
Add a check constraint to groups, say check (num_persons <= 4).
You need to create the table as following:
CREATE TABLE PERSON (
name VARCHAR2(20) PRIMARY KEY,
group_ VARCHAR2(3) -- added _ after column name
); -- used varchar2 as data type of column
Then create before insert trigger as following:
create trigger person_trg
before insert on person
for each row
declare
group_cnt number;
begin
select count(distinct name)
into group_cnt
from person
where group_ = :new.group_;
if group_cnt = 4 then
raise_application_error(-20001, 'more than 4 persons are not allowed in the group');
end if;
end;
/
I have used distinct person name as more than 4 distinct persons are not allowed in the group as per your requirement.
db<>fiddle demo
Cheers!!

Distinct top 10 from multiple tables

I have these two tables in SQLite
CREATE TABLE "freq" (
`id` INTEGER,
`translation_id` INTEGER,
`freq` INTEGER DEFAULT NULL,
`year` INTEGER,
PRIMARY KEY(`id`),
FOREIGN KEY(`translation_id`) REFERENCES `translation`(`id`) DEFERRABLE INITIALLY DEFERRED
)
CREATE TABLE translation (
id INTEGER PRIMARY KEY,
w_id INTEGER,
word TEXT DEFAULT NULL,
located_in TEXT,
UNIQUE (word, language)
ON CONFLICT ABORT
)
Based on the values from these tables I want to create a third one which contains the top 10 words for every translation.located_in for every freq.year. This could look like this:
CREATE TABLE top_ten_words_by_country (
id INTEGER PRIMARY KEY,
located_in TEXT,
year INTEGER,
`translation_id` INTEGER,
freq INTEGER,
FOREIGN KEY(`translation_id`) REFERENCES `translation`(`id`) DEFERRABLE INITIALLY DEFERRED
)
Thats what I tried (for one country and one year) so far:
SELECT * FROM freq f, translation t
WHERE t.located_in = 'Germany' ORDER BY f.freq DESC
which has these problems:
it doesn't add up multiple words from translation which have the same w_id (which means they are a translation from each other)
it only works for one year and one country
it takes veeeeery long (I know joins are expensive, so its not that important to speed this up)
it contains duplicate translation.word
So can anyone provide me a way to do what I want?
The speed is the least important thing here for me.
Look, you have a cartesian product(there's no relation between your tables).
Besides, you have to use 'group by' clause.
And you can create a view instead a table.
Change your query to:
SELECT sum(f.freq) total_freq
, t.w_id
, t.located_in
, f.year
FROM freq f
, translation t
WHERE f.translation_id = t.id
group by t.w_id
, t.located_in
, f.year
ORDER BY total_freq DESC

How to combine particular rows in a pl/pgsql function that returns set of a view row type?

I have a view, and I have a function that returns records from this view.
Here is the view definition:
CREATE VIEW ctags(id, name, descr, freq) AS
SELECT tags.conc_id, expressions.name, concepts.descr, tags.freq
FROM tags, concepts, expressions
WHERE concepts.id = tags.conc_id
AND expressions.id = concepts.expr_id;
The column id references to the table tags, that, references to another table concepts, which, in turn, references to the table expressions.
Here are the table definitions:
CREATE TABLE expressions(
id serial PRIMARY KEY,
name text,
is_dropped bool DEFAULT FALSE,
rank float(53) DEFAULT 0,
state text DEFAULT 'never edited',
UNIQUE(name)
);
CREATE TABLE concepts(
id serial PRIMARY KEY,
expr_id int NOT NULL,
descr text NOT NULL,
source_id int,
equiv_p_id int,
equiv_r_id int,
equiv_len int,
weight int,
is_dropped bool DEFAULT FALSE,
FOREIGN KEY(expr_id) REFERENCES expressions,
FOREIGN KEY(source_id),
FOREIGN KEY(equiv_p_id) REFERENCES concepts,
FOREIGN KEY(equiv_r_id) REFERENCES concepts,
UNIQUE(id,equiv_p_id),
UNIQUE(id,equiv_r_id)
);
CREATE TABLE tags(
conc_id int NOT NULL,
freq int NOT NULL default 0,
UNIQUE(conc_id, freq)
);
The table expressions is also referenced from my view (ctags).
I want my function to combine rows of my view, that have equal values in the column name and that refer to rows of the table concepts with equal values of the column equiv_r_id so that these rows are combined only once, the combined row has one (doesn't matter which) of the ids, the value of the column descr is concatenated from the values of the rows being combined, and the row freq contains the sum of the values from the rows being combined. I have no idea how to do it, any help would be appreciated.
Basically, what you describe looks like this:
CREATE FUNCTION f_test()
RETURNS TABLE(min_id int, name text, all_descr text, sum_freq int) AS
$x$
SELECT min(t.conc_id) -- AS min_id
,e.name
,string_agg(c.descr, ', ') -- AS all_descr
,sum(t.freq) -- AS sum_freq
FROM tags t
JOIN concepts c USING (id)
JOIN expressions e ON e.id = c.expr_id;
-- WHERE e.name IS DISTINCT FROM
$x$
LANGUAGE sql;
Major points:
I ignored the view ctags altogether as it is not needed.
You could also write this as View so far, the function wrapper is not necessary.
You need PostgreSQL 9.0+ for string_agg(). Else you have to substitute with
array_to_string(array_agg(c.descr), ', ')
The only unclear part is this:
and that refer to rows of the table concepts with equal values of the column equiv_r_id so that these rows are combined only once
Waht column exactly refers to what column in table concepts?
concepts.equiv_r_id equals what exactly?
If you can clarify that part, I might be able to incorporate it into the solution.

Using MySQL's "IN" function where the target is a column?

In a certain TABLE, I have a VARTEXT field which includes comma-separated values of country codes. The field is named cc_list. Typical entries look like the following:
'DE,US,IE,GB'
'IT,CA,US,FR,BE'
Now given a country code, I want to be able to efficiently find which records include that country. Obviously there's no point in indexing this field.
I can do the following
SELECT * from TABLE where cc_list LIKE '%US%';
But this is inefficient.
Since the "IN" function is supposed to be efficient (it bin-sorts the values), I was thinking along the lines of
SELECT * from TABLE where 'US' IN cc_list
But this doesn't work - I think the 2nd operand of IN needs to be a list of values, not a string. Is there a way to convert a CSV string to a list of values?
Any other suggestions? Thanks!
SELECT *
FROM MYTABLE
WHERE FIND_IN_SET('US', cc_list)
In a certain TABLE, I have a VARTEXT field which includes comma-separated values of country codes.
If you want your queries to be efficient, you should create a many-to-many link table:
CREATE TABLE table_country (cc CHAR(2) NOT NULL, tableid INT NOT NULL, PRIMARY KEY (cc, tableid))
SELECT *
FROM tablecountry tc
JOIN mytable t
ON t.id = tc.tableid
WHERE t.cc = 'US'
Alternatively, you can set ft_min_word_len to 2, create a FULLTEXT index on your column and query like this:
CREATE FULLTEXT INDEX fx_mytable_cclist ON mytable (cc_list);
SELECT *
FROM MYTABLE
WHERE MATCH(cc_list) AGAINST('+US' IN BOOLEAN MODE)
This only works for MyISAM tables and the argument should be a literal string (you won't be able to join on this condition).
The first rule of normalization says you should change multi-value columns such as cc_list into a single value field for this very reason.
Preferably into it's own table with IDs for each country code and a pivot table to support a many-to-many relationship.
CREATE TABLE my_table (
my_id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
mystuff VARCHAR NOT NULL,
PRIMARY KEY(my_id)
);
# this is the pivot table
CREATE TABLE my_table_countries (
my_id INT(11) UNSIGNED NOT NULL,
country_id SMALLINT(5) UNSIGNED NOT NULL,
PRIMARY KEY(my_id, country_id)
);
CREATE TABLE countries {
country_id SMALLINT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
country_code CHAR(2) NOT NULL,
country_name VARCHAR(100) NOT NULL,
PRIMARY KEY (country_id)
);
Then you can query it making use of indexes:
SELECT * FROM my_table JOIN my_table_countries USING (my_id) JOIN countries USING (country_id) WHERE country_code = 'DE'
SELECT * FROM my_table JOIN my_table_countries USING (my_id) JOIN countries USING (country_id) WHERE country_code IN('DE','US')
You may have to group the results my my_id.
find_in_set seems to be the MySql function you want. If you could actually store those comma-separated strings as MySql sets (no more than 64 possible countries, or splitting countries into two groups of no more than 64 each), you could keep using find_in_set and go a bit faster.
There's no efficient way to find what you want. A table scan will be necessary. Putting multiple values into a single text field is a terrible misuse of relational database technology. If you refactor (if you have access to the database structure) so that the country codes are properly stored in a separate table you will be able to easily and quickly retrieve the data you want.
One approach that I've used successfully before (not on mysql, though) is to place a trigger on the table that splits the values (based on a specific delimiter) into discrete values, inserting them into a sub-table. Your select can then look like this:
SELECT * from TABLE where cc_list IN
(
select cc_list_name from cc_list_subtable
where c_list_subtable.table_id = TABLE.id
)
where the trigger parses cc_list in TABLE into separate entries in column cc_list_name in table cc_list_subtable. It involves a bit of work in the trigger, too, as every change to TABLE means that associated rows in cc_list_table have to be deleted/updated/inserted as appropriate, but is an approach that works in situations where the original table TABLE has to retain its original structure, but where you are free to adapt the query as you see fit.