Enforce uniqueness with soft deletions

Enforce uniqueness with soft deletions - indexing

Let's say I have a "Dashboard" object that contains various "Pages". The "Pages" table looks like this:
Page
- ID (PK)
- DashboardID (FK)
- Name (VARCHAR)
- DeletedAt (TIMESTAMP)
I want to ensure that each dashboard has uniquely-named (non-deleted) pages. For example, a dashboard with the following pages is fine:
Pages: [(1, 1, "Sales", NULL), (1, 2, "Product", NULL)
Pages: [(1, 1, "Sales", NULL), (1, 2, "Sales", "2014-01-01T00:00:00")
But this is not:
Pages: [(1, 1, "Sales", NULL), (1, 2, "Sales", NULL)
What would be the proper way to create an index on this? Essentially I'd want something like:
CREATE UNIQUE INDEX ON Page(DashboardID, Name) # WHERE DeletedAt IS NOT NULL
What would be the proper way to do this in Spanner? I believe this is the concept of a "Partial Index" in Postgres: https://www.postgresql.org/docs/8.0/indexes-partial.html.
Perhaps this is the correct approach? https://cloud.google.com/spanner/docs/generated-column/how-to#create_a_partial_index_using_a_generated_column.

Yes, the approach that you are suggesting yourself is the right approach. So in your specific case that would mean using the following schema (Dashboard column left out for simplicity):
CREATE TABLE page (
id INT64 NOT NULL,
name STRING(MAX) NOT NULL,
deleted_at TIMESTAMP,
unique_name STRING(MAX) AS (if(deleted_at is null, name, null)) STORED,
) PRIMARY KEY(id);
CREATE UNIQUE NULL_FILTERED INDEX idx_page_unique_name ON page(unique_name);
The unique null-filtered index will only contain the entries where the column that is indexed is not null, and the uniqueness will therefore also only be applied to the actual values in that index.
Source: https://cloud.google.com/spanner/docs/generated-column/how-to#create_a_partial_index_using_a_generated_column.

Related

How do we design schema for user settings table for postgresql?

How do we design schema for user settings/preferences table in a sql database like postgresql?
I am interested to know the proper way to design the schema of users_setting table where users are able to modify their settings. This seems to be a 1-to-1 relationship because each row of users table corresponds to a single row in the users_setting table
so this is like a 1-to-1 table relation between users and users_setting. Is this the wrong way to do this? I have searched online and could not really find any useful example schemas where users manage their settings. So here i am asking this question. I am certain this will help many people also
Here is what my current design looks like
DROP TABLE if exists users cascade;
DROP TABLE IF EXISTS "users";
DROP SEQUENCE IF EXISTS users_id_seq;
CREATE SEQUENCE users_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1;
CREATE TABLE "public"."users" (
"id" bigint DEFAULT nextval('users_id_seq') NOT NULL,
"email" text NOT NULL,
"password" text NOT NULL,
"full_name" text NOT NULL,
"status" text NOT NULL,
"is_verified" boolean NOT NULL,
"role" text NOT NULL,
"created_at" timestamptz NOT NULL,
"updated_at" timestamptz NOT NULL,
"verified_at" timestamptz NOT NULL,
CONSTRAINT "users_email_key" UNIQUE ("email"),
CONSTRAINT "users_pkey" PRIMARY KEY ("id")
) WITH (oids = false);
DROP TABLE if exists users_setting cascade;
DROP TABLE IF EXISTS "users_setting";
DROP SEQUENCE IF EXISTS users_setting_id_seq;
CREATE SEQUENCE users_setting_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1;
CREATE TABLE "public"."users_setting" (
"id" bigint DEFAULT nextval('users_setting_id_seq') NOT NULL,
"default_currency" text NOT NULL,
"default_timezone" text NOT NULL,
"default_notification_method" text NOT NULL,
"default_source" text NOT NULL,
"default_cooldown" integer NOT NULL,
"updated_at" timestamptz NOT NULL,
"user_id" bigint,
CONSTRAINT "users_setting_pkey" PRIMARY KEY ("id")
) WITH (oids = false);
ALTER TABLE ONLY "public"."users_setting" ADD CONSTRAINT "users_setting_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "users"(id) NOT DEFERRABLE;
begin transaction;
INSERT INTO "users" ("id", "email", "password", "full_name", "status", "is_verified", "role", "created_at", "updated_at", "verified_at") VALUES
(1, 'users1#email.com', 'password', 'users1', 'active', '1', 'superuser', '2022-07-05 01:05:50.22384+00', '0001-01-01 00:00:00+00', '2022-07-11 14:10:26.615722+00'),
(2, 'users2#email.com', 'password', 'users2', 'active', '0', 'user', '2022-07-05 01:05:50.22384+00', '0001-01-01 00:00:00+00', '2022-07-11 14:10:26.615722+00');
INSERT INTO "users_setting" ("id", "default_currency", "default_timezone", "default_notification_method", "default_source", "default_cooldown", "updated_at", "user_id") VALUES
(1, 'usd', 'utc', 'email', 'google', 300, '2022-07-13 01:05:50.22384+00', 2),
(2, 'usd', 'utc', 'sms', 'yahoo', 600, '2022-07-14 01:05:50.22384+00', 2);
commit;
so lets say i want to return a single row where a users.email is users1#email.com for example, here is query i can run
select * from users, users_setting where users.id = users_setting.user_id AND users.email = 'users1#email.com';
id email password full_name status is_verified role created_at updated_at verified_at id default_currency default_timezone default_notification_method default_source default_cooldown updated_at user_id
1 users1#email.com password users1 active 1 superuser 2022-07-05 01:05:50.22384+00 0001-01-01 00:00:00+00 2022-07-11 14:10:26.615722+00 1 usd utc email google 300 2022-07-13 01:05:50.22384+00 1
i can have a single table for this but the table will get really long row-wise as i add more and more thing. user settings is just one, there are other tables similar to this. So will be great to know how to design a situation like this properly

In your case a JSON could do the job:
ALTER TABLE public.users ADD user_settings jsonb NULL;
Update of settings will be something like:
UPDATE users
SET user_settings = '{"default_currency": "usd", "default_timezone" : "utc"}'
WHERE id = 1;
And select:
select * from users WHERE id = 1;
You will find:
Also consider in Postgresql you can index a JSON, for example to query on a particular setting. Se here: https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING
Specific:
Still, with appropriate use of expression indexes, the above query can
use an index. If querying for particular items within the "tags" key
is common, defining an index like this may be worthwhile:
CREATE INDEX idxgintags ON api USING GIN ((jdoc -> 'tags'));

With this solution you can avoid JSON. Drawback is that setting_value cannot be tailored to exact type you need, compared to your first idea.
For example you can create:
CREATE TABLE public.user_setting (
user_id bigint NOT NULL,
setting_name text NOT NULL,
setting_value text NULL,
CONSTRAINT user_setting_pk PRIMARY KEY (user_id,setting_name)
);
ALTER TABLE public.user_setting ADD CONSTRAINT user_setting_fk FOREIGN KEY (user_id) REFERENCES public.users(id);
At this point I suggest you to have 2 query, one for users and one for settings:
SELECT *
FROM user_setting us
where user_id = 1;

How do I make the columns of one table not contain the columns of another table

I have a database with these two tables:
CREATE TABLE Photos(
photoId INT NOT NULL AUTO_INCREMENT,
userId INT NOT NULL,
url VARCHAR(200) NOT NULL UNIQUE,
uploadDate DATE NOT NULL,
title VARCHAR(80) NOT NULL,
description VARCHAR(400),
visibility ENUM ('Pública', 'Privada') NOT NULL,
PRIMARY KEY (photoId),
FOREIGN KEY (userId) REFERENCES Users (userId) ON DELETE CASCADE
);
CREATE TABLE InappropiateWords(
inappropiateWordId INT NOT NULL AUTO_INCREMENT,
word VARCHAR(80),
PRIMARY KEY (inappropiateWordId)
);
I'm asked to check that the title and/or description of a photo doesn't contain any inappropiate word. I guess I need to create a trigger but I don't know how to do it. Any help is appreciated. Thanks!

This is not a requirement that you can implement at the database level.
If you are really looking to ensure that the "description" or "title" does not contain inappropriate word, then
"What is Inappropriate" has to be defined?. This is step 1. You have a table (table 2) which I assume will store all inappropriate words.
Then when the program that inserts the picture and description/title is invoked, the code needs to take the title and description and parse the words and compare them against the "inappropriate_word" table and then decide which action to take.
The description or title might have a string of words in which case you may have to parse each word and check against the table(2).
This is not a take away solution but at least I hope this helps.

You can create a table variable that loads on page and perform a join to find those values.
CREATE TABLE #tbl_LinkedNames(
Name varchar(50)
, AssociatedNameNbr varchar(50)
, userId int
, inappropiateWordId int
)
INSERT INTO #tbl_LinkedNames(
NameNbr, AssociatedName, userId, inappropiateWordId )
VALUES
('A0001', 'badword', 1, 4),
('A0002', 'wORSEWORD', 2, 5),
('A0002', 'BADW00rds', 3, 6),
('A1001', 'badw', 4, 1),
('A2002', 'lengua', 5, 2),
('A3002', 'diferente', 6, 3)
SELECT * FROM #tbl_LinkedNames
From here it is a simple join based off the called stored procedure.
SELECT
*
FROM
Photos AS p
LEFT JOIN #tbl_LinkedNames AS t_LN ON
p.userId = t_LN.userID
AND
p.inappropiateWordId = t_LN.inappropiateWordId
LEFT JOIN InappropiateWords AS Ip ON
Ip.inappropiateWordId = t_LN.inappropiateWordId

Double values in a table

Is it possible to define a ID column to be unique but every value have to be occur twice?
For example:
table TRANSLATION:
id | name_id | translation
____________|_________|____________
1 | 1 | apple
____________|_________|____________
2 | 1 | apfel
____________|_________|____________
3 | 2 | pear
____________|_________|____________
4 | 2 | birne
I want name_id values to always occur twice, not once and not three times. name_id is a FK from table with my objects that needs to be translated.

No, this is impossible to enforce, though you can attempt it using triggers this is normally a pretty messy solution.
I'd change your table structure to be something like the following:
ID
NAME_ID
LANGUAGE_ID
TRANSLATION
You could then create a unique index on NAME_ID and LANGUAGE_ID. Theoretically, you'd also have a table LANGUAGES, and the LANGUAGE_ID column would have a foreign key back into LANGUAGES.ID - you could then restrict the number of times each NAME_ID appears by not having the data in LANGUAGES.
Ultimately this means that your schema would look something like this:
create table languages (
id number
, description varchar2(4000)
, constraint pk_languages primary key (id)
);
insert into languages values (1, 'English');
insert into languages values (2, 'German');
create table names (
id number
, description varchar(4000)
, constraint pk_names primary key (id)
);
insert into names values (1, 'apple');
insert into names values (2, 'pear');
create table translations (
id number
, name_id number
, language_id number
, translation varchar2(4000)
, constraint pk_translations primary key (id)
, constraint fk_translations_names foreign key (name_id) references names (id)
, constraint fk_translations_langs foreign key (language_id) references languages (id)
, constraint uk_translations unique (name_id, language_id)
);
insert into translations values (1, 1, 1, 'apple');
insert into translations values (2, 1, 2, 'apfel');
insert into translations values (3, 2, 1, 'pear');
insert into translations values (4, 2, 2, 'birne');
and you should be unable to break the constraints:
SQL> insert into translations values (5, 1, 3, 'pomme');
insert into translations values (5, 1, 3, 'pomme')
*
ERROR at line 1:
ORA-02291: integrity constraint (FK_TRANSLATIONS_LANGS) violated - parent
key not found
SQL> insert into translations values (5, 1, 2, 'pomme');
insert into translations values (5, 1, 2, 'pomme')
*
ERROR at line 1:
ORA-00001: unique constraint (UK_TRANSLATIONS) violated
See this SQL Fiddle

Do you mean a maximum of twice? or do you mean they have to occur twice (i.e., once only is not ok)
If the former, Once only IS ok) then you could Add a bit field and make the Primary Key composite on the actual id and the bit field.
If the latter (They have to occur twice), then put two id fields in the same row and make then each a single field unique key.

SET datatype set in SQL Server

While creating a table I have to use the datatype SET, but it looks like there is no datatype SET in SQL Server. I was looking on the Microsoft's website and those are the datatypes that it supports: http://msdn.microsoft.com/en-us/library/ms187752.aspx
Which one should I use to replace the SET?
I have used SET in MySQL database like this:
CREATE TABLE IF NOT EXISTS `configurations` (
`index` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`configDuration` int(5) NOT NULL,
`configDurationPerspective` set('list_this_day','list_remaining') NOT NULL,
PRIMARY KEY (`index`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
And then when I insert data into the table it looks like this:
INSERT INTO 'configurations' (index, user_id, configDuration, configDurationPerspective) VALUES (1, 1, 2, 'list_this_day');
Never mind the quotes. Something messed up while pasting the code.
Now I want to do the same thing, but in SQL Server.

You'd either have to use separate bit fields (one column with bit datatype per value) or you'd pack the values into a column with a integer datatype. If you'd use integer you'd have to use t-sql bitwise operators to read and write the values.
If you use bitwise operators you'll only get one column
The create table statement should look like this:
CREATE TABLE configurations(
[index] int NOT NULL IDENTITY (1,1) PRIMARY KEY,
user_id int NOT NULL,
configDuration int NOT NULL,
configDurationPerspective int NOT NULL,
)
And then you'd have to insert values that are possible to bitmask like 1,2,4,8,16,32 into configDurationPerspective
INSERT INTO 'configurations' (index, user_id, configDuration, configDurationPerspective) VALUES (1, 1, 2, 'list_this_day');
would translate to
INSERT INTO 'configurations' (index, user_id, configDuration, configDurationPerspective) VALUES (1, 1, 2, 1);
And
INSERT INTO 'configurations' (index, user_id, configDuration, configDurationPerspective) VALUES (1, 1, 2, 'list_remaining');
would translate to
INSERT INTO 'configurations' (index, user_id, configDuration, configDurationPerspective) VALUES (1, 1, 2, 2);
and selecting could look like:
select [index], configDuration,
case when configDurationPerspective & 1 > 0 then 'list_this_day' else '' end
+ case when configDurationPerspective & 2 > 0 then 'list_remaining' else '' end as configDurationPerspective
from configurations

The list of basic types in MS SQL Server does not support the same. But what we have are constraints and user types. In this question you can see how MySQL enum is solved
SQL Server equivalent to MySQL enum data type?
And you can also observe user types (I've seen that they were used for the similar purpose)
http://msdn.microsoft.com/en-us/library/ms175007.aspx
But as the most typical solution to this issue, we were (on our projects) using some "CodeList/StaticList" table and referencing it by Primary key (int, shortint, tinyint)

How to create unique index on fields with possible null values (Oracle 11g)?

Here is the sample table with 3 columns (ID, UNIQUE_VALUE, UNIQUE_GROUP_ID)
I want below records can be allowed:
(1, NULL, NULL)
(2, NULL, NULL)
or
(3, NULL, 7)
(4, 123, 7)
or (Note: this condition is not allowed in unique index nor unique constraint)
(5, NULL, 7)
(6, NULL, 7)
and these can't be allowed:
(7, 123, 7)
(8, 123, 7)
I created a unique index on last 2 columns, but only the first 2 examples can be allowed.
Is it possible to let db check the uniqueness of these 2 columns only when both are not null?

You want to only enforce uniqueness on the rows where both UNIQUE_VALUE and UNIQUE_GROUP_ID are not null. To do this, you can use a unique function-based index:
CREATE UNIQUE INDEX func_based_index ON the_table
(CASE WHEN unique_value IS NOT NULL
AND unique_group_id IS NOT NULL
THEN UNIQUE_VALUE || ',' || UNIQUE_GROUP_ID
END);

you can use the nvl function to avoid nulls and place a different value instead ,
create unique index func_idx on TEST_TABLE (nvl(UNIQUE_VALUE,1), UNIQUE_GROUP_ID);
the disadvantage is that your index will be larger and if you would like to search for null values you will have to use the nvl function in order to avoid table_access_full.
also all of the null values will be located under one branch in the index , so make sure your histograms are updated.
I Hope this will help you :)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Enforce uniqueness with soft deletions - indexing

Related

How do we design schema for user settings table for postgresql?

How do I make the columns of one table not contain the columns of another table

Double values in a table

SET datatype set in SQL Server

How to create unique index on fields with possible null values (Oracle 11g)?

Categories

Resources