Sqlite3 store list in column - sql

I havea some data i need to store in a sqlite database that looks like this
[
[user1,[python,java,javascript],21],
[user2,[csharp,python,c,java,php,sql],18],
[user3,[],52]
[user4,[python],73]
]
How do i store the list of programming languages for each user in sqlite3

The standard solution is have a table for each thing: users, languages, and the table that holds the 1 to many relationship which in this case is users_languages. user and language are relatively large and variable sized key, so it's pretty common optimization to introduce a an artificial key usually integer auto_increment.
create table languages (
language text primary key
);
insert into languages values ('python'), ('java'), ('javascript'), ('csharp'), ('c'), ('php'), ('sql');
create table users (
user text primary key,
age tinyint not null
);
insert into users values ('user1', 21), ('user2', 18), ('user3', 52), ('user4', 73);
create table users_languages (
user text not null,
language text not null,
foreign key (user) references users (user),
foreign key (language) references languages (language),
unique(user, language)
);
insert into users_languages values ('user1', 'python'), ('user1', 'java'), ('user1', 'javascript');
...
-- list of languages for a given user (row per language)
select language from users_languages where user = '...';
-- list of languages for all users (row per user)
select user, group_concat(langauge) from users_langauges group by 1;

Related

Securing values for other tables from enumerable table in SQL Server database

English is not my native language, so I might have misused words Enumerator and Enumerable in this context. Please get a feel for what I'm trying to say and correct me if I'm wrong.
I'm looking into not having tables for each enumerator I need in my database.
I "don't want" to add tables for (examples:) service duration type, user type, currency type, etc. and add relations for each of them.
Instead of a table for each of them which values will probably not change a lot, and for which I'd have to create relationships with other tables, I'm looking into having just 2 tables called Enumerator (eg: user type, currency...) and Enumerable (eg: for user type -> manager, ceo, delivery guy... and for currency -> euro, dollar, pound...).
Though here's the kicker. If I implement it like that, I'm loosing the rigidity of the foreign key relationships -> I can't accidentally insert a row in users table that will have a user type of some currency or service duration type, or something else.
Is there another way to resolve the issue of having so many enumerators and enumerables with the benefit of having that rigidity of the foreign key and with the benefit of having all of them in just those 2 tables?
Best I can think of is to create a trigger for BEFORE UPDATE and BEFORE INSERT to check if (for example) the column type of user table is using the id of the enumerable table that belongs to the correct enumerator.
This is a short example in SQL
CREATE TABLE [dbo].[Enumerator]
(
[Id] INT NOT NULL PRIMARY KEY,
[Name] VARCHAR(50)
)
CREATE TABLE [dbo].[Enumerable]
(
[Id] INT NOT NULL PRIMARY KEY,
[EnumeratorId] INT NOT NULL FOREIGN KEY REFERENCES Enumerator(Id),
[Name] VARCHAR(50)
)
INSERT INTO Enumerator (Id, Name)
VALUES (1, 'UserType'),
(2, 'ServiceType');
INSERT INTO Enumerable (Id, EnumeratorId, Name) -- UserType
VALUES (1, 1, 'CEO'),
(2, 1, 'Manager'),
(3, 1, 'DeliveryGuy');
INSERT INTO Enumerable (Id, EnumeratorId, Name) -- ServiceDurationType
VALUES (4, 2, 'Daily'),
(5, 2, 'Weekly'),
(6, 2, 'Monthly');
CREATE TABLE [dbo].[User]
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY (1,1),
[Type] INT NOT NULL FOREIGN KEY REFERENCES Enumerable(Id)
)
CREATE TABLE [dbo].[Service]
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY (1,1),
[Type] INT NOT NULL FOREIGN KEY REFERENCES Enumerable(Id)
)
The questions are:
Is it viable to resolve enumerators and enumerables with 2 tables and with before update and before insert triggers, or is it more trouble than it's worth?
Is there a better way to resolve this other than using before update and before insert triggers?
Is there a better way to resolve enumerators and enumerables that is not using 2 tables and triggers, nor creating a table with relations for each of them?
I ask for your wisdom as I don't have one or more big projects behind me and I didn't get a chance to create a DB like this until now.

How should I handle multiple foreign keys that all point to the same column in another table?

So let's say I have these two tables...
IF NOT EXISTS (SELECT * FROM sysobjects WHERE name='UnitsDef' AND xtype='U')
CREATE TABLE UnitsDef
(
UnitsID INTEGER PRIMARY KEY,
UnitsName NVARCHAR(32) NOT NULL,
UnitsDisplay NVARCHAR(8) NOT NULL
);
IF NOT EXISTS (SELECT * FROM sysobjects WHERE name='Dimensions' AND xtype='U')
CREATE TABLE Dimensions
(
DimID INTEGER PRIMARY KEY IDENTITY(0,1),
DimX FLOAT,
DimXUnitsID INTEGER DEFAULT 0,
DimY FLOAT,
DimYUnitsID INTEGER DEFAULT 0,
DimZ FLOAT,
DimZUnitsID INTEGER DEFAULT 0,
FOREIGN KEY (DimXUnitsID) REFERENCES UnitsDef(UnitsID),
FOREIGN KEY (DimYUnitsID) REFERENCES UnitsDef(UnitsID),
FOREIGN KEY (DimZUnitsID) REFERENCES UnitsDef(UnitsID)
);
I'll insert data into the first table similar to this...
INSERT INTO UnitsDef (UnitsID, UnitsName, UnitsDisplay) VALUES (0, 'inch', 'in.');
INSERT INTO UnitsDef (UnitsID, UnitsName, UnitsDisplay) VALUES (1, 'millimeter', 'mm');
INSERT INTO UnitsDef (UnitsID, UnitsName, UnitsDisplay) VALUES (2, 'degree', '°');
Am I going about this the right way? This is a simplified version of the problem, but I need to know which unit each measurement is given in. Is there a better design practice for this type of situation?
How would I handle the ON DELETE and ON UPDATE for these foreign keys? If I try to cascade deletes and updates, SQL Server would not be so happy about that.
Your method is pretty good. I would make the suggestion right off that UnitsId be an identity column, so it gets incremented. Your inserts would then be:
INSERT INTO UnitsDef (UnitsName, UnitsDisplay) VALUES ('inch', 'in.');
INSERT INTO UnitsDef (UnitsName, UnitsDisplay) VALUES ('millimeter', 'mm');
INSERT INTO UnitsDef (UnitsName, UnitsDisplay) VALUES ('degree', '°');
You should also make the string columns unique in UnitsDef and give them case-sensitive collations. After all, Ml and ml are two different things ("M" is mega and "m" is milli).
You might also want to combine the units and values into a single type. This has positives and negatives. For me it adds overhead, but it can help if you want to support a fuller algebra of types.

A simplified version of Twitter. Understanding many-to-many relationships between tables in the database

I am reading this article about a Twitter-like application. The type of storage where tweets, users, likes, etc. will be stored is a relational database. The database scheme is described here and drawn here.
As an Android developer, I coded my sample using SQLite. This is how I would code it:
create table users (_id integer primary key, username text unique, first_name text, last_name text);
create table tweets (_id integer primary key, content text, created_at integer, user_id integer, foreign key(user_id) references users(_id));
create table connections (_id integer primary key, follower_id integer, followee_id integer, created_at integer, foreign key(follower_id) references users(_id), foreign key (followee_id) references users(_id));
create table favorites (_id integer primary key, user_id integer, tweet_id integer, foreign key (user_id) references users(_id), foreign key (tweet_id) references tweets(_id));
Now let's insert some data.
users:
insert into users values (1, 'user1', 'Lorem', 'Ipsum');
insert into users values (2, 'user2', 'Dolor', 'Sit');
insert into users values (3, 'user3', 'Foo', 'Bar');
insert into users values (4, 'user4', 'Qwerty', 'Trewq');
some tweets:
insert into tweets values(10, '1 Tweet from user1', 1100, 1);
insert into tweets values(11, '2 Tweet from user1', 1101, 1);
insert into tweets values(12, '3 Tweet from user1', 1102, 1);
insert into tweets values(13, '4 Tweet from user1', 1103, 1);
insert into tweets values(14, '1 Tweet from user2', 1103, 2);
insert into tweets values(15, '2 Tweet from user2', 1103, 2);
insert into tweets values(16, '1 Tweet from user3', 1103, 3);
insert into tweets values(17, '2 Tweet from user3', 1103, 3);
insert into tweets values(18, '1 Tweet from user4', 1107, 4);
favorites (the same as likes):
insert into favorites values(1, 2, 11);
insert into favorites values(2, 3, 13);
insert into favorites values(3, 4, 15);
There is a question about the database scheme:
Do you think you could support with our database design the ability
to display a page for a given user with their latest tweets that were
favorited at least once?
Yes, this is why query:
sqlite> select favorites._id, tweets._id as tweet_row_id, tweets.content from favorites join tweets on tweets.user_id=1 and tweets._id = favorites.tweet_id order by tweets._id desc limit 1;
_id tweet_row_id content
---------- ------------ ------------------
2 13 4 Tweet from user1
Explanation:
The left dataset is the table favorites. The right dataset is the table tweets. I join the two datasets. Then tweets.user_id=1 and tweets._id = favorites.tweet_id is evaluated for each row of the resulting dataset as a boolean expression. If the result is true, the row is included. order by tweets._id desc is used to get the latest tweets (the greater tweets._id is, the newer the tweet is). limit is used to limit the number of rows. If the user has been using our Twitter-like app for years, we'll show the latest 10 or 20 tweets.
My questions.
Is there anything wrong with my database scheme? I omitted not null, unique, and other column constraints for simplicity.
Here the author of the original article says:
The first relation is addressed by sticking the user ID to each tweet.
This is possible because each tweet is created by exactly one user.
It’s a bit more complicated when it comes to following users and
favoriting tweets. The relationship there is many-to-many.
"The first relation" is users-tweets.
Why do we need a many-to-many here? In my scheme I only use a one-to-many.
Update 1
Shortly I placed an answer here, where the OP - like you in this question - was unsure about 1:n and n:m.
I assume, that your final sentence is the actual question you have:
Why do we need a many-to-many here? In my scheme I only use a one-to-many
The relation user-tweets is 1:n...
Think in objects
user (id, name, ...)
tweet (id, author (FK on user), datetime, content, ...)
The like is an object with sepecific details on its own:
like (id, userid,tweetid,datetime,...)
For this you need a mapping table (you call it favourites)
There is a 1:n-relation from users to this mapping and a 1:n-relation from tweets to this mapping.
These two 1:n-relations form the m:n-relation together.
Now each tweet can be liked by many users and each user can like many tweets, but one user should (probably) not like the same tweet twice (unique key or even a two column PK?). And you might introduce a CHECK constraint to ensure, that the liking user and the author's userid is not the same (don't like your own tweets).
As a side note:
Is there anything wrong with my database scheme
You should never create constraints wihtout naming them
CREATE TABLE Dummy
(
ID INT IDENTITY CONSTRAINT PK_Dummy PRIMARY KEY
,UserID INT NOT NULL CONSTRAINT FK_Dummy_UserID FOREIGN KEY REFERENCES User(id)
,...
)
If this database was ever installed on different systems, they'll get different (random) names and future upgrade scripts will get you in deepest pain...
UPDATE: example for the side note
In you comment you ask, what this last sentence is about... Try this
CREATE DATABASE testDB;
GO
USE testDB;
GO
CREATE TABLE testTbl1(ID INT IDENTITY PRIMARY KEY,SomeValue INT UNIQUE);
CREATE TABLE testTbl2(ID INT IDENTITY PRIMARY KEY,FKtoTbl1 INT NOT NULL FOREIGN KEY REFERENCES testTbl1(ID));
GO
CREATE TABLE testTbl3(ID INT IDENTITY CONSTRAINT PK_3 PRIMARY KEY,SomeValue INT CONSTRAINT UQ_3_SomeValue UNIQUE);
CREATE TABLE testTbl4(ID INT IDENTITY CONSTRAINT PK_4 PRIMARY KEY,FKtoTbl3 INT NOT NULL CONSTRAINT FK_4_FKtoTbl3 FOREIGN KEY REFERENCES testTbl3(ID));
GO
SELECT * FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS;
GO
USE master;
GO
DROP DATABASE testDB;
GO
On column in your result looks like this:
CONSTRAINT_NAME
------------------------------
PK__testTbl1__3214EC27ABEA2C0C
UQ__testTbl1__0E5C381C04C8AF66
PK__testTbl2__3214EC272784631C
FK__testTbl2__FKtoTb__1367E606
PK_3
UQ_3_SomeValue
PK_4
FK_4_FKtoTbl3
If this script is run twice, the given names will stay as you defined them. The other names will get a random name like PK__testTbl1__3214EC27ABEA2C0C. Now imagine, you need to create an upgrade script for several installed systems where one constraint has to be dropped or modified. How would you do this, if you do not know its name?

Restrict foreign key relationship to rows of related subtypes

Overview: I am trying to represent several types of entities in a database, which have a number of basic fields in common, and then each has some additional fields that are not shared with the other types of entities. Workflow would frequently involve listing the entities together, so I have decided to have a table with their common fields, and then each entity will have its own table with its additional fields.
To implement: There is a common field, “status”, which all entities have; however, some entities will only support a subset of all possible statuses. I also want each type of entity to enforce the use of its subset of statuses. Finally, I will also want to include this field when listing the entities together, so excluding it from the set of common fields seems incorrect, as this would require a union of the specific type tables and the lack of “implements interface” in SQL means that inclusion of that field would be by-convention.
Why I’m here: Below is an solution that is functional, but I am interested if there is a better or more common way to solve the problem. In particular, the fact that this solution requires me to make a redundant unique constraint and a redundant status field feels inelegant.
create schema test;
create table test.statuses(
id integer primary key
);
create table test.entities(
id integer primary key,
status integer,
unique(id, status),
foreign key (status) references test.statuses(id)
);
create table test.statuses_subset1(
id integer primary key,
foreign key (id) references test.statuses(id)
);
create table test.entites_subtype(
id integer primary key,
status integer,
foreign key (id) references test.entities(id),
foreign key (status) references test.statuses_subset1(id),
foreign key (id, status) references test.entities(id, status) initially deferred
);
Some data:
insert into test.statuses(id) values
(1),
(2),
(3);
insert into test.entities(id, status) values
(11, 1),
(13, 3);
insert into test.statuses_subset1(id) values
(1), (2);
insert into test.entites_subtype(id, status) values
(11, 1);
-- Test updating subtype first
update test.entites_subtype
set status = 2
where id = 11;
update test.entities
set status = 2
where id = 11;
-- Test updating base type first
update test.entities
set status = 1
where id = 11;
update test.entites_subtype
set status = 1
where id = 11;
/* -- This will fail
insert into test.entites_subtype(id, status) values
(12, 3);
*/
Simplify building on MATCH SIMPLE behavior of fk constraints
If at least one column of multicolumn foreign constraint with default MATCH SIMPLE behaviour is NULL, the constraint is not enforced. You can build on that to largely simplify your design.
CREATE SCHEMA test;
CREATE TABLE test.status(
status_id integer PRIMARY KEY
,sub bool NOT NULL DEFAULT FALSE -- TRUE .. *can* be sub-status
,UNIQUE (sub, status_id)
);
CREATE TABLE test.entity(
entity_id integer PRIMARY KEY
,status_id integer REFERENCES test.status -- can reference all statuses
,sub bool -- see examples below
,additional_col1 text -- should be NULL for main entities
,additional_col2 text -- should be NULL for main entities
,FOREIGN KEY (sub, status_id) REFERENCES test.status(sub, status_id)
MATCH SIMPLE ON UPDATE CASCADE -- optionally enforce sub-status
);
It is very cheap to store some additional NULL columns (for main entities):
How much disk-space is needed to store a NULL value using postgresql DB?
BTW, per documentation:
If the refcolumn list is omitted, the primary key of the reftable is used.
Demo-data:
INSERT INTO test.status VALUES
(1, TRUE)
, (2, TRUE)
, (3, FALSE); -- not valid for sub-entities
INSERT INTO test.entity(entity_id, status_id, sub) VALUES
(11, 1, TRUE) -- sub-entity (can be main, UPDATES to status.sub cascaded)
, (13, 3, FALSE) -- entity (cannot be sub, UPDATES to status.sub cascaded)
, (14, 2, NULL) -- entity (can be sub, UPDATES to status.sub NOT cascaded)
, (15, 3, NULL) -- entity (cannot be sub, UPDATES to status.sub NOT cascaded)
SQL Fiddle (including your tests).
Alternative with single FK
Another option would be to enter all combinations of (status_id, sub) into the status table (there can only be 2 per status_id) and only have a single fk constraint:
CREATE TABLE test.status(
status_id integer
,sub bool DEFAULT FALSE
,PRIMARY KEY (status_id, sub)
);
CREATE TABLE test.entity(
entity_id integer PRIMARY KEY
,status_id integer NOT NULL -- cannot be NULL in this case
,sub bool NOT NULL -- cannot be NULL in this case
,additional_col1 text
,additional_col2 text
,FOREIGN KEY (status_id, sub) REFERENCES test.status
MATCH SIMPLE ON UPDATE CASCADE -- optionally enforce sub-status
);
INSERT INTO test.status VALUES
(1, TRUE) -- can be sub ...
(1, FALSE) -- ... and main
, (2, TRUE)
, (2, FALSE)
, (3, FALSE); -- only main
Etc.
Related answers:
MATCH FULL vs MATCH SIMPLE
Two-column foreign key constraint only when third column is NOT NULL
Uniqueness validation in database when validation has a condition on another table
Keep all tables
If you need all four tables for some reason not in the question consider this detailed solution to a very similar question on dba.SE:
Enforcing constraints “two tables away”
Inheritance
... might be another option for what you describe. If you can live with some major limitations. Related answer:
Create a table of two types in PostgreSQL

how to create a Foreign-Key constraint to a subset of the rows of a table?

I have a reference table, say OrderType that collects different types of orders:
CREATE TABLE IF NOT EXISTS OrderType (name VARCHAR);
ALTER TABLE OrderType ADD PRIMARY KEY (name);
INSERT INTO OrderType(name) VALUES('sale-order-type-1');
INSERT INTO OrderType(name) VALUES('sale-order-type-2');
INSERT INTO OrderType(name) VALUES('buy-order-type-1');
INSERT INTO OrderType(name) VALUES('buy-order-type-2');
I wish to create a FK constraint from another table, say SaleInformation, pointing to that table (OrderType). However, I am trying to express that not all rows of OrderType are eligible for the purposes of that FK (it should only be sale-related order types).
I thought about creating a view of table OrderType with just the right kind of rows (view SaleOrderType) and adding a FK constraint to that view, but PostgreSQL balks at that with:
ERROR: referenced relation "SaleOrderType" is not a table
So it seems I am unable to create a FK constraint to a view (why?). Am I only left with the option of creating a redundant table to hold the sale-related order types? The alternative would be to simply allow the FK to point to the original table, but then I am not really expressing the constraint as strictly as I would like to.
I think your schema should be something like this
create table order_nature (
nature_id int primary key,
description text
);
insert into order_nature (nature_id, description)
values (1, 'sale'), (2, 'buy')
;
create table order_type (
type_id int primary key,
description text
);
insert into order_type (type_id, description)
values (1, 'type 1'), (2, 'type 2')
;
create table order_nature_type (
nature_id int references order_nature (nature_id),
type_id int references order_type (type_id),
primary key (nature_id, type_id)
);
insert into order_nature_type (nature_id, type_id)
values (1, 1), (1, 2), (2, 1), (2, 2)
;
create table sale_information (
nature_id int default 1 check (nature_id = 1),
type_id int,
foreign key (nature_id, type_id) references order_nature_type (nature_id, type_id)
);
If the foreign key clause would also accept an expression the sale information could omit the nature_id column
create table sale_information (
type_id int,
foreign key (1, type_id) references order_nature_type (nature_id, type_id)
);
Notice the 1 in the foreign key
You could use an FK to OrderType to ensure referential integrity and a separate CHECK constraint to limit the order types.
If your OrderType values really are that structured then a simple CHECK like this would suffice:
check (c ~ '^sale-order-type-')
where c is order type column in SaleInformation
If the types aren't structured that way in reality, then you could add some sort of type flag to OrderType (say a boolean is_sales column), write a function which uses that flag to determine if an order type is a sales order:
create or replace function is_sales_order_type(text ot) returns boolean as $$
select exists (select 1 from OrderType where name = ot and is_sales);
$$ language sql
and then use that in your CHECK:
check(is_sales_order_type(c))
You don't of course have to use a boolean is_sales flag, you could have more structure than that, is_sales is just for illustrative purposes.