SQL: How to separate string values separated by commas? - sql

I'm trying to create a relational database of all the movies I have watched.
I used IMDb to rate the movies I've seen and used the site's export capability to get the data in a .csv file which I uploaded to Microsoft Access. However, the "Genre" column is a many-to-many relationship that I am hoping to turn into a one-to-many relationship.
I would like to have a table called GENRE_ID that assigns each genre a numerical ID. Then I'd have another table where each instance would have the movie ID ("const"), line item number, and GENRE_ID.
So it might look like:
const line_item MOVIE_ID
tt0068646 1 1 (if MOVIE_ID: 1 = "crime")
tt0068646 2 2 (if MOVIE_ID: 2 = "drama")
Here's a link to the image of my database's current state. Thank you so much for your help. This is a project I'm doing to learn more on my own time.

Basically, when you have a one-to-many relationship, you should use a table for that relationship
In your case, I would recommend to have 3 table:
Film table : contains information like your current table ,except Genres
Genre table : contains (at least) Id and Name
Film_Genre table : contains Film_Id, GenreId.
For example
In your genre table, your data would be
row 1: Id =1 , Name = "Crime"
row 2: Id = 2, Name = drama,
and so on
your Film_Genre table would be something like:
row1: Film_Id = tt0068646, GenreId = 1,
row2: Film_Id = tt0068646, GenreId = 2
row3: Film_Id = tt0082971, GenreId = 2
and so on
(I supposed that you use "const" column as Id of Film table, if not, you should have your own Id)
Of course, it take you a litte bit effort to transform your current database to this database.

Some notes on a way to a solution.
A table of genres
ID Genre
1 Action
2 Adventure
3 Thriller
4 War
An import table
Const GenreList
tt00 Action, Adventure, Thriller, War
A query
SELECT ti.Const, ti.GenreList, tg.Genre
FROM Imports as ti, Genres as tg
WHERE ti.GenreList Like "*" & tg.Genre & "*"

Related

Database schema for Sales Commissions

I'm trying to create a database with table titles which contains different titles, code(short code for the name) and commission of that title on other titles for instance.
I have a table named Title
Id Name Code CommissionOnA CommissionOnEng
1 Admin A 0 15
2 Engineer Eng 1 0
Now Is it good to have table schema like this, as the titles will change and can be inserted, updated or deleted dynamically. So with my current approach I have to alter table and add another column to it, in order to add commission for new title.
Is there any better way to do it, considering in mind that this also support multilevel sale heirarchy. Schema for any database is fine, but for MySql is preferred.
The Scenerio is, that the form where user creates a new title, dynamically renders all the titles that exist in the table with the textbox, so that when user creates a new title, he should be able to add commissions corresponding to other titles for the new title.
for instance if user creates a new Title name "Consultant" with code "c", he should see textboxes for Admin, Engineer, so that when user saves it, a row in the table gets created which has following data
Id Name Code CommissionOnA CommissionOnEng CommissionOnC
1 Admin A 0 15 0
2 Engineer Eng 1 0 0
3 Consultant C 12 5 0
Now I have another table called Employees
Id Name Title ManagerId
1 Rob 1 Null
2 Kate 2 1
3 Eli 3 2
4 Al 2 3
Now when Ido recursion, each time a junior get sale, a commission should be transfered to his manager as well as manager of his manager based on the commission specified in the title table.
So, when Al sells something, than Eli should get commission of 5 as, title of Eli is Consultant and Eli is boss of Al, so Employee with title Consultant(3) get commission of 5, if Employee with title Engineer(2) sells something.
It's better to normalise your table schemas so you don't need to add new columns instead put those related columns into their own table and then join these records via a foreign key.
For example, create a new table named commissions, then have a column for its unique ID, the ID that relates to the titles table and the commission amount:
commissions
----------------------------
id (INT, NOT NULL, Primary Key)
titles_id (INT, NOT NULL)
amount (INT, NOT NULL, DEFAULT=0)
and the data would look like:
id titles_id amount
1 1 15
2 2 1

Tricky PostgreSQL join and order query

I've got four tables in a PostgreSQL 9.3.6 database:
sections
fields (child of sections)
entries (child of sections)
data (child of entries)
CREATE TABLE section (
id serial PRIMARY KEY,
title text,
"group" integer
);
CREATE TABLE fields (
id serial PRIMARY KEY,
title text,
section integer,
type text,
"default" json
);
CREATE TABLE entries (
id serial PRIMARY KEY,
section integer
);
CREATE TABLE data (
id serial PRIMARY KEY,
data json,
field integer,
entry integer
);
I'm trying to generate a page that looks like this:
section title
field 1 title | field 2 title | field 3 title
entry 1 | data 'as' json | data 1 json | data 3 json <-- table
entry 2 | data 'df' json | data 5 json | data 6 json
entry 3 | data 'gh' json | data 8 json | data 9 json
The way I have it set up right now each piece of 'data' has an entry it's linked to, a corresponding field (that field has columns that determine how the data's json field should be interpreted), a json field to store different types of data, and an id (1-9 here in the table).
In this example there are 3 entries, and 3 fields and there is a data piece for each of the cells in between.
It's set up like this because one section can have different field types and quantity than another section and therefore different quantities and types of data.
Challenge 1:
I'm trying to join the table together in a way that it's sortable by any of the columns (contents of the data for that field's json column). For example I want to be able to sort field 3 (the third column) in reverse order, the table would look like this:
section title
field 1 title | field 2 title | field 3 title
entry 3 | data 'gh' json | data 8 json | data 9 json
entry 2 | data 'df' json | data 5 json | data 6 json
entry 1 | data 'as' json | data 1 json | data 3 json <-- table
I'm open to doing it another way too if there's a better one.
Challenge 2:
Each field has a 'default value' column - Ideally I only have to create 'data' entries when they have a value that isn't that default value. So the table might actually look like this if field 2's default value was 'asdf':
section title
field 1 title | field 2 title | field 3 title
entry 3 | data 'gh' json | data 8 json | data 9 json
entry 2 | data 'df' json | 'asdf' | data 6 json
entry 1 | data 'as' json | 'asdf' | data 3 json <-- table
The key to writing this query is understanding that you just need to fetch all the data for single section and the rest you just join. You also can't with your schema directly filter data by section so you'll need to join entry just for that:
SELECT d.* FROM data d JOIN entries e ON (d.entry = e.id)
WHERE e.section = ?
You can then join field to each row to get defaults, types and titles:
SELECT d.*, f.title, f.type, f."default"
FROM data d JOIN entries e ON (d.entry = e.id)
JOIN fields f ON (d.field = f.id)
WHERE e.section = ?
Or you can select fields in a separate query to save some network traffic.
So this was an answer, here come bonuses:
Use foreign keys instead of integers to refer to other tables, it will make database check consistency for you.
Relations (tables) should be called in singular by convention, so it's section, entry and field.
Referring fields are called <name>_id, e.g. field_id or section_id also by convention.
The whole point of JSON fields is to store a collection with not statically defined data, so it would made much more sense to not use entries and data tables, but single table with JSON containing all the fields instead.
Like this:
CREATE TABLE row ( -- less generic name would be even better
id int primary key,
section_id int references section (id),
data json
)
With data fields containing something like:
{
"title": "iPhone 6",
"price": 650,
"available": true,
...
}
#Suor has provided good advice, some of which you already accepted. I am building on the updated schema.
Schema
CREATE TABLE section (
section_id serial PRIMARY KEY,
title text,
grp integer
);
CREATE TABLE field (
field_id serial PRIMARY KEY,
section_id integer REFERENCES section,
title text,
type text,
default_val json
);
CREATE TABLE entry (
entry_id serial PRIMARY KEY,
section_id integer REFERENCES section
);
CREATE TABLE data (
data_id serial PRIMARY KEY,
field_id integer REFERENCES field,
entry_id integer REFERENCES entry,
data json
);
I changed two more details:
section_id instead of id, etc. "id" as column name is an anti-pattern that's gotten popular since a couple of ORMs use it. Don't. Descriptive names are much better. Identical names for identical content is a helpful guideline. It also allows to use the shortcut USING in join clauses:
Don't use reserved words as identifiers. Use legal, lower-case, unquoted names exclusively to make your life easier.
Are PostgreSQL column names case-sensitive?
Referential integrity?
There is another inherent weakness in your design. What stops entries in data from referencing a field and an entry that don't go together? Closely related question on dba.SE
Enforcing constraints “two tables away”
Query
Not sure if you need the complex design at all. But to answer the question, this is the base query:
SELECT entry_id, field_id, COALESCE(d.data, f.default_val) AS data
FROM entry e
JOIN field f USING (section_id)
LEFT JOIN data d USING (field_id, entry_id) -- can be missing
WHERE e.section_id = 1
ORDER BY 1, 2;
The LEFT JOIN is crucial to allow for missing data entries and use the default instead.
SQL Fiddle.
crosstab()
The final step is cross tabulation. Cannot show this in SQL Fiddle since the additional module tablefunc is not installed.
Basics for crosstab():
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT entry_id, field_id, COALESCE(d.data, f.default_val) AS data
FROM entry e
JOIN field f USING (section_id)
LEFT JOIN data d USING (field_id, entry_id) -- can be missing
WHERE e.section_id = 1
ORDER BY 1, 2
$$
,$$SELECT field_id FROM field WHERE section_id = 1 ORDER BY field_id$$
) AS ct (entry int, f1 json, f2 json, f3 json) -- static
ORDER BY f3->>'a'; -- static
The tricky part here is the return type of the function. I provided a static type for 3 fields, but you really want that dynamic. Also, I am referencing a field in the json type that may or may not be there ...
So build that query dynamically and execute it in a second call.
More about that:
Dynamic alternative to pivot with CASE and GROUP BY

Insert into table some values which are selected from other table

I have my database structure like this ::
Database structure ::
ATT_table- ActID(PK), assignedtoID(FK), assignedbyID(FK), Env_ID(FK), Product_ID(FK), project_ID(FK), Status
Product_table - Product_ID(PK), Product_name
Project_Table- Project_ID(PK), Project_Name
Environment_Table- Env_ID(PK), Env_Name
Employee_Table- Employee_ID(PK), Name
Employee_Product_projectMapping_Table -Emp_ID(FK), Project_ID(FK), Product_ID(FK)
Product_EnvMapping_Table - Product_ID(FK), Env_ID(FK)
I want to insert values in ATT_Table. Now in that table I have some columns like assignedtoID, assignedbyID, envID, ProductID, project_ID which are FK in this table but primary key in other tables they are simply numbers).
Now when I am inputting data from the user I am taking that in form of string like a user enters Name (Employee_Table), product_Name (Product_table) and not ID directly. So I want to first let the user enter the name (of Employee or product or Project or Env) and then value of its primary key (Emp_ID, product_ID, project_ID, Env_ID) are picked up and then they are inserted into ATT_table in place of assignedtoID, assignedbyID, envID, ProductID, project_ID.
Please note that assignedtoID, assignedbyID are referenced from Emp_ID in Employee_Table.
How to do this ? I have got something like this but its not working ::
INSERT INTO ATT_TABLE(Assigned_To_ID,Assigned_By_ID,Env_ID,Product_ID,Project_ID)
VALUES (A, B, Env_Table.Env_ID, Product_Table.Product_ID, Project_Table.Project_ID)
SELECT Employee_Table.Emp_ID AS A,Employee_Table.Emp_ID AS B, Env_Table.Env_ID, Project_Table.Project_ID, Product_Table.Product_ID
FROM Employee_Table, Env_Table, Product_Table, Project_Table
WHERE Employee_Table.F_Name= "Shantanu" or Employee_Table.F_Name= "Kapil" or Env_Table.Env_Name= "SAT11A" or Product_Table.Product_Name = "ABC" or Project_Table.Project_Name = "Project1";
The way this is handled is by using drop down select lists. The list consists of (at least) two columns: one holds the Id's teh database works with, the other(s) store the strings the user sees. Like
1, "CA", "Canada"
2, "USA", 'United States"
...
The user sees
CA | Canada
USA| United States
...
The value that gets stored in the database is 1, 2, ... whatever row the user selected.
You can never rely on the exact, correct input of users. Sooner or later they will make typo's.
I extend my answer, based on your remark.
The problem with the given solution (get the Id's from the parent tables by JOINing all those parent tables together by the entered text and combining those with a number of AND's) is that as soon as one given parameter has a typo, you will get not a single record back. Imagine the consequences when the real F_name of the employee is "Shant*anu*" and the user entered "Shant*aun*".
The best way to cope with this is to get those Id's one by one from the parent tables. Suppose some FK's have a NOT NULL constraint. You can check if the F_name is filled in and inform the user when he didn't fill that field. Suppose the user eneterd "Shant*aun*" as name, the program will not warn the user, as something is filled in. But that is not the check the database will do, because the NOT NULL constraints are defined on the Id's (FK). When you get the Id's one by one from the parent tables. You can verify if they are NOT NULL or not. When the text is filled in, like "Shant*aun*", but the returned Id is NULL, you can inform the user of a problem and let him correct his input: "No employee by the name 'Shantaun' could be found."
SELECT $Emp_ID_A = Emp_ID
FROM Employee_Table
WHERE F_Name= "Shantanu"
SELECT $Emp_ID_B = Emp_ID
FROM Employee_Table
WHERE B.F_Name= "Kapil"
SELECT $Env_ID = Env_ID
FROM Env_Table
WHERE Env_Table.Env_Name= "SAT11A"
SELECT $Product_ID = Product_ID
FROM Product_Table
WHERE Product_Table.Product_Name = "ABC"
SELECT $Project_ID = Project_ID
FROM Project_Table
WHERE Project_Name = "Project1"
Please use AND instead of OR.
INSERT INTO ATT_TABLE(Assigned_To_ID,Assigned_By_ID,Env_ID,Product_ID,Project_ID)
SELECT A.Emp_ID, B.Emp_ID, Env_Table.Env_ID, Project_Table.Project_ID, Product_Table.Product_ID
FROM Employee_Table A, Employee_Table B, Env_Table, Product_Table, Project_Table
WHERE A.F_Name= "Shantanu"
AND B.F_Name= "Kapil"
AND Env_Table.Env_Name= "SAT11A"
AND Product_Table.Product_Name = "ABC"
AND Project_Table.Project_Name = "Project1";
But it is best practice to use drop down list in your scenario, i guess.

SQL field with multiple id's of other table

Could someone give me an idea how to create this database structure.
Here is an example:
Table "countries":
id, countryname
1, "US"
2, "DE"
3, "FR"
4, "IT"
Now I have another table "products" and in there I would like to store all countries where this product is available:
Table "products":
id,productname,countries
1,"product1",(1,2,4) // available in countries US, DE, IT.
2,"product2",(2,3,4) // available in countries DE, FR, IT.
My question:
How do I design the table structure in "products" to be able to store multiple countries?
My best idea is to put a comma-separated string in there (i.e. "1,2,4"), then split that string to look up each entry. But I doubt that this the best way to do this?
EDIT: Thank you all for your help, amazing! It was difficult to choose the right answer,
I finally chose Gregs because he pointed me to a JOIN explanation and gave an example how to use it.
You need an intersection table for that many-to-many relationship.
Table Country
CountryID, CountryName
Table CountryProduct
CountryID, ProductID
Table Product
ProductID, ProductName
You then Inner Join all 3 tables to get your list of Countries & Products.
Select * From Country
Inner Join CountryProduct On Country.CountryID = CountryProduct.CountryID
Inner Join Product On CountryProduct.ProductID = Product.ProductID
Without denormalizing, you'll need to add an extra table
Table Product countries
ProductID CountryID
1 1
1 2
1 4...
What you're talking about is normalisation. You have a many-to-many structure, so you should create another table to link the two. You should never (ok, pretty much never) use delimited strings to store a list of values in a relational database.
Here's an example of the setup:
product_countries table
productid | countryid
----------+-----------
1 | 1
1 | 2
1 | 4
2 | 2
2 | 3
2 | 4
You can use a foreign key to each other table, then make them both into a composite primary key.
You can then get a list of supported products for a country ID like this:
SELECT * FROM products, product_countries
WHERE products.id = product_countries.productid
AND product_countries.countryid = $cid
You could also make a third table countries_products with fields country_id and product_id.
the best approach for relational databases is the following :
One table for coutries, let's say
country_id, country_desc (country_id is primary)
one table for products, let's say
product_id, product_desc and as many columns as you want (product_id is primary)
if you had only one country for sure, it'd be enough to have a foreign key pointing to country_id in each product row. Having a foreign key asserts that there is an actual country behing a country_id referring to country table.
In your case you have several countries for a product, so add a separate association table
product_id, country_id
both keys primary and both foreign as well.

Storing and retrieving multiple values from a foreign table

I have one table "Books", with a column "genres" where I want to reference to another table that contains list of genres.
The problem is that want to store more then one genres in column.
E.g,
ID | Author | Title | Genres
2 | David Baldacci |Stone Cold | 1,4 (action thriller )
table "Books Genres" with 2 columns Id and Genre.
1 Action
2 Drama
3 Comedy
4 Thriller
5 Horror
Can something like this be done? or it's not practical and I should store genres as simple string?
The best way to solve this problem is with what is called a linking table.
A linking table would look like this:
ID (optional) | BookID| GenreID
1 2 1
2 2 4
Then each book could have multiple rows (or one) in this table.
(The optional ID would be useful if you care about row-level auditing of your tables, or transaction auditing -- you could use it to uniquely id this row -- for your problem it is not needed).
It can be done, but shouldn't.
It is bad database design - your database should be normalized.
I suggest an many-to-many table - with genreId and bookId columns (being foreign keys each, and together forming a composite primary key). This will work as the link you need (a book can have many genres, each genre can write many books).
Giving your book as an example with the Book Genres table, this would look like:
bookId genreId
2 1
2 4
You need a third table, perhaps called BookGenre which acts as a "resolver" for the many-many relationship. BookGenre would have two columns, BookID and GenreID. A row would be added to BookGenre for each book for each genre.
There would be two rows in BookGenre for the example data you provided:
BookID GenreID
2 1
2 4