SQL multiplicity many to many insertion process - sql

my question it seems to be pretty simple but unfortunately i couldn't find any satisfactory answer for this.
Please take a look on the following example:
Table Author define a author, Document has at least one author and table Authors group all authors that a document has.
My question is: Each time that i insert a document i need to verify if my group of authors exists already on table Authors.
What is the best way to do this?
I suppose that in first place i will need to verify if this group already exist and then, i should get that id (if already exist) or generate a new record on Authors (if group doesn't exist).
Question:
Is this the correct logic process that should occur on tables that has a multiplicity of many to many?
How can i check if a group of values already exist on table Authors??
There is something like this select * from Authors where a_id IN(1,2,3) but in a exclusive way. :S
Any help would be great.
Thanks

I would rather go with a solution with three tables:
Author
Document
rel_author_document
And rel_author_document will have a structure like:
author_fk
document_fk
You don't need to add a new group, but just to associate authors to a document.
In rel_author_document you can even add additional columns like role, if you need something like that.
With this approach you can have two documents with the same group of authors, but this won't kill you performances.
In case your question is for a homework assignment and you can't change table structure then:
Query the Authors table to see if you have a group of Author with the same number of author_id in the where condition, something like:
select count(1) cnt
from Authors
where a_id IN(1,2,3)
If cnt is different from the number of author_ids, then insert a new group
If cnt is equal then get the id of that group of Authors:
select ID
from Authors
where a_id = 1
or a_id = 2
or a_id = 3
group by 1
having count(1) = 3

If its a "many-to-many" relationship I believe the table structure you are looking for is:
Author (
a_id,
name,
birth_date,
domain)
Document (
d_id,
abstract,
year)
Auth_Doc (
a_id,
d_id
)
In terms of checking if there is already a group of authors, it depends n how you're displaying your data. If I was uploading a document and I wanted to check if a set of authors exist, I would do the following:
Say your data is stored as:
a_id | name
1 | john, jack, bob
SELECT * FROM Author WHERE name like '%john%'
I hope this helps,
Sohail

Related

SQL Query to return a table of specific matching values based on a criteria

I have 3 tables in PostgreSQL database:
person (id, first_name, last_name, age)
interest (id, title, person_id REFERENCES person)
location (id, city, state text NOT NULL, country, person_id REFERENCES person)
city can be null, but state and country cannot.
A person can have many interests but only one location. My challenge is to return a table of people who share the same interest and location.
All ID's are serialized and thus created automatically.
Let's say I have 4 people living in "TX", they each have two interests a piece, BUT only person 1 and 3 share a similar interest, lets say "Guns" (cause its Texas after all). I need to select all people from person table where the person's interest title (because the id is auto generated, two Guns interest would result in two different ID keys) equals that of another persons interest title AND the city or state is also equal.
I was looking at the answer to this question here Select Rows with matching columns from SQL Server and I feel like the logic is sort of similar to my question, the difference is he has two tables, to join together where I have three.
return a table of people who share the same interest and location.
I'll interpret this as "all rows from table person where another rows exists that shares at least one matching row in interest and a matching row in location. No particular order."
A simple solution with a window function in a subquery:
SELECT p.*
FROM (
SELECT person_id AS id, i.title, l.city, l.state, l.country
, count(*) OVER (PARTITION BY i.title, l.city, l.state, l.country) AS ct
FROM interest i
JOIN location l USING (person_id)
) x
JOIN person p USING (id)
WHERE x.ct > 1;
This treats NULL values as "equal". (You did not specify clearly.)
Depending on undisclosed cardinalities, there may be faster query styles. (Like reducing to duplicative interests and / or locations first.)
Asides 1:
It's almost always better to have a column birthday (or year_of_birth) than age, which starts to bit-rot immediately.
Asides 2:
A person can have [...] only one location.
You might at least add a UNIQUE constraint on location.person_id to enforce that. (If you cannot make it the PK or just append location columns to the person table.)

How to remove line duplicates SQL via compare two same table

How to remove duplicate values a = b and b = a?
with a as(select w.id , w.doc, w.org
, d.name_s, d.name_f, d.name_p, d.spec
, o.name, o.extid
from crm_s_workplaces w
join crm_s_docs d on d.id=w.doc
join crm_s_orgs o on o.id=w.org
where d.active=1 and d.cst='NY' and w.active=1 and w.cst='NY' and o.active=1
and
o.cst='NY')
select a1.doc, a2.doc,
a1.org,a1.name_s,a1.name_f,a1.name_p,a2.name_s,a2.name_f,a2.name_p from a a1
join a a2 on
a1.name_s=a2.name_s and
substr(a1.name_f,1,1)=substr(a2.name_f,1,1) and
substr(a1.name_p,1,1)=substr(a2.name_p,1,1) and
a1.org=a2.org and
a1.spec<>a2.spec
order by a1.name_s `enter code here`
ER model diagram:
Repeat example:
Sometimes comes across a1.spec > a2.spec:
What you are calling "duplicates" are actually not duplicates in your database.
You basically have multiple doc records for what could be the same person or not. See that even their names do not always match. For instace,
doc_id 1131385 has NAME_F = "Gabr" while
doc_id 1447530 has NAME_F = "Gabor"
In your database these are two different entities, and you cannot match them using primary key. You can try to join on the first, middle and last names, but as you can see in the above example with Gabor/Gabr, even that would not work.
Can you change the schema of the db? If so you need to separate the docs in one table - 1 record per doctor. And have the specialization in a separate table with the folloing columns:
spec_id (int, PK)
doc_id (foreign key to Doc table)
specialization
that way, if you have 1 doctor with 3 specs, he/she will show up only once in doc table, and multiple times in spec table.
I just notice something else. You have spec field in table workplaces. why? If you meant to say that Doc Gabor works as admin in hospital 1 but as a Therapist in hospital 2, you can do that. However, you have to remove the spec field from the doc table and only use the spec in workplaces table.

SQL: How do I find which movie genre a user watched the most? (IMDb personal project)

I'm currently working on a personal project and I could use a little help. Here's the scenario:
I'm creating a database (MS Access) for all of the movies myself and some friends have ever watched. We rated all of our movies on IMDb and used the export feature to get all of the movie data and our movie ratings. I plan on doing some summary analysis on Excel. One thing I am interested in is the most common movie genre that each person watched. Below is my current scenario. Note that the column "const" is the movies' unique IDs. I also have individual tables for each person's ratings and the following tables are the summary tables that make up the combination of all the movies we have watched.
Here's the table I had: http://imgur.com/v5x9Dhg
I assigned each genre an ID, like this: http://imgur.com/aXdr9XI
And here is a table where I have separate instances for each movie ID and a unique genre: http://imgur.com/N0wULo8
I want to find a way to count up all of the genres that each person watches. Any advice? I would love to provide any additional information that you need!
Thank you!
You need to have at least one table which has one row per user and const (movie watched). In the 3 example tables you posted nothing shows who watched which movies, which is information you need to solve your problem. You mention having "individual tables for each person's ratings," so I assume you have that information. You will want to combine all of them though, into a table called PERSON_MOVIE or something of the like.
So let's say your second table is called GENRE and its columns are ID, Genre.
Let's say your third table is called GENRE_MOVIE and its columns are Const and ID (ID corresponds to ID on the GENRE table)
Let's say the fourth table, which you did not post, but which is required, is called PERSON_MOVIE and its columns are person, Const, rating.
You could then write a query like this:
select vw1.*, ge.genre
from (select um.person, gm.id as genre_id, count(*) as num_of_genre
from user_movie um
inner join genre_movie gm
on um.const = gm.const
group by um.person, gm.id) vw1
inner join (select person, max(num_of_genre) as high_count
from (select um.person, gm.id, count(*) as num_of_genre
from user_movie um
inner join genre_movie gm
on um.const = gm.const
group by um.person, gm.id) x
group by person) vw2
on vw1.person = vw2.person
and vw1.num_of_genre = vw2.high_count
inner join genre ge
on vw1.genre_id = ge.id
Edit re: your comment:
So right now you have multiple tables reflecting people's ratings of movies. You need to combine those into a table called PERSON_MOVIE or something similar (as in example above).
There will be 3 columns on the table: person, const, rating
I'm not sure if access supports the traditional create table as select query but ordinarily you would be able to construct such a table in the following way:
create table person_movie as
select 'Bob', const, [You rated]
from ratings_by_bob
union all
select 'Sally', const, [You rated]
from ratings_by_sally
union all
select 'Jack', const, [You rated]
from ratings_by_jack
....
If not, just combine the tables manually and add a third column as shown indicating what users are reflected by each row. Then you can run my initial query.

SQL Cross-Table Referencing

Okay, so I've got two tables. One table (table 1) contains a column Books_Owned_ID which stores a series of numbers in the form of 1,3,7. I have another table (table 2) which stores the Book names in one column and the book ID in another column.
What I want to do is create an SQL code which will take the numbers from Books_Owned_IDand display the names of those books in a new column. Like so:
|New Column |
Book 1 Name
Book 2 Name
Book 3 Name
I can't wrap my head around this, it's simple enough but all the threads I look on get really confusing.
Table1 contains the following columns:
|First_Name| Last_Name| Books_Owned_ID |
Table2 contains the following columns:
|Book_Name|Book_ID|
You need to do an inner join. This is a great example/reference for these
SELECT Book_Name FROM Table2
INNER JOIN Table1
ON Table1.Books_Owned_ID = Table2.Book_ID
EDIT SQL Fiddle
I will work on getting the column comma split working. It wont be a lot extra for this.
EDIT 2 See this answer to build a function to split your string. Then you can do this:
SELECT Book_Name FROM Table2
WHERE Book_ID IN(SELECT FN_ListToTable(',',Table1.Books_Owned_ID) FROM Table1 s)
The core of this centers around data normalisation... Each fact is stored only once (and so is "authoritative"). You should also get into the habit of only storing a single fact in any field.
So, imagine the following table layouts...
Books
Id, Name, Description
Users
Id, Username, EmailAddress, PasswordHash, etc....
BooksOwned
UserId, BookId
So if a single user owns multiple books, there will be multiple entries in the BooksOwned table...
UserId, BookID
1, 1
1, 2
1, 3
Indicates that User 1 owns books 1 through 3.
The reason to do it this way is that it makes it much easier to query in future. You also treat BookId as an Integer instead of a string containing a list - so you don't need to worry about string manipulation to do your query.
The following would return the name of all books owned by the user with Id = 1
SELECT Books.Name
FROM BooksOwned
INNER JOIN Books
ON BooksOwned.BookId = Books.Id
WHERE BooksOwned.UserId = 1
You need a function which takes a comma separated list and returns a table. This is slow and fundamentally a bad idea. Really all this does is convert this way of doing it to be like the data model I describe below. (see ProfessionalAmateur's answer for an example of this).
If you are just starting change your data model. Make a linking table. Like this:
Okay, so I've got two tables. One table (table 1) contains a column Books_Owned_ID which stores a series of numbers in the form of 1,3,7. I have another table (table 2) which stores the Book names in one column and the book ID in another column.
What I want to do is create an SQL code which will take the numbers from Books_Owned_IDand display the names of those books in a new column. Like so:
Person Table
|First_Name| Last_Name| Person_ID |
Book Table
|Book_Name|Book_ID|
PersonBook Table
|PersonID|BookID|
This table can have more than one row for each person.

Mysql parent child in same table

I have following table layout (all in one table SQL):
user_id, user_id_parent, fname, lname, shopname
I want to be able to print out fname and lastname and show what shop name the user is related to via (user_id_parent(is refering to user_id, you could say that user_id is a kind of shopname id) ). If it is a normal useracount in this case, the shopname is empty.
I would guess I should use som kind of Join, but i don't know how to use it when it's in the same table...
Result something like:
John Doe, Relating to: Shop #1
I did this working SQL and guess what, Tada! :-)
Posting this, so it may help other guys trying to do the same thing.
SELECT e.`user_type` AS usertype, e.`fname` AS employee, m.`shopname` AS associated_to
FROM rw_users e
INNER JOIN rw_users m ON e.`user_id_parent` = m.`user_id`