Query relations within the same table - sql

I have a table of entities that can have many-to-many relations to each other, using a second junction table. At a first glance, the design may seem flawed and suggests a separate table for each type of entity. The thing is, that the entities are generic, and completely user-defined. They may also be connected completely ad-hoc and each entity can have an unlimited number of connections.
Here is a simplified example of my tables:
Entities
------------
Entity | Id | Type
-------------------
Event | 1 | Request
Stroll | 2 | Activity
Dinner | 3 | Activity
Angela | 4 | Person
Anders | 5 | Person
Michael | 6 | Person
Junctions
----------------
Left | Right
----------------
1 | 2 // Connect Request -> Stroll
2 | 4 // Connect Stroll -> Angela
1 | 3 // Connect Request -> Dinner
3 | 5 // Connect Dinner -> Anders
3 | 6 // Connect Dinner -> Michael
Now to my question:
I would like to perfom queries from the view-point of different entities. Lets say I would like to look at Requests and see what Activities they have, and any Persons attending each activity. I would like to get a result like this:
Request | Activity | Person
-----------------------------
Event | Stroll | Angela
| Dinner | Anders
| Michael
I would also, for example, like to be able to flip the coin and look at Persons And see what Requests they attend, like this:
Person | Request
-----------------
Angela | Event
Anders | Event
Michael | Event
How can i write queries to achieve results like this, and is it even possible with the current structure? I have spent much time on googling and trials with no luck, and I am very grateful for any help.
Here is an SQLFiddle

That's how you do it
SELECT e1.Entity Request,
e2.Entity Activity,
e3.Entity Person
FROM Junctions j1
JOIN Junctions j2
ON j1.`Right` = j2.`Left`
JOIN Entities e1
ON j1.`Left` = e1.Id
JOIN Entities e2
ON j1.`Right` = e2.Id
JOIN Entities e3
ON j2.`Right` = e3.Id;
SQLFiddle
To help you understand - at first I joined Junctions on itself like that:
SELECT j1.`Left` Request,
j1.`Right` Activity,
j2.`Right` Person
FROM Junctions j1
JOIN Junctions j2
ON j1.`Right` = j2.`Left`;
And then joined to the Entity, as you can see, to replace the Ids. One time for each type.
But, nevertheless, I still think that this architecture is horrible, and it needs to be redesigned.

Related

Auto generate columns in Microsoft Access table

How can we auto generate column/fields in microsoft access table ?
Scenario......
I have a table with personal details of my employee (EmployDetails)
I wants to put their everyday attendance in an another table.
Rather using separate records for everyday, I want to use a single record for an employ..
Eg : I wants to create a table with fields like below
EmployID, 01Jan2020, 02Jan2020, 03Jan2020,.........25May2020 and so on.......
It means everyday I have to generate a column automatically...
Can anybody help me ?
Generally you would define columns manually (whether that is through a UI or SQL).
With the information given I think the proper solution is to have two tables.
You have your "EmployDetails" which you would put their general info (name, contact information etc), and the key which would be the employee ID (unique, can be autogenerated or manual, just needs to be unique)
You would have a second table with a foreign key to the empployee ID in "EmployDetails" with a column called Date, and another called details (or whatever you are trying to capture in your date column idea).
Then you simply add rows for each day. Then you do a join query between the tables to look up all the "days" for an employee. This is called normalisation and how relational databases (such as Access) are designed to be used.
Employee Table:
EmpID | NAME | CONTACT
----------------------
1 | Jim | 222-2222
2 | Jan | 555-5555
Detail table:
DetailID | EmpID (foreign key) | Date | Hours_worked | Notes
-------------------------------------------------------------
10231 | 1 | 01Jan2020| 5 | Lazy Jim took off early
10233 | 2 | 02Jan2020| 8 | Jan is a hard worker
10240 | 1 | 02Jan2020| 7.5 | Finally he stays a full day
To find what Jim worked you do a join:
SELECT Employee.EmpID, Employee.Name, Details.Date, Details.Hours_worked, Details.Notes
FROM Employee
JOIN Details ON Employee.EmpID=Details.EmpID;
Of course this will give you a normalised result (which is generally what's wanted so you can iterate over it):
EmpID | NAME | Date | Hours_worked | Notes
-----------------------------------------------
1 | Jim | 01Jan2020 | 5 | ......
1 | Jim | 02Jan2020 | 7 | .......
If you want the results denormalised you'll have to look into pivot tables.
See more on creating foreign keys

Rebuild tables from joined table

I am facing an issue where a data supplier is generating a dump of his multi-tenant databases in a single table. Recreating the original tables is not impossible, the problem is I am receiving millions of rows every day. Recreating everything, every day, is out of question.
Until now, I was using SSIS to do so, with a lookup-intensive approach. In the past year, my virtual machine went from having 2 GB of ram to 128, and still growing.
Let me explain the disgrace:
Imagine a database where users have posts, and posts have comments. In my real scenario, I am talking about 7 distinct tables. Analyzing a few rows, I have the following:
+-----+------+------+--------+------+-----------+------+----------------+
| Id* | T_Id | U_Id | U_Name | P_Id | P_Content | C_Id | C_Content |
+-----+------+------+--------+------+-----------+------+----------------+
| 1 | 1 | 1 | john | 1 | hello | 1 | hello answer 1 |
| 2 | 1 | 2 | maria | 2 | cake | 2 | cake answer 1 |
| 3 | 2 | 1 | pablo | 1 | hello | 1 | hello answer 3 |
| 4 | 2 | 1 | pablo | 2 | hello | 2 | hello answer 2 |
| 5 | 1 | 1 | john | 3 | nosql | 3 | nosql answer 1 |
+-----+------+------+--------+------+-----------+------+----------------+
the Id is from my table
T_Id is the "tenant" Id, which identifies multiple databases
I have imagined the following possible solution:
I make a query that selects non-existent Ids for each table, such as:
SELECT DISTINCT n.t_id,
n.c_id,
n.c_content
FROM mytable n
WHERE n.id > 4
AND NOT EXISTS (SELECT 1
FROM mytable o
WHERE o.id <= 4
AND n.t_id = o.t_id
AND n.c_id = o.c_id)
This way, I am able to select only the new occurrences whenever a new Id of a table is found. Although it works, it may perform badly when working with 100s of millions of rows.
Could anyone share a suggestion? I am quite lost.
Thanks in advance.
EDIT > my question is vague
My final intent is to rebuild the tables from the dump, incrementally, avoiding lookups outside the database. Every now and then I am gonna run a script that will select new tenants, users, posts and comments and add them to their corresponding tables.
My previous solution worked as follows:
Cache the whole database
For each new row, search for the columns inside the cache
If it doesn't exist, then insert it
I know it sounds dumb, but it made sense as a new developer working with ETLs
First, if you have a full flat DB dump, I'll suggest you to work on your file before even importing it in your DB (low level file processing is pretty cheap and nearly instantaneous).
From Removing lines in one file that are present in another file using python you can remove all the already parsed line since your last run.
with open('new.csv','r') as source:
lines_src = source.readlines()
with open('old.csv','r') as f:
lines_f = f.readlines()
destination = open('diff_add.csv',"w")
for data in lines_src:
if data not in lines_f:
destination.write(data)
destination.close()
This take less than five second to work on a 900Mo => 1.2Go dump. With this you'll only work with line that really make change in one of your new table.
Now you can import this flat DB to a working table.
As you'll have to search the needle in each line, some index on the ids may by a good idea (go to composite index that use your Tenant_id first).
For the last part, I don't know exactly how your data look, can you have some update to do ?
The Operators - EXCEPT and INTERSECT can help you too with this kind of problem.

How to make multiple instances of foreign key work in one row

Hello people and fellow SQL programmers.
I have been trying to work out a reality model that is situated in Industry.
The client, that the database is ordered by states that:
There are multiple different job locations/offices where his employees work. Each workplace/office has a set number of people that can work here - minimum and maximum. For each workplace there is a group of people that consists of at least 2 people and max at 4 people. There can be only one group stationed in one work place at a time. There are also a few specifications for the group such as - there are no leaders among them - everybody is equal. A certain worker can only be assigned to only one group at a time. And there is to be an evidence in history who and where worked and for how long.
I have been trying to work the table design with its attributes for quite some time but it seems to me that everything i have done so far has some serious holes and is quite messy. I would very much appreciate any feedback and advice from you guys. Thanks in advance.
If I understand correctly you have two entities: employee and office. These will require two tables:
employee: id, name, whatever_else
office: id, desc, min_employees, max_employees, whatever_else
Theoretically the relationship between these two entities is one-to-many, because you say each employee can only be assigned to a single office at a time, so you could add a office_id foreign key to the employee table.
Having to keep track of the history however means that each employee may have multiple associations with the offices, thus making the relationship many-to-many. You will then need another table to model it:
employeeOffice: employee_id, office_id, start_date, end_date
With this model the queries I imagine you'll need to perform would be quite easy; as an example, finding how many employees are currently assigned to each office would be
select t1.id, t1.desc, count(distinct t2.employee_id)
from office t1
join employeeOffice t2
on t1.id = t2.office_id
where t2.end_date is null
group by t1.id, t1.desc
Edit
Take this sample data as an example
employee
id | name
1 | name1
2 | name2
3 | name3
4 | name4
5 | name5
6 | name6
7 | name7
office
id | desc
1 | office1
2 | office2
employeeOffice
employee_id | office_id | start_date | end_date
1 | 2 | '01-01-2107' | '31-01-29107'
1 | 1 | '01-02-2107' |
2 | 1 | '01-01-2107' |
3 | 1 | '01-01-2107' |
4 | 1 | '01-01-2107' |
5 | 2 | '01-01-2107' | '01-03-2107'
6 | 2 | '01-01-2107' |
7 | 2 | '01-01-2107' |
This would mean that employee 1 spent one month in office 2 and then was assigned to office 1. Employee 5 after two months resigned (or was fired), because there's no record for him with empty end_date.
The example query above would give you
id | desc | count
1 | office1 | 4
2 | office2 | 2

Inserting data into many-to-many relationship table

I'm trying to build a database with multiple tables for a study/research. This is the first time I'm designing database of this magnitude; the database grows by 100-200 records a day, and so far I have the data since 2010. Out of all the data, Generic Sequence Number, Product Name and the Strength of a drug (prescription) is slightly bothering me. This is what I have done so far:
Generic Seq number is unique to the strength of drug (product name). So, I have a table that contains id, generic seq no, and strength. Another table is for prod_id and product name. Each Generic seq number may have one or more product name, and each product name may have different generic seq number based on the strength. So, I set it up as many-to-many relationship. I created another table for this relationship that contains rx_id, drug_id, and prod_id. Since many patients may be prescribed for the same drug, the drug_id and prod_id may repeat several times in the rx_table.
My first question is, is this design appropriate?
How should I insert the data into rx_table? Should I create new record every time for new data even if the drug_id and prod_id already exist in the rx_table, or should I look for the rx_id where the drug_id and prod_id sequence exist and insert the rx_id into the other main table (not shown) which contains other data.
Or is this question too vague?
Thank you for your help.
I don't know what exactly is your Generic Sequence Number so i'll just use a real life drug example. From your description i think it's pretty similar to your application. Lets say you have Paracetamol as an agent. Then your Generic Sequence Number table would be something like
drug_id | generic_seq_no | strength
--------+--------------------+----------
1 | Paracetamol-100 | 100
2 | Paracetamol-250 | 250
3 | Paracetamol-500 | 500
Your product table would contain the names of the trademarks:
prod_id | prod_name
----------+------------
1 | Tylenol
2 | Captin
3 | Panadol
the rx_table contains the combinations of trademark name, agent and strength:
rx_id | drug_id | prod_id
-------+----------+----------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 2 | 2
6 | 3 | 2
7 | 3 | 3
So e.g. the first row would be Tylenol, containing 100 mg of Paracetamol. Now you have what can be prescribed by a doctor and that's what you already did so far. So as i said your approach is fine.
Now you need (or have?) another table with all your patients
patient_id | firstname | lastname
-----------+-----------+-----------
1 | John | Doe
2 | Jane | Doe
In the end, you must link your trademark/agent/strength combination to the patients. Since one patient may get different drugs and multiple patients may get the same drug you need another many-to-many-relation, let's call it prescription
prescription_id | patient_id | rx_id
----------------+------------+------
1 | 1 | 1
2 | 1 | 3
3 | 2 | 4
This means John Doe will get Tylenol and Panadol containing 100 mg Paracetamol each. Jane Doe will receive Tylenol with 250 mg Paracetamol. I think the table you will be inserting the most is the prescription table in this model.

How to join mysql tables

I've an old table like this:
user> id | name | address | comments
And now I've to create an "alias" table to allow some users to have an alias name for some reasons. I've created a new table 'user_alias' like this:
user_alias> name | user
But now I have a problem due my poor SQL level... How to join both tables to generate something like this:
1 | my_name | my_address | my_comments
1 | my_alias | my_address | my_comments
2 | other_name | other_address | other_comments
I mean, I want to make a "SELECT..." query that returns in the same format as the "user" table ALL users and ALL alias.. Something like this:
SELECT user.* FROM user LEFT JOIN user_alias ON `user`=`id`
but it doesn't work for me..
I think you need something like this:
SELECT user.*
FROM user
LEFT JOIN user_alias
ON user.name=user_alias.name
Your original query was not specific enough in the join condition.
Something like
SELECT user.name, user.address, user.comment FROM user
UNION ALL
SELECT user_alias.alias, user.address, user.comment
FROM user INNER JOIN user_alias on user.name = user_alias.name
ORDER BY name
will get you close to what you want.
You need to UNION two SELECTs together because the LEFT JOIN solution proposed by others will include only one row in the result set for users with aliases, not two as specified in your question.
But you should make the common column joining user and alias the id column, not the name column.
SELECT user.* FROM user LEFT JOIN user_alias ON user.name = user_alias.name
First of all - the query you want to build is not trivial, because you are trying to get some results spanned across more than one row. So I will offer you a proper solution in a fashion like it should be (read: in a way a database developer will do this :-).
First, you should modify your user_alias table so that it will contain id column but not the name. It is not good idea to join your tables using the name field. The reason for this is that there could be two Sarah Connors.
Then, you can get results from both tables using this query:
SELECT user.*, user_alias.*
FROM user LEFT JOIN user_alias
ON user.id=user_alias.id
This way you will get your results in such format:
id | name | address | comments | user
-------------------------------------------------------------
1 | Sarah Connor | Planet Earth | Nice woman | sarah_connor
2 | Sarah Connor | USA, NY | Mean woman | sarah_c
3 | John Connor | USA, NY | n00b | john123
In the situations when there are two or more records in user_alias table for the same person (equal id's), you will get something like this:
id | name | address | comments | user
-------------------------------------------------------------
4 | Bill Clinton | White House | President | bill
4 | Bill Clinton | White House | President | monica