Am I adding unnecessary data to a junction table?

Am I adding unnecessary data to a junction table? - sql

I have a junction table, say for People and Locations
PersonLocations
PersonId | LocationId
---------------------
1 3
2 5
Now, Locations can belong to each other i.e. one location can sit inside another, which can sit inside another etc., so I have this defined by the Location table referencing itself:
Locations
LocationId | ArbitraryName | ParentLocationId
--------------------------------------------
1 a country null
2 region a 1
3 village 1 2
4 village 2 2
5 region b 1
So you can see village 1 and village 2 belong to region a, which in turn belongs to a country
Now, I want it to be known that if person 1 visited Location 3 (village 1) as shown in the first table, they also visited Location 2 and 1 - which can be inferred by the Location table self-referencing.
But what I've done is written rules (triggers) so that if an entry occurs on the PersonLocations table it automatically inserts the ParentLocationId (which recursively works until ParentLocationId is null)
so inserting
PersonId | LocationId
---------------------
1 3
actually results in
PersonId | LocationId
---------------------
1 3
1 2
1 1
And vice versa if I remove.
What I really want to know is - is this safe? It makes my queries and views much easier but am I missing something that is later going to bite me in the backside? I feel like as long as those triggers are in place it would be fine, although its taking up more space - the payoff justifies it? But at the same time I also feel like I might be violating some principle I don't know about...

Related

Combining related organisation records in SQL where there is a Parent-Child relationship between organisations

I am trying to build a table of data for use in Yellowfin BI reporting. One limitation of this is that no temporary tables can be created and then dropped in the database. I am pulling the data from an existing database, which i have no control over. I can only use SQL to query the existing data.
There are two tables in the source database i need to work with. I've simplified them for clarity. The first contains organisations. It has an ORG_ID column which contains a unique ID for each organisation and a PARENT_ORG_ID column indicating which organisation is the Parent Company of others in the list:
ORG_ID PARENT_ORG_ID
1 Null
2 1
3 5
4 5
5 Null
6 1
Using the table above i can see that there are the following relationships between organisations:
ORG_ID RELATED_ORGANISATIONS
1 2 and 6
2 1 and 6
3 5 and 4
4 5 and 3
5 4 and 3
6 1 and 2
I'm not sure the best way to represent these connections in a query as i need to use these relationships with a second table.
The second table i have is a list of organisations and money owed:
ORG_ID MONEY_OWED
1 5
2 10
3 0
4 15
5 20
6 5
What i need to achieve is a table that i can search for any single ORG_ID, and see the combined data for that Organisation and all related Organisations. In the case of my example, this could be a results table something like this:
ORG_ID MONEY_OWED_BY_ALL_RELATED_ORGS
1 20
2 20
3 35
4 35
5 35
6 20
I'm thinking i should use a CTE to handle the relationships between organisations but i can't get my head around it.
Any help would be greatly appreciated!

For your particular example, you can use:
select o.*,
sum(mo.money_owed) over (partition by coalesce(o.parent_org_id, o.org_id)) as parent_owed
from organizations o left join
money_owed mo
on mo.org_id = o.org_id;
This works because your organizations are only one level deep -- which is consistent with your sample data.

Problems figuring out a relational division query

I've run into a problem of some complexity for my small brains to grasp. I have 4 tables, a user table that contains a user id ,a table report that contains the user id and the item id ,a table item that contains the id and a coordinate(lon,lat) and a table location that contains a bunch of coordinates(lon, lat).
I want to know which users reported an item on ALL of the locations above a certain latitude in the year X.
I know that it will involve a relational division of some sorts but I can't write the query for it.
What I'm thinking is that I want to select the users that don't fit the following criteria: From all the locations above a certain latitude, take away those that were reported by a specific user. if that returns empty, that means that the user has reported an item on ALL the locations above that latitude. Now, can you give me some help with the query part?
EXAMPLE:
USER_TABLE
user_email
tony#gmail.com
john#gmail.com
geo#gmail.com
REPORT_TABLE
user_email | item_id
tony#gmail.com | 1
geo#gmail.com | 2
geo#gmail.com | 3
ITEM TABLE
item_id | lat
1 | 10
2 | 15
3 | 20
LOCATIONS
lat
5
10
15
20
In this case, if I want ALL the users that have reported an item on a location with a lat value above 12, it should return geo#gmail.com because he has reported an item on lat = 15 and lat = 20 which are the lat values for all the locations that are above a latitude value of 12.
He geo had only reported 1 item, then the return should be empty

How to show only ids that have all selected set of values specific that id in SQL

There are tables:
PRODUCT
|id
|name
|city
PART
|id
|name
|color
|weight
|city
SUPLIER
|id
|name
|rating
|city
SUPLIES
|sup_id
|part_id
|prod_id
|amount
Ex: Show id and name of products where Supplier(1) supply several parts of each part type from all parts supplied by that supplier. (Exercise was translated from another language).
There are partids are supplied by supplier 1:
SELECT DISTINCT part_id FROM supplies JOIN parts ON part_id=parts.id WHERE sup_id=1;
Let's say they are: 1,2,4,6;
Also supplier 1 supply these products:
SELECT DISTINCT products.name FROM supplies JOIN products ON
sup_id=products.id WHERE sup_id=1;
Suppose they are 1,3,9;
Then I need to show only those products (with id=in(1,3,9)) for which all supplied part types are used (in other words, all of parts with id=in(1,2,4,6)).
So if for product with id=1 only some parts (for example parts with id=2 and 4) were used, then it must not be shown. Of course if I understand exercise correctly.
Strictly speaking that is what I don't know how to write.
EDIT:
To give some details there are some examples:
Supplier Part Product Amount
1 1 1 200
1 1 3 300
1 1 9 400
1 2 1 500
1 2 3 100
1 2 9 700
1 4 1 200
1 4 3 800
1 6 1 100
As in explanation above, only one product id must be shown: it is 1, because it used all supplied part types (1,2,4,6);
If we add one more record like this:
1 6 3 100
Then it changes output to 1 and 3. Because both use 1,2,4 and 6 parts.
If we add another one record:
1 5 9 1100
Then we shall have empty output. Because no one use all set of parts (1,2,4,5,6).
Hope this clarifies something.

Creating new Equipment combo packs. DB Design or SQL (Alter Table, Order By, Select)

I'm currently working on a basic database for Orders from customers. My issue is fields in one table (call it EquipmentPerOption) correspond to records in another table (Equipment). In theory adding a record to Equipment should add a new Column to EquipmentPerOption with the name of the new record.
For example:
**Equipment Table**
Equipment Price
Hose $1.00
Shovel $2.00
Hoe $3.00
**Equipment per Option Table**
OrderOption Hose Shovel Hoe
1 1 0 2
2 3 2 1
3 0 1 3
4 1 1 1
So basically I now have a button on a menu which takes me to an Add New Record screen for Equipment. How do I make it so that when I've finished adding the new record for Equipment it appears on the EquipmentPerOption table as a new Column? Ideally this:
OrderOption Hose Shovel Hoe (New Equipment)
1 1 0 2 0
2 3 2 1 0
3 0 1 3 0
4 1 1 1 0
I've been messing around with SQL and have come up with this as an SQL code for a query that will run after clicking a "Check" button at the bottom of the Equipment Form. (Obviously it will save the record before running the query)
ALTER TABLE EquipmentPerOption
Add
SELECT TOP (1) *
FROM
( SELECT TOP (1) *
FROM Equipment
ORDER BY created_date DESC) Short Interger
So my question is why is this code wrong? And how do I fix it to achieve the desired outcome? Or have I set up the database wrong and should just start again with a different structure for the tables?

If you find yourself having to add columns in order to handle simple transactions, it is a sign the db design is missing something.
Given your Equipment table:
Equipment Table
Id Equipment Price
1 Hose $1.00
2 Shovel $2.00
3 Hoe $3.00
A bundle of these would have a name and probably a price:
EquipmentBundle
Id Name Price
1 Hoe And Hose ...
2 H-S-H Pkg ...
Then a table to associate certain equipment records with a given bundle:
EquipmentBundleItems
Id EquipmentId Quantity
1 1 2 ; Bundle 1 has 2 hoses
1 3 1 ; and 1 hoe
2 1 3 ; Bundle 2 has 3 hoses
2 2 2 ; and 2 shovels
2 3 1 ; and 1 hoe
Presumably all current equipment is already in the Equipment items table, creating a new Bundle is just a matter of creating a new EquipmentBundle record and associating the related items with it via the EquipmentBundleItems table.

Data design & efficiency: get members from all child groups (nested)

I have 2 tables which defined members and their groups like so:
[Table member]
memberID | groupID
1 | 101
2 | 104
... | ...
[Table group]
groupID | parentID
1 | NULL
101 | 1
102 | 1
103 | 101
104 | 103
... | ...
And in my page, if I want to get all members from a particular group (including their child groups), e.g. get all members from group 1, I have to:
step 1: get groupIDs for group 1 and its children and its children's children and ...
step 2: SELECT * FROM member WHERE groupID in (....."groupIDs grabbed in step 1"....)
In my database, there're about 3M members and 50K groups contained in each table, if a requested group is a top level node, it has thousands of child groups, and query may become very slow.
Is there a better way to re-design these table structures for member-group relation and make my query run faster ?
any alternative solutions like caching are also welcomed.

Even though you can store hierarchical data in sql server
The way you have done.
or by using sql server advance data type hierarchyid.
But SQL server isn't really a tool to handle the hierarchical data and it becomes really difficult to do the simplest task.
In your case the best option would be to have Two separate tables for Users and Groups e.g.
Users
UserID UserName
1 User1
2 User2
3 User3
Groups
GroupID GroupName
1 Group1
2 Group2
3 Group3
And then finally a Third table to store data For Users and Their Group Memberships. Where UserID column referrences back to User table and GroupID column referrences back to Group table.
UserGroup
UserID GroupID
1 1
1 2
2 1
2 2
2 3
3 3
So your tables would end up having the following relations
User --> One to Many --> UserGroup
Groups --> One to Many --> UserGroup
/-------> User
UserGroup --> Many to Many --/
\
\-------> Groups

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas