I need some guidance on how to setup what im trying to achieve. I'm creating a database in postgres and need help with how i should setup my data model.
I'm working on a personal app and basically i have tables for users, transactions, and groups.
users has columns (id, email) transactions has columns (id, amount, user_id, ...)
so basically the gist is individual users have their own transactions. The tricky part for me is setting up the groups portion
Groups can consist of 2 users or more, and what i want to achieve is when 2 or more users become a group, i want all users to be able to get the TOTAL AMOUNT from the other users transactions.
I thought handling the groups table as a typical "friends data model" with user_id_one and user_id_two with the proper uniqueness and constraints would help, but im stuck.
So what I need help with is,
How should the group's table look like? (Or is there another approach I can take, add another table? what kind of relation?)
what the query should look like that would get both users transactions? (I could just add up all the amounts on the server if it can't be done through the query)
Any help will be greatly appreciated!!
So like I understand you should create some kind of linking table
Example:
| [Users] | |[UserGroups]| | [Groups] |
|user_id | |id | |group_id |
|email | ∞ 1 |user_id | 1 ∞ |other_data |
|other data | ----------> |group_id | <---------- | |
Then for exmaple you can write the query to fetch all transactions for given group:
SELECT * FROM Transactions t JOIN UserGroups ug ON t.user_id = ug.user_id WHERE ug.group_id= ?;
I don't know exactly what queries you want but it's only example.
Related
what is difference between self join and inner join
I find it helpful to think of all of the tables in a SELECT statement as representing their own data sets.
Before you've applied any conditions you can think of each data set as being complete (the entire table, for instance).
A join is just one of several ways to begin refining those data sets to find the information that you really want.
Though a database schema may be designed with certain relationships in mind (Primary Key <-> Foreign Key) these relationships really only exist in the context of a particular query. The query writer can relate whatever they want to whatever they want. I'll give an example of this later...
An INNER JOIN relates two tables to each other. There are often multiple JOIN operations in one query to chain together multiple tables. It can get as complicated as it needs to. For a simple example, consider the following three tables...
STUDENT
| STUDENTID | LASTNAME | FIRSTNAME |
------------------------------------
1 | Smith | John
2 | Patel | Sanjay
3 | Lee | Kevin
4 | Jackson | Steven
ENROLLMENT
| ENROLLMENT ID | STUDENTID | CLASSID |
---------------------------------------
1 | 2 | 3
2 | 3 | 1
3 | 4 | 2
CLASS
| CLASSID | COURSE | PROFESSOR |
--------------------------------
1 | CS 101 | Smith
2 | CS 201 | Ghandi
3 | CS 301 | McDavid
4 | CS 401 | Martinez
The STUDENT table and the CLASS table were designed to relate to each other through the ENROLLMENT table. This kind of table is called a Junction Table.
To write a query to display all students and the classes in which they are enrolled one would use two inner joins...
SELECT stud.LASTNAME, stud.FIRSTNAME, class.COURSE, class.PROFESSOR
FROM STUDENT stud
INNER JOIN ENROLLMENT enr
ON stud.STUDENTID = enr.STUDENTID
INNER JOIN CLASS class
ON class.CLASSID = enr.CLASSID;
Read the above closely and you should see what is happening. What you will get in return is the following data set...
| LASTNAME | FIRSTNAME | COURSE | PROFESSOR |
---------------------------------------------
Patel | Sanjay | CS 301 | McDavid
Lee | Kevin | CS 101 | Smith
Jackson | Steven | CS 201 | Ghandi
Using the JOIN clauses we've limited the data sets of all three tables to only those that match each other. The "matches" are defined using the ON clauses. Note that if you ran this query you would not see the CLASSID 4 row from the CLASS table or the STUDENTID 1 row from the STUDENT table because those IDs don't exist in the matches (in this case the ENROLLMENT table). Look into "LEFT"/"RIGHT"/"FULL OUTER" JOINs for more reading on how to make that work a little differently.
Please note, per my comments on "relationships" earlier, there is no reason why you couldn't run a query relating the STUDENT table and the CLASS table directly on the LASTNAME and PROFESSOR columns. Those two columns match in data type and, well look at that! They even have a value in common! This would probably be a weird data set to get in return. My point is it can be done and you never know what needs you might have in the future for interesting connections in your data. Understand the design of the database but don't think of "relationships" as being rules that can't be ignored.
In the meantime... SELF JOINS!
Consider the following table...
PERSON
| PERSONID | FAMILYID | NAME |
--------------------------------
1 | 1 | John
2 | 1 | Brynn
3 | 2 | Arpan
4 | 2 | Steve
5 | 2 | Tim
6 | 3 | Becca
If you felt so inclined as to make a database of all the people you know and which ones are in the same family this might be what it looks like.
If you wanted to return one person, PERSONID 4, for instance, you would write...
SELECT * FROM PERSON WHERE PERSONID = 4;
You would learn that he is in the family with FAMILYID 2. Then to find all of the PERSONs in his family you would write...
SELECT * FROM PERSON WHERE FAMILYID = 2;
Done and done! SQL, of course, can accomplish this in one query using, you guessed it, a SELF JOIN.
What really triggers the need for a SELF JOIN here is that the table contains a unique column (PERSONID) and a column that serves as sort of a "Category" (FAMILYID). This concept is called Cardinality and in this case represents a one to many or 1:M relationship. There is only one of each PERSON but there are many PERSONs in a FAMILY.
So, what we want to return is all of the members of a family if one member of the family's PERSONID is known...
SELECT fam.*
FROM PERSON per
JOIN PERSON fam
ON per.FamilyID = fam.FamilyID
WHERE per.PERSONID = 4;
Here's what you would get...
| PERSONID | FAMILYID | NAME |
--------------------------------
3 | 2 | Arpan
4 | 2 | Steve
5 | 2 | Tim
Let's note a couple of things. The words SELF JOIN don't occur anywhere. That's because a SELF JOIN is just a concept. The word JOIN in the query above could have been a LEFT JOIN instead and different things would have happened. The point of a SELF JOIN is that you are using the same table twice.
Consider my soapbox from before on data sets. Here we have started with the data set from the PERSON table twice. Neither instance of the data set affects the other one unless we say it does.
Let's start at the bottom of the query. The per data set is being limited to only those rows where PERSONID = 4. Knowing the table we know that will return exactly one row. The FAMILYID column in that row has a value of 2.
In the ON clause we are limiting the fam data set (which at this point is still the entire PERSON table) to only those rows where the value of FAMILYID matches one or more of the FAMILYIDs of the per data set. As we discussed we know the per data set only has one row, therefore one FAMILYID value. Therefore the fam data set now contains only rows where FAMILYID = 2.
Finally, at the top of the query we are SELECTing all of the rows in the fam data set.
Voila! Two queries in one.
In conclusion, an INNER JOIN is one of several kinds of JOIN operations. I would strongly suggest reading further into LEFT, RIGHT and FULL OUTER JOINs (which are, collectively, called OUTER JOINs). I personally missed a job opportunity for having a weak knowledge of OUTER JOINs once and won't let it happen again!
A SELF JOIN is simply any JOIN operation where you are relating a table to itself. The way you choose to JOIN that table to itself can use an INNER JOIN or an OUTER JOIN. Note that with a SELF JOIN, so as not to confuse your SQL engine you must use table aliases (fam and per from above. Make up whatever makes sense for your query) or there is no way to differentiate the different versions of the same table.
Now that you understand the difference open your mind nice and wide and realize that one single query could contain all different kinds of JOINs at once. It's just a matter of what data you want and how you have to twist and bend your query to get it. If you find yourself running one query and taking the result of that query and using it as the input of another query then you can probably use a JOIN to make it one query instead.
To play around with SQL try visiting W3Schools.com There is a locally stored database there with a bunch of tables that are designed to relate to each other in various ways and it's filled with data! You can CREATE, DROP, INSERT, UPDATE and SELECT all you want and return the database back to its default at any time. Try all sorts of SQL out to experiment with different tricks. I've learned a lot there, myself.
Sorry if this was a little wordy but I personally struggled with the concept of JOINs when I was starting to learn SQL and explaining a concept by using a bunch of other complex concepts bogged me down. Best to start at the bottom sometimes.
I hope it helps. If you can put JOINs in your back pocket you can work magic with SQL!
Happy querying!
A self join joins a table to itself. The employee table might be joined to itself in order to show the manager name and the employee name in the same row.
An inner join joins any two tables and returns rows where the key exists in both tables. A self join can be an inner join (most joins are inner joins and most self joins are inner joins). An inner join can be a self join but most inner joins involve joining two different tables (generally a parent table and a child table).
An inner join (sometimes called a simple join) is a join of two or more tables that returns only those rows that satisfy the join condition.
A self join is a join of a table to itself. This table appears twice in the FROM clause and is followed by table aliases that qualify column names in the join condition. To perform a self join, Oracle Database combines and returns rows of the table that satisfy the join condition.
I have a database in MS Access, and I ran into a problem with empty values. I have 3 tables that are connected to eachother. Lets say Table1 contains people, Table 2 contains Phone numbers, and Table 3 connects table 1 and 2, having both their ID's so I could later see what person has what numbers by using the IDs.
What I want from access is that it would display a person even if he/she doesn't have a number assigned, and also a number when there are no people assigned to it.
Something like this:
Persons_name |Phone_number
--------------------------
Fred | 123
| 222
Anna |
The tables look something like this:
People People_phones Phones
------------- -------------- ------------
ID ID ID
Persons_name People_ID Phone_number
Phones_ID
So far I've managed to get access to show either table 1's null values or table 2's null values, but not both.
As E Mett indicated above, your looking for a full outer join which doesn't handle directly. Here is an example of what he's suggesting:
How do I write a full outer join query in access
JB
In sql jargon what you are looking for is an outer join.
This is unfortunately not available in Ms Access because it is rarely needed.
You should create two queries, one using a left join and the other with a right join.
Then use the UNION keyword to combine the results
Im having difficulty trying to design a database structure for the following scenario:
My database should contain general user information UserID, FirstName,
LastName, JoiningDate.
Each User can be part of a group.
Each group has "tags" attached to it and can have multiple tags. Users
should also be able to return a list of available groups (filtered by
tags).
Be able to search for a group (by tags attached to the group), and
searching for particular users (by last name, or unique ID). ). It should also be able
to return a list of available groups (filtered by tags), and the
members of a particular group (filtered by last name and filtered by
joining date).
There should also be a means of discovering which
groups of users belong to a number of groups (a query on
“who are the members of "Bravo group" and the "Delta"
group), and keeping track of messages sent in the group (like a
forum).
Is this just two tables? Or should it be three tables... Users, Groups and Tags? Its been almost a year since ive did any relational database stuff and I was wondering if anyone could show a visual representation of this database design?
I suggest five tables: Users, Groups, Tags and link tables UserGroups and GroupTags.
This is because there appears to be a many-to-many relationship between Users and Groups, and between Groups and Tags - a link entity is required in relational design to join entities with many-to-many relationships between them.
--------- ------------ --------
| Users | | Groups | | Tags |
--------- ------------ --------
| | | |
| | | |
/|\ /|\ /|\ /|\
-------------- -------------
| UserGroups | | GroupTags |
-------------- -------------
Definitely at least 3: User, Groups and Tags.
And because you are using a many to many relationship (assumption from your description) between groups and tags you'd probably need a cross table to link groups and tags together.
If one user can belong to more groups, there should also be cross table between users and groups.
Here's a tricky normalization/SQL/Database Design question that has been puzzling us. I hope I can state it correctly.
You have a set of activities. They are things that need to be done -- a glorified TODO list. Any given activity can be assigned to an employee.
Every activity also has an enitity for whom the activity is to be performed. Those activities are either a Contact (person) or a Customer (business). Each activity will then have either a Contact or a Customer for whom the activity will be done. For instance, the activity might be "Send a thank you card to Spacely Sprockets (a customer)" or "Send marketing literature to Tony Almeida (a Contact)".
From that structure, we then need to be able to query to find all the activities a given employee has to do, listing them in a single relation that would be something like this in it simplest form:
-----------------------------------------------------
| Activity | Description | Recipient of Activity |
-----------------------------------------------------
The idea here is to avoid having two columns for Contact and Customer with one of them null.
I hope I've described this correctly, as this isn't as obvious as it might seem at first glance.
So the question is: What is the "right" design for the database and how would you query it to get the information asked for?
It sounds like a basic many-to-many relationship and I'd model it as such.
The "right" design for this database is to have one column for each, which you say you are trying to avoid. This allows for a proper foreign key relationship to be defined between those two columns and their respective tables. Using the same column for a key that refers to two different tables will make queries ugly and you can't enforce referential integrity.
Activities table should have foreign keys ContactID, CustomerID
To show activities for employee:
SELECT ActivityName, ActivityDescription, CASE WHEN a.ContactID IS NOT NULL THEN cn.ContactName ELSE cu.CustomerName END AS Recipient
FROM activity a
LEFT JOIN contacts cn ON a.ContactID=cn.ContactID
LEFT JOIN customers cu ON a.CustomerID=cu.CustomerID
It's not clear to me why you are defining Customers and Contacts as separate entities, when they seem to be versions of the same entity. It seems to me that Customers are Contacts with additional information. If at all possible, I'd create one table of Contacts and then mark the ones that are Customers either with a field in that table, or by adding their ids to a table Customers that has the extended singleton customer information in it.
If you can't do that (because this is being built on top of an existing system the design of which is fixed) then you have several choices. None of the choices are good because they can't really work around the original flaw, which is storing Customers and Contacts separately.
Use two columns, one NULL, to allow referential integrity to work.
Build an intermediate table ActivityContacts with its own PK and two columns, one NULL, to point to the Customer or Contact. This allows you to build a "clean" Activity system, but pushes the ugliness into that intermediate table. (It does provide a possible benefit, which is that it allows you to limit the target of activities to people added to the intermediate table, if that's an advantage to you).
Carry the original design flaw into the Activities system and (I'm biting my tongue here) have parallel ContactActivity and CustomerActivity tables. To find all of an employee's assigned tasks, UNION those two tables together into one in a VIEW. This allows you to maintain referential integrity, does not require NULL columns, and provides you with a source from which to get your reports.
Here is my stab at it:
Basically you need activities to be associated to 1 (contact or Customer) and 1 employee that is to be a responsible person for the activity. Note you can handle referential constraint in a model like this.
Also note I added a businessEntity table that connects all People and places. (sometimes useful but not necessary). The reason for putting the businessEntity table is you could simple reference the ResponsiblePerson and the Recipient on the activity to the businessEntity and now you can have activities preformed and received by any and all people or places.
If I've read the case right, Recipients is a generalization of Customers and Contacts.
The gen-spec design pattern is well understood.
Data modeling question
You would have something like follows:
Activity | Description | Recipient Type
Where Recipient Type is one of Contact or Customer
You would then execute a SQL select statement as follows:
Select * from table where Recipient_Type = 'Contact';
I realize there needs to be more information.
We will need an additional table that is representative of Recipients(Contacts and Customers):
This table should look as follows:
ID | Name| Recipient Type
Recipient Type will be a key reference to the table initially mentioned earlier in this post. Of course there will need to be work done to handle cascades across these tables, mostly on updates and deletes. So to quickly recap:
Recipients.Recipient_Type is a FK to Table.Recipient_Type
[ActivityRecipientRecipientType]
ActivityId
RecipientId
RecipientTypeCode
||| ||| |||_____________________________
| | |
| -------------------- |
| | |
[Activity] [Recipient] [RecipientType]
ActivityId RecipientId RecipientTypeCode
ActivityDescription RecipientName RecipeintTypeName
select
[Activity].ActivityDescription
, [Recipient].RecipientName
from
[Activity]
join [ActivityRecipientRecipientType] on [Activity].ActivityId = [ActivityRecipientRecipientType].ActivityId
join [Recipient] on [ActivityRecipientRecipientType].RecipientId = [Recipient].RecipientId
join [RecipientType] on [ActivityRecipientRecipientType].RecipientTypeCode = [RecipientType].RecipientTypeCode
where [RecipientType].RecipientTypeName = 'Contact'
Actions
Activity_ID | Description | Recipient ID
-------------------------------------
11 | Don't ask questions | 0
12 | Be cool | 1
Activities
ID | Description
----------------
11 | Shoot
12 | Ask out
People
ID | Type | email | phone | GPS |....
-------------------------------------
0 | Troll | troll#hotmail.com | 232323 | null | ...
1 | hottie | hottie#hotmail.com | 2341241 | null | ...
select at.description,a.description, p.* from Activities at, Actions a, People p
where a."Recipient ID" = p.ID
and at.ID=a.activity_id
result:
Shoot | Don't ask questions | 0 | Troll | troll#hotmail.com | 232323 | null | ...
Ask out | Be cool | 1 | hottie | hottie#hotmail.com | 2341241 |null | ...
Model another Entity: ActivityRecipient, which will be inherited by ActivityRecipientContact and ActivityRecipientCustomer, which will hold the proper Customer/Contact ID.
The corresponding tables will be:
Table: Activities(...., RecipientID)
Table: ActivityRecipients(RecipientID, RecipientType)
Table: ActivityRecipientContacts(RecipientID, ContactId, ...,ExtraContactInfo...)
Table: ActivityRecipientCustomers(RecipentID, CustomerId, ...,ExtraCustomerInfo...)
This way you can also have different other columns for each recipient type
I would revise that definition of Customer and Contact. A customer can be either an person or a business, right? In Brazil, there's the terms 'pessoa jurídica' and 'pessoa física' - which in a direct (and mindless) translation become 'legal person' (business) and 'physical person' (individual). A better translation was suggested by Google: 'legal entity' and 'individual'.
So, we get an person table and have an 'LegalEntity' and 'Individual' tables (if there's enough attributes to justify it - here there's plenty). And the receiver become an FK to Person table.
And where has gone the contacts? They become an table that links to person. Since a contact is a person that is contact of another person (example: my wife is my registered contact to some companies I'm customer). People can have contacts.
Note: I used the word 'Person' but you can call it 'Customer' to name that base table.
For about a year now, we’ve been allowing our users to login with usernames and/or email addresses that are not unique (though each user does have a unique id). Although the system handles duplicate usernames/emails elegantly, we’ve decided to finally enforce unique usernames and email addresses. I’ve been tasked with generating a table in MySQL that will show the duplicates and the tables in which a duplicate’s id is being used (i.e. the tables dependent on the duplicate’s user id, using 1 for true and 0 for false). This table will then be used as a reference once duplicate data is marked for deletion. In short, I’m looking to generate a table something like this:
| User_id |
Username |
Email |
Exists_in_Table1 |
Exists_in_Table2 |
Exists_in_Table3 |
-----------------------------------------------------------------------------------------------------------
| 0001.....|
test1.........|
email.|
0..........................|
0..........................|
1..........................|
| 0002.....|
test2.........|
email.|
0..........................|
1..........................|
1..........................|
| 0003.....|
test3.........|
email.|
1..........................|
1..........................|
1..........................|
It doesn’t matter much how this is accomplished. Since my SQL skills are somewhat lacking, I intended to do this programmatically using PHP and a number of simple SQL queries. However, I believe a single SQL query or a series of queries (without the use of PHP) is the cleanest approach. I know how to query for duplicates, but I can’t seem to figure out how to query multiple tables and join them by the user id in an appropriate manner. I appreciate any and all help with this. Thank you.
SELECT u.User_id, u.Username, u.Email,
IF(t1.User_id IS NULL, 0, 1) AS Exists_in_Table1,
IF(t2.User_id IS NULL, 0, 1) AS Exists_in_Table2,
IF(t3.User_id IS NULL, 0, 1) AS Exists_in_Table3
FROM Users u
LEFT OUTER JOIN Table1 t1 USING (User_id)
LEFT OUTER JOIN Table2 t2 USING (User_id)
LEFT OUTER JOIN Table3 t3 USING (User_id);