Best way to implement SQL Many-to-Many joins

Best way to implement SQL Many-to-Many joins - sql

I have three tables, and their relevant columns are:
tPerson
-> PersonID
tPersonStatusHistory
-> PersonStatusHistoryID
-> PersonID
-> StatusID
-> PersonStatusDate
Status
-> StatusID
I want to store a full history of all the Statuses that a Person has ever had. But I also want easy access to the current status.
Query to get the current status:
SELECT TOP 1 StatusID FROM tPersonStatusHistory
WHERE PersonID = ? ORDER BY PersonStatusDate DESC
What I want is a query that will fetch me a list of Person records, with their most recent StatusID as a column in the query.
We have tried the following approaches:
Including the above query as a sub-query in the select.
Adding a CurrentPersonStatusHistoryID column to the tPerson table and maintaining it using a computed column that calls a User-Defined-Function.
Maintaining the CurrentPersonStatusHistoryID column using a trigger on the tPersonStatusHistory table.
The query to pull up the Person records is quite high use, so I don't want to have to look up the History table each time. The trigger approach is closest to what I want, since the data is persisted in the Person table and is only changed when an update is made (which is by comparison not very often).
I find triggers difficult to maintain and I would prefer to stay away from them. I've also found that when doing an Insert-Select, or an Update query involving multiple records, the trigger is only called on the first record and not the others.
What I really want is to put some logic into the column definition of CurrentPersonStatusHistoryID, press Save and have it persisted and updated behind the scenes without my intervention.
Given that Many-to-Many relationships are common I was wondering if anyone else had come across a similar situation and had some insight into the highest performance, and preferably least hassle, way of implementing this.

Another approach is to use something like the following query, perhaps as a view. It will give you the most recent StatusID for each Person.
SELECT PersonID, StatusID
FROM (
SELECT PersonID, StatusID,
rank() OVER(PARTITION BY PersonID ORDER BY PersonStatusDate DESC) as rnk
FROM tPersonStatusHistory
) A
WHERE rnk = 1
I'm not sure that this satisfies your requirement for performance, but it's something you could look into.

Related

Add column to SQL Table based on other rows in the table

I have a table that contains stop times for a transit system. The details aren't important, but my table essentially looks like this:
I am importing the data from a CSV file which contains everything except the next stop ID. I want to generate Next Stop ID to speed up some data processing I am going to do in my app.
For each row, the Next Stop ID should be the Stop ID from the next row with matching Trip ID and Service ID. The ordering should be based on the Stop Sequence, which will be increasing but not necessarily in order (1, 20, 21, 23, etc rather than 1,2,3,4...).
Here is an example of what I'm hoping it will look like. For simplicity, I kept all the service IDs the same and there are two Trip IDs. If there is no next stop I want that entry to just be blank.
I think it makes sense to do this entirely in SQL, but I'm not sure how best to do it. I know how I would do it in a standard programming language, but not SQL. Thank you for your help.

You can use lead():
select
t.*,
lead(stop_id)
over(partition by trip_id, service_id order by stop_sequence) next_stop_id
from mytable t
It is not ncessarily an good idea to actally store that derived information, since you can compute on the fly when needed (you can put the query in a view to make it easier to access it). But if you want this in an update, then, assuming that stop_id is the primary key of the table, that would look like:
update mytable
set next_stop_id = t.next_stop_id
from (
select
stop_id,
lead(stop_id) over(partition by trip_id, service_id order by stop_id) next_stop_id
from mytable
) t
where mytable.stop_id = t.stop_id

Update JOIN table contents

I have a table joined from two other tables. I would like this table to stay updated with entries in the other two tables.
First Table is "employees"
I am using the ID, Last_Name, and First_Name.
And the second Table is "EmployeeTimeCardActions"
using columns ID, ActionTime, ActionDate, ShiftStart, and ActionType.
ID is my common column that the join was created by..Joined Table...
Because I usually have a comment saying I did not include enough information, I do not need a exact specific code sample and I think I have included everything needed. If there is a good reason to include more I will, I just try to keep as little company information public as possible

Sounds like you're having your data duplicated across tables. Not a smart idea at all. You can update data in one table when a row is updated in a different one via triggers but this is a TERRIBLE approach. If you want to display data joined from 2 tables, the right approach here is using an SQL VIEW which will display the current data.

How to force ID column to remain sequential even if a recored has been deleted, in SQL server?

I don't know what is the best wording of the question, but I have a table that has 2 columns: ID and NAME.
when I delete a record from the table the related ID field deleted with it and then the sequence spoils.
take this example:
if I deleted row number 2, the sequence of ID column will be: 1,3,4
How to make it: 1,2,3

ID's are meant to be unique for a reason. Consider this scenario:
**Customers**
id value
1 John
2 Jackie
**Accounts**
id customer_id balance
1 1 $500
2 2 $1000
In the case of a relational database, say you were to delete "John" from the database. Now Jackie would take on the customer_id of 1. When Jackie goes in to check here balance, she will now show $500 short.
Granted, you could go through and update all of her other records, but A) this would be a massive pain in the ass. B) It would be very easy to make mistakes, especially in a large database.
Ids (primary keys in this case) are meant to be the rock that holds your relational database together, and you should always be able to rely on that value regardless of the table.
As JohnFx pointed out, should you want a value that shows the order of the user, consider using a built in function when querying.

In SQL Server identity columns are not guaranteed to be sequential. You can use the ROW_NUMBER function to generate a sequential list of ids when you query the data from the database:
SELECT
ROW_NUMBER() OVER (ORDER BY Id) AS SequentialId,
Id As UniqueId,
Name
FROM dbo.Details

If you want sequential numbers don't store them in the database. That is just a maintenance nightmare, and I really can't think of a very good reason you'd even want to bother.
Just generate them dynamically using tSQL's RowNumber function when you query the data.
The whole point of an Identity column is creating a reliable identifier that you can count on pointing to that row in the DB. If you shift them around you undermine the main reason you WANT an ID.
In a real world example, how would you feel if the IRS wanted to change your SSN every week so they could keep the Social Security Numbers sequential after people died off?

Is it possible to add a "check if previous" column to a view?

I have a view in SQL Server 2008 that I want to use for a report in SSRS 2008.
The main problem is that I have to use two different datasets in one table and cannot do grouping as I want it. Both datasets come from this view. Let's say in one column of my report table I want to sum all computers of all school buildings of my country. In the other column, the ratio students of schools per computer.
Now, in DB there are two different tables one for Buildings and one for Schools (because sometimes there are different buildings for one school or other similar scenarios). Joining them results in more couples building-school than needed for the computer-sum column, it will result summing different times the same building (if more than one schools operate in that building).
The table is this:
To avoid this I have done those two datasets, one from the building point of view, and one from the school point of view. But these are two datasets in one table! To solve my problem, I have thought to add a special column to my view : it checks automatically if a BUILDING_ID is shown twice or more in the result table, f.e. like this:
The problem is that I don't know if this is possible and if it is, I don't know how to do it.

Maybe this can give you a hint:
select building_id,
row_number() over (partition by building_id order by newid()) - 1 check_if_previous
from yourtable
If you just want 1's or 0's
select building_id,
cast(row_number() over (partition by building_id order by newid()) - 1 as BIT) check_if_previous
from yourtable

Maintaining a metadata table in SQL

Can someone help giving me some direction to tackle a scenario like this.
A User table which contains all the user information, UserID is the primary key on User Table. I have another table called for example Comments, which holds all the comments created by any user. Comments table contains UserID as the foreign key. Now i have to rank the Users based on number of comments they added. The more comments a user added, the ranking goes up. I am trying to see what will be the best way to do this.
I would prefer to have another table, which basically contains all the attributes or statistics of a user(might have more attributes in future, right now only rank, based on comment count),rather than adding another column in User table itself.
If I create another table Called UserStats, and have UserID as the foreign Key, and have another column, called Rank, there is a possibility that everytime a user adds a comment, we might need to update the ranks. How do I write a SP that does this, Im not even sure, if this is the right way to do this.

This is not the right way to do this.
You don't want to be materializing those kinds of computed values until there is a performance problem - and you have options like Indexed Views to help you well before you get to the point of doing what you suggested.
Just create a View called UserRankings and have it look like:
SELECT c.UserId, COUNT(c.CommentId) [Ranking]
FROM Comments c
GROUP BY c.UserId
Not sure how you want to do your rankings, but you can also look at the RANK() and DENSE_RANK() functions in T-SQL: Ranking Functions (Transact-SQL)

You could do this from a query
SELECT UserID,
COUNT(UserID) CntOfUserID
FROM UserComments
GROUP BY UserID
ORDER BY COUNT(UserID) DESC
You could also do this using a ROW_NUMBER
DECLARE #Comments TABLE(
UserID INT,
Comment VARCHAR(MAX)
)
INSERT INTO #Comments SELECT 3, 'Foo'
INSERT INTO #Comments SELECT 3, 'Bar'
INSERT INTO #Comments SELECT 3, 'Tada'
INSERT INTO #Comments SELECT 2, 'T'
INSERT INTO #Comments SELECT 2, 'G'
SELECT UserID,
ROW_NUMBER() OVER (ORDER BY COUNT(UserID) DESC) ID
FROM #Comments
GROUP BY UserID

Storing that kind of information is actually a bad idea. The count of comments per user is something that can be calculated at any given time quickly and easily. And if your columns are properly indexed (on the foreign key,) the count operation ought to happen very quickly.
The only reason you might want to persist metadata is if the load on your database is fast and furious and you simply cannot afford to run select queries with counts per request. And that load will also inform whether you simply add a column to your user table or create a whole separate table. (The latter solution being the one for the most extreme server loads.)

A few comments:
Yes, I think you should keep the "score" metadata somewhere, otherwise, you'd have to run the scoring calc each time, which could ultimately get expensive.
Second, I don't think you should calculate an actual "rank" (vs other users). Just calculate a "score" (based on the number of comments posted), then your query can determine "rank" by retrieving scores in descending order.
Third, I would probably make a trigger that updates the "score" in the metadata table, based on each insert into the comments table.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas