Add column to SQL Table based on other rows in the table - sql

I have a table that contains stop times for a transit system. The details aren't important, but my table essentially looks like this:
I am importing the data from a CSV file which contains everything except the next stop ID. I want to generate Next Stop ID to speed up some data processing I am going to do in my app.
For each row, the Next Stop ID should be the Stop ID from the next row with matching Trip ID and Service ID. The ordering should be based on the Stop Sequence, which will be increasing but not necessarily in order (1, 20, 21, 23, etc rather than 1,2,3,4...).
Here is an example of what I'm hoping it will look like. For simplicity, I kept all the service IDs the same and there are two Trip IDs. If there is no next stop I want that entry to just be blank.
I think it makes sense to do this entirely in SQL, but I'm not sure how best to do it. I know how I would do it in a standard programming language, but not SQL. Thank you for your help.

You can use lead():
select
t.*,
lead(stop_id)
over(partition by trip_id, service_id order by stop_sequence) next_stop_id
from mytable t
It is not ncessarily an good idea to actally store that derived information, since you can compute on the fly when needed (you can put the query in a view to make it easier to access it). But if you want this in an update, then, assuming that stop_id is the primary key of the table, that would look like:
update mytable
set next_stop_id = t.next_stop_id
from (
select
stop_id,
lead(stop_id) over(partition by trip_id, service_id order by stop_id) next_stop_id
from mytable
) t
where mytable.stop_id = t.stop_id

Related

Problem to calculate values from 1 table using group by in firebird

I have a logic problem to calculate the final value of this table:
https://i.stack.imgur.com/YPXXX.png
I need calculate for every row with column TIPO having the value "E" +1 and for "S" -1, grouping by columns Codigo and Configuracao.
Basically, I need a simple stock control, the columns Codigo and Configuracao is product column control, and TIPO is the type of moviment, S = OUT and E = IN
Anyone can give me a light?
untested but maybe this
select SUM(t1.TipoNumeric), t1.CODIGO, t1.CONFIGURACAO from (
select
case (TIPO)
when 'E' then 1
when 'S' then -1
else 0
end as TipoNumeric,
CODIGO,
CONFIGURACAO
from MyTable
) as t1
group by t1.CODIGO, t1.CONFIGURACAO
Just add that +1/-1 column, perhaps?
alter table MyTable
add tipo_val computed by
(
decode( upper(TIPO), 'E', +1, 'S', -1 )
)
https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-ddl-tbl
https://www.firebirdsql.org/refdocs/langrefupd21-intfunc-decode.html
And then:
Select * from MyTable;
Select SUM(tipo_val), CODIGO, CONFIGURACAO
From MyTable
Group by 2, 3
P.S. do not use pictures to show your data.
Instead put them to http://dbfiddle.uk/?rdbms=firebird_3.0 as a script,
and then use Markdown Export there to copy both data and a hyperlink into your question text.
P.P.S. i believe your whole approach is wrong there, if "need a simple stock control".
https://en.wikipedia.org/wiki/Double-entry_bookkeeping
https://medium.com/#RobertKhou/double-entry-accounting-in-a-relational-database-2b7838a5d7f8
I think your table should have columns like that:
surrogate row id, primary key, auto-incrementing integer, 32-bits or 64-bits
columns identifying your item, usually it is, again, a single surrogain integer SKU (Stock Keeping Unit) referencing (see - Foreign Keys) another "dictionary table". In your case it seemes to be two columns Codigo and Configuracao but that also means you can not add extra information ("attributes") about your items, like price or photo (read: database normalization). It also makes grouping harder for Firebird Engine, than using a single integer column. Also, you did created an index on the item-identifying column(s) did you not? What is your query plan on those selects, do they use index on Codigo and Configuracao or an ad hoc external sorting instead?
the timestamp of an operation, that is automatically set by the Firebird server to be current_timestamp, so you always know when exactly that row was inserted. Indexed, of course.
the computer user who added that row, again, automatically set by Firebird server to current_user or to an ID of a user in some stock_workers table you would create. Surely, indexed too.
some description of an operation, like contract number, or seller name, anything that would help you later to remember what real world event that row even describes. Being free form text, it probable would not be indexed. But maybe you would eventually make some contracts or sellers table and add integer references (FK IDs) to those tables? That depends which exactly kind of data would be repeated often enough to be worth extracting into an extra indexed columns.
maybe a unit measure, maybe all your units forever would only be measured in pieces, in integer quantity. But maybe there would be some items measured in kilograms, meters, liters, etc?
finaly two integer (or float?) columns like Qty_Income and Qty_Outcome, where you would record how many items were added or taken from your depot. There would be not that E/S column! There would be two integer columns, that you would put number into one or another. Why? read the articles about bookkeeping above!
In such a database scheme your query would finally look like this:
select Sum(s.Qty_Income) as Credit, Sum(s.Qty_Outcome) as Debit,
Sum(s.Qty_Income) - Sum(s.Qty_Outcome) as Saldo,
min(g.Codigo), min(g.Configuracao)
from stock_movements s
join known_goods g on g.ID = s.SKU_ID
group by s.SKU_ID
And you would also be able to flexibly compose similar requests grouping by workers, or dates, or quantities (like, only care about BIG events like 1000 or more items added in one operation), or anything.

Creating a column of an SQL table to number records based on their order

I have a web program (PHP and JavaScript) that needs to display entries in the table based on how recently they were added. To accomplish this I want to have one column of the table represent the entry number while another would represent what it should be displayed at.
For example, if I had 10 records, ID=10 would correspond to Display=1 . I was wondering if there would be a simple way to update this, ordering by the entry ID and generating the display IDs accordingly.
Your question is a little vague, but here goes....
Normally IDs ascend, with the highest ID being the most recently added, so you can ORDER by ID desc in your query to determine which should be displayed. The results you get from the query will be the display order.
Why aren't you making use of SQLServer's default column values? Have a look here to see an example: Add default value of datetime field in SQL Server to a timestamp
For example, you have a table like this:
create table test (
entry_id number,
message varchar(100),
created_time datetime default GETDATE()
);
Then you can insert like
insert into test values (1, "test1");
insert into test values (2, "test2");
And select like
select entry_id, message from test order by created_time desc
There are lots of ways to do this. As the others have noted, it wouldn't be common practice to store the reverse or inverted id like this. You can get the display_id several ways. These come to mind quickly:
CREATE TABLE test (entry_id INT)
GO
INSERT INTO test VALUES (1),(2),(3)
GO
--if you trust your entry_id is truly sequential 1 to n you can reverse it for the display_id using a subquery
SELECT entry_id,
(SELECT MAX(entry_id) + 1 FROM test) - entry_id display_id
FROM test
--or a cte
;WITH cte AS (SELECT MAX(entry_id) + 1 max_id FROM test)
SELECT entry_id,
max_id - entry_id display_id
FROM test
CROSS APPLY
cte
--more generally you can generate a row easily since sql server 2005
SELECT entry_id
,ROW_NUMBER() OVER (ORDER BY entry_id DESC) display_id
FROM test
You could use any of those methods in a view to add display id. Normally I'd recommend you just let you presentation layer handle the numbering for display, but if you intend to query back against it you might want to persist it. I can only see storing it if the writes are infrequent relative to reads. You could update a "real" column using a trigger. You could also create a persisted computed column.

How do I get row id of a row in sql server

I have one table CSBCA1_5_FPCIC_2012_EES207201222743, having two columns employee_id and employee_name
I have used following query
SELECT ROW_NUMBER() OVER (ORDER BY EMPLOYEE_ID) AS ID, EMPLOYEE_ID,EMPLOYEE_NAME
FROM CSBCA1_5_FPCIC_2012_EES207201222743
But, it returns the rows in ascending order of employee_id, but I need the rows in order they were inserted into the table.
SQL Server does not track the order of inserted rows, so there is no reliable way to get that information given your current table structure. Even if employee_id is an IDENTITY column, it is not 100% foolproof to rely on that for order of insertion (since you can fill gaps and even create duplicate ID values using SET IDENTITY_INSERT ON). If employee_id is an IDENTITY column and you are sure that rows aren't manually inserted out of order, you should be able to use this variation of your query to select the data in sequence, newest first:
SELECT
ROW_NUMBER() OVER (ORDER BY EMPLOYEE_ID DESC) AS ID,
EMPLOYEE_ID,
EMPLOYEE_NAME
FROM dbo.CSBCA1_5_FPCIC_2012_EES207201222743
ORDER BY ID;
You can make a change to your table to track this information for new rows, but you won't be able to derive it for your existing data (they will all me marked as inserted at the time you make this change).
ALTER TABLE dbo.CSBCA1_5_FPCIC_2012_EES207201222743
-- wow, who named this?
ADD CreatedDate DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP;
Note that this may break existing code that just does INSERT INTO dbo.whatever SELECT/VALUES() - e.g. you may have to revisit your code and define a proper, explicit column list.
There is a pseudocolumn called %%physloc%% that shows the physical address of the row.
See Equivalent of Oracle's RowID in SQL Server
SQL does not do that. The order of the tuples in the table are not ordered by insertion date. A lot of people include a column that stores that date of insertion in order to get around this issue.

Best way to implement SQL Many-to-Many joins

I have three tables, and their relevant columns are:
tPerson
-> PersonID
tPersonStatusHistory
-> PersonStatusHistoryID
-> PersonID
-> StatusID
-> PersonStatusDate
Status
-> StatusID
I want to store a full history of all the Statuses that a Person has ever had. But I also want easy access to the current status.
Query to get the current status:
SELECT TOP 1 StatusID FROM tPersonStatusHistory
WHERE PersonID = ? ORDER BY PersonStatusDate DESC
What I want is a query that will fetch me a list of Person records, with their most recent StatusID as a column in the query.
We have tried the following approaches:
Including the above query as a sub-query in the select.
Adding a CurrentPersonStatusHistoryID column to the tPerson table and maintaining it using a computed column that calls a User-Defined-Function.
Maintaining the CurrentPersonStatusHistoryID column using a trigger on the tPersonStatusHistory table.
The query to pull up the Person records is quite high use, so I don't want to have to look up the History table each time. The trigger approach is closest to what I want, since the data is persisted in the Person table and is only changed when an update is made (which is by comparison not very often).
I find triggers difficult to maintain and I would prefer to stay away from them. I've also found that when doing an Insert-Select, or an Update query involving multiple records, the trigger is only called on the first record and not the others.
What I really want is to put some logic into the column definition of CurrentPersonStatusHistoryID, press Save and have it persisted and updated behind the scenes without my intervention.
Given that Many-to-Many relationships are common I was wondering if anyone else had come across a similar situation and had some insight into the highest performance, and preferably least hassle, way of implementing this.
Another approach is to use something like the following query, perhaps as a view. It will give you the most recent StatusID for each Person.
SELECT PersonID, StatusID
FROM (
SELECT PersonID, StatusID,
rank() OVER(PARTITION BY PersonID ORDER BY PersonStatusDate DESC) as rnk
FROM tPersonStatusHistory
) A
WHERE rnk = 1
I'm not sure that this satisfies your requirement for performance, but it's something you could look into.

Maintaining a metadata table in SQL

Can someone help giving me some direction to tackle a scenario like this.
A User table which contains all the user information, UserID is the primary key on User Table. I have another table called for example Comments, which holds all the comments created by any user. Comments table contains UserID as the foreign key. Now i have to rank the Users based on number of comments they added. The more comments a user added, the ranking goes up. I am trying to see what will be the best way to do this.
I would prefer to have another table, which basically contains all the attributes or statistics of a user(might have more attributes in future, right now only rank, based on comment count),rather than adding another column in User table itself.
If I create another table Called UserStats, and have UserID as the foreign Key, and have another column, called Rank, there is a possibility that everytime a user adds a comment, we might need to update the ranks. How do I write a SP that does this, Im not even sure, if this is the right way to do this.
This is not the right way to do this.
You don't want to be materializing those kinds of computed values until there is a performance problem - and you have options like Indexed Views to help you well before you get to the point of doing what you suggested.
Just create a View called UserRankings and have it look like:
SELECT c.UserId, COUNT(c.CommentId) [Ranking]
FROM Comments c
GROUP BY c.UserId
Not sure how you want to do your rankings, but you can also look at the RANK() and DENSE_RANK() functions in T-SQL: Ranking Functions (Transact-SQL)
You could do this from a query
SELECT UserID,
COUNT(UserID) CntOfUserID
FROM UserComments
GROUP BY UserID
ORDER BY COUNT(UserID) DESC
You could also do this using a ROW_NUMBER
DECLARE #Comments TABLE(
UserID INT,
Comment VARCHAR(MAX)
)
INSERT INTO #Comments SELECT 3, 'Foo'
INSERT INTO #Comments SELECT 3, 'Bar'
INSERT INTO #Comments SELECT 3, 'Tada'
INSERT INTO #Comments SELECT 2, 'T'
INSERT INTO #Comments SELECT 2, 'G'
SELECT UserID,
ROW_NUMBER() OVER (ORDER BY COUNT(UserID) DESC) ID
FROM #Comments
GROUP BY UserID
Storing that kind of information is actually a bad idea. The count of comments per user is something that can be calculated at any given time quickly and easily. And if your columns are properly indexed (on the foreign key,) the count operation ought to happen very quickly.
The only reason you might want to persist metadata is if the load on your database is fast and furious and you simply cannot afford to run select queries with counts per request. And that load will also inform whether you simply add a column to your user table or create a whole separate table. (The latter solution being the one for the most extreme server loads.)
A few comments:
Yes, I think you should keep the "score" metadata somewhere, otherwise, you'd have to run the scoring calc each time, which could ultimately get expensive.
Second, I don't think you should calculate an actual "rank" (vs other users). Just calculate a "score" (based on the number of comments posted), then your query can determine "rank" by retrieving scores in descending order.
Third, I would probably make a trigger that updates the "score" in the metadata table, based on each insert into the comments table.