T-SQL Incremental Update Group Number - sql

I have a table like this. For each student, for each term column change I want to increment the semester number. When the student ID Changes the Semester Number should re begin from 1. My Semester Number is null for now and I wish to update it. Any easy fast solution? I read about Dens_Rank and I am not sure if it is applicable here.
StudentId | Term |Course_Number| Semester_Number(Expected)
1 0010 ENG 1
1 0010 AGR 1
1 0020 MAT 2
1 0020 ... 2
1 0110 ... 3
1 0110 ... 3
2 0010 ENG 1
2 0010 MAT 1
2 0020 PHY 2
3 0010 MAT 1
3 ...
3

If I understand correctly, you can do:
select t.*,
dense_rank() over (order by studentid, term) as semester_number
from t;
I have improved my undertanding:
select t.*,
dense_rank() over (partition by studentid order by term) as semester_number
from t;

Related

Finding next node in T-SQL

Suppose I have the following table (Date + CustNum is an unique index)
RowId
Date
CustNum
1
1-Jan-2021
0001
2
1-Jan-2021
0002
3
1-Jan-2021
0004
4
2-Jan-2021
0001
5
3-Jan-2021
0001
6
3-Jan-2021
0004
7
7-Jan-2021
0004
The table has ~500K records.
What is the best method to get the previous and next rowid of the CustNum?
RowId
Date
CustNum
CustPrevRowId
CustNextRowId
1
1-Jan-2021
0001
4
2
1-Jan-2021
0002
3
1-Jan-2021
0004
6
4
2-Jan-2021
0001
1
5
5
3-Jan-2021
0001
4
6
3-Jan-2021
0004
3
7
7
7-Jan-2021
0004
6
I've tried to use sub-query but I have faced a performance issue.
SELECT T1.*,
(SELECT TOP 1 RowID FROM T T2 WHERE T2.CustNum = T1.CustNum AND T2.Date < T1.Date ORDER BY DATE DESC) AS CustPrevRowId,
(SELECT TOP 1 RowID FROM T T2 WHERE T2.CustNum = T1.CustNum AND T2.Date > T1.Date ORDER BY DATE ) AS CustNextRowId
FROM T T1
As already pointed in the comments, you can use the two window functions:
LAG, retrieves the previous row in the same partition, given a specified order
LEAD, does the same, but will get the following row instead
In this specific case, you want to:
partition on "CustNum" (since you want last row for each customer number)
order by the date field (so that it will attempt to get rowid with respect to last/next date)
SELECT *, LAG([RowId]) OVER(PARTITION BY [CustNum] ORDER BY [Date]) AS CustPrevRowId,
LEAD([RowId]) OVER(PARTITION BY [CustNum] ORDER BY [Date]) AS CustNextRowId
FROM tab
ORDER BY RowId
Check the demo here.
Note: the last ORDER BY RowId clause is not necessary.

Row_Number Sybase SQL Anywhere change on multiple condition

I have a selection that returns
EMP DOC DATE
1 78 01/01
1 96 02/01
1 96 02/01
1 105 07/01
2 4 04/01
2 7 04/01
3 45 07/01
3 45 07/01
3 67 09/01
And i want to add a row number (il'l use it as a primary id) but i want it to change always when the "EMP" changes, and also won't change when the doc is same as previous one like:
EMP DOC DATE ID
1 78 01/01 1
1 96 02/01 2
1 96 02/01 2
1 105 07/01 3
2 4 04/01 1
2 7 04/01 2
3 45 07/01 1
3 45 07/01 1
3 67 09/01 2
In SQL Server I could use LAG to compare previous DOC but I can't seem to find a way into SYBASE SQL Anywhere, I'm using ROW_NUMBER to partitions by the "EMP", but it's not what I need.
SELECT EMP, DOC, DATE, ROW_NUMBER() OVER (PARTITION BY EMP ORDER BY EMP, DOC, DATE) ID -- <== THIS WILL CHANGE THE ROW NUMBER ON SAME DOC ON SAME EMP, SO WOULD NOT WORK.
Anyone have a direction for this?
You sem to want dense_rank():
select
emp,
doc,
date,
dense_rank() over(partition by emp order by date) id
from mytable
This numbers rows within groups having the same emp, and increments only when date changes, without gaps.
if performance is not a issue in your case, you can try sth. like:
SELECT tx.EMP, tx.DOC, tx.DATE, y.ID
FROM table_xxx tx
join y on tx.EMP = y.EMP and tx.DOC = y.DOC
(SELECT EMP, DOC, ROW_NUMBER() OVER (PARTITION BY EMP ORDER BY DOC) ID
FROM(SELECT EMP, DOC FROM table_xxx GROUP BY EMP, DOC)x)y

Split a column based on a character in BigQuery

I have a table as shown below on BigQuery
Name | Score
Tim | 63 > 89 > 90
James| 67 > 44
I want to split the Score column into N separate columns where N is the maximum score length in the entire table. I would like the table to be as follow.
Name| Score_1 | Score_2 | Score_3
Tim | 63 | 89 | 90
James| 67 | 44 | 0 or NA
I tried the Split command but I end up doing a new row for each Name-Score combination.
For BigQuery Standard SQL
Below is simple case and assumes you know in advance the expected max score length (3 in below example)
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'Tim' name, '63 > 89 > 90' score UNION ALL
SELECT 'James', '67 > 44'
)
SELECT
name,
score[SAFE_OFFSET(0)] AS score_1,
score[SAFE_OFFSET(1)] AS score_2,
score[SAFE_OFFSET(2)] AS score_3
FROM (
SELECT name, SPLIT(score, ' > ') score
FROM `project.dataset.your_table`
)
with result
Row name score_1 score_2 score_3
1 Tim 63 89 90
2 James 67 44 null
Of course above approach means - if you have many scores - like 10 or 20 or more - you will need to add respective number of extra lines like below
score[SAFE_OFFSET(20)] AS score_21
So, above gives you what you wanted from schema of output point of view
At the same time, below makes more sense to me and in most practical cases is better and most optimal :
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'Tim' name, '63 > 89 > 90' score UNION ALL
SELECT 'James', '67 > 44'
)
SELECT name, score
FROM `project.dataset.your_table`, UNNEST(SPLIT(score, ' > ')) score
with result
Row name score
1 Tim 63
2 Tim 89
3 Tim 90
4 James 67
5 James 44

How to select lowest value for each subject?

subject_ID Date Test_id value
------- --------- ----- -----
1 1/1/2000 A 50
1 1/1/2000 B 10
1 1/2/2000 A 55
1 1/2/2000 B 09
2 1/1/2000 A 51
2 1/1/2000 B 13
2 1/2/2000 A 48
2 1/2/2000 B 08
Hi All,
I have a question about the scenario above. As you see I have test results that comes daily for each subjects. I'm trying to find a way to select the lowest value for each test in defined period of time so final table will be like this
subject_ID Date Test_id value
------- --------- ----- -----
1 1/1/2000 A 50
1 1/2/2000 B 09
2 1/2/2000 A 48
2 1/2/2000 B 08
I'm not sure what technology you are using, but assuming SQL something using a GROUP BY will work.
SELECT subject_ID
, Date
, Test_id
, MIN(value)
FROM YourTable
GROUP BY subject_ID
, Date
, Test_id
ANSI standard SQL supports the row_number() function. With this function, you ca do:
select t.*
from (select t.*,
row_number() over (partition by subject_id, test_id order by value asc) as seqnum
from t
) t
where seqnum = 1;

SQL Teradata query

I have a table abc which have many records with columns col1,col2,col3,
dept | name | marks |
science abc 50
science cvv 21
science cvv 22
maths def 60
maths abc 21
maths def 62
maths ddd 90
I need to order by dept and name with ranking as ddd- 1, cvv - 2, abc -3, else 4 then need to find out maximum mark of an individual. Expected result is
dept | name | marks |
science cvv 22
science abc 50
maths ddd 90
maths abc 21
maths def 62
. How may I do it.?
SELECT
dept,
name,
MAX(marks) AS mark
FROM
yourTable
GROUP BY
dept,
name
ORDER BY
CASE WHEN name = 'ddd' THEN 1
name = 'cvv' THEN 2
name = 'abc' THEN 3
ELSE 4 END
Or, preferably, have another table that includes the sorting order.
SELECT
yourTable.dept,
yourTable.name,
MAX(yourTable.marks) AS mark
FROM
yourTable
INNER JOIN
anotherTable
ON yourTable.name = anotherTable.name
GROUP BY
yourTable.dept,
youtTable.name
ORDER BY
anotherTable.sortingOrder
This should work:
SELECT Dept, Name, MAX(marks) AS mark
FROM yourTable
GROUP BY Dept, Name
ORDER BY CASE WHEN Name = 'ddd' THEN 1
WHEN Name = 'cvv' THEN 2
WHEN Name = 'ABC' THEN 3
ELSE 4 END