Add unique id to groups of ordered transactions [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I currently have a table with transactions that are sequentially ordered for each group like so:
| transaction_no | value |
|----------------|-------|
| 1 | 8 |
| 2 | 343 |
| 3 | 28 |
| 4 | 102 |
| 1 | 30 |
| 2 | 5 |
| 3 | 100 |
| 1 | 12 |
| 2 | 16 |
| 3 | 28 |
| 4 | 157 |
| 5 | 125 |
However I'm interested in add another column that assigns a unique ID to each
grouping (set of transactions where the transaction_no starts with 1 and ends with x
where the transaction_no immediately after x is 1). So the goal is a table like this:
| transaction_no | value | stmt_id |
|----------------|-------|---------|
| 1 | 8 | 1001 |
| 2 | 343 | 1001 |
| 3 | 28 | 1001 |
| 4 | 102 | 1001 |
| 1 | 30 | 1002 |
| 2 | 5 | 1002 |
| 3 | 100 | 1002 |
| 1 | 12 | 1003 |
| 2 | 16 | 1003 |
| 3 | 28 | 1003 |
| 4 | 157 | 1003 |
| 5 | 125 | 1003 |
How would I do this?

This is a variation of the gaps-and-island problem. For it to be solvable, as commented by Gordon Linoff, you need a column that can be used to order the rows. I assume that such a column exists and is called id.
The typical solution involves ranking the records and performing a window sum. When the difference between the overal rank and the window sum changes, a new group starts.
Consider the following query:
select
id,
transaction,
value,
1000
+ rn
- sum(case when transaction_no = lag_transaction_no + 1 then 1 else 0 end)
over(order by id) grp
from (
select
t.*,
row_number() over(order by id) rn,
lag(transaction_no) over(order by id) lag_transaction_no
from mytable t
) t
With this sample data:
id | transaction_no | value
-: | -------------: | ----:
1 | 1 | 8
2 | 2 | 343
3 | 3 | 28
4 | 4 | 102
5 | 1 | 30
6 | 2 | 5
7 | 3 | 100
8 | 1 | 12
9 | 2 | 16
10 | 3 | 28
11 | 4 | 157
12 | 5 | 125
The query returns:
id | transaction_no | value | grp
-: | -------------: | ----: | ---:
1 | 1 | 8 | 1001
2 | 2 | 343 | 1001
3 | 3 | 28 | 1001
4 | 4 | 102 | 1001
5 | 1 | 30 | 1002
6 | 2 | 5 | 1002
7 | 3 | 100 | 1002
8 | 1 | 12 | 1003
9 | 2 | 16 | 1003
10 | 3 | 28 | 1003
11 | 4 | 157 | 1003
12 | 5 | 125 | 1003
Demo on SQL Server 2012 DB Fiddle

Related

SQL - Create number of categories based on pre-defined number of splits

I am using BigQuery, and trying to assign categorical values to each of my records, based on the number of 'splits' assigned to it.
The table has a cumulative count of records, grouped at the STR level - i.e., if there are 4 SKUs at 2 STR, the SKUs will be labeled 1,2,3,4. Each STR is assigned a SPLIT value, so if the STR has a SPLIT value of 2, I want it to split its SKUs into 2 categories. I want to create another column that would assign SKUs labeled 1-2 as '1', and SKUs labeled 3-4 as '2'. (The actual data is on a much larger scale, but thought this would be easier.)
+-----+------+---------------+--------+
| STR | SKU | SKU_ROW_COUNT | SPLITS |
+-----+------+---------------+--------+
| 1 | 1230 | 1 | 3 |
| 1 | 1231 | 2 | 3 |
| 1 | 1232 | 3 | 3 |
| 1 | 1233 | 4 | 3 |
| 1 | 1234 | 5 | 3 |
| 1 | 1235 | 6 | 3 |
| 2 | 1310 | 1 | 2 |
| 2 | 1311 | 2 | 2 |
| 2 | 1312 | 3 | 2 |
| 2 | 1313 | 4 | 2 |
| 3 | 2345 | 1 | 1 |
| 3 | 2346 | 2 | 1 |
| 3 | 2347 | 3 | 1 |
+-----+------+---------------+--------+
The SPLITS column is dynamic, ranging from 1 to 3. The number of SKUs in each category should be relatively equal, but that's not a priority as much as just the number of groups that are created. Ideally, the final table with the new column (HOST_NUMBER) would look something like this:
+-----+------+---------------+--------+-------------+
| STR | SKU | SKU_ROW_COUNT | SPLITS | HOST_NUMBER |
+-----+------+---------------+--------+-------------+
| 1 | 1230 | 1 | 3 | 1 |
| 1 | 1231 | 2 | 3 | 1 |
| 1 | 1232 | 3 | 3 | 2 |
| 1 | 1233 | 4 | 3 | 2 |
| 1 | 1234 | 5 | 3 | 3 |
| 1 | 1235 | 6 | 3 | 3 |
| 2 | 1310 | 1 | 2 | 1 |
| 2 | 1311 | 2 | 2 | 1 |
| 2 | 1312 | 3 | 2 | 2 |
| 2 | 1313 | 4 | 2 | 2 |
| 3 | 2345 | 1 | 1 | 1 |
| 3 | 2346 | 2 | 1 | 1 |
| 3 | 2347 | 3 | 1 | 1 |
+-----+------+---------------+--------+-------------+
You can use window functions and arithmetics:
select
t.*,
1 + floor((sku_row_count - 1) * splits / count(*) over(partition by str)) host_number
from mytable t
order by sku
Actually, ntile() seems to do exactly what you want - and you don't even need the sku_row_count column (which basically mimics row_number() anyway):
select
t.*,
ntile(splits) over(partition by str order by sku) host_number
from mytable t
order by sku
If the ordering of the values in the groups doesn't matter, just use modulo arithmetic:
select t.*, (SKU_ROW_COUNT % SPLITS) as split_group
from t
Below is for BigQuery Standard SQL
#standardSQL
SELECT *, 1 + MOD(SKU_ROW_COUNT, SPLITS) AS HOST_NUMBER
FROM `project.dataset.table`

writing SQL query to show result in specific order

I have this table
+----+--------+------------+-----------+
| Id | day_id | subject_id | period_Id |
+----+--------+------------+-----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 8 | 2 | 6 | 1 |
| 9 | 2 | 7 | 2 |
| 15 | 3 | 3 | 1 |
| 16 | 3 | 4 | 2 |
| 22 | 4 | 5 | 1 |
| 23 | 4 | 5 | 2 |
| 24 | 4 | 6 | 3 |
| 29 | 5 | 8 | 1 |
| 30 | 5 | 1 | 2 |
to something like this
| Id | day_id | subject_id | period_Id |
| 1 | 1 | 1 | 1 |
| 8 | 2 | 6 | 1 |
| 15 | 3 | 3 | 1 |
| 22 | 4 | 5 | 1 |
| 29 | 5 | 8 | 1 |
| 2 | 1 | 2 | 2 |
| 2 | 1 | 2 | 2 |
| 16 | 3 | 4 | 2 |
| 23 | 4 | 5 | 2 |
| 30 | 5 | 1 | 2 |
+----+--------+------------+-----------+
SO, I want to choose one period with a different subject each day and doing this for number of weeks. so first subject dose not come until all subject have been chosen.
You can ORDER BY period_id first and then by day_id:
SELECT *
FROM your_table
ORDER BY period_Id, day_Id
LiveDemo

How to generate merit list from exam results in SQL Server

I'm using SQL Server 2008 R2. I have a table called tstResult in my database.
AI SubID StudID StudName TotalMarks ObtainedMarks
--------------------------------------------------------
1 | 1 | 1 | Jakir | 100 | 90
2 | 1 | 2 | Rubel | 100 | 75
3 | 1 | 3 | Ruhul | 100 | 82
4 | 1 | 4 | Beauty | 100 | 82
5 | 1 | 5 | Bulbul | 100 | 96
6 | 1 | 6 | Ripon | 100 | 82
7 | 1 | 7 | Aador | 100 | 76
8 | 1 | 8 | Jibon | 100 | 80
9 | 1 | 9 | Rahaat | 100 | 82
Now I want a SELECT query that generate a merit list according to the Obtained Marks. In this query obtained marks "96" will be the top in the merit list and all the "82" marks will be placed one after another in the merit list. Something like this:
StudID StudName TotalMarks ObtainedMarks Merit List
----------------------------------------------------------
| 5 | Bulbul | 100 | 96 | 1
| 1 | Jakir | 100 | 90 | 2
| 9 | Rahaat | 100 | 82 | 3
| 3 | Ruhul | 100 | 82 | 3
| 4 | Beauty | 100 | 82 | 3
| 6 | Ripon | 100 | 82 | 3
| 8 | Jibon | 100 | 80 | 4
| 7 | Aador | 100 | 76 | 5
| 2 | Rubel | 100 | 75 | 6
;with cte as
(
select *, dense_rank() over (order by ObtainedMarks desc) as Merit_List
from tstResult
)
select * from cte order by Merit_List desc
you need to use Dense_rank()
select columns from tstResult order by ObtainedMarks desc

Ask about query in sql server

i have table like this:
| ID | id_number | a | b |
| 1 | 1 | 0 | 215 |
| 2 | 2 | 28 | 8952 |
| 3 | 3 | 10 | 2000 |
| 4 | 1 | 0 | 215 |
| 5 | 1 | 0 |10000 |
| 6 | 3 | 10 | 5000 |
| 7 | 2 | 3 |90933 |
I want to sum a*b where id_number is same, what the query to get all value for every id_number? for example the result is like this :
| ID | id_number | result |
| 1 | 1 | 0 |
| 2 | 2 | 523455 |
| 3 | 3 | 70000 |
This is a simple aggregation query:
select id_number, sum(a*b)
from t
group by id_number
I'm not sure what the first column is for.

SQL report show result in one line of group

I am trying to reach the follwoing result:
ID | Part | QTY| Boxes| Reference
1 | ABC123 | 20 | 0 | REF0001
2 | ABC345 | 10 | 0 | REF0001
3 | ABC487 | 5 | 1 | REF0001
4 | SEF453 | 4 | 0 | REF0002
5 | ABDS12 | 82 | 4 | REF0002
6 | EFR488 | 64 | 0 | REF0003
7 | XCV345 | 58 | 0 | REF0003
8 | SSFS33 | 23 | 3 | REF0003
Right now I get
ID | Part | QTY| Boxes| Reference
1 | ABC123 | 20 | 1 | REF0001
2 | ABC345 | 10 | 1 | REF0001
3 | ABC487 | 5 | 1 | REF0001
4 | SEF453 | 4 | 4 | REF0002
5 | ABDS12 | 82 | 4 | REF0002
6 | EFR488 | 64 | 3 | REF0003
7 | XCV345 | 58 | 3 | REF0003
8 | SSFS33 | 23 | 3 | REF0003
As you can see, the qty of boxes per reference repeat each row and i need to appear only one per reference.
Well, here is one way . . .
with t as (<your current query>)
select ID, Part, QTY,
max(Boxes) over (partition by Reference) as Boxes,
Reference
from t
Assigning row numbers grouped per each reference will mark highest ID sharing the same reference as 1; main query checks this mark and outputs zero if it is not satisfied.
; with q as
(
select *,
row_number() over (partition by Reference
order by ID desc) rn
from
(
your-query-here
) a
)
select q.ID,
q.Part,
q.QTY,
case when rn = 1 then q.Boxes else 0 end as Boxes,
q.Reference
from q
order by q.ID