SQL: Update column with with an index that resets if column changes - sql

Firstly, sorry for the wording of the question. I'm not too sure how to express it. Hopefully the example below is clear.
If I have a table
Id | Type | Order
0 | Test | null
1 | Test | null
2 | Blah | null
3 | Blah | null
I want to turn it into this
Id | Type | Order
0 | Test | 1
1 | Test | 2
2 | Blah | 1
3 | Blah | 2
So I'm grouping the table by 'type' and allocating a number to 'order' incrementally. It'll start at 1 per type.
How should I go about doing it?
Db I'm using is Sybase 15.

select
Id,
Type,
row_number() over(partition by Type order by Id) as [Order]
from YourTable
You should utilize the ROW_NUMBER function to get you what you're looking for.
ROW_NUMBER (Transact-SQL)
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.

Related

How to do a batch insert version of auto increment?

I have a temporary table that has columns id and value:
| id | value |
And I have a table that aggregates data with columns g_id and value:
| g_id | value |
The id in the temporary table is locally ordered by id, here's 2 examples:
temp_table 1
| id | value |
+----+--------+
| 1 | first |
| 2 | second |
| 3 | third |
temp_table 2
| id | value |
+----+-------+
| 2 | alpha |
| 3 | beta |
| 4 | gamma |
I want to insert all the rows from the temp table into the global table, while having the g_id ordered globally. If I insert temp table 1 before temp table 2, it should be like this:
global_table
| group_id | value |
+----------+--------+
| 1 | first |
| 2 | second |
| 3 | third |
| 4 | alpha |
| 5 | beta |
| 6 | gamma |
For my purposes, it is ok if there are jumps between consecutive numbers, and if 2 inserts are done at the same time, it can be interleaved. The requirement is basically that the id columns are always increasing.
e.g.
Let's say I have 2 sets of queries run at the same time, and the group_id in global_table is auto incremented:
Query 1:
INSERT INTO global_table (value) VALUES (first)
INSERT INTO global_table (value) VALUES (second)
INSERT INTO global_table (value) VALUES (third)
Query 2:
INSERT INTO global_table (value) VALUES (alpha)
INSERT INTO global_table (value) VALUES (beta)
INSERT INTO global_table (value) VALUES (gamma)
I can get something like this:
global_table
| group_id | value |
+----------+--------+
| 1 | first |
| 3 | alpha |
| 4 | second |
| 5 | third |
| 6 | beta |
| 7 | gamma |
How do I achieve something like this with inserting from a table? Like with
INSERT INTO global_table (value)
SELECT t.value
FROM temp_table t
Unfortunately, this may not result in a incrementing id all the time.
The requirement is basically that the id columns are always
increasing.
If I understood you correctly, you should use ORDER BY in your INSERT statement.
INSERT INTO global_table (value)
SELECT t.value
FROM temp_table t
ORDER BY t.id;
In SQL Server, if you include ORDER BY t.id, it will guarantee that the new IDs generated in the global_table table will be in the specified order.
I don't know about other databases, but SQL Server IDENTITY has such guarantee.
Using your sample data, it is guaranteed that the group_id generated for value second will be greater than the value generated for the value first. And group_id for value third will be greater than for value second.
They may be not consecutive, but if you specify ORDER BY, their relative order will be preserved.
Same for the second table, and even if you run two INSERT statements at the same time. Generated group_ids may interleave between two tables, but relative order within each statement will be guaranteed.
See my answer for a similar question with more technical details and references.

How can you assign all the different values in a table variable to fields in existing rows of a table?

I have a table variable (#tableVar) with one column (tableCol) of unique values.
I have a target table with many existing rows that also has a column that is filled entirely with the NULL value.
What type of statement can I use to iterate through #tableVar and assign a different value from #tableVar.tableCol to the null field for each of the rows in my target table?
*Edit (to provide info)
My Target table has this structure:
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | NULL |
| Byron | NULL |
| Steve | NULL |
+-------+------------+
My table variable has this structure
+------------+
| CallNumber |
+------------+
| 6348 |
| 2675 |
| 9898 |
+------------+
I need to assign a different call number to each row in the target table, to achieve this kind of result.
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | 6348 |
| Byron | 2675 |
| Steve | 9898 |
+-------+------------+
Note: Each row does not need a specific CallNumber. The only requirement is that each row have a unique CallNumber. For example, "James" does not specifically need 6348; He can have any number as long as it's unique, and the unique number must come from the table variable. You can assume that the table variable will have enough CallNumbers to meet this requirement.
What type of query can I use for this result?
You can use an update with a sequence number:
with toupdate as (
select t.*, row_number() over (order by (select null)) as seqnum
from target t
)
update toupdate
set col = tv.tablecol
from (select tv.*, row_number() over (order by (select null)) as seqnum
from #tablevar tv
) tv
where tv.seqnum = toupdate.seqnum;
This assumes that #tablevar has a sufficient number of rows to assign in target. If not, I would suggest that you ask a new question with sample data and desired results.
Here is a db<>fiddle.

How select data from two column in sql?

I have a table in postgresql as follow:
id | name | parent_id |
1 | morteza | null |
2 | ali | null |
3 | morteza2 | 1 |
4 | morteza3 | 1 |
My unique data are records with id=1,2, and record id=1 modified twice. now I want to select data with last modified. Query result for above data is as follow:
id | name |
1 | morteza3 |
2 | ali |
What's the suitable query?
If I am following correctly, you can use distinct on and coalesce():
select distinct on (coalesce(parent_id, id)) coalesce(parent_id, id) as new_id, name
from mytable
order by coalesce(parent_id, id), id desc
Demo on DB Fiddle:
new_id | name
-----: | :-------
1 | morteza3
2 | ali
From your description it would seem that the latest version of each row has parent_id IS NULL. (And obsoleted row versions have parent_id IS NOT NULL.)
The query is simple then:
SELECT id, name
FROM tbl
WHERE parent_id IS NULL;
db<>fiddle here
If you have many updates (hence, many obsoleted row versions), a partial index will help performance a lot:
CREATE INDEX ON tbl(id) WHERE parent_id IS NULL;
The actual index column is mostly irrelevant (unless there are additional requirements). The WHERE clause is the point here, to exclude the many obsoleted rows from the index. See:
Postgres partial index on IS NULL not working
Slow PostgreSQL query in production - help me understand this explain analyze output

How to select first item by group with condition?

I have a table with the following layout, to store orders for users, and to remember which orders are being processed right now:
Sequence | User | Order | InProcess
---------+------+-------+----------
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 1 | 3 |
6 | 4 | 1 |
7 | 2 | 2 |
E.g., line 4 | 3 | 1 | means that the 4th order ever is for user 3, and it's his/her 1st order. Now I want to select the order which to process next. This has to be done according to the following criterias:
Older orders (with lower sequence numbers) are processed first.
Only one order is processed per user at once.
Once an order is selected as being processed it gets marked as InProcess.
Once an order is completed, it is deleted from this list.
So, after some time this may look like this:
Sequence | User | Order | InProcess
---------+------+-------+----------
1 | 1 | 1 | X
2 | 1 | 2 |
3 | 2 | 1 | X
4 | 3 | 1 | X
5 | 1 | 3 |
6 | 4 | 1 |
7 | 2 | 2 |
When now being asked for the next order to process, the answer would be the line with sequence number 6, since orders for users 1, 2 and 3 are already being processed, so no additional order for them may be processed. The question is: How do I get efficiently to this row?
Basically what I need is the SQL equivalent of
Of all orders, select the first order which is not in process, and whose user is not having an order already being processed.
The question is just how to tell this with SQL? BTW: I'm looking for a standard SQL solution, not DBMS-specific ways to go. However, if for whatever reason limiting the question to a specific DBMS, these are the ones I have to support (in this order):
PostgreSQL
MariaDB
MySQL
SQL Server
MongoDB
Any ideas?
I think captures your logic:
select t.*
from (select t.*, max(in_process) over (partition by user_id) as any_in_process
from t
) t
where any_in_process is null
order by sequence
fetch first 1 row only;
Fetching one row is database specific, but the rest is pretty generic.
You can get the next order to be processed by using the ROW_NUMBER() window function, as in:
select *
from (
select
*,
row_number() over(order by "order", "sequence") as as rn
from t
where "user" not in (
select "user" from t where inprocess = 'X'
)
) x
where rn = 1
Available in PostgreSQL, MariaDB 10.2, MySQL 8.0, SQL Server 2012.

Trying to create a Teradata view to aggregate how long rows of a specific ID have had a certain value

I have a test report table, that writes a row after each run of a test.
Let's say this is the data:
| main_id | status | date |
|---------|--------|---------|
| 123 | pass | Jan 1st |
| 123 | fail | Jan 2nd |
| 123 | fail | Jan 3rd |
| 123 | fail | Jan 4th |
I want to make a view that for each test, will list how long it has been failing.
Essentially, the corresponding row for the above data would look like this:
| main_id | days_failing |
|---------|--------------|
| 123 | 3 |
Using Teradata SQL, how could check each row in the source table, looking for the last success, and then sum up all the subsequent failures?
Edit: Note that there would be many different "main_id"s in the source table, I would need 1 row in the view for every unique failing test in the source table.
Thanks
select main_id
,count (*) - 1 as days_failing
from (select main_id
,"date"
from t
qualify "date" >= max (case status when 'pass' then "date" end) over (partition by main_id)
) t
group by main_id
order by main_id
;