SQL Ordering records by "weight" - sql

We have a system that processes records by a "priority" number in a table. We define the priority by the contents of the table, e.g.
UPDATE table
SET priority=3
WHERE processed IS NULL
UPDATE table
SET priority=2
WHERE balance>50
UPDATE table
SET priority=1
WHERE value='blah'
(please ignore the fact that there could be 'overlaps' between priorities :) )
This works fine - the table is processed in priority order, so all the rows where the column "value" is 'blah' are worked first.
I've been given the task of adding an option to order the records by a definable "weight". For example, we'd like 50% of the processing to be priority 1, 25% priority 2 and 25% priority 3. Therefore, from the above, in every 100 records 50 of them would be ones where "value" is 'blah", 25 of them would be where "balance" is greater than 50 etc.
I'm trying to figure out how to do this: some kind of weighted incrementing value for "priority" would seem to be the best way, but I can't get my head around how to code this. Can anyone help please?
EDIT: Apologies, should have said: this is running on MSSQL 2008

General idea is to collect tasks into buckets, divided on border of whole numbers:
select
task_id
from (
select
task_id,
((task_priority_order - 1) / task_priority_density) as task_processing_order
from (
select
t.task_id as task_id,
t.priority as task_priority,
row_number()
over (partition by t.priority order by t.priority) as task_priority_order,
case
when t.priority = 3 then 50
when t.priority = 2 then 25
when t.priority = 1 then 25
end as task_priority_density
from
table t
)
)
order by task_processing_order
In the diapason from 0.0 to 0.(9) we got 100 records constructed from first 50 records with priority 3, first 25 records with priority 2 and first 25 records with priority 1.
The next diapason from 1.0 to 1.(9) represents next bucket of records.
If no more tasks with some value of priority then remaining tasks will be placed in buckets in same ratio. E.g. if not enough tasks with priority 3 then remaining tasks will be arranged with ratio of 50/50.
task_id - some surrogate key for task identification.
P.S. Sorry, I can't test this query now, so any syntax correction very appreciated.
Update: Query syntax corrected according to comments.

Given test script provides the following output. If you would lay out some rules about what the end result should be, I'm willing to take another look at it.
Results
Priority Processed Balance Value
3 NULL NULL NULL
NULL 0 49 NULL
NULL 1 49 NULL
NULL 0 50 NULL
NULL 1 50 NULL
2 0 51 NULL
2 1 51 NULL
2 0 51 Notblah
1 1 51 blah
Test script
DECLARE #Table TABLE (Priority INTEGER, Processed BIT, Balance INTEGER, Value VARCHAR(32))
INSERT INTO #Table VALUES
(NULL, NULL, NULL, NULL)
, (NULL, 0, 49, NULL)
, (NULL, 1, 49, NULL)
, (NULL, 0, 50, NULL)
, (NULL, 1, 50, NULL)
, (NULL, 0, 51, NULL)
, (NULL, 1, 51, NULL)
, (NULL, 0, 51, 'Notblah')
, (NULL, 1, 51, 'blah')
UPDATE #table SET priority=3 WHERE processed IS NULL
UPDATE #table SET priority=2 WHERE balance > 50
UPDATE #table SET priority=1 WHERE value = 'blah'
SELECT *
FROM #table

Related

renumbering in a column when adding a row sql

For a table like
create table Stations_in_route
(
ID_station_in_route int primary key,
ID_route int,
ID_station int,
Number_in_route int not null
)
There is the following trigger that changes the values ​​in the Number_in_route column after a new row is added to the route. The list of numbers in the route must remain consistent.
create trigger stations_in_route_after_insert on Stations_in_route
after insert
as
if exists
(select *from Stations_in_route
where Stations_in_route.ID_station_in_route not in (select ID_station_in_route from inserted)
and Stations_in_route.ID_route in (select ID_route from inserted)
and Stations_in_route.Number_in_route in (select Number_in_route from inserted))
begin
update Stations_in_route
set Number_in_route = Number_in_route + 1
where Stations_in_route.ID_station_in_route not in (select ID_station_in_route from inserted)
and Stations_in_route.ID_route in (select ID_route from inserted)
and Stations_in_route.Number_in_route >= (select Number_in_route from inserted where Stations_in_route.ID_route = inserted.ID_route)
end
this trigger will throw an error if insertion into one ID_route is performed:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
For example,
Insert into Stations_in_route values(25, 4, 11, 3),(26, 4, 10, 5)
How to fix?
ID_station_in_route
ID_route
ID_station
Number_in_route
1
4
1
1
2
4
2
2
3
4
3
3
4
4
4
4
5
4
5
5
6
4
6
6
7
4
7
7
8
4
8
8
i expect the list after adding will become like this
ID_station_in_route
ID_route
ID_station
Number_in_route
1
4
1
1
2
4
2
2
25
4
11
3
3
4
3
4
26
4
10
5
4
4
4
6
5
4
5
7
6
4
6
8
7
4
7
9
8
4
8
10
this is not the whole table, as there are other routes too
Based on the requirements, when you add new stops to the route, you need to insert them into their desired sequence correctly, and push all existing stops from that point forward so that a contiguous sequence is maintained. When you insert one row this isn't very hard (just number_in_route + 1 where number_in_route > new_number_in_route), but when you insert more rows, you need to basically push the entire set of subsequent stops by 1 for each new row. To illustrate, let's say you start with this:
If we insert two new rows, such as:
INSERT dbo.Stations_in_route
(
ID_station_in_route,
ID_route,
ID_station,
Number_in_route
)
VALUES (25, 4, 11, 3),(26, 4, 10, 5);
-- add a stop at 3 ^ ^
----------------- add a stop at 5 ^
We can illustrate this by slowing it down into separate steps. First, we need to add this row at position #3:
And we do this by pushing all the rows > 3 down by 1:
But now when we add this row at position #5:
That's the new position #5, after the previous shift, so it looks like this:
We can do this with the following trigger, which is possibly a little more complicated than it has to be, but is better IMHO than tedious loops which might otherwise be required.
CREATE TRIGGER dbo.tr_ins_Stations_in_route ON dbo.Stations_in_route
FOR INSERT AS
BEGIN
;WITH x AS
(
SELECT priority = 1, *, offset = ROW_NUMBER() OVER
(PARTITION BY ID_route ORDER BY Number_in_route)
FROM inserted AS i
UNION ALL
SELECT priority = 2, s.*, offset = NULL FROM dbo.Stations_in_route AS s
WHERE s.ID_route IN (SELECT ID_route FROM inserted)
),
y AS
(
SELECT *, rough_rank = Number_in_route
+ COALESCE(MAX(offset) OVER (PARTITION BY ID_Route
ORDER BY Number_in_route ROWS UNBOUNDED PRECEDING),0)
- COALESCE(offset, 0),
tie_break = ROW_NUMBER() OVER
(PARTITION BY ID_route, ID_station_in_route ORDER BY priority)
FROM x
),
z AS
(
SELECT *, new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority)
FROM y WHERE tie_break = 1
)
UPDATE s SET s.Number_in_route = z.new_Number_in_route
FROM dbo.Stations_in_route AS s
INNER JOIN z ON s.ID_route = z.ID_route
AND s.ID_station_in_route = z.ID_station_in_route;
END
Working example db<>fiddle
I've mentioned a couple of times that you might want to handle ties for new rows, e.g. if the insert happened to be:
Insert into Stations_in_route values(25, 4, 11, 3),(26, 4, 10, 3)
For that you can add additional tie-breaking criteria to this clause:
new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority)
e.g.:
new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority,
ID_station_in_route DESC)
I'm unable to repro the exception with the test code/data in the question, however I'm gonna guess that the issue is with this bit of the code in the trigger:
AND Stations_in_route.Number_in_route >=
(
SELECT Number_in_route
FROM inserted
WHERE Stations_in_route.ID_route = inserted.ID_route
)
The engine there will implicitly expect that subquery on the right-side of the >= operator to return a scalar result (single row, single column result), however the inserted table is in fact, a table...which may contain multiple records (as would be the case in a multi-row insert/update/etc. type statement as outlined in your example). Given that the filter (i.e. WHERE clause) in that subquery isn't guaranteed to be unique (ID_route doesn't appear to be unique, and in your example you have an insert statement that actually inserts multiple rows with the same ID_route value), then it's certainly possible that query will return a non-scalar result.
To fix that, you'd need to adjust that subquery to guarantee a result of a scalar value (single row and single column). You've guaranteed the single column already with the selector...now you need to add logic to guarantee a single result/record as well. That could include one or more of the following (or possibly other things also):
Wrap the selected Number_in_route column in an aggregate function (i.e. a MAX() perhaps?)
Add a TOP 1 with an ORDER BY to get the record you want to compare with
Add additional filters to the WHERE clause to ensure a single result is returned

SQL Server Sum a specific number of rows based on another column

Here are the important columns in my table
ItemId RowID CalculatedNum
1 1 3
1 2 0
1 3 5
1 4 25
1 5 0
1 6 8
1 7 14
1 8 2
.....
The rowID increments to 141 before the ItemID increments to 2. This cycle repeats for about 122 million rows.
I need to SUM the CalculatedNum field in groups of 6. So sum 1-6, then 7-12, etc. I know I end up with an odd number at the end. I can discard the last three rows (numbers 139, 140 and 141). I need it to start the SUM cycle again when I get to the next ItemID.
I know I need to group by the ItemID but I am having trouble trying to figure out how to get SQL to SUM just 6 CalculatedNum's at a time. Everything else I have come across SUMs based on a column where the values are the same.
I did find something on Microsoft's site that used the ROW_NUMBER function but I couldn't quite make sense of it. Please let me know if this question is not clear.
Thank you
You need to group by (RowId - 1) / 6 and ItemId. Like this:
drop table if exists dbo.Items;
create table dbo.Items (
ItemId int
, RowId int
, CalculatedNum int
);
insert into dbo.Items (ItemId, RowId, CalculatedNum)
values (1, 1, 3), (1, 2, 0), (1, 3, 5), (1, 4, 25)
, (1, 5, 0), (1, 6, 8), (1, 7, 14), (1, 8, 2);
select
tt.ItemId
, sum(tt.CalculatedNum) as CalcSum
from (
select
*
, (t.RowId - 1) / 6 as Grp
from dbo.Items t
) tt
group by tt.ItemId, tt.Grp
You could use integer division and group by.
SELECT ItemId, (RowId-1)/6 as Batch, sum(CalculatedNum)
FROM your_table GROUP BY ItemId, Batch
To discard incomplete batches:
SELECT ItemId, (RowId-1)/6 as Batch, sum(CalculatedNum), count(*) as Cnt
FROM your_table GROUP BY ItemId, Batch HAVING Cnt = 6
EDIT: Fix an off by one error.
To ensure you're querying 6 rows at a time you can try to use the modulo function : https://technet.microsoft.com/fr-fr/library/ms173482(v=sql.110).aspx
Hope this can help.
Thanks everyone. This was really helpful.
Here is what we ended up with.
SELECT ItemID, MIN(RowID) AS StartingRow, SUM(CalculatedNum)
FROM dbo.table
GROUP BY ItemID, (RowID - 1) / 6
ORDER BY ItemID, StartingRow
I am not sure why it did not like the integer division in the select statement but I checked the results against a sample of the data and the math is correct.

Oracle SQL - How can I write an insert statement that is conditional and looped?

Context:
I have two tables: markettypewagerlimitgroups (mtwlg) and stakedistributionindicators (sdi). When a mtwlg is created, 2 rows are created in the sdi table which are linked to the mtwlg - each row with the same values bar 2, the id and another field (let's call it column X) which must contain a 0 for one row and 1 for the other.
There was a bug present in our codebase which prevented this happening automatically, so any mtwlg's created during the time that bug was present do not have the related sdi's, causing NPE's in various places.
To fix this, a patch needs to be written to loop through the mtwlg table and for each ID, search the sdi table for the 2 related rows. If the rows are present, do nothing; if there is only 1 row, check if F is a 0 or a 1, and insert a row with the other value; if neither row is present, insert them both. This needs to be done for every mtwlg, and a unique ID needs to be inserted too.
Pseudocode:
For each market type wager limit group ID
Check if there are 2 rows with that id in the stake distributions table, 1 where column X = 0 and one where column X = 1
if none
create 2 rows in the stake distributions table with unique id's; 1 for each X value
if one
create the missing row in the stake distributions table with a unique id
if 2
do nothing
If it helps at all - the patch will be applied using liquibase.
Anyone with any advice or thoughts as to if and how this will be possible to write in SQL/a liquibase patch?
Thanks in advance, let me know of any other information you need.
EDIT:
I've actually just been advised to do this using PL/SQL, do you have any thoughts/suggestions in regards to this?
Thanks again.
Oooooh, an excellent job for MERGE.
Here's your pseudo code again:
For each market type wager limit group ID
Check if there are 2 rows with that id in the stake distributions table,
1 where column X = 0 and one where column X = 1
if none
create 2 rows in the stake distributions table with unique id's;
1 for each X value
if one
create the missing row in the stake distributions table with a unique id
if 2
do nothing
Here's the MERGE variant (still pseudo-code'ish as I don't know how your data really looks):
MERGE INTO stake_distributions d
USING (
SELECT limit_group_id, 0 AS x
FROM market_type_wagers
UNION ALL
SELECT limit_group_id, 1 AS x
FROM market_type_wagers
) t
ON (
d.limit_group_id = t.limit_group_id AND d.x = t.x
)
WHEN NOT MATCHED THEN INSERT (d.limit_group_id, d.x)
VALUES (t.limit_group_id, t.x);
No loops, no PL/SQL, no conditional statements, just plain beautiful SQL.
Nice alternative suggested by Boneist in the comments uses a CROSS JOIN rather than UNION ALL in the USING clause, which is likely to perform better (unverified):
MERGE INTO stake_distributions d
USING (
SELECT w.limit_group_id, x.x
FROM market_type_wagers w
CROSS JOIN (
SELECT 0 AS x FROM DUAL
UNION ALL
SELECT 1 AS x FROM DUAL
) x
) t
ON (
d.limit_group_id = t.limit_group_id AND d.x = t.x
)
WHEN NOT MATCHED THEN INSERT (d.limit_group_id, d.x)
VALUES (t.limit_group_id, t.x);
Answer: you don't. There is absolutely no need to loop through anything - you can do it in a single insert. All you need to do is identify the rows that are missing, and then you just need to add them in.
Here is an example:
drop table t1;
drop table t2;
drop sequence t2_seq;
create table t1 (cola number,
colb number,
colc number);
create table t2 (id number,
cola number,
colb number,
colc number,
colx number);
create sequence t2_seq
START WITH 1
INCREMENT BY 1
MAXVALUE 99999999
MINVALUE 1
NOCYCLE
CACHE 20
NOORDER;
insert into t1 values (1, 10, 100);
insert into t2 values (t2_seq.nextval, 1, 10, 100, 0);
insert into t2 values (t2_seq.nextval, 1, 10, 100, 1);
insert into t1 values (2, 20, 200);
insert into t2 values (t2_seq.nextval, 2, 20, 200, 0);
insert into t1 values (3, 30, 300);
insert into t2 values (t2_seq.nextval, 3, 30, 300, 1);
insert into t1 values (4, 40, 400);
commit;
insert into t2 (id, cola, colb, colc, colx)
with dummy as (select 1 id from dual union all
select 0 id from dual)
select t2_seq.nextval,
t1.cola,
t1.colb,
t1.colc,
d.id
from t1
cross join dummy d
left outer join t2 on (t2.cola = t1.cola and d.id = t2.colx)
where t2.id is null;
commit;
select * from t2
order by t2.cola;
ID COLA COLB COLC COLX
---------- ---------- ---------- ---------- ----------
1 1 10 100 0
2 1 10 100 1
3 2 20 200 0
5 2 20 200 1
7 3 30 300 0
4 3 30 300 1
6 4 40 400 0
8 4 40 400 1
If the processing logic is too gnarly to be encapsulated in a single SQL statement, you may need to resort to cursor for loops and row types - basically allows you to do things like the following:
DECLARE
r_mtwlg markettypewagerlimitgroups%ROWTYPE;
BEGIN
FOR r_mtwlg IN (
SELECT mtwlg.*
FROM markettypewagerlimitgroups mtwlg
)
LOOP
-- do stuff here
-- refer to elements of the current row like this
DBMS_OUTPUT.PUT_LINE(r_mtwlg.id);
END LOOP;
END;
/
You can obviously nest another loop inside this one that hits the stakedistributionindicators table, but I'll leave that as an exercise for you. You could also left join to stakedistributionindicators a couple of times in this first cursor so that you only return rows that don't already have an x=1 and x=0, again you can probably work that bit out for yourself.
If you would rather write your logic in Java vs. PL/SQL, Liquibase allows you to create custom changes. The custom change points to a Java class you write that can do whatever logic you need. A simple example can be found here

SQL sort equal values using case

A similar question is already asked here. But current scenario is little bit complex than previous. In the example if same Itime then we can can sort by case but if Itime and result is same then how can I sort.
My question is, here the result ID: 3,5,6,1,2,7,8,4. Why it is 2,7,8 for fail case .
Why it is not 8,2,7?
If I want the expected result like: 3,5,1,6,8,2,7,4 how can I proceed?
Please run the below commands and help me to sort. Thanks in advance.
if object_id('tempdb.dbo.#temp321','U') is not null
drop table tempdb.dbo.#temp321
create table #temp321(id int, uname varchar(50), current_point int,
previous_point int, ITime datetime, Result varchar(10))
INSERT into #temp321 values('1','a','50','40','2012-11-12 13:12:28.103','pass')
INSERT into #temp321 values('2','b','15','10','2012-11-12 13:12:28.103','fail')
INSERT into #temp321 values('3','c','71','70','2012-11-12 12:58:30.000','pass')
INSERT into #temp321 values('4','d','34','30','2012-11-12 13:12:28.103','withdraw')
INSERT into #temp321 values('5','e','40','35','2012-11-12 12:58:41.360','withdraw')
INSERT into #temp321 values('6','f','65','60','2012-11-12 13:12:28.103','pass')
INSERT into #temp321 values('7','g','20','15','2012-11-12 13:12:28.103','fail')
INSERT into #temp321 values('8','h','10','7','2012-11-12 13:12:28.103','fail')
select
ID
from
#temp321
ORDER BY
ITime ASC,
CASE Result
WHEN 'pass' THEN 1
WHEN 'fail' THEN 2
WHEN 'withdrow' THEN 3
END
drop table #temp321
Current output ID: 3,5,6,1,2,7,8,4
Expected Output ID: 3,5,1,6,8,2,7,4
The current query will NOT deliver the same order every time.
For me your example delivers:
3, 5, 1, 6, 2, 7, 8, 4 (Note 1 and 6 being swapped)
1 and 6 are "equal" compared to their sort values taking into account for sorting. And if no sorting is specified (or equal sortings) the order within that bunch is - per definition - undefined. (depends on the order threads created the data)
Same applies for 2, 7, 8. You want the order 3, 5, 1, 6, 8, 2, 7, 4 - so you seem to "have" a logic how you expect it to be sorted? Then add that condition and you are done :)
(for your expected output adding current_point is what you want - but YOU have to know if you want to sort by that column)
SELECT *
FROM temp321
ORDER BY ITime ASC,
CASE Result
WHEN 'pass' THEN 1
WHEN 'fail' THEN 2
WHEN 'withdraw' THEN 3
END, current_point ASC
You can create subquery with as key value pair:
SELECT 'pass' value, 1 priority UNION ALL
SELECT 'fall' value, 2 priority UNION ALL
SELECT 'withdraw' value, 3 priority UNION ALL
Then do a join to a subquery on value column and order by priority, that will give you cleaner solution. You can even create temporary table and index by value if there are a lot of lookup values, to ensure speed is appropriate.

Get values between ranges

The more I think of it the more I am confused, could be because it is quite a while that i wrote some complex sql.
I have a table that has a range for a value. Lets call it a range:
RANGE
RANGE_ID RANGE_SEQ MIN MAX FACTOR
1 1 0 10 1
1 2 11 100 1.5
1 3 101 2.5
2 1 0 18 1
2 2 19 2
And I have anothe table that uses these ranges. Lets call it application
APPLICATION
APP_ID RAW_VALUE RANGE_ID FINAL_VALUE
1 20.0 1 30.0 /*In Range 1, 20 falls between 11 and 100, so 1.5 is applied)*/
2 25.0 2 50.0
3 18.5 2 18.5
I want to get those RAW_VALUES that fall between the ranges. So for range 2, I want those APP_IDs that have a RAW_VALUE between 18 and 19. Similarly for range 1, I want those APP_IDs that have a RAW_VALUE between 10 and 11 and 100 and 101.
I want to know whether this is possible with SQL, and some pointers on what I can try. I don't need the sql itself, just some pointers to the approach.
Try this to get you close
select app_id,raw_value,aa.range_id,raw_value * xx.factor as FinaL_Value
from Application_table aa
join range_table xx on (aa.raw_value between xx.min and xx.max)
and (aa.range_id=xx.range_id)
To get non-matches (i.e. raw_values that do not exist in the table), try this
select app_id,raw_value,aa.range_id
from Application_table aa
left join range_table xx on (aa.raw_value between xx.min and xx.max)
and (aa.range_id=xx.range_id)
where xx.range_id is null
create table tq84_range (
range_id number not null,
range_seq number not null,
min_ number not null,
max_ number,
factor number not null,
--
primary key (range_id, range_seq)
);
insert into tq84_range values (1, 1, 0, 10, 1.0);
insert into tq84_range values (1, 2, 10, 100, 1.5);
insert into tq84_range values (1, 3,101,null, 2.5);
insert into tq84_range values (2, 1, 0, 18, 1.0);
insert into tq84_range values (2, 2, 19,null, 2.0);
create table tq84_application (
app_id number not null,
raw_value number not null,
range_id number not null,
primary key (app_id)
);
insert into tq84_application values (1, 20.0, 1);
insert into tq84_application values (2, 25.0, 2);
insert into tq84_application values (3, 18.5, 2);
You want to use a left join.
With such a left join, you ensure that each record of the left
table (the table appearing prior to left join in the
select statement text) will be returned at least once,
even though the where condition doesn't find a record
in the right table.
If tq84_range.range is null then you know that the join
condition didn't find a record in tq84_range, therefore, there
seems to be a gap. So you print Missing:.
Since tq84_application.max_ can be null and null appears to
indicate infinity or upper bound you test the upper limit
with nvl(tq84_range.max_, tq84_application.raw_value
Thus, the select statement will become something like:
select
case when tq84_range.range_id is null then 'Missing: '
else ' '
end,
tq84_application.raw_value
from
tq84_application left join
tq84_range
on
tq84_application.range_id = tq84_range.range_id
and
tq84_application.raw_value between
tq84_range.min_ and nvl(tq84_range.max_, tq84_application.raw_value);
From what I understand you're saying you only want results from the application table that don't fit in any range? This, for example, would return only the row for app_id = 3 (my own column names and guess at real minimum and maximum amounts):
select *
from APP1 A
where not exists
(select null
from RANGE1 R
where R.RANGE_ID = A.RANGE_ID and A.RAW_VALUE between nvl(R.MINNUM, 0) and nvl(R.MAXNUM, 999999));
But, of course, it won't return a factor amount as it matches no rows in the range table so why would the result for app_id = 3 in your example above match up with factor = 1? If your raw_value column is going to be decimal then I would expect the ranges to be decimal too.