Assign value to a column based on values of other columns in the same table - sql

I have a table with columns Date and Order. I want to add a column named Batch to this table which will be filled as follows: For each Date, we start from the first Order, and group each two orders in one batch.
It means that for records with Date = 1 in this example (the first 4 records), the first two records (Order= 10 and Order=30) will have batch number: Batch = 1, the next two records (Order = 80 and Order = 110) will have Batch = 2, and so on.
If at the end the number of remaining record(s) is less than the batch size (2 in this example),
the remained order(s) will have a separate Batch number, as in the example below, number of records with Date=2 is odd, so the last record (5th records) will have Batch = 3.
Date Order
-----------
1 10
1 30
1 80
1 110
2 20
2 30
2 50
2 70
2 120
3 90
Date Order Batch
------------------
1 10 1
1 30 1
1 80 2
1 110 2
2 20 1
2 30 1
2 50 2
2 70 2
2 120 3
3 90 1

Use the analytic function row_number to get row numbers 1,2,3,... within each date. Then add one and divide by two:
select
dateid,
orderid,
trunc((row_number() over (partition by dateid order by orderid) +1 ) / 2) as batch
from mytable;

Related

SQL Get max value of n next rows

Say I have a table with two columns: the time and the value. I want to be able to get a table with :
for each time get the max values of every next n seconds.
If I want the max value of every next 3 seconds, the following table:
time
value
1
6
2
1
3
4
4
2
5
5
6
1
7
1
8
3
9
7
Should return:
time
value
max
1
6
6
2
1
4
3
4
5
4
2
5
5
5
5
6
1
3
7
1
7
8
3
NULL
9
7
NULL
Is there a way to do this directly with an sql query?
You can use the max window function:
select *,
case
when row_number() over(order by time desc) > 2 then
max(value) over(order by time rows between current row and 2 following)
end as max
from table_name;
Fiddle
The case expression checks that there are more than 2 rows after the current row to calculate the max, otherwise null is returned (for the last 2 rows ordered by time).
Similar Version to Zakaria, but this solution uses about 40% less CPU resources (scaled to 3M rows for benchmark) as the window functions both use the same exact OVER clause so SQL can better optimize the query.
Optimized Max Value of Rolling Window of 3 Rows
SELECT *,
MaxValueIn3SecondWindow = CASE
/*Check 3 rows exists to compare. If 3 rows exists, then calculate max value*/
WHEN 3 = COUNT(*) OVER (ORDER BY [Time] ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
/*Returns max [Value] between the current row and the next 2 rows*/
THEN MAX(A.[Value]) OVER (ORDER BY [Time] ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
END
FROM #YourTable AS A

Replace a column value with random values

I want to replace values in a column with randomized values
NO LINE
-- ----
1 1
1 2
1 3
1 4
2 1
2 2
3 1
4 1
4 2
I want to randomize column NO and replace with random values. I have 5 million records and doing something like below script gives me 5 million unique NO's but as you can see NO is not unique and i want the same random value assigned for the same NO.
UPDATE table1
SET NO= abs(checksum(NewId())) % 100000000
I want my resultant dataset like below
NO LINE
------ ----
99 1
99 2
99 3
99 4
1092 1
1092 2
3456 1
41098 1
41098 2
I would recommend rand() with a seed:
UPDATE table1
SET NO = FLOOR(rand(NO) * 100000000);
This runs a slight risk of collisions, so two different NO rows could get the same value.
If the numbers do not need to be "random" you can give them consecutive values in an arbitrary order and avoid collisions:
with toupdate as (
select t1.*,
dense_rank() over (order by rand(NO), no) as new_no
from t
)
update toupdate
set no = new_no;

Oracle SQL find row crossing limit

I have a table which has four columns as below
ID.
SUB_ID. one ID will have multiple SUB_IDs
Revenue
PAY where values of Pay is always less than or equal to Revenue
select * from Table A order by ID , SUB_ID will have data as below
ID SUB_ID REVENUE PAY
100 1 10 8
100 2 12 9
100 3 9 7
100 4 11 11
101 1 6 5
101 2 4 4
101 3 3 2
101 4 8 7
101 5 4 3
101 6 3 3
I have constant LIMIT value 20 . Now I need to find the SUB_ID which Revenue crosses the LIMIT when doing consecutive SUM using SUB_ID(increasing order) for each ID and then find total Pay ##. In this example
for ID 100 Limit is crossed by SUB ID 2 (10+12) . So total Pay
is 17 (8+9)
for ID 101 Limit is crossed by SUB ID 4
(6+4+3+8) . So total Pay is 18 (5+4+2+7)
Basically I need to find the row which crosses the Limit.
Fiddle: http://sqlfiddle.com/#!4/4f12a/4/0
with sub as
(select x.*,
sum(revenue) over(partition by id order by sub_id) as run_rev,
sum(pay) over(partition by id order by sub_id) as run_pay
from tbl x)
select *
from sub s
where s.run_rev = (select min(x.run_rev)
from sub x
where x.id = s.id
and x.run_rev > 20);

SQL query to return rows in multiple groups

I have a SQL table with data in the following format:
REF FIRSTMONTH NoMONTHS VALUE
--------------------------------
1 2 1 100
2 4 2 240
3 5 4 200
This shows a quoted value which should be delivered starting on the FIRSTMONTH and split over NoMONTHS
I want to calculate the SUM for each month of the potential deliveries from the quoted values.
As such I need to return the following result from a SQL server query:
MONTH TOTAL
------------
2 100 <- should be all of REF=1
4 120 <- should be half of REF=2
5 170 <- should be half of REF=2 and quarter of REF=3
6 50 <- should be quarter of REF=3
7 50 <- should be quarter of REF=3
8 50 <- should be quarter of REF=3
How can I do this?
You are trying extract data from what should be a many to many relationship.
You need 3 tables. You should be able to write a JOIN or GROUP BY select statement from there. The tables below don't use the same data values as yours, and are merely intended for a structural example.
**Month**
REF Month Value
---------------------
1 2 100
2 3 120
etc.
**MonthGroup**
REF
---
1
2
**MonthsToMonthGroups**
MonthREF MonthGroupREF
------------------
1 1
2 2
2 3
The first part of this query gets a set of numbers between the start and the end of the valid values
The second part takes each month value, and divides it into the monthly amount
Then it is simply a case of grouping each month, and adding up all of the monthly amounts.
select
number as month, sum(amount)
from
(
select number
from master..spt_values
where type='p'
and number between (select min(firstmonth) from yourtable)
and (select max(firstmonth+nomonths-1) from yourtable)
) numbers
inner join
(select
firstmonth,
firstmonth+nomonths-1 as lastmonth,
value / nomonths as amount
from yourtable) monthly
on numbers.number between firstmonth and lastmonth
group by number

Caluculating sum of activity

I have a table which is with following kind of information
activity cost order date other information
10 1 100 --
20 2 100
10 1 100
30 4 100
40 4 100
20 2 100
40 4 100
20 2 100
10 1 101
10 1 101
20 1 101
My requirement is to get sum of all activities over a work order
ex: for order 100
1+2+4+4=11
1(for activity 10)
2(for activity 20)
4 (for activity 30) etc.
i tried with group by, its taking lot time for calculation. There are 1lakh plus records in warehouse. is there any possibility in efficient way.
SELECT SUM(MIN(cost))
FROM COST_WAREHOUSE a
WHERE order = 100
GROUP BY (order, ACTIVITY)
You can use the following query to get the sum of cost for distinct tuples of (activity, order, cost)
SELECT SUM(COST)
FROM
(SELECT DISTINCT activity, order, cost
FROM COST_WAREHOUSE WHERE order = 100) AS A