SQL Server - Manipulating Values to Create a String - sql

I am trying to populate the column String_to_Use the way it displays on the 5th column from left to right, with values from the ID column, displaying the ranges with "-". Code below produces the last column string_to_use incorrectly.
select
t.*,
(case
when Checking_id = -2
then min(id) over (partition by grp) + '-' + max(id) over (partition by grp)
else id
end) as string_to_use
from
(select
t.*,
sum(case when Checking_id = -2 then 1 else 0 end) over (partition by id) as grp
from
t) t
order by
id;
Output:
ID Number ID IndexColumn String_To_Use Checking_id grp string_to_use
------------------------------------------------------------------------------
0000 1 0000 1 0000-1130 -2 1 0000-1210
1000 2 1000 2 0000-1130 -2 1 0000-1210
1020 3 1020 3 0000-1130 -2 1 0000-1210
1130 4 1130 4 0000-1130 -2 1 0000-1210
1198 5 NULL 9999 NULL NULL 0 NULL
1199 6 1199 5 1199-1210 -2 1 0000-1210
1210 7 1210 6 1199-1210 -2 1 0000-1210
1240 8 NULL 9999 NULL NULL 0 NULL
1250 9 NULL 9999 NULL NULL 0 NULL
1260 10 1260 7 1260 7 0 1260
1261 11 NULL 9999 NULL NULL 0 NULL
1280 12 NULL 9999 NULL NULL 0 NULL
1296 13 NULL 9999 NULL NULL 0 NULL
1298 14 NULL 9999 NULL NULL 0 NULL
1299 15 1299 8 1299 8 0 1299
1501 16 NULL 9999 NULL NULL 0 NULL
Can someone please help me with this? Thank you!

Have a look at the following query.
What i do is to create groups on the basis of difference between Number and IndexColumn.
ie my paritition by block is based upon groups up and until it meets up with a 9999 indexcol record.
After that i am getting the max id and the min id value of that group and concatenating using '-'
Here is a db-fiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=7bd4d3a489600b58740e2f82a478726b
In the end the query looks like this
create table t(ID varchar(10),Number1 int,ID2 varchar(10),indexcol int,String_To_Use varchar(100))
insert into t
select *
from (values
('0000',1 ,'0000',1 ,'0000-1130')
,('1000',2 ,'1000',2 ,'0000-1130')
,('1020',3 ,'1020',3 ,'0000-1130')
,('1130',4 ,'1130',4 ,'0000-1130')
,('1198',5 ,NULL ,9999,NULL )
,('1199',6 ,'1199',5 ,'1199-1210')
,('1210',7 ,'1210',6 ,'1199-1210')
,('1240',8 ,NULL ,9999,NULL )
,('1250',9 ,NULL ,9999,NULL )
,('1260',10,'1260',7 ,'1260' )
,('1261',11,NULL ,9999,NULL )
,('1280',12,NULL ,9999,NULL )
,('1296',13,NULL ,9999,NULL )
,('1298',14,NULL ,9999,NULL )
,('1299',15,'1299',8 ,'1299' )
,('1501',16,NULL ,9999,NULL )
)t(id,number1,id2,indexcol,string_to_use)
select *
,max(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)as max_val
,case when max(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)
= min(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)
then max(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)
else min(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)
+'-'+
max(case when indexcol <> 9999 then id end) over(partition by Number1-indexcol)
end as computed_string_to_use
from t
order by Number1
+------+---------+------+----------+---------------+---------+------------------------+
| ID | Number1 | ID2 | indexcol | String_To_Use | max_val | computed_string_to_use |
+------+---------+------+----------+---------------+---------+------------------------+
| 0000 | 1 | 0000 | 1 | 0000-1130 | 1130 | 0000-1130 |
| 1000 | 2 | 1000 | 2 | 0000-1130 | 1130 | 0000-1130 |
| 1020 | 3 | 1020 | 3 | 0000-1130 | 1130 | 0000-1130 |
| 1130 | 4 | 1130 | 4 | 0000-1130 | 1130 | 0000-1130 |
| 1198 | 5 | | 9999 | | | |
| 1199 | 6 | 1199 | 5 | 1199-1210 | 1210 | 1199-1210 |
| 1210 | 7 | 1210 | 6 | 1199-1210 | 1210 | 1199-1210 |
| 1240 | 8 | | 9999 | | | |
| 1250 | 9 | | 9999 | | | |
| 1260 | 10 | 1260 | 7 | 1260 | 1260 | 1260 |
| 1261 | 11 | | 9999 | | | |
| 1280 | 12 | | 9999 | | | |
| 1296 | 13 | | 9999 | | | |
| 1298 | 14 | | 9999 | | | |
| 1299 | 15 | 1299 | 8 | 1299 | 1299 | 1299 |
| 1501 | 16 | | 9999 | | | |
+------+---------+------+----------+---------------+---------+------------------------+

Related

number duplicate entries in a table

I have a below table.
cid
oid
order_date
1
12
2020-07-01 13:19:16.235
1
12
2020-07-01 13:19:21.549
1
23
2020-07-27 13:00:18.446
1
34
2021-08-17 09:42:20.778
1
55
2022-08-01 13:37:53.340
1
55
2022-08-01 13:38:07.564
1
55
2022-08-01 13:38:28.201
1
09
2022-08-03 10:32:24.202
I tried the below query.
select
cid,
oid,
dense_rank() over (partition by oid order by order_date) as oid_history
from
master.t1
where
cid = 1
order by
order_date asc;
Got the below output.
cid
oid
order_date
oid_history
1
12
2020-07-01 13:19:16.235
1
1
12
2020-07-01 13:19:21.549
2
1
23
2020-07-27 13:00:18.446
1
1
34
2021-08-17 09:42:20.778
1
1
55
2022-08-01 13:37:53.340
1
1
55
2022-08-01 13:38:07.564
2
1
55
2022-08-01 13:38:28.201
3
1
09
2022-08-03 10:32:24.202
1
Expected output.
cid
oid
order_date
oid_history
1
12
2020-07-01 13:19:16.235
1
1
12
2020-07-01 13:19:21.549
1
1
23
2020-07-27 13:00:18.446
2
1
34
2021-08-17 09:42:20.778
3
1
55
2022-08-01 13:37:53.340
4
1
55
2022-08-01 13:38:07.564
4
1
55
2022-08-01 13:38:28.201
4
1
09
2022-08-03 10:32:24.202
5
Thank you:)
Can you try this one?
select
cid,
oid,
order_date,
dense_rank() over (partition by cid order by oid) as oid_history
from
mytable -- master.t1
where
cid = 1
order by
order_date asc;
+-----+-----+-------------------------+-------------+
| CID | OID | ORDER_DATE | OID_HISTORY |
+-----+-----+-------------------------+-------------+
| 1 | 12 | 2020-07-01 13:19:16.235 | 1 |
| 1 | 12 | 2020-07-01 13:19:21.549 | 1 |
| 1 | 23 | 2020-07-27 13:00:18.446 | 2 |
| 1 | 34 | 2021-08-17 09:42:20.778 | 3 |
| 1 | 55 | 2022-08-01 13:37:53.340 | 4 |
| 1 | 55 | 2022-08-01 13:38:07.564 | 4 |
| 1 | 55 | 2022-08-01 13:38:28.201 | 4 |
+-----+-----+-------------------------+-------------+
Based on your new question, here is the answer:
select
cid,
oid,
order_date,
CONDITIONAL_CHANGE_EVENT( oid ) over (partition by cid order by ORDER_DATE ) + 1 as oid_history
from
mytable -- master.t1
where
cid = 1
order by oid_history;
+-----+-----+-------------------------+-------------+
| CID | OID | ORDER_DATE | OID_HISTORY |
+-----+-----+-------------------------+-------------+
| 1 | 12 | 2020-07-01 13:19:16.235 | 1 |
| 1 | 12 | 2020-07-01 13:19:21.549 | 1 |
| 1 | 23 | 2020-07-27 13:00:18.446 | 2 |
| 1 | 34 | 2021-08-17 09:42:20.778 | 3 |
| 1 | 55 | 2022-08-01 13:37:53.340 | 4 |
| 1 | 55 | 2022-08-01 13:38:07.564 | 4 |
| 1 | 55 | 2022-08-01 13:38:28.201 | 4 |
| 1 | 09 | 2022-08-03 10:32:24.202 | 5 |
+-----+-----+-------------------------+-------------+
I didn't want to update my answer (my comment explains the reason) but Pankaj already answered, so I also had to share my answer. Now, I'm waiting for another hidden requirement to modify my answer.
From the expected output, it looks like a use-case for conditional_change_event.
with data (cid, oid, order_date) as (
select * from values
(1,12,'2020-07-01 13:19:16.235'::date),
(1,12,'2020-07-01 13:19:21.549'::date),
(1,23,'2020-07-27 13:00:18.446'::date),
(1,34,'2021-08-17 09:42:20.778'::date),
(1,55,'2022-08-01 13:37:53.340'::date),
(1,55,'2022-08-01 13:38:07.564'::date),
(1,55,'2022-08-01 13:38:28.201'::date),
(1,09,'2022-08-03 10:32:24.202'::date)
)select *,
1+conditional_change_event (oid) over (order by cid) as oid_history
from data;
CID
OID
ORDER_DATE
OID_HISTORY
1
12
2020-07-01
1
1
12
2020-07-01
1
1
23
2020-07-27
2
1
34
2021-08-17
3
1
55
2022-08-01
4
1
55
2022-08-01
4
1
55
2022-08-01
4
1
9
2022-08-03
5

How to get interpolation value in SQL Server?

I want to get interpolation value for NULL. Interpolation is a statistical method by which related known values are used to estimate an unknown price or potential yield of a security. Interpolation is achieved by using other established values that are located in sequence with the unknown value.
Here is my sample table and code.
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=673fcd5bc250bd272e8b6da3d0eddb90
I want to get this result:
| SEQ | cat01 | cat02 | dt_day | price | coeff |
+-----+-------+-------+------------+-------+--------+
| 1 | 230 | 1 | 2019-01-01 | 16000 | 0 |
| 2 | 230 | 1 | 2019-01-02 | NULL | 1 |
| 3 | 230 | 1 | 2019-01-03 | 13000 | 0 |
| 4 | 230 | 1 | 2019-01-04 | NULL | 1 |
| 5 | 230 | 1 | 2019-01-05 | NULL | 2 |
| 6 | 230 | 1 | 2019-01-06 | NULL | 3 |
| 7 | 230 | 1 | 2019-01-07 | 19000 | 0 |
| 8 | 230 | 1 | 2019-01-08 | 20000 | 0 |
| 9 | 230 | 1 | 2019-01-09 | 21500 | 0 |
| 10 | 230 | 1 | 2019-01-10 | 21500 | 0 |
| 11 | 230 | 1 | 2019-01-11 | NULL | 1 |
| 12 | 230 | 1 | 2019-01-12 | NULL | 2 |
| 13 | 230 | 1 | 2019-01-13 | 23000 | 0 |
| 1 | 230 | 2 | 2019-01-01 | NULL | 1 |
| 2 | 230 | 2 | 2019-01-02 | NULL | 2 |
| 3 | 230 | 2 | 2019-01-03 | 12000 | 0 |
| 4 | 230 | 2 | 2019-01-04 | 17000 | 0 |
| 5 | 230 | 2 | 2019-01-05 | 22000 | 0 |
| 6 | 230 | 2 | 2019-01-06 | NULL | 1 |
| 7 | 230 | 2 | 2019-01-07 | 23000 | 0 |
| 8 | 230 | 2 | 2019-01-08 | 23200 | 0 |
| 9 | 230 | 2 | 2019-01-09 | NULL | 1 |
| 10 | 230 | 2 | 2019-01-10 | NULL | 2 |
| 11 | 230 | 2 | 2019-01-11 | NULL | 3 |
| 12 | 230 | 2 | 2019-01-12 | NULL | 4 |
| 13 | 230 | 2 | 2019-01-13 | 23000 | 0 |
I use this code. I think this code incorrect.
coeff is the NULL is in order set.
This code is for implementing interpolation.
I tried to find out between the empty values and divide them by the number of spaces.
But, this code is incorrect.
WITH ROW_VALUE AS
(
SELECT SEQ
, dt_day
, cat01
, cat02
, price
, ROW_NUMBER() OVER (ORDER BY dt_day) AS sub_seq
FROM (
SELECT SEQ
, cat01
, cat02
, dt_day
, dt_week
, dt_month
, price
FROM temp01
WHERE price IS NOT NULL
)val
)
,STEP_CHANGE AS(
SELECT RV1.SEQ AS id_Start
, RV1.SEQ - 1 AS id_End
, RV1.cat01
, RV1.cat02
, RV1.dt_day
, RV1.price
, (RV2.price - RV1.price)/(RV2.SEQ - RV1.SEQ) AS change1
FROM ROW_VALUE RV1
LEFT JOIN ROW_VALUE RV2 ON RV1.cat01 = RV2.cat01
AND RV1.cat02 = RV2.cat02
AND RV1.SEQ = RV2.SEQ - 1
)
SELECT *
FROM STEP_CHANGE
ORDER BY cat01, cat02, dt_day
Please, let me know what a good way to fill NULL using linear relationships.
If there is another good way, please recommend it.
If I assume that you mean linear interpolation between the previous price and the next price based on the number of days that passed, then you can use the following method:
Use window functions to get the next and previous days with prices for each row.
Use window functions or joins to get the prices on those days as well.
Use arithmetic to calculate the linear interpolation.
You SQL Fiddle uses SQL Server, so I assume that is the database you are using. The code looks like this:
select t.*,
coalesce(t.price,
(tprev.price +
(tnext.price - tprev.price) / datediff(day, prev_price_day, next_price_day) *
datediff(day, t.prev_price_day, t.dt_day)
)
) as imputed_price
from (select t.*,
max(case when price is not null then dt_day end) over (partition by cat01, cat02 order by dt_day asc) as prev_price_day,
min(case when price is not null then dt_day end) over (partition by cat01, cat02 order by dt_day desc) as next_price_day
from temp01 t
) t left join
temp01 tprev
on tprev.cat01 = t.cat01 and
tprev.cat02 = t.cat02 and
tprev.dt_day = t.prev_price_day left join
temp01 tnext
on tnext.cat01 = t.cat01 and
tnext.cat02 = t.cat02 and
tnext.dt_day = t.next_price_day
order by cat01, cat02, dt_day;
Here is a db<>fiddle.

How to fill NULL value using min-max in SQL Server?

I want to get front value and back value using min-max comparing.
Here is my sample table.
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=322aafeb7970e25f9a85d0cd2f6c00d6
For example, this is temp01 table.
I want to get start&end NULL value in temp01.
So, I fill
price = [NULL, NULL, 13000]
to
price = [12000, 12000, 13000]
Because the 12000 is minimum value in [230, 1]. And I fill end NULL is filled maximum value in [cat01, cat02] group
| SEQ | cat01 | cat02 | dt_day | price |
+-----+-------+-------+------------+-------+
| 1 | 230 | 1 | 2019-01-01 | NULL |
| 2 | 230 | 1 | 2019-01-02 | NULL |
| 3 | 230 | 1 | 2019-01-03 | 13000 |
...
| 11 | 230 | 1 | 2019-01-11 | NULL |
| 12 | 230 | 1 | 2019-01-12 | NULL |
| 1 | 230 | 2 | 2019-01-01 | NULL |
| 2 | 230 | 2 | 2019-01-02 | NULL |
| 3 | 230 | 2 | 2019-01-03 | 12000 |
...
| 12 | 230 | 2 | 2019-01-11 | NULL |
| 13 | 230 | 2 | 2019-01-12 | NULL |
[result]
| SEQ | cat01 | cat02 | dt_day | price |
+-----+-------+-------+------------+-------+
| 1 | 230 | 1 | 2019-01-01 | 12000 | --START
| 2 | 230 | 1 | 2019-01-02 | 12000 |
| 3 | 230 | 1 | 2019-01-03 | 13000 |
| 4 | 230 | 1 | 2019-01-04 | 12000 |
| 5 | 230 | 1 | 2019-01-05 | NULL |
| 6 | 230 | 1 | 2019-01-06 | NULL |
| 7 | 230 | 1 | 2019-01-07 | 19000 |
| 8 | 230 | 1 | 2019-01-08 | 20000 |
| 9 | 230 | 1 | 2019-01-09 | 21500 |
| 10 | 230 | 1 | 2019-01-10 | 21500 |
| 11 | 230 | 1 | 2019-01-11 | 21500 |
| 12 | 230 | 1 | 2019-01-12 | 21500 |
| 13 | 230 | 1 | 2019-01-13 | 21500 | --END
| 1 | 230 | 2 | 2019-01-01 | 12000 | --START
| 2 | 230 | 2 | 2019-01-02 | 12000 |
| 3 | 230 | 2 | 2019-01-03 | 12000 |
| 4 | 230 | 2 | 2019-01-04 | 17000 |
| 5 | 230 | 2 | 2019-01-05 | 22000 |
| 6 | 230 | 2 | 2019-01-06 | NULL |
| 7 | 230 | 2 | 2019-01-07 | 23000 |
| 8 | 230 | 2 | 2019-01-08 | 23200 |
| 9 | 230 | 2 | 2019-01-09 | NULL |
| 10 | 230 | 2 | 2019-01-10 | 24000 |
| 11 | 230 | 2 | 2019-01-11 | 24000 |
| 12 | 230 | 2 | 2019-01-12 | 24000 |
| 13 | 230 | 2 | 2019-01-13 | 24000 | --END
Please, let me know what a good way to fill NULL using linear relationships.
find the min() and max() for the price GROUP BY cat01, cat02.
Also find the min and max seq for the row where price is not null
after that it is just simply inner join to your table and update where price is null
with val as
(
select cat01, cat02,
min_price = min(price),
max_price = max(price),
min_seq = min(case when price is not null then seq end),
max_seq = max(case when price is not null then seq end)
from temp01
group by cat01, cat02
)
update t
set price = case when t.seq < v.min_seq then min_price
when t.seq > v.max_seq then max_price
end
FROM temp01 t
inner join val v on t.cat01 = v.cat01
and t.cat02 = v.cat02
where t.price is null
dbfiddle
EDIT : returning the price as a new column in SELECT query
with val as
(
select cat01, cat02, min_price = min(price), max_price = max(price),
min_seq = min(case when price is not null then seq end),
max_seq = max(case when price is not null then seq end)
from temp01
group by cat01, cat02
)
select t.*,
new_price = coalesce(t.price,
case when t.seq < v.min_seq then min_price
when t.seq > v.max_seq then max_price
end)
FROM temp01 t
left join val v on t.cat01 = v.cat01
and t.cat02 = v.cat02
Updated dbfiddle

convert the input table to mentioned output table by using SQL

I want to convert the input table to mentioned output table by using SQL statement, Can anyone please help me on this.
Input table
+-------------+-----------+----------+
| start_value | end_value | interval |
+-------------+-----------+----------+
| 0 | 120 | 10 |
| 1 | 150 | 50 |
+-------------+-----------+----------+
OUTPUT
+-----------------+-----------+------------------+
| start_value | end_value | next_start_value |
+-----------------+-----------+------------------+
| 0 | 120 | 0 |
| 0 | 120 | 10 |
| 0 | 120 | 20 |
| 0 | 120 | 30 |
+-----------------+-----------+------------------+
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE input ( start_value, end_value, "INTERVAL" ) AS
SELECT 0, 120, 10 FROM DUAL UNION ALL
SELECT 1, 150, 50 FROM DUAL;
Query 1:
WITH output ( rn, start_value, end_value, "INTERVAL", next_start_value ) AS (
SELECT ROWNUM, i.*, start_value FROM input i
UNION ALL
SELECT rn,
start_value,
end_value,
"INTERVAL",
next_start_value + "INTERVAL"
FROM output
WHERE next_start_value + "INTERVAL" <= end_value
)
SELECT start_value, end_value, next_start_value
FROM output
ORDER BY rn, next_start_value
Results:
| START_VALUE | END_VALUE | NEXT_START_VALUE |
|-------------|-----------|------------------|
| 0 | 120 | 0 |
| 0 | 120 | 10 |
| 0 | 120 | 20 |
| 0 | 120 | 30 |
| 0 | 120 | 40 |
| 0 | 120 | 50 |
| 0 | 120 | 60 |
| 0 | 120 | 70 |
| 0 | 120 | 80 |
| 0 | 120 | 90 |
| 0 | 120 | 100 |
| 0 | 120 | 110 |
| 0 | 120 | 120 |
| 1 | 150 | 1 |
| 1 | 150 | 51 |
| 1 | 150 | 101 |
As MTD's answer does not actually fully meet the expected output structure, based on his/hers approach, I'd suggest this altered answer:
WITH output (start_value, end_value, "INTERVAL", next_start_value) AS (
SELECT start_value, end_value, "INTERVAL", start_value AS next_start_value
FROM input
UNION ALL
SELECT start_value,
end_value,
"INTERVAL",
next_start_value + "INTERVAL" AS next_start_value
FROM output
WHERE next_start_value + "INTERVAL" <= end_value
)
SELECT start_value, end_value, next_start_value
FROM output
ORDER BY 1, 2, 3;
EDIT: Adding the query result:
START_VALUE | END_VALUE | NEXT_START_VALUE
------------+-----------+-----------------
0 | 120 | 0
0 | 120 | 10
0 | 120 | 20
0 | 120 | 30
0 | 120 | 40
0 | 120 | 50
0 | 120 | 60
0 | 120 | 70
0 | 120 | 80
0 | 120 | 90
0 | 120 | 100
0 | 120 | 110
0 | 120 | 120
1 | 150 | 1
1 | 150 | 51
1 | 150 | 101
This is the SQL Fiddle for it.

How to calculate a ratio and cater for division by zero in SQL

I need to product a report from the data set below. The report is supposed to show, for each day, the sum of columns P, U, D, F and M, as well as a ratio: M / P+U in aggregate form.
I'm having trouble with the ratio. I'm not sure how to cater for the division by zero.
TYP | TIMESTAMP | P | U | D | F | M
------------------------------------------------------------
L | 2012-04-27 15:47:02.000 | 0 | 949 | 0 | 0 | 949
L | 2012-04-27 15:48:18.000 | 0 | 949 | 0 | 0 | 949
L | 2012-04-30 17:15:01.000 | 0 | 0 | 4051| 0 | 0
L | 2012-04-30 17:44:44.000 | 0 | 984 | 5 | 0 | 986
L | 2012-05-02 11:12:01.000 | 2117| 0 | 0 | 0 | 0
L | 2012-05-02 11:12:09.000 | 149 | 4 | 210 | 0 | 157
L | 2012-05-02 11:12:11.000 | 77 | 0 | 30 | 0 | 43
My query:
SELECT
CONVERT(date,TIMESTAMP,112) As 'DAY',
SUM(P) As PAS,
SUM(U) As UFR,
SUM(D) As DES,
SUM(F) As FIR,
SUM(M) As MOL,
[M%] = ISNULL( (SUM(M) / NULLIF( SUM(P)+SUM(U), 0 ) )*100, 0),
FROM DATASET
GROUP BY CONVERT(date,TIMESTAMP,112) ORDER BY CONVERT(date,TIMESTAMP,112) DESC
UPDATE: this is the report
DAY | PAS | UFR | DES | FIR | MOL | M%
----------------------------------------------------------------
2012-05-02 | 2343 | 4 | 240 | 0 | 200 | 0
2012-04-30 | 0 | 984 | 4056 | 0 | 986 | 100
2012-04-27 | 0 | 1898 | 0 | 0 | 1898 | 100
The problem is that you are not accounting for integer division.
Select Cast([Timestamp] As Date) As 'Day'
, Sum(P) As Pas
, Sum(U) As Ufr
, Sum(D) As Des
, Sum(F) As Fir
, Sum(M) As Mol
, IsNull( (Sum(M) / NullIf( Sum(P) * 1.0000 + Sum(U), 0 ) ) *100, 0) As [M%]
From Dataset
Group By Cast([Timestamp] As Date)
Order By [Day] Desc
SQL Fiddle version