behaviour of decode in oracle when search is null - sql

I was reading the Oracle decode() documentation. As far as I know, when calling decode(expr, search1, value1, search2, value2...) Oracle casts expr, search2 and search3 to the type of search1 and compare them.
So if search1 is NULL then what will search2, search3, etc. be cast to?
Example:
create table sc(a date, b varchar2(256));
insert into sc values(
to_date('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss'),
'2010-01-01 11:22:33'
);
select decode(
to_date('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss'),
null,
1,
b,
123,
a,
456
)
from sc;
Why is the result 456 rather than 123?

EDIT:
When the first value to compare with is null or char, all values will be converted to varchar2 and compared as strings. So compare always equal types and do not use null as first compare value if you do not compare strings:
select decode(
to_date('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss'),
to_date('2010-01-01 11:22:31', 'yyyy-mm-dd hh24:mi:ss'),
0,
null,
1,
to_char(to_date('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss')),
2,
to_date(b, 'yyyy-mm-dd hh24:mi:ss'),
123,
a,
456
)
from sc;
If the first compare value is NULL, the date will be converted to string with the default date representation (which can be different from this in b) and compared with b. if you want to see the default, use this:
select to_char(to_date('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss')),b from sc
If we read the Oracle Documentation:
If the first result has the datatype CHAR or if the first result is
null, then Oracle converts the return value to the datatype VARCHAR2.
The same happens with values. First value is null, then everything is converted to string.. You can see it here.
SELECT DECODE (1, NULL, 1, '01', 2, '1 ', 3, '1', 4, 1, 5) FROM DUAL;
Now change the null with a number
SELECT DECODE (1, 5, 1, '01', 2, '1 ', 3, '1', 4, 1, 5) FROM DUAL;
What will give this:
SELECT DECODE (TO_DATE ('2010-01-01 11:22:33', 'yyyy-mm-dd hh24:mi:ss'),
NULL, 1,
TO_DATE ('2010-01-01 15:22:32', 'yyyy-mm-dd hh24:mi:ss'), 2,
'3')
FROM DUAL

To give you a better understanding
SELECT DECODE( 1, NULL, 1, 33, '1', 44 ) FROM DUAL;
will give you 44 because, It cannot find 1 in the find field and goes for the default.
( 1, -> Search expression
NULL, 1, -> Find and replace
33, 1, -> Find and replace
44 ) -> default
44
Also
SELECT DECODE( 1, NULL, 22, 1, 33, '1', 44 ) FROM DUAL;
will give you 33 because, It finds 1 in the find field and goes for the replace value (33).
( 1, -> Search expression
NULL, 22, -> Find and replace
1, 33, -> Find and replace
1, 44 ) -> Find and replace and no default
33
Also
SELECT DECODE( 1, NULL, 22, 1, 33, '1', 44 ) FROM DUAL;
will give you NULL because, It cannot find 1 in the find field and goes for the default which doesn't exist.
( 1, -> Search expression
NULL, 22, -> Find and replace
2, 33, -> Find and replace
2, 44 ) -> Find and replace and no default
(null)

Related

Averaging values and getting standard devs from database to build graph

Tricky to Explain so Ill shrink down the info to a minimum:
But first, I'll try and explain my ultimate goal, I want to take users who trialed a product and determine how that product affected a value as a percentage compared to their average baseline and then average all these percentages with stand devs.
I have database with the a table that has a user_id, a value, a date.
user_id
value
date
int
int
int in epoch miliseconds
I then have a second table which indicates when a trial began and ends for a user and the product they are using for said trial.
user_id
start_date
end_date
product id
int
int in epoch milisecs
int in epoch milisecs
int
What I want to do is gather all the user's trials for one product type, and for each user that participated get a baseline value and their percent change each day. Then take all these percentages and average them and get a standard deviation for each day.
One problem is date needs to convert to days since start_date so anything between the start date and the first 24 hrs will be lumped as day 0, next 24 as day 1, and so forth. So ill be averaging the percents of each day
Not every day was recorded for each user so some will have multiple missing days, so I cant need to mark each day as days from start
The start_date's are random between users
So the graph will look like this:
picture
I would prefer to do as much of it in sql as possible, but the rest will be in Golang.
I was thinking about grabbing each trial , and then each trial will have an array of results. so then I iterate over each trial and iteriate over the results for each trial picking day 0, day 1, day 2 and saving these in their own arrays which I will then average. Everything start getting super messy though
such as in semi pseudo code:
db.Query("select user_id, start_date from trials where product_id = $1", productId).Scan(&trial.UserId, &trial.StartDate)
//extract trials from rows
for _, trial := range trials {
// extract leadingAvgStart from StartDate
db.QueryRow("select AVG(value) from results where user_id = $1 date between $2 and $3", trial.UserId, leadingAvgStart, trial.StartDate)
// Now we have the baseline for the user
rows := db.Query("select value, date from results where product_id = $1", start)
//Now we extract the results and have and array
//Convert Dates to Dates from start Date
//...? It just start getting ugly and I believe there has to be a better way
}
How can I do most of the heavy lifting with sql?
create table users (id int PRIMARY KEY, name text);
create table products (id int PRIMARY KEY, name text);
create table values (
id int PRIMARY KEY
, user_id int REFERENCES users(id)
, value int
, date numeric
);
create table trials (
id int PRIMARY KEY
, user_id int REFERENCES users(id)
, start_date numeric
, end_date numeric
, product_id int REFERENCES products(id)
);
INSERT INTO users (id, name ) VALUES
(1,'John'),
(2,'Jane'),
(3,'Billy'),
(4,'Miranda');
INSERT INTO products (id, name ) VALUES
(1, 'pill A'),
(2, 'pill B'),
(3, 'pill C'),
(4, 'exercise bal'),
(5, 'diet plan');
INSERT INTO trials (id,user_id,start_date,end_date,product_id) VALUES
(1, 1, 1667896408000, 1668099442000, 1),
(2, 1, 1667896408000, 1668099442000, 2),
(3, 2, 1667576960000, 1668074401000, 3),
(4, 3, 1667896408000, 1668099442000, 1);
INSERT INTO values (id, user_id, value, date) VALUES
(38, 1, 7, 1668182428000),
(1, 1, 7, 1668099442000),
(2, 1, 8, 1668074401000),
(3, 1, 8, 1668012300000),
(4, 1, 6, 1668011197000),
(5, 1, 6, 1667978268000),
(6, 1, 9, 1667925002000),
(7, 1, 9, 1667896408000),
(8, 1, 4, 1667838601000),
(9, 1, 6, 1667803049000),
(10, 1, 7, 1667576960000),
(12, 1, 5, 1667546428000),
(13, 1, 8, 1667490149000),
(14, 2, 8, 1668182428000),
(15, 2, 7, 1668099442000),
(16, 2, 8, 1668074401000),
(17, 2, 9, 1668012300000),
(18, 2, 6, 1668011197000),
(19, 2, 6, 1667978268000),
(20, 2, 5, 1667925002000),
(21, 2, 9, 1667896408000),
(22, 2, 4, 1667803049000),
(23, 2, 4, 1667576960000),
(24, 2, 5, 1667546428000),
(25, 2, 9, 1667490149000),
(26, 3, 6, 1668182428000),
(27, 3, 7, 1668099442000),
(28, 3, 8, 1668074401000),
(29, 3, 9, 1668011197000),
(30, 3, 6, 1667978268000),
(31, 3, 9, 1667925002000),
(32, 3, 9, 1667896408000),
(33, 3, 8, 1667838601000),
(34, 3, 6, 1667803049000),
(35, 3, 4, 1667576960000),
(36, 3, 5, 1667546428000),
(37, 3, 6, 1667490149000);
Ok I figures it out, basically I do two inner join queries and treat those as tables and then inner join those, and use a group by to average
select
query1.product_uuid,
query1.days,
AVG(query1.value / query2.avg) as avg_percent
from
(
select
DATE_PART(
'day',
to_timestamp(
values
.date / 1000
):: date - trials.start_date
) as days,
trials.uuid as trial_uuid,
trials.product_uuid,
values
.value as value
from
values
as
values
inner join product_trials as trials ON
values
.user_id = trials.user_id
where
values
.source = 'Trued'
and
values
.use = 'true'
AND trials.start_date IS NOT NULL
AND trials.end_date IS NOT NULL
AND to_timestamp(
values
.date / 1000
):: date > trials.start_date
AND to_timestamp(
values
.date / 1000
):: date < trials.end_date
) as query1
inner join (
select
values
.user_id,
trials.uuid as trial_uuid,
AVG(value)
from
values
inner join product_trials as trials ON
values
.user_id = trials.user_id
where
source = 'Trued'
and use = true
AND trials.start_date IS NOT NULL
AND trials.end_date IS NOT NULL
AND to_timestamp(
values
.date / 1000
):: date < trials.start_date
AND DATE_PART(
'day',
to_timestamp(
values
.date / 1000
):: date - trials.start_date
) > -20
GROUP BY
values
.user_id,
trials.uuid
) as query2 ON query1.trial_uuid = query2.trial_uuid
where
query2.avg > 0
GROUP BY
query1.days,
query1.product_uuid
ORDER BY
query1.product_uuid,
query1.days

Oracle sql using previous rows data in calculations

I have a table T1 with 06 columns and want to get new two columns using a select query.
Here's T1 with two extra columns (STOCK, WAUC) that i want to get :
CREATE TABLE T1 (MOUVEMENT NUMBER(2), OPERATION VARCHAR2(5), ITEM VARCHAR2(5), INPUT_QTY NUMBER(6, 2), OUTPUT_QTY NUMBER(6, 2), INPUT_PRICE NUMBER(6, 2), STOCK NUMBER(6, 2), WAUC NUMBER(6, 2));
INSERT ALL
INTO T1 VALUES(1, 'I', 'A', 1500, 0, 5, 1500, 5)
INTO T1 VALUES(2, 'I', 'A', 700, 0, 6, 2200, 5.31)
INTO T1 VALUES(3, 'O', 'A', 0, 800, 0, 1400, 5.31)
INTO T1 VALUES(4, 'I', 'A', 1000, 0, 5, 2400, 5.18)
INTO T1 VALUES(5, 'O', 'A', 0, 500, 0, 1900, 5.18)
INTO T1 VALUES(6, 'I', 'A', 1000, 0, 7, 2900, 5.8 )
INTO T1 VALUES(7, 'I', 'A', 2000, 0, 7, 4900, 6.28)
INTO T1 VALUES(8, 'I', 'A', 5000, 0, 7, 5400, 6.34)
INTO T1 VALUES(9, 'O', 'A', 0, 1000, 0, 4400, 6.34)
INTO T1 VALUES(10, 'I','A', 1000, 0, 5, 5400, 6.09)
SELECT 1 FROM DUAL;
WAUC is like weighted average unit cost to valorise our stock.
In case first record : STOCK = INPUT and WAUC = INPUT_PRICE;
In case new INPUT operation : new WAUC should be : (last generated WAUC * last generated stock) + (current INPUT * current INPUT_PRICE)) / current generated STOCK.
Ex for 2nd row : WAUC = ((5 * 1500) + (700 * 6)) / 2200 = 5.31
In case new OUTPUT operation : WAUC should be last generated WAUC.
Ex for 3rd row : WAUC = last generated WAUC (5.31) of the same ITEM A.
Means, WAUC should be changed every new INPUT operation.
In my opinion, STOCK and WAUC should be generated on the fly, not as records,
besause otherwise, only one accidently wrong INPUT_PRICE, will cause wrong next WAUC(s) -> wrong next calculation(s) -> (wrong work).
how can I achieve this?
Thanks in advance.
Your logic is textbook example of need for model clause and can be rewritten to that clause almost as you verbosely specified (note the model clause is a beast, to learn more about it see here or here or here):
with t1 (mouvement, operation, item, input_qty, output_qty, input_price, stock_expected, wauc_expected) as (
select 1, 'I', 'A', 1500, 0, 5, 1500, 5 from dual union all
select 2, 'I', 'A', 700, 0, 6, 2200, 5.31 from dual union all
select 3, 'O', 'A', 0, 800, 0, 1400, 5.31 from dual union all
select 4, 'I', 'A', 1000, 0, 5, 2400, 5.18 from dual union all
select 5, 'O', 'A', 0, 500, 0, 1900, 5.18 from dual union all
select 6, 'I', 'A', 1000, 0, 7, 2900, 5.8 from dual union all
select 7, 'I', 'A', 2000, 0, 7, 4900, 6.28 from dual union all
select 8, 'I', 'A', 500, 0, 7, 5400, 6.34 from dual union all
select 9, 'O', 'A', 0, 1000, 0, 4400, 6.34 from dual union all
select 10, 'I','A', 1000, 0, 5, 5400, 6.09 from dual
)
select * from (
select t1.*, 0 as stock_actual, 0 as wauc_actual from t1
)
model
dimension by (row_number() over (order by mouvement) as rn)
measures (mouvement, operation, item, input_qty, output_qty, input_price, stock_expected, wauc_expected, stock_actual, wauc_actual)
rules (
stock_actual[any] = coalesce(stock_actual[cv(rn) - 1], 0) + case operation[cv(rn)]
when 'I' then input_qty[cv(rn)]
when 'O' then -output_qty[cv(rn)]
end,
wauc_actual[any] = case
when cv(rn) = 1
then input_price[cv(rn)]
when operation[cv(rn)] = 'I'
then trunc((wauc_actual[cv(rn) - 1] * stock_actual[cv(rn) - 1] + input_qty[cv(rn)] * input_price[cv(rn)]) / stock_actual[cv(rn)], 2)
when operation[cv(rn)] = 'O'
then wauc_actual[cv(rn) - 1]
end
)
order by mouvement
(I changed typo in operation=5000->500 for mouvement=8 and added truncation to 2 digits - both I guessed from your expected results.)
Db fiddle here.
Note that simple analytic functions are not sufficient for computation of wauc because they have access only to previous values of column of input dataset, not the values of column being computed by the function itself. For stock it would be possible using running totals of sum(input_qty) over (order by mouvement) - sum(output_qty) over (order by mouvement) but for wauc there is hardly any explicit formula.

Why does this conversion to date fail on some rows in my table and not other rows when I use an IIF

I have this table and data:
CREATE TABLE dbo.tBadDate
(
BadDateID int NOT NULL,
StartDate nchar(20) NULL,
CONSTRAINT [PK_tBadDate] PRIMARY KEY CLUSTERED
(
[BadDateID] ASC
)
);
INSERT dbo.tBadDate (BadDateID, StartDate) VALUES
(1, N'1/1/2020 '),
(2, N'Jan 1 2021 '),
(3, N'January 1 2021 '),
(4, N'Ja 1 2021 '),
(5, N'Jan,1,2021 '),
(6, N'2021.1.1 '),
(7, N'8/8/1981 '),
(8, NULL),
(9, N'January First, 2021 ');
This script works:
SELECT StartDate, ISDATE(StartDate) from tBadDate;
This script fails:
SELECT StartDate, IIF(ISDATE(StartDate) = 1 , CONVERT(DATE,
startDate), 'Undefined Format')
FROM tBadDate
Msg 241, Level 16, State 1
Conversion failed when converting date and/or time from character string.
You could use try_convert with a cross apply
select StartDate,
Iif(x.v is null,0,1) ValidDate,
IsNull(Cast(v as varchar(20)),'Undefined Format')
from tBadDate
cross apply (values(Try_Convert(date,StartDate)))x(v)

SQL Pivot Half of table

I have a table that consists of time information. It's basically:
Employee, Date, Seq, Time In, Time Out.
They can clock out multiple times a day, so I'm trying to get all of the clock outs in a day on one row. My result would be something like:
Employee, Date, TimeIn1, TimeOut1, TimeIn2, TimeOut2, TimeIn3, TimeOut3....
Where the 1, 2, and 3 are the sequence numbers. I know I could just do a bunch of left joins to the table itself based on employee=employee, date=date, and seq=seq+1, but is there a way to do it in a pivot? I don't want to pivot the employee and date fields, just the time in and time out.
The short answer is: Yes, it's possible.
The exact code will be updated if/when you provide sample data to clarify some points, but you can absolutely pivot the times out while leaving the employee/work date alone.
Sorry for the wall of code; none of the fiddle sites are working from my current computer
declare #test table (
pk int,
workdate date,
seq int,
tIN time,
tOUT time
)
insert into #test values
(1, '2020-11-25', 1, '08:00', null),
(1, '2020-11-25', 2, null, '11:00'),
(1, '2020-11-25', 3, '11:32', null),
(1, '2020-11-25', 4, null, '17:00'),
(2, '2020-11-25', 5, '08:00', null),
(2, '2020-11-25', 6, null, '09:00'),
(2, '2020-11-25', 7, '09:15', null),
-- new date
(1, '2020-11-27', 8, '08:00', null),
(1, '2020-11-27', 9, null, '08:22'),
(1, '2020-11-27', 10, '09:14', null),
(1, '2020-11-27', 11, null, '12:08'),
(1, '2020-11-27', 12, '01:08', null),
(1, '2020-11-27', 13, null, '14:40'),
(1, '2020-11-27', 14, '14:55', null),
(1, '2020-11-27', 15, null, '17:00')
select *
from (
/* this just sets the column header names and condenses their values */
select
pk,
workdate,
colName = case when tin is not null then 'TimeIn' + cast(empDaySEQ as varchar) else 'TimeOut' + cast(empDaySEQ as varchar) end,
colValue = coalesce(tin, tout)
from (
/* main query */
select
pk,
workdate,
/* grab what pair # this clock in or out is; reset by employee & date */
empDaySEQ = (row_number() over (partition by pk, workdate order by seq) / 2) + (row_number() over (partition by pk, workdate order by seq) % 2),
tin,
tout
from #test
) i
) a
PIVOT (
max(colValue)
for colName
IN ( /* replace w/ dynamic if you don't know upper boundary of max in/out pairs */
[TimeIn1],
[TimeOut1],
[TimeIn2],
[TimeOut2],
[TimeIn3],
[TimeOut3],
[TimeIn4],
[TimeOut4]
)
) mypivotTable
generates these results.
(I would provide a fiddle demo but they're not working for me today)

How do I set up a table to do recursive queries in Sqlite?

I apologize if this is a basic question, but I'm a database novice. I'm using sqlite to manage a list of command-line options for some tools. A simple version of this is:
sqlite> CREATE TABLE option_set (set_id INTEGER, idx INTEGER, value TEXT );
sqlite> INSERT INTO option_set VALUES( 1, 1, 'a' );
sqlite> INSERT INTO option_set VALUES( 1, 2, 'b' );
sqlite> INSERT INTO option_set VALUES( 1, 3, 'c' );
sqlite> SELECT value FROM option_set WHERE set_id=1 ORDER BY idx;
a
b
c
This all works fine. I want to add an enhancement, however, where I allow one option_set to contain another. For instance, if I specified option_set=2 as { 'd', 'e', [option_set=1], 'f' }, I would like that to mean the options { 'd', 'e', 'a', 'b', 'c', 'f' }. The question is how to express this in the database. I was thinking of something along the lines of:
sqlite> CREATE TABLE option_set (set_id INTEGER, idx INTEGER, contained_set_id INTEGER, value TEXT );
sqlite> INSERT INTO option_set VALUES( 1, 1, NULL, 'a' );
sqlite> INSERT INTO option_set VALUES( 1, 2, NULL, 'b' );
sqlite> INSERT INTO option_set VALUES( 1, 3, NULL, 'c' );
sqlite> INSERT INTO option_set VALUES( 2, 1, NULL, 'd' );
sqlite> INSERT INTO option_set VALUES( 2, 2, NULL, 'e' );
sqlite> INSERT INTO option_set VALUES( 2, 3, 1, NULL );
sqlite> INSERT INTO option_set VALUES( 2, 4, NULL, 'f' );
The idea is that in each row of the table, I'd either have a value or another set_id that should be expanded. The problem is that I don't know how to query such a table - how could I produce the list of options short of recursively doing selects? I'm not crazy about the structure where the contained and value column are never both valid, but not sure how to get around that. Would a different table design work better?
Thanks.
To do a recursive query, you need a recursive common table expression:
WITH RECURSIVE
contained_sets(level, idx, contained_set_id, value)
AS (SELECT 0, idx, contained_set_id, value
FROM option_set
WHERE set_id = 2
UNION ALL
SELECT level + 1,
option_set.idx,
option_set.contained_set_id,
option_set.value
FROM option_set
JOIN contained_sets ON option_set.set_id = contained_sets.contained_set_id
ORDER BY 1 DESC, 2)
SELECT value
FROM contained_sets
WHERE contained_set_id IS NULL;
value
----------
d
e
a
b
c
f
(This is supported in SQLite 3.8.3 or later.)