SQL - get default NULL value if data is not available - sql

I got a table data as follows:
ID | TYPE_ID | CREATED_DT | ROW_NUM
=====================================
123 | 485 | 2019-08-31 | 1
123 | 485 | 2019-05-31 | 2
123 | 485 | 2019-02-28 | 3
123 | 485 | 2018-11-30 | 4
123 | 485 | 2018-08-31 | 5
123 | 485 | 2018-05-31 | 6
123 | 487 | 2019-05-31 | 1
123 | 487 | 2018-05-31 | 2
I would like to select 6 ROW_NUMs for each TYPE_ID, if there is missing data I need to return NULL value for CREATED_DT and the final result set should look like:
ID | TYPE_ID | CREATED_DT | ROW_NUM
=====================================
123 | 485 | 2019-08-31 | 1
123 | 485 | 2019-05-31 | 2
123 | 485 | 2019-02-28 | 3
123 | 485 | 2018-11-30 | 4
123 | 485 | 2018-08-31 | 5
123 | 485 | 2018-05-31 | 6
123 | 487 | 2019-05-31 | 1
123 | 487 | 2018-05-31 | 2
123 | 487 | NULL | 3
123 | 487 | NULL | 4
123 | 487 | NULL | 5
123 | 487 | NULL | 6
Query:
SELECT
A.*
FROM TBL AS A
WHERE A.ROW_NUM <= 6
UNION ALL
SELECT
B.*
FROM TBL AS B
WHERE B.ROW_NUM NOT IN (SELECT ROW_NUM FROM TBL)
AND B.ROW_NUM <= 6
I tried using UNION ALL and ISNULL to backfill data that is not available but it is still giving me the existing data but not the expected result. I think this can be done in a easy way by using CTE but not sure how to get this working. Can any help me in this regard.

Assuming Row_Num has at least record has at least all 6 rows... 1,2,3,4,5,6 in tbl and no fractions or 0 or negative numbers...
we get a list of all the distinct type ID's and IDs. (Alias A)
Then we get a distinct list of row numbers less than 7 (giving us 6 records)
we cross join these to ensure each ID & Type_ID has all 6 rows.
we then left join back in the base set (tbl) to get all the needed dates; where such dates exist. As we're using left join the rows w/o a date will still persist.
.
SELECT A.ID, A.Type_ID, C.Created_DT, B.Row_Num
FROM (SELECT DISTINCT ID, Type_ID FROM tbl) A
CROSS JOIN (SELECT distinct row_num from tbl where Row_num < 7) B
LEFT JOIN tbl C
on C.ID = A.ID
and C.Type_ID = A.Type_ID
and C.Row_num = B.Row_num
Giving us:
+----+-----+---------+------------+---------+
| | ID | Type_ID | Created_DT | Row_Num |
+----+-----+---------+------------+---------+
| 1 | 123 | 485 | 2019-08-31 | 1 |
| 2 | 123 | 485 | 2019-05-31 | 2 |
| 3 | 123 | 485 | 2019-02-28 | 3 |
| 4 | 123 | 485 | 2018-11-30 | 4 |
| 5 | 123 | 485 | 2018-08-31 | 5 |
| 6 | 123 | 485 | 2018-05-31 | 6 |
| 7 | 123 | 487 | 2019-05-31 | 1 |
| 8 | 123 | 487 | 2018-05-31 | 2 |
| 9 | 123 | 487 | NULL | 3 |
| 10 | 123 | 487 | NULL | 4 |
| 11 | 123 | 487 | NULL | 5 |
| 12 | 123 | 487 | NULL | 6 |
+----+-----+---------+------------+---------+
Rex Tester: Example
This also assumes that you'd want 1-6 for each combination of type_id and ID. If ID's irrelevant, then simply exclude it from the join criteria. I included it as it's an ID and seems like it's part of a key.

Please reference the other answer for how you can do this using a CROSS JOIN - which is pretty neat. Alternatively, we can utilize the programming logic available in MS-SQL to achieve the desired results. The following approach stores distinct ID and TYPE_ID combinations inside a SQL cursor. Then it iterates through the cursor entries to ensure the appropriate amount of data is stored into a temp table. Finally, the SELECT is performed on the temp table and the cursor is closed. Here is a proof of concept that I validated on https://rextester.com/l/sql_server_online_compiler.
-- Create schema for testing
CREATE TABLE Test (
ID INT,
TYPE_ID INT,
CREATED_DT DATE
)
-- Populate data
INSERT INTO Test(ID, TYPE_ID, CREATED_DT)
VALUES
(123,485,'2019-08-31')
,(123,485,'2019-05-31')
,(123,485,'2019-02-28')
,(123,485,'2018-11-30')
,(123,485,'2018-08-31')
,(123,485,'2018-05-31')
,(123,487,'2019-05-31')
,(123,487,'2018-05-31');
-- Create TempTable for output
CREATE TABLE #OutputTable (
ID INT,
TYPE_ID INT,
CREATED_DT DATE,
ROW_NUM INT
)
-- Declare local variables
DECLARE #tempID INT, #tempType INT;
-- Create cursor to iterate ID and TYPE_ID
DECLARE mycursor CURSOR FOR (
SELECT DISTINCT ID, TYPE_ID FROM Test
);
OPEN mycursor
-- Populate cursor
FETCH NEXT FROM mycursor
INTO #tempID, #tempType;
-- Loop
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #count INT = (SELECT COUNT(*) FROM Test WHERE ID = #tempID AND TYPE_ID = #tempType);
INSERT INTO #OutputTable (ID, TYPE_ID, CREATED_DT, ROW_NUM)
SELECT ID, TYPE_ID, CREATED_DT, ROW_NUMBER() OVER(ORDER BY ID ASC)
FROM Test
WHERE ID = #tempID AND TYPE_ID = #tempType;
WHILE #count < 6
BEGIN
SET #count = #count + 1
INSERT INTO #OutputTable
VALUES (#tempID, #tempType, NULL, #count);
END
FETCH NEXT FROM mycursor
INTO #tempID, #tempType;
END
-- Close cursor
CLOSE mycursor;
-- View results
SELECT * FROM #OutputTable;
Note, if you have an instance where a unique combination of ID and TYPE_ID are grouped more than 6 times, the additional groupings will be included in your final result. If you must only show exactly 6 groupings, you can change that part of the query to SELECT TOP 6 ....

create a cte with a series and cross apply it
CREATE TABLE Test (
ID INT,
TYPE_ID INT,
CREATED_DT DATE
)
INSERT INTO Test(ID, TYPE_ID, CREATED_DT)
VALUES
(123,485,'2019-08-31')
,(123,485,'2019-05-31')
,(123,485,'2019-02-28')
,(123,485,'2018-11-30')
,(123,485,'2018-08-31')
,(123,485,'2018-05-31')
,(123,487,'2019-05-31')
,(123,487,'2018-05-31')
;
WITH n(n) AS
(
SELECT 1
UNION ALL
SELECT n+1 FROM n WHERE n < 6
)
,id_n as (
SELECT
DISTINCT
ID
,TYPE_ID
,n
FROM
Test
cross apply n
)
SELECT
id_n.ID
,id_n.TYPE_ID
,test.CREATED_DT
,id_n.n row_num
FROM
id_n
left join
(
select
ID
,TYPE_ID
,CREATED_DT
,ROW_NUMBER() over(partition by id, type_id order by created_dt) rn
from
Test
) Test on Test.ID = id_n.ID and Test.TYPE_ID = id_n.TYPE_ID and id_n.n = test.rn
drop table Test

Related

Update multiple records using index result of sub-query

Let's say I have a table with data...
| person_id | priority |
|------------|------------|
| 678 | 2 |
| 413 | 4 |
| 912 | 1 |
| 111 | 5 |
How can I update priority so that the values are contiguous? I.e....
| person_id | priority |
|------------|------------|
| 678 | 2 |
| 413 | 3 | -- updated from 4 to 3
| 912 | 1 |
| 111 | 4 | -- updated from 5 to 4
I know that I can use something like...
select
row_number() over (order by [priority]) as position
from
table_name
...to find a person's 'position', but how can I use this to update the same row?
The priority values should always start at 1.
You can use an updatable CTE or subquery:
with toupdate as (
select t.*, row_number() over (order by [priority]) as new_priority
from table_name
)
update toudpate
set priority = new_priority
where priority <> new_priority;

How to select timestamp values in PostgreSQL under conditions?

I have a database table 'table1' as follows:
f_key | begin | counts|
1 | 2018-10-04 | 15 |
1 | 2018-10-06 | 20 |
1 | 2018-10-08 | 34 |
1 | 2018-10-09 | 56 |
I have another database table 'table2' as follows:
f_key | p_time | percent|
1 | 2018-10-05 | 80 |
1 | 2018-10-07 | 90 |
1 | 2018-10-08 | 70 |
1 | 2018-10-10 | 60 |
The tables can be joined by the f_key field.
I want to get a combined table as shown below:
If the begin time is earlier than any of the p_time then the p_time value in the combined table would be the same as begin time and the percent value would be 50. (As shown in row 1 in the following table)
If the begin time is later than any of the p_time then the p_time value in the combined table would be the very next available p_time and the percent value would be the corresponding value of the selected p_time.
(As shown in row 2, 3 and 4 in the following table)
row | f_key | begin | counts| p_time | percent|
1 | 1 | 2018-10-04 | 15 | 2018-10-04 | 50 |
2 | 1 | 2018-10-06 | 20 | 2018-10-05 | 80 |
3 | 1 | 2018-10-08 | 34 | 2018-10-07 | 90 |
4 | 1 | 2018-10-09 | 56 | 2018-10-08 | 70 |
You can try to use row_number window function to make row number which is the closest row from table1 by begin.
then use coalesce function to let begin time is earlier than any of the p_time then the p_time value in the combined table would be the same as begin time and the percent value would be 50
PostgreSQL 9.6 Schema Setup:
CREATE TABLE table1(
f_key INT,
begin DATE,
counts INT
);
INSERT INTO table1 VALUES (1,'2018-10-04',15);
INSERT INTO table1 VALUES (1,'2018-10-06',20);
INSERT INTO table1 VALUES (1,'2018-10-08',34);
INSERT INTO table1 VALUES (1,'2018-10-09',56);
CREATE TABLE table2(
f_key INT,
p_time DATE,
percent INT
);
INSERT INTO table2 VALUES (1, '2018-10-05',80);
INSERT INTO table2 VALUES (1, '2018-10-07',90);
INSERT INTO table2 VALUES (1, '2018-10-08',70);
INSERT INTO table2 VALUES (1, '2018-10-10',60);
Query 1:
SELECT ROW_NUMBER() OVER(ORDER BY begin) "row",
t1.f_key,
t1.counts,
coalesce(t1.p_time,t1.begin) p_time,
coalesce(t1.percent,50) percent
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY t1.begin,t1.f_key order by t2.p_time desc) rn,
t2.p_time,
t2.percent,
t1.counts,
t1.f_key,
t1.begin
FROM table1 t1
LEFT JOIN table2 t2 ON t1.f_key = t2.f_key and t1.begin > t2.p_time
)t1
WHERE rn = 1
Results:
| row | f_key | counts | p_time | percent |
|-----|-------|--------|------------|---------|
| 1 | 1 | 15 | 2018-10-04 | 50 |
| 2 | 1 | 20 | 2018-10-05 | 80 |
| 3 | 1 | 34 | 2018-10-07 | 90 |
| 4 | 1 | 56 | 2018-10-08 | 70 |

How do I dynamically make calculations via a CASE statement based on the results of the previous row's calculations in Oracle?

I'm trying to make calculations via CASE statements which rely on the results of calculations made on the previous row. The data I'm working with is hierarchical data. My end goal is to structure the resulting data to be in line with a Modified Preorder Tree Traversal algorithm.
Here's what my raw data looks like:
+-------+--------+
| id | parent |
+-------+--------+
| 1 | (null) |
+-------+--------+
| 600 | 1 |
+-------+--------+
| 690 | 600 |
+-------+--------+
| 6990 | 690 |
+-------+--------+
| 6900 | 690 |
+-------+--------+
| 69300 | 6900 |
+-------+--------+
| 69400 | 6900 |
+-------+--------+
Here's what I want the end result to look like. I'm happy to expand on why this is what I'm looking for, related to MPTT, etc.
+-------+-----------+-----+------+--+--+--+--+
| id | parent_id | lft | rght | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 1 | | 1 | 14 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 600 | 1 | 2 | 13 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 690 | 600 | 3 | 12 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 6900 | 690 | 4 | 9 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 6990 | 690 | 10 | 11 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 69300 | 6900 | 5 | 6 | | | | |
+-------+-----------+-----+------+--+--+--+--+
| 69400 | 6900 | 7 | 8 | | | | |
+-------+-----------+-----+------+--+--+--+--+
Here's what my SQL code looks like so far. It calculates many of the fields that I think the algorithm that I describe below requires. This is "organization" data within an enterprise setting, which is why the orgn abbreviation is common in my code.
Here's the algorithm that I think will successfully transform it into the MPTT format:
-If level is root (lvl=1), lft = 1, rght = subnodes*2 + 2
-If level is the next level down (lvl = prev_lvl+1), and prev_parent != parent (meaning this is the first sibling)
-lft = parent_lft+1
-If lvl = prev_lvl, so we are on the same level (don’t know if this is a true sibling of the same parent yet)
-if parent = prev_parent, lft=prev_rght+1 (true sibling, just use previous sibling’s right + 1)
-if parent != prev_parent, lft=parent_lft+1 (same level, not true sibling, so use parent’s left + 1)
-rght=(subnodes*2) + lft + 1
SQL Code I have so far:
WITH tab1 (
id,
parent_id
) AS (
SELECT
1,
NULL
FROM
dual
UNION ALL
SELECT
600,
1
FROM
dual
UNION ALL
SELECT
690,
600
FROM
dual
UNION ALL
SELECT
6990,
690
FROM
dual
UNION ALL
SELECT
6900,
690
FROM
dual
UNION ALL
SELECT
69300,
6900
FROM
dual
UNION ALL
SELECT
69400,
6900
FROM
dual
),t1 (
id,
parent_id,
lvl
) AS (
SELECT
id,
parent_id,
1 AS lvl
FROM
tab1
WHERE
parent_id IS NULL
UNION ALL
SELECT
t2.id,
t2.parent_id,
lvl + 1
FROM
tab1 t2,
t1
WHERE
t2.parent_id = t1.id
)
SEARCH BREADTH FIRST BY id SET order1,orgn_subnodes AS (
SELECT
id AS id,
COUNT(*) - 1 AS subnodes
FROM
(
SELECT
CONNECT_BY_ROOT ( t1.id ) AS id
FROM
t1
CONNECT BY
PRIOR t1.id = t1.parent_id
)
GROUP BY
id
),orgn_partial_data AS (
SELECT
orgn_subnodes.id AS id,
orgn_subnodes.subnodes,
parent_id,
lvl,
LAG(lvl,1) OVER(
ORDER BY
order1
) AS prev_lvl,
LAG(parent_id,1) OVER(
ORDER BY
order1
) AS prev_parent,
CASE
WHEN parent_id IS NULL THEN 1
END
lft,
CASE
WHEN parent_id IS NULL THEN ( subnodes * 2 ) + 2
END
rght,
order1
FROM
orgn_subnodes
JOIN t1 ON orgn_subnodes.id = t1.id
) SELECT
*
FROM
orgn_partial_data;
The result is:
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| id | subnodes | parent_id | lvl | prev_lvl | prev_parent | lft | rght | order1 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 1 | 6 | | 1 | | | 1 | 14 | 1 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 600 | 5 | 1 | 2 | 1 | | | | 2 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 690 | 4 | 600 | 3 | 2 | 1 | | | 3 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 6900 | 2 | 690 | 4 | 3 | 600 | | | 4 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 6990 | 0 | 690 | 4 | 4 | 690 | | | 5 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 69300 | 0 | 6900 | 5 | 4 | 690 | | | 6 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
| 69400 | 0 | 6900 | 5 | 5 | 6900 | | | 7 |
+-------+----------+-----------+-----+----------+-------------+-----+------+--------+
I don't care about the ordering of "sibling nodes" within the tree. Also, if you don't find the SQL I've started on useful, you can post an answer that doesn't use any of it. I only posted to show what pieces of info I think I need to perform the steps of the algorithm.
I'll accept any Oracle code (database procedure, SELECT statement, etc) as an answer.
Please ask for more details if you need them!
I think there is a typo in starting post, it should be (7, 8) and not (4, 8) for 69400.
The canonical way to get the result is by using recursive procedure/function.
Below approach uses procedure and temporary table but you can achieve the same with function returning collection.
Temporary table
create global temporary table tmp$ (id int, l int, r int) on commit delete rows;
Package
create or replace package pkg as
procedure p(p_id in int);
end pkg;
/
sho err
Package body
create or replace package body pkg as
seq int;
procedure p_(p_id in int) as
begin
seq := seq + 1;
insert into tmp$(id, l, r) values (p_id, seq, null);
for i in (select id from tab1 where parent_id = p_id order by id) loop
p_(i.id);
end loop;
seq := seq + 1;
update tmp$ set r = seq where id = p_id;
end;
procedure p(p_id in int) as
begin
seq := 0;
p_(p_id);
end;
end pkg;
/
sho err
Test in SQL*PLus
SQL> exec pkg.p(1);
PL/SQL procedure successfully completed.
SQL> select * from tmp$;
ID L R
---------- ---------- ----------
1 1 14
600 2 13
690 3 12
6900 4 9
69300 5 6
69400 7 8
6990 10 11
7 rows selected.
Update
Standalone procedure without global variables
create or replace procedure p(p_id in int, seq in out int) as
begin
seq := seq + 1;
insert into tmp$(id, l, r) values (p_id, seq, null);
for i in (select id from tab1 where parent_id = p_id order by id) loop
p(i.id, seq);
end loop;
seq := seq + 1;
update tmp$ set r = seq where id = p_id;
end;
/
Test in SQL*PLus
SQL> var n number
SQL> exec :n := 0;
PL/SQL procedure successfully completed.
SQL> exec p(1, :n);
PL/SQL procedure successfully completed.
SQL> select * from tmp$;
ID L R
---------- ---------- ----------
1 1 14
600 2 13
690 3 12
6900 4 9
69300 5 6
69400 7 8
6990 10 11
7 rows selected.

Sql - Row as column

I have data in below format, around 8 to 9 departments, for each department few questions.
| Department | NoOfCases | Question | Rate |
+============+===========+==========+======+
| VC | 4 | A | 80 |
| VC | 2 | B | 90 |
| VC | 1 | C | 95 |
| ED | 5 | A | 85 |
| ED | 1 | B | 90 |
| ED | 3 | C | 95 |
| PH | 3 | A | 80 |
I want into below format, I want total no of cases per department and every question as column and rate as its value.
| Department | NoOfCases | A | B | C(actual questions as columns) |
+============+===========+====+====+================================+
| VC | 7 | 80 | 90 | 95 |
| ED | 9 | 85 | 90 | 95 |
| PH | 3 | 80 | | |
Can we achieve this?
You can achieve it using a PIVOT with a GROUP BY:
--create table variable to hold sample data
declare #tmp table( Department nvarchar(2),NoOfCases int, Question nvarchar(1), Rate int)
--populate sample data
insert into #tmp select 'VC', 4,'A', 80
insert into #tmp select 'VC', 2,'B', 90
insert into #tmp select 'VC', 1,'C', 95
insert into #tmp select 'ED', 5,'A', 85
insert into #tmp select 'ED', 1,'B', 90
insert into #tmp select 'ED', 3,'C', 95
insert into #tmp select 'PH', 3,'A', 80
select * from #tmp
--pivot with group by
select Department,SUM(piv.NoOfCases) AS NoOfCases,
ISNULL(SUM(A),0) AS A, ISNULL(SUM(B),0) AS B, ISNULL(SUM(C),0) AS C
from
(
--select data
select Department,NoOfCases , Question ,RATE
from #tmp
) src
pivot
(
MAX(RATE)
for Question in ([A], [B], [C])
) piv
GROUP BY Department
This is the output of the command:

Add extra column in sql to show ratio with previous row

I have a SQL table with a format like this:
SELECT period_id, amount FROM table;
+--------------------+
| period_id | amount |
+-----------+--------+
| 1 | 12 |
| 2 | 11 |
| 3 | 15 |
| 4 | 20 |
| .. | .. |
+-----------+--------+
I'd like to add an extra column (just in my select statement) that calculates the growth ratio with the previous amount, like so:
SELECT period_id, amount, [insert formula here] AS growth FROM table;
+-----------------------------+
| period_id | amount | growth |
+-----------+-----------------+
| 1 | 12 | |
| 2 | 11 | 0.91 | <-- 11/12
| 3 | 15 | 1.36 | <-- 15/11
| 4 | 20 | 1.33 | <-- 20/15
| .. | .. | .. |
+-----------+-----------------+
Just need to work out how to perform the operation with the line before. Not interested in adding to the table. Any help appreciated :)
** also want to point out that period_id is in order but not necessarily increasing incrementally
The window function Lag() would be a good fit here.
You may notice that we use (amount+0.0). This is done just in case AMOUNT is an INT, and NullIf() to avoid the dreaded divide by zero
Declare #YourTable table (period_id int,amount int)
Insert Into #YourTable values
( 1,12),
( 2,11),
( 3,15),
( 4,20)
Select period_id
,amount
,growth = cast((amount+0.0) / NullIf(lag(amount,1) over (Order By Period_ID),0) as decimal(10,2))
From #YourTable
Returns
period_id amount growth
1 12 NULL
2 11 0.92
3 15 1.36
4 20 1.33
If you are using SQL Server 2012+ then go for John Cappelletti answer.
And If you are also less blessed like me then this below code work for you in the 2008 version too.
Declare #YourTable table (period_id int,amount int)
Insert Into #YourTable values
( 1,12),
( 2,11),
( 3,15),
( 4,20)
;WITH CTE AS (
SELECT ROW_NUMBER() OVER (
ORDER BY period_id
) SNO
,period_id
,amount
FROM #YourTable
)
SELECT C1.period_id
,C1.amount
,CASE
WHEN C2.amount IS NOT NULL AND C2.amount<>0
THEN CAST(C1.amount / CAST(C2.amount AS FLOAT) AS DECIMAL(18, 2))
END AS growth
FROM CTE C1
LEFT JOIN CTE C2 ON C1.SNO = C2.SNO + 1
Which works same as LAG.
+-----------+--------+--------+
| period_id | amount | growth |
+-----------+--------+--------+
| 1 | 12 | NULL |
| 2 | 11 | 0.92 |
| 3 | 15 | 1.36 |
| 4 | 20 | 1.33 |
+-----------+--------+--------+