SQL: Earliest Date After Latest Null If Exists - sql

Using T-Sql I am looking to return the min date after the latest null if one exists and simply the min date on any products where there are no nulls.
Table:
DateSold Product
12/31/2012 A
1/31/2013
2/28/2013 A
3/31/2013 A
4/30/2013 A
5/31/2013
6/30/2013 A
7/31/2013 A
8/31/2013 A
9/30/2013 A
12/31/2012 B
1/31/2013 B
2/28/2013 B
3/31/2013 B
4/30/2013 B
5/31/2013 B
6/30/2013 B
7/31/2013 B
8/31/2013 B
9/30/2013 B
For product “A” 6/30/2013 is the desired return while for product “B” 12/31/2012 is desired.
Result:
MinDateSold Product
6/30/2013 A
12/31/2012 B
Any solutions will greatly be appreciated. Thank you.

This does it for me, if there's a GROUP involved, otherwise how do you know whether the NULLs are in the run of A or B products? I realise this may not be exactly what you're after, but I hope it helps anyway.
WITH DATA_IN AS (
SELECT 1 as grp,
convert(DateTime,'12/31/2012') as d_Date,
'A' AS d_ch
UNION ALL
SELECT 1, '1/31/2013', NULL UNION ALL
SELECT 1, '2/28/2013', 'A' UNION ALL
SELECT 1, '3/31/2013', 'A' UNION ALL
SELECT 1, '4/30/2013', 'A' UNION ALL
SELECT 1, '5/31/2013', NULL UNION ALL
SELECT 1, '6/30/2013', 'A' UNION ALL
SELECT 1, '7/31/2013', 'A' UNION ALL
SELECT 1, '8/31/2013', 'A' UNION ALL
SELECT 1, '9/30/2013', 'A' UNION ALL
SELECT 2, '12/31/2012', 'B' UNION ALL
SELECT 2, '1/31/2013', 'B' UNION ALL
SELECT 2, '2/28/2013', 'B' UNION ALL
SELECT 2, '3/31/2013', 'B' UNION ALL
SELECT 2, '4/30/2013', 'B' UNION ALL
SELECT 2, '5/31/2013', 'B' UNION ALL
SELECT 2, '6/30/2013', 'B' UNION ALL
SELECT 2, '7/31/2013', 'B' UNION ALL
SELECT 2, '8/31/2013', 'B' UNION ALL
SELECT 2, '9/30/2013', 'B'
)
SELECT
grp as YourGroup,
(SELECT Min(d_date) -- first date after...
FROM DATA_IN
WHERE d_date>
Coalesce( -- either the latest NULL
(SELECT max(d_Date)
FROM DATA_IN d2
WHERE d2.grp=d1.grp AND d2.d_ch IS NULL
)
, '1/1/1901' -- or a base date if no NULLs
)
) as MinDateSold
FROM DATA_IN d1
GROUP BY grp
Results :
1 2013-06-30 00:00:00.000
2 2012-12-31 00:00:00.000

One approach to this is to count the number of NULL values that appear before a given row for a given value. This divides the ranges into groups. For each group, take the minimum date. And, find the largest minimum date for each product:
select product, minDate
from (select product, NumNulls, min(DateSold) as minDate,
row_number() over (partition by product order by min(DateSold) desc
) as seqnum
from (select t.*,
(select count(*)
from table t2
where t2.product is null and t2.DateSold <= t.DateSold
) as NumNulls
from table t
) t
group by Product, NumNUlls
) t
where seqnum = 1;
In your data, there is no mixing of different products in a range, so this query sort of assumes that is true as well.

Related

create date range from day based data

i have following source data...
id date value
1 01.08.22 a
1 02.08.22 a
1 03.08.22 a
1 04.08.22 b
1 05.08.22 b
1 06.08.22 a
1 07.08.22 a
2 01.08.22 a
2 02.08.22 a
2 03.08.22 c
2 04.08.22 a
2 05.08.22 a
and i would like to have the following output...
id date_from date_until value
1 01.08.22 03.08.22 a
1 04.08.22 05.08.22 b
1 06.08.22 07.08.22 a
2 01.08.22 02.08.22 a
2 03.08.22 03.08.22 c
2 04.08.22 05.08.22 a
Is this possible with Oracle SQL? Which functions do I need for this?
Based on the link provided by #astentx, try this solution:
SELECT
id, MIN("date") AS date_from, MAX("date") AS date_until, MAX(value) AS value
FROM (
SELECT
t1.*,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY "date") -
ROW_NUMBER() OVER(PARTITION BY id, value ORDER BY "date") AS rn
FROM yourtable t1
)
GROUP BY id, rn
See db<>fiddle
WITH CTE (id, dateD,valueD)
AS
(
SELECT 1, TO_DATE('01.08.22','DD.MM.YY'), 'a' FROM DUAL UNION ALL
SELECT 1, TO_DATE('02.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 1, TO_DATE('03.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 1, TO_DATE('04.08.22','DD.MM.YY'), 'b'FROM DUAL UNION ALL
SELECT 1, TO_DATE('05.08.22','DD.MM.YY'), 'b'FROM DUAL UNION ALL
SELECT 2, TO_DATE('01.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 2, TO_DATE('02.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 2, TO_DATE('03.08.22','DD.MM.YY'), 'c'FROM DUAL
)
SELECT C.ID,C.VALUED,MIN(C.DATED)AS MIN_DATE,MAX(C.DATED)AS MAX_DATE
FROM CTE C
GROUP BY C.ID,C.VALUED
ORDER BY C.ID
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=47c87d60445ce262cd371177e31d5d63

How to calculate percentage in oracle sql

I have a table in which I have multiple IDs which can have a value or 0. The IDs come from different sources so I would like to know what is the percentage of IDs with the value 0 as a percentage of total IDs, for each source file.
Sample Data:
ID Source
1 aaa
0 aaa
2 bbb
0 ccc
3 ccc
0 ccc
5 aaa
0 bbb
6 bbb
7 bbb
I need to display Output like:
CountOfIDs0 TotalIDs Source PercentageIDs0
2 3 ccc 66.6%%
1 3 aaa 33.3%%
1 4 bbb 25%
Thanks!
If you want a result like 66.6% rather than 66.7%, you would use trunc() rather than round() (although the latter is probably better). And you need to round a/b to three decimal places, so there is one left after you multiply by 100.
Then, you can have both counts in one query, and you can add the percentage calculation also in the same query.
select count(case when propkey = 0 then 1 end) countid0,
count(propkey) totalidcount,
source,
to_char(round(count(case when properkey = 0 then 1 end)/count(properkey), 3)*100)
|| '%' percentageids0
from......
Apply round function.
select count(id) as TotalIDs ,Source, sum(case when id=0 then 1 end) countid0,
to_char((sum(case when id=0 then 1 end)/count(id))*100)||'%' as PercentageIDs0
from Table1 group by Source
For Unique record you have to use DISTINCT Query
I would do it that way:
With MyRows AS (
SELECT 1 ID, 'aaa' SOURCE FROM DUAL UNION ALL
SELECT 0, 'aaa' FROM DUAL UNION ALL
SELECT 2, 'bbb' FROM DUAL UNION ALL
SELECT 0, 'ccc' FROM DUAL UNION ALL
SELECT 3, 'ccc' FROM DUAL UNION ALL
SELECT 0, 'ccc' FROM DUAL UNION ALL
SELECT 5, 'aaa' FROM DUAL UNION ALL
SELECT 0, 'bbb' FROM DUAL UNION ALL
SELECT 6, 'bbb' FROM DUAL UNION ALL
SELECT 7, 'bbb' FROM DUAL
)
SELECT
DISTINCT SOURCE,
SUM(CASE WHEN ID = 0 THEN 1 ELSE 0 END) OVER (PARTITION BY SOURCE) ZERO_IDS,
COUNT(ID) OVER (PARTITION BY SOURCE) TOTAL_IDS,
(100 * SUM(CASE WHEN ID = 0 THEN 1 ELSE 0 END) OVER (PARTITION BY SOURCE))/(COUNT(ID) OVER (PARTITION BY SOURCE)) PERCENTAGE
FROM MyRows
;
I calculated percentage of values in a column by using below query
Select A.,B., to_char((A.count_service/B.count_total)*100)||'%' from
(Select type_cd, count(type_cd) as count_type
from table1
group by type_cd) A
cross join
(Select count(type_cd) as count_total
from table1) B ;
select Source,
ROUND(100*number/sum(number) OVER (PARTITION BY p),2) as percentage,
sum(number) OVER (PARTITION BY p) as total
from(
select 1 p,
Source ,
count(Source) number
from declaration_assessment_result
GROUP by Source
)x

Oracle SQL (Toad): Expand table

Suppose I have an SQL (Oracle Toad) table named "test", which has the following fields and entries (dates are in dd/mm/yyyy format):
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/06/2014 3
1 01/09/2014 6
2 01/04/2015 7
2 01/08/2015 43
2 01/09/2015 85
2 01/12/2015 4
I know from how the table has been created that, since there are value entries for id = 1 for February 2014 and June 2014, the values for March through May 2014 must be 0. The same applies to July and August 2014 for id = 1, and for May through July 2015 and October through November 2015 for id = 2.
Now, if I want to calculate, say, the median of the value column for a given id, I will not arrive at the correct result using the table as it stands - as I'm missing 5 zero entries for each id.
I would therefore like to create/use the following (potentially just temporary table)...
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/03/2014 0
1 01/04/2014 0
1 01/05/2014 0
1 01/06/2014 3
1 01/07/2014 0
1 01/08/2014 0
1 01/09/2014 6
2 01/04/2015 7
2 01/05/2015 0
2 01/06/2015 0
2 01/07/2015 0
2 01/08/2015 43
2 01/09/2015 85
2 01/10/2015 0
2 01/11/2015 0
2 01/12/2015 4
...on which I could then compute the median by id:
select id, median(value) as med_value from test group by id
How do I do this? Or would there be an alternative way?
Many thanks,
Mr Clueless
In this solution, I build a table with all the "needed dates" and value of 0 for all of them. Then, instead of a join, I do a union all, group by id and ref_date and ADD the values in each group. If the date had a row with a value in the original table, then that's the resulting value; and if it didn't, the value will be 0. This avoids a join. In almost all cases a union all + aggregate will be faster (sometimes much faster) than a join.
I added more input data for more thorough testing. In your original question, you have two id's, and for both of them you have four positive values. You are missing five values in each case, so there will be five zeros (0) which means the median is 0 in both cases. For id=3 (which I added) I have three positive values and three zeros; the median is half of the smallest positive number. For id=4 I have just one value, which then should be the median as well.
The solution includes, in particular, an answer to your specific question - how to create the temporary table (which most likely doesn't need to be a temporary table at all, but an inline view). With factored subqueries (in the WITH clause), the optimizer decides if to treat them as temporary tables or inline views; you can see what the optimizer decided if you look at the Explain Plan.
with
inputs ( id, ref_date, value ) as (
select 1, to_date('01/01/2014', 'dd/mm/yyyy'), 20 from dual union all
select 1, to_date('01/02/2014', 'dd/mm/yyyy'), 25 from dual union all
select 1, to_date('01/06/2014', 'dd/mm/yyyy'), 3 from dual union all
select 1, to_date('01/09/2014', 'dd/mm/yyyy'), 6 from dual union all
select 2, to_date('01/04/2015', 'dd/mm/yyyy'), 7 from dual union all
select 2, to_date('01/08/2015', 'dd/mm/yyyy'), 43 from dual union all
select 2, to_date('01/09/2015', 'dd/mm/yyyy'), 85 from dual union all
select 2, to_date('01/12/2015', 'dd/mm/yyyy'), 4 from dual union all
select 3, to_date('01/01/2016', 'dd/mm/yyyy'), 12 from dual union all
select 3, to_date('01/03/2016', 'dd/mm/yyyy'), 23 from dual union all
select 3, to_date('01/06/2016', 'dd/mm/yyyy'), 2 from dual union all
select 4, to_date('01/11/2014', 'dd/mm/yyyy'), 9 from dual
),
-- the "inputs" table constructed above is for testing only,
-- it is not part of the solution.
ranges ( id, min_date, max_date ) as (
select id, min(ref_date), max(ref_date)
from inputs
group by id
),
prep ( id, ref_date, value ) as (
select id, add_months(min_date, level - 1), 0
from ranges
connect by level <= 1 + months_between( max_date, min_date )
and prior id = id
and prior sys_guid() is not null
),
v ( id, ref_date, value ) as (
select id, ref_date, sum(value)
from ( select id, ref_date, value from prep union all
select id, ref_date, value from inputs
)
group by id, ref_date
)
select id, median(value) as median_value
from v
group by id
order by id -- ORDER BY is optional
;
ID MEDIAN_VALUE
-- ------------
1 0
2 0
3 1
4 9
If ref_date is date and is second
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, add_months(i.min_date,s.n) as ref_date
, nvl(value,0) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
And with median
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, MEDIAN(nvl(value,0)) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
group by i.id

select rows between two character values of a column

I have a table which shows as below:
S.No | Action
1 | New
2 | Dependent
3 | Dependent
4 | Dependent
5 | New
6 | Dependent
7 | Dependent
8 | New
9 | Dependent
10 | Dependent
I here want to select the rows between the first two 'New' values in the Action column, including the first row with the 'New' action. Like [New,New)
For example:
In this case, I want to select rows 1,2,3,4.
Please let me know how to do this.
Hmmm. Let's count up the cumulative number of times that New appears as a value and use that:
select t.*
from (select t.*,
sum(case when action = 'New' then 1 else 0 end) over (order by s_no) as cume_new
from t
) t
where cume_new = 1;
you can do some magic with analytic functions
1 select group of NEW actions, to get min and max s_no
2 select lead of 2 rows
3 select get between 2 sno (min and max)
with t as (
select 1 sno, 'New' action from dual union
select 2,'Dependent' from dual union
select 3,'Dependent' from dual union
select 4,'Dependent' from dual union
select 5,'New' from dual union
select 6,'Dependent' from dual union
select 7,'Dependent' from dual union
select 8,'New' from dual union
select 9,'Dependent' from dual union
select 10,'Dependent' from dual
)
select *
from (select *
from (select sno, lead(sno) over (order by sno) a
from ( select row_number() over (partition by action order by Sno) t,
t.sno
from t
where t.action = 'New'
) a
where t <=2 )
where a is not null) a, t
where t.sno >= a.sno and t.sno < a.a

SQL Grouping by Ranges

I have a data set that has timestamped entries over various sets of groups.
Timestamp -- Group -- Value
---------------------------
1 -- A -- 10
2 -- A -- 20
3 -- B -- 15
4 -- B -- 25
5 -- C -- 5
6 -- A -- 5
7 -- A -- 10
I want to sum these values by the Group field, but parsed as it appears in the data. For example, the above data would result in the following output:
Group -- Sum
A -- 30
B -- 40
C -- 5
A -- 15
I do not want this, which is all I've been able to come up with on my own so far:
Group -- Sum
A -- 45
B -- 40
C -- 5
Using Oracle 11g, this is what I've hobbled togther so far. I know that this is wrong, by I'm hoping I'm at least on the right track with RANK(). In the real data, entries with the same group could be 2 timestamps apart, or 100; there could be one entry in a group, or 100 consecutive. It does not matter, I need them separated.
WITH SUB_Q AS
(SELECT K_ID
, GRP
, VAL
-- GET THE RANK FROM TIMESTAMP TO SEPARATE GROUPS WITH SAME NAME
, RANK() OVER(PARTITION BY K_ID ORDER BY TMSTAMP) AS RNK
FROM MY_TABLE
WHERE K_ID = 123)
SELECT T1.K_ID
, T1.GRP
, SUM(CASE
WHEN T1.GRP = T2.GRP THEN
T1.VAL
ELSE
0
END) AS TOTAL_VALUE
FROM SUB_Q T1 -- MAIN VALUE
INNER JOIN SUB_Q T2 -- TIMSTAMP AFTER
ON T1.K_ID = T2.K_ID
AND T1.RNK = T2.RNK - 1
GROUP BY T1.K_ID
, T1.GRP
Is it possible to group in this way? How would I go about doing this?
I approach this problem by defining a group which is the different of two row_number():
select group, sum(value)
from (select t.*,
(row_number() over (order by timestamp) -
row_number() over (partition by group order by timestamp)
) as grp
from my_table t
) t
group by group, grp
order by min(timestamp);
The difference of two row numbers is constant for adjacent values.
A solution using LAG and windowed analytic functions:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( "Timestamp", "Group", Value ) AS
SELECT 1, 'A', 10 FROM DUAL
UNION ALL SELECT 2, 'A', 20 FROM DUAL
UNION ALL SELECT 3, 'B', 15 FROM DUAL
UNION ALL SELECT 4, 'B', 25 FROM DUAL
UNION ALL SELECT 5, 'C', 5 FROM DUAL
UNION ALL SELECT 6, 'A', 5 FROM DUAL
UNION ALL SELECT 7, 'A', 10 FROM DUAL;
Query 1:
WITH changes AS (
SELECT t.*,
CASE WHEN LAG( "Group" ) OVER ( ORDER BY "Timestamp" ) = "Group" THEN 0 ELSE 1 END AS hasChangedGroup
FROM TEST t
),
groups AS (
SELECT "Group",
VALUE,
SUM( hasChangedGroup ) OVER ( ORDER BY "Timestamp" ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS grp
FROM changes
)
SELECT "Group",
SUM( VALUE )
FROM Groups
GROUP BY "Group", grp
ORDER BY grp
Results:
| Group | SUM(VALUE) |
|-------|------------|
| A | 30 |
| B | 40 |
| C | 5 |
| A | 15 |
This is typical "star_of_group" problem (see here: https://timurakhmadeev.wordpress.com/2013/07/21/start_of_group/)
In your case, it would be as follows:
with t as (
select 1 timestamp, 'A' grp, 10 value from dual union all
select 2, 'A', 20 from dual union all
select 3, 'B', 15 from dual union all
select 4, 'B', 25 from dual union all
select 5, 'C', 5 from dual union all
select 6, 'A', 5 from dual union all
select 7, 'A', 10 from dual
)
select min(timestamp), grp, sum(value) sum_value
from (
select t.*
, sum(start_of_group) over (order by timestamp) grp_id
from (
select t.*
, case when grp = lag(grp) over (order by timestamp) then 0 else 1 end
start_of_group
from t
) t
)
group by grp_id, grp
order by min(timestamp)
;