Oracle SQL (Toad): Expand table - sql

Suppose I have an SQL (Oracle Toad) table named "test", which has the following fields and entries (dates are in dd/mm/yyyy format):
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/06/2014 3
1 01/09/2014 6
2 01/04/2015 7
2 01/08/2015 43
2 01/09/2015 85
2 01/12/2015 4
I know from how the table has been created that, since there are value entries for id = 1 for February 2014 and June 2014, the values for March through May 2014 must be 0. The same applies to July and August 2014 for id = 1, and for May through July 2015 and October through November 2015 for id = 2.
Now, if I want to calculate, say, the median of the value column for a given id, I will not arrive at the correct result using the table as it stands - as I'm missing 5 zero entries for each id.
I would therefore like to create/use the following (potentially just temporary table)...
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/03/2014 0
1 01/04/2014 0
1 01/05/2014 0
1 01/06/2014 3
1 01/07/2014 0
1 01/08/2014 0
1 01/09/2014 6
2 01/04/2015 7
2 01/05/2015 0
2 01/06/2015 0
2 01/07/2015 0
2 01/08/2015 43
2 01/09/2015 85
2 01/10/2015 0
2 01/11/2015 0
2 01/12/2015 4
...on which I could then compute the median by id:
select id, median(value) as med_value from test group by id
How do I do this? Or would there be an alternative way?
Many thanks,
Mr Clueless

In this solution, I build a table with all the "needed dates" and value of 0 for all of them. Then, instead of a join, I do a union all, group by id and ref_date and ADD the values in each group. If the date had a row with a value in the original table, then that's the resulting value; and if it didn't, the value will be 0. This avoids a join. In almost all cases a union all + aggregate will be faster (sometimes much faster) than a join.
I added more input data for more thorough testing. In your original question, you have two id's, and for both of them you have four positive values. You are missing five values in each case, so there will be five zeros (0) which means the median is 0 in both cases. For id=3 (which I added) I have three positive values and three zeros; the median is half of the smallest positive number. For id=4 I have just one value, which then should be the median as well.
The solution includes, in particular, an answer to your specific question - how to create the temporary table (which most likely doesn't need to be a temporary table at all, but an inline view). With factored subqueries (in the WITH clause), the optimizer decides if to treat them as temporary tables or inline views; you can see what the optimizer decided if you look at the Explain Plan.
with
inputs ( id, ref_date, value ) as (
select 1, to_date('01/01/2014', 'dd/mm/yyyy'), 20 from dual union all
select 1, to_date('01/02/2014', 'dd/mm/yyyy'), 25 from dual union all
select 1, to_date('01/06/2014', 'dd/mm/yyyy'), 3 from dual union all
select 1, to_date('01/09/2014', 'dd/mm/yyyy'), 6 from dual union all
select 2, to_date('01/04/2015', 'dd/mm/yyyy'), 7 from dual union all
select 2, to_date('01/08/2015', 'dd/mm/yyyy'), 43 from dual union all
select 2, to_date('01/09/2015', 'dd/mm/yyyy'), 85 from dual union all
select 2, to_date('01/12/2015', 'dd/mm/yyyy'), 4 from dual union all
select 3, to_date('01/01/2016', 'dd/mm/yyyy'), 12 from dual union all
select 3, to_date('01/03/2016', 'dd/mm/yyyy'), 23 from dual union all
select 3, to_date('01/06/2016', 'dd/mm/yyyy'), 2 from dual union all
select 4, to_date('01/11/2014', 'dd/mm/yyyy'), 9 from dual
),
-- the "inputs" table constructed above is for testing only,
-- it is not part of the solution.
ranges ( id, min_date, max_date ) as (
select id, min(ref_date), max(ref_date)
from inputs
group by id
),
prep ( id, ref_date, value ) as (
select id, add_months(min_date, level - 1), 0
from ranges
connect by level <= 1 + months_between( max_date, min_date )
and prior id = id
and prior sys_guid() is not null
),
v ( id, ref_date, value ) as (
select id, ref_date, sum(value)
from ( select id, ref_date, value from prep union all
select id, ref_date, value from inputs
)
group by id, ref_date
)
select id, median(value) as median_value
from v
group by id
order by id -- ORDER BY is optional
;
ID MEDIAN_VALUE
-- ------------
1 0
2 0
3 1
4 9

If ref_date is date and is second
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, add_months(i.min_date,s.n) as ref_date
, nvl(value,0) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
And with median
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, MEDIAN(nvl(value,0)) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
group by i.id

Related

How to group items by rows

I wanted to group the number of shop but i am not sure what is the syntax to create a group that is not exist in the table. I wanted the output to be like this
Group | Number of items
1 | XXX
2 | XXX
Group 1 would have number of items that is less than 10 while group 2 would have item that is more than 10.I have the data for the number of items, but I need to create the group number and I am not sure how. Thank you in advance.
Way I have tried:
SELECT
case when b.item_stock < 10 then count(a.shopid) else null end as Group_1,
case when b.item_stock >= 10 or b.item_stock < 100 then count(a.shopid) else null end as Group_2
FROM `table_a` a
left join `table_b` b
on a.id= b.id
where registration_time between "2017-01-01" and "2017-05-31"
group by b.item_stock
LIMIT 1000
Below is the BigQuery way of doing this
select 'group_' || range_bucket(item_stock, [0, 10]) as group_id,
count(*) as number_of_items
from your_table
group by group_id
if apply to dummy data like
with your_table as (
select 'ID001' shop_id, 40 item_stock union all
select 'ID002', 20 union all
select 'ID003', 30 union all
select 'ID004', 9 union all
select 'ID005', 44 union all
select 'ID006', 22 union all
select 'ID007', 28 union all
select 'ID008', 35 union all
select 'ID009', 20 union all
select 'ID010', 4 union all
select 'ID011', 5 union all
select 'ID012', 45 union all
select 'ID013', 29 union all
select 'ID014', 8 union all
select 'ID015', 40 union all
select 'ID016', 26 union all
select 'ID017', 31 union all
select 'ID018', 48 union all
select 'ID019', 45 union all
select 'ID020', 13
)
output is
Benefit of this solution is that it is easily extended to any number of ranges just by adding those into range_bucket function -
for example : range_bucket(item_stock, [0, 10, 50, 100, 1000])
From the example you've shared you were close to solving this one, just need to tweak your case statement.
The case statement in your query is splitting the groups into two separate columns, whereas you need these groups in one column with the totals to the right.
Consider the below change to your select statement.
case when b.item_stock < 10 then "Group_1"
when b.item_stock >= 10 then "Group_2" else null end as Groups,
count(a.shop_id) as total
Schema (MySQL v5.7)
CREATE TABLE id (
`shop_id` VARCHAR(5),
`item_stock` INTEGER
);
INSERT INTO id
(`shop_id`, `item_stock`)
VALUES
('ID001', '40'),
('ID002', '20'),
('ID003', '30'),
('ID004', '9'),
('ID005', '44'),
('ID006', '22'),
('ID007', '28'),
('ID008', '35'),
('ID009', '20'),
('ID010', '4'),
('ID011', '5'),
('ID012', '45'),
('ID013', '29'),
('ID014', '8'),
('ID015', '40'),
('ID016', '26'),
('ID017', '31'),
('ID018', '48'),
('ID019', '45'),
('ID020', '13');
Query #1
SELECT
case when item_stock < 10 then "Group_1"
when item_stock >= 10 then "Group_2" else null end as Groups,
count(shop_id) as total
FROM id group by 1;
Groups
total
Group_1
4
Group_2
16
View on DB Fiddle
Tom

Filter based on condition in WHERE clause

I have a table where I have to pick one of two if it is present. For example if a ID has ACCEPTED and SETTLED , I have to only pick SETTLED else the remaining. Only ACCEPTED/SETTLED always comes as duplicates
Input:
Output:
Query Tried:
SELECT * FROM TABLE
WHERE CASE WHEN "Status" IN ('ACCEPTED','SETTLED') THEN 'SETTLED'
WHEN "Status" IN ('ACCEPTED') THEN 'ACCEPTED'
ELSE "Status" END In ('SETTLED','ACCEPTED')
If your groups are defined by ID and Amount, you could do something like:
SELECT
t.ID,
MAX(t.Status),
t.Amount
FROM t
GROUP BY t.ID, t.Amount
ORDER BY t.ID
db<>fiddle
This is one option (sample data in lines #1 - 7; query begins at line #8). It ranks statuses so that SETTLED comes first, and then the rest of them.
SQL> with test (id, status, amount) as
2 (select 1, 'ACCEPTED', 13 from dual union all
3 select 1, 'SETTLED' , 13 from dual union all
4 select 2, 'SETTLED' , 155 from dual union all
5 select 3, 'ACCEPTED', 123 from dual union all
6 select 4, 'REJECTED', 140 from dual
7 )
8 select id, status, amount
9 from (select id, status, amount,
10 row_number() over (partition by id
11 order by case when status = 'SETTLED' then 1 else 2 end) rn
12 from test
13 )
14 where rn = 1;
ID STATUS AMOUNT
---------- -------- ----------
1 SETTLED 13
2 SETTLED 155
3 ACCEPTED 123
4 REJECTED 140
SQL>

Back fill data in table using Oracle

ID Date NAME START_TIME END_TIME
1 2/15/2017 A 2/15/20173:40:39 PM 2/15/2017 3:41:17 PM
2 2/15/2017 B 2/15/20173:40:39 PM 2/15/2017 3:41:17 PM
3 2/15/2017 C 2/15/20173:40:39 PM 2/15/2017 3:41:17 PM
I am facing a problem where I have to back fill my database with these 3 statements From Jan 2016 to Today.
One solution I can try is I can write java code which just loop on and create a new date and new entry for the table and then i can insert using generated query.
But is there any way I can do this using oracle.
This is a commonly used way to generate dates given a start and an end date, which you can simply join to the list of your names to get what you need:
insert into yourTable ( ...)
with names as (
select 'A' as name from dual union all
select 'B' as name from dual union all
select 'C' as name from dual
),
dates as (
select date' 2017-01-01' + level -1 as yourDate
from dual
connect by date' 2016-01-01' + level -1 <= date '2017-02-20'
)
select rownum, name, yourDate
from names
cross join dates
This has to be slightly edited to better suit the number and types of your columns. A small example of how it works:
with names as (
select 'A' as name from dual union all
select 'B' as name from dual union all
select 'C' as name from dual
),
dates as (
select date' 2017-02-18' + level -1 as yourDate,
level as lev
from dual
connect by date' 2017-02-18' + level -1 <= date '2017-02-20')
select rownum, name, yourDate, lev
from names
cross join dates
gives:
ROWNUM N YOURDATE LEV
---------- - --------- ----------
1 A 18-FEB-17 1
2 B 18-FEB-17 1
3 C 18-FEB-17 1
4 A 19-FEB-17 2
5 B 19-FEB-17 2
6 C 19-FEB-17 2
7 A 20-FEB-17 3
8 B 20-FEB-17 3
9 C 20-FEB-17 3
As a basic concept, something like this would do it... You would need to either adapt for the A, B and C, or repeat for each.
with Numbers (NN) as
(
select 1 as NN
from dual
union all
select NN+1
from Numbers
where NN <2000
)
insert into MyTable (ID, Date, Name, StartTime, EndTime)
select NN + 3, -- If repeating, replace the 3 with the max(id) after each run
'A',
to_date('20170215','YYYYMMDD') - NN,
to_date('20170215 154039','YYYYMMDD HH24MISS') - NN,
to_date('20170215 154117','YYYYMMDD HH24MISS') - NN
from NN
where NN <= 365

Sql query duplicate column with different effective start date

i have a table x_person and x_person_name. with structure as :
x_person
eff_start_date eff_end_date person_number
01-jan-1990 31-dec-4712 2
01-feb-1990 31-dec-4712 2
01-jan-1990 31-dec-4712 1
x_person_name
eff_start_date eff_end_date person_number name
01-jan-1990 31-dec-4712 2 freida
01-feb-1990 31-dec-4712 2 sam
01-jan-1990 31-dec-4712 1 isha
Now i want to check in these two tables for same employee have same effective start dates . for those who dont i created a query
select * from
(
select distinct min(x.Effective_Start_Date) over(partition by x.person_number) Effective_Start_Date_name, y.Effective_Start_Date,x.person_number
from x_person_name x,x_person y where
x.person_number=y.person_number
)
where Effective_Start_Date_name <> Effective_Start_Date;
but this query will not work for example for person number 2 though they are 2 different ppl and have 2 differnet records in x_person. but still it is comng in the output.
Output I am getting
effective_start_date y.effective_start_date person_number
01-jan-1990 01-feb-1990 2
Whereas this shouldnt be coming.
Assuming that you need the rows from x_person not matching x_person_name, maybe this can help:
select *
from x_person p
left outer join x_person_name pn
ON (
p.person_number = pn.person_number and
p.eff_start_date = pn.eff_start_date
)
where pn.person_number is null
with x_perosn(eff_start_date, eff_end_date, person_number)
as (select to_date('01-jan-1990'), to_date('31-dec-4712'), 2 from dual
union all
select to_date('01-feb-1990'), to_date('31-dec-4712'), 2 from dual
union all
select to_date('01-jan-1990'), to_date('31-dec-4712'), 1 from dual)
, x_person_name(eff_start_date
, eff_end_date
, person_number
, name)
as (select to_date('01-jan-1990'), to_date('31-dec-4712'), 2, 'freida' from dual
union all
select to_date('01-feb-1990'), to_date('31-dec-4712'), 2, 'sam' from dual
union all
select to_date('01-jan-1990'), to_date('31-dec-4712'), 1, 'isha' from dual)
select * from (
select min(eff_start_date) over( partition by person_number) ed,y.* from x_person_name y
) t1 where not exists (select 1 from x_perosn where person_number= t1.person_number and ed = eff_start_date)
I think you are looking for something called, anti-join. Retrun only rows from table1 which there are no rows in table2.

How do I select records with max from id column if two of three other fields are identical

I have a table that stores costs for consumables.
consumable_cost_id consumable_type_id from_date cost
1 1 01/01/2000 £10.95
2 2 01/01/2000 £5.95
3 3 01/01/2000 £1.98
24 3 01/11/2013 £2.98
27 3 22/11/2013 £3.98
33 3 22/11/2013 £4.98
34 3 22/11/2013 £5.98
35 3 22/11/2013 £6.98
If the same consumable is updated more than once on the same day I would like to select only the row where the consumable_cost_id is biggest on that day. Desired output would be:
consumable_cost_id consumable_type_id from_date cost
1 1 01/01/2000 £10.95
2 2 01/01/2000 £5.95
3 3 01/01/2000 £1.98
24 3 01/11/2013 £2.98
35 3 22/11/2013 £6.98
Edit:
Here is my attempt (adapted from another post I found on here):
SELECT cc.*
FROM
consumable_costs cc
INNER JOIN
(
SELECT
from_date,
MAX(consumable_cost_id) AS MaxCcId
FROM consumable_costs
GROUP BY from_date
) groupedcc
ON cc.from_date = groupedcc.from_date
AND cc.consumable_cost_id = groupedcc.MaxCcId
You were very close. This seems to work for me:
SELECT cc.*
FROM
consumable_cost AS cc
INNER JOIN
(
SELECT
Max(consumable_cost_id) AS max_id,
consumable_type_id,
from_date
FROM consumable_cost
GROUP BY consumable_type_id, from_date
) AS m
ON cc.consumable_cost_id = m.max_id
SELECT * FROM consumable_cost
GROUP by consumable_type_id, from_date
ORDER BY cost DESC;
Assuming consumable_cost_id is unique.
SELECT * FROM T t1
WHERE EXISTS(
SELECT t2.consumable_type_id, t2.from_date FROM T t2
GROUP by t2.consumable_type_id, t2.from_date
HAVING MAX(t2.consumable_cost_id) = t1.consumable_cost_id);
Because of comment that this was returning an incorrect result, I created a test-query for Oracle that proves that this query works. As I said, it's for Oracle, but there is really no reason why this should not work in MS Access. The only Oracle specific I used here is the FROM DUAL to generate the virtual data.
WITH T AS
(
SELECT 1 AS consumable_cost_id,1 AS consumable_type_id, TO_DATE('01/01/2000','DD/MM/YYYY') AS FROM_DATE, '£10.95' AS COST FROM DUAL
UNION ALL
SELECT 2,2,TO_DATE('01/01/2000','DD/MM/YYYY'),'£5.95' FROM DUAL
UNION ALL
SELECT 3,3,TO_DATE('01/01/2000','DD/MM/YYYY'),'£1.98' FROM DUAL
UNION ALL
SELECT 24,3,TO_DATE('01/11/2013','DD/MM/YYYY'),'£1.98' FROM DUAL
UNION ALL
SELECT 27,3,TO_DATE('22/11/2013','DD/MM/YYYY'),'£1.98' FROM DUAL
UNION ALL
SELECT 33,3,TO_DATE('22/11/2013','DD/MM/YYYY'),'£1.98' FROM DUAL
UNION ALL
SELECT 34,3,TO_DATE('22/11/2013','DD/MM/YYYY'),'£1.98' FROM DUAL
UNION ALL
SELECT 35,3,TO_DATE('22/11/2013','DD/MM/YYYY'),'£1.98' FROM DUAL
)
SELECT * FROM T t1
WHERE EXISTS(
SELECT t2.consumable_type_id, t2.from_date FROM T t2
GROUP by t2.consumable_type_id, t2.from_date
HAVING MAX(t2.consumable_cost_id) = t1.consumable_cost_id);
Result:
1 1 01-JAN-00 £10.95
2 2 01-JAN-00 £5.95
3 3 01-JAN-00 £1.98
24 3 01-NOV-13 £1.98
35 3 22-NOV-13 £1.98