sql - count one column based on another column - sql

I have a dataset
case_id subcase_id
1 | 1-1
1 | 1-2
1 | 1-3
1 | 1-6
2 | 2-1
2 | 2-7
I want the following output. The idea is to count the occurence of a subcase corresponding to a case.
case_id subcase_id
1 | 1-1 | 1
1 | 1-2 | 2
1 | 1-3 | 3
1 | 1-6 | 4
2 | 2-1 | 1
2 | 2-7 | 2

You can try using row_number() function
select
caseid,
subcase_id,
row_number() over(partition by caseid
order by
cast(SUBSTR(subcase_id, 1,INSTR(subcase_id, '-') -1) as number),
cast(SUBSTR(subcase_id, INSTR(subcase_id, '-') +1) as number)) as rn
from tablename

You may use count() over (partition by .. order by ..) clause as :
with t(case_id,subcase_id) as
(
select 1,'1-1' from dual union all
select 1,'1-2' from dual union all
select 1,'1-3' from dual union all
select 1,'1-6' from dual union all
select 2,'2-1' from dual union all
select 2,'2-7' from dual
)
select t.*,
count(*) over (partition by case_id order by subcase_id)
as result
from t;
CASE_ID SUBCASE_ID RESULT
------- ---------- ------
1 1-1 1
1 1-2 2
1 1-3 3
1 1-6 4
2 2-1 1
2 2-7 2
where subcase_id is changes frequently and distinct for all values while case_id changes rarely.
Rextester Demo

Here is a query which should behave as you want. We have to isolate the two numeric components of the subcase_id, and then cast them to integers, to avoid sorting this column as text.
SELECT
case_id,
subcase_id,
ROW_NUMBER() OVER (PARTITION BY case_id
ORDER BY TO_NUMBER(SUBSTR(subcase_id, 1, INSTR(subcase_id, '-') - 1)),
TO_NUMBER(SUBSTR(subcase_id, INSTR(subcase_id, '-') + 1))) rn
FROM yourTable
ORDER BY
case_id,
TO_NUMBER(SUBSTR(subcase_id, 1, INSTR(subcase_id, '-') - 1)),
TO_NUMBER(SUBSTR(subcase_id, INSTR(subcase_id, '-') + 1));
Demo
It is not a good idea to treat the subcase_id column as both text and numbers. If you really have a long term need to sort on this column, then I suggest breaking out the two numeric components as separate number columns.

Related

Rows as Columns Oracle DB

I'm trying to display the rows of a table as columns, this is the normal output:
ITEM | CODE | SET | CREATION | CATEGORY | GROUP
1 1 CP 06/11/2020 10 52
2 3 PN 07/11/2020 9 57
3 1 PNI 08/11/2020 12 53
This is how I need to display it:
ITEM | 1 | 2 | 3
CODE | 1 | 3 | 1
SET | CP | PN | PNI
CREATION | 06/11/2020 | 07/11/2020 | 08/11/2020
CATEGORY | 10 | 9 | 12
GROUP | 52 | 57 | 53
I'm quite new to SQL, I tried to use the Oracle pivot function but I'm not getting the desired output, Is this even posible? Any suggestions?
The simplest method is probably union all. Assuming all columns are strings:
with t as (
select t.*, row_number() over (order by item) as seqnum
from yourtable
)
select 'item',
max(case when seqnum = 1 then item end),
max(case when seqnum = 2 then item end),
max(case when seqnum = 3 then item end)
from t
union all
select 'code',
max(case when seqnum = 1 then code end),
max(case when seqnum = 2 then code end),
max(case when seqnum = 3 then code end)
from t
union all
select 'set',
max(case when seqnum = 1 then set end),
max(case when seqnum = 2 then set end),
max(case when seqnum = 3 then set end)
from t
union all
-- and so on for the rest of the columns
Transposing means unpivot, then pivot (along different dimensions).
When you unpivot, you put together values of different data types in the same column of the (intermediate) result set. That can't work; you must first convert everything to strings. (Which means that, if you have data that cannot be converted to strings, the whole project will fail.) Note that, for simplicity, I set my NLS_DATE_FORMAT to 'dd/mm/yyyy'; in production code, you should give explicit format model in the TO_CHAR call for dates.
Also, you need to know the number of input rows (items) in advance - otherwise you must use dynamic SQL, which is not an intro level topic - and not a good practice anyway.
Here is how it goes. Note that I changed two column names (SET and GROUP are reserved keywords in Oracle, they can't be column names) in the sample data, which I included in the WITH clause. Of course, the WITH clause is not part of the solution - it's there just for testing.
I also used an advanced feature of UNPIVOT, to create a column to order by at the end. That is not critical - you can use a more elementary version of UNPIVOT, and use a different approach to get the output in the right order.
with
sample_data (ITEM, CODE, set_, creation, category, group_) as (
select 1, 1, 'CP' , to_date('06/11/2020'), 10, 52 from dual union all
select 2, 3, 'PN' , to_date('07/11/2020'), 9, 57 from dual union all
select 3, 1, 'PNI', to_date('08/11/2020'), 12, 53 from dual
)
select col, "1", "2", "3"
from (
select to_char(item) as item, to_char(code) as code, set_,
to_char(creation) as creation, to_char(category) as category,
to_char(group_) as group_, rownum as rn
from sample_data
)
unpivot (value for (col, ord) in (item as ('ITEM', 1), code as ('CODE', 2),
set_ as ('SET', 3), creation as ('CREATION', 4),
category as ('CATEGORY', 5), group_ as ('GROUP', 6)))
pivot (min(value) for rn in (1, 2, 3))
order by ord
;
COL 1 2 3
-------- -------------------- -------------------- --------------------
ITEM 1 2 3
CODE 1 3 1
SET CP PN PNI
CREATION 06/11/2020 07/11/2020 08/11/2020
CATEGORY 10 9 12
GROUP 52 57 53

DENSE_RANK() Query

I have something similar to the below dataset...
ID RowNumber
101 1
101 2
101 3
101 4
101 5
101 1
101 2
What I would like to get is an additional column as below...
ID RowNumber New
101 1 1
101 2 1
101 3 1
101 4 1
101 5 1
101 1 2
101 2 2
I have toyed with dense_rank(), but no such luck.
Gordon already mentioned, you required a column to specify the order of data. If i consider ID as order by column, this following logic may help you to get your desired result-
WITH your_table(ID,RowNumber)
AS
(
SELECT 101,1 UNION ALL
SELECT 101,2 UNION ALL
SELECT 101,3 UNION ALL
SELECT 101,4 UNION ALL
SELECT 101,5 UNION ALL
SELECT 101,1 UNION ALL
SELECT 101,2
)
SELECT A.ID,A.RowNumber,
SUM(RN) OVER
(
ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) +1 New
FROM
(
SELECT *,
CASE
WHEN LAG(RowNumber) OVER(ORDER BY ID) > RowNumber THEN 1
ELSE 0
END RN
FROM your_table
)A
Above will always change the ROW NUMBER if the value in RowNumber decreased than previous one. Alternatively, the same output alsoo can be achieved if you wants to change row number whenever value 1 found. This is bit static option-
SELECT A.ID,A.RowNumber,
SUM(RN) OVER
(
ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) New
FROM(
SELECT *,
CASE
WHEN RowNumber = 1 THEN 1
ELSE 0
END RN
FROM your_table
)A
Output is-
ID RowNumber New
101 1 1
101 2 1
101 3 1
101 4 1
101 5 1
101 1 2
101 2 2
SQL tables represent unordered sets. There is no ordering unless a column specifies the ordering.
Assuming you have such a column, you can do what you want simply by counting the number of "1" up to each point:
select t.*,
sum(case when rownumber = 1 then 1 else 0 end) over (partition by id order by <ordering column>) as new
from t;
As Gordon alluded to, there is no default order in your example so it's difficult to imagine how to get a deterministic result (e.g. where the same values supplied to the same query always resulting in the exact same answer.)
This sample data includes a sequential PK column which is used to define the order of this set.
DECLARE #tbl TABLE (PK INT IDENTITY, ID INT, RowNumber INT)
INSERT #tbl(ID, RowNumber) VALUES (101,1),(101,2),(101,3),(101,4),(101,5),(101,1),(101,2);
SELECT t.* FROM #tbl AS t;
Returns:
PK ID RowNumber
----- ------ -----------
1 101 1
2 101 2
3 101 3
4 101 4
5 101 5
6 101 1
7 101 2
This query uses DENSE_RANK to get you what you want:
DECLARE #tbl TABLE (PK INT IDENTITY, ID INT, RowNumber INT)
INSERT #tbl(ID, RowNumber) VALUES (101,1),(101,2),(101,3),(101,4),(101,5),(101,1),(101,2);
SELECT t.ID, t.RowNumber, New = DENSE_RANK() OVER (ORDER BY t.PK - RowNumber)
FROM #tbl AS t;
Returns:
ID RowNumber New
----- ----------- ------
101 1 1
101 2 1
101 3 1
101 4 1
101 5 1
101 1 2
101 2 2
Note that ORDER BY New does not affect the plan.
Please try the below
Load the data into Temp table
Select id,RowNumber,Row_number()over(partition by RowNumber Order by id)New from #temp
Order by Row_number()over(partition by RowNumber Order by id),RowNumber

PostgreSQL last_value ignore nulls

I know this already been asked, but why doesn't the solution below work? I want to fill value with the last non-null value ordered by idx.
What I see:
idx | coalesce
-----+----------
1 | 2
2 | 4
3 |
4 |
5 | 10
(5 rows)
What I want:
idx | coalesce
-----+----------
1 | 2
2 | 4
3 | 4
4 | 4
5 | 10
(5 rows)
Code:
with base as (
select 1 as idx
, 2 as value
union
select 2 as idx
, 4 as value
union
select 3 as idx
, null as value
union
select 4 as idx
, null as value
union
select 5 as idx
, 10 as value
)
select idx
, coalesce(value
, last_value(value) over (order by case when value is null then -1
else idx
end))
from base
order by idx
What you want is lag(ignore nulls). Here is one way to do what you want, using two window functions. The first defines the grouping for the NULL values and the second assigns the value:
select idx, value, coalesce(value, max(value) over (partition by grp))
from (select b.*, count(value) over (order by idx) as grp
from base b
) b
order by idx;
You can also do this without subqueries by using arrays. Basically, take the last element not counting NULLs:
select idx, value,
(array_remove(array_agg(value) over (order by idx), null))[count(value) over (order by idx)]
from base b
order by idx;
Here is a db<>fiddle.
Well the last_value here doesn't make sense to me unless you can point out to me. Looking at the example you need the last non value which you can get it by:
I am forming a group with the nulls and previous non null value so that I can get the first non value.
with base as (
select 1 as idx , 2 as value union
select 2 as idx, -14 as value union
select 3 as idx , null as value union
select 4 as idx , null as value union
select 5 as idx , 1 as value
)
Select idx,value,
first_value(value) Over(partition by rn) as new_val
from(
select idx,value
,sum(case when value is not null then 1 end) over (order by idx) as rn
from base
) t
here is the code
http://sqlfiddle.com/#!15/fcda4/2
To see why your solution doesn't work, just look at the output if you order by the ordering in your window frame:
with base as (
select 1 as idx
, 2 as value
union
select 2 as idx
, 4 as value
union
select 3 as idx
, null as value
union
select 4 as idx
, null as value
union
select 5 as idx
, 10 as value
)
select idx, value from base
order by case when value is null then -1
else idx
end;
idx | value
-----+-------
3 |
4 |
1 | 2
2 | 4
5 | 10
The last_value() window function will pick the last value in the current frame. Without changing any of the frame defaults, this will be the current row.

oracle dates group

How to get optimized query for this
date_one | date_two
------------------------
01.02.1999 | 31.05.2003
01.01.2004 | 01.01.2010
02.01.2010 | 10.10.2011
11.10.2011 | (null)
I need to get this
date_one | date_two | group
------------------------------------
01.02.1999 | 31.05.2003 | 1
01.01.2004 | 01.01.2010 | 2
02.01.2010 | 10.10.2011 | 2
11.10.2011 | (null) | 2
The group number is assigned as follows. Order the rows by date_one ascending. First row gets group = 1. Then for each row if date_one is the date immediately following date_two of the previous row, the group number stays the same as in the previous row, otherwise it increases by one.
You can do this using left join and a cumulative sum:
select t.*, sum(case when tprev.date_one is null then 1 else 0 end) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
The idea is to find where the gaps begin (using the left join) and then do a cumulative sum of such beginnings to define the group.
If you want to be more inscrutable, you could write this as:
select t.*,
count(*) over (order by t.date_one) - count(tprev.date_one) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
One way is using window function:
select
date_one,
date_two,
sum(x) over (order by date_one) grp
from (
select
t.*,
case when
lag(date_two) over (order by date_one) + 1 =
date_one then 0 else 1 end x
from t
);
It finds the date_two from the last row using analytic function lag and check if it in continuation with date_one from this row (in increasing order of date_one).
How it works:
lag(date_two) over (order by date_one)
(In the below explanation, when I say first, next, previous or last row, it's based on increasing order of date_one with null values at the end)
The above produces produces NULL for the first row as there is no row before it to get date_two from and previous row's date_two for the subsequent rows.
case when
lag(date_two)
over (order by date_one) + 1 = date_one then 0
else 1 end
Since, the lag produces NULL for the very first row (since NULL = anything expression always finally evaluates to false), output of case will be 1.
For further rows, similar check will be done to produce a new column x in the query output which has value 1 when the previous row's date_two is not in continuation with this row's date_one.
Then finally, we can do an incremental sum on x to find the required group values. See the value of x below for understanding:
SQL> with t (date_one,date_two) as (
2 select to_date('01.02.1999','dd.mm.yyyy'),to_date('31.05.2003','dd.mm.yyyy') from dual union
all
3 select to_date('01.01.2004','dd.mm.yyyy'),to_date('01.01.2010','dd.mm.yyyy') from dual union
all
4 select to_date('02.01.2010','dd.mm.yyyy'),to_date('10.10.2011','dd.mm.yyyy') from dual union
all
5 select to_date('11.10.2011','dd.mm.yyyy'),null from dual
6 )
7 select
8 date_one,
9 date_two,
10 x,
11 sum(x) over (order by date_one) grp
12 from (
13 select
14 t.*,
15 case when
16 lag(date_two) over (order by date_one) + 1 =
17 date_one then 0 else 1 end x
18 from t
19 );
DATE_ONE DATE_TWO X GRP
--------- --------- ---------- ----------
01-FEB-99 31-MAY-03 1 1
01-JAN-04 01-JAN-10 1 2
02-JAN-10 10-OCT-11 0 2
11-OCT-11 0 2
SQL>

SQL query to Calculate allocation / netting

Here is my source data,
Group | Item | Capacity
-----------------------
1 | A | 100
1 | B | 80
1 | C | 20
2 | A | 90
2 | B | 40
2 | C | 20
The above data shows the capacity to consume "something" for each item.
Now suppose I have maximum 100 allocated to each group. I want to distribute this "100" to each group upto the item's maximum capacity. So my desired output is like this:
Group | Item | Capacity | consumption
-------------------------------------
1 | A | 100 | 100
1 | B | 80 | 0
1 | C | 20 | 0
2 | A | 90 | 90
2 | B | 40 | 10
2 | C | 20 | 0
My question is how do I do it in a single SQL query (preferably avoiding any subquery construct). Please note, number of items in each group is not fixed.
I was trying LAG() with running SUM(), but could not quite produce the desired output...
select
group, item, capacity,
sum (capacity) over (partition by group order by item range between UNBOUNDED PRECEDING AND CURRENT ROW) run_tot,
from table_name
Without a subquery using just the analytic SUM function:
SQL> create table mytable (group_id,item,capacity)
2 as
3 select 1, 'A' , 100 from dual union all
4 select 1, 'B' , 80 from dual union all
5 select 1, 'C' , 20 from dual union all
6 select 2, 'A' , 90 from dual union all
7 select 2, 'B' , 40 from dual union all
8 select 2, 'C' , 20 from dual
9 /
Table created.
SQL> select group_id
2 , item
3 , capacity
4 , case
5 when sum(capacity) over (partition by group_id order by item) > 100 then 100
6 else sum(capacity) over (partition by group_id order by item)
7 end -
8 case
9 when nvl(sum(capacity) over (partition by group_id order by item rows between unbounded preceding and 1 preceding),0) > 100 then 100
10 else nvl(sum(capacity) over (partition by group_id order by item rows between unbounded preceding and 1 preceding),0)
11 end consumption
12 from mytable
13 /
GROUP_ID I CAPACITY CONSUMPTION
---------- - ---------- -----------
1 A 100 100
1 B 80 0
1 C 20 0
2 A 90 90
2 B 40 10
2 C 20 0
6 rows selected.
Here's a solution using recursive subquery factoring. This clearly ignores your preference to avoid subqueries, but doing this in one pass might be impossible.
Probably the only way to do this in one pass is to use MODEL, which I'm not allowed to code after midnight. Maybe someone waking up in Europe can figure it out.
with ranked_items as
(
--Rank the items. row_number() should also randomly break ties.
select group_id, item, capacity,
row_number() over (partition by group_id order by item) consumer_rank
from consumption
),
consumer(group_id, item, consumer_rank, capacity, consumption, left_over) as
(
--Get the first item and distribute as much of the 100 as possible.
select
group_id,
item,
consumer_rank,
capacity,
least(100, capacity) consumption,
100 - least(100, capacity) left_over
from ranked_items
where consumer_rank = 1
union all
--Find the next row by the GROUP_ID and the artificial CONSUMER_ORDER_ID.
--Distribute as much left-over from previous consumption as possible.
select
ranked_items.group_id,
ranked_items.item,
ranked_items.consumer_rank,
ranked_items.capacity,
least(left_over, ranked_items.capacity) consumption,
left_over - least(left_over, ranked_items.capacity) left_over
from ranked_items
join consumer
on ranked_items.group_id = consumer.group_id
and ranked_items.consumer_rank = consumer.consumer_rank + 1
)
select group_id, item, capacity, consumption
from consumer
order by group_id, item;
Sample data:
create table consumption(group_id number, item varchar2(1), capacity number);
insert into consumption
select 1, 'A' , 100 from dual union all
select 1, 'B' , 80 from dual union all
select 1, 'C' , 20 from dual union all
select 2, 'A' , 90 from dual union all
select 2, 'B' , 40 from dual union all
select 2, 'C' , 20 from dual;
commit;
Does this work as expected?
WITH t AS
(SELECT GROUP_ID, item, capacity,
SUM(capacity) OVER (PARTITION BY GROUP_ID ORDER BY item RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) sum_run,
GREATEST(100-SUM(capacity) OVER (PARTITION BY GROUP_ID ORDER BY item RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), 0) AS remain
FROM table_name)
SELECT t.*,
LEAST(sum_run,lag(remain, 1, 100) OVER (PARTITION BY GROUP_ID ORDER BY item)) AS run_tot
FROM t
select group_id,item,capacity,(case when rn=1 then capacity else 0 end) consumption
from
(select group_id,item,capacity,
row_number() over (partition by group_id order by capacity desc) rn from mytable)