Ordinal numbers in select - sql

First sorry for my bad english it's not my native language :(
I'm kinda new in Oracle and I need help with following. I have several records with same ID, several values (which can be same) and different creation date.
I would like to select an ordinal number for IDs which have same value, but different date.
For example
ID | Value | Date | Number
A | Value1 | 01.11. | 1
A | Value1 | 02.11. | 2
A | Value2 | 03.11. | null
A | Value2 | 01.11. | null
B | Value1 | 01.11. | 1
B | Value1 | 03.11. | 2
B | Value2 | 01.11. | null
C | Value1 | 01.11. | 1
C | Value2 | 01.11. | null
So for every ID in first coloumn where I have Value1 I want to have increment and for the rest of the values I don't need to have anything.
I hope I'm not posting double question I have tried to look it up, but I couldn't find any answer.
Thank you in advance!
Edit: Will accept one instead of null for other values.

The basic idea is row_number() to get the sequential value and rank() to rank the values. You only want the first set to be enumerated. "First" corresponds to rank() having a value of 1. The rest get NULL:
select id, value, date,
(case when rank() over (partition by id order by value) = 1
then row_number() over (partition by id order by value)
end) as number
from table t;
EDIT:
I realize that you might actually want the first value by time and not some other ordering. For that, use keep instead of rank():
select id, value, date,
(case when value = max(value) keep (dense_rank first order by value) over (partition by id)
then row_number() over (partition by id order by value)
end) as number
from table t;

Hmm... Hope I understood correctly:
with my_table as (
select 'A' ID, 'Value1' value, '01.11.' dt, 1 num from dual union all
select 'A', 'Value1', '02.11.', 2 from dual union all
select 'A', 'Value2', '03.11.', null from dual union all
select 'A', 'Value2', '01.11.', null from dual union all
select 'B', 'Value1', '01.11.', 1 from dual union all
select 'B', 'Value1', '03.11.', 2 from dual union all
select 'B', 'Value2', '01.11.', null from dual union all
select 'C', 'Value1', '01.11.', 1 from dual union all
select 'C', 'Value2', '01.11.', null from dual)
select *
from (select t.*, count(distinct dt) over (partition by value, id) diff_cnt
from my_table t) tt
where tt.diff_cnt > 1;
result:
ID VALUE DT NUM DIFF_CNT
-- ------ ------ ---------- ----------
A Value1 01.11. 1 2
A Value1 02.11. 2 2
B Value1 01.11. 1 2
B Value1 03.11. 2 2
A Value2 01.11. 2
A Value2 03.11. 2

Related

What is the equivalent of ARRAY_AGG(foo where bar) in BigQuery?

Let's say I want to do this:
with source_data as (
select 1 as id, 'a' as sub_id, true as turned_on
union all
select 1 as id, 'b' as sub_id, true as turned_on
union all
select 2 as id, 'a' as sub_id, false as turned_on
union all
select 2 as id, 'b' as sub_id, true as turned_on
union all
select 3 as id, 'a' as sub_id, false as turned_on
union all
select 3 as id, 'b' as sub_id, false as turned_on
)
select
id,
array_agg(sub_id where turned_on) as all_on, -- invalid syntax
turned_off(sub_id where turned_off) as all_off -- invalid syntax
from
source_data
group by id
to get something like
| id | all_on | all_off |
| --- | ------ | ------- |
| 1 | [a, b] | |
| 2 | [b] | [a] |
| 3 | | [a, b] |
the marked rows are invalid, because I can't do ARRAY_AGG(... where ...). From the docs I gather I could probably accomplish something similar using analytic functions (particularly PARTITION BY) but I don't understand how.
Is it possible to write a query that aggregates arrays the way I illustrate above? How do I do it?
Consider using if:
with source_data as (
select 1 as id, 'a' as sub_id, true as turned_on
union all
select 1 as id, 'b' as sub_id, true as turned_on
union all
select 2 as id, 'a' as sub_id, false as turned_on
union all
select 2 as id, 'b' as sub_id, true as turned_on
union all
select 3 as id, 'a' as sub_id, false as turned_on
union all
select 3 as id, 'b' as sub_id, false as turned_on
)
select
id,
array_agg(if(turned_on=true, sub_id, null) ignore nulls) as all_on,
array_agg(if(turned_on=false, sub_id, null) ignore nulls) as all_off
from
source_data
group by id

SQL theory: Filtering out duplicates in one column, picking lowest value in other column

I am trying to figure out the best way to remove rows from a result set where either the value in one column or the value in a different column has a duplicate in the result set.
Imagine the results of a query are as follows:
a_value | b_value
-----------------
1 | 1
2 | 1
2 | 2
3 | 1
4 | 3
5 | 2
6 | 4
6 | 5
What I want to do is:
Eliminate all rows that have duplicate values in a_value
Pick only 1 row for a given b_value
So I'd want the filtered results to end up like this after eliminating a_value duplicates:
a_value | b_value
-----------------
1 | 1
3 | 1
4 | 3
5 | 2
And then like this after picking only a single b_value:
a_value | b_value
-----------------
1 | 1
4 | 3
5 | 2
I'd appreciate suggestions on how to accomplish this task in an efficient way via SQL.
with
q_res ( a_value, b_value ) as (
select 1, 1 from dual union all
select 2, 1 from dual union all
select 2, 2 from dual union all
select 3, 1 from dual union all
select 4, 3 from dual union all
select 5, 2 from dual union all
select 6, 4 from dual union all
select 6, 5 from dual
)
-- end test data; solution begins below
select min(a_value) as a_value, b_value
from (
select a_value, min(b_value) as b_value
from q_res
group by a_value
having count(*) = 1
)
group by b_value
order by a_value -- ORDER BY is optional
;
A_VALUE B_VALUE
------- -------
1 1
4 3
5 2
1) In the inner query I am avoiding all duplicates which are present in a_value
column and getting all the remaining rows from input table and storing them
as t2. By joining t2 with t1 there would be full data without any dups as per
your #1 in requirement.
SELECT t1.*
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value;
2) Once the filtered data is obtained, I am assigning rank to each row in the filtered dataset obtained in step-1 and I am selecting only rows with rank=1.
SELECT X.a_value,
X.b_value
FROM
(
SELECT t1.*,
ROW_NUMBER() OVER ( PARTITION BY t1.b_value ORDER BY t1.a_value,t1.b_value ) AS rn
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value
) X
WHERE X.rn = 1;

In Oracle, how to select specific row while aggregating all rows

I have a requirement that I need to both aggregate all rows by id, and find 1 specific row among the rows of the same id. It's like 2 SQL queries, but I want to make it in 1 SQL query. I'm using Oracle database.
for example,table t1 whose data looks like:
id | name | num
----- -------- -------
1 | 'a' | 1
2 | 'b' | 3
2 | 'c' | 6
2 | 'd' | 6
I want to aggregate the data by the id, find the 'name' with the highest 'count', and sum all count of the id to 'total_count'.
There are 2 rows with same num, pick up the first one.
id | highest_num | name_of_highest_num | total_num | avg_num
----- ------------- --------------------- ------------ -------------------
1 | 1 | 'a' | 1 | 1
2 | 6 | 'c' | 15 | 5
Can I get this result by 1 Oracle SQL query?
Thanks in advance for any replies.
Oracle Setup:
CREATE TABLE table_name ( id, name, num ) AS
SELECT 1, 'a', 1 FROM DUAL UNION ALL
SELECT 2, 'b', 3 FROM DUAL UNION ALL
SELECT 2, 'c', 6 FROM DUAL UNION ALL
SELECT 2, 'd', 6 FROM DUAL;
Query:
SELECT id,
MAX( num ) AS highest_num,
MAX( name ) KEEP ( DENSE_RANK LAST ORDER BY num ) AS name_of_highest_num,
SUM( num ) AS total_num,
AVG( num ) AS avg_num
FROM table_name
GROUP BY id
Output:
ID HIGHEST_NUM NAME_OF_HIGHEST_NUM TOTAL_NUM AVG_NUM
-- ----------- ------------------- --------- -------
1 1 a 1 1
2 6 d 15 5
Here's one option using row_number in a subquery with conditional aggregation:
select id,
max(num) as highest_num,
max(case when rn = 1 then name end) as name_of_highest_num,
sum(num) as total_num,
avg(num) as avg_num
from (
select id, name, num,
row_number() over (partition by id order by num desc) rn
from a
) t
group by id
SQL Fiddle Demo
Sounds like you want to use some analytic functions. Something like this should work
select id,
num highest_num,
name name_of_highest_num,
total total_num,
average avg_num
from (select id,
num,
name,
rank() over (partition by id
order by num desc, name asc) rnk,
sum(num) over (partition by id) total,
avg(num) over (partition by id) average
from table t1)
where rnk = 1

Select Contiguous Rows for Run Length Encoding

I have some enormous tables of values, and dates, that I want to compress using run length encoding. The most obvious way (to me) to do this is to select all the distinct value combinations, and the minimum and maximum dates. The problem with this is that it would miss any instances where a mapping stops, and then starts again.
Id | Value1 | Value2 | Value3 | DataDate
------------------------------------------
01 | 1 | 2 | 3 | 2000-01-01
01 | 1 | 2 | 3 | 2000-01-02
01 | 1 | 2 | 3 | 2000-01-03
01 | 1 | 2 | 3 | 2000-01-04
01 | A | B | C | 2000-01-05
01 | A | B | C | 2000-01-06
01 | 1 | 2 | 3 | 2000-01-07
Would be encoded this way as
Id | Value1 | Value2 | Value3 | FromDate | ToDate
-----------------------------------------------------
01 | 1 | 2 | 3 | 2000-01-01| 2000-01-07
01 | A | B | C | 2000-01-05| 2000-01-06
Which is clearly wrong.
What I'd like is a query that would return each set of continuous dates that exist for each set of values.
Alternatively, if I'm looking at this arse-backwards, any other advice would be appreciated.
Try this:
DECLARE #MyTable TABLE (
Id INT,
Value1 VARCHAR(10),
Value2 VARCHAR(10),
Value3 VARCHAR(10),
DataDate DATE
);
INSERT #MyTable
SELECT 01, '1', ' 2', '3', '2000-01-01' UNION ALL
SELECT 01, '1', ' 2', '3', '2000-01-02' UNION ALL
SELECT 01, '1', ' 2', '3', '2000-01-03' UNION ALL
SELECT 01, '1', ' 2', '3', '2000-01-04' UNION ALL
SELECT 01, 'A', ' B', 'C', '2000-01-05' UNION ALL
SELECT 01, 'A', ' B', 'C', '2000-01-06' UNION ALL
SELECT 01, '1', ' 2', '3', '2000-01-07'
SELECT Id, Value1, Value2, Value3,
MIN(DataDate) AS FromDate, MAX(DataDate) AS ToDate
FROM (
SELECT x.Id, x.Value1, x.Value2, x.Value3,
x.DataDate,
GroupNum =
DATEDIFF(DAY, 0, x.DataDate) -
ROW_NUMBER() OVER(PARTITION BY x.Id, x.Value1, x.Value2, x.Value3 ORDER BY x.DataDate)
FROM #MyTable x
) y
GROUP BY Id, Value1, Value2, Value3, GroupNum
Results:
Id Value1 Value2 Value3 FromDate ToDate
-- ------ ------ ------ ---------- ----------
1 1 2 3 2000-01-01 2000-01-04
1 1 2 3 2000-01-07 2000-01-07
1 A B C 2000-01-05 2000-01-06
Try this:
SELECT Id, Value1, Value2, Value3, MIN(DataDate) AS FromDate, MAX(DataDate) AS ToDate
FROM YourTable
GROUP BY Id, Value1, Value2, Value3
You'll probably want to use windowing functions. Try something like this:
select
id, value1, value2, value3,
from_date=update_date,
to_date=lead(update_date) over (partition by id order by update_date)
from (
select
t.*
,is_changed=
case when
value1 <> lag(value1) over (partition by id order by update_date) or
(lag(value1) over (partition by id order by update_date) is null and value1 is not null) or
value2 <> lag(value2) over (partition by id order by update_date) or
(lag(value2) over (partition by id order by update_date) is null and value2 is not null) or
value3 <> lag(value3) over (partition by id order by update_date) or
(lag(value3) over (partition by id order by update_date) is null and value3 is not null)
then 1 else 0 end
from test t
) t2
where is_changed = 1
order by id, update_date
Please note that this query relies on the LAG() function, and two other things:
Separate tests for each "value" column; if you have a lot of columns to test, you might consider creating a single hash value to simplify the equality checks
The "to_date" is identical to the next record's "from_date", which means you might need to test for values using >= from_date and < to_date to make the run-lengths mutually exclusive
Note that I used the following sample data in my testing:
create table test(id int, value1 varchar(3), value2 varchar(3), value3 varchar(3), update_date datetime)
insert into test values
(1, 'A', 'B', 'C', '1/1/2014'),
(1, 'A', 'B', 'C', '2/1/2014'),
(1, 'X', 'Y', 'Z', '3/1/2014'),
(1, 'A', 'B', 'C', '4/1/2014'),
(2, 'D', 'E', 'F', '1/1/2014'),
(2, 'D', 'E', 'F', '6/1/2014')
Good luck!

Oracle grouping/changing rows to columns

I have the following table named foo:
ID | KEY | VAL
----------------
1 | 47 | 97
2 | 47 | 98
3 | 47 | 99
4 | 48 | 100
5 | 48 | 101
6 | 49 | 102
I want to run a select query and have the results show like this
UNIQUE_ID | KEY | ID1 | VAL1 | ID2 | VAL2 | ID3 | VAL3
--------------------------------------------------------------
47_1:97_2:98_3:99| 47 | 1 | 97 | 2 | 98 | 3 | 99
48_4:100_5:101 | 48 | 4 | 100 | 5 | 101 | |
49_6:102 | 49 | 6 | 102 | | | |
So, basically all rows with the same KEY get collapsed into 1 row. There can be anywhere from 1-3 rows per KEY value
Is there a way to do this in a sql query (without writing a stored procedure or scripts)?
If not, I could also work with the less desirable choice of
UNIQUE_ID | KEY | IDS | VALS
--------------------------------------------------------------
47_1:97_2:98_3:99| 47 | 1,2,3 | 97,98,99
48_4:100_5:101 | 48 | 4,5 | 100, 101
49_6:102 | 49 | 6 | 102
Thanks!
UPDATE:
Unfortunately my real-world problem seems to be much more difficult than this example, and I'm having trouble getting either example to work :( My query is over 120 lines so it's not very easy to post. It kind of looks like:
with v_table as (select ...),
v_table2 as (select foo from v_table where...),
v_table3 as (select foo from v_table where ...),
...
v_table23 as (select foo from v_table where ...)
select distinct (...) as "UniqueID", myKey, myVal, otherCol1, ..., otherCol18
from tbl1 inner join tbl2 on...
...
inner join tbl15 on ...
If I try any of the methods below it seems that I cannot do group-bys correctly because of all the other data being returned.
Ex:
with v_table as (select ...),
v_table2 as (select foo from v_table where...),
v_table3 as (select foo from v_table where ...),
...
v_table23 as (select foo from v_table where ...)
select "Unique ID",
myKey, max(decode(id_col,1,id_col)) as id_1, max(decode(id_col,1,myVal)) as val_1,
max(decode(id_col,2,id_col)) as id_2,max(decode(id_col,2,myVal)) as val_2,
max(decode(id_col,3,id_col)) as id_3,max(decode(id_col,3,myVal)) as val_3
from (
select distinct (...) as "UniqueID", myKey, row_number() over (partition by myKey order by id) as id_col, id, myVal, otherCol1, ..., otherCol18
from tbl1 inner join tbl2 on...
...
inner join tbl15 on ...
) group by myKey;
Gives me the error: ORA-00979: not a GROUP BY expression
This is because I am selecting the UniqueID from the inner select. I will need to do this as well as select other columns from the inner table.
Any help would be appreciated!
Take a look ath this article about Listagg function, this will help you getting the comma separated results, it works only in the 11g version.
You may try this
select key,
max(decode(id_col,1,id_col)) as id_1,max(decode(id_col,1,val)) as val_1,
max(decode(id_col,2,id_col)) as id_2,max(decode(id_col,2,val)) as val_2,
max(decode(id_col,3,id_col)) as id_3,max(decode(id_col,3,val)) as val_3
from (
select key, row_number() over (partition by key order by id) as id_col,id,val
from your_table
)
group by key
As #O.D. suggests, you can generate the less desirable version with LISTAGG, for example (using a CTE to generate your sample data):
with foo as (
select 1 as id, 47 as key, 97 as val from dual
union select 2,47,98 from dual
union select 3,47,99 from dual
union select 4,48,100 from dual
union select 5,48,101 from dual
union select 6,49,102 from dual
)
select key ||'_'|| listagg(id ||':' ||val, '_')
within group (order by id) as unique_id,
key,
listagg(id, ',') within group (order by id) as ids,
listagg(val, ',') within group (order by id) as vals
from foo
group by key
order by key;
UNIQUE_ID KEY IDS VALS
----------------- ---- -------------------- --------------------
47_1:97_2:98_3:99 47 1,2,3 97,98,99
48_4:100_5:101 48 4,5 100,101
49_6:102 49 6 102
With a bit more manipulation you can get your preferred results:
with foo as (
select 1 as id, 47 as key, 97 as val from dual
union select 2,47,98 from dual
union select 3,47,99 from dual
union select 4,48,100 from dual
union select 5,48,101 from dual
union select 6,49,102 from dual
)
select unique_id, key,
max(id1) as id1, max(val1) as val1,
max(id2) as id2, max(val2) as val2,
max(id3) as id3, max(val3) as val3
from (
select unique_id,key,
case when r = 1 then id end as id1, case when r = 1 then val end as val1,
case when r = 2 then id end as id2, case when r = 2 then val end as val2,
case when r = 3 then id end as id3, case when r = 3 then val end as val3
from (
select key ||'_'|| listagg(id ||':' ||val, '_')
within group (order by id) over (partition by key) as unique_id,
key, id, val,
row_number() over (partition by key order by id) as r
from foo
)
)
group by unique_id, key
order by key;
UNIQUE_ID KEY ID1 VAL1 ID2 VAL2 ID3 VAL3
----------------- ---- ---- ---- ---- ---- ---- ----
47_1:97_2:98_3:99 47 1 97 2 98 3 99
48_4:100_5:101 48 4 100 5 101
49_6:102 49 6 102
Can't help feeling there ought to be a simpler way though...