Google Bigquery: Retain Previous Value of Column - sql

I have 2 columns named claim_no & n Proc_rank.Trying to use below logic.Please help here
Logic
a) if claim_no=proc_rank then linenum=1
b) if claim_no<>proc_rank then a+1
c) if claim_no=proc_rank then value of b
d) if claim_no<>proc_rank then c+1
I tried with Lag Function with case statement, but not getting desired results & recursive queries not supported by Google Big query.

Below is for BigQuery Standard SQL
#standardSQL
SELECT *, 1 + COUNTIF(claim_no != n_Proc_rank) OVER(ORDER BY ts) linenum
FROM `project.dataset.table`
if to apply to sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 ts, 1 claim_no, 1 n_Proc_rank UNION ALL
SELECT 2, 0, 0 UNION ALL
SELECT 3, 0, 0 UNION ALL
SELECT 4, 1, 1 UNION ALL
SELECT 5, 0, 1 UNION ALL
SELECT 6, 0, 0 UNION ALL
SELECT 7, 0, 1 UNION ALL
SELECT 8, 0, 1 UNION ALL
SELECT 9, 0, 1 UNION ALL
SELECT 10, 0, 1 UNION ALL
SELECT 11, 0, 0 UNION ALL
SELECT 12, 0, 1 UNION ALL
SELECT 13, 0, 0 UNION ALL
SELECT 14, 0, 1 UNION ALL
SELECT 15, 0, 1 UNION ALL
SELECT 16, 0, 1 UNION ALL
SELECT 17, 0, 1
)
SELECT *, 1 + COUNTIF(claim_no != n_Proc_rank) OVER(ORDER BY ts) linenum
FROM `project.dataset.table`
-- ORDER BY ts
result is
Row ts claim_no n_Proc_rank linenum
1 1 1 1 1
2 2 0 0 1
3 3 0 0 1
4 4 1 1 1
5 5 0 1 2
6 6 0 0 2
7 7 0 1 3
8 8 0 1 4
9 9 0 1 5
10 10 0 1 6
11 11 0 0 6
12 12 0 1 7
13 13 0 0 7
14 14 0 1 8
15 15 0 1 9
16 16 0 1 10
17 17 0 1 11
Note: you must have some extra column that defines order or processing, so in my example I added column ts. It can be anything - integer position or date/timestamp, etc.

Related

ORACLE SQL, I don't know how to use SUM() here

Table TRANSACTION:
TRANS_VALUE
USER ID
TRANS_TYPE_ID
10
1
2
5
2
1
15
1
1
20
2
2
10
1
2
5
1
2
15
3
1
20
3
1
I need to get to this:
USER
SUM(TRANS_TYPE_1)
SUM(TRANS_TYPE_2)
1
15
25
2
5
20
3
35
NULL
Can someone help me?
I tried this but sadness
SELECT
user_id AS "USER
SUM(trans_value)
FROM
TRANSACTION
WHERE
trans_value = 1
GROUP BY
user_id
ORDER BY 1;
I need to get to this
USER
SUM(TRANS_TYPE_1)
SUM(TRANS_TYPE_2)
1
15
25
2
5
20
3
35
NULL
Use conditional aggregation:
SELECT user_id,
SUM(CASE trans_type_id WHEN 1 THEN trans_value END) AS sum_trans_type_1,
SUM(CASE trans_type_id WHEN 2 THEN trans_value END) AS sum_trans_type_2
FROM transaction
GROUP BY user_id
or PIVOT:
SELECT *
FROM transaction
PIVOT (
SUM(trans_value)
FOR trans_type_id IN (
1 AS sum_trans_type_1,
2 AS sum_trans_type_2
)
)
Which, for the sample data:
CREATE TABLE transaction (TRANS_VALUE, USER_ID, TRANS_TYPE_ID) AS
SELECT 10, 1, 2 FROM DUAL UNION ALL
SELECT 5, 2, 1 FROM DUAL UNION ALL
SELECT 15, 1, 1 FROM DUAL UNION ALL
SELECT 20, 2, 2 FROM DUAL UNION ALL
SELECT 10, 1, 2 FROM DUAL UNION ALL
SELECT 5, 1, 2 FROM DUAL UNION ALL
SELECT 15, 3, 1 FROM DUAL UNION ALL
SELECT 20, 3, 1 FROM DUAL;
Both output:
USER_ID
SUM_TRANS_TYPE_1
SUM_TRANS_TYPE_2
1
15
25
2
5
20
3
35
null
fiddle

How to identify pattern in SQL

This is my table. It does consist of A,B and C columns. Only one column value will be true at one time.
My task is to identify pattern based on latest five rows.
For example
I need to search entire table to find whenever these five values were repeated.
If they were repeated, what was the next value avilable for these pattern and show how many times does A, B and C values were found after the pattern.
How this can be done in SQL? I am using oracle 11g. Thanks.
You can convert your a, b, c value to a trinary number and then calculate a value for that row and the previous 4 as if the trinary values for the rows comprised a 5-digit trinary number and then use analytic functions to find the next occurrence and to count the occurrences:
SELECT id,
a,
b,
c,
CASE
WHEN grp_value IS NULL
THEN NULL
ELSE MIN(id) OVER (
PARTITION BY grp_value
ORDER BY id
ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING
) + 1
END AS row_after_next_match,
CASE
WHEN grp_value IS NULL
THEN 0
ELSE COUNT(id) OVER ( PARTITION BY grp_value )
END AS num_matches
FROM (
SELECT id,
a,
b,
c,
value,
81 * LAG(value,4) OVER ( ORDER BY id ) +
27 * LAG(value,3) OVER ( ORDER BY id ) +
9 * LAG(value,2) OVER ( ORDER BY id ) +
3 * LAG(value,1) OVER ( ORDER BY id ) +
1 * value AS grp_value
FROM (
SELECT id,
a,
b,
c,
DECODE(1,a,0,b,1,c,2) AS value
FROM table_name
)
)
ORDER BY id
Which, for the sample data:
CREATE TABLE table_name (
id PRIMARY KEY,
a,
b,
c,
CHECK (a IN (0,1)),
CHECK (b IN (0,1)),
CHECK (c IN (0,1)),
CHECK (a+b+c = 1)
) AS
SELECT 1, 1, 0, 0 FROM DUAL UNION ALL
SELECT 2, 1, 0, 0 FROM DUAL UNION ALL
SELECT 3, 0, 1, 0 FROM DUAL UNION ALL
SELECT 4, 1, 0, 0 FROM DUAL UNION ALL
SELECT 5, 0, 1, 0 FROM DUAL UNION ALL
SELECT 6, 0, 0, 1 FROM DUAL UNION ALL
SELECT 7, 1, 0, 0 FROM DUAL UNION ALL
SELECT 8, 0, 1, 0 FROM DUAL UNION ALL
SELECT 9, 1, 0, 0 FROM DUAL UNION ALL
SELECT 10, 0, 1, 0 FROM DUAL UNION ALL
SELECT 11, 0, 0, 1 FROM DUAL UNION ALL
SELECT 12, 1, 0, 0 FROM DUAL UNION ALL
SELECT 13, 1, 0, 0 FROM DUAL UNION ALL
SELECT 14, 1, 0, 0 FROM DUAL UNION ALL
SELECT 15, 1, 0, 0 FROM DUAL UNION ALL
SELECT 16, 1, 0, 0 FROM DUAL UNION ALL
SELECT 17, 1, 0, 0 FROM DUAL UNION ALL
SELECT 18, 1, 0, 0 FROM DUAL UNION ALL
SELECT 19, 1, 0, 0 FROM DUAL UNION ALL
SELECT 20, 1, 0, 0 FROM DUAL
Outputs:
ID
A
B
C
ROW_AFTER_NEXT_MATCH
NUM_MATCHES
1
1
0
0
0
2
1
0
0
0
3
0
1
0
0
4
1
0
0
0
5
0
1
0
1
6
0
0
1
12
2
7
1
0
0
13
2
8
0
1
0
1
9
1
0
0
1
10
0
1
0
1
11
0
0
1
2
12
1
0
0
2
13
1
0
0
1
14
1
0
0
1
15
1
0
0
1
16
1
0
0
18
5
17
1
0
0
19
5
18
1
0
0
20
5
19
1
0
0
21
5
20
1
0
0
5
db<>fiddle here

create sequence of numbers on grouped column in Oracle

Consider below table with column a,b,c.
a b c
3 4 5
3 4 5
6 4 1
1 1 8
1 1 8
1 1 0
1 1 0
I need a select statement to get below output. i.e. increment column 'rn' based on group of column a,b,c.
a b c rn
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4
You can use the DENSE_RANK analytic function to get a unique ID for each combination of A, B, and C. Just note that if a new value is inserted into the table, the IDs of each combination of A, B, and C will shift and may not be the same.
Query
WITH
my_table (a, b, c)
AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
SELECT t.*, DENSE_RANK () OVER (ORDER BY b desc, c desc, a) as rn
FROM my_table t;
Result
A B C RN
____ ____ ____ _____
3 4 5 1
3 4 5 1
6 4 1 2
1 1 8 3
1 1 8 3
1 1 0 4
1 1 0 4
As a starter: for your answer to make sense at all, you need a column that defines the ordering of the rows. Let me assume that you have such column, called id.
Then, you can use window functions:
select a, b, c,
sum(case when a = lag_a and b = lag_b and c = lag_c then 0 else 1 end) over(order by id) rn
from (
select t.*,
lag(a) over(order by id) lag_a,
lag(b) over(order by id) lag_b,
lag(c) over(order by id) lag_c
from mytable t
) t
Assuming you have some way of ordering your rows, then you can use MATCH_RECOGNIZE:
SELECT a, b, c, rn
FROM table_name
MATCH_RECOGNIZE (
ORDER BY id
MEASURES MATCH_NUMBER() AS rn
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW EQUAL_ROWS* )
DEFINE EQUAL_ROWS AS (
EQUAL_ROWS.a = PREV( EQUAL_ROWS.a )
AND EQUAL_ROWS.b = PREV( EQUAL_ROWS.b )
AND EQUAL_ROWS.c = PREV( EQUAL_ROWS.c )
)
)
So, for your test data:
CREATE TABLE table_name ( id, a, b, c ) AS
SELECT 1, 3, 4, 5 FROM DUAL UNION ALL
SELECT 2, 3, 4, 5 FROM DUAL UNION ALL
SELECT 3, 6, 4, 1 FROM DUAL UNION ALL
SELECT 4, 1, 1, 8 FROM DUAL UNION ALL
SELECT 5, 1, 1, 8 FROM DUAL UNION ALL
SELECT 6, 1, 1, 0 FROM DUAL UNION ALL
SELECT 7, 1, 1, 0 FROM DUAL;
Outputs:
A | B | C | RN
-: | -: | -: | -:
3 | 4 | 5 | 1
3 | 4 | 5 | 1
6 | 4 | 1 | 2
1 | 1 | 8 | 3
1 | 1 | 8 | 3
1 | 1 | 0 | 4
1 | 1 | 0 | 4
db<>fiddle here
It can also be done without any ordering, by getting the distinct groups and numbering each group. Borrowing the first part from EJ Egjed:
WITH my_table (a, b, c) AS
(SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 3, 4, 5 FROM DUAL
UNION ALL
SELECT 6, 4, 1 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 8 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL
UNION ALL
SELECT 1, 1, 0 FROM DUAL)
, groups as (select distinct a, b, c
from my_table)
, groupnums as (select rownum as num, a, b, c
from groups)
select a, b, c, num
from my_table join groupnums using(a,b,c);

Filtering SQL using a oracle database

I would like to know if the following is possible
For example I have a shoe factory. In this factory I have a production line, Every step in this production line is recorded into the oracle database.
if the shoe has completed a production step the result is = 1
example table
Shoe_nr production step result
1 1 1
1 2 1
1 3
2 1 1
2 2 1
2 3
3 1
3 2
3 3
Now the question, is it possible to filter out production step 3 where only the shoes have passed production step 2 which is equal to 1 in result.
I know if it can be done it's probably very easy but if you dont know i found out it's a little bit tricky.
Thanks,
Chris
Yes, you can do it with IN and a Subselect
select *
from shoes
where shoe.id in (
select shoe.id
from shoes
where production_step = 2
and result = 1
)
and production_step = 3
This might be one option; see comments within code (lines #1 - 12 represent sample data; you already have that and don't type it. Query you might be interested in begins at line #13).
SQL> with shoes (shoe_nr, production_step, result) as
2 -- sample data
3 (select 1, 1, 1 from dual union all
4 select 1, 2, 1 from dual union all
5 select 1, 3, null from dual union all
6 select 2, 1, 1 from dual union all
7 select 2, 2, 1 from dual union all
8 select 2, 3, null from dual union all
9 select 3, 1, null from dual union all
10 select 3, 2, null from dual union all
11 select 3, 3, null from dual
12 ),
13 -- which shoes' production step #3 should be skipped?
14 skip as
15 (select shoe_nr
16 from shoes
17 where production_step = 2
18 and result = 1
19 )
20 -- finally:
21 select a.shoe_nr, a.production_step, a.result
22 from shoes a
23 where (a.shoe_nr, a.production_step) not in (select b.shoe_nr, 3
24 from skip b
25 )
26 order by a.shoe_nr, a.production_step;
SHOE_NR PRODUCTION_STEP RESULT
---------- --------------- ----------
1 1 1
1 2 1
2 1 1
2 2 1
3 1
3 2
3 3
7 rows selected.
SQL>
If you just want the shoe_nr that satisfy the condition, you can use aggregation and a having clause:
select shoe_nr
from mytable
group by shoe_nr
having
max(case when production_step = 2 then result end) = 0
and max(case when production_step = 3 then 1 end) = 1
If you want the entire row corresponding to this shoe_nr at step 3, use window functions instead:
select 1
from (
select
t.*,
max(case when production_step = 2 then result end)
over(partition by shoe_nr) as has_completed_step_2
from mytable t
) t
where production_step = 3 and has_completed_step_2 = 0

Oracle PL/SQL: How to find duplicate sequences in large table?

I have a ~20000 row table like this (seq = sequence):
id seq_num seq_count seq_id a b c d
----------------------------------------------------
1 1 3 A400 1 0 0 0
2 2 3 A400 0 1 0 0
3 3 3 A400 0 0 1 0
4 1 2 V2303 1 1 1 1
5 2 2 V2303 1 1 1 1
6 1 3 G2 1 0 0 0
7 2 3 G2 0 1 0 0
8 3 3 G2 0 0 1 0
9 1 3 U900 1 0 0 0
10 2 3 U900 2 2 1 1
11 3 3 U900 5 3 8 5
I want to find the seq_id of a-b-c-d sequences that have duplicates in the table, could just be a dbms_ouput.put_line or anything. So as you can see, seq_id G2 is a duplicate of A400 because all of their rows match up, but U900 has no duplicates even though one row matches A400 and G2.
Is there a good way to check for duplicates like this on large sets of data? I cannot create new tables to temporarily hold data. So far I've been trying with cursors mostly but no luck.
Thank you, let me know if you need any more info about my problem.
Oracle Setup:
CREATE TABLE table_name ( id, seq_num, seq_count, seq_id, a, b, c, d ) AS
SELECT 1, 1, 3, 'A400', 1, 0, 0, 0 FROM DUAL UNION ALL
SELECT 2, 2, 3, 'A400', 0, 1, 0, 0 FROM DUAL UNION ALL
SELECT 3, 3, 3, 'A400', 0, 0, 1, 0 FROM DUAL UNION ALL
SELECT 4, 1, 2, 'V2303', 1, 1, 1, 1 FROM DUAL UNION ALL
SELECT 5, 2, 2, 'V2303', 1, 1, 1, 1 FROM DUAL UNION ALL
SELECT 6, 1, 3, 'G2', 1, 0, 0, 0 FROM DUAL UNION ALL
SELECT 7, 2, 3, 'G2', 0, 1, 0, 0 FROM DUAL UNION ALL
SELECT 8, 3, 3, 'G2', 0, 0, 1, 0 FROM DUAL UNION ALL
SELECT 9, 1, 3, 'U900', 1, 0, 0, 0 FROM DUAL UNION ALL
SELECT 10, 2, 3, 'U900', 2, 2, 1, 1 FROM DUAL UNION ALL
SELECT 11, 3, 3, 'U900', 5, 3, 8, 5 FROM DUAL;
Query:
SELECT s.seq_id,
t.seq_id AS matched_seq_id
FROM table_name s
INNER JOIN
table_name t
ON ( s.seq_num = t.seq_num
AND s.seq_count = t.seq_count
AND s.seq_id < t.seq_id
AND s.a = t.a
AND s.b = t.b
AND s.c = t.c
AND s.d = t.d )
GROUP BY
t.seq_id,
s.seq_id
HAVING COUNT( DISTINCT t.seq_num ) = MAX( t.seq_count );
Results:
SEQ_ID MATCHED_SEQ_ID
------ --------------
A400 G2
Assuming results fit in a string about 2000 characters long, the fastest way is probably to use listagg():
select abcds, listagg(seq_id, ',') within group (order by seq_id)
from (select seq_id, listagg(a||b||c||d, ',') within group (order by seq_num) as abcds
from table_name
group by seq_id
) t
group by abcds
having count(*) >= 2;
This returns the matches as a comma-delimited list.