I'm looking for functionality similar to this in BigQuery: https://cloud.google.com/dataprep/docs/html/ROLLINGLIST-Function_118228853
Does anyone know of a suitable function?
Below example for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'a' col1, 1 col2 UNION ALL
SELECT 'b', 2 UNION ALL
SELECT 'c', 3 UNION ALL
SELECT 'd', 4 UNION ALL
SELECT 'e', 5 UNION ALL
SELECT 'f', 6 UNION ALL
SELECT 'g', 7 UNION ALL
SELECT 'h', 8
)
SELECT *,
STRING_AGG(col1)
OVER(ORDER BY col2 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) rolling_list
FROM `project.dataset.table`
with output
Row col1 col2 rolling_list
1 a 1 a
2 b 2 a,b
3 c 3 a,b,c
4 d 4 b,c,d
5 e 5 c,d,e
6 f 6 d,e,f
7 g 7 e,f,g
8 h 8 f,g,h
Related
I have input like :-
Table 1
Table 2
A
4
B
5
C
6
1
X
2
Y
3
Z
And Output muse be
Output
A
B
C
X
Y
Z
One method uses a union followed by a filter:
SELECT val
FROM
(
SELECT col1 AS val FROM yourTable
UNION ALL
SELECT col2 FROM yourTable
) t
WHERE REGEXP_LIKE(val, '^[A-Z]+$')
ORDER BY val;
If data really looks as you put it, a simple option is to use greatest function.
Sample data:
SQL> with test (col1, col2) as
2 (select 'A', '4' from dual union all
3 select 'B', '5' from dual union all
4 select 'C', '6' from dual union all
5 select '1', 'X' from dual union all
6 select '2', 'Y' from dual union all
7 select '3', 'Z' from dual
8 )
Query:
9 select greatest(col1, col2) result
10 from test;
RESULT
----------
A
B
C
X
Y
Z
6 rows selected.
SQL>
Table looks like this:
col1 | col2
A | B
A | B
D | C
D | C
E | F
E | G
what I need, is to extract col1 E.
Already tried couple variants of SELECT DISTINCT or SELECT ..., COUNT(*) with GROUP BY, but can't figure it out.
p.s. DBMS is Oracle
Such as this?
SQL> with test (col1, col2) as
2 (select 'A', 'B' from dual union all
3 select 'A', 'B' from dual union all
4 select 'D', 'C' from dual union all
5 select 'D', 'C' from dual union all
6 select 'E', 'F' from dual union all
7 select 'E', 'G' from dual
8 )
9 select col1
10 from test
11 group by col1
12 having count(distinct col2) > 1;
C
-
E
SQL>
Given table as:
WITH table AS
(SELECT 'A' id, '11' ar, 1 ts UNION ALL
SELECT 'A', '12', 2 UNION ALL
SELECT 'A', '11', 3 UNION ALL
SELECT 'B', '11', 4 UNION ALL
SELECT 'B', '13', 5 UNION ALL
SELECT 'B', '12', 6 UNION ALL
SELECT 'B', '12', 7)
id ar ts
A 11 1
A 12 2
A 11 3
B 11 4
B 13 5
B 12 6
B 12 7
I need to get the unique last two rows as:
id ar
A 11
A 12
B 12
B 13
I tried ARRAY_AGG with DISTINCT and LIMIT,
But the ORDER BY must be the same as expression
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(ars)
FROM (
SELECT id, ARRAY_AGG(ar ORDER BY ts DESC LIMIT 2) AS ars
FROM (
SELECT id, ar, MAX(ts) AS ts
FROM `project.dataset.table`
GROUP BY id, ar
)
GROUP BY id
) t, t.ars AS ar
if to apply to sample data from your question - output is
Row id ar
1 A 11
2 A 12
3 B 12
4 B 13
Suppose having the data set:
with
data_table(title, x) as (
select 'a', 1 from dual union all
select 'a', 3 from dual union all
select 'a', 4 from dual union all
select 'a', 5 from dual union all
select 'b', 1 from dual union all
select 'b', 2 from dual union all
select 'b', 3 from dual union all
select 'b', 6 from dual
)
select * from data_table;
TITLE | X
-----------
a 1
a 3
a 4
a 5
b 1
b 2
b 3
b 6
Wee see that points related to a and b are different.
How to align values in X column so both groups have the same points, filling the gaps with NULL?
Expected result is:
TITLE | X
-----------
a 1
a NULL
a 3
a 4
a 5
a NULL
b 1
b 2
b 3
b NULL
b NULL
b 6
Straightforward solution I got is
with
data_table(title, x) as (
select 'a', 1 from dual union all
select 'a', 3 from dual union all
select 'a', 4 from dual union all
select 'a', 5 from dual union all
select 'b', 1 from dual union all
select 'b', 2 from dual union all
select 'b', 3 from dual union all
select 'b', 6 from dual
),
all_points(x) AS (
select distinct x from data_table
),
all_titles(title) AS (
select distinct title from data_table
),
aligned_data(title, x) as (
select t.title, p.x from all_points p cross join all_titles t
)
select ad.title, dt.x
from aligned_data ad
left join data_table dt on dt.title = ad.title and dt.x = ad.x
order by ad.title, ad.x;
As wee see cross join in aligned_data definition is bottleneck and can kill performance on valuable data sets.
I wonder if this task could be solved more elegantly. Maybe a trick with window functions can be proposed.
Hi take the following data as an example
id | value
----------
A | 3
A | 9
B | 7
B | 2
C | 4
C | 5
I want to list out all the data base on the min value of each id group, so that the expected output is
id | value
----------
B | 2
B | 7
A | 3
A | 9
C | 4
C | 5
i.e. min of group A is 3, group B is 2, group C is 4, so group B first and then the rest of group B in ascending order. Next group A and then group C
I tried this but thats not what I want
SELECT * FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
GROUP BY id, value
ORDER BY MIN(value)
Please help! Thank you
SELECT * FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
ORDER BY MIN(value) OVER(PARTITION BY id), id, value
OVER Clause (Transact-SQL)
Add the over() clause to your query output and you can see what it does for you.
SELECT *,
MIN(value) OVER(PARTITION BY id) OrderedBy FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
ORDER BY MIN(value) OVER(PARTITION BY id), id, value
Result:
id value OrderedBy
---- ----- ---------
B 2 2
B 7 2
A 3 3
A 9 3
C 4 4
C 5 4