Unnest array defined as string within a row in BigQuery - google-bigquery

I have the following query
select 123 as user_id, "[\"A\",\"B\",\"C\"]" as category
which generates this data:
user_id category
123 ["A","B","C"]
What I would like to get from this data is:
user_id category
123 A
123 B
123 C
How could I do it?

Use below.
WITH sample_data AS (
select 123 as user_id, "[\"A\",\"B\",\"C\"]" as category
)
SELECT user_id, category
FROM sample_data, UNNEST(JSON_VALUE_ARRAY(category)) category;

Related

How to transpose data grouped by two fields

I am having issues figuring out wow I can transform my data from the example table to the desired results? The main idea is to group the rows by id_1 and id_2 and then transform the data into one row with the order by sequence_id. Any help or tips would be appreciated, thanks!
Example data:
date
id_1
id_2
sequence_id
data_1
data_2
data_3
2020-01-01
ABC
123
2
hi
nice
to
2020-01-01
ABC
123
3
meet
you
my
2020-01-01
ABC
123
4
name
is
bob
2020-02-01
DEF
456
1
good
day
sir
2020-02-01
DEF
456
3
how
are
you
Desired output:
date
id_1
id_2
sequence_id
data_1
data_2
data_3
data_1
data_2
data_3
data_1
data_2
data_3
2020-01-01
ABC
123
2
hi
nice
to
meet
you
my
name
is
bob
2020-02-01
DEF
456
1
good
day
sir
how
are
you
Consider below approach (could be good starting point for you to further optimize it)
select * from(
select date, id_1, id_2, min(sequence_id) over win as sequence_id,
data, row_number() over win pos
from your_table, unnest([data_1, data_2, data_3]) data with offset
window win as (partition by date, id_1, id_2 order by sequence_id, offset)
)
pivot (any_value(data) as data_ for pos in (1,2,3,4,5,6,7,8,9))
if applied to sample data in your question - output is
We cannot use the same name for more than one column in a table.Hence your desired output is not feasible(i.e data_1/2/3 as column name more than once).
If your goal is to have the full sentence for each id in each row then as an alternative you can consider the below query:
with cte as (
select "2020-01-01" date,"ABC" id_1,"123"id_2,"2"sequence,"hi" data_1,"nice" data_2,"to" data_3 union all
select "2020-01-01","ABC","123","3","meet","you","my" union all
select "2020-01-01","ABC","123","4","name","is","bob" union all
select "2020-02-01","DEF","456","1","good","day","sir" union all
select "2020-02-01","DEF","456","3","how","are","you"
)
select date,id_1,id_2,sequence
,STRING_AGG(concat(data_1," ",data_2," ",data_3)," ")over(partition by date,id_1,id_2 order by sequence)str
from cte
qualify row_number() over(partition by date,id_1,id_2 order by sequence desc)=1

Get the running unique count of items till a give date, similar to running total but instead a running unique count

I have a table with user shopping data as shown below
I want an output similar to running total but instead I want the running total of the count of unique categories that the user has shopped for by date.
I know I have to make use of ROWS PRECEDING AND FOLLOWING in the count function but I am not able to user count(distinct category) in a window function
Dt category userId
4/10/2022 Grocery 123
4/11/2022 Grocery 123
4/12/2022 MISC 123
4/13/2022 SERVICES 123
4/14/2022 RETAIl 123
4/15/2022 TRANSP 123
4/20/2022 GROCERY 123
Desired output
Dt userID number of unique categories
4/10/2022 123 1
4/11/2022 123 1
4/12/2022 123 2
4/13/2022 123 3
4/14/2022 123 4
4/15/2022 123 5
4/20/2022 123 5
Consider below approach
select Dt, userId,
( select count(distinct category)
from t.categories as category
) number_of_unique_categories
from (
select *, array_agg(lower(category)) over(partition by userId order by Dt) categories
from your_table
) t
if applied to sample data in your question - output is

select distinct and autoincrement field in select query

I have a table Product
with
ProductNo ProductDetail UniqueiD(Primarykey)
L1234 ProductA 1
L1234 ProductB 2
L1234 ProductC 3
M1234 ProductD 4
M1234 ProductE 5
So i need a select query that will display distinct product no with ids for displaying in p-listbox.
say
Name code
L1234 1
M1234 2
How do i achieve this?
Thanks
One method is:
select distinct name, dense_rank() over (order by name)
from product;
That said, I would probably use group by:
select name, row_number() over (order by name) as code
from product
group by name;

Trying to group quantities based off an ID

We have two columns one with ID and another with QTY. And the layout goes along the lines of:
ID QTY
-------------
123 456
123 634
123 4235
234 67
234 735
234 666
What I am trying to do is add up all the numbers based off the ID so it would look like:
ID QTY
-------------
123 5325
234 1468
I currently have the following SQL query:
SELECT CLIENT_ID, ID, QTY_ON_HAND,
SUM(QTY_ON_HAND)
FROM
(select CLIENT_ID, ID, QTY_ON_HAND
FROM INVENTORY
WHERE CLIENT_ID = '(CLIENT ID HERE)')
GROUP BY QTY_ON_HAND
It would be appreciated if anyone can tell me simple way on how to do this.
I do not have a test DB at hand, but it should be this:
select
ID,
sum(QTY) as TOTAL
from
YourTableName
group by
ID;
YourTableName ... name of data table with two columns ID, QTY. Be aware of whole table name, it can be also something like dbo.yourtablename, etc.

countif type function in SQL where total count could be retrieved in other column

I have 36 columns in a table but one of the columns have data multiple times like below
ID Name Ref
abcd john doe 123
1234 martina 100
123x brittany 123
ab12 joe 101
and i want results like
ID Name Ref cnt
abcd john doe 123 2
1234 martina 100 1
123x brittany 123 2
ab12 joe 101 1
as 123 has appeared twice i want it to show 2 in cnt column and so on
select ID, Name, Ref, (select count(ID) from [table] where Ref = A.Ref)
from [table] A
Edit:
As mentioned in comments below, this approach may not be the most efficient in all cases, but should be sufficient on reasonably small tables.
In my testing:
a table of 5,460 records and 976 distinct 'Ref' values returned in less than 1 second.
a table of 600,831 records and 8,335 distinct 'Ref' values returned in 6 seconds.
a table of 845,218 records and 15,147 distinct 'Ref' values returned in 13 seconds.
You should provide SQL brand to know capabilities:
1) If your DB supports window functions:
Select
*,
count(*) over ( partition by ref ) as cnt
from your_table
2) If not:
Select
T.*, G.cnt
from
( select * from your_table ) T inner join
( select count(*) as cnt from your_table group by ref ) G
on T.ref = G.ref
You can use COUNT with OVERin following:
QUERY
select ID,
Name,
ref,
count(ref) over (partition by ref) cnt
from #t t
SAMPLE DATA
create table #t
(
ID NVARCHAR(400),
Name NVARCHAR(400),
Ref INT
)
insert into #t values
('abcd','john doe', 123),
('1234','martina', 100),
('123x','brittany', 123),
('ab12','joe', 101)