Unnest array of integers SQL BigQuery - sql

I cannot seem to find anything that helps with unnesting a list of integers in SQL BigQuery.
I've tried using select * from my_table, unnest(column) as column but I get this error:
Values referenced in UNNEST must be arrays. UNNEST contains expression of type STRUCT<os STRING, product_ids STRING...
|Product IDs|
|[123456,234567,345678,456789]|
|[987654,876543,765432,654321]|
I basically want to to get it so that I just have each product number on a separate line. So...
|Product IDs|
|123456|
|234567|
|345678|
|456789|
|987654|
|876543|
|765432|
|654321|
EDIT:
sorry forgot to add in - there is a customer id, then the product_ids has the list of product numbers.
So I want the customer_id, and the product ids on separate lines.
so output for 1 customer to be like the below:

Giving you 4 solutions, see which one fits your case:
-- Solution1
select product_id from
(select "123,234,345,456,456,678,789" as product_ids),unnest(split(product_ids)) as product_id
--Solution2
select product_id from
(select struct("ios" as os, "123,234,345,456,456,678,789" as product_ids) as os_products), unnest(split(os_products.product_ids)) as product_id
--Solution3
select product_id from
(select array_agg(os_products) as os_products from
(select struct("ios" as os, "123,234,345,456,456,678,789" as product_ids) as os_products
union all
select struct("android" as os, "abc,cde,efg" as product_ids) as os_products
)), unnest(os_products) as op, unnest(split(op.product_ids)) as product_id
--Solution4
select product_id from
(select array_agg(os_products) as os_products from
(select struct("ios" as os, split("123,234,345,456,456,678,789") as product_ids) as os_products
union all
select struct("android" as os, split("abc,cde,efg") as product_ids) as os_products
)), unnest(os_products) as op, unnest((op.product_ids)) as product_id
========
Based on your latest re-edit of your question and having total 2 columns in following way: customer_id, array of product_ids
select customer_id, product_id from
(select "customer1" as customer_id, split("123,234,345,456,456,678,789") as product_ids), unnest(product_ids) as product_id
This will give you customer_id and product_id , 1 per each row.

Based on the error message it seems like there are additional fields in your struct. To unnest you will need to isolate the array, try the following:
with sample_data as (
select [STRUCT('linux' as os, [123456,234567,345678,456789] as product_id),
STRUCT('macos' as os, [987654,876543,765432,654321] as product_id)] as values
)
select pid
from sample_data,
UNNEST(values) v,
UNNEST(v.product_id) pid

Related

Prefered results

I am trying to list all DEPARTMENT_IDs with PRODUCT_IDs, first where PRODUCT_COST_STATUS = 1 but there are also data with where PRODUCT_COST_STATUS = 0. I prefer to list first "1"s and if not then "0"s with the latest date (this is another case for future) ... The code I wrote should give the expected result but it takes a lot of time to run query. I don't to want to list duplicate DEPARTMENT_ID.
Is there any way around ?
Thanks
SELECT PRODUCT_ID
,PRODUCT_COST_STATUS
,DEPARTMENT_ID FROM [PRODUCT_COST] PC
WHERE PRODUCT_COST_STATUS = 1
OR PRODUCT_ID NOT IN (SELECT PRODUCT_ID
FROM
[PRODUCT_COST]
where PRODUCT_COST_STATUS = 0
GROUP BY PRODUCT_ID,
PRODUCT_COST_STATUS,
DEPARTMENT_ID) GROUP BY PRODUCT_ID,
PRODUCT_COST_STATUS,
DEPARTMENT_ID
ORDER BY PRODUCT_ID,
DEPARTMENT_ID
I solved with the help of my friend and wanted to share here...
PRODUCT_COST_STATUS is a bit
SELECT PRODUCT_ID
,DEPARTMENT_ID
,LOCATION_ID
,max(cast(PRODUCT_COST_STATUS as int)) as maxpcs
,max(ACTION_DATE) as maxad
FROM [PRODUCT_COST]
group by
PRODUCT_ID, DEPARTMENT_ID,LOCATION_ID

convert data from multiple columns into single row sorting descending

I am trying to query the original source which contain totals from a category (in this case Vehicles) into the second table.
Motorcycle
Bicycle
Car
1
3
2
Desired Output:
Vehicle
Quantity
Bicycle
3
Car
2
Motorcycle
1
Additionally, I need that the Quantity is sorted in descending order like showing above.
So far I have tried to do an Unpivot, but there is a syntax error in the Unpivot function. Is there another way to reach out the same results?
My code so far:
SELECT Vehicle_Name
FROM
(
SELECT [Motorcycle], [Bycycle], [Car] from Data
) as Source
UNPIVOT
(
Vehicle FOR Vehicle_Name IN ([Motorcycle], [Bycycle], [Car])
) as Unpvt
Edit: Added sort requirement.
You can use CROSS APPLY here too
select vehicle, amnt
from test
cross apply(
VALUES('motorcycle', motorcycle)
,('bicycle', bicycle)
,('car', car)) x (vehicle, amnt)
order by amnt desc
Fiddle here
Try this
with data1 as
(
Select * from data)
Select * From
(
Select 'motorcycle' as "Vehicle", motorcycle as quantity from data1
union all
Select 'bicycle' , bicycle from data1
union all
Select 'car', car from data1
) order by quantity desc;
Since we don't know what DBMS, here's a way that'd work in the one I use the most.
SELECT *
FROM (SELECT map_from_entries(
ARRAY[('Motorcycle', Motorcycle),
('Bicycle', Bicycle),
('Car', Car)])
FROM Source) AS t1(type_quant)
CROSS JOIN UNNEST(type_quant) AS t2(Vehicle, Quantity)
ORDER BY Quantity DESC
-Trino

Select distinct on one column with multiple columns returned

In a table I have 2 versions of cust_id per client; what I am trying to achieve is a list of those products which are only listed under one version of the cust_id to fix the differences and output a report. I do understand that the distinct won't work because I might have the same product for multiple cust_id's.
Am trying to use DISTINICT on one column only to get a list of the distinct product_name for all cust ids.
SELECT *
FROM (
SELECT DISTINCT
([product_name]),
[cust_id] % 100000 AS 'CustId',
COUNT([cust_id]) AS 'CustIdCount'
FROM [dbo].[solars_solutions_setting]
WHERE [cust_id] IS NOT NULL
GROUP BY
[product_name],
[cust_id] % 100000,
[value]
) AS Settings WHERE Settings.CustIdCount < 2
Is it possible to output a result having three columns?
distinct product_name
list of all cust_id's for that product_name, comma separated (is it possible?)
custIdCount - which should be 1 (meaning the product is only listed under one version of the cust_id)
Given that you are using SQL Server 2017, we can take advantage of STRING_AGG here:
SELECT
product_name,
STRING_AGG(cust_id, ',') AS cust_ids,
COUNT(DISTINCT cust_id) AS dist_cust_cnt
FROM [dbo].[solars_solutions_setting]
GROUP BY
product_name
HAVING
COUNT(DISTINCT cust_id) > 1;
I chose to retain product names where there be more than one cust_id present. In this case, the CSV list cust_ids would have more than one value. Otherwise, it would only have a single value and the call to STRING_AGG would not serve much purpose.
string_agg is available in this sql sever version.
SELECT *
FROM (
SELECT DISTINCT
([product_name]),
[cust_id] % 100000 AS 'CustId',
COUNT([cust_id]) AS 'CustIdCount',
string_agg(cust_id, ',') as listCust
FROM [dbo].[solars_solutions_setting]
WHERE [cust_id] IS NOT NULL
GROUP BY
[product_name],
[cust_id] % 100000,
[value]
) AS Settings WHERE Settings.CustIdCount < 2

How to count each instance where a certain product was sold after a different product? SQL or DAX

Sorry if the title seems confusing, it was the best I could come up with.
I can work with both excel(Dax since its a power query) and sql:
I have a situation where there are two product types being purchased, Type_A and Type_B.
I want to calculate a count of how many unique Loc_ID have purchased a "Type_A" Product type, AFTER purchasing a "Type_B" Product type.
From my example there are a total of 3 unique Loc_ID which would fall in this filter: Loc_01, Loc_02, and Loc_04
Any help is greatly appreciated
Try this (it works good if each loc_id purchased both type of products as in your example.
select count(*)
from
(select loc_id , max(date_purchased) dt
from table t where product_type = 'type_a'
group by loc_id) a,
(select loc_id , max(date_purchased) dt
from table t where product_type = 'type_b'
group by loc_id) b
where a.loc_id=b.loc_id and a.dt>b.dt;
This will work even if certain loc_id did not purchase both type of products
Try this:-
Select count(a.loc_id) as cnt_locations
from
your_table_name a
inner join
(
Select a.loc_id,b.date_purchased,b.Product_type
from
(
Select loc_id, min(date_purchased) as date_purchased
from
your_table_name
group by loc_id
) a
inner join
your_table_name b
on a.loc_id=b.loc_id and a.date_purchased =b.date_purchased
where Product_type ='Type_B'
) b
on
a.loc_id=b.loc_id
where a.date_purchased >b.date_purchased and a.Product_type ='Type_A'

SQL: Using UNION

Here is the question and database info.
Use the UNION command to prepare a full statement for customer 'C001' - it should be laid out as follows. (Note that the values shown below are not correct.) You may be able to use '' or NULL for blank values - if necessary use 0.
Here is a link to the webpage with the database info. http://sqlzoo.net/5_0.htm or see the image below.
Here is what I have tried:
SELECT sdate AS LineDate, "delivery" AS LEGEND, price*quantity AS Total,"" AS Amount
FROM shipped
JOIN product ON (shipped.product=product.id)
WHERE badguy='C001'
UNION
SELECT rdate,notes, "",receipt.amount
FROM receipt
WHERE badguy='C001'
Here is what I get back:
Wrong Answer. The correct answer has 5 row(s).
The amounts don't seem right in the amount column and I can't figure out how to order the data by the date since it is using two different date columns (sdate and rdate which are UNIONED).
Looks like the data in the example is being aggregated by date and charge type using group by, that's why you are getting too many rows.
Also, you can sort by the alias of the column (LineDate) and the order by clause will apply to all the rows in the union.
SELECT sdate AS LineDate, "delivery" AS LEGEND, SUM(price*quantity) AS Total,"" AS Amount
FROM shipped
JOIN product ON (shipped.product=product.id)
WHERE badguy='C001'
GROUP BY sdate
UNION
SELECT rdate, notes, "",receipt.amount
FROM receipt
WHERE badguy='C001'
ORDER BY LineDate
It's usually easiest to develop each part of the union separately. Pay attention to the use of "null" to separate the monetary columns. The first select gets to name the columns.
select s.sdate as tr_date, 'Delivery' as type, sum((s.quantity * p.price)) as extended_price, null as amount
from shipped s
inner join product p on p.id = s.product
where badguy = 'C001'
group by s.sdate
union all
select rdate, notes, null, sum(amount)
from receipt
where badguy = 'C001'
group by rdate, notes
order by tr_date