Bigquery - Transposing Rows into Multiple Cells but not using cell values name as column name - google-bigquery

Previously I've asked a question within this link and has been successfully answered by JayTiger's Answer and Mikhail's Answer
Their answers helped my issue successfully, but I have another case that cannot be solved by those ones.
For example, I have this kind of data:
transaction_id
item_name
123
snacks
123
marbles
124
tooth_paste
124
tooth_brush
124
pen
By using [JayTiger's Answer]
(Bigquery - Best way to transpose rows into multiple columns) and Mikhail's Answer it will generate list of columns like below
transaction_id
item_name_snacks
item_name_marbles
item_name_tooth_paste
item_name_tooth_brush
item_name_pen
However, what I desired is something like this where I can define the name of the columns by using a sequence of number for example:
transaction_id
item_name_1
item_name_2
item_name_3
123
snacks
marbles
124
tooth_paste
tooth_brush
pen
Since in my sample data the maximum number of item per transaction_id is 3 items, then the generated column is also 3.
Is there any way to pull this off? Thanks!

You might consider below and I think you can generalize the query using a dynamic sql answered by Mikhail and me in previous answers.
WITH sample_table AS (
SELECT '123' transaction_id, 'snacks' item_name UNION ALL
SELECT '123' transaction_id, 'marbles' item_name UNION ALL
SELECT '124' transaction_id, 'tooth_paste' item_name UNION ALL
SELECT '124' transaction_id, 'tooth_brush' item_name UNION ALL
SELECT '124' transaction_id, 'pen' item_name
)
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY transaction_id) rn FROM sample_table)
PIVOT (ANY_VALUE(item_name) item_name FOR rn IN (1, 2, 3));
Query results

Related

How to concat one column by order after group by?

The dataset looks like this:
id
result
rank
001
pass
2
002
fail
3
001
fail
1
002
pass
1
What I want to do: group the dataset by id and concatenate the results in ascending order of rank column.
id
results
001
fail-pass
002
pass-fail
As the other column's order is involved, the concat_ws('-',collect_set(result))function cannot fulfill my thought.
Are there any built-in functions to help me achieve this, or writing a UDF seems the only solution?
In a subquery before collect_set, distribute by id and sort by id, rank. Dataset will be distributed between reducers by id and sorted by rank before aggregation. See comments in the code.
Demo:
with demo_dataset as ( --Use your table instead of this CTE
select stack(4,
'001' , 'pass', 2,
'002' , 'fail', 3,
'001' , 'fail', 1,
'002' , 'pass', 1
) as (id,result,rank)
)
select id, concat_ws('-',collect_set(result))
from
(
select t.*
from demo_dataset t
distribute by id --Distribute by grouping column
sort by id, rank --Sort in required order
) s
group by id
Result:
id results
001 fail-pass
002 pass-fail
Now if you change SORT: sort by id, rank desc you will get results ordered differently
The answer above can solve the problem, however, it does not fit some situation, for example, if you want to create a column with concat string like this:
"fail:1,sucess:2", the answer above cannot sort the data in order.
Alternatively, I found a solution, you can use
sort_array(collect_set(your_column_name))) to sort the result in order after your group by operation.

SQL Select unique combinations of rows for other column value

I’m trying to do an analysis of the different combinations of taxes per invoice to identify how many scenarios exist.
In the tax table, column 1 is invoiceNo, column 2 is taxType. These form the composite key. There can be 1 or more taxType per invoiceNo. Example of data:
https://i.imgur.com/bcQc7vY_d.jpg?maxwidth=640&shape=thumb&fidelity=medium (Sorry but i’m new so can’t add picture).
I want to be able to report on unique taxType for any invoiceNo. Ie, 1 A is unique comb 1, 2 AB is unique comb 2, 3 A is disregarded as already returned for 1, and 4 BC is unique comb 3.
Not sure if this makes sense! Finding it hard to articulate what I’m after!
Expected output would be:
A
AB
BC
The original version of this question was tagged MySQL, so this answers the question.
If I understand correctly, you can use group_concat():
select distinct group_concat(taxtype order by taxtype)
from t
group by invoiceno;
This works with the table you have given and would work with those combinations of Tax types even if they repeat but if there are more tax codes, or there is an AC combination, or if some of the given combinations are omitted then it might get little different! You could develop this to suit the conditions, or you could give some more info: Do invoices have three codes (ABC)? do invoices have just B or just C codes? I notice that the BC invoice etc
WITH CTE (RN,InvoiceNo,TT1,TT2)
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY a.InvoiceNo),a.InvoiceNo,a.TaxType,b.TaxType
FROM UniqueCombo a INNER JOIN UniqueCombo b ON a.InvoiceNo=b.InvoiceNo
)
,
CTE2 (RN,InvoiceNo,TT1,TT2)
AS
(
SELECT * FROM CTE WHERE RN IN
(
SELECT MAX(RN) FROM CTE WHERE TT1=TT2 GROUP BY InvoiceNo HAVING COUNT(InvoiceNo)=1
)
)
SELECT TT1 FROM CTE2 WHERE RN IN
(
SELECT MAX(RN) FROM CTE WHERE TT1=TT2 GROUP BY TT1,TT2 HAVING COUNT(InvoiceNo)>1
)
UNION
SELECT TT1+''+TT2 FROM CTE WHERE RN IN
(
SELECT MAX(RN)-1 FROM CTE WHERE TT1<>TT2 GROUP BY InvoiceNo
)
You can try STRING_AGG. Something like:
SELECT DISTINCT TaxTypeString
FROM
(
SELECT InvoiceNo, STRING_AGG(TaxType, '') AS TaxTypeString
FROM t
GROUP BY InvoiceNo
) x
ORDER BY TaxTypeString
The nested query, called x, should give you one row per invoice number, in the format you want. Then you have to select the distinct tax types from there.

totalling rows for distinct values in SQL

I haven't had much experience with SQL and it strikes me as a simple question, but after an hour of searching I still can't find an answer...
I have a table that I want to add up the totals for based on ID - e.g:
-------------
ID Quantity
1 30
2 11
1 4
1 3
2 17
3 16
.............
After summing the table should look something like this:
-------------
ID Quantity
1 37
2 28
3 16
I'm sure that I need to use the DISTINCT keyword and the SUM(..) function, but I can only get one total value for all unique value combinations in the table, and not separate ones like above. Help please :)
Select ID, Sum(Quantity) from YourTable
Group by ID
You can find here some resources to learn more about "Group by": http://www.w3schools.com/sql/sql_groupby.asp
SELECT ID, SUM(QUANTITY) FROM TABLE1 GROUP BY ID ORDER BY ID;
Select ID, Sum(Quantity) AS Quantity
from table1
Group by ID
Replace table1 with name of the table.
Just posting a complete answer that aliases the column and orders the results:
SELECT ID, SUM(Quantity) as [Quantity]
FROM TableName
GROUP BY ID
ORDER BY ID

SQL - Selecting unique values from one column then filtering based on another

I've had a search around and have seen quite a few questions about selecting distinct values, but none of them seem close enough to my query to be able to help. This is the scenario
ID Product_ID Product_type
123 56789 A
123 78901 B
456 12345 A
789 45612 B
The SQL I need would be to search in a table similar to the above, and bring back the rows where the Product_type is B but only if the ID related to it exists once within the table.
So in this case it would bring back only
789 45612 B
The SQL I have tried based on what I've found so far was
SELECT DISTINCT(ID)
FROM "TABLE"
WHERE "PRODUCT_TYPE" = 'B'
As well as
SELECT *
FROM "TABLE"
WHERE "PRODUCT_TYPE" = 'B'
GROUP BY "ID"
HAVING COUNT(ID) = 1
And neither have worked
One way via a list of IDs appearing once:
select * from T where Product_type = 'B' and id in (
select id from T
group by id
having count(id) = 1)
Soltuion 1: Use a sub-query to count id's.
select * from table t1
where Product_type = 'B'
and (select count(*) from table
where id = t1.id) = 1
You can use group by for this type of query. However, you cannot filter down to the 'B's before the aggregation.
So, try this:
SELECT t.id, MAX(t.product_id) as product_id,
MAX(t.product_type) as product_type
FROM "TABLE" t
GROUP BY "ID"
HAVING COUNT(*) = 1 AND
MAX(PRODUCT_TYPE) = 'B';
This may look a little bit arcane. But the having clause is guaranteeing that there is only one row and that row has a 'B'. Hence the MAX() functions are returning the max from that one row -- which is the value on that row.
EDIT:
Many databases will also allow you to take advantage of window functions for this:
select t.*
from (select t.*, count(*) over (partition by id) as id_cnt
from table t
) t
where t.product_type = 'B' and id_cnt = 1;

SQL - Only one result per set

I have a SQL problem (MS SQL Server 2012), where I only want one result per set, but have different items in some rows, so a group by doesn't work.
Here is the statement:
Select Deliverer, ItemNumber, min(Price)
From MyTable
Group By Deliverer, ItemNumber
So I want the deliverer with the lowest price for one item.
With this query I get the lowest price for each deliverer.
So a result like:
DelA 12345 1,25
DelB 11111 2,31
And not like
DelA 12345 1,25
DelB 12345 1,35
DelB 11111 2,31
DelC 11111 2,35
I know it is probably a stupid question with an easy solution, but I tried for about three hours now and just can't find a solution. Needles to say, I'm not very experienced with SQL.
Just Add an aggregate function to your deliverer field also, as appropriate (Either min or max). From your data, I guess you need min(deliverer) and hence use the below query to get your desired result.
Select mIN(Deliverer), ItemNumber, min(Price)
From MyTable
Group By ItemNumber;
EDIT:
Below query should help you get the deliverer with the lowest price item-wise:
SELECT TABA.ITEMNUMBER, TABA.MINPRICE, TABB.DELIVERER
FROM
(
SELECT ITEMNUMBER, MIN(PRICE) MINPRICE
FROM MYTABLE GROUP BY
ITEMNUMBER
) TABA JOIN
MYTABLE TABB
ON TABA.ITEMNUMBER=TABB.ITEMNUMBER AND
TABA.MINPRICE = TABB.PRICE
You should be able to do this with the RANK() (or DENSE_RANK()) functions, and a bit of partitioning, so something like:
; With rankings as (
SELECT Deliverer,
rankings.ItemNumber,
rankings.Price
RANK() OVER (PARTITION BY ItemNumber ORDER BY Price ASC) AS Ranking
FROM MyTable (Deliverer, ItemNumber, Price)
)
SELECT rankings.Deliverer,
rankings.ItemNumber,
rankings.Price
FROM rankings
WHERE ranking = 1