Turn one column into multiple columns by key words

Turn one column into multiple columns by key words - sql

How can I split a column into more columns base on the specific words? For example, I have table A and I want to split col wherever the words "AND, OR, PLUS" appears, so that I get table B as the result.
A
ID
col
1
THE BIG APPLE AND ORANGE OR PEAR
2
BANNANA EATS GRAPE OR BLUEBERRY
3
THE BEST FRUIT IS WATERMELON
4
FRUITS OR CANDY ARE THE BEST OR WATER
5
APPLE STRAWBERRY AND PLUM PLUS SUGAR OR PEACH
6
MELON IN MY BELLY
B
ID
col1
col2
col3
col4
1
THE BIG APPLE
ORANGE
PEAR
2
BANNANA EATS GRAPE
BLUEBERRY
3
THE BEST FRUIT IS WATERMELON
4
FRUITS
CANDY ARE THE BEST
WATER
5
APPLE STRAWBERRY
PLUM
SUGAR
PEACH
6
MELON IN MY BELLY

You can split the string and then PIVOT:
SELECT *
FROM (
SELECT id,
idx,
match
FROM table_name
CROSS APPLY (
SELECT LEVEL AS idx,
REGEXP_SUBSTR(
col,
'(.+?)(\s+(AND|OR|PLUS)\s+|$)',
1,
LEVEL,
'i',
1
) AS match
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(
col,
'(.+?)(\s+(AND|OR|PLUS)\s+|$)',
1,
'i'
)
)
)
PIVOT (
MAX(match)
FOR idx IN (1 AS col1, 2 AS col2, 3 AS col3, 4 AS col4)
);
Note: SQL statements MUST have a fixed number of output columns so you cannot dynamically set the number of columns with a static SQL statement. It would possibly be better to just use the inner query (without the outer wrapper which performs the PIVOT) and output the values as rows rather than columns and then if you want to transpose to columns then do it in whatever front-end you are using to access the database.
Which, for the sample data:
CREATE TABLE table_name (ID, col) AS
SELECT 1, 'THE BIG APPLE AND ORANGE OR PEAR' FROM DUAL UNION ALL
SELECT 2, 'BANNANA EATS GRAPE OR BLUEBERRY' FROM DUAL UNION ALL
SELECT 3, 'THE BEST FRUIT IS WATERMELON' FROM DUAL UNION ALL
SELECT 4, 'FRUITS OR CANDY ARE THE BEST OR WATER' FROM DUAL UNION ALL
SELECT 5, 'APPLE STRAWBERRY AND PLUM PLUS SUGAR OR PEACH' FROM DUAL UNION ALL
SELECT 6, 'MELON IN MY BELLY' FROM DUAL;
Outputs:
ID
COL1
COL2
COL3
COL4
1
THE BIG APPLE
ORANGE
PEAR
null
2
BANNANA EATS GRAPE
BLUEBERRY
null
null
3
THE BEST FRUIT IS WATERMELON
null
null
null
4
FRUITS
CANDY ARE THE BEST
WATER
null
5
APPLE STRAWBERRY
PLUM
SUGAR
PEACH
6
MELON IN MY BELLY
null
null
null
db<>fiddle here

Related

Remove all duplicates except latest occurrence in big query standard sql based off two columns

If I have a table in big query that contains the following
fruit color quantity age other_field
apple red 3 1 foo
grapes green 5 1 young
apple green 1 3 word
apple red 4 5 bar
How would I delete all rows except the last instance containing the same fruit & color column so that my table would then look like this
fruit color quantity age other_field
grapes green 5 1 young
apple green 1 3 word
apple red 4 5 bar
Essentially only keeping a single row for every unique pair of fruit and color in big query standard sql?

Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'apple' fruit, 'red' color, 3 quantity, 1 age, 'foo' other_field UNION ALL
SELECT 'grapes', 'green', 5, 1, 'young' UNION ALL
SELECT 'apple', 'green', 1, 3, 'word' UNION ALL
SELECT 'apple', 'red', 4, 5, 'bar'
)
SELECT fruit, color,
ARRAY_AGG(STRUCT(quantity, age, other_field) ORDER BY age DESC LIMIT 1)[OFFSET(0)].*
FROM `project.dataset.table` t
GROUP BY fruit, color
with result
Row fruit color quantity age other_field
1 apple red 4 5 bar
2 grapes green 5 1 young
3 apple green 1 3 word
Another version of same is:
#standardSQL
SELECT AS VALUE
ARRAY_AGG(t ORDER BY age DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY fruit, color
with same result ... but obviously I like this version better :o)

How do I split a string column into multi rows of single words & word pairs in BigQuery SQL?

I am trying (unsuccessfully) to split a string column in Google BigQuery into rows containing all single words and all word pairs (next to each other & in order). I also need to maintain the ID field for the words from the IndataTable. Both recordsets have 2 columns.
IndataTable as IDT
ID WordString
1 apple banana pear
2 carrot
3 blue red green yellow
OutdataTable as ODT
ID WordString
1 apple
1 banana
1 pear
1 apple banana
1 banana pear
2 carrot
3 blue
3 red
3 green
3 yellow
3 blue red
3 red green
3 green yellow (only pairs that are next to each other)
Is this possible in BigQuery SQL?
Edit/Added:
This is what I have so far which works for splitting it up into single words. I am really struggling to figure out how to expand this to word pairs. I don't know if this can be modified for it or I need a new approach altogether.
SELECT ID, split(WordString,' ') as Words
FROM (
select *
from
(select ID, WordString from IndataTable)
)

Below is for BigQuery Standard SQL
#standardSQL
WITH IndataTable AS (
SELECT 1 id, 'apple banana pear' WordString UNION ALL
SELECT 2, 'carrot' UNION ALL
SELECT 3, 'blue red green yellow'
), words AS (
SELECT id, word, pos
FROM IndataTable, UNNEST(SPLIT(WordString,' ')) AS Word WITH OFFSET pos
), pairs AS (
SELECT id, CONCAT(word, ' ', LEAD(word) OVER(PARTITION BY id ORDER BY pos)) pair
FROM words
)
SELECT id, word AS WordString FROM words UNION ALL
SELECT id, pair AS WordString FROM pairs
WHERE NOT pair IS NULL
ORDER BY id
with result as expected :
Row id WordString
1 1 apple
2 1 banana
3 1 pear
4 1 apple banana
5 1 banana pear
6 2 carrot
7 3 blue
8 3 red
9 3 green
10 3 yellow
11 3 blue red
12 3 red green
13 3 green yellow

Access SQL Query - Only extracting value if it is same for the whole group

I'm trying to write a query in Access 2010 based on the following simple data set:
**Lot Fruit**
1 Mango
1 Mango
1 Apple
1 Orange
2 Apple
2 Apple
2 Apple
3 Apple
3 Mango
4 Mango
4 Mango
4 Mango
5 Apple
5 Apple
I only want to extract those Lot no. where fruit has only one type within each Lot.
For example, say I want to get data where Fruit = "Apple", then it should only pull data where Lot has only "Apple" no other fruit in the same Lot.
In our example if I want all lot which have only Apple, then query should bring the following result.
Lot Fruit
2 Apple
2 Apple
2 Apple
5 Apple
5 Apple
I have tried various SQL queries but with no luck, any help will be appreciated.

Try this:
SELECT Lot
FROM mytable
GROUP BY Lot
HAVING COUNT(CASE WHEN Fruit <> 'Apple' THEN 1 END) = 0
Alternatively try:
SELECT DISTINCT Lot
FROM mytable AS t1
WHERE Fruit = 'Apple' AND
NOT EXISTS (SELECT 1
FROM mytable AS t2
WHERE t1.Lot = t2.Lot AND t2.Fruit <> 'Apple')
It's also possible to use LEFT JOIN:
SELECT DISTINCT Lot
FROM mytable AS t1
LEFT JOIN mytable AS t2 ON t1.Lot = t2.Lot AND t2.Fruit <> 'Apple'
WHERE t1.Fruit = 'Apple' AND t2.Lot IS NULL

SQL - group by and allow references

stackoverflowers.
I want to group a list by a field name and, as aggregate function, choose the most recent date. But I want to retain the value from a field price relative to the entry with that last date. Dates never repeat for the same name. Here is the example table...
id name date price
1 Orange 21/01 1,99
2 Orange 22/01 1,99
3 Orange 23/01 2,99
4 Orange 25/01 1,99
5 Apple 20/01 2,49
6 Apple 22/01 3,49
7 Apple 23/01 2,99
8 Banana 20/01 3,99
9 Banana 21/01 3,99
10 Banana 22/01 4,99
11 Banana 23/01 3,99
12 Banana 24/01 3,99
... and the desired result:
id name MAX(date) last_price
4 Orange 25/01 1,99
7 Apple 23/01 2,99
12 Banana 24/01 3,99
Is that possible to accomplish this with SQL group by clause? Using a nested select is slowing things down, as I have a big, big table.

You can do something like this:
select *
from table t1
inner join (
select name, max(date) as maxdate
from table
group by name
) t2 on t1.name = t2.name and t1.date = t2.maxdate

SQL Query to get 1 output from 2 different columns within same table

I am still new to SQL and was wondering what would be the best option to get distinct category names from two different columns from the same table.
Example:
Table Name: Fruits
ID CAT1 CAT2
1 APPLE PEACH
2 PEACH GRAPE
3 APPLE GRAPE
4 ORANGE APPLE
5 PEACH PEAR
Desired Output
Distinct CAT
APPLE
PEACH
GRAPE
ORANGE
PEAR
I know that I would want to do a join where I name each table a letter like fruits a and fruits b so I match it via the ID but I cannot figure how to display it in one column only the distinct CAT from both columns.

You could query the distinct values of both columns separately and UNION (e.g. MySQL documentation) the results:
(SELECT DISTINCT CAT1 FROM Fruits)
UNION
(SELECT DISTINCT CAT2 FROM Fruits)

If you would have played with it little you would have get this already.
Select distinct cat from ( (Select cat1 as cat from fruits) union all (Select cat2 as cat from fruits)) q

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Turn one column into multiple columns by key words - sql

Related

Remove all duplicates except latest occurrence in big query standard sql based off two columns

How do I split a string column into multi rows of single words & word pairs in BigQuery SQL?

Access SQL Query - Only extracting value if it is same for the whole group

SQL - group by and allow references

SQL Query to get 1 output from 2 different columns within same table

Categories

Resources