BigQuery - concatenate array of strings for each row with `null`s - sql

This is a clarification/follow-up on the earlier question where I didn't specify the requirement for null values.
Given this input:
Row id app_date inventor.name inventor.country
1 id_1 01-15-2022 Steve US
Ashley US
2 id_2 03-16-2011 Pete US
<null> US
Mary FR
I need to extract name from inventor struct and concatenate them for each id, like so:
Row id app_date inventors
1 id_1 01-15-2022 Steve, Ashley
2 id_2 03-16-2011 Pete, ^, Mary
Note custom filling for null value - which, to me, seems like it means I need to use ARRAY_TO_STRING specifically that supports this.
The closest example I found doesn't work with nulls. How can one do this?

Use below
SELECT * EXCEPT(inventor),
(SELECT STRING_AGG(IFNULL(name, '^'), ', ') FROM t.inventor) inventors
FROM sample t
with output

Related

SQL: extract the last word

I have a table that looks like the following:
id | cars
1 | John's Honda
2 | Andrew's red lexus
3 | James has a bmw
I need to just get the last word of the "cars" column that shows the actual "car" name
I have tried the followings but I don't get the desired output
select substr(cars, -1)
from t
the code above just shows me the last charater of the column. Later, I tried the following:
select split(cars, ' ')[offset(1)]
from t
however, I got the "Array index 1 is out of bounds (overflow)" error. Can anyone help how this can be achieved with bigquery?
Consider below simple approach
select *,
array_reverse(split(cars, ' '))[offset(0)] as brand
from your_table
if applied to sample data in your question - output is
Note: there are really many ways to accomplish your case - so anoher one would be regexp_extract(cars, r'\b(\w+)$') as brand

SQL Combine null rows with non null

Due to the way a particular table is written I need to do something a little strange in SQL and I can't find a 'simple' way to do this
Table
Name Place Amount
Chris Scotland
Chris £1
Amy England
Amy £5
Output
Chris Scotland £1
Amy England £5
What I am trying to do is above, so the null rows are essentially ignored and 'grouped' up based on the Name
I have this working using For XML however it is incredibly slow, is there a smarter way to do this?
This is where MAX would work
select
Name
,Place = Max(Place)
,Amount = Max(Amount)
from
YourTable
group by
Name
Naturally, if you have more than one occurance of a place for a given name, you may get unexpected results.

How to use LIKE in a query to find multiple words?

I have a cust table
id name class mark
1 John Deo Matt Four 75
2 Max Ruin Three 85
3 Arnold Three 55
4 Krish Star HN Four 60
5 John Mike Four 60
6 Alex John Four 55
I would like to search for a customer which might be given as John Matt without the deo string. How to use a LIKE condition for this?
SELECT * FROM cust WHERE name LIKE '%John Matt%'
The result should fetch the row 1.
what if the search string is Matt Deo or john
The above can't be implemented when trying to find an exact name. How can I make the LIKE query to fetch the customer even if 2 strings are given?
If the pattern to be matched is
string1<space>anything<space>string2
you can write:
like string1||' % '||string2
Why not this
select * from cust where name Like 'John%Matt' ;
SELECT *
FROM custtable
WHERE upper(NAME) LIKE '%' || upper(:first_word) || '%'
AND upper(NAME) LIKE '%' || upper(:second_word) || '%'
Must you use LIKE? Oracle has plenty of more powerful search options.
http://docs.oracle.com/cd/B19306_01/server.102/b14220/content.htm#sthref2643
I'd look at those.
I believe you need REGEXP_LIKE( ):
SQL> with tbl(name) as (
select 'John Deo Matt' from dual
)
select name
from tbl
where regexp_like(name, 'matt|deo', 'i');
NAME
-------------
John Deo Matt
SQL>
Here the regex string specifies name contains 'matt' OR 'deo' and the 'i' means its case-insensitive. The order of the names does not matter.

Using a Table-Valued Function to Turn a Single Row into Many Within Select

Simple enough question I think.
I have a dataset, quite large with a bit of free-text name data. I need to to link this to our employee table.
There's a whole set of different ways people have entered the 'owner' in to this fields (John Smith, J.Smith, John Smith (JSMITH), Company:John Smith/Client: John Smith, ect.)
Most of these are fine, but the problem I have is with the ones where multiple names have been entered. For example; "John Smith / Joe Bloggs".
I have a pre-created Table-Valued function which takes in a string and a delimiter, then returns a table with the results of the split.
dbo.Split('John Smith / Joe Bloggs')
id val
1 John Smith
2 Joe Bloggs
The issue I have is that I need these results to come back for each row within an existing dataset. So for example, my query selecting the Owner, RefNumber and OSProjectCode fro my 'ProjectActions' table containing the following data:
RefNumber OSProjectCode Owner
1 1234 Bill Baggins
2 1234 John Smith / Joe Bloggs
would come out looking like this:
RefNumber OSProjectCode Owner
1 1234 Bill Baggins
2 1234 John Smith
2 1234 Joe Bloggs
What I've tried to far is attempt to join on the results of the function - but unsurprisingly it wont let me send in the column from ProjectsActions into the function like that.
SELECT a.val AS [Owner], pa.[RefNumber], pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
INNER JOIN dbo.Split(pa.[Owner], '/') a
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "pa.Owner" could not be bound.
The only way I can think of doing this, which seems a little too bulky and messy, is the below:
;with base as(
SELECT
pa.RefNumber
, pa.OSProjectCode
, (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 1) AS [First]
, (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 2) AS [Second]
FROM ProjectsActions pa
)
SELECT
a.RefNumber
, a.OSProjectCode
, a.First AS [Owner]
FROM base a WHERE a.First IS NOT NULL
UNION ALL
SELECT
b.RefNumber
, b.OSProjectCode
, b.Second AS [Owner]
FROM base b WHERE a.First IS NOT NULL
Surely there's a better way? Something more similar to my first attempt - joining to the results within each row?
Any feedback or ideas would be much appreciated.
Cheers,
Scott.
EDIT:
FYI if anyone comes accross this with a similar issue, but are missing the 'split' part - I use a function found elsewhere on stackoverflow. https://stackoverflow.com/a/14600765/1700309
You need to use an APPLY as your join.
SELECT
a.val AS [Owner],
pa.[RefNumber],
pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
CROSS APPLY dbo.Split(pa.[Owner], '/') a
The CROSS APPLY acts like an INNER JOIN passing the row-level value into your table-valued function. If you expect split function returns NULL if it can't split the value (NULL, empty, etc), you can use OUTER APPLY so that the NULL won't drop that row out of your result set. You can also add a COALESCE to fall back to the [owner].
SELECT
COALESCE(a.val, pa.[Owner]) AS [Owner],
pa.[RefNumber],
pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
OUTER APPLY dbo.Split(pa.[Owner], '/') a

Another transpose rows to columns question- SQL Server 2005

Sorry I know similar question have been asked many times before but like most I haven't been able to find one that fits what I need or gives me enough information to be able to expand towards the solution.
I suspect I could use the PIVOT or UNPIVOT commands of T-SQL but I can't follow the MSDN explanation and there don't seem to be any "idiots" type guides out there!
Onto my problem - I need to convert a wide table consisting of many columns to a row based result set for a report. No aggregation needed and I'm trying to avoid a repeated UNION based select if possible.
The table result set is formatted as such (there are actually many more person columns! :s ):
Person1 | Person2 | Person3 | Person4 | Person5 | Person6 | Person7 | Person8
-----------------------------------------------------------------------------
Bob Sam Tom Alex Paul Ann Jill Jane
What I really need is to be able to produce the following:
Person
--------------------
Bob
Sam
Tom
Alex
Paul
Ann
Jill
Jane
A bonus would be able to create a result set such as:
Column Person
--------------------
Person1 Bob
Person2 Sam
Person3 Tom
Person4 Alex
Person5 Paul
Person6 Ann
Person7 Jill
Person8 Jane
How can this be achieved using T-SQL in SQL Server 2005?
Thanks for any help,
Paul.
--Edit--
Thanks to Martin I've learnt something new this morning and I've managed to get exactly what I needed. In the end I had to modify the example slightly to get what I needed but that's because my original example left out some detail that I hadn't realised would be important!
My final piece of code looked like this for anyone else that has such a problem:
WITH Query_CTE([Person1 Title],[Person2 Title],[Person3 Title])
AS
--CTE Expression and column list
(
SELECT
--Converted to create a common data type.
CONVERT(NVARCHAR(MAX),Person1Title) AS 'Person1 Title',
CONVERT(NVARCHAR(MAX),Person2Title) AS 'Person2 Title',
CONVERT(NVARCHAR(MAX),Person3Title) AS 'Person3 Title'
FROM Table_Name
WHERE KeyId = 'XXX'
)
SELECT *
FROM Query_CTE
UNPIVOT
(Person FOR [Column] IN
(
[Person1 Title],
[Person2 Title],
[Person3 Title]
)
)AS unpvt;
WITH T(Person1,Person2 /* etc....*/) AS
(
SELECT 'Bob','Sam' /* etc....*/
)
SELECT *
FROM T
UNPIVOT
(Person FOR [Column] IN
(Person1,Person2 /* etc....*/)
)AS unpvt;
To easily transpose columns into rows with its names you should use XML. In my blog I was described this with example: http://sql-tricks.blogspot.com/2011/04/sql-server-rows-transpose.html