Get multiple occurrence of string from a column in sql query - sql

I have a table which has the following data
Ticketid created Details
205853669 2020-03-05 #CLOSE# Next action value://346004/ next action value://346002/ or value://346008/
205853670 2020-03-06 #Archive Next action value://346088/ next action value://346077/ or value://346057/
The string "value://" pattern is same in all column, I want to extract those numbers from the string.
ticketid Numbers
205853669 346004
205853669 346002
205853669 346008
205853670 346088
205853670 346077
205853670 346057
I am using standard Sql only
I have created something like below.
select ticketid,TRIM(REPLACE(SUBSTR(
details, STRPOS(details, "value//"),10
),"value//"","")) AS number from table

Below is for BigQuery Standard SQL
#standardSQL
SELECT Ticketid, Numbers
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(Details, r'value://(\d+)/')) Numbers
If to apply to sample data from your question - output is
Row Ticketid Numbers
1 205853669 346004
2 205853669 346002
3 205853669 346008
4 205853670 346088
5 205853670 346077
6 205853670 346057

The below query would work. This query splits the comment on value then extracts the 6 digit id.
with `project.dataset.table` as (
select id, split(details, 'value://') AS number from (
select '1' as id, '#CLOSE# Next action value://346004/ next action value://346002/ or value://346008/' as details
union all
select '2' as id, '#Archive Next action value://346088/ next action value://346077/ or value://346057/'
)
)
select id, regexp_extract(number1, "\\d{6}") as number
from `project.dataset.table` ,
UNNEST( number ) number1
where regexp_extract(number1, "\\d{6}") is not null
It has one remark about UNNEST function. As per documentation
The UNNEST operator takes an ARRAY and returns a table, with one row for each element in the ARRAY.
If you have only a few 'values://' for each comment then this wouldn't cause as much problem, but if there would be unlimited number of 'value://' this might become a performance bottleneck so keep that in mind. On the other hand this is the only way I know how to achieve that using CloudSQL.

Related

max consecutive digits in a string

I am trying to count the number of MAX consecutive digits that appear in a string column, let me give an example to illustrate better what I am trying to do. If I have a table called email
email
lucas1234#gmail.com
fer12#gmail.com
lupal#gmail.com
carlos1perez222#gmail.com
carlos11perez222#gmail.com
lucila1#gmail.com
my expected output would be
email count_cons_digits
lucas1234#gmail.com 4
fer12#gmail.com 2
lupal#gmail.com 0
carlos1perez222#gmail.com 3
carlos11perez222#gmail.com 3
lucila1#gmail.com 1
Check that this question is very similar to :
Number of consecutive digits in a column string
but the only difference is that the function from the results is not contemplating cases with only one digit in the email (like lucila1#gmail.com). In this case, the expected result should be 1 but the proposed function is giving 0. And also whenever the email contains "two sections" of consecutive digits (carlos11perez222#gmail.com). In this case, the expected output is to be 3 but is given 5.
Consider below approach
select *,
ifnull((select length(digits) len
from unnest(regexp_extract_all(email, r'\d+')) digits
order by len desc
limit 1
), 0) as count_cons_digits
from your_table
if applied to sample data in your question - output is
You may also try this approach using regex:
WITH email AS
(SELECT 'lucas1234#gmail.com' mail,
UNION ALL SELECT 'fer12#gmail.com',
UNION ALL SELECT 'lupal#gmail.com',
UNION ALL SELECT 'carlos1perez222#gmail.com',
UNION ALL SELECT 'carlos11perez222#gmail.com',
UNION ALL SELECT 'lucila1#gmail.com')
SELECT email,
(LENGTH(REGEXP_REPLACE(REGEXP_REPLACE(email.mail, r'[A-Za-z]+\d+[A-Za-z]+', ''),r'[A-Za-z.#]+',''))) AS count_cons_digits,
FROM email;
Output:

Return 0 if no row found in SQL Server using Pivot

Thanks everyone, and thank you #Aaron Bertrand, your answer solved my problem :) !
i am struggling to find a solution to my problem, here is my query in SQL Server :
EDIT: a little more details : this is the kind of data that a have in my table :
identifier
date
status
1
20220421
have a book
2
20220421
have a pdf
3
20220421
have a pdf
4
20220421
have a book
5
20220421
have a book
6
20220421
have a book
my query gives this result :
have a book
have a pdf
4
2
so in the case when there is no records for a date, I need a query that returns :
have a book
have a pdf
0
0
instead of :
have a book
have a pdf
SELECT * FROM
(
Select status from database.dbo.MyTable where date = '20220421' and status
in ('have a book','have a pdf')) y
PIVOT( Count (status) FOR y.status IN ( [have a book],[have a
pdf])
) pivot_table
This query works well but my issue is that i want to display 0 in the results if no row is found, i tried with IsNull, it works without the Pivot part, but i wasn't able to make it work with the Pivot.
Thanks in advance :)
Since we're only dealing with bits and one or zero rows for a given date, you can just add a union to include a second row with zeros, and take the max (which will either pull the 0 or 1 from the real row, or the zeros from the dummy row when a real row doesn't exist).
SELECT [have a book] = MAX([have a book]),
[have a pdf] = MAX([have a pdf])
FROM
(
SELECT [have a book], [have a pdf] FROM
(
SELECT status FROM dbo.whatever
WHERE date = '20220421'
AND status IN ('have a book','have a pdf')
) AS src PIVOT
(
COUNT(status) FOR status IN
([have a book],[have a pdf])
) AS pivot_table
UNION ALL SELECT 0,0
) AS final;
Example db<>fiddle

How can I rightpad continus numbers to one cloumn in big query table?

Here is my problem:
I have one column in the table, type of Integer.
The length of any entity in the column is 7 and thus fixed.
I want to right pad 0000 to 9999 to every entity in this column, so one entity in the original table will correspond to 10k new columns in a new table.
For example:
The first entry of the original table is '1234567',
I want to generate:
12345670000
12345670001
12345670002
12345670003
...
12345679999
How could I achieve this?
Below is for BigQuery Standard SQL
#standardSQL
SELECT value * 10000 + step AS value
FROM `project.dataset.table`,
UNNEST(GENERATE_ARRAY(0, 9999)) step
You can test, play with above using that simplified example from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1234567 value
)
SELECT value * 10000 + step AS value
FROM `project.dataset.table`,
UNNEST(GENERATE_ARRAY(0, 9999)) step
-- ORDER BY value

Find preceding and following rows for a matching row in BigQuery?

Is it possible to find rows preceding and following a matching rows in a BigQuery query? For example if I do:
select textPayload from logs.logs_20160709 where textPayload like "%something%"
and say that I get these results back:
something A
something B
How can I also show the 3 rows preceding and following the matching rows? Something like this:
some text 1
some text 2
some text 3
something A
some text 4
some text 5
some text 6
some text 90
some text 91
some text 92
something B
some text 93
some text 94
some text 95
Is this possible and if so how?
While on Zuma Beach - I was thinking of avoiding CROSS JOIN in my original answer.
Check below - should be much cheaper especially for big set
SELECT textPayload
FROM (
SELECT textPayload,
SUM(match) OVER(ORDER BY ts ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) AS flag
FROM (
SELECT textPayload, ts, IF(textPayload CONTAINS 'something', 1, 0) AS match
FROM YourTable
)
)
WHERE flag > 0
Of course another way to avoid cross join is to use BigQuery Standard SQL. But still - above solution with no joins at all is better than my original answer
I think, one piece is missing in your example - extra field that will define the order, so I added ts field for this in my answer. This mean I assume your table has two fields involved : textPayload and ts
Try below. Should give you exactly what you need
SELECT
all.textPayload
FROM (
SELECT start, finish
FROM (
SELECT textPayload,
LAG(ts, 3) OVER(ORDER BY ts ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS start,
LEAD(ts, 3) OVER(ORDER BY ts ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) AS finish
FROM YourTable
)
WHERE textPayload CONTAINS 'something'
) AS matches
CROSS JOIN YourTable AS all
WHERE all.ts BETWEEN matches.start AND matches.finish
Please note: depends on type of your ts field - you might need to do some data casting in query for this field. hope not

SQL Combing the top 2 field values into 1 value

I have a very simple query that returns the Notes field. Since there can be multiple notes, I only want the top 2. No problem. However, I'm going to be using the sql within another query. I really don't want 2 lines in my results. I would like to combine the results into 1 field value so I only have 1 result line in the results. Is this possible?
For example, I currently get the following:
12345 1001 500.00 "Note 1"
12345 1001 500.00 "Note 2"
What I would like to see is this:
12345 1001 500.00 "Note 1 AND Note 2"
Following is the sql:
select top 2 rcai.field_value
from rnt_agrs ra
inner join rnt_agr_inv_notes rain on ra.rnt_agr_nbr=rain.rea_rnt_agr_nbr
inner join RNT_CUST_ADDNL_INFO rcai on rain.rea_rnt_agr_nbr=rcai.rea_rnt_agr_nbr and rain.bac_acc_id=rcai.bac_acct_id
where ra.rnt_agr_nbr=128260511
Thanks for your help. I appreciate this forum for help with these issues.....
Get the next row's value and filter all but the first row:
select ..., rcai.field_value || ' AND '
min(rcai.field_value) -- next row's value (same as LEAD in Standard SQL)
over (partition by ra.rnt_agr_nbr
order by rcai.field_value
rows between 1 following and 1 following) as next_field_value
from rnt_agrs ra
inner join rnt_agr_inv_notes rain on ra.rnt_agr_nbr=rain.rea_rnt_agr_nbr
inner join RNT_CUST_ADDNL_INFO rcai on rain.rea_rnt_agr_nbr=rcai.rea_rnt_agr_nbr and rain.bac_acc_id=rcai.bac_acct_id
where ra.rnt_agr_nbr=128260511
qualify
row_number() -- only the first row
over (partition by ra.rnt_agr_nbr
order by rcai.field_value) = 1
If there might be only a single row you need to add a COALESCE(min...,'') to get rid of the NULL.
Both OLAP functions specify the same PARTITION and ORDER, so this is a single working step.
select *,(SELECT top 2 rcai.field_value + ' AND ' AS [text()]
FROM RNT_CUST_ADDNL_INFO rcai
WHERE rcai.rea_rnt_agr_nbr = rain.rea_rnt_agr_nbr
AND rcai.bac_acct_id=rain.bac_acc_id
FOR XML PATH('')) AS Notes
from
rnt_agrs ra inner join rnt_agr_inv_notes rain
on ra.rnt_agr_nbr=rain.rea_rnt_agr_nbr
I had something like this, where there was a 1 to many, and I wanted a semicolon delimited set of values in a single column with the main record.
You could use PIVOT to transform the two note rows into two note columns based on row number, then concatenate them. Here's an example:
SELECT pvt.[1] + ' and ' + pvt.[2]
FROM
( --the selection of your table data, including a row-number column
SELECT Msg, ROW_NUMBER() OVER(ORDER BY Id)
--sample data shown here, but this would be your real table
FROM (VALUES(1, 'Note 1'), (2, 'Note 2'), (3, 'Note 3')) Note(Id, Msg)
) Data (Msg, Row)
PIVOT (MAX(Msg) FOR Row IN ([1], [2])) pvt
Note that MAX is used for the aggregate in the PIVOT since an aggregate is required, but since ROW_NUMBER is unique, you're only aggregating a single value.
This could also be easily extended to the first N rows - just include the row numbers you want in the pivot and combine them as desired in the select statement.