BigQuery Bytes field individual field - google-bigquery

I have a table in BigQuery with a BYTES field. In every row this field contains 6 bytes. How can I query the table so only rows are returned where the second byte is A, E, 2 or 6?
Thanks in advance,
Evert

You can use the SUBSTR function to extract the byte at that position. Here is an example that you can run:
#standardSQL
WITH T AS (
SELECT b'abcdef' AS s UNION ALL
SELECT b'ABCDEF' UNION ALL
SELECT b'123456' UNION ALL
SELECT b'765432'
)
SELECT s
FROM T
WHERE SUBSTR(s, 2, 1) IN UNNEST(SPLIT(b'AE26', b''));
To use your own table, just remove the WITH T AS (... part. If you want to match more characters, add them to the list that is passed to SPLIT.

Related

max consecutive digits in a string

I am trying to count the number of MAX consecutive digits that appear in a string column, let me give an example to illustrate better what I am trying to do. If I have a table called email
email
lucas1234#gmail.com
fer12#gmail.com
lupal#gmail.com
carlos1perez222#gmail.com
carlos11perez222#gmail.com
lucila1#gmail.com
my expected output would be
email count_cons_digits
lucas1234#gmail.com 4
fer12#gmail.com 2
lupal#gmail.com 0
carlos1perez222#gmail.com 3
carlos11perez222#gmail.com 3
lucila1#gmail.com 1
Check that this question is very similar to :
Number of consecutive digits in a column string
but the only difference is that the function from the results is not contemplating cases with only one digit in the email (like lucila1#gmail.com). In this case, the expected result should be 1 but the proposed function is giving 0. And also whenever the email contains "two sections" of consecutive digits (carlos11perez222#gmail.com). In this case, the expected output is to be 3 but is given 5.
Consider below approach
select *,
ifnull((select length(digits) len
from unnest(regexp_extract_all(email, r'\d+')) digits
order by len desc
limit 1
), 0) as count_cons_digits
from your_table
if applied to sample data in your question - output is
You may also try this approach using regex:
WITH email AS
(SELECT 'lucas1234#gmail.com' mail,
UNION ALL SELECT 'fer12#gmail.com',
UNION ALL SELECT 'lupal#gmail.com',
UNION ALL SELECT 'carlos1perez222#gmail.com',
UNION ALL SELECT 'carlos11perez222#gmail.com',
UNION ALL SELECT 'lucila1#gmail.com')
SELECT email,
(LENGTH(REGEXP_REPLACE(REGEXP_REPLACE(email.mail, r'[A-Za-z]+\d+[A-Za-z]+', ''),r'[A-Za-z.#]+',''))) AS count_cons_digits,
FROM email;
Output:

Select records based on variable known value

I need to select a series of records based on a known value. The record IDs follow a hierarchical structure where the first 4 characters are constant (IA09), followed by 4 digits representing an organization, followed by 4 characters representing an entity within the parent organization. The goal is to select all "child" records of the known value, as well as the "known value" record.
Sample Data Set:
IA0900000000
IA0912340000
IA0912340109
IA0912340418
IA0912340801
IA0945810000
IA0945810215
IA0945810427
IA0945810454
Here is the same dataset, indented to illustrate the hierarchical structure.
IA0900000000
IA0912340000
IA0912340109
IA0912340418
IA0912340801
IA0945810000
IA0945810215
IA0945810427
IA0945810454
Example 1
If the known value is 'IA0900000000', I need to select all records in the dataset.
Example 2
If the known value is 'IA0945810000', I need to select all records that begin with 'IA094581'
Example 3
If the known value is 'IA0912340109', I need to select ONLY the record with that ID as it has no child records.
The actual dataset is quite larger than this sample, and the known value will be different for each user of the database.
Is there a simple comparison I could employ in the WHERE clause that will give me the correct subset of records?
Assuming your table is called YourTable and the column name is column. You could remove the trailing 0 from your search term and concatenate it with a wildcard (%) and use the LIKE operator like so:
SELECT *
FROM YourTable
WHERE column LIKE TRIM(TRAILING '0' FROM 'IA0945810000') || '%'
You can use simple "like" in your case:
with t(org) as (-- test_data:
select 'IA0900000000' from dual union all
select 'IA0912340000' from dual union all
select 'IA0912340109' from dual union all
select 'IA0912340418' from dual union all
select 'IA0912340801' from dual union all
select 'IA0945810000' from dual union all
select 'IA0945810215' from dual union all
select 'IA0945810427' from dual union all
select 'IA0945810454' from dual
)
select
regexp_replace(
regexp_replace(
regexp_replace(t.org,'0{4}')
,'0{4}')
,'(.{4})'
,'\1.'
) as short_org_path -- just for better readability
,t.*
from t
where t.org like regexp_replace(regexp_replace('&input_org','0{4}'),'0{4}')||'%'
/

Find Max Value in string/integers in a column in SQL

I have a table with a column UniqueID of type varchar. This column has unique IDs labeled as follows:
DU19F0001
DU19M001
DU19M002
DU19F002
EL19F001
EL19F002
MU19M001
MU19M002
I am trying to select for the last max value based on this mixed string. For instance what is the last value for 'DU' '19' 'F'? The result should then be DU19F002. How do I write a query for selecting the max value based on a mix of strings and integers in a column?
Is this what you want?
select max(uniqueid)
from t
where uniqueid like 'DU19F%';
If you want to do this for the first 5 (or whatever) characters, you can do this:
select
first_part,
max(uniqueid)
from
(
select substring(uniqueid,1,5) as first_part,uniqueid from <your table> ) t
group by first_part

T-SQL QUERY , COUNT WITH TRAILER RECORD

i have a code which starts with Alert procedure and follows up with header , detail and trailer statements , where as i used select distinct for header , detail and in between both header and detail there is union command , so i was wondering if i could get the total number of records from header and detail in trailer row in specific column .. for now i have used
CONVERT(bigint, count(*) ) as Recordcount,
but it is displaying as 498 rows .. but we originally have 475 rows about trailer row . i think it is counting total number of sql query rows ..
COUNT(*) does count the total number of rows in a dataset; including any rows that are completely made up of the value NULL. Take, for example:
WITH VTE AS(
SELECT CONVERT(int,NULL) AS N
UNION ALL
SELECT CONVERT(int,NULL) AS N
UNION ALL
SELECT CONVERT(int,NULL) AS N
UNION ALL
SELECT CONVERT(int,NULL) AS N
UNION ALL
SELECT CONVERT(int,NULL) AS N
UNION ALL
SELECT 1 AS N)
SELECT COUNT(*)
FROM VTE;
Notice this returns 6, not 1. If you wanted the value 1, then you'd need to use COUNT(N).
Without sample data, this is pure guesswork, but i imagine you need to use COUNT with a CASE expression, to only include rows that aren't header or footers. This is pseudo-SQL, however, it'll be something like:
COUNT(CASE WHEN <<Some expression that determines a row instead a header/footer>> THEN 1 END)
Also, there's no reason to use CONVERT(BIGINT,COUNT(<<expr>>). If you're doing a count that might return more than 2^31-1 rows, then use COUNT_BIG. If you're not going to be returning more than 2^31-1 rows then just use COUNT (you're returning <500 rows, so literally no reason to use a bigint).

DB2: fill a dummy field with values in for loop while a select

I want to fill a dummy field with values in a for loop during a select:
Somethinhg like (table account e.g. has a field "login")
select login,(for i= 1 to 3 {list=list.login.i.","}) as list from account
The result should be
login | list
aaa | aaa1,aaa2,aaa3
bbb | bbb1,bbb2,bbb3
ccc | ccc1,ccc2,ccc3
Can someone please help me if that is possible !!!!
Many Thanks !
If this is an one-off task and the size of your loop is fixed, you can make up a table of integers and do a cartesian product with your table containing the column login:
SELECT ACC.LOGIN || NUMBRS.NUM FROM
ACCOUNT ACC, TABLE (
SELECT '1' AS NUM FROM SYSIBM.SYSDUMMY1 UNION
SELECT '2' AS NUM FROM SYSIBM.SYSDUMMY1 UNION
SELECT '3' AS NUM FROM SYSIBM.SYSDUMMY1
) NUMBRS
which will give you strings like 'aaa1', 'aaa2', 'aaa3' one string per row. Then, you can aggregate these strings with LISTAGG.
If the size is not fixed, you can always make up a temporary table and fill it up with appropriate data and use it instead of the NUMBRS table above.