How to do a regex compare of string values in 2 columns in big query - google-bigquery

So let’s say I have 2 columns, both containing string values,
ColA , ResultCol
I want to check if In a row, the string in ColA is a substring of string in ResultCol
I know about ‘WHERE REGEXP_CONTAINS(col, regex)’ , but how do I do this comparison with string in another column?
Please let me know if I missed explaining any criteria of the question
Thanks!

For your requirement, you can use like operator in BigQuery which will compare the strings of two columns.I have created a sample table Product and ran the below code:
Code
SELECT * from `project.dataset.Product` where product like concat('%',country,'%')
Sample Table
Output

Related

SQL group by middle part of string

I have string column that looks usually approximately like this:
https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554
https://mapy.cz/turisticka?x=15.9380354&y=50.1990211&z=11&source=base&id=2197
https://mapy.cz/turisticka?x=12.8611357&y=49.8051338&z=16&source=base&id=1703157
I would like to group data by source which is part of the string - four letters behind "source=" (in the case above: firm) and then simply count them. Is there a way to achieve this directly in SQL code? I am using hadoop.
Data is a set of strings that look like above. My expected result is summary table with two columns: 1) Each type of the source (there is about 20 possible and their length is different so I cannot use sipmle substring). Ideally I am looking for solution that says: For the grouping use four letters that come after "source=" 2) Count of their occurences in all the strings.
There is just one source type in each string.
You can use regexp_extract():
select substr(regexp_extract(url, 'source[^&]+'), 8)
You can use charindex in MSSQL to get position of string and extract record
;with cte as (
SELECT SUBSTRING('https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554',
charindex('&source=','https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554')
+8,4) AS ExtractString )
select ExtractString,count(ExtractString) as count from cte group by ExtractString;
There is equivalent function LOCATE in hiveql for charindex.

How to exclude "?" for Null value from DB2 Select

I am extracting data from DB2 to Dataset via submitting the job(JCL) using DSNTIAUL. Some of column having null value but after extraction in dataset null value are placed like "?" For example:
Table1:
Select Query :
Select Column1,column2,Cloumn3
from Table1;
Output Dataset :
AAAAA......................?.......BBBBBB
CCCCC......DDDDDD.......................?
Could someone help to exclude "?" from the dataset. I tried with COALESCE but no luck. or should i need to write separate SORT step in JCL to remove "?".
And also is there any possible way extract data into CSV format
Is this what you want?
select column1, coalesce(nullif(column2, '?'), nullif(column3, '?'))
If both columns have '?'s, then this will return NULL.

SQL Server: How to select rows which contain value comprising of only one digit

I am trying to write a SQL query that only returns rows where a specific column (let's say 'amount' column) contains numbers comprising of only one digit, e.g. only '1's (1111111...) or only '2's (2222222...), etc.
In addition, 'amount' column contains numbers with decimal points as well and these kind of values should also be returned, e.g. 1111.11, 2222.22, etc
If you want to make the query generic that you don't have to specify each possible digit you could change the where to the following:
WHERE LEN(REPLACE(REPLACE(amount,LEFT(amount,1),''),'.','') = 0
This will always use the first digit as comparison for the rest of the string
If you are using SQL Server, then you can try this script:
SELECT *
FROM (
SELECT CAST(amount AS VARCHAR(30)) AS amount
FROM TableName
)t
WHERE LEN(REPLACE(REPLACE(amount,'1',''),'.','') = 0 OR
LEN(REPLACE(REPLACE(amount,'2',''),'.','') = 0
I tried like this in place of 1111111 replace with column name:
Select replace(Str(1111111, 12, 2),0,left(11111,1))

How to Filter WHERE Field Value LIKE any of the values stored in a Multi Value Parameter in SQL

I have a report (built using SSRS) that uses a multi-value parameter.
I want to add a Filter onto my SQL Query WHERE FieldA is LIKE any of the values stored in the parameter.
So FieldA might have the following values:
BOBJAMESLOUISE
MARYBOB
JENNY
JOHNLOUISEJAMES
BOB
JENNYJAMESMIKE
And #ParamA might have the following values:
Bob, Louise
Therefore in this example only records 1, 3, 4 and 5 should be returned
Thanks to any help in advance :)
P.S I'm using SQL Server 2008
You will want to implement a function like the split function. This can take a comma separated value list and separate it into rows like you want.
Below is a link for a couple of different versions, any of them will work for you. It also tells you how to use it.
Split Function
I am guessing its not the spiting sting part that is the issue since just googling for SQL split string you can find a lot of example. In your case what you would want after the split string is something like this. Assuming that the split string function you end up using returns a table of values Here is what your comparison query for with field A would look like.
SELECT * FROM YourTableWithFieldA WHERE (#ParamA IS NULL OR EXISTS ( SELECT * FROM YourSplitFunctionThatReturnsATableOfValues(#ParamA) SplitTable WHERE (FieldA Like '%'+SplitTable.Value+'%')))

Problem with MySQL Select query with "IN" condition

I found a weird problem with MySQL select statement having "IN" in where clause:
I am trying this query:
SELECT ads.*
FROM advertisement_urls ads
WHERE ad_pool_id = 5
AND status = 1
AND ads.id = 23
AND 3 NOT IN (hide_from_publishers)
ORDER BY rank desc
In above SQL hide_from_publishers is a column of advertisement_urls table, with values as comma separated integers, e.g. 4,2 or 2,7,3 etc.
As a result, if hide_from_publishers contains same above two values, it should return only record for "4,2" but it returns both records
Now, if I change the value of hide_for_columns for second set to 3,2,7 and run the query again, it will return single record which is correct output.
Instead of hide_from_publishers if I use direct values there, i.e. (2,7,3) it does recognize and returns single record.
Any thoughts about this strange problem or am I doing something wrong?
There is a difference between the tuple (1, 2, 3) and the string "1, 2, 3". The former is three values, the latter is a single string value that just happens to look like three values to human eyes. As far as the DBMS is concerned, it's still a single value.
If you want more than one value associated with a record, you shouldn't be storing it as a comma-separated value within a single field, you should store it in another table and join it. That way the data remains structured and you can use it as part of a query.
You need to treat the comma-delimited hide_from_publishers column as a string. You can use the LOCATE function to determine if your value exists in the string.
Note that I've added leading and trailing commas to both strings so that a search for "3" doesn't accidentally match "13".
select ads.*
from advertisement_urls ads
where ad_pool_id = 5
and status = 1
and ads.id = 23
and locate(',3,', ','+hide_from_publishers+',') = 0
order by rank desc
You need to split the string of values into separate values. See this SO question...
Can Mysql Split a column?
As well as the supplied example...
http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/
Here is another SO question:
MySQL query finding values in a comma separated string
And the suggested solution:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set