Query to search substring in column - sql

I have a table that has a substring value in the column and I want to write a query that checks if input string has the substring.
My table looks like:
| company | host |
| ------- | ---------- |
| ebay | ebay.com |
| google | google.com |
| yahoo | yahoo.com |
My input will be like www.ebay.com or https://www.ebay.com or www.qa.ebay.com or www.dev.ebay.com..
If I get any of the inputs I want to return the first record.
I tried looking at the CHARINDEX, INSTR but they are work in reverse. My scenario is I have substring to be searched in table and the actual string as input.
Any help is appreciated.

You can use like for this, but you also need string concatenation. In ANSI standard SQL, this looks like:
select t.*
from t
where #inputstring like concat('%.', t.host)
where #inputstring is the string you are inputting.
Note: You can also use the concatenation infix operation, which is typically || (standard) or +.

You can use the SQL wildcard like so:
SELECT * FROM table WHERE host LIKE '%ebay.com';

Go for this:
SELECT * FROM table WHERE host LIKE '%SearchString%'
It will pull all rows containing the SearchString.

You can achieve this using like operator.
Select * from yourtable
where ? like concat('%', company, '%');
parameter ? with your input.

Related

How do I remove the | special character in a CHAR dataset using proc sql?

I am trying to remove the charater | using proc sql. The position of | is not fixed and varies in the data, hence I do not want to use the substr function
Example 1- 1234|5678|9|101
Example 2 - 12345|6789|1|011
You can use TRANSLATE() function
UPDATE tab
SET TRANSLATE(Col, '', '|')
In oracle you could use REPLACE, something like
SELECT REPLACE('1234|5678|9|101 Example 2 - 12345|6789|1|011','|','') Changed
FROM DUAL;

BigQuery - Regex to match a pattern after a known string (positive lookbehind alternative)

I need to extract 8 digits after a known string:
| MyString | Extract: |
| ---------------------------- | -------- |
| mypasswordis 12345678 | 12345678 |
| # mypasswordis 12345678 | 12345678 |
| foobar mypasswordis 12345678 | 12345678 |
I can do this with regex like:
(?<=mypasswordis.*)[0-9]{8})
However, when I want to do this in BigQuery using the REGEXP_EXTRACT command, I get the error message, "Cannot parse regular expression: invalid perl operator: (?<".
I searched through the re2 library and saw there doesn't seem to be an equivalent for positive lookbehind.
Is there any way I can do this using other methods? Something like
SELECT REGEXP_EXTRACT(MyString, r"(?<=mypasswordis.*)[0-9]{8}"))
You need a capturing group here to extract a part of a pattern, see the REGEXP_EXTRACT docs you linked to:
If the regular expression contains a capturing group, the function returns the substring that is matched by that capturing group. If the expression does not contain a capturing group, the function returns the entire matching substring.
Also, the .* pattern is too costly, you only need to match whitespace between the word and the digits.
In general, to "convert" a (?<=mypasswordis).* pattern with a positive lookbehind, you can use mypasswordis(.*).
In this case, you can use
SELECT REGEXP_EXTRACT(MyString, r"mypasswordis\s*([0-9]{8})"))
Or just
SELECT REGEXP_EXTRACT(MyString, r"mypasswordis\s*([0-9]+)"))
See the re2 regex online test.
Try to not use regexp as much as you can, its quite slow. Try substring and instr as example:
SELECT SUBSTR(MyString, INSTR(MyString,'mypasswordis') + LENGTH('mypasswordis')+1)
otherwise Wiktor Stribiżew have probably right answer.
Use REGEXP_REPLACE instead to match what you don't want and delete that:
REGEXP_REPLACE(str, r'^.*mypasswordis ', '')

Extract particular character using StandardSQL

I would like to extract particular character from strings using StandardSQL.
I would like to extract the character after limit=.
For instance, from below strings I would like to extract 10, 3 and null. For everything that has null I also would like to make all null = 1.
partner=&limit=10
partner=aex&limit=3&filters%5Bpartner%5D
partner=aex&limit=&filters%5Bpartner%5D
I only know how to use substring function but the problem here is the positions of limit= are not always the same.
You can use REGEXP_EXTRACT. For example:
SELECT REGEXP_EXTRACT('partner=aex&limit=3&filters%5Bpartner%5D', 'limit=(\\d+)');
+-------+
| $col1 |
+-------+
| 3 |
+-------+

sql query words that don't have numbers?

It is possible to select just words that don't contain numbers?
Something like...
|Address |
-----------------
|Street x 150 |
|Street y |
|Street z 498Z |
I want just Street y in this case.
I have these texts in a excel, and would 'filter' in access. And in last try I can pass it to a SQL Server (microsoft).
I'll search about REGEX on Access or mssql.
Here is a way to do it in SQL Server (and most other databases):
select *
from t
where address not like '%[0-9]%'
That is, the address is not like something that has a number in it.
Like in Access does not follow the standard at all (using * rather than % as the wildcard, for instance). So, this will not work in Access.
SELECT ... FROM ...WHERE fieldname REGEXP [^0-9]

Wildcard of Number in SQL Server

How to match numbers in SQL Server 'LIKE'.
SpaceName
------------
| New_Space_1
| .
| .
| New_Space_8
| New_Space_9
| New_Space_10
| New_Space_11
| New_Space_SomeString
| New_Space_SomeString1
Above is my table contents.
I want to get only records ending with Numeric chars, ie I want the records from New_Space_1 to New_Space_11.
Don't want New_Space_SomeString and New_Space_SomeString1
I have some query like this.
SELECT SpaceName FROM SpaceTable
WHERE SpaceName LIKE 'New_Space_%'
But this returns all records.
what about
SELECT SpaceName FROM SpaceTable
WHERE SpaceName LIKE 'New[_]Space[_][0-9]%'
The reason I put underscore in brackets is because in a regular expression _ means Any single character. Read up on like here http://msdn.microsoft.com/en-us/library/ms179859.aspx
This solution from #SteveKass works perfect.
SELECT SpaceName FROM SpaceTable WHERE SpaceName LIKE 'New[_]Space[_]%' AND SpaceName NOT LIKE 'New[_]Space[_]%[^0-9]%'