SQL Select LIKE, find similarities - sql

Is there a way with the Select LIKE operator to find similarities?
For example I have a table with following content.
1. 34578
2. 34878
3. 12578
Now I want select all values with are similar with 34X78, where X can be any number from 0 to 9. Result should be then record 1 and 2.
Also X can be on various positions and something like 3XX79 or 3X5X8 should be possible.
It can be also a solution using SQLScript on SAP HANA

Try using '_' wild card:
SELECT * FROM YourTable
WHERE COLUMN LIKE '34_78'
_ Is a wild card that does what you asked for, can be replaced with any thing.
You can find an explanation about LIKE wildcards here.

According to the manuals HANA suports regular expressions:
WHERE column LIKE_REGEXPR '34[0-9]78'

Related

SQL Like condition fails to run

I've been tasked to develop a query that behaves essentially like the following one:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*textToSearch*'
The textToSearch is a string which contains information about the condition in which a given device is tested (Voltage, Current, Frequency, etc) in the following format as an example:
[V:127][PF:1][F:50][I:65]
The objective is to recover a list of any and all tests performed at a voltage of 127 Volts, so the SQL developed would look like the folllowing:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V:127*'
This works as intended but there is a problem due to an inproper introduction of data, there are cases in which the _textToSearch string looks like the following examples:
[V.127][PF:1][F:50][I:65]
[V.230][PF:1][F:50][I:65]
As you can see, my previous SQL transaction does not work as it does not meet the conditions.
If I try to do the following transaction with the objective of ignoring improper data format:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V*127*'
The transaction is not succesful and returns an error.
What am I doing wrong for this transaction not to work? I am approaching this problem wrong?
I see a pair of problems although with this transaction, if there were a group of test conditions like the following:
[V.127][PF:1][F:50][I:127]
[V.230][PF:1][F:50][I:127]
Would it return the values of both points given that both meet the condition of the transaction stated above?
In conclusion, my questions are:
What is wrong with the LIKE '*V*127*' condition for it not to work?
What implications has working with this condition? Can it return more information than desired if I am not careful?
I hope it is clear what I am asking for, if it isn't, please point out what is not clear and I will try to clarify it
One choice is to look for any character between the "V" and the "127":
WHERE TestConditions LIKE '%V_127%'
Note that % is the wildcard for a string of any length and _ is the wildcard for a single character.
You can also use regular expressions:
WHERE regexp_like(TestConditions, 'V[.:]127')
Note that regular expressions match anywhere in the string, so wildcards at the beginning and end are not needed.
You could check for both cases (although this will decrease performance)
SELECT *
FROM tblTestData
WHERE (TestConditions LIKE '%V:127%' OR TestConditions LIKE '%V.127%')
It is better to clean the data in your database if only old records have this problem.
Using regular expressions is recommended by Oracle for this kind of conditions. You could build a regular expression for your case:
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE REGEXP_LIKE(text_to_search,'\[V(.|:)127\]','i')
Or you could use the good old LIKE operator. In this case, you need to know that:
% matches zero or more characters
_ matches only one character
So you should use an underscore to match the : or the .
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE text_to_search LIKE '%V_127%';

SQL Like filtering

Welcome!
I am currently working on some C# and I encountered a rather silly issue. I need to fill ListBox with some data from database obviously. Problem is varchar filtering. I need to filter codes and display only the right ones.
Example of codes are: RRTT, RRTR, RT12, RQ00, R100, R200, R355, TY44, GG77 etc. Four digit codes.
I managed to filter only R-codes with simple select * from table1 where code_field like 'R%' but I get both R000 and RQ10 and I need to get R ones followed by numbers only.
So from example:
RRTT, RRTR, RT12, RQ00, R100, R200, R355
Only:
R100, R200, R355
Any ideas what should I add to the like 'R%'?
In SQL Server, you can make use of wildcards. Here is one approach:
where rcode like 'R___' and -- make sure code has four characters
rcode not like 'R%[^0-9]%' -- and nothing after the first one is not a number
Or:
where rcode like 'R[0-9][0-9][0-9]'
In other databases, you would normally do this using regular expressions rather than extensions to like.
here solution using like
SELECT *
FROM (
VALUES ( 'R123'),
( 'R12T' ),
( 'RT12' )
) AS t(test)
where test like 'R[0-9][0-9][0-9]'
like 'S[0-9%]'
Friend came up with the solution, thanks anyway

BigQuery Wildcard using TABLE_DATE_RANGE()

Great news about the new table wildcard functions this morning! Is there a way to use TABLE_DATE_RANGE() on tables that include date but no prefix?
I have a dataset that contains tables named YYYYMMDD (no prefix). Normally I would query like so:
SELECT foo
FROM [mydata.20140319],[mydata.20140320],[mydata.20140321]
LIMIT 100
I tried the following but I'm getting an error:
SELECT foo
FROM
(TABLE_DATE_RANGE(mydata.,
TIMESTAMP('2014-03-19'),
TIMESTAMP('2015-03-21')))
LIMIT 100
as well as:
SELECT foo
FROM
(TABLE_DATE_RANGE(mydata,
TIMESTAMP('2014-03-19'),
TIMESTAMP('2015-03-21')))
LIMIT 100
The underlying bug here has been fixed as of 2015-05-14. You should be able to use TABLE_DATE_RANGE with a purely numeric table name. You'll need to end the dataset in a '.' and enclose the name in brackets, so that the parser doesn't complain. This should work:
SELECT foo
FROM
(TABLE_DATE_RANGE([mydata.],
TIMESTAMP('2014-03-19'),
TIMESTAMP('2015-03-21')))
LIMIT 100
Note: The underlying bug has been fixed, please see my other answer.
Original response left for posterity (since the workaround should still work, in case you need it for some reason)
Great question. That should work, but it doesn't currently. I've filed an internal bug. In the meantime, a workaround is to use the TABLE_QUERY function, as in:
SELECT foo
FROM (
TABLE_QUERY(mydata,
"TIMESTAMP(table_id) BETWEEN "
+ "TIMESTAMP('2014-03-19') "
+ "AND TIMESTAMP('2015-03-21')"))
Note that with standard SQL support in BigQuery, you can use _TABLE_SUFFIX, instead of TABLE_QUERY. For example:
SELECT foo
FROM `mydata_*`
WHERE _TABLE_SUFFIX BETWEEN '20140319' AND '20150321'
Also check this question for more about BigQuery standard SQL.

How can I SELECT DISTINCT on the last, non-numerical part of a mixed alphanumeric field?

I have a data set that looks something like this:
A6177PE
A85506
A51SAIO
A7918F
A810004
A11483ON
A5579B
A89903
A104F
A9982
A8574
A8700F
And I need to find all the ENDings where they are non-numeric. In this example, that means PE, AIO, F, ON, B and F.
In pseudocode, I'm imagining I need something like
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,[SOME_CLEVER_LOGIC]) AS X FROM TABLE);
Any ideas? Can I solve this without learning regexp?
EDIT: To clarify, my data set is a lot larger than this example. Also, I'm only interested in the part of the string AFTER the numeric part. If the string is "A6177PE" I want "PE".
Disclaimer: I don't know Oracle SQL. But, I think something like this should work:
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,REGEXP_INSTR(COL, "[[:ALPHA:]]+$")) AS X FROM TABLE);
REGEXP_INSTR(COL, "[[:ALPHA:]]+$") should return the position of the first of the characters at the end of the field.
For readability, I'd recommend using the REGEXP_SUBSTR function (If there are no performance issues of course, as this is definitely slower than the accepted solution).
...also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself
SELECT DISTINCT SUBSTR(MY_COLUMN,REGEXP_SUBSTR("[a-zA-Z]+$")) FROM MY_TABLE;
(:alpha: is supported also, as #Audun wrote )
Also useful: Oracle Regexp Support (beginning page)
For example
SELECT SUBSTR(col,INSTR(TRANSLATE(col,'A0123456789','A..........'),'.',-1)+1)
FROM table;

Regular expressions inside SQL Server

I have stored values in my database that look like 5XXXXXX, where X can be any digit. In other words, I need to match incoming SQL query strings like 5349878.
Does anyone have an idea how to do it?
I have different cases like XXXX7XX for example, so it has to be generic. I don't care about representing the pattern in a different way inside the SQL Server.
I'm working with c# in .NET.
You can write queries like this in SQL Server:
--each [0-9] matches a single digit, this would match 5xx
SELECT * FROM YourTable WHERE SomeField LIKE '5[0-9][0-9]'
stored value in DB is: 5XXXXXX [where x can be any digit]
You don't mention data types - if numeric, you'll likely have to use CAST/CONVERT to change the data type to [n]varchar.
Use:
WHERE CHARINDEX(column, '5') = 1
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
CHARINDEX
ISNUMERIC
i have also different cases like XXXX7XX for example, so it has to be generic.
Use:
WHERE PATINDEX('%7%', column) = 5
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
PATINDEX
Regex Support
SQL Server 2000+ supports regex, but the catch is you have to create the UDF function in CLR before you have the ability. There are numerous articles providing example code if you google them. Once you have that in place, you can use:
5\d{6} for your first example
\d{4}7\d{2} for your second example
For more info on regular expressions, I highly recommend this website.
Try this
select * from mytable
where p1 not like '%[^0-9]%' and substring(p1,1,1)='5'
Of course, you'll need to adjust the substring value, but the rest should work...
In order to match a digit, you can use [0-9].
So you could use 5[0-9][0-9][0-9][0-9][0-9][0-9] and [0-9][0-9][0-9][0-9]7[0-9][0-9][0-9]. I do this a lot for zip codes.
SQL Wildcards are enough for this purpose. Follow this link: http://www.w3schools.com/SQL/sql_wildcards.asp
you need to use a query like this:
select * from mytable where msisdn like '%7%'
or
select * from mytable where msisdn like '56655%'