Due to an earlier error, I ended up with letters and symbols in places where I should have had integers and floars. At this time, I don't know the extent of the problem and working to correct the code as we move forward.
As of right now when I run SELECT distinct col1 from table; I get integers, floats, symbols and letters. A few million of them.
How can I update the SQL to exclude all numbers? In other words, show me only letters and symbols.
You can use the GLOB operator:
select col1
from tablename
where col1 GLOB '*[^0-9]*'
This will return all values of col1 that contain any character different than a digit.
You may change it to include '.' in the list of chars:
where col1 GLOB '*[^0-9.]*'
See the demo.
If what you want is values that do not contain any digits then use this:
select col1
from tablename
where col1 not GLOB '*[0-9]*'
See the demo.
Hmmm . . . SQLite doesn't have regular expressions built-in, making this a bit of a pain. If the column actually contains numbers and strings (because that is possible in SQLite), you can use
where typeof(col) = 'text'
If the types are all text (so '1.23' rather than 1.23), then this may do what you want:
where cast( (col + 0) as text) = col
Related
I need to convert varchar value into NUMERIC and receive below error.
Error 2621: Bad characters in format or data of (variable_name)
I want to remove the rows with those bad characters and keep only convertable ones.
How to do that in Teradata? Does Teradata have some function to do that? (something like PRXMATCH in SAS)
Thanks
Simply apply TO_NUMBER which returns NULL for bad data points:
TO_NUMBER(mycol)
If you don't use regular expressions, you can use translate. In Oracle, I would write this as:
select col
from t
where translate(col, 'x0123456789.', 'x') is not null and
col not like '%.%.%';
I think Teradata has a more sensible policy on empty strings, so it would look like:
select col
from t
where otranslate(col, '0123456789.', '') <> '' and
col not like '%.%.%';
Of course, remove the . if you only want integers.
Trying to get my grips on Oracle from a SQL environment.
Does anyone know why this query returns 0?
SELECT COUNT( * ) FROM MORGS.LOGS l
WHERE ( l.LOCATION = 'X:\Import\XXX006' ) AND
( l.DIRECTION = 'IN' ) AND
( 'XXX006-Test.txt' LIKE '%XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$%' ) -- It fails on this condition
Please take note that 'XXX006-Test.txt' on the left handside of LIKE is the value of the column in the table. I've just hard-coded it here just to demo.
Thanks in advance.
Actually LIKE is working. I'm afraid it's your logic that's faulty. The premise of LIKE is that the whole text in the first parameter exists in its entirety in the second, with wildcards to omit irrelevant characters from the matching.
So this is TRUE ...
where 'ABC' like 'ABC%'
... and this is FALSE ...
where 'ABC' like 'ABCDEF'
Looking at your actual test:
( 'XXX006-Test.txt' LIKE '%XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$%' )
we notice that the string XXX006-Test.txt does not exist in XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$ so LIKE quite rightly returns FALSE.
" Do you know how I can split the RHS on a '.' and grab only the first index of the split results which is 'XXX006'?"
If the required match is always six characters long the simplest thing is
substr('XXX006-Test.txt', 1, 6)
If the leading thing is variable, you can use regular expressions. To extract everything before the dot:
regexp_replace ( 'XXX006-Test.txt', '(.+)\.txt$','\1' )
Although given the values in the two strings you might want to match on the dash instead ...
regexp_replace ( 'XXX006-Test.txt', '([a-z0-9]+)\-(.*)','\1' )
Depends how stable the pattern is.
Column xy of type 'nvarchar2(40)' in table ABC.
Column consists mainly of numerical Strings
how can I make a
select to_number(trim(xy)) from ABC
query, that ignores non-numerical strings?
In general in relational databases, the order of evaluation is not defined, so it is possible that the select functions are called before the where clause filters the data. I know this is the case in SQL Server. Here is a post that suggests that the same can happen in Oracle.
The case statement, however, does cascade, so it is evaluated in order. For that reason, I prefer:
select (case when NOT regexp_like(xy,'[^[:digit:]]') then to_number(xy)
end)
from ABC;
This will return NULL for values that are not numbers.
You could use regexp_like to find out if it is a number (with/without plus/minus sign, decimal separator followed by at least one digit, thousand separators in the correct places if any) and use it like this:
SELECT TO_NUMBER( CASE WHEN regexp_like(xy,'.....') THEN xy ELSE NULL END )
FROM ABC;
However, as the built-in function TO_NUMBER is not able to deal with all numbers (it fails at least when a number contains thousand separators), I would suggest to write a PL/SQL function TO_NUMBER_OR_DEFAULT(numberstring, defaultnumber) to do what you want.
EDIT: You may want to read my answer on using regexp_like to determine if a string contains a number here: https://stackoverflow.com/a/21235443/2270762.
You can add WHERE
SELECT TO_NUMBER(TRIM(xy)) FROM ABC WHERE REGEXP_INSTR(email, '[A-Za-z]') = 0
The WHERE is ignoring columns with letters. See the documentation
I'm trying to select some rows from an Oracle database like so:
select * from water_level where bore_id in ('85570', '112205','6011','SP068253');
This used to work fine but a recent update has meant that bore_id in water_level has had a bunch of whitespace added to the end for each row. So instead of '6011' it is now '6011 '. The number of space characters added to the end varies from 5 to 11.
Is there a way to edit my query to capture the bore_id in my list, taking account that trialling whitespace should be ignored?
I tried:
select * from water_level where bore_id in ('85570%', '112205%','6011%','SP068253%');
which returns more rows than I want, and
select * from water_level where bore_id in ('85570\s*', '112205\s*','6011\s*', 'SP068253\s*');
which didn't return anything?
Thanks
JP
You should RTRIM the WHERE clause
select * from water_level where RTRIM(bore_id) in ('85570', '112205','6011');
To add to that, RTRIM has an overload which you can pass a second parameter of what to trim, so if the trailing characters weren't spaces, you could remove them. For example if the data looked like 85570xxx, you could use:
select * from water_level where RTRIM(bore_id, 'x') IN ('85570','112205', '6011');
You could use the replace function to remove the spaces
select * from water_level where replace(bore_id, ' ', '') in ('85570', '112205', '6011', 'SP068253');
Although, a better option would be to remove the spaces from the data if they are not supposed to be there or create a view.
I'm guessing bore_id is VARCHAR or VARCHAR2. If it were CHAR, Oracle would use (SQL-standard) blank-padded comparison semantics, which regards 'foo' and 'foo ' as equivalent.
So, another approach is to force comparison as CHARs:
SELECT *
FROM water_level
WHERE CAST(bore_id AS CHAR(16)) IN ('85570', '112205', '6011', 'SP068253');
I want to get only those rows that contain ONLY certain characters in a column.
Let's say the column name is DATA.
I want to get all rows where in DATA are ONLY (must have all three conditions!):
Numeric characters (1 2 3 4 5 6 7 8 9 0)
Dash (-)
Comma (,)
For instance:
Value "10,20,20-30,30" IS OK
Value "10,20A,20-30,30Z" IS NOT OK
Value "30" IS NOT OK
Value "AAAA" IS NOT OK
Value "30-" IS NOT OK
Value "30," IS NOT OK
Value "-," IS NOT OK
Try patindex:
select * from(
select '10,20,20-30,30' txt union
select '10,20,20-30,40' txt union
select '10,20A,20-30,30Z' txt
)x
where patindex('%[^0-9,-]%', txt)=0
For you table, try like:
select
DATA
from
YourTable
where
patindex('%[^0-9,-]%', DATA)=0
As per your new edited question, the query should be like:
select
DATA
from
YourTable
where
PATINDEX('%[^0-9,-]%', DATA)=0 and
PATINDEX('%[0-9]%', LEFT(DATA, 1))=1 and
PATINDEX('%[0-9]%', RIGHT(DATA, 1))=1 and
PATINDEX('%[,-][-,]%', DATA)=0
Edit: Your question was edited, so this answer is no longer correct. I won't bother updating it since someone else already has updated theirs. This answer does not fulfil the condition that all three character types must be found.
You can use a LIKE expression for this, although it's slightly convoluted:
where data not like '%[^0123456789,!-]%' escape '!'
Explanation:
[^...] matches any character that is not in the ... part. % matches any number (including zero) of any character. So [^0123456789-,] is the set of characters that you want to disallow.
However: - is a special character inside of [], so we must escape it, which we do by using an escape character, and I've chosen !.
So, you match rows that do not contain (not like) any character that is not in your disallowed set.
Use option with PATINDEX and LIKE logic operator
SELECT *
FROM dbo.test70
WHERE PATINDEX('%[A-Z]%', DATA) = 0
AND PATINDEX('%[0-9]%', DATA) > 0
AND DATA LIKE '%-%'
AND DATA LIKE '%,%'
Demo on SQLFiddle
As already mentioned u can use a LIKE expression but it will only work with some minor modifications, otherwise too many rows will be filtered out.
SELECT * FROM X WHERE T NOT LIKE '%[^0-9!-,]%' ESCAPE '!'
see working example here:
http://sqlfiddle.com/#!3/474f5/6
edit:
to meet all 3 conditions:
SELECT *
FROM X
WHERE T LIKE '%[0-9]%'
AND T LIKE '%-%'
AND T LIKE '%,%'
see: http://sqlfiddle.com/#!3/86328/1
Maybe not the most beautiful but a working solution.