Underscore and LEFT function - sql-server-2012

I have a column that has values that look like the following:
17_data...
18_data...
1801151...data
The data isn't the cleanest in this columns, so I am trying to use a LEFT function to identify the rows that have the 2017 year followed by an underscore LEFT(column, 3) = '17[_]' This doesn't return a single column. So to troubleshoot, I added this WHERE clause to the SELECT statement to see what was getting returned, and I got the value 175 where the actual first three characters are "17_".
Why is this, and how can I structure my WHERE clause to pick up those rows?

When you tried adding 'where' with a rule of LEFT(column, 3) = '17[_]', it was doomed to fail. Operator '=' performs exact comparison: both sides must be equal. That is, it would look for rows whose first 3 characters (left,3) are equal to 17[_], that is, 5 characters, one, seven, bracket, underscore, bracket. Text of 3 characters will not exactly-match 5 characters, ever.
You should have written simply:
WHERE LEFT(column, 3) = '17_'
I guess that you've got the idea for adding a bracket from reading about LIKE patterns. LIKE operator allows you to look for strings contained at start/end/middle of the data.
WHERE column LIKE 'mom%' - starts with mom
WHERE column LIKE '%dad' - ends with dad
and so on. LIKE supports '%' meaning "and then text of any length", and also "_" meaning "and then just one character". This forms a problem: when you want to say "starts with _mom", you cannot write
WHERE column LIKE '_mom%'
because it would also match 9mom, Bmom, and so on, due to _ meaning 'any single character'. That's why in such cases, only in LIKE, you have to write the underscore in brackets:
WHERE column LIKE '[_]mom%' - starts with _mom
Knowing that, it's obvious that you could construct your 'starts with 17_' with LIKE as well:
SELECT column1, column2, ..., columnN
FROM sometable
WHERE column LIKE '17[_]%'

Related

What does the trim function mean in this context?

Database I'm using: https://uploadfiles.io/72wph
select acnum, field.fieldnum, title, descrip
from field, interest
where field.fieldnum=interest.fieldnum and trim(ID) like 'B.1._';
What will the output be from the above query?
Does trim(ID) like 'B.1._' mean that it will only select items from B.1._ column?
trim removes spaces at the beginning and end.
"_" would allow representing any character. Hence query select any row that starts with "B.1."
For eg.
'B.1.0'
'B.1.9'
'B.1.A'
'B.1.Z'
etc
Optional Wildcard characters allowed in like are % (percent) and _ (underscore).
A % matches any string with zero or more characters.
An _ matches any single character.
I don't know about the DB you are using but trim usually remove spaces around the argument you give to it.
The ID is trimmed to be sure to compare the ID without any white-space around it.
About your second question, Only the ROWS with an ID like 'B.1.' will be selected.
SQL like
SQL WHERE

How can I extract a substring from a character column without using SUBSTR()?

I have a questions regarding below data.
You clearly can see each EMP_IDENTIFIER has connected with EMP_ID.
So I need to pull only identifier which is 10 characters that will insert another column.
How would I do that?
I did some traditional way, using INSTR, SUBSTR.
I just want to know is there any other way to do it but not using INSTR, SUBSTR.
EMP_ID(VARCHAR2)EMP_IDENTIFIER(VARCHAR2)
62049 62049-2162400111
6394 6394-1368000222
64473 64473-1814702333
61598 61598-0876000444
57452 57452-0336503555
5842 5842-0000070666
75778 75778-0955501777
76021 76021-0546004888
76274 76274-0000454999
73910 73910-0574500122
I am using Oracle 11g.
If you want the second part of the identifier and it is always 10 characters:
select t.*, substr(emp_identifier, -10) as secondpart
from t;
Here is one way:
REGEXP_SUBSTR (EMP_IDENTIFIER, '-(.{10})',1,1,null,1)
That will give the 1st 10 character string that follows a dash ("-") in your string. Thanks to mathguy for the improvement.
Beyond that, you'll have to provide more details on the exact logic for picking out the identifier you want.
Since apparently this is for learning purposes... let's say the assignment was more complicated. Let's say you had a longer input string, and it had several groups separated by -, and the groups could include letters and digits. You know there are at least two groups that are "digits only" and you need to grab the second such "purely numeric" group. Then something like this will work (and there will not be an instr/substr solution):
select regexp_substr(input_str, '(-|^)(\d+)(-|$)', 1, 2, null, 2) from ....
This searches the input string for one or more digits ( \d means any digit, + means one or more occurrences) between a - or the beginning of the string (^ means beginning of the string; (a|b) means match a OR b) and a - or the end of the string ($ means end of the string). It starts searching at the first character (the second argument of the function is 1); it looks for the second occurrence (the argument 2); it doesn't do any special matching such as ignore case (the argument "null" to the function), and when the match is found, return the fragment of the match pattern included in the second set of parentheses (the last argument, 2, to the regexp function). The second fragment is the \d+ - the sequence of digits, without the leading and/or trailing dash -.
This solution will work in your example too, it's just overkill. It will find the right "digits-only" group in something like AS23302-ATX-20032-33900293-CWV20-3499-RA; it will return the second numeric group, 33900293.

Remove unnecessary Characters by using SQL query

Do you know how to remove below kind of Characters at once on a query ?
Note : .I'm retrieving this data from the Access app and put only the valid data into the SQL.
select DISTINCT ltrim(rtrim(a.Company)) from [Legacy].[dbo].[Attorney] as a
This column is company name column.I need to keep string characters only.But I need to remove numbers only rows,numbers and characters rows,NULL,Empty and all other +,-.
Based on your extremely vague "rules" I am going to make a guess.
Maybe something like this will be somewhere close.
select DISTINCT ltrim(rtrim(a.Company))
from [Legacy].[dbo].[Attorney] as a
where LEN(ltrim(rtrim(a.Company))) > 1
and IsNumeric(a.Company) = 0
This will exclude entries that are not at least 2 characters and can't be converted to a number.
This should select the rows you want to delete:
where company not like '%[a-zA-Z]%' and -- has at least one vowel
company like '%[^ a-zA-Z0-9.&]%' -- has a not-allowed character
The list of allowed characters in the second expression may not be complete.
If this works, then you can easily adapt it for a delete statement.

What does \ (backslash) mean in an SQL query?

I have the following query
SELECT txt1 FROM T1 WHERE txt1 LIKE '_a\%'
will that result in answers that have any char+a+\+whatever?
is something like Pa\pe valid as a result?
are Ca% or _a% valid answers maybe?
how does \ behave normally inside an SQL query??
% is a wildcard character that matches zero or more characters in a LIKE clause.
_ is a wildcard character that maches exactly one character in a LIKE clause.
\ is a special character known as an escape character that indicates that the character directly following it should be interpreted literally (useful for single quotes, wildcard characters, etc.).
For example:
SELECT txt1 FROM T1 WHERE txt1 LIKE '_a%'
will select records with txt1 values of 'xa1', 'xa taco', 'ya anything really', etc.
Now let's say you want to actually search for the percent sign. In order to do this you need a special character that indicates % should not be treated as a wildcard. For example:
SELECT txt1 FROM T1 WHERE txt1 LIKE '_a\%'
will select records with txt1 values of 'ba%' (but nothing else).
Finally, a LIKE clause would typically contain a wildcard (otherwise you could just use = instead of LIKE). So you might see a query containing \%%. Here the first percent sign would be treated as a literal percent sign, but the second would be interpreted as a wildcard. For example:
SELECT txt1 FROM T1 WHERE txt1 LIKE '_a\%%'
will select records with txt1 values of 'da%something else', 'fa% taco', 'ma% bunch of tacos', etc.
The LIKE clause allows you to find text when you don't know the exact value, such as names beginning with JO would be
LIKE 'JO%'
However, if you are search for something ending with a%, then you need to tell SQL to treat the % as part of what you are searching for. In your example, you are looking for a 3 character string, you don't care what the first letter is, but has to end with a%.

VB6 Syntax Question

Can anyone tell me what this Asterisk(*) is for. ...tblpersonal where empid like '" & idNumber & "*'". What if I replace it with Percent sign(%), what would be the outcome?
The LIKE condition allows you to use wildcards in the where clause of an SQL statement. This allows you to perform pattern matching. The LIKE condition can be used in any valid SQL statement - select, insert, update, or delete.
The patterns that you can choose from are:
% allows you to match any string of any length (including zero length)
_ allows you to match on a single character
Next, let's explain how the _ wildcard works. Remember that the _ is looking for only one character.
For example,
SELECT * FROM suppliers
WHERE supplier_name like 'Sm_th';
This SQL statement would return all suppliers whose name is 5 characters long, where the first two characters is 'Sm' and the last two characters is 'th'. For example, it could return suppliers whose name is 'Smith', 'Smyth', 'Smath', 'Smeth', etc.
Here is another example,
SELECT * FROM suppliers
WHERE account_number like '12317_';
The same way u can use asterisk (*) instead of (%)
I hope its help to you
The Percent (%) sign in SQL says "match any number of characters here". E.g. LIKE '%test' will match abctest, LIKE 'test%' will match testabc
The Asterisk character looks like it'll match a literal *, e.g. matching all empids ending with an asterisk (depending on the version of SQL - see below)
EDIT: See Microsoft Jet wildcards: asterisk or percentage sign? for a more in depth answer on * vs %
This is much more a SQL syntax question than a VB6 one. :-)
You haven't mentioned what database this is talking to (I assume it's talking to a DB). The asterisk is not generally special in SQL (or VB6 strings), and so that query will look for empid being like whatever's in your idNumber followed by an asterisk. Probably not what was intended. If you replace it with a %, you'll be looking for any empid that starts with whatever's in your idNumber variable. If the column is numeric, it will be converted to text before the comparison.
So for instance, if idNumber contains 100, say, and there are empid values in the database with the values 10, 100, 1000, and 10000, the query would match all but the first of those, since "100", "1000", and "10000" are all like "100%".