How to select values around .(dot) using sql - sql

I am running below query in Teradata :
sel requesttext from dbc.tables
where tablename='old_employee_table'
Result:
alter table DB_NAME.employee_table,no fallback ;
I want to get below result using SQL:
DB_NAME.employee_table
Requesttext can be:
create set table DB_NAME.employee_table;
DB Name and table can occur anywhere in the result. Since .(dot) is joining them that's why i want to split with .(dot).
Basically I need sql which can result me surrounding values of .(dot)
I want DBName and Tablename in result.

I'm not a Teradata person, but this should work for both strings given so far, as long as teradata's regexp_substr() supports positive look-behind and positive look-ahead assertions (I might have the Teradata syntax wrong, so a little tweaking may be needed):
SELECT REGEXP_SUBSTR(requesttext, '(?<= )(\w+\.\w+)(?=[,$]?)', 1, 1)
FROM dbc.tables
WHERE tablename='old_employee_table'
See the regex101 example. Hopefully it translates to Teradata easily.
The regex looks for and returns the words either side of and including the period, when preceded by a space, and followed by an optional comma or the end of the line.

You could do this with either regexp_substr() or strtok().
As Jamie Zawinski said:
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
So I would go with the strtok() method. Also I'm lazy and regular expressions are hard.
Function strtok() takes three arguments:
The string being split
The delimiter to split the string
The number of the token to grab.
To get at the <database>.<table> from that string that is returned in your query, we can split by a space, grab the third token, then split that by a comma and grab the first token.
That would look like:
SELECT strtok(strtok(requestText,' ',3),',',1)
FROM dbc.tables
WHERE tablename='old_employee_table'

Related

Hive SQL regexp_extract (number)_(number)

I'm new to hiveSQL and I'm trying to extract a value from the column col_a from the data df which is in this format:
\\\"id\\\":\\\"101_12345\\\"
I only need to extract 101_12345, but underscore makes it hard to satisfy my need. I tried using regexp_extract(col_a, '(\\d+)[_](\\d+)') but only outputs 101.
Could I get some help with regexp? Thank you
Simple solution: You don't need the two brackets.
Here's a working solution: '\\d+[_]\\d+'
When you put tokens into parentheses, the regex engine will group its match together, separate from the complete match. So the final result will comprise the complete match, and two extra matches representing the one before and after the underscore. To avoid this, just remove the brackets as you don't really need them.
In the future, if you want to group a regex together but don't want the result to contain it separately, use a non-capturing group given by (?:).
Here's a demo of what your code resulted in, hosted at regex101.com

Regex not working in LIKE condition

I'm currently using Oracle SQL developer and am trying to write a query that will allow me to search for all fields that resemble a certain value but always differ from it.
SELECT last_name FROM employees WHERE last_name LIKE 'Do[^e]%';
So the result that I'm after would be: Give me all last names that start with 'Do' but are not 'Doe'.
I got the square brackets method from a general SQL basics book so I assume any SQL database should be able to run it.
This is my first post and I'd be happy to clarify if my question wasn't clear enough.
In Oracle's LIKE no regular expressions can be used. But you can use REGEXP_LIKE.
SELECT * FROM EMPLOYEES WHERE REGEXP_LIKE (Name, '^Do[^e]');
The ^ at the beginning of the pattern anchors it to the beginning of the compared string. In other words the string must start with the pattern to match. And there is no wildcard needed at the end, as there is no anchor for the end of the string (which would be $). And you seem to already know the meaning of [^e].

SQL Remove Substring From Query Results

I have a query that is returning data from a database. In a single field there is a rather long text comment with a segment, which is clearly defined with marking tags like !markerstart! and !markerend!. I would like to have a query return with the string segment between the two markers removed (and the markers removed too).
I would normally do this client-side after I get the data back, however, the problem is that the query is an INSERT query that gets it's data from a SELECT statement. I don't want the text segment to be stored in the archival/reporting table (working with an OLTP application here), so I need to find a way to get the SELECT statement to return exactly what is to be inserted, which, in this case, means getting the SELECT statement to strip out the unwanted phrase instead of doing it in post-processing client-side.
My only thought is to use some convoluted combination of SUBSTRING, CHARINDEX, and CONCAT, but I'm hoping there is a better way, but, based on this, I don't see how. Anyone have ideas?
Sample:
This is a long string of text in some field in a database that has a segment that needs to be removed. !markerstart! This is the segment that is to be removed. It's length is unknown and variable. !markerend! The part of this field that appears after the marker should remain.
Result:
This is a long string of text in some field in a database that has a segment that needs to be removed. The part of this field that appears after the marker should remain.
SOLUTION USING STUFF:
I really don't like how verbose this is, but I can put it in a function if I really need to. It isn't ideal, but it is easier and faster than a CLR routine.
SELECT STUFF(CAST(Description AS varchar(MAX)), CHARINDEX('!markerstart!', Description), CHARINDEX('!markerend!', Description) + 11 - CHARINDEX('!markerstart!', Description), '') AS Description
FROM MyTable
You may want to consider implementing a CLR user-defined function that returns the parsed data.
The following link demonstrates how to use a CLR UDF RegEx function for pattern matching and data extraction.
http://msdn.microsoft.com/en-us/magazine/cc163473.aspx
Regards,
You can use Stuff function or Replace function and replace your unwanted symbols with ''.
STUFF('EXP',START_POS,'NUMBER_OF_CHARS','REPLACE_EXP')

Insert double quotes into SQL output

After I run a query and view the output, for example
select * from People
My output is as follows
First Last Email
Ray Smith raysmith#whatever.itis
How would I export this data so that it looks as follows?
"Ray","Smith","raysmith#whatever.itis"
Or is there a way to do this within SQL to modify records to contain quotes?
Because when you export, it's going to include the commas anyway, right?
If the columns you're interested in are 128 characters or less, you could use the QUOTENAME function. Be careful with this as anything over 128 characters will return NULL.
SELECT QUOTENAME(First, '"'), QUOTENAME(Last, '"'), QUOTENAME(Email, '"')
FROM People
select '"'+first+'","'+last+'","'+email+'"'
from people
This is the kind of thing best done in code however, you shouldn't query for presentation.
select concat(“\"”,first,“\"”,“\"”,Last,“\"”,“\"”,Email,“\"”) as allInOne
Modifying the records to contain quotes would be a disaster; you don't use the data only for export. Further, in theory you'd have to deal with names like:
Thomas "The Alley Cat" O'Malley
which presents some problems.
In Standard SQL, you'd use doubled-up single quotes to enclose single quotes (with no special treatment for double quotes):
'"Thomas "The Alley Cat" O''Malley"'
Some DBMS allow you to use double quotes around strings (in Standard SQL, the double quotes indicate a 'delimited identifier'; SQL Server uses square brackets for that), in which case you might write the string as:
"""Thomas ""The Alley Cat"" O'Malley"""
Normally, though, your exporter tools provide CSV output formatting and your SQL statement does not need to worry about it. Embedded quotes make anything else problematic. Indeed, you should usually not make the DBMS deal with the formatting of the data.
This worked best for me
SELECT 'UPDATE [dbo].[DirTree1] SET FLD2UPDATE=',QUOTENAME(FLD2UPDATE,'''')
+' WHERE KEYFLD='+QUOTENAME(KEYFLD,'''')
FROM [dbo].[Table1]
WHERE SUBSTRING(FLD2UPDATE,1,2) = 'MX'
order by 2
If you are using MS SQL Server, try something like:
SELECT '"'||Table.Column||'"'
FROM Table
-- Note that the first 3 characters between "SELECT" and "||" are:
' " '
-- The characters are the same after "||" at the end... that way you get a " on each side of your value.

Regex for parsing SQL parameters

If I have a query such as SELECT * from authors where name = #name_param, is there a regex to parse out the parameter names (specifically the "name_param")?
Thanks
This is tricky because params can also occur inside quoted strings.
SELECT * FROM authors WHERE name = #name_param
AND string = 'don\'t use #name_param';
How would the regular expression know to use the first #name_param but not the second?
It's a problem that can be solved, but it's not practical to do it in a single regular expression. I had to handle this in Zend_Db, and what I did was first strip out all quoted strings and delimited identifiers, and then you can use regular expressions on the remainder.
You can see the code, because it's open-source.
See functions _stripQuoted() and _parseParameters().
https://github.com/zendframework/zf1/blob/136735e776f520b081cd374012852cb88cef9a88/library/Zend/Db/Statement.php#L200
https://github.com/zendframework/zf1/blob/136735e776f520b081cd374012852cb88cef9a88/library/Zend/Db/Statement.php#L140
Given you have no quoted strings or comments with parameters in them, the required regex would be quite trivial:
#([_a-zA-Z]+) /* match group 1 contains the name only */
I go with Bill Karwin's recommendation to be cautious, knowing that the naïve approach has its pitfalls. But if you kow the data you deal with, this regex would be all you need.