Oracle SQL -- remove partial duplicate from string - sql

I have a table with a column with strings that looke like this:
static-text-here/1abcdefg1abcdefgpxq
From this string 1abcdefg is repeated twice, so I want to remove that partial string, and return:
static-text-here/1abcdefgpxq
I can make no guarantees about the length of the repeat string. In pure SQL, how can this operation be performed?

regexp_replace('static-text-here/1abcdefg1abcdefgpxq', '/(.*)\1', '/\1')
fiddle

If you can guarantee a minimum length of the repeated string, something like this would work:
select REGEXP_REPLACE
(input,
'(.{10,})(.*?)\1+',
'\1') "Less one repetition"
from tablename tn where ...;
I believe this can be expanded to meet your case with some cleverness.

It seems to me that you might be pushing SQL beyond what it is capable/designed for. Is it possible for you to handle this situation programmatically in the layer that lays under the data layer where this type of thing can be more easily handled?

The REPLACE function should be enough to solve the problem.
Test table:
CREATE TABLE test (text varchar(100));
INSERT INTO test (text) VALUES ('pxq');
INSERT INTO test (text) VALUES ('static-text-here/pxq');
INSERT INTO test (text) VALUES ('static-text-here/1abcdefgpxq');
INSERT INTO test (text) VALUES ('static-text-here/1abcdefg1abcdefgpxq');
Query:
SELECT text, REPLACE(text, '1abcdefg1abcdefg', '1abcdefg') AS text2
FROM test;
Result:
TEXT TEXT2
pxq pxq
static-text-here/pxq static-text-here/pxq
static-text-here/1abcdefgpxq static-text-here/1abcdefgpxq
static-text-here/1abcdefg1abcdefgpxq static-text-here/1abcdefgpxq
AFAIK the REPLACE function is not in the SQL99 standard, but most DBMSs support it. I tested it here, and it works with MySQL, PostgreSQL, SQLite, Oracle and MS SQL Server.

Related

SQL Substring REGEX pattern matching (TERADATA)

I have a column say LINES with the below string patters. I want to extract the date from the strings. For example for each lines I would need the date i.e 20201123 or 20201124 whichever the case may be.
Since the dates are in different positions I can't really use substring for this. How do I go about this ? Is there a simpler REGEX method within substring that I can apply to this.
Here is a simple reproduced code for testing.
create volatile table TEST
(LINES VARCHAR(1000) CHARACTER SET LATIN NOT CASESPECIFIC)
ON COMMIT PRESERVE ROWS;
insert into TEST values('path/to/file/OVERALL_GOTO_Datas.20201123.dat');
insert into TEST values('path/to/file/endartstmov20201124.20201124.dat');
insert into TEST values('path/to/file/TESTDEV20201123.20201123.5.0014.CHK.dat');
insert into TEST values('path/to/file/DEVTOTES20201124.20201124.5.0109.CHK.dat');
insert into TEST values('path/to/file/STORE_PARTNER.20201124.20201124.0.0501.CHK.dat');
SELECT * FROM TEST;
Appreciate your responses. Thanks.
Using the teradata REGEXP_SUBSTR
You should be able to use this regex :
SELECT REGEXP_SUBSTR(LINES, '(:?\.([0-9]{8})\.)')
see : https://regex101.com/r/WRqEmY/2
An other way is with regexp_extract ( https://teradata.github.io/presto/docs/148t/functions/regexp.html )
SELECT regexp_extract(LINES, '(?:\.([0-9]{8})\.)', 1)

Get records matching regex in Ms-Sql

I am using query as follows to get any records that begins with any character, has bunch of 0s and ends with number (1 in this case).
where column like '_%[0]1'
But the issue is it's even returning me d0101 etc. which I don't want. I just want d0001, or r0001. Can I use it to exactly match pattern, not partially using like?
Any other options in ms-sql?
SQL-Server does not really do proper regular expressions but you can generate the search clause you want like this:
where column like '_%1' and column not like '_%[^0]%1'
The second condition will exclude all cases where you have a character other than 0 in the middle of the string.
It will allow strings of all possible lengths, provided they start with an arbitrary character, then have any number of 0s and finish with a 1. All other strings will not satisfy the where clause.
create table tst(t varchar(10));
insert into tst values('d0101');
insert into tst values('d0001');
insert into tst values('r0001');
select * from tst where PATINDEX('%00%1', t)>0
or
select * from tst where t like '%00%1'
You use the _ to say that you don't care what char is there (single char) and then use the rest of the string you want:
DECLARE # TABLE (val VARCHAR(100))
INSERT INTO #
VALUES
('d0001'),
('f0001'),
('e0005'),
('e0001')
SELECT *
FROM #
WHERE val LIKE '_0001'
This code only really handles your two simple examples. If it is more complex, add it to your post.

Sql LIKE in Arabic?

Consider this sample:
CREATE TABLE #tempTable
(name nvarchar(MAX))
INSERT INTO #tempTable VALUES (N'إِبْرَاهِيمُ'), (N'إبراهيم')
SELECT * FROM #tempTable WHERE name = N'إبراهيم'
SELECT * FROM #tempTable WHERE name LIKE N'%إبراهيم%'
Both selects only return إبراهيم but not إِبْرَاهِيمُ. How can I make it ignore these non-alphabetical characters in search? In other words, I want to get all similar words, including those with non-alpha characters.
You do not do it. Simple. NOTHING about arabic here - you have the same problem in english.
How can I make it ignore these non-alphabetical characters in search?
Like numbers? NOT AT ALL. Not with "standard SQL Syntax".
If you can, put a full text index on the field. And use the full text search syntax in your query. This is what it is for.
There is a thread over at sql stackexchange that has a workaround for this issue.
https://dba.stackexchange.com/questions/14153/treating-certain-arabic-characters-as-identical
Unfortunately, there is no case sensitive Arabic language, and of course, both select statements will return 'إبراهيم' because they were ordered to do that.
This is a problem we have been suffering from for a very long time, people always look for 'احمد' when it's written 'أحمد' and they won't find it.
this is a solution 100%:
$yourChaine = \Transliterator::create('NFC; [:Nonspacing Mark:] Remove; NFC')
->transliterate($yourChaine);

How do I use ESCAPE in SQLite?

Trying this answer and not having any luck:
I am using the SQLite Database browser (built with 3.3.5 of the SQLite engine) to execute this query:
SELECT columnX FROM MyTable
WHERE columnX LIKE '%\%16' ESCAPE '\'
In column XI have a row with the data: sampledata%167
I execute the statement and get no data returned but no error either?
http://www.sqlite.org/lang_expr.html
(SQLite with C API)
I think the problem is that you are missing a % from the end of the pattern.
'%\%16%'
Also backslashes often cause confusion in various programming languages and tools, especially if there is multiple levels of parsing involved before the query gets to the database. To simplify things you could try a different escape character instead:
WHERE columnX LIKE '%!%16%' ESCAPE '!'
Your sample data ends in %167, but your query only matches things which end in %16. You may need to change your query to have a trailing % or end with 167 as your data does, depending on your needs.
When I try this in SQLite, it works just fine:
sqlite> create table foo (bar text);
sqlite> insert into foo (bar) values ('sampledata%167');
sqlite> select * from foo where bar like '%\%16' escape '\';
sqlite> select * from foo where bar like '%\%167' escape '\';
sampledata%167
You may want to try experimenting with this in the SQLite shell, on a simple example table like I show, to see if your problem still exists there.

How can I store sql statements in an oracle table?

We need to store a select statement in a table
select * from table where col = 'col'
But the single quotes messes the insert statement up.
Is it possible to do this somehow?
From Oracle 10G on there is an alternative to doubling up the single quotes:
insert into mytable (mycol) values (q'"select * from table where col = 'col'"');
I used a double-quote character ("), but you can specify a different one e.g.:
insert into mytable (mycol) values (q'#select * from table where col = 'col'#');
The syntax of the literal is:
q'<special character><your string><special character>'
It isn't obviously more readable in a small example like this, but it pays off with large quantities of text e.g.
insert into mytable (mycol) values (
q'"select empno, ename, 'Hello' message
from emp
where job = 'Manager'
and name like 'K%'"'
);
How are you performing the insert? If you are using any sort of provider on the front end, then it should format the string for you so that quotes aren't an issue.
Basically, create a parameterized query and assign the value of the SQL statement to the parameter class instance, and let the db layer take care of it for you.
you can either use two quotes '' to represent a single quote ' or (with 10g+) you can also use a new notation:
SQL> select ' ''foo'' ' txt from dual;
TXT
-------
'foo'
SQL> select q'$ 'bar' $' txt from dual;
TXT
-------
'bar'
If you are using a programming language such as JAVA or C#, you can use prepared (parametrized) statements to put your values in and retrieve them.
If you are in SQLPlus you can escape the apostrophe like this:
insert into my_sql_table (sql_command)
values ('select * from table where col = ''col''');
Single quotes are escaped by duplicating them:
INSERT INTO foo (sql) VALUES ('select * from table where col = ''col''')
However, most database libraries provide bind parameters so you don't need to care about these details:
INSERT INTO foo (sql) VALUES (:sql)
... and then you assign a value to :sql.
Don't store SQL statements in a database!!
Store SQL Views in a database. Put them in a schema if you have to make them cleaner. There is nothing good that will happen ever if you store SQL Statements in a database, short of logging this is categorically a bad idea.
Also if you're using 10g, and you must do this: do it right! Per the FAQ
Use the 10g Quoting mechanism:
Syntax
q'[QUOTE_CHAR]Text[QUOTE_CHAR]'
Make sure that the QUOTE_CHAR doesnt exist in the text.
SELECT q'{This is Orafaq's 'quoted' text field}' FROM DUAL;