Trying this answer and not having any luck:
I am using the SQLite Database browser (built with 3.3.5 of the SQLite engine) to execute this query:
SELECT columnX FROM MyTable
WHERE columnX LIKE '%\%16' ESCAPE '\'
In column XI have a row with the data: sampledata%167
I execute the statement and get no data returned but no error either?
http://www.sqlite.org/lang_expr.html
(SQLite with C API)
I think the problem is that you are missing a % from the end of the pattern.
'%\%16%'
Also backslashes often cause confusion in various programming languages and tools, especially if there is multiple levels of parsing involved before the query gets to the database. To simplify things you could try a different escape character instead:
WHERE columnX LIKE '%!%16%' ESCAPE '!'
Your sample data ends in %167, but your query only matches things which end in %16. You may need to change your query to have a trailing % or end with 167 as your data does, depending on your needs.
When I try this in SQLite, it works just fine:
sqlite> create table foo (bar text);
sqlite> insert into foo (bar) values ('sampledata%167');
sqlite> select * from foo where bar like '%\%16' escape '\';
sqlite> select * from foo where bar like '%\%167' escape '\';
sampledata%167
You may want to try experimenting with this in the SQLite shell, on a simple example table like I show, to see if your problem still exists there.
Related
I have a large list of SQL commands such as
SELECT * FROM TEST_TABLE
INSERT .....
UPDATE .....
SELECT * FROM ....
etc. My goal is to parse this list into a set of results so that I can easily determine a good count of how many of these statements are SELECT statements, how many are UPDATES, etc.
so I would be looking at a result set such as
SELECT 2
INSERT 1
UPDATE 1
...
I figured I could do this with Regex, but I'm a bit lost other than simply looking at everything string and comparing against 'SELECT' as a prefix, but this can run into multiple issues. Is there any other way to format this using REGEX?
You can add the SQL statements to a table and run them through a SQL query. If the SQL text is in a column called SQL_TEXT, you can get the SQL command type using this:
upper(regexp_substr(trim(regexp_replace(SQL_TEXT, '\\s', ' ')),
'^([\\w\\-]+)')) as COMMAND_TYPE
You'll need to do some clean up to create a column that indicates the type of statement you have. The rest is just basic aggregation
with cte as
(select *, trim(lower(split_part(regexp_replace(col, '\\s', ' '),' ',1))) as statement
from t)
select statement, count(*) as freq
from cte
group by statement;
SQL is a language and needs a parser to turn it from text into a structure. Regular expressions can only do part of the work (such as lexing).
Regular Expression Vs. String Parsing
You will have to limit your ambition if you want to restrict yourself to using regular expressions.
Still you can get some distance if you so want. A quick search found this random example of tokenizing MySQL SQL statements using regex https://swanhart.livejournal.com/130191.html
I have a table with a column with strings that looke like this:
static-text-here/1abcdefg1abcdefgpxq
From this string 1abcdefg is repeated twice, so I want to remove that partial string, and return:
static-text-here/1abcdefgpxq
I can make no guarantees about the length of the repeat string. In pure SQL, how can this operation be performed?
regexp_replace('static-text-here/1abcdefg1abcdefgpxq', '/(.*)\1', '/\1')
fiddle
If you can guarantee a minimum length of the repeated string, something like this would work:
select REGEXP_REPLACE
(input,
'(.{10,})(.*?)\1+',
'\1') "Less one repetition"
from tablename tn where ...;
I believe this can be expanded to meet your case with some cleverness.
It seems to me that you might be pushing SQL beyond what it is capable/designed for. Is it possible for you to handle this situation programmatically in the layer that lays under the data layer where this type of thing can be more easily handled?
The REPLACE function should be enough to solve the problem.
Test table:
CREATE TABLE test (text varchar(100));
INSERT INTO test (text) VALUES ('pxq');
INSERT INTO test (text) VALUES ('static-text-here/pxq');
INSERT INTO test (text) VALUES ('static-text-here/1abcdefgpxq');
INSERT INTO test (text) VALUES ('static-text-here/1abcdefg1abcdefgpxq');
Query:
SELECT text, REPLACE(text, '1abcdefg1abcdefg', '1abcdefg') AS text2
FROM test;
Result:
TEXT TEXT2
pxq pxq
static-text-here/pxq static-text-here/pxq
static-text-here/1abcdefgpxq static-text-here/1abcdefgpxq
static-text-here/1abcdefg1abcdefgpxq static-text-here/1abcdefgpxq
AFAIK the REPLACE function is not in the SQL99 standard, but most DBMSs support it. I tested it here, and it works with MySQL, PostgreSQL, SQLite, Oracle and MS SQL Server.
Can someone explain this to me? I have two queries below with their results.
query:
select * from tbl where contains([name], '"*he*" AND "*ca*"')
result-set:
Hertz Car Rental
Hemingyway's Cantina
query:
select * from tbl where contains([name], '"*he*" AND "*ar*"')
result-set:
nothing
The first query is what I would expect, however I would expect the second query to return "Hertz Car Rental". Am I fundamentally misunderstanding how '*' works in full-text searching?
Thanks!
I think SQL Server is interpreting your strings as prefix_terms. The asterisk is not a plain old wildcard specifier. Fulltext and Contains are word oriented. For what you are trying to do, you would be better off using plain old LIKE instead of CONTAINS.
http://msdn.microsoft.com/en-us/library/ms187787.aspx
"*" only works as a suffix. If you use it as a prefix, the table needs to be scanned no matter what and the index is useless. At that point, you might as well do
Select * From Table Where (Name Like '%he%') And (Name Like '%ar%')
I would try replacing * with % to see how it goes.
select * from tbl where contains([name], '"%he%" AND "%ar%"')
We need to store a select statement in a table
select * from table where col = 'col'
But the single quotes messes the insert statement up.
Is it possible to do this somehow?
From Oracle 10G on there is an alternative to doubling up the single quotes:
insert into mytable (mycol) values (q'"select * from table where col = 'col'"');
I used a double-quote character ("), but you can specify a different one e.g.:
insert into mytable (mycol) values (q'#select * from table where col = 'col'#');
The syntax of the literal is:
q'<special character><your string><special character>'
It isn't obviously more readable in a small example like this, but it pays off with large quantities of text e.g.
insert into mytable (mycol) values (
q'"select empno, ename, 'Hello' message
from emp
where job = 'Manager'
and name like 'K%'"'
);
How are you performing the insert? If you are using any sort of provider on the front end, then it should format the string for you so that quotes aren't an issue.
Basically, create a parameterized query and assign the value of the SQL statement to the parameter class instance, and let the db layer take care of it for you.
you can either use two quotes '' to represent a single quote ' or (with 10g+) you can also use a new notation:
SQL> select ' ''foo'' ' txt from dual;
TXT
-------
'foo'
SQL> select q'$ 'bar' $' txt from dual;
TXT
-------
'bar'
If you are using a programming language such as JAVA or C#, you can use prepared (parametrized) statements to put your values in and retrieve them.
If you are in SQLPlus you can escape the apostrophe like this:
insert into my_sql_table (sql_command)
values ('select * from table where col = ''col''');
Single quotes are escaped by duplicating them:
INSERT INTO foo (sql) VALUES ('select * from table where col = ''col''')
However, most database libraries provide bind parameters so you don't need to care about these details:
INSERT INTO foo (sql) VALUES (:sql)
... and then you assign a value to :sql.
Don't store SQL statements in a database!!
Store SQL Views in a database. Put them in a schema if you have to make them cleaner. There is nothing good that will happen ever if you store SQL Statements in a database, short of logging this is categorically a bad idea.
Also if you're using 10g, and you must do this: do it right! Per the FAQ
Use the 10g Quoting mechanism:
Syntax
q'[QUOTE_CHAR]Text[QUOTE_CHAR]'
Make sure that the QUOTE_CHAR doesnt exist in the text.
SELECT q'{This is Orafaq's 'quoted' text field}' FROM DUAL;
I have the following SQL query:
SELECT Phrases.*
FROM Phrases
WHERE (((Phrases.phrase) Like "*ing aids*")
AND ((Phrases.phrase) Not Like "*getting*")
AND ((Phrases.phrase) Not Like "*contracting*"))
AND ((Phrases.phrase) Not Like "*preventing*"); //(etc.)
Now, if I were using RegEx, I might bunch all the Nots into one big (getting|contracting|preventing), but I'm not sure how to do this in SQL.
Is there a way to render this query more legibly/elegantly?
Just by removing redundant stuff and using a consistent naming convention your SQL looks way cooler:
SELECT *
FROM phrases
WHERE phrase LIKE '%ing aids%'
AND phrase NOT LIKE '%getting%'
AND phrase NOT LIKE '%contracting%'
AND phrase NOT LIKE '%preventing%'
You talk about regular expressions. Some DBMS do have it: MySQL, Oracle... However, the choice of either syntax should take into account the execution plan of the query: "how quick it is" rather than "how nice it looks".
With MySQL, you're able to use regular expression where-clause parameters:
SELECT something FROM table WHERE column REGEXP 'regexp'
So if that's what you're using, you could write a regular expression string that is possibly a bit more compact that your 4 like criteria. It may not be as easy to see what the query is doing for other people, however.
It looks like SQL Server offers a similar feature.
Sinec it sounds like you're building this as you go to mine your data, here's something that you could consider:
CREATE TABLE Includes (phrase VARCHAR(50) NOT NULL)
CREATE TABLE Excludes (phrase VARCHAR(50) NOT NULL)
INSERT INTO Includes VALUES ('%ing aids%')
INSERT INTO Excludes VALUES ('%getting%')
INSERT INTO Excludes VALUES ('%contracting%')
INSERT INTO Excludes VALUES ('%preventing%')
SELECT
*
FROM
Phrases P
WHERE
EXISTS (SELECT * FROM Includes I WHERE P.phrase LIKE I.phrase) AND
NOT EXISTS (SELECT * FROM Excludes E WHERE P.phrase LIKE E.phrase)
You are then always just running the same query and you can simply change what's in the Includes and Excludes tables to refine your searches.
Depending on what SQL server you are using, it may support REGEX itself. For example, google searches show that SQL Server, Oracle, and mysql all support regex.
You could push all your negative criteria into a short circuiting CASE expression (works Sql Server, not sure about MSAccess).
SELECT *
FROM phrases
WHERE phrase LIKE '%ing aids%'
AND CASE
WHEN phrase LIKE '%getting%' THEN 2
WHEN phrase LIKE '%contracting%' THEN 2
WHEN phrase LIKE '%preventing%' THEN 2
ELSE 1
END = 1
On the "more efficient" side, you need to find some criteria that allows you to avoid reading the entire Phrases column. Double sided wildcard criteria is bad. Right sided wildcard criteria is good.