CONTAINS sql function doesn't match underscores - sql

I have some table 'TableName' like Id,Name:
1 | something
2 | _something
3 | something like that
4 | some_thing
5 | ...
I want to get all rows from this table where name containes 'some'.
I have 2 ways:
SELECT * FROM TableName WHERE Name like '%some%'
Result is table :
1 | something
2 | _something
3 | something like that
4 | some_thing
But if I use CONTAINS function
SELECT * FROM TableName WHERE CONTAINS(Name,'"*some*"')
I get only
1 | something
3 | something like that
What should I do to make CONTAINS function work properly?

The last time I looked (admittedly SQL Server 2000) CONTAINS didn't support wildcard matching at the beginning of words, only at the end. Also, you might need to check your noise files to see if the "_" character is being ignored.
Also see
How do you get leading wildcard full-text searches to work in SQL Server?

http://doc.ddart.net/mssql/sql70/ca-co_15.htm
If you read this article you will see that * means prefix this means that word must start with this, but like means the word contains key phrase.
Best Regards,
Iordan

Try this:
SELECT * FROM TableName WHERE CONTAINS(Name,'some')

Related

SQL Server Full Text Search to find containing characters

I have a table with a column Document that is FullText Index.
Let say I have this in this table:
| ID | Document |
| 1 | WINTER SUMMER SPRING OTHER |
My requirement is to find rows that contains 'ER'.
For this I am querying like this:
SELECT TOP 100
[FullTextSearch].[Document], [FullTextSearch].[ID]
FROM
[FullTextSearch]
WHERE
CONTAINS(Document, '"*ER*"')
But this is not working.
Please suggest what should be best way to do this using FullTextSearch.
I am expecting id 1 should be returned.
You can user LIKE operator to find the value.
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.
There are two wildcards used in conjunction with the LIKE operator:
% - The percent sign represents zero, one, or multiple characters
_ - The underscore represents a single character
Syntax,
SELECT column1, column2, ...
FROM table_name
WHERE columnN LIKE pattern;
This query can help to find the result.
SELECT Document,ID FROM FullTextSearch
WHERE Document LIKE '%ER%';
It's a wildcard query...This should work.
SELECT TOP 100
[FullTextSearch].[Document], [FullTextSearch].[ID]
FROM
[FullTextSearch]
WHERE
Document like '%ER%'
========OR=============
SELECT TOP 100
[FullTextSearch].[Document], [FullTextSearch].[ID]
FROM
[FullTextSearch]
WHERE
CONTAINS(Document, '%ER%')

Postgres matching against an array of regular expressions

My client wants the possibility to match a set of data against an array of regular expressions, meaning:
table:
name | officeId (foreignkey)
--------
bob | 1
alice | 1
alicia | 2
walter | 2
and he wants to do something along those lines:
get me all records of offices (officeId) where there is a member with
ANY name ~ ANY[.*ob, ali.*]
meaning
ANY of[alicia, walter] ~ ANY of [.*ob, ali.*] results in true
I could not figure it out by myself sadly :/.
Edit
The real Problem was missing form the original description:
I cannot use select disctinct officeId .. where name ~ ANY[.*ob, ali.*], because:
This application, stored data in postgres-xml columns, which means i do in fact have (after evaluating xpath('/data/clients/name/text()'))::text[]):
table:
name | officeId (foreignkey)
-----------------------------------------
[bob, alice] | 1
[anthony, walter] | 2
[alicia, walter] | 3
There is the Problem. And "you don't do that, that is horrible, why would you do it like this, store it like it is meant to be stored in a relation database, user a no-sql database for Document-based storage, use json" are no options.
I am stuck with this datamodel.
This looks pretty horrific, but the only way I can think of doing such a thing would be a hybrid of a cross-join and a semi join. On small data sets this would probably work pretty well. On large datasets, I imagine the cross-join component could hit you pretty hard.
Check it out and let me know if it works against your real data:
with patterns as (
select unnest(array['.*ob', 'ali.*']) as pattern
)
select
o.name, o.officeid
from
office o
where exists (
select null
from patterns p
where o.name ~ p.pattern
)
The semi-join helps protect you from cases where you have a name like "alicia nob" that would meet multiple search patterns would otherwise come back for every match.
You could cast the array to text.
SELECT * FROM workers WHERE (xpath('/data/clients/name/text()', xml_field))::text ~ ANY(ARRAY['wal','ant']);
When casting a string array into text, strings containing special characters or consisting of keywords are enclosed in double quotes kind of like {jimmy,"walter, james"} being two entries. Also when matching with ~ it is matched against any part of the string, not the same as LIKE where it's matched against the whole string.
Here is what I did in my test database:
test=# select id, (xpath('/data/clients/name/text()', name))::text[] as xss, officeid from workers WHERE (xpath('/data/clients/name/text()', name))::text ~ ANY(ARRAY['wal','ant']);
id | xss | officeid
----+-------------------------+----------
2 | {anthony,walter} | 2
3 | {alicia,walter} | 3
4 | {"walter, james"} | 5
5 | {jimmy,"walter, james"} | 4
(4 rows)

Are there some kind of "Reverse Wildcards" in SAP OpenSQL?

So we have a table with a field that contains strings.
These strings can contain wildcards.
For example:
id | name
---+----------------
1 | thomas
2 | san*
3 | *max*
Now I want to select from that table with respect to these wildcards.
For example something like this:
SELECT * FROM table WHERE name = 'sandra'.
That SELECT should fetch the record with ID = 2 from my table.
Note that it would be ok to use % instead of * as the wildcard character in the table.
Any way to achieve this in OpenSQL?
You can use wildcards, just the sign (like Matecki said), is %.
Take a look here:
https://scn.sap.com/thread/1418148
Additionally You could create and use a ranges table in the where clause. If You do not know, what it is, and how it can be done, just tell me. Populate the ranges table like this: OPTION = CP, SIGN = I, LOW = san.
Ok for You?
UPDATE:
I was wrong and changed the answer

How to select everything that is NOT part of this string in database field?

First: I'm using Access 2010.
What I need to do is pull everything in a field out that is NOT a certain string. Say for example you have this:
00123457*A8V*
Those last 3 characters that are bolded are just an example; that portion can be any combination of numbers/letters and from 2-4 characters long. The 00123457 portion will always be the same. So what I would need to have returned by my query in the example above is the "A8V".
I have a vague idea of how to do this, which involved using the Right function, with (field length - the last position in that string). So what I had was
SELECT Right(Facility.ID, (Len([ID) - InstrRev([ID], "00123457")))
FROM Facility;
Logically in this mind it would work, however Access 2010 complains that I am using the Right function incorrectly. Can someone here help me figure this out?
Many thanks!
Why not use a replace function?
REPLACE(Facility.ID, "00123457", "")
You are missing a closing square bracket in here Len([ID)
You also need to reverse this "00123457" in InStrRev(), but you don't need InStrRev(), just InStr().
If I understand correctly, you want the last three characters of the string.
The simple syntax: Right([string],3) will yield the results you desire.
(http://msdn.microsoft.com/en-us/library/ms177532.aspx)
For example:
(TABLE1)
| ID | STRING |
------------------------
| 1 | 001234567A8V |
| 2 | 008765432A8V |
| 3 | 005671234A8V |
So then you'd run this query:
SELECT Right([Table1.STRING],3) AS Result from Table1;
And the Query returns:
(QUERY)
| RESULT |
---------------
| A8V |
| A8V |
| A8V |
EDIT:
After seeing the need for the end string to be 2-4 characters while the original, left portion of the string is 00123457 (8 characters), try this:
SELECT Right([Table1].[string],(Len([Table1].[string])-'8')) AS Result
FROM table1;

Custom sorting (order by) in PostgreSQL, independent of locale

Let's say I have a simple table with two columns: id (int) and name (varchar). In this table I store some names which are in Polish, e.g.:
1 | sępoleński
2 | świecki
3 | toruński
4 | Włocławek
Now, let's say I want to sort the results by name:
SELECT * FROM table ORDER BY name;
If I have C locale, I get:
4 | Włocławek
1 | sępoleński
3 | toruński
2 | świecki
which is wrong, because "ś" should be after "s" and before "t". If I use Polish locale (pl_PL.UTF-8), I get:
1 | sępoleński
2 | świecki
3 | toruński
4 | Włocławek
which is also not what I want, because I would like names starting with capital letters to be first just like in C locale, like this:
4 | Włocławek
1 | sępoleński
2 | świecki
3 | toruński
How can I do this?
If you want a custom sort, you must define some function that modifies your values in some way so that the natural ordering of the modified values fits your requirement.
For example, you can append some character or string it the value starts with uppercase:
CREATE OR REPLACE FUNCTION mysort(text) returns text IMMUTABLE as $$
SELECT CASE WHEN substring($1 from 1 for 1) =
upper( substring($1 from 1 for 1)) then 'AAAA' || $1 else $1 END
;
$$ LANGUAGE SQL;
And then
SELECT * FROM table ORDER BY mysort(name);
This is not foolprof (you might want to change 'AAA' for something more apt) and hurts performance, of course.
If you want it efficient, you'll need to create another column that "naturally" sorts correctly (e.g. even in the C locale), and use that as a sorting criterion. For that, you should use the approach of the strxfrm C library function. As a straight-forward strxfrm table for your approach, replace each letter with two ASCII letters: 's' would become 's0' and 'ś' would become 's1'. Then 'świecki' becomes 's1w0i0e0c0k0i0', and the regular ASCII sorting will sort it correctly.
If you don't want to create a separate column, you can try to use a function in the where clause:
SELECT * FROM table ORDER BY strxfrm(name);
Here, strxfrm needs to be replaced with a proper function. Either you write one yourself, or you use the standard translate function (although this doesn't support replacing a character with two of them, so you'll need some more involved transformation).