how can use Full Text Search For Persian Text File In Sql Server - sql

I Have the Table that contains the varbinary col I insert my Text File To it.
I Add Catalog and Full Text Index and I can Search for English Words Successfully like below:
select * from [FILESTREAM_Documents]
where Contains([DocumentFS], 'Hello')
And I Get Result Correctly As Below:
I Used this Doc For getting here
Know My Question is How Can I Search Persian Word In FileStream Using Full Text Search Like Image Below:
select * from [FILESTREAM_Documents]
where Contains([DocumentFS], 'سلام')
I searched it so much and I test the (N) Prefix for the word but still does not work
How can I do this?
Thanks for Helping!

Related

Cannot search nor join on other language other than english

I'm scratching my head on this SQL.
I have already changed data base collation to Chinese_PRC_CI_AS but still cannot join or search on a specific value containing Chinese. This column value comes from Excel file, I'm thinking that there might be something wrong with the excel encoding.
I have tried find the hex string using this:
SELECT master.dbo.fn_varbintohexstr(CAST(Media AS varbinary))
,Media
,master.dbo.fn_varbintohexstr(CAST('汽车之家 Autohome' AS varbinary))
FROM XXX
RESULTING different value:
0x7d6c668f4b4eb65b0a004100750074006f0068006f006d006500 汽车之家 Autohome 0xc6fbb3b5d6aebcd2204175746f686f6d65
The first hex string is the string that I cannot join or search using condition where
How can I determine that which encoding that the first string uses?
UPDATE:
Inspired by folks below, using N'', the hex string are the same. But I still could not search string using where Media = N'汽车之家 Autohome'. Any ideas why?
UPDATE:
I found out the reason, be aware that the space is not actually the space, but \n or other special character, remove this and all work fine

Full Text Search for extracting a snippet of the text (returning intended text and it's surrounding)

I'm using SQL file table and for instance I have a saved text file named "SOS.txt" which contains following text
For god's sake, save us right now please. We can't survive.
Now or never!
Now I want to find all files that contain the word save, so I execute following query
SELECT * FROM FileTableExample
WHERE CONTAINS(file_stream, 'save')
and here's the result:
stream file => 0x616C692053617665207573207269676874206E6F772E0D0A4E6F77206F72206E6576657221
As you can see I got the true result, the third column of the result indicates the file under name SOS.txt, I have the stream_id and stream_file but what I'm about to find is the way to show the the intended text in company with it's surrounding in human readable format.
Somethings like this:
Name | Excerpt
-------------+----------------------
SOS.txt |..sake, save us..
Is there any way?
Update:
After searching on the net I found this article which is useful but it didn't mention about full text search in filetable structure.
Based on this article, I converted file stream to string:
SELECT CONVERT(varchar(MAX), file_stream) AS Excerpt, *
from FileTableExample
where contains(file_stream, 'save')
It works if the file is a plain text like SOS.txt but if it's .docx or .pptx file, you are not going to gain a useful convention.
Use this, CAST(file_Stream as varchar(max))

SQL cannot search

In my SQL table Image, when i perform a search query
SELECT * FROM Image WHERE platename LIKE 'WDD 666'
it return no result(using other column to search then no problem).
The all the column data was inserted by C# code. (If enter data manually search works.)
now i suspect that the words WDD 666 wasn't english alphabet. is this possible?
In c#,
the plate number was generate by using tesseract wrapper string type.
what should i do to search the plate number?
Thanks in advance and sorry for my bad English.
Since your case matches, I'm going to rule out Case-sensitivity.
There may be leading or trailing blank spaces - Try this..
SELECT * FROM Image WHERE platename LIKE '%WDD 666%'
Try running this command:
SELECT '*'+plateName+'*',len(plateName)
FROM image.
I suspect platename has some non-printable characters in the field.
It appears to be a CR/LF at the end of the data. You can use
UPDATE image SET plateName = replace(plateName,char(13)+char(10),'')
WHERE plateName like '%'+char(13)+char(10)+'%'
If you get a positive row count, you'll know there was CR/LF data and it was removed. If you run the select afterwards, your lengths should be 7 and 8 based on your sample data

How to get data from a .rtf file or excel file into database(sqlite) in iphone sdk?

I had lots of data in a .rtf file(having usernames and passwords).How can I fetch that data into a table. I'm using sqlite3.
I had created a "userDatabase.sql" in that I had created a table "usersList" having fields "username","password". I want to get the list of data in the "list.rtf" file in to my table "usersList". Please help me .
Thanks in advance.
Praveena.
I would write a little parser. Re-save the .rtf as a txt-file and assume it look like this:
user1:pass1
user2:pass2
user5:pass5
Now do this (in your code):
open the .txt file (NSString -stringWithContentsOfFile:usedEncoding:error:)
read line by line
for each line, fetch user and password (NSArray -componentsSeparatedByString)
store user/password into your DB
Best,
Christian
Edit: for parsing excel-sheets I recommend export as CSV file and then do the same
Parsing RTF files is mostly trivial. They're actually text, not binary (like doc pdf etc).
Last I used it, I remember the file format wasn't too difficult either.
Example:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\f0\fs22 Username Password\par
Username2 Password2\par
UsernameN PasswordN\par
}
Do a regular expression match to get the last { ... } part. By sure to match { not \{.
Next, parse the text as you want, but keep in mind that:
everything starting with a \ is escaped, I would write a little function to unescape the text
the special identifier \par is for a new line
there are other special identifiers, such as \b which toggles bolding text
the color change identifier, \cfN changes the text color according to the color table defined in the file header. You would want to ignore this identifier since we're talking about plain text.

How to force scheme.ini to be used for MS Text Driver?

I am creating this huge csv import, that uses the ms text driver, to read the csv file.
And I am using ColdFusion to create the scheme.ini in each folder's location, where the file has been uploaded.
Here is a sample one I am using:
[some_filename.csv]
Format=CSVDelimited
ColNameHeader=True
MaxScanRows=0
Col1=user_id Text width 80
Col2=first_name Text width 20
Col3=last_name Text width 30
Col4=rights Text width 10
Col5=assign_training Text width 1
CharacterSet=ANSI
Then in my ColdFusion code, I am doing 2 cfdump's:
<cfdump var="#GetMetaData( csvfile )#" />
<cfdump var="#csvfile#">
The meta data shows that the query has not grabbed the correct data types for reading the csv file.
And the dump of the query to read file, shows that it is missing values, because of Excel we can not force them to use double quotes. And when fields have mixed data types, then it causes our process to not work..
How can I either change the data type inside the query, aka make it use scheme.ini, or update metadata to the correct data type.
I am using a view on information_schema in sql server 2005 to get the correct data types, column names, and max lengths...
Unless I have some kind of syntax error, I can't see why it's not grabbing the data as the correct data type.
Any suggestions?
Funnily, I had the filename spelled wrong, instead of using schema.ini i was having it as scheme.ini.
I hate when you make lil mistakes like this...
Thank You