I'm using Sql Server 2008 FullText Search for a project. I need to be able to search PDf files, and I had some questions relating to that:
How do I enable PDF searching? I've heard of the adobe filter, but couldn't find a clear guide on how to get started.
Are the PDF files stored in the DB itself, or in the file system? I was mainly concerned about the space on shared hosting services like DiscountASP. Typically, we get only about 100MB of space for the DB, but a lot more (in GBs) for the File System.
So, if these PDF files are going to be stored directly in the DB, then it may get expensive, right?
I would like to provide snippets of the search results (like Google). How can I achieve this with Sql Server 2008 FTS?
Full text search can only search database content. It will not index content outside the database. Fulltext is extensible through a programming API and Adobe has providers for PDF content, as you already know. SQL Fulltext can use those providers.
However there is another feature you may be interested in, namely the new SQL 2008 FILESTREAM data type. Filestreams are stored in the file system as files but are maintained as part of the database from the point of view of transaction consitency, backup and restore etc. Luckly FILESTREAM and FULL TEXT work together.
Sounds like you want to use Microsoft Indexing Services
This will index files on the file system so you can search their contents.
Here is an example of querying indexing services using ASP.NET
You need a PDF IFilter. Here's the one from Foxit Software.
I believe you can only use 'Sql Server Full Text Search" if the PDF files are stored within the database.
I haven't found a way to do this other than opening the file and searching for the context myself for each result.
Related
This one's going back to the basics but I haven't been able to find a simple explanation anywhere. I just started working with databases and I'm using a SQL Server database managed mostly with navicat (but I have SQL Server Management Studio as well) and I need to store a PDF or image in the database.
I'm using Entity Framework to interface the database with the C# app I am building. A simple explanation assuming little knowledge of database management would be much appreciated.
Thanks for the help.
The database size will grow exponentially if you start storing images and PDF's.
A better approach would probably be to store the path of the file in the database and then load the item by referencing the proper path.
EDIT:
It's going to depend on the file structure of your application really. A simple version of retrieving a PDF could be the following:
Example Path:
/PDF/username.PDF
You store the path of the PDF in the DB, maybe under pdfPath. Then when you retrieve the path from the database direct the user to the correct link using the path you got from the query.
I've found solutions to determine the length of an audio file using WMPLib.WindowsMediaPlayer (which seems quite ugly), by using a physical file path, but nothing to determine the duration of an audio file stored in a VARBINARY field (SQL Server 2008 R2).
I'm using .Net WebForms. Maybe it'd be a better idea to do this client side with jQuery, but what if I only want to expose some controls to the web browser based off of the duration?
follow these links to implement your functionality:
you can implement these in your vb.net code not in tsql. May be possible using CLR UDF/stored procedures in sql server, but not confirm about this.
just read files from database and get the information about the media as here are the links to read the meta information of MP3 files etc.
hope you will get some idea from these ..
http://www.developerfusion.com/code/4684/read-mp3-tag-information-id3v1-and-id3v2/
http://www.codeproject.com/KB/audio-video/mpegaudioinfo.aspx
http://rongchaua.net/blog/c-how-to-read-mp3-header/
http://www.developerfusion.com/code/4684/read-mp3-tag-information-id3v1-and-id3v2/
How to edit a Word Document (.docx) stored in a SQL Server Table?
Here is the tentative work flow:
Read BLOB from SQL Table through Ideablade
Write BLOB to disk as .docx
Open .docx using Word
User makes changes
Save .docx using Word
Read .docx into BLOB
Write BLOB back to SQL Table through Ideablade
All sample code is welcomed?
I am sure there are a lot of people doing this already.
Any other ideas on how to simplify this process?
I am using VB.NET, .NET 3.5 SP1, WinForm and SQL Server 2008.
Well, as to the how, here is how to read a blob and write a blob to SQL. Although frankly, unless you have very good reasons such as an existing backup system, you would probably be best served storing the file to the file system and the path and metadata in the database. Either way, abstract it in your BLL, so you can change your mind down the road.
Retrieving and updating the BLOB from the db shouldn't be a problem, you'll find lots of sample code to do that on the net.
A simple approach to your problem would be to create a "temp" or "working" directory somewhere and monitor it with System.IO.FileSystemWatcher (sample code). When the user wants to edit a file, fetch it from the db and store it the directory. Whenever the user saves the file, you'll get a notification from your FileSystemWatcher, so you can save it to the database. Don't forget to empty the directory from time to time.
The method I've seen for this that I think works best is to build this as an add-on for MS Word itself. Examples include the Save to Sharepoint, Save to Moodle, and other similar add-ins.
All of our correspondence is done via database mail in sql server. The data for document generation and the rules to trigger the generation are all on sql server. We now have to create a pdf file. I was planning on using pdfsharp/migradoc to do it, but then we'd have to create document and time its readiness with sql server data state and mail state. It'd be nice if the db could handle everything.
Has anyone created pdf files directly in sql server? And if so, how.
take a look here: Create data driven PDF on the fly by using SQL server reporting service (SSRS)
I've not used it, but there is SQL2PDF stored proc. It uses sp_OA% code.
Google search
Blog article and duplicated on SQL Server Central (needs login)
SQL isn't the best place to do this of course, but if you have to I'd use CLR if possible.
I want to move full text catalogue for 1 database to a different location on same SQL server. I am using SQL 2005. One of the source said:
SQL Server 2005 full-text
search provides the ability to easily
detach and move full-text catalogs in
the same way that SQL Server database
files may be detached, moved, and
re-attached. Full-text catalogs are
included with sp_detach_db and
sp_attach_db. After detaching a
database, you may move the full-text
catalog and/or database data files,
and then re-attach the database.
Full-text catalog metadata is updated
to reflect the change of location.
This capability simplifies building,
testing, moving, and deploying
databases across multiple servers.
From:
http://msdn.microsoft.com/en-us/library/ms345119(SQL.90).aspx
Beleiving it, i only moved data and log file using attach and detach method, copied Full text catalog and rebuild it. Still
sp_help_fulltext_catalogs ''
shows same output with old path.
Another source:
http://sqlserverpedia.com/wiki/FTS_-Restoring&_Relocating_your_Catalogs
mentions Stopping and starting full text search service as a part of move. I can't restart full text service as it is being used by the other databases.
Is there any option using which I can move a single full text catalogue without restarting full text service?
Regards
Manjot
I didn't have to restart service at all..
I just tried it without restarting and it worked as per:
http://sqlserverpedia.com/wiki/FTS_-Restoring&_Relocating_your_Catalogs