FILESTREAM/FILETABLE Clarifications for Implementation

FILESTREAM/FILETABLE Clarifications for Implementation - sql

Recently our team was looking at FILESTREAM to expand the capabilities of our proprietary application. The main purpose of this app is managing the various PDFS, Images and documents to all of the parts we manufacture. Our ASP application uses a few third party tools to allow viewing of these files. We currently have 980GB of data on the Fileserver. We have around 200GB of Binary data in SQL Server that we would like to extract since it is not performing well hence FILESTREAM seems to be a good compromise to the two major data storage/access issues.
A few things are not exactly clear to us:
FILESTREAM Can or Cannot store its data on a drive that is not locally attached. We already have a File Server with a RAID 10 (1.5TB drives). This server stores all of the documents right now, would we have to move these drives to the SQL Server for FILESTREAM? That would be a tough bullet to bite since the server also is doubling as the Application Server (Two VMs on one physical server).
FILETABLE stores the common metadata about the files but where is the Full Text part of it stored to allow searching of files like doc/docx? Is this separate? Are you able to freely add criteria to this to search by? If so any links to clarify would be appreciated.
Can FILETABLE be referenced in another table with a foreign key?
Thank you in advance
EDIT: For those having these questions this web video covered everything and more in terms of explaining filestream from 2008 to 2012 and the cavets to consider (I would seriously rep him if I could): http://channel9.msdn.com/Events/TechDays/Techdays-2012-the-Netherlands/2270
In conclusion we will not be using FILESTREAM as it would be way to huge of an upsurge to accommodate for investment.
EDIT 2:
Update to #1 - After carefully assessing FileTable in addition to FILESTREAM we got a winning combination. We did have to move the files over to the new server (wasn't to painful since they were on the same VM).It honestly took more time to write an extraction tool to dump the binary data within SQL to the File System.
Update to #2 - This was seperate but again Bob had an excellent webinar explaining this: http://channel9.msdn.com/Events/TechEd/Europe/2012/DBI411
Update to #3 - Using TFT inheritance we recycled the Docs table we had (minus the huge binary blobs) which required very little changes in our legacy apps. This was a huge upshot for the developer team.

The location that the files are stored in for FileTables has to be local, or at least must appear to SQL Server as being local so a clever san driver might trick it. Since the FileTables stuff is built on the FILESTREAM stuff I imagine the limitations to be the same.
The searching of filetables is done via the containstable function which is documented on MSDN the search criteria uses the same syntax as FULLTEXT searching AFAIK.
For all intent and purpose the FileTable is a typical table so can be joined, searched or whatever. The only thing is that you have to use some functions of sql server in order to change the FILESTREAM guids into something more useful like a file path.

Related

How to create a SQL database within VB.net?

I am currently creating a vb.net program in which users upload a song file to the program and then it is saved within the programs files. I have set up the actual saving of the files but would also like to store some meta data of each in a SQL database within my program.
I have looked online and although i now understand the basics of SQL, im still a little fuzzy on how you actually implement this within VB.net. I have already added the library- Imports System.Data.SqlClient but failed to work out how to begin coding in SQL.
The basics of what im trying to acheive is a if statement that will determine wether or not a SQL database has been created in a specific location, and if it hasnt it should create it.
All constructive answers appreciated, thanks.

There are a number of different database engines available. The namespace that you have chosen contains the ADO.NET client classes for Microsoft SQL Server. You would use a connection string to specify how to connect to the database. This would often contain connection information, such as server name, user name, password etc, but it sounds like you want to store data locally.
There is a local version of SQL Server called LocalDB, but I think you would still need quite a lot of the SQL Server components installed for that to work. Although you can package these with your application they may be too large for you, so you may want to look at SQL Server Compact Edition, which is much smaller and allows you to package the whole engine as part of your application and is useful for storing data locally. Compact edition doesn't have quite all of the features that LocalDB does, so you may want to compare the features available for each.
Although you can use the ADO.NET objects to connect to a database, I think most people these days would use a layer on top which transfers data back and forwards between objects in memory and the database. This also allows you to use Linq to query the database in most cases. I personally use Entity Framework. You might want to look into that. There are different ways of configuring EF so you may want to look at a tutorial. Once you have it set up, you will probably find it much easier and safer to work with than writing SQL manually though.

Possibilities for external database with MS Access 2010

This question is quite general, however, i can not find a good answer for it.
What are the possibilities for using an external database with MS Access?
I see that MySQL can be used, but I would have to setup a ODBC connection and install drivers on every machine. The issue is that I have a software developed in MS Access that uses a lot of data, and it gets very slow at processing the data when i include a lot of data.
The software analyzes data from wind turbines, so it is used by different customers and it may contain a lot of different turbines with 50,000+ rows in each data set.
I would like these turbine data to be stored in a separate file that is pointed to by MS Access, so I include the software + whatever turbine data wanted.
As it is now, i have a lot of Access database files where the data is included in the software. It becomes impossible to keep track of - Especially when I do an edit to the source code of the software, which is do a lot these days.
Another issue is that the users may only have Access Runtime.
What are my options here? Is the best method to use the Access Link function?
Best regards, Emil.
Edit:
SQL's - Can they be combined? :
SELECT q_DataLimited.YAW001, q_DataLimited.YAW002
FROM q_DataLimited
WHERE (((q_DataLimited.YAW002)>Degree_dsp() And (q_DataLimited.YAW002)<Degree_dsp_high()));
And
SELECT Count(q_WindRose_PCU.YAW001) AS CountOfYAW0011
FROM q_WindRose_PCU;
Edit 2:
Public Degree As Long
Public Function Degree_dsp() As Long
Degree_dsp = Degree * 20
End Function
I have the degree as a counter outside the function in a form being:
For Degree = 0 To 17
DoCmd.OpenQuery "q_WindRose_PCU"
DoCmd.Close
Next Degree
Edit 3:
How to combine a query and the append of it to a table?
SELECT q_PowerBinned.Bin, Avg(q_PowerBinned.POW001) AS AvgOfPOW001, StDev(q_PowerBinned.POW001) AS StDevOfPOW001, Avg(q_PowerBinned.WSP001) AS AvgOfWSP001, StDev(q_PowerBinned.WSP001) AS StDevOfWSP001, Avg(q_PowerBinned.POW002) AS AvgOfPOW002, StDev(q_PowerBinned.POW002) AS StDevOfPOW002, Avg(q_PowerBinned.WSP002) AS AvgOfWSP002, StDev(q_PowerBinned.WSP002) AS StDevOfWSP002, Count(q_PowerBinned.Bin) AS CountOfBin
FROM q_PowerBinned
GROUP BY q_PowerBinned.Bin;
And then the append of the above to a table:
INSERT INTO t_Average_Stored ( Bin, PowAvg001, WindAvg001, PowAvg002, WindAvg002, n_samples, PowDev001, WindDev001, PowDev002, WindDev002 )
SELECT q_Average_Temp.Bin, q_Average_Temp.AvgOfPOW001, q_Average_Temp.AvgOfWSP001, q_Average_Temp.AvgOfPOW002, q_Average_Temp.AvgOfWSP002, q_Average_Temp.CountOfBin, q_Average_Temp.StDevOfPOW001, q_Average_Temp.StDevOfWSP001, q_Average_Temp.StDevOfPOW002, q_Average_Temp.StDevOfWSP002
FROM q_Average_Temp;

I see already a few suggestions in the comments, but I am going to answer the general question you posted. In short, the possibilities are endless.
MS Access, and Excel for that matter, have excellent external data tools that allow you to connect to almost any external data source and leverage on regular SQL-based databases or even use OLAP cubes to do your analysis. Access itself should be powerful enough to handle the data sets you mention. Even Access 2010 should be able to handle millions of records with relative ease.
MS Access does have a significant limitation, which is the 2GB file size. Once your database reaches 2GB, everything goes out the window and you are very likely to get data corruption. This is a well known issue, but I don't think you are anywhere near these limits.
Before considering an upgrade, though, there are a few things to suggest:
Analyze the structure of your data and your database. Perhaps your tables are too big (lots of columns) and unnecessarily redundant. It may make sense to process the raw data you receive to split it into different tables that reduce the redundancy and improve performance.
Look into indexing some key fields in your tables. This is heavily dependent on the type of analysis you do and what queries are most common. Read up on indexes and how to use them and explore some options with actual datasets. You may be surprised how queries that used to take minutes to run become almost instantaneous when the right indexes are created and maintained.
Analyze your queries for performance. If I remember correctly, MS Access 2010 had a performance analyzer, which could improve your queries to make them run more efficiently.
If you have already looked into the items above and you decide you really need to take a step up, one fairly easy path (and inexpensive) is to install SQL Server Express, which you can download for free from Microsoft. Access was made to talk to SQL Server and the performance is many times better. You can run SQL Server Express in your personal pc and use it as a back-end for Access, or you could actually install it in a networked pc and use it as a server (behind a firewall, of course, NEVER connected to the Internet). In this setup you can access your data from several PCs.
One key thing to keep in mind once you start using Access as a front end, is that you want to push the processing to the back end, not keep it in Access. The best way to do this is to create what Access calls pass-through queries. These queries are written in the backend's native SQL language and are sent to the back end server for processing. Only the processed data comes back. If you don't do this, for example by creating the queries in the visual editor in Access instead, the raw data will be sent to Access and then Access will try to create your results. This, as you can imagine, can actually be a lot slower than your initial situation, so don't do it.
If you are not a SQL expert and need a visual editor, there is a tool that you can download from Microsoft: SQL-Server Management Studio Express. The query editor is not that different from Access and will allow you to create queries in a visual manner, but in Transact-SQL (the language of SQL Server). You can also manage your SQL Server Express with this tool and maintain your data in this manner (import, export, etc). You can create the SQL statements you need in this editor and then copy and paste into the pass-through queries in Access. The data will be available for you in the program you are familiar with, but with the power of a much bigger database engine behind the scenes.
Since I do not want to sound like a Microsoft shill, I definitely want to mention other options for external data that could be equally or even more powerful than SQL Server Express. The only reason I mentioned these is because you are already familiar with Microsoft products and the learning curve is a bit less steep. Also, most things should work together out of the box.
The first option that comes to mind is SQLite, which is a high performing database that is actually file-based. It is very small, yet very powerful and fast, and it is ideal for a locally based application like what you mention. There are also lots of graphical interfaces for SQLite and you can connect to it via ODBC from Access. Again, you want to run everything using pass-through queries and let SQLite pick up the load. SQLite is Open Source and it is free.
If you are keen on having "a real database server", then MySQL is probably the next step up. Also Open Source and free, it is very popular, which means lots of places to get support and different graphical interfaces to choose from.
Any search for Open Source Database will give you even more options to try and choose from.
One key thing to keep in mind: if you install any database server in your PC, it will become a server, and will start advertising its services in your local network or on the internet if you bring it to a local Starbucks. Be careful with that, learn how to start/stop the services in your PC, and make sure you turn them off when you are not behind a firewall. There are many exploits for different database servers and you will get quickly detected once your PC starts advertising its newly acquired abilities.
Just to close, there is no difference in the performance of Access and the runtime. Just the ability to edit the queries and so on. Whatever front end you create in Access, your users will be able to utilize in the same manner.

Should a Text File be stored in my SQL Server Database?

We are using several text files as Templates to create the results of a WCF Data Services - Service Operation call.
The text files are each less than 3000 Bites Max.
What are the pros and cons of storing my template files on the file system with the WCF Data Services files vs storing them in a SQL Server 2008 R2 server?

Prior to SQL Server 2008, I recommended strongly against storing large objects like text files in the database. It tended to slow down access and made them generally harder to work with. Instead, I generally recommended storing links to the files in question.
Of course, this meant that the database would not protect the files in the event someone deleted something they shouldn't and the files needed to be backed up and transferred separately from the database.
With SQL Server 2008, I think many of the former problems have been overcome using the filestream functions and I think that storing files using filestream can be quite useful at times. It continues to store the actual data outside the database, which avoids many of the former complications. But it still binds the two together and permits the database to protect the files rather than just relying on the links in the database to be correct.

There's a lot of pros and cons for either storage method. Nowadays (my opinion has changed, and may change again some day), I'd focus on security and managability.
If it is sensitive data, you might get a bit more security by storing them in the database. If nothing else, it might be more difficult to hack a database than a file system. If security is not so important, it can be easier/simpler to store it on the OS.
For managing, if the data gets updated (and how frequently does that happen), how easy is it to update? One instance in a database is simpler to update (or corrupt...) than an instance on each of however many servers are in your web farm. (1 server, no problem, 20 servers, possible headache.)

I think it's better to store the data directly in the database. This makes it even faster to access because a database is more efficient in reading and generally handling data. You could always store movies in databases - that's no problem. Then it's also possible to stream large data.
For security reasons existing more as enough options to configure your database. And if you cluster your database - this is even more scalable.

Best method for transferring data between two SQL databases

I'm developing a project for gathering customer feedback using a Samsung Q1 Ultra, a cheap touchscreen PC. The project consists of two parts: a PC based application that builds the survey and stores the info on an SQL Server, and a survey viewer on the Samsung device which downloads survey data from the SQL Server and stores it on a SQL Server Compact 3.5 database.
My question is, how best can I transfer survey data from the SQL Server to the handheld device's database? Writing a tonne of code to copy data from one database to another seems overcomplicated - is there a handy function or somesuch that I can use to copy data from identical tables on these two separate databases?
Any help, suggestions greatly appreciated.
Cheers

Since you are copying from one device to another, you're going to need to do some sort of transfer system (replication, export/import, etc.) obviously.
My initial suggestion would be to have the handheld devices just access the main database on the server remotely... This means that each of the individual handheld devices (should you add more than one) would be working from the same data. Other than that approach, I would suggest something like this (after adding a linked server entry):
select * into targetTable from [sourceserver].[sourcedatabase].[dbo].[sourceTable]
A quick search on Google actually returned a question similar to yours here on the site.

SQL Server Replication might work for this. That MSDN section has tutorials, how-tos and walkthroughs. It isn't all that difficult to understand, but there is some setup. However, once you're done, you can basically put it on automatic pilot.
This article is a pretty good overview.

SQL Server 2005: Replication, varbinary

Scenario
In our replication scheme we replicate a number of tables, including a photos table that contains binary image data. All other tables replicate as expected, but the photos table does not. I suspect this is because of the larger amount of data in the photos table or perhaps because the image data is a varbinary field. However, using smaller varbinary fields did not help.
Config Info
Here is some config information:
Each image could be anywhere from 65-120 Kb
A revision and approved copy is stored along with thumbnails, so a single row may approach ~800Kb
I once had trouble with the "max text repl size" configuration field, but I have set that to the max value using sp_configure and reconfigure with override
Photos are filtered based on a “published” field, but so are other working tables
The databases are using the same local db server (in the development environment) and are configured for transactional replication
The replicated database uses a “push” subscription
Also, I noticed that sometimes regenerating the snapshot and reinitializing the subscription caused the images to replicate. Taking this into consideration, I configured the snapshot agent to regenerate the snapshot every minute or so for debugging purposes (obviously this is overkill for a production environment). However, this did not help things.
The Question
What is causing the photos table not to replicate while all others do not have a problem? Is there a way around this? If not, how would I go about debugging further?
Notes
I have used SQL Server Profiler to look for errors as well as the Replication Monitor. No errors exist. The operation just fails silently as far as I can tell.
I am using SQL Server 2005 with Service Pack 3 on Windows Server 2003 Service Pack 2.
[update]
I have found out the hard way that Philippe Grondier is absolutely right in his answer below. Images, videos and other binary files should not be stored in the database. IIS handles these files much more efficiently than I can.

I do not have a straight answer to your problem, as our standard policy has always been 'never store (picture) files in (database) fields'. Our solution, that applies not only to pictures but to any kind of file, or document, is now standard:
We have a "document" table in our database, where document/file names and relative folders are stored (in order to get unique document/file names, we generate them from the primary key/uniqueIdentifier value of the 'Document' table).
This 'document' table is replicated among our different suscribers, like all other tables
We have a "document" folder and
subfolders, available on each of our
database servers.
Document folders are then replicated independently from the database, with some files and folders replication software (allwaysynch is an option)
main publisher's folders are fully accessible through ftp, where a user trying to read a document (still) unavailable on his local server will be proposed to download it from the main server through a ftp client software (such as coreFTP and its command line options)

With an images table like that, have you considered moving that article to a one-way (or two-way, if you like) merge publication? That may alleviate some of your issues.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas