How do I select just a portion of huge binary (file)?

How do I select just a portion of huge binary (file)? - sql

My problem is this: I have the potential for huge files being stored in a binary (image) field on SQL Server 2008 (> 1GB).
If I return the entire binary using a regular select statement, the query takes more than a minute to return results to my .NET program and my client apps time out. What I'm looking for is TSQL code that will limit the size of the data returned (maybe 300mb), allowing me to iterate through the remaining chunks and prevent timeouts.
This has to happen in the SQL query, not in the processing after the data is returned.
I've tried SubString, which MS says works with binary data, but all I get back is 8000 bytes maximum. The last thing I tried looked like this:
select substring(Package,0,300000000) 'package', ID from rplPackage where ID=0
--where package is the huge binary stored in a image field
Data Streaming isn't really an option either, because of the client apps.
Any ideas?

OK, I figured it out. The way to do this is with the substring function, which MS accurately says works with binaries. What they don't say is that substring will return only 8,000 bytes, which is what threw me.
In other words, if the blob data type is image and you use this:
select substring(BlobField,0,100000000)
from TableWithHugeBlobField
where ID = SomeIDValue
--all you'll get is the first 8K bytes (use DataLength function to get the size)
However, if you declare a variable of varbinary(max) and the blob field data type is varbinary(max) - or some size that's useful to you - then use the substring function to bring back the partial binary into the variable you declared. This works just fine. Just like this:
Declare #PartialImage varbinary(max)
select #PartialImage = substring(BlobField, 0, 100000000) --1GB
from TableWithHugeBlobField
where ID = SomeIDValue
select DataLength(#PartialImage) -- should = 1GB
The question was posed earlier, why use SQL to store file data? It's a valid question; imagine you're having to replicate data as files to hundreds of different client devices (like iPhones), each package unique from the other because different clients have different needs, then storing the file packages as blobs on a database is a lot easier to program against than it would be to programmatically dig through folders to find the right package to stream out to the client.

Use this:
select substring(*cast(Package as varbinary(max))*,0,300000000) 'package', ID
from rplPackage
where ID=0

Consider using FileStream
FILESTREAM Overview
Managing FILESTREAM Data by Using Win32
sqlFileStream.Seek(0L, SeekOrigin.Begin);
numBytes = sqlFileStream.Read(buffer, 0, buffer.Length);

Related

Query remote oracle CLOB data from MSSQL

I read different posts about this problem but it didn't help me with my problem.
I am on a local db (Microsoft SQL Server) and query data on remote db (ORACLE).
In this data, there is a CLOB type.
CLOB type column shows me only 7 correct data the others show me <null>
I tried to CAST(DEQ_COMMENTAIRE_REFUS_IMPORT AS VARCHAR(4000))
I tried to SUBSTRING(DEQ_COMMENTAIRE_REFUS_IMPORT, 4000, 1)
Can you help me, please ?
Thank you

No MSSQL but in my case we were pulling data into MariaDB using the ODBC Connect engine from Oracle.
For CLOBs, we did the following (in outline):
Create PLSQL function get_clob_chunk ( clobin CLOB, chunkno NUMBER) RETURN VARCHAR2.
This will return the the specified nth chunk of 1000 chars for the CLOB.
We found 1,000 worked best with multibyte data. If the data is all plain text single byte that chunks of 4,000 are safe.
Apologies for the absence of actual code, as I'm a bit rushed for time.
Create a Oracle VIEW which calls the get_clob_chunk function to split the CLOB into 1,000 char chunk columns chunk1, chunk2, ... chunkn, CAST as VARCHAR2(1000).
We found that Oracle did not like having more than 16 such columns, so we had to split the views into sets of 16 such columns.
What this means is that you must check what the maximum size of data in the CLOB is so you know how many chunks/views you need. To do this dynamically adds complexity, needless to say.
Create a view in MariaDB querying the view.
Create table/view in MariaDB that joins the chunks up into a single Text column.
Note, in our case, we found that copying Text type columns between MariaDB databases using the ODBC Connect engine was also problematic, and required a similar splitting method.
Frankly, I'd rather use Java/C# for this.

Search and Replace a a partial string / substring in mssql tables

I was tasked with moving an installation of Orchard CMS to a different server and domain. All the content (page content, menu structure, links, etc.) is stored in an MSSQL database. The good part: When moving the physical files of the Orchard installation to the new server, the database will stay the same, no need to migrate it. The bad thing: There are lots and lots of absolute URLs scattered all over the pages and menus.
I have isolated / pinned down the tables and fields in which the URLs occur, but I lack the (MS)SQL experience/knowledge to do a "search - replace". So I come here for help (I have tried exporting the tables to .sql files, doing a search-replace in a text editor, and then re-importing the .sql files to the database, but ran into several syntax errors... so i need to do this the "SQL way").
To give an example:
The table Common_BodyPartRecord has the field Text of type ntext that contains HTML content. I need to find every occurance of the partial string /oldserver.com/foo/ and replace it with /newserver.org/bar/. There can be multiple occurances of the pattern within the same table entry.
(In total I have 5 patterns that will need replacing, all partial string / substrings of urls, domains/paths, etc.)
I usually do frontend stuff and came to this assignment by chance. I have used MySQL back in the day I was playing around with PHP related stuff, but never got past eh basics of SQL - it would be helpful if you could keep your explainations more or less newbie-friendly.
The SQL server version is SQL Server 9.0.4053, I have access to the database via the Microsoft SQL Server Management Studio 12
Any help is highly appreciated!

You can't manipulate the NTEXT datatype directly, but you can CAST it to VARCHAR(MAX), then use the REPLACE function to perform the string replacement, then CAST it back to NTEXT. This can all be done in a single UPDATE statement.
update MyTable
set MyColmun = cast(replace(cast(MyColumn as nvarchar(max)), N'/oldserver.com/foo/', N'/newserver.org/bar/') as ntext)
where cast(MyColumn as nvarchar(max)) LIKE N'%/oldserver.com/foo/%'
The WHERE clause in the UPDATE statement below is used to prevent SQL Server from making non-changes, i.e. if the value does not need to be changed then there is no need to update it to itself.
The CAST function is used to change the data type of a value. NTEXT is a legacy data type used for storing large character values, NVARCHAR(MAX) is a new and more versatile data type for storing large character values. The REPLACE function can not operate on NTEXT values, hence the need to CAST it to NVARCHAR(MAX) first, do the replace, then CAST it back to NTEXT afterwards.

Simba Mongo ODBC driver: returned data that does not match expected data length

We are using Simba Mongo ODBC driver to connect to Mongo database and make sql queries. I tested connection on Linux using isql and was able to perform queries.
When my client tried to connect to Mongo through Microsoft SQL Server Management Studio he received the following error:
OLE DB provider 'MSDASQL' for linked server 'mongo' returned data that does not match
expected data length for column '[MSDASQL].contributorComposite__0__biographicalNote'.
The (maximum) expected data length is 255, while the returned data length is 290.
I've never worked with this application. Have you got any idea where I can control expected data length?

Linked Server is very picky about metadata and the data that is returned, in general you're more likely to encounter problems if your defined metadata doesn't match exactly what is expected when using it vs. using other applications.
What's happening in this case is that you're retrieving data with a string column defined. The data in the string column has a length of 290, but the driver is reporting a length of 255. This is because MongoDB doesn't return metadata about the length of any specific field as it is a schema-less data source. The driver instead uses a default for reporting lengths of string columns, which by default is set to 255. You can change this by opening the Configuration Dialog for the DSN, going to the Advanced Options, and changing the Standard string column length from 255 to something larger, like 512. This should allow Linked Server to behave properly unless your data exceeds 512 bytes, in which case you should simply adjust this to a larger appropriate value.

I found this answer here: One way to get around this problem is to construct your SQL statement, such as:
declare #myStmt varchar(max)
set #myStmt = 'select * from my_collection'
EXECUTE (#myStmt) AT MongoDB_PROD_mydb
I did try it myself on a linked SQLite server and it worked but the strange part is that on longer entries the text get cut off somewhat randomly. I have not figured this part out yet - it might have something to do with SQLiteODBC driver I'm using.
But the error of "expected data length" is handled.
Michael

XML file generation using Windows Azure SQL Database

Could someone tell me the best way to generate XML file with data from Windows Azure SQL database ?
Basically, I want to create a XML with data from Windows Azure SQL Database by querying certain table and the data is huge (around 90 MB). As I need to run this job at least every couple of hours, this should perform very good.
Any suggestions ?
Thanks,
Prav

This is a very general question, and not really specific to SQL Azure, so it's difficult to give you a good answer. I would suggest you research the different ways to query a SQL database, and also the different ways of writing XML. That might give you ideas for more specific questions to ask.
90MB is not particularly large - it shouldn't be difficult to that into memory. But nonetheless you might want to consider approaches that keep only a small part of the data in memory at once. eg, reading data from a SqlDataReader and immediately writing it to an XmlTextWriter, or something along those lines.

One way to do what you are looking for is to query the database and have the result saved to an ADO.net DataTable. Once you have the DataTable, give it a name of your choosing using the TableName property. Then use the WriteXml method of the DataTable to save the DataTable to a location of your choosing. Make sure to specify the XmlWriteMode.WriteSchema to make sure you save the schema and the data.
Note that if the datatable is going to be 2Gb or greater, you hit the default object memory limit of .Net. One solution is to break your query into smaller chunks and store multiple datatables in XML format per original query. Another solution, is to increase the max size of an object in .Net to greater than 2Gb. This comes with it's own set of risks and performance issues. However, to exceed that 2Gb object size restriction, make sure that you're application is 64 bit, the application is compiled to .Net 4.5 or later and the app.config file should have gcAllowVeryLargeObjects enabled="true".
using System.Data;
using System.Data.SqlClient;
string someConnectionString = "enter Aure connection string here";
DataTable someDataTable = new DataTable();
SqlConnection someConnection = new SqlConnection(someConnectionString);
someConnection.Open();
SqlDataReader someReader = null;
// enter your query below
SqlCommand someCommand = new SqlCommand(("SELECT * FROM [SomeTableName]"), someConnection);
// Since your are downloading a large amount of data, I effectively turned the timeout off in the line below
someCommand.CommandTimeout = 0;
someReader = someCommand.ExecuteReader();
someDataTable.Load(someReader);
someConnection.Close();
// you need to name the datatable before saving it in XML
someDataTable.TableName = "anyTableName";
string someFileNameAndLocation = #"C:\Backup\backup1.xml";
// the XmlWriteMode is necessary to save the schema and data of the datatable
someDataTable.WriteXml(someFileNameAndLocation, XmlWriteMode.WriteSchema);

Why is maximum length of varchar less than 8,000 bytes?

So I have a stored procedure in a SQLServer 2005 database, which retrieves data from a table, format the data as a string and put it into a varchar(max) output variable.
However, I notice that although len(s) reports the string to be > 8,000, the actual string I receive (via SQLServer output window) is always truncated to < 8,000 bytes.
Does anybody know what the causes of this might be ? Many thanks.

The output window itself is truncating your data, most likely. The variable itself holds the data but the window is showing only the first X characters.
If you were to read that output variable from, for instance, a .NET application, you'd see the full value.

Are you talking about in SQL Server Management Studio? If so, there are some options to control how many characters are returned (I only have 2008 in front of me, but the settings are in Tools|Options|Query Results|SQL Server|Results to Grid|Maximum Characters Retrieved and Results to Text|Maximum number of characters displayed in each column.

The data is all there, but management studio isn't displaying all of the data.
In cases like this, I've used MS Access to link to the table and read the data. It's sad that you have to use Access to view the data instead of Management Studio or Query Analyzer, but that seems to be the case.

However, I notice that although len(s) reports the string to be > 8,000
I have fallen for the SQL Studio issue too :) but isn't the maximum length of varchar 8,000 bytes, or 4,000 for nvarchar (unicode).
Any chance the column data type is actually text or ntext and you're converting to varchar?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas