Import huge SQL file into SQL Server

Import huge SQL file into SQL Server - sql

I use the sqlcmd utility to import a 7 GB large SQL dump file into a remote SQL Server. The command I use is this:
sqlcmd -S IP address -U user -P password -t 0 -d database -i file.sql
After about 20-30 min the server regularly responds with:
Sqlcmd: Error: Scripting error.
Any pointers or advice?

I assume file.sql is just a bunch of INSERT statements. For a large amount of rows, I suggest using the BCP command-line utility. This will perform orders of magnitude faster than individual INSERT statements.
You could also bulk insert data using the T-SQL BULK INSERT command. In that case, the file path needs to be accessible by the database server (i.e. UNC path or copied to a drive on the server) and along with needed permissions. See http://msdn.microsoft.com/en-us/library/ms188365.aspx.

Why not use SSIS. While I have a certificate as a DBA, I always try to use the right tool for the job.
Here are some reasons to use SSIS.
1 - Use can still use fast-load, bulk copy. Make sure you set the batch size.
2 - Error handling is much better.
However, if you are using fast-load, either the batch commits or it gets tossed.
If you are using single record, you can direct each error row to a separate destination.
3 - You can perform transformations on the source data before loading it into the destination.
In short, Extract Translate Load.
4 - SSIS loves memory and buffers. If you want to get really in depth, read some articles from Matt Mason or Brian Knight.
Last but not least, the LAN/WAN always plays a factor if the job is not running on the target server with the input file on a local disk.
If you are on the same backbone with a good pipe, things go fast.
In summary, yeah you can use BCP. It is great for little quick jobs. Anything complicated with robust error handling should be done with SSIS.
Good luck,

Related

Is there an alternative way to import data into Postgres than using psql?

I am under strict corporate environment and don't have access to Postgres' psql. Therefore I can't do what's shown e.g. in the SO Convert SQLITE SQL dump file to POSTGRESQL. However, I can generate the sqlite dump file .sql. The resulting dump.sql file is 1.3gb big.
What would be the best way to import this data into Postgres? I also have DBeaver and can connect to both databases simultaneously but unfortunately can't do INSERT from SELECT.

I think the term for that is 'absurd', not 'strict'.
DBeaver has an 'execute script' feature. But who knows, maybe it will be blocked.

EnterpriseDB offers binary downloads. If you unzip those to a local drive you might be able to execute psql from the bin subdirectory.

If you can install psycopg2 or pg8000 for python, you should be able to connect to the database and then loop over the dump file sending each line to the database with cur.execute(line) . It might take some fiddling if the dump file has any multi-line commands, but the example you linked to doesn't show any of those.

mysql optimized query execution

As part of an ongoing research work, I am checking if an URL exists or not using the cURL command. I have been executing a shell script for couple of days and it is doing some updates for each URL in my database. However, the script seems to update around only 100,000 rows in a day.
I was thinking if I could write the values in a file first and then do the updates, the execution might be faster.
I am connecting to the database using the command line.
mysql -h servername -u username -ppassword databasename "Update Query"
For example, instead of connecting to the database 2 million times like above from the command line and updating 2 million rows, I am planning to connect to the database only once from the command line and update 2 million rows from the file.
So is the second approach better than the first one or the time difference would be negligible?

Three approaches.
You could using load data infile
You could build up a .sql file with all of the updates you need.
You could use something other than a CLI to connect to the URLs and DB. In other words, not using "curl" and "mysql" commands, but using a real programming language and provided libraries for checking URLs and updating databases.
Any of those would probably be faster. Though you'll likely get more speed improvement by making the http calls in parallel. You can do that more easily with a real programming language.

SQL, moving million records from a database to other database

I am a C# developer, I am not really good with SQL. I have a simple questions here. I need to move more than 50 millions records from a database to other database. I tried to use the import function in ms SQL, however it got stuck because the log was full (I got an error message The transaction log for database 'mydatabase' is full due to 'LOG_BACKUP'). The database recovery model was set to simple. My friend said that importing millions records using task->import data will cause the log to be massive and told me to use loop instead to transfer the data, does anyone know how and why? thanks in advance

If you are moving the entire database, use backup and restore, it will be the quickest and easiest.
http://technet.microsoft.com/en-us/library/ms187048.aspx
If you are just moving a single table read about and use the BCP command line tools for this many records:
The bcp utility bulk copies data between an instance of Microsoft SQL Server and a data file in a user-specified format. The bcp utility can be used to import large numbers of new rows into SQL Server tables or to export data out of tables into data files. Except when used with the queryout option, the utility requires no knowledge of Transact-SQL. To import data into a table, you must either use a format file created for that table or understand the structure of the table and the types of data that are valid for its columns.
http://technet.microsoft.com/en-us/library/ms162802.aspx

The fastest and probably most reliable way is to bulk copy the data out via SQL Server's bcp.exe utility. If the schema on the destination database is exactly identical to that on the source database, including nullability of columns, export it in "native format":
http://technet.microsoft.com/en-us/library/ms191232.aspx
http://technet.microsoft.com/en-us/library/ms189941.aspx
If the schema differs between source and target, you will encounter...interesting (yes, interesting is a good word for it) problems.
If the schemas differ or you need to perform any transforms on the data, consider using text format. Or another format (BCP lets you create and use a format file to specify the format of the data for export/import).
You might consider exporting data in chunks: if you encounter problems it gives you an easier time of restarting without losing all the work done so far.
You might also consider zipping the exported data files up to minimize time on the wire.
Then FTP the files over to the destination server.
bcp them in. You can use the bcp utility on the destination server for the BULK IMPORT statement in SQL Server to do the work. Makes no real difference.
The nice thing about using BCP to load the data is that the load is what is described as a 'non-logged' transaction, though it's really more like a 'minimally logged' transaction.
If the tables on the destination server have IDENTITY columns, you'll need to use SET IDENTITY statement to disable the identity column on the the table(s) involved for the nonce (don't forget to reenable it). After your data is imported, you'll need to run DBCC CHECKIDENT to get things back in synch.
And depending on what your doing, it can sometimes be helpful to put the database in single-user mode or dbo-only mode for the duration of the surgery: http://msdn.microsoft.com/en-us/library/bb522682.aspx
Another approach I've used to great effect is to use Perl's DBI/DBD modules (which provide access to the bulk copy interface) and write a perl script to suck out the data from the source server, transform it and bulk load it directly into the destination server, without having to save it to disk and move it. Also means you can trap errors and design things for recovery and restart right at the point of failure.

Use BCP to migrate data.
Another approach i have used in the past is to take a backup of the transaction log and shrink the log Prior to the migration. Split the migration script in parts and run the log backup- shrink - migrate iteration a few times.

How to execute big SQL files on SQL Server?

I have about 50 T-SQL files, some of them are 30MB but some of them are 700MB. I thought on executing them manually, but if the file is bigger than 10MB it throws an out of memory exception on the SQL Server Management Studio.
Any ideas?

you can try the sqlcmd command line tool, that may have different memory limits.
http://msdn.microsoft.com/en-us/library/ms162773.aspx
Usage example:
sqlcmd -U userName -P myPassword -S MYPCNAME\SQLEXPRESS -i myCommands.tsql

If you have that much data - wouldn't it be a lot easier and smarter to have that data in e.g. a CSV file and then bulk importing those into SQL Server??
Check out the BULK INSERT command - allows you to quickly and efficiently load large data volumes into SQL Server - much better than such a huge SQL file!
The command looks something like:
BULK INSERT dbo.YourTableName
FROM 'yourfilename.csv'
WITH ( FIELDTERMINATOR =';',
ROWTERMINATOR =' |\n' )
or whatever format you might have to import.

Maybe this is too obvious, but...did you consider writing a program to loop through the files and call SqlCommand.ExecuteNonQuery() for each line? It's almost trivial.
Less obvious advantages:
You can monitor the progress of the feed (which is going to take some time)
You can throttle it (in case you don't want to swamp the server)
You can add a little error handling in case there are problems in the input files

Are there any programs that will shrink the size of a sql script file?

I have a SQL script which is extremely large (about 700 megabytes). I am wondering if there is a good way to reduce the size of the script?
I know there are code minimizers for JavaScript and am looking for one to use with SQL scripts.
I am not looking to get performance on the SQL script. I am trying to make the file size smaller. Removing excess whitespace. Keeping name-qualification down so that the script file sizes can be smaller.
If I attempt to load the file in SQL Server Management Studio I get this error.
Not enough storage is available to
process this command. (Exception from
HRESULT: 0x80070008) (mscorlib)

What's in this script of 700MB?! I would hope that there are some similarities/repetitions that would allow it to shorten the file.
Just some guesses:
Instead of inserting a million records using Insert statements, use a bulk loading tool
Instead of updating a number of individual records, try to batch updates to the same value into one (e.g. Update tab set col=1 where id in (..) instead of individual updates)
long manipulations can be defined as a stored procedure (before running the script) and the script would only have to call the stored proc
Of course, splitting the script up into smaller portions and calling each one from a simple batch file would work too. But I'd be a little worried about performance (how long does the execution take?!) and would look for some faster ways.

What about breaking your script into several small files, and calling those files from a single master script?
This link describes how to do it from a stored procedure.
Or you can do it from a batch file like this:
REM =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
REM Define widely-used variables up here so they can be changed in one place
REM Search for "sqlcmd.exe" and make sure this path is valid for you
REM =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
set sqlcmd="C:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqlcmd.exe"
set uname="your_uname_here"
set pwd="your_pwd_here"
set database="your_db_name_here"
set server="your_server_name_here"
%sqlcmd% -S %server% -d %database% -U %uname% -P %pwd% -i "c:\script1.sql"
%sqlcmd% -S %server% -d %database% -U %uname% -P %pwd% -i "c:\script2.sql"
%sqlcmd% -S %server% -d %database% -U %uname% -P %pwd% -i "c:\script3.sql"
pause
I like the batch file approach myself, because it is easier to tinker with it, and you can schedule it as a windows job.
Make sure the .BAT file is in a folder with the appropriate security restrictions, since it has your credentials in a plain text .BAT file.

gzip should do.

SQL is much harder to shrink, the field, table names and commands need to be what they are. Plus, you wouldn't just want to rewrite the commands as something shorter because it could have implications on performance.
Depending on the DBMS that you use, it may allow short names for commands, and then there might be a converter.

(Answering this because it is the top item returned when I searched for "SQL script size")
I got the same error when trying to load a large script into Management Studio. In my case I was trying to downgrade a database from SQL2008 R2 to SQL 2008 by using the SQL Server script generator, which created a 700mb structure and data .sql file.
To get around it I used the command line to run the script instead:
C:>sqlcmd -S [SQLSERVER\INSTANCE] -i [FILELOCATION\FILENAME].sql
Hopefully this helps someone else.

Compress the sql file will have the most compression ratio.
Minimizing the txt sql file will reduce some bytes/kilobytes per mega.. is not worth...
The better approach is to create a "function" to unzip and read the file. The best benefit I guess.
Today, filesize shouldn't be a problem. Dial-up connection? Floppy disks?

pg-minify can do it, and not just for PostgreSQL, but for most notations, including MS, MySql, etc.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas