Concern with using external SQL server for DIH - sql

I am looking to import entries to my SOLR server by using the DIH, connecting to an external PostgreSQL server using the JDBC driver. I will be importing about 50,000 entries each time.
Is connecting to an external SQL server for my data unreliable or risky, or is it instead perfectly reasonable?
My only alternative is to export the SQL file on the other server, download the SQL file to my SOLR server, import it to my Solr servers copy of PostgreSQL and then run the DIH on the local database.

The way you're using it is pretty much why the DIH exists. Otherwise, you could just use the /update handler with XML documents. The core I'm working on right now regularly indexes 11,000,000 rows per batch.

This is a standard use case, importing from a remote DB. Proceed with confidence!

Related

Performance moving data from Postgres to SQL Server via SSIS

I have several large SQL queries that I need to run against a Postgres data source. I am using SSIS on SQL Server 2008 R2 to move the data. Because of the way our system is set up, I have to use a tunnel via PuTTY and set up local port redirection.
In the SSIS package, I am using ADO.NET source and destination. I have PostgreSQL drivers installed, and we were able to get the 32-bit version working. My package runs, I am getting the data, but the data transformation tasks run painfully slow ... about 2,000 records per second.
Does anyone have experience making a trip to Postgres with static queries and dumping the results into a SQL Server? Any tips / best practices?
You should try to get the data and store it in a ssis raw file.
Then make your transformation and whatever you like on the raw file data.
After that send it back to DB.
General try not to have many calls to the database.

Import large table to azure sql database

I want to transfer one table from my SQL Server instance database to newly created database on Azure. The problem is that insert script is 60 GB large.
I know that the one approach is to create backup file and then load it into storage and then run import on azure. But the problem is that when I try to do so than while importing on azure IO have an error:
Could not load package.
File contains corrupted data.
File contains corrupted data.
Second problem is that using this approach I cant copy only one table, the whole database has to be in the backup file.
So is there any other way to perform such an operation? What is the best solution. And if the backup is the best then why I get this error?
You can use tools out there that make this very easy (point and click). If it's a one time thing, you can use virtually any tool (Red Gate, BlueSyntax...). You always have BCP as well. Most of these approaches will allow you to backup or restore a single table.
If you need something more repeatable, you should consider using a backup API or code this yourself using the SQLBulkCopy class.
I don't know that I'd ever try to execute a 60gb script. Scripts generally do single inserts which aren't very optimized. Have you explored using various bulk import/export options?
http://msdn.microsoft.com/en-us/library/ms175937.aspx/css
http://msdn.microsoft.com/en-us/library/ms188609.aspx/css
If this is a one-time load, using a IaaS VM to do the import into the SQL Azure database might be a good alternative. The data file, once exported could be compressed/zipped and uploaded to blob storage. Then pull that file back out of storage into your VM so you can operate on it.
Have you tried using BCP in the command prompt?
As explained here: Bulk Insert Azure SQL.
You basically create a text file with all your table data in it and bulk copy it your azure sql database by using the BCP command in the command prompt.

how can communicate two SQL database in LAN to save data in both database

We have a server with SQL database (8 database) working in LAN, Now we are planning to make a backup server connected through LAN.
What we need is, when user enter data it should save in both database, so that we have all the data in both database.
I am a newbie so pls give me some detail information. I have seen some replication option, is it better option for us.
We have SQL Server 2005.
Which database engine are you using?
There're several ways to build a distributed/replicated database, I'm sure you'll find out how to do it by reading your engine documentation, but we cannot help here without more info.
Yes, you can backup your database from one server to other by multiple way.. Following are those
1) Make script of complete database with schema and data..Technique is here
2) Export your database to excel file and import it to other server (Use only in required conditions).
3) communicate two server by addlink.. Technique here
Now if you want dump data in two different server then its depend on code logic written for dumping database and connection string provided for it. By adding Trigger to one server database you can dump data on different server or database.. Trigger

Running SQL Server queries against sqlite or plain files?

I have a website developed using a SQL Server database. But I need to port it to a new web server that doesn't have access to a database.
I therefore need an alternative solution, preferably without rewriting all the database queries. I would prefer being able to run the same queries against a sqlite file or just a plain textfile.
As far as storage goes it won't be storing large amounts of data, so performance won't be an issue.
Thanks for your time!,
Kind regards
Solved it by using Sql Server Compact Edition (see comments in OP).

replication between SQL Server and MySQL server

I want to setup replication between SQL Server and MySQL, in which SQL Server is the primary database server and MySQL is the slave server (on linux).
Is there a way to setup such scenario? Help me .
My answer might be coming too late, but still for future reference ...
You can use one of the heterogeneous replication solutions like SymmetricDS: http://www.symmetricds.org/. It can replicate data between any SQL database to any SQL database, altough the overhead is higher than using a native replication solution.
of course you can replicate MSSQL database to MYSQL
By using Linked Server in MSSQL.
for that you need to download ODBC drivers. and you can further search regarding how to create Linked server on SQL SERVER.
This option is very easy and Totally free. You can use OPEN QUERY FOR THIS.
By using SSIS Packages.
for that you need the Business Intelligence service of SQL SERVER. you can create SSIS Packages on Visual Studio and run them for replication.
No. At least not without doing a lot of dirty, bad things. MSSQL and MySQL speak different replication protocols, so you won't be able to set it up natively (which is the way you'd want to handle it). At best, you could hack together some sort of proxy that forwards insert/update/delete/create/alter, etc. queries from one to the other. This is a terrible idea as they don't speak the same SQL except in the most common case. Even database dumps which wouldn't really be replication are generally incompatible between vendors.
Don't do it. If you must use different OSes on your servers, standardize the database to something that runs on both.
These two databases are from two different vendors. While I cannot say for sure, it is unlikely Microsoft has any interest in allowing replication to a different vendor's database server.
I work with Informix and MySQL. Both those databases have commands that dump the entire database to an ascii file format. You would need to see how that is done on MS SQL Server; ftp the dump to the server hosting the MySQL server; and then convert the dump into something MySQL can import.