Import database from File PROGRAMMATICALLY? - sql

Is there a way to programatically import an entire database from a file with SQL?? (either .CSV , .SQL and .DB files are fine)
Thanks!!
EDITED AFTER TO CLARIFY:
I am interested in a solution that is Database independent (has to works with all types of databases (Mysql, SQL Server, PostGres, Oracle...)

MySQL: LOAD DATA INFILE for csv's; for .sql files generated with MySQL, use the shell.
For SQLite: see this SO question.
SQL Server: apparently there's the BULK INSERT command.
You are not going to find a database-independent syntax for an SQL command because there isn't one.
There may be a wrapper library around databases but I'm not aware of it. (Or you could try to use ODBC, but that's connection oriented and wouldn't allow direct access to a file)
Perhaps there is an interactive GUI-related software tool out there to do this.
Note also that loading data directly from a file on a database server into a database almost certainly requires security privileges to do so (otherwise it is a security risk).

Ok so I actually found a solution that is Database INDEPENDENT to import a database from a .Sql file quite easily!! :)
So whichever database you have (Mysql, Sqlite, ...) do the following:
1) export your database into .sql format.
(This .sql file will contain all sql commands such as CREATE TABLE... INSERT INTO table...)
(You may need to remove the lines that start with CREATE TABLE and leave only the lines that start with INSERT...)
2) Then in the language that you are using write some code that read each line of the Sql Lite files and stores it into an Array of String (String[])
3) Then execute each String contained in the array String[] as sql command
I've implemented this in Java:
import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.util.LinkedList;
import java.util.List;
import android.content.Context;
import android.database.sqlite.SQLiteDatabase;
public class DatabaseImporter {
private static DatabaseImporter instance;
public static DatabaseImporter getInstance(){
if(DatabaseImporter.instance == null)
instance = new DatabaseImporter();
return DatabaseImporter.instance;
}
private DatabaseImporter (){
}
public void importDatabaseFromFile(Context context, String databaseName , String filePath){
SQLiteDatabase database = //CREATE UR DATABASE WITH THE COMMAND FROM THE DATABASE API YOUR USING
this.executeSqlCommands( database ,
this.readSqlCommandsFromFile(filePath)
);
}
private String[] readSqlCommandsFromFile(String filePath){
String[] sqlCommands = new String[0];
List<String> sqlCommandsList = new LinkedList<String>();
try{
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream(filePath);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
if(!strLine.equals("") && !strLine.equals(" ") && strLine != null)
sqlCommandsList.add(strLine);
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
sqlCommands = new String[sqlCommandsList.size()];
sqlCommandsList.toArray(sqlCommands);
return sqlCommands;
}
private void executeSqlCommands(SQLiteDatabase database, String[] sqlCommands){
for(int i = 0; i < sqlCommands.length ; i++){
database.execSQL(sqlCommands[i]);
}
}
}

mysql -u -p < dumpfile.sql
importing csv would require a script (using eg. PHP) to put the right fields in the right bit of the query.

If you are using SQL Server check out SSIS

Related

How to sort through a dictionary (a real world dictionary) that is in a .csv file?

I haven't read enough theory or have had enough practice in CS, but there must be a simpler, faster way to look up data from a file. I'm working with a literal, real world dictionary .csv file, and I'm wondering how I can speed up look up of every word. No doubt going through the whole list for the word does not make sense; splitting the file into a-z order, and only looking there for each word, makes sense.
But what else? Should I learn SQL or something and try to convert the text database into an SQL database? Are there methods in SQL that would enable me to do what I wish? Please give me ideas!
SQLite sounds fit to this task.
Create a table, import your csv file, create an index and you're done.
I just did this using interop with a moderate size .csv file given me by a supply company. It worked well, but still requires a considerable delay due to the cumbersome decorators used in interop/COM.
class Excel
{
private excel.Application application;
private excel.Workbook excelWorkBook;
protected const string WORD_POSITION = "A"; //whichever column the word is located in when loaded on Excel spreadsheet.
protected const string DEFINITION_POSITION = "B"; // whichever column the definition is loaded into on Excel spreadsheet.
Dictionary<string,string> myDictionary = new Dicationary<string,string>();
public Excel(string path) // where path is the fileName
{
try
{
application = new excel.Application();
excelWorkBook = application.Workbooks.Add(path);
int row = 1;
while (application.Cells[++row, WORD_POSITION].Value != null)
{
myDictionary[GetValue(row, WORD_POSITION)] = GetValue(row, DEFINITION_POSITION);
});
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.ToString());
}
finally
{
excelWorkBook.Close();
application.Quit();
}
}
private string GetValue(int row, string columnName)
{
string returnValue = String.Empty;
returnValue = application.Cells[row, columnName].Value2;
if (returnValue == null) return string.Empty;
return returnValue;
}
}
}
Create a new sql database, import the cab into a new table, place an index on the column that stores the word values, then search against table... That is the approach I would take

Basic info on how to export BLOB as files

I have researched on how to export BLOBs to image. A DB has an IMAGE column storing several thousand images. I thought of exporting the table but I get a BLOB file error in EMS SQL Manager for InterBase and Firebird.
There have been good posts, but I have still not been able to succeed.
SQL scripts to insert File to BLOB field and export BLOB to File
This example has appeared on numerous pages, including Microsoft's site. I am using INTERBASE (Firebird). I have not found anything related to enabling xp_shell for Firebird, or EMS SQL Manager for InterBase and Firebird (which I have also installed). My guess is: its not possible. I also tried Installing SQL Server Express, SQL Server 2008, and SQL Server 2012. I am at a dead end without having even connected to the server. The reason being I have not managed to start the server. Followed the guide at technet.microsoft: How to: Start SQL Server Agent but there are no services on the right pane to me.
PHP file to download entire column (may not post link due to rep limitation).
It has a MySQL connect section that daunts me. There on my computer is the DB as a GDB file, I also have XAMPP. I can figure out a way to use this as a localhost environment. I hope this is making sense.
Last solution is to use bcp, an idea posted on Stack Overflow titled: fastest way to export blobs from table into individual files. I read the documentation, installed it, but cannot connect to server. I use -S PC-PC -U xxx -P xxx (The server must be wrong) But the information I find all uses -T (Windows Authentication)
Summing up. I am using Firebird, as EMS SQL Manager. I try to extract all images from images table into individual files. These tools both have SQL script screens, but it appears to be in conjunction with xp shell. What would you suggest? Am I using the wrong SQL manager to accomplish this?
There are several ways:
Use isql command BLOBDUMP to write a blob to file,
Use a client library (eg Jaybird for Java, Firebird .net provider for C#) to retrieve the data,
With PHP you can use ibase_blob_get in a loop to get bytes from the blob, and write those to a file.
I don't use nor know EMS SQL Manager, so I don't know if (and how) you can export a blob with that.
The example you link to, and almost all tools you mention are for Microsoft SQL Server, not for Firebird; so it is no wonder those don't work.
Example in Java
A basic example to save blobs to disk using Java 8 (might also work on Java 7) would be:
/**
* Example to save images to disk from a Firebird database.
* <p>
* Code assumes a table with the following structure:
* <pre>
* CREATE TABLE imagestorage (
* filename VARCHAR(255),
* filedata BLOB SUB_TYPE BINARY
* );
* </pre>
* </p>
*/
public class StoreImages {
// Replace testdatabase with alias or path of database
private static final String URL = "jdbc:firebirdsql://localhost/testdatabase?charSet=utf-8";
private static final String USER = "sysdba";
private static final String PASSWORD = "masterkey";
private static final String DEFAULT_FOLDER = "D:\\Temp\\target";
private final Path targetFolder;
public StoreImages(String targetFolder) {
this.targetFolder = Paths.get(targetFolder);
}
public static void main(String[] args) throws IOException, SQLException {
final String targetFolder = args.length == 0 ? DEFAULT_FOLDER : args[0];
final StoreImages storeImages = new StoreImages(targetFolder);
storeImages.store();
}
private void store() throws IOException, SQLException {
if (!Files.isDirectory(targetFolder)) {
throw new FileNotFoundException(String.format("The folder %s does not exist", targetFolder));
}
try (
Connection connection = DriverManager.getConnection(URL, USER, PASSWORD);
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("SELECT filename, filedata FROM imagestorage")
) {
while (rs.next()) {
final Path targetFile = targetFolder.resolve(rs.getString("FILENAME"));
if (Files.exists(targetFile)) {
System.out.printf("File %s already exists%n", targetFile);
continue;
}
try (InputStream data = rs.getBinaryStream("FILEDATA")) {
Files.copy(data, targetFile);
}
}
}
}
}
Example in C#
Below is an example in C#, it is similar to the code above.
class StoreImages
{
private const string DEFAULT_FOLDER = #"D:\Temp\target";
private const string DATABASE = #"D:\Data\db\fb3\fb3testdatabase.fdb";
private const string USER = "sysdba";
private const string PASSWORD = "masterkey";
private readonly string targetFolder;
private readonly string connectionString;
public StoreImages(string targetFolder)
{
this.targetFolder = targetFolder;
connectionString = new FbConnectionStringBuilder
{
Database = DATABASE,
UserID = USER,
Password = PASSWORD
}.ToString();
}
static void Main(string[] args)
{
string targetFolder = args.Length == 0 ? DEFAULT_FOLDER : args[0];
var storeImages = new StoreImages(targetFolder);
storeImages.store();
}
private void store()
{
if (!Directory.Exists(targetFolder))
{
throw new FileNotFoundException(string.Format("The folder {0} does not exist", targetFolder), targetFolder);
}
using (var connection = new FbConnection(connectionString))
{
connection.Open();
using (var command = new FbCommand("SELECT filename, filedata FROM imagestorage", connection))
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
string targetFile = Path.Combine(targetFolder, reader["FILENAME"].ToString());
if (File.Exists(targetFile))
{
Console.WriteLine("File {0} already exists", targetFile);
continue;
}
using (var fs = new FileStream(targetFile, FileMode.Create))
{
byte[] filedata = (byte[]) reader["FILEDATA"];
fs.Write(filedata, 0, filedata.Length);
}
}
}
}
}
}

How to import raven db with versioning bundle

I have a database with versioning bundle turned on. I make an export and then try to import the exported dump to newly created db. I get the exception "Modifying a historical revisionis not allowed". I found this question and answer from Ayende, that its by design. But how do I import data to an empty database if the versioning bundle was turned on and there are revisions in it?
For now I did the following thing: I make a new database without a versioning bundle, but with replication bundle in it. Import to that db(and it works), but I have a lot of duplicates if I perform search.
After that I create another new database, with replication and versioning bundle turned on. And I replicate from the db with duplicates this db. And it works, but it seems its a lot of things todo.
Am I doing the right thing? Is there an easier way to get your data from the dump?
I've had the same issue. The ONLY way I could get the import to work was to do the following:
Create DB with Versioning enabled (They recommend that DBs be created with any bundles that may be used).
Disable the versioning bundle. I've done this by editing the database settings and removing the bundle from the "Raven/ActiveBundles" setting.
Import your database.
Enable the versioning bundle. Just add the "Versioning" bundle back to the "Raven/ActiveBundles" setting.
If anyone has a better idea, I'd love to hear it. :)
The new version of RavenDb 3.0 give the permission to import with the Bundle already activated. So all the trick of disabling and enabling is not of any use anymore. But you might have new issues like Index not in synch with your revision or having the "versioning" not displayed in the Settings=>Versioning window.
To fix your versioning if they don't show in the versioning window but that they are visible in the System Documents, you need to add the Id individually or run some kind of code.
Option parameter for import since RavenDb 3.0.3745:
--disable-versioning-during-import
Fix of Versioning not containing an Id :
private void UpdateVersioning(string destinationRavenDbServer, string databaseName)
{
using (var documentStore = new DocumentStore { Url = destinationRavenDbServer, DefaultDatabase = databaseName })
{
documentStore.Initialize();
using (var session = documentStore.OpenSession())
{
var versioningInfoList = session.Advanced.LoadStartingWith<RavenJObject>("Raven/Versioning/", pageSize: 1024 );
foreach (var versioningInfo in versioningInfoList)
{
if (!versioningInfo.ContainsKey("Id"))
{
var fullInternalId = session.Advanced.GetDocumentId(versioningInfo);
var idSplitted = fullInternalId.Split('/');
var newId = idSplitted[idSplitted.Length - 1];
versioningInfo.Add("Id", newId);
}
}
session.SaveChanges();
}
}
}
Fix for re-indexing all the database:
private void ResetAllIndexes(string destinationRavenDbServer, string databaseName)
{
using (var documentStore = new DocumentStore { Url = destinationRavenDbServer, DefaultDatabase = databaseName })
{
documentStore.Initialize();
var indexes = documentStore.DatabaseCommands.GetIndexNames(0, 1024); // Update the 1024 first index, but you get the idea
foreach (var indexName in indexes)
{
documentStore.DatabaseCommands.ResetIndex(indexName);
}
}
}

Pig base 64 encoding/ store single line per record / remove newlines

I am trying to store some pig tuples data one per line to be later processed by an external system.
One of my fields is a bytearray representing a not-so-well structured html, containing newlines.
I tried using REPLACE($0.raw,'(\r\n|\n|\t)','')), to no avail, as it requires chararray and returns errors when I tried to cast it.
Compressing the tuple, as long it would guarantee a single line, would solve my problem.
Is there an easy way to make sure that a record will be stored in a single line (except for writing a custom UDF, although an already existing one would be perfect)?
In the end I implemented a custom UDF to convert bytearray to base64, which I then applied to the culprit field via standard res = FOREACH parsed GENERATE my.little.pony.udf.package.ByteArrayToByteArrayB64($0.raw);
The UDF definition:
package my.little.pony.udf.package;
import java.io.IOException;
import javax.xml.bind.DatatypeConverter;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.data.DataByteArray;
public class ByteArrayToByteArrayB64 extends EvalFunc<DataByteArray> {
public DataByteArray exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return null;
try{
if (input.size() < 1){
throw new IOException("Input is of size:"+input.size());
}
DataByteArray data = (DataByteArray)input.get(0);
String convertedBase64 = DatatypeConverter.printBase64Binary(data.get());
return new DataByteArray(convertedBase64.getBytes("UTF-8"));
}catch (ClassCastException e){
throw new IOException("Tuple element at is really of type:"+input.get(0).getClass().getName());
}catch(Exception e){
throw new IOException("Caught exception processing input row ", e);
}
}
}

Bigquery: Extract Job does not create file

I am working on a Java application which uses Bigquery as the analytics engine. Was able to run query jobs (and get results) using the code on Insert a Query Job. Had to modify the code to use service account using this comment on stackoverflow.
Now, need to run an extract job to export a table to a bucket on GoogleStorage. Based on Exporting a Table, was able to modify the Java code to insert extract jobs (code below). When run, the extract job's status changes from PENDING to RUNNING to DONE. The problem is that no file is actually uploaded to the specified bucket.
Info that might be helpful:
The createAuthorizedClient function returns a Bigquery instance and works for query jobs, so probably no issues with the service account, private key etc.
Also tried creating and running the insert job manually on google's api-explorer and the file is successfully created in the bucket. Using the same values for project, dataset, table and destination uri as in code so these should be correct.
Here is the code (pasting the complete file in case somebody else finds this useful):
import java.io.File;
import java.io.IOException;
import java.security.GeneralSecurityException;
import java.util.Arrays;
import java.util.List;
import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.bigquery.Bigquery;
import com.google.api.services.bigquery.Bigquery.Jobs.Insert;
import com.google.api.services.bigquery.BigqueryScopes;
import com.google.api.services.bigquery.model.Job;
import com.google.api.services.bigquery.model.JobConfiguration;
import com.google.api.services.bigquery.model.JobConfigurationExtract;
import com.google.api.services.bigquery.model.JobReference;
import com.google.api.services.bigquery.model.TableReference;
public class BigQueryJavaGettingStarted {
private static final String PROJECT_ID = "123456789012";
private static final String DATASET_ID = "MY_DATASET_NAME";
private static final String TABLE_TO_EXPORT = "MY_TABLE_NAME";
private static final String SERVICE_ACCOUNT_ID = "123456789012-...#developer.gserviceaccount.com";
private static final File PRIVATE_KEY_FILE = new File("/path/to/privatekey.p12");
private static final String DESTINATION_URI = "gs://mybucket/file.csv";
private static final List<String> SCOPES = Arrays.asList(BigqueryScopes.BIGQUERY);
private static final HttpTransport TRANSPORT = new NetHttpTransport();
private static final JsonFactory JSON_FACTORY = new JacksonFactory();
public static void main (String[] args) {
try {
executeExtractJob();
} catch (Exception e) {
e.printStackTrace();
}
}
public static final void executeExtractJob() throws IOException, InterruptedException, GeneralSecurityException {
Bigquery bigquery = createAuthorizedClient();
//Create a new Extract job
Job job = new Job();
JobConfiguration config = new JobConfiguration();
JobConfigurationExtract extractConfig = new JobConfigurationExtract();
TableReference sourceTable = new TableReference();
sourceTable.setProjectId(PROJECT_ID).setDatasetId(DATASET_ID).setTableId(TABLE_TO_EXPORT);
extractConfig.setSourceTable(sourceTable);
extractConfig.setDestinationUri(DESTINATION_URI);
config.setExtract(extractConfig);
job.setConfiguration(config);
//Insert/Execute the created extract job
Insert insert = bigquery.jobs().insert(PROJECT_ID, job);
insert.setProjectId(PROJECT_ID);
JobReference jobId = insert.execute().getJobReference();
//Now check to see if the job has successfuly completed (Optional for extract jobs?)
long startTime = System.currentTimeMillis();
long elapsedTime;
while (true) {
Job pollJob = bigquery.jobs().get(PROJECT_ID, jobId.getJobId()).execute();
elapsedTime = System.currentTimeMillis() - startTime;
System.out.format("Job status (%dms) %s: %s\n", elapsedTime, jobId.getJobId(), pollJob.getStatus().getState());
if (pollJob.getStatus().getState().equals("DONE")) {
break;
}
//Wait a second before rechecking job status
Thread.sleep(1000);
}
}
private static Bigquery createAuthorizedClient() throws GeneralSecurityException, IOException {
GoogleCredential credential = new GoogleCredential.Builder()
.setTransport(TRANSPORT)
.setJsonFactory(JSON_FACTORY)
.setServiceAccountScopes(SCOPES)
.setServiceAccountId(SERVICE_ACCOUNT_ID)
.setServiceAccountPrivateKeyFromP12File(PRIVATE_KEY_FILE)
.build();
return Bigquery.builder(TRANSPORT, JSON_FACTORY)
.setApplicationName("My Reports")
.setHttpRequestInitializer(credential)
.build();
}
}
Here is the output:
Job status (337ms) job_dc08f7327e3d48cc9b5ba708efe5b6b5: PENDING
...
Job status (9186ms) job_dc08f7327e3d48cc9b5ba708efe5b6b5: PENDING
Job status (10798ms) job_dc08f7327e3d48cc9b5ba708efe5b6b5: RUNNING
...
Job status (53952ms) job_dc08f7327e3d48cc9b5ba708efe5b6b5: RUNNING
Job status (55531ms) job_dc08f7327e3d48cc9b5ba708efe5b6b5: DONE
It is a small table (about 4MB) so the job taking about a minute seems ok. Have no idea why no file is created in the bucket OR how to go about debugging this. Any help would be appreciated.
As Craig pointed out, printed the status.errorResult() and status.errors() values.
getErrorResults(): {"message":"Backend error. Job aborted.","reason":"internalError"}
getErrors(): null
It looks like there was an access denied error writing to the path: gs://pixalate_test/from_java.csv. Can you make sure that the user that was performing the export job has write access to the bucket (and that the file doesn't already exist)?
I've filed an internal bigquery bug on this issue ... we should give a better error in this situation.
.
I believe the problem is with the bucket name you're using -- mybucket above is just an example, you need to replace that with a bucket you actually own in Google Storage. If you've never used GS before, the intro docs will help.
Your second question was how to debug this -- I'd recommend looking at the returned Job object once the status is set to DONE. Jobs that end in an error still make it to DONE state, the difference is that they have an error result attached, so job.getStatus().hasErrorResult() should be true. (I've never used the Java client libraries, so I'm guessing at that method name.) You can find more information in the jobs docs.
One more difference, I notice is you are not passing job type as config.setJobType(JOB_TYPE);
where constant is private static final String JOB_TYPE = "extract";
also for json, need to set format as well.
I had the same problem. But it turned out was that I typed the name of the table wrong. However, Google did not generate an error message saying that "the table does not exists." That would have helped me locate my problem.
Thanks!