HSQL database storage format - hsqldb

I am just starting usage of HSQL database and I'm misunderstanding storage format of data.
I have made simple test program creating entity via Hibernate. I used file-based standalone in-process mode of HSQL.
I got this file of data:
testdb.script
SET DATABASE UNIQUE NAME HSQLDB52B647B0B4
// SET DATABASE lines skipped
INSERT INTO PERSON VALUES(1,'Peter','UUU')
INSERT INTO PERSON VALUES(2,'Nasta','Kuzminova')
INSERT INTO PERSON VALUES(3,'Peter','Sagan')
INSERT INTO PERSON VALUES(4,'Nasta','Kuzminova')
INSERT INTO PERSON VALUES(5,'Peter','Sagan')
INSERT INTO PERSON VALUES(6,'Nasta','Kuzminova')
As I understand when I get a lot of data, all of it will be stored as such SQL script and executed every time of database startup and will be kept in memory?

The INSERT statements in the .script file are for MEMORY tables of the database.
If you create a table with CREATE CACHED TABLE ... or you change a MEMORY table to CACHED with SET TABLE <name> TYPE CACHED the data for the table will be stored in the .data file and there will be no INSERT statements in the .script. file.

Related

SSIS schema switching deadlocks

I have an SSIS package which takes data that has changed in the last 1/2 hour and transfers it from a DB2 database into a SQL server. This data is loaded into an empty import table (import.tablename) then inserted into a staging table (newlive.tablename). The staging table is then schema switched with the live (dbo) table within a transaction. FYI, the dbo tables are the backend to a visualization tool (Looker)
My problem is that the schema switching is now creating deadlocks. Everytime I run the package, it affects different tables. I've been using this process with larger tables before (also backend to Looker) and have not had this problem before.
I read in another post that the user was having a similar problem because of indexes but all the data has been written to the destination tables.
Any ideas or suggestion of where to look would be much appreciated
The schema switching code is within a Execute SQL Task in the SSIS Package with:
BEGIN TRAN
ALTER SCHEMA LAST_LIVE TRANSFER DBO.TABLENAME
ALTER SCHEMA DBO TRANSFER NEW_LIVE.TABLENAME
GRANT SELECT ON DBO.TABLENAME TO LOOKER_LOOKUP
COMMIT TRAN

SQL Server bulk insert for large data set

I have 1 million rows of data in a file, I want to insert all the records into SQL Server. While inserting I am doing some comparison with existing data on the server, if the comparison satisfied I will update the existing records in the server or else I will insert the record from the file.
I'm currently doing this by looping from C#, which consume more than 3 hours to complete the work. Can anyone suggest idea to improve the performance?
Thanks,
Xavier.
Check if your database in Full or Simple recovery mode:
SELECT recovery_model_desc
FROM sys.databases
WHERE name = 'MyDataBase';
If database is SIMPLE recovery mode you can create a staging table right there. If it is in Full mode then better create Staging table in separate database with Simple model.
Use any BulkInsert operation/tool (for instance BCP, as already suggested)
Insert only those data from your staging table, which do not exist in your target table. (hope you know how to do it)

OleDB Destination executes full rollback on error, Bulk Insert Task doesn't

I'm using SSIS and BIDS to process a text file who contains lots (millions) of records. I decided to use the Bulk Insert Task and it worked great but then the destination table needed an additional column with a default value on the insert operation and the Bulk Insert Task stopped working. After that, I decided to use a Derived Column with the defaul value and an OleDB Destination to insert the bulk data. It solved my last problem but generated a new one: If there is an error when inserting the data in the OleDB Destination, then it executes a full rollback and no row was added on my table, but when I used the Bulk Insert Task, there were rows based in the BatchSize configuration. Let me explain it with a sample:
I use a text file with 5000 lines. The file contained a duplicate id (intentionally) between the rows 3000 and 4000.
Before starting the DTS, the destination table was totally empty.
Using Bulk Insert Task, after the error raised (and the DTS stopped), the destination table had 3000 rows. I set the BatchSize attribute to 1000.
Using OleDB Destination, after the error raised, the destination table had 0 rows! I set the Rows per batch attribute to 1000 and the Maximum insert commit size to its max value: 2147483647. I tried changing last one to 0, no effect.
Is this the normal behavior of OleDB Destination? Can someone provide me a guide about working with these tasks? Should I forget to use these tasks and use the Bulk Insert from T-SQL?
As a side note, I also tried following the instructions for KEEPNULLS in Keep Nulls or UseDefault Values During Bulk Import (SQL Server) to not use the OleDB Destination task, but it didn't work (maybe is just me).
EDIT: Additional info about the problem.
Table structure (sample)
Table T
id int, name varchar(50), processed int default 0
CSV File (sample)
1, hello
2, world
There is no rolling back on Bulk Inserts, that's why they are fast.
Take a look at using format files:
http://msdn.microsoft.com/en-us/library/ms179250.aspx
You could potentially place this in a transaction in SSIS (you'll need MSDTC running), or you could create T-SQL script with a try-catch to handle any exceptions of the bulk insert (probably just rollback or commit).

Dumping a table's content in sqlite3 to be imported into a new database

Is there an easy way of dumping a SQLite database table into a text string with insert statements to be imported into the same table of a different database?
In my specific example, I have a table called log_entries with various columns. At the end of every day, I'd like to create a string which can then be dumped into an other database with a table of the same structure called archive. (And empty the table log_entries)
I know about the attach command to create new databases. I actually wish to add it to an existing one rather than creating a new one every day.
Thanks!
ATTACH "%backup_file%" AS Backup;
INSERT INTO Backup.Archive SELECT * FROM log_entries;
DELETE FROM log_entries;
DETACH Backup;
All you need to do is replace %backup_file% with the path to your backup database. This approach considers that your Archive table is already defined and that you are using the same database file to cumulate your archive.
$ sqlite3 exclusion.sqlite '.dump exclusion'
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE exclusion (word string);
INSERT INTO "exclusion" VALUES('books');
INSERT INTO "exclusion" VALUES('rendezvousing');
INSERT INTO "exclusion" VALUES('motherlands');
INSERT INTO "exclusion" VALUES('excerpt');
...

Create a table called #test, and create a table called test in the tempdb, what's the difference?

In SQL Server 2005(2008 not tested), you can't create a temp function like #function_name, but you can create a functoin called function_name directly in tempdb. Does the function created in this way a temp function?
What's the difference between a table called #table_name and the same named table directly created in tempdb?
A temp table (#test) isn't actually called #test in the tempdb database. This is because every user on the system can create a table called #test. If you create a temp object the physical object in the tempdb database (found by looking at the sys.all_objects catalog view). In my case it was created as "#test_______________________________________________________________________________________________________________000000000003". Where if you create a physical table in the tempdb database it is called test, and only one use at a time can create the object, and if multiple users put data into a physical table called test then they will be able to access each others data. Where when you have the temp tables users can only access their own data and their own table.
One obvious difference is that it won't be dropped automagically when your connection ends but beyond that I find it quite useful for testing something out quickly on my development machine that I don't want to bother cleaning up explicitly. tempdb obviously gets recreated after server restart so shouldn't be used if you want any sort of persistence.