Best way to recover from HSQLDB LOBS file growth - hsqldb

I have a HSQLDB database with lots of LOBs. The LOB file has grown to a point where it is causing the machine to crash.
I have some data in the LOBs file which can be deleted.
What is the best way to deal with this situation? Will performing a backup only export the actual LOBs or will the file be exported with the internal space? (i.e the whole sparse LOBs file)
I have tried CHECKPOINT DEFRAG but this doesn't seem to work in 2.3.2, the LOBs file continues to grow after deleting the unused LOBs.

With version 2.3.4 a CHECKPOINT results in the truncation of the .lobs file to the last LOB that is referenced in database tables. Any empty spaces within the file will then be reused for future lobs.
If there are relatively few live lobs in the database, you can convert the type of columns with LOBs to VARCHAR or VARBINARY, perform a CHECKPOINT, then SHUTDOWN, and then delete the .lobs file. You can then reopen and convert the column types back to CLOB and BLOB.

Related

Where does oracle store undo data? In memory or in hard disk

I know about undo tablespace which is permanent. Does that mean oracle stores undo data in hard disk?
data files are stored on disk Source
A data file is a physical file on disk that was created by Oracle
Database and contains data structures such as tables and indexes. A
temp file is a data file that belongs to a temporary tablespace. The
data is written to these files in an Oracle proprietary format that
cannot be read by other programs.
And tablespaces are stored in data files Source
Oracle stores data logically in tablespaces and physically in
datafiles associated with the corresponding tablespace.
This means undo tablespace is stored on disk.

File size in hive with different file formats

I have a small file (2MB). I created a external hive table over this file (stored as textfile). I created another table (stored as ORC) and copied the data from the previous table. When I checked the size of data in ORC table, it was more than 2MB.
ORC is a compressed file format, so shouldn't the data size be less?
As of Hive 0.14, users can request an efficient merge of small ORC files together by issuing a CONCATENATE command on their table or partition. The files will be merged at the stripe level without reserialization.
ALTER TABLE istari [PARTITION partition_spec] CONCATENATE;
It's because your source file is too small. ORC has complex structure with internal indexes, headers, footers, postscript, compressing codecs also add some structures, etc, etc.
See this for details: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-ORCFileFormat
All these supporting structures consume more space than the data. For such small file you really do not need to store min/max values for columns, do not need blum filters, etc since your file may fit in memory. The best storage for this case is text file uncompressed. You can also try just to gzip your source file and check it's size. Too small gzipped file may be bigger than uncompressed. The bigger the file the more benefit from compressing and using orc will be.

How to safely release unallocated space for 1 table in a database?

Our Production database is on SQL Server 2008 R2. One of our tables, Document_Details, stores documents that users upload via our application (VB). They are stored in varbinary(max) format. There are over 20k files in pdf format and many of these are large in size (some are 50mb each). So overall this table is 90GB. We then ran an exe that compressed these pdf files down to 10GB.
However here lies the problem - the table is still 90GB in size. The unalloacted space hasn't been released. How do I unallocate this space so that the table is 10GB?
I tried moving the table to a new filegroup and then back to original filegroup but in either case it didn't release any space.
I also tried rebuilding the index on the table but that didn't work either.
What did work (but I heard it isn't recommended) was - change the recovery type from Simple, Shrink the filegroup, set recovery to Full.
Could I move this table to a new filegroup and then shrink that filegroup (i.e. just the Document_Details table)? I know the shrink command affects performance but if it's just 1 table would it still be a problem? Or is there anything else I can try?
Thanks.
Moving a table to a filegroup has one problem: By default the TEXTIMAGE data (the blobs) are not moved! A table's rows can reside on one filegroup and the blobs and on another. This is a crazy defect in SQL Server. Maybe by rebuilding the table the blobs were simply not touched.
Use one of the well-known methods to move lob data as well. That would rebuild the lobs and shrink them.

Create database backup, ignore column

I'd like to create a database backup using SSMS.
The backup file will be a .bak file, but I would like to ignore 1 column in a certain table, because this column isn't necessary, but it takes up 95% of the backup size.
The column values should all be replaced by 0x00 (column type is varbinary(max), not null).
What's the best way to do this?
FYI: I know how to generate a regular backup using Tasks => Back Up..
There is a long way of doing what you ask. Its basically create a new restored database, remove the non required data and then do a new backup again.
Create a Backup of the production database.
Restore the backup locally on production with a new name
Update the column with 0x00
Shrink the database (Shrink is helpful when doing a restore. This wont reduce the bak file size)
Take the backup of the new database (Also use Backup Compression to reduce the size even more)
Ftp the bak file
If you only needed a few tables, you could have used bcp but that looks out of the picture for your current requirement.
From SQL Server native backups, you can't. You'd have to restore the database to some other location and then migrate usefull data.
You can create a copy of your table without the column and backup using filegroups https://msdn.microsoft.com/en-us/library/ms191539(SQL.90).aspx

table size not reducing after purged table

I recently perform a purging on my application table. total record of 1.1 millions with the disk space used 11.12GB.
I had deleted 860k records and remain 290k records, but why my space used only drop to 11.09GB?
I monitor the detail on report - disk usage - disk space used by data files - space used.
Is it that i need to perfrom shrink data file? This has been puzzle me long time.
For MS SQL Server, rebuild the clustered indexes.
You have only deleted rows: not reclaimed space.
DBCC DBREINDEX or ALTER INDEX ... WITH REBUILD depending on verison
(It's MS SQL because the disk space report is in SSMS)
You need to explicitly call some operation (specific to your database management system) that will shrink the data file. The database engine doesn't shrink the file when you delete records, that's for optimization purposes - shrinking is time-consuming.
I think this is like with mail folders in Thunderbird: If you delete something, it's just marked as deleted, but to get higher performance, the space isn't freed. So most of your 11.09 GB will now contain either your old data or 0's. Shrink data file will "compress" (or "clean") this by creating a new file that'll only contain the actual data that is left.
Probably you need to shrink the table. I know that SQL server doesn't do it by default for you, I would guess this is for reasons of performance, maybe other DBs are the same.