How to add or route PostgreSQL Data to New Hard Drive - sql

Im Using Windows Server 2008 R2 Standard
Im Running PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 32-bit
I got C:/ and D:/ Drive
C:/ --> 6.7GB free space (almost full and my server performance running low)
D:/ --> 141GB free space
Currently my PostgreSQL Data stored at C:/ Now,I want to route or add path to D:/ without migrate the data from C:/ to D:/ because now my PostgreSQL Data Stored around 148 GB. It Heavy and Massive Stored.
If success, I should still be able to do a query like SELECT * From table_bla_bla and it will return result from both drives?
Please do not suggest me to change PostgreSQL to other DB or whatsoever.
Because Im running 39,763 Device GPS Meter that send the data to my Server.
I have to take care this server because my expert already past-away.

You need to use tablespaces.
Create the tablespace, for example CREATE TABLESPACE second_drive LOCATION 'D:/postgresdata/' (see this other answer if you get permission denied errors)
ALTER TABLE table_bla_bla SET tablespace second_drive
Tablespaces allow you to decide which tables go on which drives and that can help speed up performance by ensuring you control where reads and writes go, but it also helps with space.

Postgres places individual tables in TABLESPACEs (which relate to a single disk), which is enough if you have multiple tables and you can achieve what you need by moving some tables to the other disk.
On the other hand, if you have a large table that you need to split over multiple disks, you need to use Postgres's Horizontal Partitioning capability.
This builds on tablespaces by allowing you to create a master table table_bla_bla which is actually just a facade on top of two or more tables which actually hold the data. These data tables can then be put on different tablesspaces effectively splitting your data over disks.
For this you would:
Rename your current table_bla_bla to something like
table_bla_bla_c
Create a new table_bla_bla master table.
Alter table_bla_bla_c to mark that it inherits from
table_bla_bla
Create a new table_bla_bla_d table that inherits from table_bla_bla and specify the tablespace as the D drive.
Apply partitioning triggers and check constraints as per the partitioning documentation
Once this is in place, you can arrange it so that any inserts into table_bla_bla cause new records to be created on the D drive. Selects on table_bla_bla will read from both disks.

Related

create tablespace in ASM in oracle rac db

I want to create tablespace in ASM in Oracle rac db create tablespace <tablespacexxx> DATAFILE '+data';. data is one disk storage. I saw there are multiple disk groups when executing select * from V$ASM_DISKGROUP;
There is one disk group called Data and another group called Reco. The voting_files column for the two groups is different. voting_files for Data is set to y, and voting_files for Reco is set no n. I am wondering what are the differences between the two, and can I use either one to create tablespaces?
Use +DATA for your datafiles. +RECO or +FRA should be for the fast recovery area, online redo logs, voting files, and control files.
The Fast Recovery Area is Oracle-managed disk space that provides a
centralized disk location for backup and recovery files.
https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/oracle-database-configuration-best-practices.html#GUID-DB511B6A-D220-4556-B31B-18E3D310BA61
https://docs.oracle.com/en/database/oracle/oracle-database/19/ostmg/create-diskgroups.html#GUID-CACF13FD-1CEF-4A2B-BF17-DB4CF0E1800C
According to the disk group names, DATA is supposed to be contain user data, and RECO is to store the recovery logs. It more like an agreement among the developers and DBAs.
So, your new tablespace needs to placed at DATA.
Voting disks manage information about RAC node membership. You should not use it store any user data.

SQL: How to alter all varchar column in all tables to nvarchar?

I need to convert all varchar columns in about 40 tables (filled with the data) to nvarchar columns. It is planned to happen in a dedicated MS SQL server used only for the purpose. The result should be moved to Azure SQL.
Where should be the conversion done: on the old SQL, or after moving it on Azure SQL Server?
According to Remus Rusanu's answer https://stackoverflow.com/a/8157951/1346705, new nvarchar columns are created in the process, and the old varchar columns are dropped. The space can be reclaimed by DBCC CLEANTABLE or using ALTER TABLE ... REBUILD. Are the dropped varchar columns packed into the backup table, or does the backup/restore also remove the dropped columns?
Can the process be somehow automated using a universal SQL script? Or is it necessary to write the script for each individual table?
Context: We are the 3rd party with respect to the enterprise information system. Our product reads from the information system SQL database and presents the data the way that would otherwise be expensive to implement in the IS. The enterprise information system is now migrated to the new version and is to be run on Azure SQL. The database of the IS have been changed heavily, and one of the changes was to abandon the old 8-bit text encoding (varchar) and to use Unicode instead (nvarchar). Our system was used also for collecting data typed manually -- using the same encoding that the old IS used.
Migration is to be done via doing old version of backup (SqlCmd that produces xxx.bak files), restoring on another good old SQL server. Then we run the script that removes all the tables, views, and stored procedures that can be reconstructed from the IS. One of the main reasons is that the SQL code uses features that are not accepted by the new backup tool SqlPackage.exe to produce xxx.bacpac file. Then the bacpac file is restored in Azure SQL.
Where should be the conversion done: on the old SQL, or after moving it on Azure SQL Server?
I would do it on local SQLServer First,Running this on Azure database,might cause you to run into some issues like hitting your DTU limits,disk IO throttling..
Are the dropped varchar columns packed into the backup table, or does the backup/restore also remove the dropped columns?
The space wont be released back to filesystem,also backup doesn't process free spaces,so you will not see much change there.You might want to read more on dbcc cleantable though,before proceeding ..
Can the process be somehow automated using a universal SQL script? Or is it necessary to write the script for each individual table?
It can be automated,may be you can use dynamic sql to see the column type and process further.You will also have to see if any of those columns are part of indexes,if so you have to drop them first
I suggest making the schema changes beforehand on the old instances. Even if you don't bother cleaning up space with DBCC CLEAANTABLE or ALTER...REBUILD, the resultant bacpac size will be the same because, unlike a physical backup/restore, a bacpac file is just a compressed package format of schema and data.
Consider using SQL Server Data Tools (SSDT) to facilitate the schema changes. This will consider all the dependencies (constraints, indexes, etc.) that is a challenge with a "universal" T-SQL solution. SSDT will generally generate a migration script that employs temp tables for such schema changes so the end result won't have wasted space in your old database. However, you will need sufficient unused space in the database to contain the old/new objects side-by-side.

Update Hive metadata location for many tables

I would like to change the bucket name in location of many Hive tables. Is it possible for us to connect to mySQL database and update it? I think it is possible.But I would like to know if it is safe to do it in production database.
Yes, it is possible, and I have seen it done; but
(a) the Metastore schema is not documented, and each Hive version brings some minor changes, so you have to do your own exploration to find where/how the StorageDescriptor objects are persisted -- then some unit tests / non-regression tests on a Dev system -- plus, don't forget to run a full DB backup before tinkering with your Prod system (and to rehearse an emergency restoration on your Dev system, too!)
(b) you have to update the StorageDescriptor for tables, but also for partitions -- remember that for partitioned tables, the table-level LOCATION is just used as default root dir for future partitions; once created, a partition retains its location until it is ALTERed explicitly.
For the record, the preferred method for bulk updates is (in theory) the Hive MetaTool but unfortunately, it does not support the kind of updates that you need.Right now it's only good for changing the NameNode alias in all HDFS paths, because that was a real pain point...
A valid alternative to brutal SQL Updates would be to develop a custom Java program, using the Hive MetaStore API, to scan all tables & partitions then read their StorageDescriptor then run RegEx changes on their Location then write back the changes (which is exactly what the MetaTool does, only at a lower level). But that would be overkill.
Finally, a possible compromise would be a SQL Select on the appropriate MySQL table, to generate (with regexp_replace()) a chain of ALTER Table/Partition LOCATION commands to run later in the Hive CLI.Plus a chain of ALTER to revert to the original locations, in case you have to do an emergency rollback :-/

How to safely release unallocated space for 1 table in a database?

Our Production database is on SQL Server 2008 R2. One of our tables, Document_Details, stores documents that users upload via our application (VB). They are stored in varbinary(max) format. There are over 20k files in pdf format and many of these are large in size (some are 50mb each). So overall this table is 90GB. We then ran an exe that compressed these pdf files down to 10GB.
However here lies the problem - the table is still 90GB in size. The unalloacted space hasn't been released. How do I unallocate this space so that the table is 10GB?
I tried moving the table to a new filegroup and then back to original filegroup but in either case it didn't release any space.
I also tried rebuilding the index on the table but that didn't work either.
What did work (but I heard it isn't recommended) was - change the recovery type from Simple, Shrink the filegroup, set recovery to Full.
Could I move this table to a new filegroup and then shrink that filegroup (i.e. just the Document_Details table)? I know the shrink command affects performance but if it's just 1 table would it still be a problem? Or is there anything else I can try?
Thanks.
Moving a table to a filegroup has one problem: By default the TEXTIMAGE data (the blobs) are not moved! A table's rows can reside on one filegroup and the blobs and on another. This is a crazy defect in SQL Server. Maybe by rebuilding the table the blobs were simply not touched.
Use one of the well-known methods to move lob data as well. That would rebuild the lobs and shrink them.

SQL Server 2008, Compression / Zip and fast query

I have an application that produce approximately 15000 rows int a table named ExampleLog for each Task. The task has a taskID, that is saved in a table named TaskTable, thus it's possible to retrieve data from the ExampleLog table to run some queries.
The problem is that the ExampleLog table is getting very big, since I run everyday at least 1 task. At the time being my ExampleLog table is over 60 GB.
I would like to compress the 15000 rows which belong to a TaskID, and compress them or just Zip them and then save the compressed data somewhere inside the database as Blob or as Filestream. But it is important for me to be able to query easily the compressed or zipped file and proccess some query in a efficient manner inside the compressed or zipped data. (I don't know, if it's possible or I may lost in term of performance)
PS: The compressed data should not be considered as backup data.
Did someone can recommend an good approach or technique to resolve this problem. My focus is on the speed and of the query running on the ExampleLog and the place taken on the disk.
I'm using SQL Server 2008 on Windows 7
Consider Read-Only Filegroups and Compression.
Using NTFS Compression with Read-Only User-defined Filegroups and Read-Only Databases
SQL Server supports NTFS compression of read-only
user-defined filegroups and read-only databases. You should consider
compressing read-only data in the following situations: You have a
large volume of static or historical data that must be available for
limited read-only access. You have limited disk space.
Also, you can try and estimate the gains from page compression applied to the log table using Data Compression Wizard.
The answer of Denis could not solve my Problem completely, however I will use it for some optimization inside the DB.
Regarding the problem of storing data in package/group, there are 2 solutions of my problem:
The first solution is the use of the Partitioned Table and Index Concepts.
For example, if a current month of data is primarily used for INSERT, UPDATE, DELETE, and MERGE operations while previous months are used primarily for SELECT queries, managing this table may be easier if it is partitioned by month. This benefit can be especially true if regular maintenance operations on the table only have to target a subset of the data. If the table is not partitioned, these operations can consume lots of resources on an entire data set. With partitioning, maintenance operations, such as index rebuilds and defragmentations, can be performed on a single month of write-only data, for example, while the read-only data is still available for online access.
The second solution it to insert from the code (C# in my case) a List or Dictionary of row from a Task, then save them inside a FILESTREAM (SQL Server) on the DB server. Data will later by retrived by Id; the zip will be decompressed and data will be ready to use.
We have decided to use the second solution.