Migrate from date sharded table to partioned and clusterd table

Migrate from date sharded table to partioned and clusterd table - google-bigquery

Is there a way to migrate a date sharded table to a partitioned table by ingestion time i.e _PARTITIONDATE and clustered by column?
I read https://cloud.google.com/bigquery/docs/creating-partitioned-tables#convert-date-sharded-tables and understand how to convert a date sharded table to a partitioned table by ingestion time. If after the conversion I edit the table to be clustered by a column than only new data uploaded to the table will be clustered, but I want the existing data to be clustered.
From https://cloud.google.com/bigquery/docs/creating-clustered-tables I find how to create a clustered table from an existing table only by using query result CREATE TABLE AS SELECT, but I can not create a partition by ingestion time when using the AS query_statement clause.
Is there a way to solve this and migrate my data to be partitioned by ingestion time and clustered by specified column?

Related

A table with 80% text data type and 140 columns taking much time in table scan which slow down the query

I have a table with 100 columns where 80% column is nvarchar(max) and there is no way to change this data type cause i am getting this data from MySQL text type column. This table contains almost 30lacs records, so when I am selecting all the columns it takes too much time to show recordset. In this circumstance, i was interested to change this table as column store table but column store does not support nvarchar(max) data type and now i am finding the way how I can design this table which will help to make query fast.
Note I have tried with non clustered indexing by different column which has also not impacted in makig query fast.
Any help will be appreciated

Why not just use two tables? If your original table as a primary key, define a new table as:
create table t_text (
original_id int primary key,
value nvarchar(max),
foreign key (id) references original_table(original_id)
);
You would then join in this table when you want to use the column.
For inserting or updating the table, you can define a view that includes the text value. With a trigger on the view you can direct updates to the correct table.
What you really want is vertical partitioning -- the ability to store columns in separate partitions. This is a method for implementing this manually.

Efficient way to change the table's filegroup

I have around 300 tables which are located in different partition and now these tables are not in use for such huge data as it was. Now, I am getting space issue time to time and some of but valuable space is occupied by the 150 filegroups that was created for these tables so I want to change table's filegroup to any one instead of 150 FG and release the space by deleting these filegroups.
FYI: These tables are not holding any data now but defined many constraints and indices.
Can you please suggest me, how it can be done efficiently ?

To move the table, drop and then re-create its clustered index specifying the new FG. If it does not have a clustered index, create one then drop it.
It is best practice not to keep user data on primary FG. Leave that for system objects, and put your data on other file groups. But a lot of people ignore this...

I found few more information on the ways of changing the FG group of existing table:
1- Define clustered index in every object using NEW_FG (Mentioned in #under answer)
CREATE UNIQUE CLUSTERED INDEX <INDEX_NAME> ON dbo.<TABLE_NAME>(<COLUMN_NAME>) ON [FG_NAME]
2- If we can't define clustered index then copy table and data structure to new table, drop old and rename new to old as below
Changes Database's default FG to NEW_FG so that every table can be created using INTO, under that new FG by default
ALTER DATABASE <DATABASE> MODIFY FILEGROUP [FG_NAME] DEFAULT
IF OBJECT_ID('table1') IS NOT NULL
BEGIN
SELECT * INTO table1_bkp FROM table1
DROP TABLE table1
EXEC sp_rename table1_bkp, table1
END
After all the operation Database's default FG as before
ALTER DATABASE <DATABASE> MODIFY FILEGROUP [PRIMARY] DEFAULT
3- Drop table if feasible then create it again using NEW_FG
DROP TABLE table1
CREATE TABLE [table1] (
id int,
name nvarchar(50),
--------
) ON [NEW_FG]

SQL Statement alter index and add partition

I have an index in which I have to remove one column and reindex back for rebuild:
ALTER INDEX <index_name> REBUILD;
Is it possible to add partition when I rebuild an index. Partition will be based on one of the column index which is a datetime field. Something like:
ALTER INDEX <index_name> REBUILD, PARTITION BY RANGE(COLLECTIONTIME) INTERVAL (INTERVAL '15' MINUTE)
(PARTITION INITIAL_PARTITION VALUES LESS THAN (DATE '2014-10-10') );
Not sure how to write the sql statement for it. Anyone can help?
Also, if it is possible will the existing records will also be partitioned?
Edit: Database is Oracle

If you want to remove a column from an index which is on a partitioned table I assume the table is huge. This means the index rebuild takes time. You can either do an online rebuild or just create a new index and drop the old one afterwards.
If your index inherits the partition key column you can create a LOCAL, PREFIXED index on the table.
LOCAL means, each index partition has only records from one table partition -- table and index have the same partition structure
PREFIXED means, that the index inherits the partition key and supports partition pruning when the index is read.
The following command will create an index as mentioned above. The partitions are stored in the default tablespace of the user. The partition structure will be the same as the table.
create index <index_name> on (<columns>) LOCAL;
If you want the partitions to be stored in a different tablespace you can use this command:
create index <index_name> on (<columns>) LOCAL
(partition <index_name>_p1 tablespace whatever_1,
partition <index_name>_p2 tablespace whatever_2
);

Partition scheme change with clustered Index

I have table which has 600 million records and also has the Partition on PS_TRPdate(TRPDate) column, I want to change it to another Partition PS_LPDate(LPDate).
So I have tried with small amount of data with following steps.
1) Drop the Primary key Constraints.
2) Adding the New Primary Key Clustered Index with new Partition PS_LPDate(LPDate).
Is it Feasible with 600 million records? Can anyone guide me for it?
and How does it works with Non Partitioned Tables?
--343

My gut feeling is that you should create a parallel table using a new primary key, file groups and files.
To test out my assumption, I looked at a old blog post in which I stored the first five million prime numbers into three files / file groups.
I used the TSQL view that Kalen Delaney wrote and I modified to my standards to look at the partition information.
As you can see, we have three partitions based on the primary key.
Next, I drop the primary key on the my_value column, create a new column named chg_value, update it to the prime number, and then try to create a new primary key.
-- drop primary key (pk)
alter table tbl_primes drop constraint [PK_TBL_PRIMES]
-- add new field for new pk
alter table tbl_primes add chg_value bigint not null default (0)
-- update new field
update tbl_primes set chg_value = my_value
-- try to add a new primary key
alter table tbl_primes add constraint [PK_TBL_PRIMES] primary key (chg_value)
First, I was surprise that the partition still stayed together after dropping the PK. However, the view shows the index no longer exists.
Second, I end up receiving the following error during constraint creation.
While you could merge/switch the partitions into one file group which is not part of the scheme, drop/create the primary key, partition function & partition scheme, and then move the data yet again with the appropriate merge/switch statements, I would not.
This will generate a ton of work (TSQL) and cause alot of I/O on the disks.
I suggest you build a parallel partitioned table, if you have space, with the new primary key. Reload the data from the old table to the new.
If you are not using data compression and have the enterprise version of SQL Server, why not save the bytes by turning it on.
Good luck!
John
www.craftydba.com

How do I partition efficiently in SQL Server based on foreign keys?

I am working on SQL Server and want to create a partition on a table. I want to base it off of a foreign key which is in another table.
table1 (
fk uniqueidentifier,
data
)
fk points to table2
table 2 (
partition element here
)
I want to partition table1 base on table2's data, ie if table2 contains categories

The foreign key relationship doesn't really matter, horizontal partitioning is based on the values in the table itself. The foreign key just makes sure they already exist in another table.
Links:
SQL SERVER – 2005 – Database Table Partitioning Tutorial – How to Horizontal Partition Database Table
Partitioning a SQL Server Database Table
Steps for Creating Partitioned Tables

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Migrate from date sharded table to partioned and clusterd table - google-bigquery

Related

A table with 80% text data type and 140 columns taking much time in table scan which slow down the query

Efficient way to change the table's filegroup

SQL Statement alter index and add partition

Partition scheme change with clustered Index

How do I partition efficiently in SQL Server based on foreign keys?

Categories

Resources