Redis: moving data from an index to another - redis

What is the way to migrate the data from DB 0 to DB 1 in the same instance of Redis?
I have found many resources about migrating from one instance to another (SAVE or replication), but not from one index to another; maybe I don't use the correct search keywords…

The commands COPY and MOVE let's you copy/move keys from one DB to another.

Related

Azure Data Factory - delete data from a MongoDb (Atlas) Collection

I'm trying to use Azure Data Factory (V2) to copy data to a MongoDb database on Atlas, using the MongoDB Atlas connector but I have an issue.
I want to do an Upsert but the data I want to copy has no primary key, and as the documentation says:
Note: Data Factory automatically generates an _id for a document if an
_id isn't specified either in the original document or by column mapping. This means that you must ensure that, for upsert to work as
expected, your document has an ID.
This means the first load works fine, but then subsequent loads just insert more data rather than replacing current records.
I also can't find anything native to Data Factory that would allow me to do a delete on the target collection before running the Copy step.
My fallback will be to create a small Function to delete the data in the target collection before inserting fresh, as below. A full wipe and replace. But before doing that I wondered if anyone had tried something similar before and could suggest something within Data Factory that I have missed that would meet my needs.
As per the document, You cannot delete multiple documents at once from the MongoDB Atlas. As an alternative, you can use the db.collection.deleteMany() method in the embedded MongoDB Shell to delete multiple documents in a single operation.
It has been recommended to use Mongo Shell to delete via query. To delete all documents from a collection, pass an empty filter document {} to the db.collection.deleteMany() method.
Eg: db.movies.deleteMany({})

How to remove shards in crate DB?

I am new to crate.io and I am not very familiar with the term of "sherd" and I am trying to understand why when I am running my local db it creates 4 different shards?
I need to reduce this to one single shard because it causes problems when I try to export the data from crate into json files (it creates 4 different shards!)
Most users run crate on multiple servers. To distribute the records of a table between multiple servers it needs to be splitted. One piece of that table is called shards.
To make sure that the database still has records CrateDB by defaults create on replica of each shard. A copy of the data that is located on a different server.
While the system doesn't have full copies of the shards the cluster state is yellow / underreplicated.
CrateDB running on a single node will never be able to create a redundant copy (because it is only one server).
To change the amount of replicas you can use the command ALTER TABLE my_table SET(number_of_replicas=...)

Backing up portion of data in SQL

I have a huge schema containing billions of records, I want to purge data older than 13 months from it and maintain it as a backup in such a way that it can be recovered again whenever required.
Which is the best way to do it in SQL - can we create a separate copy of this schema and add a delete trigger on all tables so that when trigger fires, purged data gets inserted to this new schema?
Will there be only one record per delete statement if we use triggers? Or all records will be inserted?
Can we somehow use bulk copy?
I would suggest this is a perfect use case for the Stretch Database feature in SQL Server 2016.
More info: https://msdn.microsoft.com/en-gb/library/dn935011.aspx
The cold data can be moved to the cloud with your given date criteria without any applications or users being aware of it when querying the database. No backups required and very easy to setup.
There is no need for triggers, you can use job running every day, that will put outdated data into archive tables.
The best way I guess is to create a copy of current schema. In main part - delete all that is older then 13 months, in archive part - delete all for last 13 month.
Than create SP (or any SPs) that will collect data - put it into archive and delete it from main table. Put this is into daily running job.
The cleanest and fastest way to do this (with billions of rows) is to create a partitioned table probably based on a date column by month. Moving data in a given partition is a meta operation and is extremely fast (if the partition setup and its function is set up properly.) I have managed 300GB tables using partitioning and it has been very effective. Be careful with the partition function so dates at each edge are handled correctly.
Some of the other proposed solutions involve deleting millions of rows which could take a long, long time to execute. Model the different solutions using profiler and/or extended events to see which is the most efficient.
I agree with the above to not create a trigger. Triggers fire with every insert/update/delete making them very slow.
You may be best served with a data archive stored procedure.
Consider using multiple databases. The current database that has your current data. Then an archive or multiple archive databases where you move your records out from your current database to with some sort of say nightly or monthly stored procedure process that moves the data over.
You can use the exact same schema as your production system.
If the data is already in the database no need for a Bulk Copy. From there you can backup your archive database so it is off the sql server. Restore the database if needed to make the data available again. This is much faster and more manageable than bulk copy.
According to Microsoft's documentation on Stretch DB (found here - https://learn.microsoft.com/en-us/azure/sql-server-stretch-database/), you can't update or delete rows that have been migrated to cold storage or rows that are eligible for migration.
So while Stretch DB does look like a capable technology for archive, the implementation in SQL 2016 does not appear to support archive and purge.

Copying Vertica Schema or all tables in a schema from one physical cluster to another physical Cluster

I am trying to export and import Vertica schema from one physical cluster to another physical cluster.
My Test instance has one single cluster and my production instance has 3 clusters.
I explored following options, but they are limited to moving data on one physical Vertica instance:
EXPORT TO VERTICA ..
COPY schema.table FROM VERTICA ...
Would like to know if there is an option to move the Vertica schema from one physical Vertica instance to another, with different cluster configuration.
This is tricky manipulation, which have many issues:
If you copy over DDLs, you will lose the current value of sequences, which might mean duplicate primary key when you insert data.
If columns are set up as AUTO_INCREMENT, you will not be able to insert data in it as it is on the source (you cannot force an auto_increment column, although I believe this might have been fixed in new releases).
If you copy DDLs between clusters with a different number of nodes, if node names are part of projection definition, you will end up with something you do not want.
As you noticed, different networks will prevent the use of CONNECT.
An attempt to help out with this has been done in python via the pyvertica utility, and specially the vertica_migrate script. You can find the doc at https://pyvertica.readthedocs.org .
This is a tricky job, and I know there are some issues in this script, although it already helped me a lot.
Hope this helped,
You can use either COPY FROM VERTICA or EXPORT TO VERTICA to import/export the data to another Vertica database (regardless of node configuration). Also, the target table must already exist. You can use EXPORT_OBJECTS to export the DDL. Both methods allow for data migration from a version that's an earlier release from the last major release (running 6.x, you can import from 5.x).
In the example below, I'll use EXPORT TO VERTICA to export data from the source database to the target database.
You must first create a connection to the other database:
CONNECT TO VERTICA VMart USER dbadmin PASSWORD '' ON 'VerticaTarget',5433;
Then use EXPORT TO VERTICA to export the data from the source to the target database:
EXPORT TO VERTICA VMart.schema.customer_dimension FROM schema.customer_dimension;
|______________________________| |_______________________|
| |
Target Source
DISCONNECT VMart;

Fastest way to clear the content out of many tables

Right now we're using TRUNCATE to clear out the contents of 798 tables in postgres (isolated test runs). Where possible we use transactions. However, in places where it's not possible, we'd like the fastest way to reset the state of the DB.
We're working towards only actually calling truncate on the tables that have been modified (for any given test only a few of the 798 tables will be modified).
What is the fastest way to delete all of the data from many PostgreSQL tables?
Two things come to mind:
Setup the clean DB as a template and createdb a copy from it before each test.
Setup the clean DB as the default schema, but run the TransactionTests in a different schema (SET search_path TO %s).