How to unload data of syscat.tables on my prod db on instance1 - db2-luw

how to unload data of syscat.tables on my prod db on instance1 on some host and load this data to my dev db on instance2 on another host and then I want to compare list of table names on dev db so that i can correct my table definitions.
Do we have to use load and unload command
let me know if i am not clear

If you're only comparing table names and not much else, you can unload that information to a text file via the EXPORT command.
db2 "EXPORT TO tablenames.csv OF DEL SELECT tabschema, tabname FROM syscat.tables ORDER BY tabschema, tabname"
Another way to capture the output of a SELECT statement is with the -z option of the command line processor (CLP).
db2 -z tablenames.txt "SELECT tabschema, tabname FROM syscat.tables ORDER BY tabschema, tabname"
That said, any serious effort to compare and contrast the table definitions of different DB2 databases should involve the db2look command, which can reverse-engineer just about any DB2 object into a set of DDL statements. I highly recommend it.
Regardless of how you decide to generate the details about each database, you're likely to have a much easier time comparing them as text files than as rows inside a database. The diff command has been around for over forty years, and newer implementations such as WinMerge and Beyond Compare will make quick work of identifying discrepancies in even the largest sets of text files.

Related

Incrementally importing data to a PostgreSQL database

Situation:
I have a PostgreSQL-database that is logging data from sensors in a field-deployed unit (let's call this the source database). The unit has a very limited hard-disk space, meaning that if left untouched, the data-logging will cause the disk where the database is residing to fill up within a week. I have a (very limited) network link to the database (so I want to compress the dump-file), and on the other side of said link I have another PostgreSQL database (let's call that the destination database) that has a lot of free space (let's just, for argument's sake, say that the source is very limited with regard to space, and the destination is unlimited with regard to space).
I need to take incremental backups of the source database, append the rows that have been added since last backup to the destination database, and then clean out the added rows from the source database.
Now the source database might or might not have been cleaned since a backup was last taken, so the destination database needs to be able to only imported the new rows in an automated (scripted) process, but pg_restore fails miserably when trying to restore from a dump that has the same primary key numbers as the destination database.
So the question is:
What is the best way to restore only the rows from a source that are not already in the destination database?
The only solution that I've come up with so far is to pg_dump the database and restore the dump to a new secondary-database on the destination-side with pg_restore, then use simple sql to sort out which rows already exist in my main-destination database. But it seems like there should be a better way...
(extra question: Am I completely wrong in using PostgreSQL in such an application? I'm open to suggestions for other data-collection alternatives...)
A good way to start would probably be to use the --inserts option to pg_dump. From the documentation (emphasis mine) :
Dump data as INSERT commands (rather than COPY). This will make
restoration very slow; it is mainly useful for making dumps that can
be loaded into non-PostgreSQL databases. However, since this option
generates a separate command for each row, an error in reloading a row
causes only that row to be lost rather than the entire table contents.
Note that the restore might fail altogether if you have rearranged
column order. The --column-inserts option is safe against column order
changes, though even slower.
I don't have the means to test it right now with pg_restore, but this might be enough for your case.
You could also use the fact that from the version 9.5, PostgreSQL provides ON CONFLICT DO ... for INSERTs. Use a simple scripting language to add these to the dump and you should be fine. I haven't found an option for pg_dump to add those automatically, unfortunately.
You might google "sporadically connected database synchronization" to see related solutions.
It's not a neatly solved problem as far as I know - there are some common work-arounds, but I am not aware of a database-centric out-of-the-box solution.
The most common way of dealing with this is to use a message bus to move events between your machines. For instance, if your "source database" is just a data store, with no other logic, you might get rid of it, and use a message bus to say "event x has occurred", and point the endpoint of that message bus at your "destination machine", which then writes that to your database.
You might consider Apache ActiveMQ or read "Patterns of enterprise integration".
#!/bin/sh
PSQL=/opt/postgres-9.5/bin/psql
TARGET_HOST=localhost
TARGET_DB=mystuff
TARGET_SCHEMA_IMPORT=copied
TARGET_SCHEMA_FINAL=final
SOURCE_HOST=192.168.0.101
SOURCE_DB=slurpert
SOURCE_SCHEMA=public
########
create_local_stuff()
{
${PSQL} -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG0
CREATE SCHEMA IF NOT EXISTS ${TARGET_SCHEMA_IMPORT};
CREATE SCHEMA IF NOT EXISTS ${TARGET_SCHEMA_FINAL};
CREATE TABLE IF NOT EXISTS ${TARGET_SCHEMA_FINAL}.topic
( topic_id INTEGER NOT NULL PRIMARY KEY
, topic_date TIMESTAMP WITH TIME ZONE
, topic_body text
);
CREATE TABLE IF NOT EXISTS ${TARGET_SCHEMA_IMPORT}.tmp_topic
( topic_id INTEGER NOT NULL PRIMARY KEY
, topic_date TIMESTAMP WITH TIME ZONE
, topic_body text
);
OMG0
}
########
find_highest()
{
${PSQL} -q -t -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG1
SELECT MAX(topic_id) FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic;
OMG1
}
########
fetch_new_data()
{
watermark=${1-0}
echo ${watermark}
${PSQL} -h ${SOURCE_HOST} -U postgres ${SOURCE_DB} <<OMG2
\COPY (SELECT topic_id, topic_date, topic_body FROM ${SOURCE_SCHEMA}.topic WHERE topic_id >${watermark}) TO '/tmp/topic.dat';
OMG2
}
########
insert_new_data()
{
${PSQL} -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG3
DELETE FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic WHERE 1=1;
COPY ${TARGET_SCHEMA_IMPORT}.tmp_topic(topic_id, topic_date, topic_body) FROM '/tmp/topic.dat';
INSERT INTO ${TARGET_SCHEMA_FINAL}.topic(topic_id, topic_date, topic_body)
SELECT topic_id, topic_date, topic_body
FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic src
WHERE NOT EXISTS (
SELECT *
FROM ${TARGET_SCHEMA_FINAL}.topic nx
WHERE nx.topic_id = src.topic_id
);
OMG3
}
########
delete_below_watermark()
{
watermark=${1-0}
echo ${watermark}
${PSQL} -h ${SOURCE_HOST} -U postgres ${SOURCE_DB} <<OMG4
-- delete not yet activated; COUNT(*) instead
-- DELETE
SELECT COUNT(*)
FROM ${SOURCE_SCHEMA}.topic WHERE topic_id <= ${watermark}
;
OMG4
}
######## Main
#create_local_stuff
watermark="`find_highest`"
echo 'Highest:' ${watermark}
fetch_new_data ${watermark}
insert_new_data
echo 'Delete below:' ${watermark}
delete_below_watermark ${watermark}
# Eof
This is just an example. Some notes:
I assume a non-decreasing serial PK for the table; in most cases it could also be a timestamp
for simplicity, all the queries are run as user postgres, you might need to change this
the watermark method will guarantee that only new records will be transmitted, minimising bandwidth usage
the method is atomic, if the script crashes, nothing is lost
only one table is fetched here, but you could add more
because I'm paranoid, I us a different name for the staging table and put it into a separate schema
The whole script does two queries on the remote machine (one for fetch one for delete); you could combine these.
but there is only one script (executing from the local=target machine) involved.
The DELETE is not yet active; it only does a count(*)

Proper method to migrate DB2 materialized query tables (MQTs) using db2move and db2look

I'm migrating a database from DB2 10.1 for Windows x86_64 to DB2 10.1 for Linux x86_64 - this is a combination of operating systems and machine types that have incompatible backup file formats, which means I can't just do a backup and restore.
Instead, I'm using db2move to backup the database from Windows and restore it on Linux. However, db2move does not move the materialized query tables (MQTs). Instead I need to use db2look. This poses the challenge of finding a generic method to handle the process. Right now to dump the DDLs for the materialized queries I have to run the following commands:
db2 connect to MYDATABASE
db2 -x "select cast(tabschema || '.' || tabname as varchar(80)) as tablename from syscat.tables where type='S'"
This returns a list of MQTs such as:
MYSCHEMA.TABLE1
MYSCHEMA.TABLE2
MYOTHERSCHEMA.TABLE3
I can then take all those values and feed them into a db2look to generate the DDLs for each table and send the output to mqts.sql.
db2look -d MYDATABASE -e -t MYSCHEMA.TABLE1 MYSCHEMA.TABLE2 MYOTHERSCHEMA.TABLE3 -o mqts.sql
Then I copy the file mqts.sql to the target computer, which I've previously restored all the non-MQTs, and run the following command to restore the MQTs:
db2 -tvf mqts.sql
Is this the standard way to migrate a MQT? There has got to be a simpler way that I'm missing here.
db2move is mainly to migrate data, and things related to that data, for example the DDL of each table, etc. db2move does not even migrate the relation between tables, so you have to recreated them with the ddl.
Taking the previous thing into account, an MQT is just a DDL, it does not have any data. The tool to deal with DDLs is db2look, and it has so many options to extract exactly what you want.
The process you indicated is a normal process to extract that DDL. However, I have seen more difficult processes than yours, dealing with DDLs and with db2more/db2look; yours is "simple".
Another option is to use Data Studio, however you cannot script that.
I believe what you are doing is right because MQTs do not have data of their own and are populated from the base tables. So the process should be to migrate data into the base tables which the MQT is referring and then simple create/refresh the MQTs.

Purging an SQL table

I have an SQL table which is used for logging purpose(There are lakhs of records in the table). I need to purge the table (Take a back up of the data and need to clear the table data).
Is there a standard way of doing it where I can automate it.?
You can do this within SQL Server Management Studio, by:
right clicking Database > Tasks > Generate Script
You can then select the table you wish to script out and also choose to include any associated objects, such as constraints and indexes.
Attaching an image which will give you the step by step procedure,
image_bkp_procedure
PFB the stackoverflow link which will give you more insight on this,
Table-level backup
And your automation requirement,
You can download bcp utility which copies data between an instance of Microsoft SQL Server and a data file in a user-specified format.
Sample syntax to export,
bcp "select * from [MyDatabase].dbo.Customer " queryout "Customer.bcp" -N -S localhost -T -E
You can automate this query by using any scheduling mechanism (UNIX etc)
Simply we can create a job that runs once in a month
--> That backups data in another table like archive table
--> Then deletes data in the main table
Its primitive partitioning I guess, this way it will be more flexible when you need to select data from the past deleted one i.e. now on archive table where you have backed up

Fastest way to copy table contet from one server to another

Im looking for fastest way to copy some tables from one sybase server (ase 12.5) to another. Currently im using bcp tool but it takes time to create proper bcp.fmt file.
Tables have the same structure. There is about 25K rows in every table. I have to copy about 40 tables.
I tryed to use -c parameter for bcp but I get errors while importing:
CSLIB Message: - L0/O0/S0/N24/1/0:
cs_convert: cslib user api layer: common library error: The conversion/operation
was stopped due to a syntax error in the source field.
My standard bcp in/out commands:
bcp.exe SPEPL..VSoftSent out VSoftSent.csv -U%user% -P%pass% -S%srv% -c
bcp.exe SPEPL..VSoftSent in VSoftSent.csv -U%user2% -P%pass2% -S%srv2% -e import.err -c
Since you are copying from different servers, BCP is the way to go!
If it was in the same server would be different.
Are you saying it's from 1 Sybase ASE host to another Sybase ASE host?
If you don't want to mess with BCP or I/O on the file system, you could create a CIS proxy table in your destination database that references either a stored procedure with a select statement or a physical table in your source database.
Then you could just
insert into destinationtable (col1, col2...)
select
col1, col2...
from proxytablename
CIS proxy is fairly resource intensive, so I'd be very careful about how much work you're doing here.

shell script to truncate all MySql tables

I'm looking for a Unix shell script that will truncate all tables in a schema. A similar question was already asked, but I have some additional requirements which makes none of the provided answers satisfactory:
Must be a Unix shell script (i.e. no python, perl, PHP)
The script must truncate the tables in an order which respects foreign key constraints
I'd prefer not to have to use a stored proc
Thanks in advance,
Don
How about something cheeky like this:
mysqldump --no-data mydb | mysql mydb
Gets a dump of the schema and replays it into the database!
Alternatively, check out mk-find in Maatkit, you should be able to do something like this:
mk-find -exec "truncate %s"
Description of mk-find:
This tool is the MySQL counterpart to
the UNIX ‘find’ command. It accepts
tests (such as “find all tables larger
than 1GB”) and performs actions, such
as executing SQL (”DROP TABLE %s”).
With this tool at your disposal you
can automate many tedious tasks, such
as measuring the size of your tables
and indexes and saving the data for
historical trending, dropping old
scratch tables, and much more. It is
especially useful in periodic
scheduled tasks such as cron jobs.