Database trouble in Django: can't reset because of dependencies - sql

I'm trying to reset a database in Django, using:
python manage.py reset app
but get the following error:
Error: Error: app couldn't be reset. Possible reasons:
* The database isn't running or isn't configured correctly.
* At least one of the database tables doesn't exist.
* The SQL was invalid.
Hint: Look at the output of 'django-admin.py sqlreset app'. That's the SQL this command wasn't able to run.
The full error: cannot drop table app_record because other objects depend on it
HINT: Use DROP ... CASCADE to drop the dependent objects too.
This is what my models.py file looks like:
class Record(models.Model):
name = models.CharField(max_length=50, db_index=True)
year = models.IntegerField(blank=True, null=True)
def __unicode__(self):
return self.name
class Class(models.Model):
record = models.ForeignKey(Record)
def __unicode__(self):
return self.id
I get that I need to use the DROP... CASCADE command in the SQL that deletes and recreates the database (the output of django-admin.py).
But how can I edit that SQL directly from models.py?
UPDATE
OK, I figured out how to delete tables manually (the database is postgres), have noted it here for anyone with the same problem:
python manage.py dbshell
# drop table app_record cascade;
# \q
python manage.py reset app
Would still like to know if anyone has a better way to do it, though :)

The easy way to fully reset a Django database is using django-extensions.
It has a reset_db command that supports all Django's default database backends.
python manage.py reset_db
If you're using Django 1.2+ you should explicitly define the database you want to reset. If your project only uses one database, you should probably set --router=default

I use a little unix pipeline that adds CASCADE to all the DROP statements.
python manage.py sqlreset myapp | sed 's/DROP TABLE \(.*\);/DROP TABLE \1 CASCADE;/g' | \
psql --username myusername mydbname

The problem of DROP TABLE CASCADE is that it just remove a foreign keys on related tables - after syncdb this relation is not recreated.
I found no way to recreate the particular model's tables, so I'm reseting whole application by recreating schema:
DROP SCHEMA public CASCADE;
CREATE SCHEMA "public" AUTHORIZATION "owner of database";
That should work only with database that supports schema, e.g. postgresql

Using the details in other answers, I made a bash function that I dropped into ~/.bash_profile (on Mac OS X).
django_reset () { python mainsite/manage.py sqlreset "$*" | sed 's/DROP TABLE \(.*\);/DROP TABLE \1 CASCADE;/g' | mainsite/manage.py dbshell ; }
Then just run this command in the terminal from your root code directory (so the path to mainsite/manage.py makes sense).
django_reset myappA myappB
And it'll execute!

I found another way. I'm using sqlite3 that comes by default in Django.
To reset table to default.
python manage.py flush --database=default
after this you will need to use the syncdb command again.

Related

How to rename database using phpMyAdmin tool?

I created a fresh database in phpmyadmin which does not contain any tables yet since its fresh, however I accidentally made a typo. How can I rename the database?
If this happens to me I usually just execute the SQL command:
DROP DATABASE dbname;
and create another database. But is it possible to rename it? I was already searching SO but found nothing helpful.
I found two possible solutions.
Rename it via the phpmyadmin backend UI (preferable):
Or just execute this SQL (only use it if the database is fresh and does not contain any data yet, otherwise it will be lost!)
CREATE DATABASE newname;
DROP DATABASE oldname;
ALTER DATABASE oldName MODIFY NAME = newName
I don't think you can do this. I think you'll need to dump that database, create the newly named one and then import the dump.
If this is a live system you'll need to take it down. If you cannot, then you will need to setup replication from this database to the new one.
If you want to see the commands try this link, Rename MySQL database
Try using an aux temporary db (as copy of the original)
$ mysqldump dbname > dbname_dump.sql //create a backup
$ mysqladmin create dbname_new //create your new db with desired name
$ mysql dbname_new < dbname_dump.sql //restore the backup to the new one
$ mysql drop database dbname; //drop old one

Incrementally importing data to a PostgreSQL database

Situation:
I have a PostgreSQL-database that is logging data from sensors in a field-deployed unit (let's call this the source database). The unit has a very limited hard-disk space, meaning that if left untouched, the data-logging will cause the disk where the database is residing to fill up within a week. I have a (very limited) network link to the database (so I want to compress the dump-file), and on the other side of said link I have another PostgreSQL database (let's call that the destination database) that has a lot of free space (let's just, for argument's sake, say that the source is very limited with regard to space, and the destination is unlimited with regard to space).
I need to take incremental backups of the source database, append the rows that have been added since last backup to the destination database, and then clean out the added rows from the source database.
Now the source database might or might not have been cleaned since a backup was last taken, so the destination database needs to be able to only imported the new rows in an automated (scripted) process, but pg_restore fails miserably when trying to restore from a dump that has the same primary key numbers as the destination database.
So the question is:
What is the best way to restore only the rows from a source that are not already in the destination database?
The only solution that I've come up with so far is to pg_dump the database and restore the dump to a new secondary-database on the destination-side with pg_restore, then use simple sql to sort out which rows already exist in my main-destination database. But it seems like there should be a better way...
(extra question: Am I completely wrong in using PostgreSQL in such an application? I'm open to suggestions for other data-collection alternatives...)
A good way to start would probably be to use the --inserts option to pg_dump. From the documentation (emphasis mine) :
Dump data as INSERT commands (rather than COPY). This will make
restoration very slow; it is mainly useful for making dumps that can
be loaded into non-PostgreSQL databases. However, since this option
generates a separate command for each row, an error in reloading a row
causes only that row to be lost rather than the entire table contents.
Note that the restore might fail altogether if you have rearranged
column order. The --column-inserts option is safe against column order
changes, though even slower.
I don't have the means to test it right now with pg_restore, but this might be enough for your case.
You could also use the fact that from the version 9.5, PostgreSQL provides ON CONFLICT DO ... for INSERTs. Use a simple scripting language to add these to the dump and you should be fine. I haven't found an option for pg_dump to add those automatically, unfortunately.
You might google "sporadically connected database synchronization" to see related solutions.
It's not a neatly solved problem as far as I know - there are some common work-arounds, but I am not aware of a database-centric out-of-the-box solution.
The most common way of dealing with this is to use a message bus to move events between your machines. For instance, if your "source database" is just a data store, with no other logic, you might get rid of it, and use a message bus to say "event x has occurred", and point the endpoint of that message bus at your "destination machine", which then writes that to your database.
You might consider Apache ActiveMQ or read "Patterns of enterprise integration".
#!/bin/sh
PSQL=/opt/postgres-9.5/bin/psql
TARGET_HOST=localhost
TARGET_DB=mystuff
TARGET_SCHEMA_IMPORT=copied
TARGET_SCHEMA_FINAL=final
SOURCE_HOST=192.168.0.101
SOURCE_DB=slurpert
SOURCE_SCHEMA=public
########
create_local_stuff()
{
${PSQL} -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG0
CREATE SCHEMA IF NOT EXISTS ${TARGET_SCHEMA_IMPORT};
CREATE SCHEMA IF NOT EXISTS ${TARGET_SCHEMA_FINAL};
CREATE TABLE IF NOT EXISTS ${TARGET_SCHEMA_FINAL}.topic
( topic_id INTEGER NOT NULL PRIMARY KEY
, topic_date TIMESTAMP WITH TIME ZONE
, topic_body text
);
CREATE TABLE IF NOT EXISTS ${TARGET_SCHEMA_IMPORT}.tmp_topic
( topic_id INTEGER NOT NULL PRIMARY KEY
, topic_date TIMESTAMP WITH TIME ZONE
, topic_body text
);
OMG0
}
########
find_highest()
{
${PSQL} -q -t -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG1
SELECT MAX(topic_id) FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic;
OMG1
}
########
fetch_new_data()
{
watermark=${1-0}
echo ${watermark}
${PSQL} -h ${SOURCE_HOST} -U postgres ${SOURCE_DB} <<OMG2
\COPY (SELECT topic_id, topic_date, topic_body FROM ${SOURCE_SCHEMA}.topic WHERE topic_id >${watermark}) TO '/tmp/topic.dat';
OMG2
}
########
insert_new_data()
{
${PSQL} -h ${TARGET_HOST} -U postgres ${TARGET_DB} <<OMG3
DELETE FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic WHERE 1=1;
COPY ${TARGET_SCHEMA_IMPORT}.tmp_topic(topic_id, topic_date, topic_body) FROM '/tmp/topic.dat';
INSERT INTO ${TARGET_SCHEMA_FINAL}.topic(topic_id, topic_date, topic_body)
SELECT topic_id, topic_date, topic_body
FROM ${TARGET_SCHEMA_IMPORT}.tmp_topic src
WHERE NOT EXISTS (
SELECT *
FROM ${TARGET_SCHEMA_FINAL}.topic nx
WHERE nx.topic_id = src.topic_id
);
OMG3
}
########
delete_below_watermark()
{
watermark=${1-0}
echo ${watermark}
${PSQL} -h ${SOURCE_HOST} -U postgres ${SOURCE_DB} <<OMG4
-- delete not yet activated; COUNT(*) instead
-- DELETE
SELECT COUNT(*)
FROM ${SOURCE_SCHEMA}.topic WHERE topic_id <= ${watermark}
;
OMG4
}
######## Main
#create_local_stuff
watermark="`find_highest`"
echo 'Highest:' ${watermark}
fetch_new_data ${watermark}
insert_new_data
echo 'Delete below:' ${watermark}
delete_below_watermark ${watermark}
# Eof
This is just an example. Some notes:
I assume a non-decreasing serial PK for the table; in most cases it could also be a timestamp
for simplicity, all the queries are run as user postgres, you might need to change this
the watermark method will guarantee that only new records will be transmitted, minimising bandwidth usage
the method is atomic, if the script crashes, nothing is lost
only one table is fetched here, but you could add more
because I'm paranoid, I us a different name for the staging table and put it into a separate schema
The whole script does two queries on the remote machine (one for fetch one for delete); you could combine these.
but there is only one script (executing from the local=target machine) involved.
The DELETE is not yet active; it only does a count(*)

Proper way to migrate a postgres database?

I have a dev version and a production version running in django.
I recently started populating it with a lot of data and found that the django loaddata tries to load everything into memory before adding it into the db and my files will be too big for that.
What is the proper way to push my data from my dev machine to my production?
I did...
pg_dump -U user -W db ./filename.sql
and then on the production server I did...
psql dbname < filename.sql
It seems like it worked, all the data is there, but it came up with some errors such as
relation xxx already exists
constrain xxx for relation xxx already exists
and there were quite a few of them, but like I said everything appears to be there. Is this the right way to do it?
Edit: I have in the production machine the database with information and I don't want truncate the tables before import.
This is the script that I use:
pg_dump -d DATABASE_NAME -U postgres --format plain --inserts > /FILE.sql
Edit: As you says in comments that you don't want truncate the tables before import, you can't do this type of import into your production database. I suggest empty your production database before import the dev database dump.

Exporting from one schema and importing to another with pg_dump

I have a table called units, which exists in two separate schemas within the same database (we'll call them old_schema, and new_schema). The structure of the table in both schemas are identical. The only difference is that the units table in new_schema is presently empty.
I am attempting to export the data from this table in old_schema and import it into new_schema. I used pg_dump to handle the export, like so:
pg_dump -U username -p 5432 my_database -t old_schema.units -a > units.sql
I then attempted to import it using the following:
psql -U username -p 5432 my_database -f units.sql
Unfortunately, this appeared to try and reinsert back in to the old_schema. Looking at the generated sql file, it seems there is a line, which I think is causing this:
SET search_path = mysql_migration, pg_catalog;
I can, in fact, alter this line to read
SET search_path = public;
And this does prove successful, but I don't believe this is the "correct" way to accomplish this.
Question: When importing data via a script generated through pg_dump, how can I specify in to which schema the data should go without altering the generated file?
There are two main issues here based on the scenario you described.
The difference in the schemas, to which you alluded.
The fact that by dumping the whole table via pg_dump, you're dumping the table definition also, which will cause issues if the table is already present in the destination schema.
To dump only the data, if the table already exists in the destination database (which appears to be the case based on your scenario above), you can dump the table using pg_dump with the --data-only flag.
Then, to address the schema issue, I would recommend doing a search/replace (sed would be a quick way to do it) on the output sql file, replacing old_schema with new_schema.
That way, it will apply the data (which is all that would be in the file, not the table definition itself) to the table in new_schema.
If you need a solution on a broader level to support, say, dynamically named schemas, you can use the same search/replace trick with sed, but instead of replacing it with new_schema, replace it with some placeholder text, say, $$placeholder_schema$$ (something highly unlikely to appear as as token elsewhere in the file), and then, when you need to apply that file to a particular schema, use the original file as a template, copy it, and then modify the copy using sed or similar, replacing the placeholder token with the desired on-the-fly schema name.
You can set some options for psql on the command line, such as --set AUTOCOMMIT=off, however, a similar approach with SEARCH_PATH does not appear to have any effect.
Instead, it needs the form \set SEARCH_PATH to <path>, which can be specified with the -c option, but not in combination with -f (it's either or).
Given that, I think modifying the file with sed is probably the best all around option in this case for use with -f.

Create database explicitly before restoring to it?

When I setup my PostgreSQL server one of the first things I will do is import a database for an external source. Which of the following is the right way to do it?
Create a database called "NEWDB" on the PostgreSQL server and then
import my external "BACKUPDB" database from my pg_dump into the
"NEWDB".
Don't create a database on the PostgreSQL server, and import the
"NEWDB" database, thereby automatically creating "NEWDB" on the
postgresql server.
I guess my question is, if I want to import an existing database to the PostgreSQL server, do I first need to create a database for it to go into?
You don't have to. It depends on what you want to achieve. If you dump a single database with pg_dump, CREATE DATABASE and ALTER DATABASE commands are not included. You are expected to connect to an existing database. So you have to create it first.
I quote advice from the manual:
If your database cluster has any local additions to the template1
database, be careful to restore the output of pg_dump into a truly
empty database; otherwise you are likely to get errors due to
duplicate definitions of the added objects. To make an empty database
without any local additions, copy from template0 not template1, for
example:
CREATE DATABASE foo WITH TEMPLATE template0;
And also:
The dump file also does not contain any ALTER DATABASE ... SET
commands; these settings are dumped by pg_dumpall, along with database
users and other installation-wide settings.
pg_dumpall, on the other hand, dumps the whole DB cluster including meta-objects like users. It includes CREATE DATABASE statements and connects to each DB when restoring. You can even include DROP DATABASE statements with the -c (--clean) option. Careful with that.
Every instance of PostgreSQL has a default maintenance db named "postgres" that you can connect to - to create databases for instance or start a full restore (from pg_dumpall). But a single-DB dump (from pg_dump) has to be run against its target database.
Finally:
Once restored, it is wise to run ANALYZE on each database so the
optimizer has useful statistics. You can also run vacuumdb -a -z to
analyze all databases.