Duplicate entry error when trying to import a dump - sql

I have the following table on a server:
CREATE TABLE routes (
id int(11) unsigned NOT NULL AUTO_INCREMENT,
from_lat double NOT NULL,
from_lng double NOT NULL,
to_lat double NOT NULL,
to_lng double NOT NULL,
distance int(11) unsigned NOT NULL,
drive_time int(11) unsigned NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY route (from_lat,from_lng,to_lat,to_lng)
) ENGINE=InnoDB;
We are saving some routing information from point A (from_lat, from_lng) to point B (to_lat, to_lng). There is a unique index on the coordinates.
However, there are two entries in the database that confuse me:
+----+----------+----------+---------+---------+----------+------------+
| id | from_lat | from_lng | to_lat | to_lng | distance | drive_time |
+----+----------+----------+---------+---------+----------+------------+
| 27 | 52.5333 | 13.1667 | 52.5833 | 13.2833 | 13647 | 1125 |
| 28 | 52.5333 | 13.1667 | 52.5833 | 13.2833 | 13647 | 1125 |
+----+----------+----------+---------+---------+----------+------------+
They are exactly the same.
When I not try to export the database using mysqldump and trying to reimport it, I get an error:
ERROR 1062 (23000): Duplicate entry '52.5333-13.1667-52.5833-13.2833' for key 'route'
How can it be that this is in the database, when there is an unique key on them? Shouldn't MySQL reject them?

Is it possible that the double values are slightly different, but only after the 4th digit?
If you export and import them, they would be the same, and that would give a unique constraint violation.
Quoting from this MySQL bug report:
When mysqldump dumps a DOUBLE value, it uses insufficient precision to
distinguish between some close values (and, presumably, insufficient
precision to recreate the exact values from the original database). If
the DOUBLE value is a primary key or part of a unique index, restoring
the database from this output fails with a duplicate key error.
Try to display them with more digits behind the comma (how will depend on your client.)

Related

How to declare "nextval('testing_thing_thing_id_seq'::regclass)" as default value for column "thing_id" in postgres table "testing_thing"?

In my postgres db there is a table called testing_thing, which I can see (by running \d testing_thing in my psql prompt) it is defined as
Table "public.testing_thing"
Column | Type | Collation | Nullable | Default
--------------+-------------------+-----------+----------+-----------------------------------------------------
thing_id | integer | | not null | nextval('testing_thing_thing_id_seq'::regclass)
thing_num | smallint | | not null | 0
thing_desc | character varying | | not null |
Indexes:
"testing_thing_pk" PRIMARY KEY, btree (thing_num)
I want to drop it and re-create it exactly as it is, but I don't know how to reproduce the
nextval('testing_thing_thing_id_seq'::regclass)
part for column thing_id.
This is the query I put together to create the table:
CREATE TABLE testing_thing(
thing_id integer NOT NULL, --what else should I put here?
thing_num smallint NOT NULL PRIMARY KEY DEFAULT 0,
thing_desc varchar(100) NOT NULL
);
what is it missing?
Add a DEFAULT to the column you want to increment and call nextval():
CREATE SEQUENCE testing_thing_thing_id_seq START WITH 1;
CREATE TABLE testing_thing(
thing_id integer NOT NULL DEFAULT nextval('testing_thing_thing_id_seq'),
thing_num smallint NOT NULL PRIMARY KEY DEFAULT 0,
thing_desc varchar(100) NOT NULL
);
Side note: Keep in mind that attaching a sequence to a column does not prevent users to manually fill it with random data, which can create really nasty problems with primary keys. If you want to overcome it and do not necessarily need to have a sequence, consider creating an identity column, e.g.
CREATE TABLE testing_thing(
thing_id integer NOT NULL GENERATED ALWAYS AS IDENTITY,
thing_num smallint NOT NULL PRIMARY KEY DEFAULT 0,
thing_desc varchar(100) NOT NULL
);
Demo: db<>fiddle

Insert multiple values with foreign key Postgresql

I am having trouble figuring out how to insert multiple values to a table, which checks if another table has the needed values stored. I am currently doing this in a PostgreSQL server, but will be implementing it in PreparedStatements for my java program.
user_id is a foreign key which references the primary in mock2. I have been trying to check if mock2 has values ('foo1', 'bar1') and ('foo2', 'bar2').
After this I am trying to insert new values into mock1 which would have a date and integer value and reference the primary key of the row in mock2 to the foreign key in mock1.
mock1 table looks like this:
===============================
| date | time | user_id |
| date | integer | integer |
| | | |
And the table mock2 is:
==================================
| Id | name | program |
| integer | text | test |
Id is a primary key for the table and the name is UNIQUE.
I've been playing around with this solution https://dba.stackexchange.com/questions/46410/how-do-i-insert-a-row-which-contains-a-foreign-key
However, I haven't been able to make it work. Could someone please point out what the correct syntax is for this, I would be really appreciative.
EDIT:
The create table statements are:
CREATE TABLE mock2(
id SERIAL PRIMARY KEY UNIQUE,
name text NOT NULL,
program text NOT NULL UNIQUE
);
and
CREATE TABLE mock1(
date date,
time_spent INTEGER,
user_id integer REFERENCES mock2(Id) NOT NULL);
Ok so I found an answer to my own question.
WITH ins (date,time_spent, id) AS
( VALUES
( '22/08/2012', 170, (SELECT id FROM mock3 WHERE program ='bar'))
)
INSERT INTO mock4
(date, time_spent, user_id)
SELECT
ins.date, ins.time_spent, mock3.id
FROM
mock3 JOIN ins
ON ins.id = mock3.id ;
I was trying to take the 2 values from the first table, match these and then insert 2 new values to the next table, but I realised that I should be using the Primary and Foreign keys to my advantage.
I instead now JOIN on the ID and then just select the key I need by searching it from the values with (SELECT id FROM mock3 WHERE program ='bar') in the third row.

Conditional rule on Insert in Postgresql

My question is relative to a postgresql problem I encountered.
I want to modify a database containing a newsgroup-like forum. Its messages describe a tree and are stored in a table as below :
\d messages
Table « public.messages »
Column | Type | Modifiers
-----------+-----------------------------+------------------------------------------------------------------
idmessage | integer | not NULL By default, nextval('messages_idmessage_seq'::regclass)
title | character varying(50) | not NULL
datemsg | timestamp without time zone | By default, ('now'::text)::timestamp without time zone
author | character varying(10) | By default, "current_user"()
text | text | not NULL
idgroup | integer | not NULL
msgfather | integer |
(I translated the output from French, I hope I didn't add misinterpretation)
Title is the message title, datemsg its timestamp, author and text are quite explicit, idgroup is the discussion group identifier where the message is found and msgfather is another message this message anwsers to.
The rule I want to implement on insert is about preventing the user to add an answer to a message with a different group from its father's (if it has one, as a new discussion starts with a new message posted without parent).
Now I have this view:
\d newmessage
View « public.newmessage »
Column | Type | Modifiers
-----------+-----------------------+---------------
idmessage | integer |
title | character varying(50) |
text | text |
idgroup | integer |
msgfather | integer |
View definition :
SELECT messages.idmessage, messages.title, messages.text, messages.idgroup, messages.msgfather
FROM messages;
Rules :
ins_message AS
ON INSERT TO newmessage DO INSTEAD INSERT INTO messages (title, text, idgroup, msgfather)
VALUES (new.title, new.text, new.idgroup, new.msgfather)
I can't change view or table as people already use the database, so I think I have to play with rules or triggers.
I tried to add to the view this rule below, without the expected effect:
CREATE OR REPLACE RULE ins_message_answer AS ON INSERT TO newmessage WHERE NEW.msgfather IS NOT NULL DO INSTEAD INSERT INTO messages( title, text, idgroup, msgfather) VALUES ( NEW.title, NEW.text, (SELECT idgroup FROM messages WHERE idmessage=new.msgfather), NEW.msgfather);
Even with this new rule, people can still answer a message and set this answer in another group.
I tried too to change the ins_message rule by adding WHERE NEW.msgfather IS NULL, but I then have the error that no rule is found as on insert in newmessage view when I try to add a message.
So how to achieve what I want to do? With a trigger? (I don't know how to do one with postgresql).
First create an unique index for (idmessage, idgroup):
alter table messages add constraint messages_idmessage_idgroup_key
unique (idmessage, idgroup);
This constraint is needed for msgfather_idgroup_fkey below
as foreign key constraints require matching unique index and
without it there'll be an ERROR: there is no unique constraint matching given keys for referenced table "messages".
Then add self referencing foreign key:
alter table messages add constraint msgfather_idgroup_fkey
foreign key (msgfather,idgroup) references messages(idmessage,idgroup);
In passing:
add NOT NULL to datemsg and author columns;
never use rules.

MySQL: Which indexes to use for a simple range select?

I have a table with ~30 million rows ( and growing! ) and currently i have some problems with a simple range select.
The query, looks like this one:
SELECT SUM( CEIL( dlvSize / 100 ) ) as numItems
FROM log
WHERE timeLogged BETWEEN 1000000 AND 2000000
AND user = 'example'</pre>
It takes minutes to finish and i think that the solution would be at the indexes that i'm using. Here is the result of explain:
+----+-------------+-------+-------+---------------------------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------------------------+---------+---------+------+----------+-------------+
| 1 | SIMPLE | log | range | PRIMARY,timeLogged | PRIMARY | 4 | NULL | 11839754 | Using where |
+----+-------------+-------+-------+---------------------------------+---------+---------+------+----------+-------------+
My table structure is this one ( reduced to make it fit better on the problem ):
CREATE TABLE IF NOT EXISTS `log` (
`origDomain` varchar(64) NOT NULL default '0',
`timeLogged` int(11) NOT NULL default '0',
`orig` varchar(128) NOT NULL default '',
`rcpt` varchar(128) NOT NULL default '',
`dlvSize` varchar(255) default NULL,
`user` varchar(255) default NULL,
PRIMARY KEY (`timeLogged`,`orig`,`rcpt`),
KEY `timeLogged` (`timeLogged`),
KEY `orig` (`orig`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Any ideas of what can I do to optimize this query or indexes on my table?
You may want to try adding a composite index on (user, timeLogged):
CREATE TABLE IF NOT EXISTS `log` (
...
KEY `user_timeLogged` (user, timeLogged),
...
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Related Stack Overflow post:
Database: When should I use a composite index?
In addition to the suggestions made by the other answers, I note that you have a column user in the table which is a varchar(255). If this refers to a column in a table of users, then 1) it would most likely to far more efficient to add an integer ID column to that table, and use that as the primary key and as a referencing column in other tables; 2) you are using InnoDB, so why not take advantage of the foreign key capabilities it offers?
Consider that if you index by a varchar(n) column, it is treated like a char(n) in the index, so each row of your current primary key takes up 4 + 128 + 128 = 260 bytes in the index.
Add an index on user.

Is InnoDB sorting really THAT slow?

I had all my tables in myISAM but the table level locking was starting to kill me when I had long running update jobs. I converted my primary tables over to InnoDB and now many of my queries are taking over 1 minute to complete where they were nearly instantaneous on myISAM. They are usually stuck in the Sorting result step. Did I do something wrong?
For example :
SELECT * FROM `metaward_achiever`
INNER JOIN `metaward_alias` ON (`metaward_achiever`.`alias_id` = `metaward_alias`.`id`)
WHERE `metaward_achiever`.`award_id` = 1507
ORDER BY `metaward_achiever`.`modified` DESC
LIMIT 100
Takes about 90 seconds now. Here is the describe :
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+-------+-----------------------------+
| 1 | SIMPLE | metaward_achiever | ref | metaward_achiever_award_id,metaward_achiever_alias_id | metaward_achiever_award_id | 4 | const | 66424 | Using where; Using filesort |
| 1 | SIMPLE | metaward_alias | eq_ref | PRIMARY | PRIMARY | 4 | paul.metaward_achiever.alias_id | 1 | |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+-------+-----------------------------+
It seems that now TONS of my queries get stuck in the "Sorting result" step :
mysql> show processlist;
+--------+------+-----------+------+---------+------+----------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+-----------+------+---------+------+----------------+------------------------------------------------------------------------------------------------------+
| 460568 | paul | localhost | paul | Query | 0 | NULL | show processlist |
| 460638 | paul | localhost | paul | Query | 0 | Sorting result | SELECT `metaward_achiever`.`id`, `metaward_achiever`.`modified`, `metaward_achiever`.`created`, `met |
| 460710 | paul | localhost | paul | Query | 79 | Sending data | SELECT `metaward_achiever`.`id`, `metaward_achiever`.`modified`, `metaward_achiever`.`created`, `met |
| 460722 | paul | localhost | paul | Query | 49 | Updating | UPDATE `metaward_alias` SET `modified` = '2009-09-15 12:43:50', `created` = '2009-08-24 11:55:24', ` |
| 460732 | paul | localhost | paul | Query | 25 | Sorting result | SELECT `metaward_achiever`.`id`, `metaward_achiever`.`modified`, `metaward_achiever`.`created`, `met |
+--------+------+-----------+------+---------+------+----------------+------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
Any why is that simple update stuck for 49 seconds?
If it helps, here are the schemas :
| metaward_alias | CREATE TABLE `metaward_alias` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`modified` datetime NOT NULL,
`created` datetime NOT NULL,
`string_id` varchar(255) DEFAULT NULL,
`shortname` varchar(100) NOT NULL,
`remote_image` varchar(500) DEFAULT NULL,
`image` varchar(100) NOT NULL,
`user_id` int(11) DEFAULT NULL,
`type_id` int(11) NOT NULL,
`md5` varchar(32) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `string_id` (`string_id`),
KEY `metaward_alias_user_id` (`user_id`),
KEY `metaward_alias_type_id` (`type_id`)
) ENGINE=InnoDB AUTO_INCREMENT=858381 DEFAULT CHARSET=utf8 |
| metaward_award | CREATE TABLE `metaward_award` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`modified` datetime NOT NULL,
`created` datetime NOT NULL,
`string_id` varchar(20) NOT NULL,
`owner_id` int(11) NOT NULL,
`name` varchar(100) NOT NULL,
`description` longtext NOT NULL,
`owner_points` int(11) NOT NULL,
`url` varchar(500) NOT NULL,
`remote_image` varchar(500) DEFAULT NULL,
`image` varchar(100) NOT NULL,
`parent_award_id` int(11) DEFAULT NULL,
`slug` varchar(110) NOT NULL,
`true_points` double DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `string_id` (`string_id`),
KEY `metaward_award_owner_id` (`owner_id`),
KEY `metaward_award_parent_award_id` (`parent_award_id`),
KEY `metaward_award_slug` (`slug`),
KEY `metaward_award_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=122176 DEFAULT CHARSET=utf8 |
| metaward_achiever | CREATE TABLE `metaward_achiever` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`modified` datetime NOT NULL,
`created` datetime NOT NULL,
`award_id` int(11) NOT NULL,
`alias_id` int(11) NOT NULL,
`count` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `metaward_achiever_award_id` (`award_id`),
KEY `metaward_achiever_alias_id` (`alias_id`)
) ENGINE=InnoDB AUTO_INCREMENT=77175366 DEFAULT CHARSET=utf8 |
And these in my my.cnf
innodb_file_per_table
innodb_buffer_pool_size = 2048M
innodb_additional_mem_pool_size = 16M
innodb_flush_method=O_DIRECT
As written in Should you move from MyISAM to Innodb ? (which is pretty recent):
Innodb Needs Tuning As a final note about MyISAM to Innodb migration I should mention about Innodb tuning. Innodb needs tuning. Really. MyISAM for many applications can work well with defaults. I’ve seen hundreds of GB databases ran with MyISAM with default settings and it worked reasonably. Innodb needs resources and it will not work well with defaults a lot. Tuning MyISAM from defaults rarely gives more than 2-3 times gain while it can be as much as 10-50 times for Innodb tables in particular for write intensive workloads. Check here for details.
So, about MySQL Innodb Settings, the author wrote in Innodb Performance Optimization Basics:
The most important ones are:
innodb_buffer_pool_size 70-80% of
memory is a safe bet. I set it to 12G
on 16GB box.
UPDATE: If you’re looking for more
details, check out detailed guide on
tuning innodb buffer pool
innodb_log_file_size – This depends on
your recovery speed needs but 256M
seems to be a good balance between
reasonable recovery time and good
performance
innodb_log_buffer_size=4M 4M is good
for most cases unless you’re piping
large blobs to Innodb in this case
increase it a bit.
innodb_flush_log_at_trx_commit=2 If you’re not concern about ACID and can loose transactions for last second or two in case of full OS crash than set this value. It can dramatic effect especially on a lot of short write transactions.
innodb_thread_concurrency=8 Even with current Innodb Scalability Fixes having limited concurrency helps. The actual number may be higher or lower depending on your application and default which is 8 is decent start
innodb_flush_method=O_DIRECT Avoid double buffering and reduce swap pressure, in most cases this setting improves performance. Though be careful if you do not have battery backed up RAID cache as when write IO may suffer.
innodb_file_per_table – If you do not have too many tables use this option, so you will not have uncontrolled innodb main tablespace growth which you can’t reclaim. This option was added in MySQL 4.1 and now stable enough to use.
Also check if your application can run in READ-COMMITED isolation mode – if it does – set it to be default as transaction-isolation=READ-COMMITTED. This option has some performance benefits, especially in locking in 5.0 and even more to come with MySQL 5.1 and row level replication.
Just for the record, the people behind mysqlperformanceblog.com ran a benchmark comparing Falcon, MyISAM and InnoDB. The benchmark was really supposed to be highlighting Falcon, except it was InnoDB that won the day, topping both Falcon and MyISAM in queries per second for almost every test: InnoDB vs MyISAM vs Falcon benchmarks – part 1.
That is a large result set (66,424 rows) that MySQL must manually sort. Try adding an index to metaward_achiever.modified.
There is a limitation with MySQL 4.x that only allows MySQL to use one index per table. Since it is using the index on metaward_achiever.award_id column for the WHERE selection, it cannot also use the index on metaward_achiever.modified for the sort. I hope you're using MySQL 5.x, which may have improved this.
You can see this by doing explain on this simplified query:
SELECT * FROM `metaward_achiever`
WHERE `metaward_achiever`.`award_id` = 1507
ORDER BY `metaward_achiever`.`modified` DESC
LIMIT 100
If you can get this using the indexes for both the WHERE selection and sorting, then you're set.
You could also create a compound index with both metaward_achiever.award_id and metaward_achiever. If MySQL doesn't use it, then you can hint at it or remove the one on just award_id.
Alternatively, if you can get rid of metaward_achiever.id and make metaward_achiever.award_id your primary key and add a key on metaward_achiever.modified, or better yet make metaward_achiever.award_id combined with metaward.modified your primary key, then you'll be really good.
You can try to optimize the file sorting by modifying settings. Unfortunately, I'm not experienced with this, as our DBA handles the configuration, but you might want to check out this great blog:
http://www.mysqlperformanceblog.com/
Here's an article about filesort in particular:
http://s.petrunia.net/blog/?p=24
My guess is that you probably haven't configured your InnoDB settings beyond the defaults. You should do a quick google for setting up your InnoDB options.
The one that caused me the most noticeable performance issues out of the box was innodb_buffer_pool_size. This should be set to 50-80% of your machine's memory. By default it's often only a few MB. Crank it way up, and you should see a noticeable performance increase.
Also take a look at innodb_additional_mem_pool_size.
Start here, but also google around for "innodb performance tuning".
MySQL's query optimizer is not good, from my memory. Try a subselect instead of a straight join.
SELECT * FROM (SELECT * FROM `metaward_achiever`
WHERE `metaward_achiever`.`award_id` = 1507) a
INNER JOIN `metaward_alias` ON (a.`alias_id` = `metaward_alias`.`id`)
ORDER BY a.`modified` DESC
LIMIT 100
Or something like that (untested syntax above).
Sorting is something done by the database server, not the storage engine, in MySQL.
If in both cases, the engine was not able to provide the results in already-sorted form (it depends on the index used), then the server needs to sort them.
The only reason that MyISAM / InnoDB might be different is that the order the rows come back could affect how sorted the data are already - MyISAM could give the data back in "more sorted" order in some cases (and vice versa).
Still, sorting 60k rows is not going to take long as it's a very small data set. Are you sure you've got your sort buffer set big enough?
Using an on-disc filesort() instead of an in-memory one is much slower. The engine should however, not make any difference to this. filesort is not an engine function, but a MySQL core function. filesort does, in fact, suck in quite a lot of ways but it's not normally that slow.
Try adding a key for fields: metaward_achiever.alias_id, metaward_achiever.award_id, and metaward_achiever.modified this will help alot. And do not use keys on varchar fields it will increase time for inserts and updates. Also it seems you have 77M records in achiever table, you may want to care about innodb optimizations. There are lots of good tutors around how to set memory limits for it.