Relating ID's in relational table - sql

I have these two insertion queries in Perl using the DBI module and DBD:mysql.
This one inserts fields url, html_extr_text, concord_file, and sys_time into table article:
my #fields = (qw(url html_extr_text concord_file sys_time));
my $fieldlist = join ", ", #fields;
my $field_placeholders = join ", ", map {'?'} #fields;
my $insert_query = qq{
INSERT INTO article ($fieldlist)
VALUES ($field_placeholders)
};
my $sth = $dbh->prepare($insert_query);
my $id_article;
my #id_articles;
foreach my $article_index (0 .. #output_concord_files_prepare) {
$field_placeholders = $sth->execute(
$url_prepare[$article_index],
$html_pages_files_extended[$article_index],
$output_concord_files_prepare[$article_index],
$sys_time_prepare[$article_index]);
$id_article = $dbh->last_insert_id(undef, undef, 'article', 'id_article');
push #id_articles, $id_article;
if ($field_placeholders != 1) {
die "Error inserting records, only [$field_placeholders] got inserted: " . $sth->insert->errstr;
}
}
print "#id_articles\n";
And this one inserts field event into table event:
#fields = (qw(event));
$fieldlist = join ", ", #fields;
$field_placeholders = join ", ", map {'?'} #fields;
$insert_query = qq{
INSERT INTO event ($fieldlist)
VALUES ($field_placeholders)
};
$sth = $dbh->prepare($insert_query);
my $id_event;
my #id_events;
foreach my $event_index (0 .. #event_prepare){
$field_placeholders = $sth->execute($event_prepare[$event_index]);
$id_event = $dbh->last_insert_id(undef, undef, 'event', 'id_event');
push #id_events, $id_event;
if ($field_placeholders != 1){
die "Error inserting records, only [$field_placeholders] got inserted: " . $sth->insert->errstr;
}
}
print "#id_events\n";
I'd like to create a third one-to-many relationship table. Because, one article contains multiple events, so I have this file :
output_concord/concord.0.txt -> earthquake
output_concord/concord.0.txt -> avalanche
output_concord/concord.0.txt -> snowfall
output_concord/concord.1.txt -> avalanche
output_concord/concord.1.txt -> rock fall
output_concord/concord.1.txt -> mud slide
output_concord/concord.4.txt -> avalanche
output_concord/concord.4.txt -> rochfall
output_concord/concord.4.txt -> topple
...
As you can see, I collect the IDs of each entry using the LAST_INSERT_ID. However I don't really know how to make the next step.
Using this file, how can I insert into a third table 'article_event_index' the ids of the two previous tables.
It would be something like this:
$create_query = qq{
create table article_event_index(
id_article int(10) NOT NULL,
id_event int(10) NOT NULL,
primary key (id_article, id_event),
foreign key (id_article) references article (id_article),
foreign key (id_event) references event (id_event)
)
};
$dbh->do($create_query);
Which will contain relationships following the pattern
1-1, 1-2, 1-3, 2-4, 3-5 ...
I'm a newbie to Perl and databases so it's hard to formulate what I want to do. I hope I was clear enough.

Something like this should do what you need (untested, but it does compile).
It starts by building Perl hashes to relate concord files to article IDs and events to event IDs. Then the file is read, and a pair of IDs is inserted into the new table for each relationship that can be found in the exisiting tables.
Note that the hashes are there only to avoid a long sequence of
SELECT id_article FROM article WHERE concord_file = ?
and
SELECT id_event FROM event WHERE event = ?
statements.
use strict;
use warnings;
use DBI;
use constant RELATIONSHIP_FILE => 'relationships.txt';
my $dbh = DBI->connect('DBI:mysql:database', 'user', 'pass')
or die $DBI::errstr;
$dbh->do('DROP TABLE IF EXISTS article_event_index');
$dbh->do(<< 'END_SQL');
CREATE TABLE article_event_index (
id_article INT(10) NOT NULL,
id_event INT(10) NOT NULL,
PRIMARY KEY (id_article, id_event),
FOREIGN KEY (id_article) REFERENCES article (id_article),
FOREIGN KEY (id_event) REFERENCES event (id_event)
)
END_SQL
my $articles = $dbh->selectall_hashref(
'SELECT id_article, concord_file FROM article',
'concord_file'
);
my $events = $dbh->selectall_hashref(
'SELECT id_event, event FROM event',
'event'
);
open my $fh, '<', RELATIONSHIP_FILE
or die sprintf qq{Unable to open "%s": %s}, RELATIONSHIP_FILE, $!;
my $insert_sth = $dbh->prepare('INSERT INTO article_event_index (id_article, id_event) VALUES (?, ?)');
while (<$fh>) {
chomp;
my ($concord_file, $event) = split /\s*->\s*/;
next unless defined $event;
unless (exists $articles->{$concord_file}) {
warn qq{No article record for concord file "$concord_file"};
next;
}
my $id_article = $articles->{$concord_file}{id_article};
unless (exists $events->{$event}) {
warn qq{No event record for event "$event"};
next;
}
my $id_event = $events->{$event}{id_event};
$insert_sth->execute($id_article, $id_event);
}

Related

Dealing with collections in SQL

I have this table in which I want to register likes from users. The data type for likes is an array. INT[] I don't know if this is the best approach to handle like and unlike
I have not found an effective way to manipulate collection in order to toggle like/unlike from a user.
can we use sthg like an associative array in SQL? or, what would be the best approach with an array ? I could not find one example.
Thks for pointing me in the right direc†ion.
CREATE TABLE posts (
pid SERIAL PRIMARY KEY,
user_id INT REFERENCES users(uid),
author VARCHAR REFERENCES users(username),
title VARCHAR(255),
content TEXT,
date_created TIMESTAMP,
like_user_id INT[] DEFAULT ARRAY[]::INT[],
likes INT DEFAULT 0
);
const likePost = (req, res, next) => {
const values = [req.body.id, req.body.user_id];
console.log(values);
const query = `UPDATE posts SET likes = likes - 1 WHERE pid = $1, UPDATE posts SET likes[1] = $2 WHERE pid = $1`;
pool.query(query, values, (q_err, q_res) => {
if (q_err) return next(q_err);
res.json(`post ${req.body.id} succcessfully remove 👍`);
});
};

SQLite: Foreign Key "ON DELETE SET NULL" action not getting triggered

Why is ON DELETE SET NULL failing when deleting a row via the application code, but it behaves correctly when manually executing an SQL statement?
I have a todo table and a category table. The todo table has a category_id foreign key that references id in the category table, and it was created with the "ON DELETE SET NULL" action.
create table `category` (
`id` integer not null primary key autoincrement,
`name` varchar(255) not null
);
create table `todo` (
`id` integer not null primary key autoincrement,
`title` varchar(255) not null,
`complete` boolean not null default '0',
`category_id` integer,
foreign key(`category_id`) references `category`(`id`) on delete SET NULL on update CASCADE
);
I also have an endpoint in my application that allows users to delete a category.
categoryRouter.delete('/:id', async (req, res) => {
const { id } = req.params
await req.context.models.Category.delete(id)
return res.status(204).json()
})
This route successfully deletes categories, but the problem is that related todo items are not getting their category_id property set to null, so they end up with a category id that no longer exists. Strangely though, if I open up my database GUI and manually execute the query to delete a category... DELETE FROM category WHERE id=1... the "ON DELETE SET NULL" hook is successfully firing. Any todo item that had category_id=1 is now set to null.
Full application source can be found here.
Figured it out, thanks to MikeT.
So apparently SQLite by default has foreign key support turned off. WTF!
To enable FKs, I had to change my code from this...
const knex = Knex(knexConfig.development)
Model.knex(knex)
to this...
const knex = Knex(knexConfig.development)
knex.client.pool.on('createSuccess', (eventId, resource) => {
resource.run('PRAGMA foreign_keys = ON', () => {})
})
Model.knex(knex)
Alternatively, I could have done this inside of the knexfile.js...
module.exports = {
development: {
client: 'sqlite3',
connection: {
filename: './db.sqlite3'
},
pool: {
afterCreate: (conn, cb) => {
conn.run('PRAGMA foreign_keys = ON', cb)
}
}
},
staging: {},
production: {}
}
FYI and other people who stumbled across a similar problem, you need PRAGMA foreign_keys = ON not only for the child table but also for the parent table.
When I set PRAGMA foreign_keys = ON only for a program which handles the child table, ON UPDATE CASCADE was enabled but ON DELETE SET NULL was still disabled. At last I found out that I forgot PRAGMA foreign_keys = ON for another program which handles the parent table.

Optional column update if provided value for column is not null

I have following table:
CREATE TABLE IF NOT EXISTS categories
(
id SERIAL PRIMARY KEY,
title CHARACTER VARYING(100) NOT NULL,
description CHARACTER VARYING(200) NULL,
category_type CHARACTER VARYING(100) NOT NULL
);
I am using pg-promise, and I want to provide optional update of columns:
categories.update = function (categoryTitle, toUpdateCategory) {
return this.db.oneOrNone(sql.update, [
categoryTitle,
toUpdateCategory.title, toUpdateCategory.category_type, toUpdateCategory.description,
])
}
categoryName - is required
toUpdateCategory.title - is required
toUpdateCategory.category_type - is optional (can be passed or undefined)
toUpdateCategory.description - is optional (can be passed or undefined)
I want to build UPDATE query for updating only provided columns:
UPDATE categories
SET title=$2,
// ... SET category_type=$3 if $3 is no NULL otherwise keep old category_type value
// ... SET description=$4 if $4 is no NULL otherwise keep old description value
WHERE title = $1
RETURNING *;
How can I achieve this optional column update in Postgres?
You could coalesce between the old and the new values:
UPDATE categories
SET title=$2,
category_type = COALESCE($3, category_type),
description = COALESCE($4, description) -- etc...
WHERE title = $1
The helpers syntax is best for any sort of dynamic logic with pg-promise:
/* logic for skipping columns: */
const skip = c => c.value === null || c.value === undefined;
/* reusable/static ColumnSet object: */
const cs = new pgp.helpers.ColumnSet(
[
'title',
{name: 'category_type', skip},
{name: 'description', skip}
],
{table: 'categories'});
categories.update = function(title, category) {
const condition = pgp.as.format(' WHERE title = $1', title);
const update = () => pgp.helpers.update(category, cs) + condition;
return this.db.none(update);
}
And if your optional column-properties do not even exist on the object when they are not specified, you can simplify the skip logic to just this (see Column logic):
const skip = c => !c.exists;
Used API: ColumnSet, helpers.update.
See also a very similar question: Skip update columns with pg-promise.

How To Split Pipe-Delimited Column and insert each value into new table Once?

I have an old database with a gazillion records (more or less) that have a single tags column (with tags being pipe-delimited) that looks like so:
Breakfast
Breakfast|Brunch|Buffet|Burger|Cakes|Crepes|Deli|Dessert|Dim Sum|Fast Food|Fine Wine|Spirits|Kebab|Noodles|Organic|Pizza|Salad|Seafood|Steakhouse|Sushi|Tapas|Vegetarian
Breakfast|Brunch|Buffet|Burger|Deli|Dessert|Fast Food|Fine Wine|Spirits|Noodles|Pizza|Salad|Seafood|Steakhouse|Vegetarian
Breakfast|Brunch|Buffet|Cakes|Crepes|Dessert|Fine Wine|Spirits|Salad|Seafood|Steakhouse|Tapas|Teahouse
Breakfast|Brunch|Burger|Crepes|Salad
Breakfast|Brunch|Cakes|Dessert|Dim Sum|Noodles|Pizza|Salad|Seafood|Steakhouse|Vegetarian
Breakfast|Brunch|Cakes|Dessert|Dim Sum|Noodles|Pizza|Salad|Seafood|Vegetarian
Breakfast|Brunch|Deli|Dessert|Organic|Salad
Breakfast|Brunch|Dessert|Dim Sum|Hot Pot|Seafood
Breakfast|Brunch|Dessert|Dim Sum|Seafood
Breakfast|Brunch|Dessert|Fine Wine|Spirits|Noodles|Pizza|Salad|Seafood
Breakfast|Brunch|Dessert|Fine Wine|Spirits|Salad|Vegetarian
Is there a way one could retrieve each tag and insert it into a new table tag_id | tag_nm using MySQL only?
Here is my attempt which uses PHP..., I imagine this could be more efficient with a clever MySQL query. I've placed the relationship part of it there too. There's no escaping and error checking.
$rs = mysql_query('SELECT `venue_id`, `tag` FROM `venue` AS a');
while ($row = mysql_fetch_array($rs)) {
$tag_array = explode('|',$row['tag']);
$venueid = $row['venue_id'];
foreach ($tag_array as $tag) {
$rs2 = mysql_query("SELECT `tag_id` FROM `tag` WHERE tag_nm = '$tag'");
$tagid = 0;
while ($row2 = mysql_fetch_array($rs2)) $tagid = $row2['tag_id'];
if (!$tagid) {
mysql_execute("INSERT INTO `tag` (`tag_nm`) VALUES ('$tag')");
$tagid = mysql_insert_id;
}
mysql_execute("INSERT INTO `venue_tag_rel` (`venue_id`, `tag_id`) VALUES ($venueid, $tagid)");
}
}
After finding there is no official split function I've solved the issue using only MySQL like so:
1: I created the function strSplit
CREATE FUNCTION strSplit(x varchar(21845), delim varchar(255), pos int) returns varchar(255)
return replace(
replace(
substring_index(x, delim, pos),
substring_index(x, delim, pos - 1),
''
),
delim,
''
);
Second I inserted the new tags into my new table (real names and collumns changed, to keep it simple)
INSERT IGNORE INTO tag (SELECT null, strSplit(`Tag`,'|',1) AS T FROM `old_venue` GROUP BY T)
Rinse and repeat increasing the pos by one for each collumn (in this case I had a maximum of 8 seperators)
Third to get the relationship
INSERT INTO `venue_tag_rel`
(Select a.`venue_id`, b.`tag_id` from `old_venue` a, `tag` b
WHERE
(
a.`Tag` LIKE CONCAT('%|',b.`tag_nm`)
OR a.`Tag` LIKE CONCAT(b.`tag_nm`,'|%')
OR a.`Tag` LIKE CONCAT(CONCAT('%|',b.`tag_nm`),'|%')
OR a.`Tag` LIKE b.`tag_nm`
)
)

SQL to batch re-tag items

I've got a MySQL database with typical schema for tagging items:
item (1->N) item_tag (N->1) tag
Each tag has a name and a count of how many items have that tag
ie:
item
(
item_id (UNIQUE KEY)
)
item_tag
(
item_id (NON-UNIQUE INDEXED),
tag_id (NON-UNIQUE INDEXED)
)
tag
(
tag_id (UNIQUE KEY)
name
count
)
I need to write a maintenance routine to batch re-tag one or more existing tags to a single new or existing other tag. I need to make sure that after the retag, no items have duplicate tags and I need to update the counts on each tag record to reflect the number of actual items using that tag.
Looking for suggestions on how to implement this efficiently...
if i understood you correctly then you could try something like this:
/* new tag/item table clustered PK optimised for group by tag_id
or tag_id = ? queries !! */
drop table if exists tag_item;
create table tag_item
(
tag_id smallint unsigned not null,
item_id int unsigned not null,
primary key (tag_id, item_id), -- clustered PK innodb only
key (item_id)
)
engine=innodb;
-- populate new table with distinct tag/items
insert ignore into tag_item
select tag_id, item_id from item_tag order by tag_id, item_id;
-- update counters
update tag inner join
(
select
tag_id,
count(*) as counter
from
tag_item
group by
tag_id
) c on tag.tag_id = c.tag_id
set
tag.counter = c.counter;
An index/constraint on the item_tag table can prevent duplicate tags; or create the table with a composite primary key using both item_id and tag_id.
As to the counts, drop the count column from the tag table and create a VIEW to get the results:
CREATE VIEW tag_counts AS SELECT tag_id, name, COUNT(*) AS count GROUP BY tag_id, name
Then your count is always up to date.
This is what I've got so far, which seems to work but I don't have enough data yet to know how well it performs. Comments welcome.
Some notes:
Had to add a unique id field to to the item_tags table get the duplicate tag cleanup working.
Added support for tag aliases so that there's a record of retagged tags.
I didn't mention this before but each item also has a published flag and only published items should affect the count field on tags.
The code uses C#, subsonic+linq + "coding horror", but is fairly self explanatory.
The code:
public static void Retag(string new_tag, List<string> old_tags)
{
// Check new tag name is valid
if (!Utils.IsValidTag(new_tag))
{
throw new RuleException("NewTag", string.Format("Invalid tag name - {0}", new_tag));
}
// Start a transaction
using (var scope = new SimpleTransactionScope(megDB.GetInstance().Provider))
{
// Get the new tag
var newTag = tag.SingleOrDefault(x => x.name == new_tag);
// If the new tag is an alias, remap to the alias instead
if (newTag != null && newTag.alias != null)
{
newTag = tag.SingleOrDefault(x => x.tag_id == newTag.alias.Value);
}
// Get the old tags
var oldTags = new List<tag>();
foreach (var old_tag in old_tags)
{
// Ignore same tag
if (string.Compare(old_tag, new_tag, true)==0)
continue;
var oldTag = tag.SingleOrDefault(x => x.name == old_tag);
if (oldTag != null)
oldTags.Add(oldTag);
}
// Redundant?
if (oldTags.Count == 0)
return;
// Simple rename?
if (oldTags.Count == 1 && newTag == null)
{
oldTags[0].name = new_tag;
oldTags[0].Save();
scope.Complete();
return;
}
// Create new tag?
if (newTag == null)
{
newTag = new tag();
newTag.name = new_tag;
newTag.Save();
}
// Build a comma separated list of old tag id's for use in sql 'IN' clause
var sql_old_tags = string.Join(",", (from t in oldTags select t.tag_id.ToString()).ToArray());
// Step 1 - Retag, allowing duplicates for now
var sql = #"
UPDATE item_tags
SET tag_id=#newtagid
WHERE tag_id IN (" + sql_old_tags + #");
";
// Step 2 - Delete the duplicates
sql += #"
DELETE t1
FROM item_tags t1, item_tags t2
WHERE t1.tag_id=t2.tag_id
AND t1.item_id=t2.item_id
AND t1.item_tag_id > t2.item_tag_id;
";
// Step 3 - Update the use count of the destination tag
sql += #"
UPDATE tags
SET tags.count=
(
SELECT COUNT(items.item_id)
FROM items
INNER JOIN item_tags ON item_tags.item_id = items.item_id
WHERE items.published=1 AND item_tags.tag_id=#newtagid
)
WHERE
tag_id=#newtagid;
";
// Step 4 - Zero the use counts of the old tags and alias the old tag to the new tag
sql += #"
UPDATE tags
SET tags.count=0,
alias=#newtagid
WHERE tag_id IN (" + sql_old_tags + #");
";
// Do it!
megDB.CodingHorror(sql, newTag.tag_id, newTag.tag_id, newTag.tag_id, newTag.tag_id).Execute();
scope.Complete();
}