Remove html entities from a databases - sql

Due to errors of my predecessors a (MySQL) database I would like to use contains a lot of HTML entities (e.g. € instead of €).
As the database should contain raw data (a database shouldn't have anything to do with HTML) I want to remove them from the DB and store it in proper UTF8, the collocation is already that.
What would be a good way to fix this? The only thing I can think of is to write a PHP script that gets all the data, runs it through html_entity_decode() and writes it back. It's doable since it's a one-time-operation and the DB is only about 100MB large, but it's still less than optimal.
Any ideas?

Since no-one could provide a satisfying SQL-only solution, I solved it with a script similar to this one.
Note that it only works if all the tables you use it on have a primary key, but this will usually be the case
<?php
// Specify which columns need to be de-entitiezed
$affected = array(
'table1' => array('column1', 'column2'),
'table2' => array('column1', 'column2'),
);
// Make database connection
$db = new PDO("mysql:dbname=yourdb;host=yourhost", "user", "pass");
foreach($affected as $table => $columns){
// Start a transaction for each table
$db->beginTransaction();
// Find the table primary key. PHP5.4 syntax!
$pk = $db->query("SHOW INDEX FROM " . $table . " WHERE Key_name = 'PRIMARY'")->fetch()[0];
foreach($columns as $column){
// Construct a prepared statement for this column
$ps = $db->prepare("UPDATE " . $table . " SET " . $column . " . = ? WHERE " . $pk . " = ?");
// Go through all rows
foreach( $db->query("SELECT " . $column . ", " . $pk . " FROM " . $table) as $row){
$row[0] = html_entity_decode($row[0]); // Actual processing
$ps->execute($row);
}
}
// Everything went well for this table, commit
$db->commit();
}
?>

I tnink u need to create a mysql procedure. (with SELECT loop and update replace) REPLACE(TextString, '&apos;','"') ;

Depending on the database (Oracle, MySql, etc) and whether you can take it offline you might be able to export all the DDL and data as a large SQL script (containing INSERTs for all the tables). Then you could do a standard search/replace using sed:
sed -i 's/€/€/g' script.sql
then drop the database or truncate the tables and recreate it using the script.

Ultimately I think you are going to have to resort to PHP at some stage, converting a lot of these entites in SQL is going to invole a huge amount of desicion logic.
However, One approach I can think of if you must use SQL, is to create a user defined function, that esentially has a huge case statement in (Or lots of if/then's) :
http://dev.mysql.com/doc/refman/5.0/en/case-statement.html
Then you should simply be able to do something like:
SELECT col1,col2,col3,mtuserdecodefunction(column-with-entities-in) FROM mytable
Which should in theory return you a cleaned table.

Related

Wordpress post update code is not working

I want to update a topic with a post id in Wordpress and add it to the end of the post. However, I could not run the code below. Can you help?
wp_update_post(['ID' => $posts->ID,"post_content" => "concat_ws(' ',post_content, 'SECOND')"]);
Normally, this process is done over sql with concat. But I want to use it with php.
The version that works with sql;
update test_user set descrip = concat_ws(' ',descrip, 'SECOND') where Id=2
but I want to run it with php, not sql. How should the first code be?
You can use braces or concatenation operator .
echo "qwe{$a}rty"; // qwe12345rty, using braces
echo "qwe" . $a . "rty"; // qwe12345rty, concatenation used
Also, it much better to use WP_Post class than modify data in tables directly.
Your WP instance can use some db caching layer, or some hooks for posts updating. This functionality can be
potentially broken if you work with tables directly.
$post = get_post( 123 );
$post->post_content = $post->post_content . "Some other content";
wp_update_post( $post );

PDO bulk import script

I am trying to create an import script, I have an example working after following [some outdated and unsafe tutorial] However this tutorial doesn't use PDO. Although this form is to only be used by a site supervisor I feel I should use PDO to help prevent SQL injection. Though I am not sure how I would change this for PDO statements?
Thanks
(I already have my DB connection up here)
if ($_FILES['csv']['size'] > 0) {
//get the csv file
$file = $_FILES['csv']['tmp_name'];
$handle = fopen($file, "r");
do {
if ($data[0]) {
$sql = "INSERT INTO users_tbl (username, staff_id, dept, area) VALUES
('" . addslashes($data[0]) . "',
'" . addslashes($data[1]) . "',
'" . addslashes($data[2]) . "'
)
";
$result = pg_query($sql);
}
} while ($data = fgetcsv($handle, 1000, ",", "'"));
//redirect
header('Location: import.php?success=1');
die;
}
As a matter of fact, there is not a little connection between PDO and CSV.
Every PHP user have to develop himself an ability to separate matters. There are two tasks for you:
to read a line from CSV file into array
to store an array data in database.
These 2 tasks are totally irrelevant to each other. After reading csv you may sore it anywhere. And CSV part won't be changed a bit. You may store in PDO data of any other source - and PDO code won't be changed a bit.
So, all you need to learn actually is how to use PDO. Well, Here you go https://stackoverflow.com/tags/pdo/info
Another option would be using LOAD DATA INFILE query, which is way more suitable for the bulk imports

Returning one cell from Codeigniter Query

I want to query a table and only need one cell returned. Right now the only way I can think to do it is:
$query = $this->db->query('SELECT id FROM crops WHERE name = "wheat"');
if ($query->num_rows() > 0) {
$row = $query->row();
$crop_id = $row->id;
}
What I want is, since I'm select 'id' anyway, for that to be the result. IE: $query = 'cropId'.
Any ideas? Is this even possible?
Of course it's possible. Just use AND in your query:
$query = $this->db->query('SELECT id FROM crops WHERE name = "wheat" AND id = {$cropId}');
Or you could use the raw power of the provided Active Record class:
$this->db->select('id');
$this->db->from('crops');
$this->db->where('name','wheat');
$this->db->where('id',$cropId);
$query = $this->db->get();
If you just want the cropId from the whole column:
foreach ($query->result()->id as $cropId)
{
echo $cropId;
}
Try this out, I'm not sure if it will work:
$cropId = $query->first_row()->id;
Note that you want to swap your quotes around: use " for your PHP strings, and ' for your SQL strings. First of all, it would not be compatible with PostgreSQL and other database systems that check such things.
Otherwise, as Christopher told you, you can test the crop identifier in your query. Only if you define a string between '...' in PHP, the variables are not going to be replaced in the strings. So he showed the wrong PHP code.
"SELECT ... $somevar ..."
will work better.
Yet, there is a security issue in writing such strings: it is very dangerous because $somevar could represent some additional SQL and completely transform your SELECT in something that you do not even want to think about. Therefore, the Active Record as mentioned by Christopher is a lot safer.

Quote value into Zend Framework 2

I'm working on an application using ZF2. In my application, I have to insert many rows in a database (about 900).
I've got a table model for this, so I first try to do :
$table->insert(array('x' => $x, 'y' => $y));
in my loop. This technically work, but this is so slow that I can hardly insert half of the datas before php's timeout (and I can't change the timeout).
Then, I've decide to use a prepared statment. So I've prepared it outside of the loop, then execute it in my loop... it was even slower.
So, I decide to stop using ZF2 tools, as they seems to be too slow to be used in my case, and i've created my own request. I'm using mysql, so i can do a single request with all my values. But I can't find any method in any of the interface to escape my values...
Is there any way to do this ?
Thank you for your help and sorry for my poor english.
If you want to perform raw queries you can do so using the Database Adapter:
$sql = 'SELECT * FROM '
. $adapter->platform->quoteIdentifier('users')
. ' WHERE ' . $adapter->platform->quoteIdentifier('id') . ' = ' . $adapter->driver->formatParameterName('id');
/* #var $statement \Zend\Db\Adapter\Driver\StatementInterface */
$statement = $adapter->query($sql);
$parameters = array('id' => 99);
/* #var $results Zend\Db\ResultSet\ResultSet */
$results = $statement->execute($parameters);
$row = $results->current();
use transactions: http://dev.mysql.com/doc/refman/5.0/en/commit.html
Than will help you to decrease the execution time

Updating email addresses in MySQL (regexp?)

Is there a way to update every email address in MySQL with regexp? What I want to do is to change something#domain.xx addresses to something#domain.yy. Is it possible to do with SQL or should I do it with PHP for example?
Thanks!
You can search for a REGEXP with MySQL, but, unfortunately, it cannot return the matched part.
It's possible to do it with SQL as follows:
UPDATE mytable
SET email = REPLACE(email, '#domain.xx', '#domain.yy')
WHERE email REGEXP '#domain.xx$'
You can omit the WHERE clause, but it could lead to unexpected results (like #example.xxx.com will be replaced with #example.yyx.com), so it's better to leave it.
UPDATE tableName
SET email = CONCAT(SUBSTRING(email, 1, locate('#',email)), 'domain.yy')
WHERE email REGEXP '#domain.xx$';
I would rather do it with PHP, if possible. Mysql unfortunately does not allow capturing matching parts in regular expressions. Or even better: you can combine the two like this, for example:
$emails = fetchAllDistinctEmailsIntoAnArray();
# Make the array int-indexed:
$emails = array_values($emails);
# convert the mails
$replacedEmails = preg_replace('/xx$/', 'yy', $emails);
# create a new query
$cases = array();
foreach ($emails as $key => $email) {
# Don't forget to use mysql_escape_string or similar!
$cases[] = "WHEN '" . escapeValue($email) .
"' THEN '" . escappeValue(replacedEmails[$key]) . "'";
}
issueQuery(
"UPDATE `YourTable`
SET `emailColumn` =
CASE `emailColumn`" .
implode(',', $cases) .
" END CASE");
Note that this solution will take quite some time and you may run out of memory or hit execution limits if you have many entries in your database. You might want to look into ignore_user_abort() and ini_set() for changing the memory limit for this script.
Disclaimer: Script not tested! Do not use without understanding/testing the code (might mess up your db).
Didn't check it, since don't have mysql installed, but seems it could help you
update table_name
set table_name.email = substr(table_name.email, 0, position("#" in table_name.email) + 1)
+ "new_domain";
PS. Regexp won't help you for update, since it only can help you to locate specific entrance of substring in string ot check whenever string is matches the pattern. Here you can find reference to relevant functions.