Need help for complicated sql update - sql

i have a table which has many records. i am storing html data in a particular fields called Data of that table. html data in each records have many IMG tag like <img src='test.gif' />. as a sample page url here http://www.bba-reman.com/content.aspx?content=bba_reman_diagnostics_tools
go there and see that a page is showing many product images and all data comes from table. i want to use lazyload jquery plugin and for that IMG tag should look like <img src="img/grey.gif" data-original="img/example.jpg" >. so i need to update my table html data.
so i need to write a sql update statement which would iterate in all html data in all rows and find img tag inside the particular div find by ID and change src url of IMG tag like src will be fixed like src="img/grey.gif" for all images and add one attribute to all img tag like data-original="img/example.jpg"
i know my situation is bit horrible for update statement. please suggest a good way to update all IMG tag writing sql. thanks

Assuming that all your tags do actually end in />, then this would work
UPDATE myTable
SET tag = LEFT(tag, CHARINDEX('/', tag) - 1) + 'data-original=''example.gif'' />'
However, that wouldn't change the quotes, as you have done in your question, and it wouldn't remove the closing slash before the tag end, as you have done in your question.

Related

Wordpress Database search and replace html and keep content

I have a problem. An old plugin created a lot of unnecessary tags in all my 700 WordPress blog posts.
Currently the html of every h2-tag looks like this:
<h2><a class="chartbeat-section" target="_blank" rel="nofollow noopener" name="name"></a>title</h2>
The outcome should be just:
<h2>title</h2>
Is it possible to get rid of the a-tag inside of every h2-tag with some sql query?
Thanks in advance.

Remove HTML Tags from a text using SQL

I have column in a table which contains HTML text(data contains HTML tags) and also normal text.
I need to remove the HTML tags in the data wherever it exists.
Steps I planned:
Filter only the records which contains HTML tags. --> I am able to complete this step. My Logic: where HTMLStirng like('<%>%')
Replace HTML tags with a blank space. --> I am trying to apply replace function. But I am not able to.
For Example:
<p>Paragraph</p>
<b>bold</b><I>Italic</I>
Normal Text
My Output shold be:
Paragraph
BoldItalic
Normal Text
Can someone help me in the step 2 ?
If you are using Oracle, try the following
SELECT Regexp_replace(your_column_name, '<.+?>')
FROM dual;
Example
SELECT Regexp_replace('<b>bold</b><I>Italic</I> Testing', '<.+?>')
FROM dual;

Selenium find all the elements which have two divs

I am trying to collect texts and images from a website to help collect missing people related tweets. Here is the problem:
Some tweets don't have images so the corresponding <div class='c' ....> has only one <div>...</div>.
Some tweets have images, so the corresponding <div class='c' ....> has two <div>...</div>, as shown in the following codes:
<div class='c' id="M_D*****">
<div>...</div>
and
<div class='c' id="M_D*****">
<div>...</div>
<div>...</div>
I intend to check whether a tweet has an image, i.e. find out whether the corresponding <div class='c' ....> has two <div>...</div>.
PS: The following codes are used to collect all the texts and image URLs but not all tweets have images so I want to match them by solving the above problem.
tweets = browser.find_elements_by_xpath("//span[#class='ctt']")
graph_links = browser.find_elements_by_xpath("//img[#alt='img' and #class='ib']")
This is a public welfare program, which aims to help the missing people go back home.
By collecting the text and the images separately, I think that it's going to be impossible to match the text with the related image after the fact. I would suggest a different approach. I would search for the <div class='c'...> that contains both the text and the optional image. Once you have the "container" DIV, you can then get the text and see if an image exists and put them all together. Without all the relevant HTML, you may have to tweak the code below but it should give you an idea on how to approach this.
containers = browser.find_elements_by_css_selector("div.c")
for container in containers:
print container.find_element_by_css_selector("span.ctt").text // the tweet text
images = container.find_elements_by_css_selector("img.ib")
if len(images) > 0 // see if the image exists
print images[0].get_attribute("src") // the URL of the image
print "-------------" // separator between tweets
The html you provided is probably not enough, but basing on it I suggest xpath: //div[#id='M_D*****' and ./div//img] which find div with specified id and containing div with image.
But answering directly to your question:
//div[./div[2] and not(./div[3])] will find all divs with exactly 2 div children

dijit.InlineEditBox with highlighted html

I have some dijit.InlineEditBox widgets and now I need to add some search highlighting over them, so I return the results with a span with class="highlight" over the matched words. The resulting code looks like this :
<div id="title_514141" data-dojo-type="dijit.InlineEditBox"
data-dojo-props="editor:\'dijit.form.TextBox\', onFocus:titles.save_old_value,
onChange:titles.save_inline, renderAsHtml:true">Twenty Thousand Leagues <span
class="highlight">Under</span> the Sea</div>
This looks as expected, however, when I start editing the title the added span shows up. How can I make the editor remove the span added so only the text remains ?
In this particular case the titles of the books have no html in them, so some kind of full tag stripping should work, but it would be nice to find a solution (in case of short description field with a dijit.Editor widget perhaps) where the existing html is left in place and only the highlighting span is removed.
Also, if you can suggest a better way to do this (inline editing and word highlighting) please let me know.
Thank you !
How will this affect your displayed content in the editor? It rather depends on the contents you allow into the field - you will need a rich-text editor (huge footprint) to handle html correctly.
These RegExp's will trim away XML tags
this.value = this.displayNode.innerHTML.replace(/<[^>]*>/, " ").replace(/<\/[^>]*>/, '');
Here's a running example of the below code: fiddle
<div id="title_514141" data-dojo-type="dijit.InlineEditBox"
data-dojo-props="editor:\'dijit.form.TextBox\', onFocus:titles.save_old_value,
onChange:titles.save_inline, renderAsHtml:true">Twenty Thousand Leagues <span
class="highlight">Under</span> the Sea
<script type="dojo/method" event="onFocus">
this.value = this.displayNode.innerHTML.
replace(/<[^>]*>/, " ").
replace(/<\/[^>]*>/, '');
this.inherited(arguments);
</script>
</div>
The renderAsHtml attribute only trims 'off one layer', so embedded HTML will still be html afaik. With the above you should be able to 1) override the onFocus handling, 2) set the editable value yourself and 3) call 'old' onFocus method.
Alternatively (as seeing you have allready set 'titles.save_*' in props, use dojo/connect instead of dojo/method - but you need to get there first, sort of say.

Split table data in SQL and replace with results

I need to remove a bunch of unneeded data from each table based on split parameters.
My SQL table is storing a bunch of HTML for caching, The data is already in SQL and it's growing to be quite large so now I want to split some of the data I don't use from each table based on a string and update the table with the new results.
cacheHTML table is holding data like this
<html>
... (a bunch of data I don't need)
<first div>
... (the data I do want to save)
</div>
... (data I don't care about also)
</html>
I only want whats inside the first div and to remove all the html up to that point.
Is there any easy method for this? I need to do this to 5k rows of cached data...
I need a function or method to say give me everything between string1 till string2 then replace the table with the results. Any help would be appreciated Thanks!
You could do something like this. Will only work if you always need the text inside the first div in the html string. Im assuming SQL Server as database system but it could probably be translated to others pretty easily.
Sample html string:
<html>
<head>
<title>Stuff i dont need</title>
</head>
<body>
<h1>Stuff i dont need</title>
<p>I dont need any of this data</title>
<div>This is the data i need to save!</div>
<h3>Dont need this</h3>
<div>Wont need this either!<div>
<h3>Bye</h3>
</body>
SQL to do the update:
UPDATE cacheHTML
SET htmlText = REPLACE(SUBSTRING(htmlText, CHARINDEX('<div>', htmlText, 0), CHARINDEX('</div>', htmlText, 0) - CHARINDEX('<div>', htmlText, 0)), '<div>', '')