How to edit html (tags), before being executed by CppWebBrowser - webbrowser-control

I can't solve my problem, and I hope somebody knows how...
I have a component on my form TCppWebBrowser, and when I navigate to a URL, after the document was downloaded, in method OnDocumentComplete() , I'm trying to check and change html source of loaded document... before it being executed by browser.
I need that, because some websites have background sounds, and I want to parse html and remove tags or just remove text which contains sound files like *.wav , *.mid , *.swf, *.mp3 ... ect.
For example if html source have this line:
<NOEMBED><BGSOUND src="/images/ImagineCut.wav"></NOEMBED>
then, i change it to:
<NOEMBED><BGSOUND src="/images/ImagineCut."></NOEMBED>
or I can delete whole tag.
Using this way I want to mute webbrowser or even to stop playing sounds. Please take into consideration this method, because it will help me to avoid all kind of sounds after I edited html.. (before browser execute it)
That's what I tried to do:
void __fastcall TForm1::CppWebBrowser1DocumentComplete(TObject *Sender,
LPDISPATCH pDisp, Variant *URL)
{
IHTMLDocument2 *pHTMLDoc;
CppWebBrowser1->Document->QueryInterface(IID_IHTMLDocument2,(LPVOID*)&pHTMLDoc);
IHTMLElement *pElem;
pHTMLDoc->get_body(&pElem);
BSTR text;
pElem->get_innerHTML(&text);
text = Cleaning(text); //checking and changing html without souds
pElem->put_innerHTML(text);
pElem->Release();
pHTMLDoc->Release();
}

To do what you are asking, you would have to download the HTML file yourself from outside of the TCppWebBrowser component completely, alter the HTML as needed, and then push the new HTML into TCppWebBrowser using one of its IPersist... interfaces. Examples of doing that have been posted in the Borland/CodeGear/Embarcadero forums many times before.

Related

Parse Element ID within iframe using Webview in VB.net

I have a webpage that loads caller data when a caller calls. I am trying to parse the element ids but they are loaded in an iframe. How would i go about doint this?
The iframe is and the elements are on default.html <iframe onload="UserFrameLoaded();" name="cmUserFrame" id="view_cmUserFrame" style="display: block; overflow: scroll;" marginheight="0" width="100%" height="1415px" frameborder="0" marginwidth="0" src="./CallManager_files/default.html"></iframe>
the code im using is
Dim firstNameText As String = Await WebView21.ExecuteScriptAsync("document.getElementById('m.first_name').textContent")
i tried
Dim firstNameText As String = Await WebView21.ExecuteScriptAsync("document.getElementById('view_cmUserFrame').contentWindow.document.getElementById('m.first_name').textContent")
Im not sure if the iframes source is on another domain. Im hoping not...
I can probably just save the webpage locally and then load default.html directly but i am not sure how to save the webpage with webview either.
You pretty much are asking a Javascript question, however, things might be easier for you if you understand a little more about Webview2 and how you can access those frames.
First option is FrameCreated Event which will fire when the cmUserFrame is created in the top document. From this you get a frame and you can then use ExecuteScriptAsync from that frame object.
Next up you have AddScriptToExecuteOnDocumentCreatedAsync with this it will inject your script into each document that is created. In this case you could have Javascript that checks the name of the document/Frame and then if it is the right frame continues with your javascript.
With respect to you actual Javascript, without knowing the actual page it is difficult to know. open the Dev tools of either Chrome or edge and then the console and copy past your javascript into there to see if it works there first. If it works there then you have a timing issue. If it does not work then you need to work on what is not being found.
Updated From Comments
To help you understand the Webview2 frame events here is an Issue from Github
One more link from Webview2 GitHub issues.
Specifically to answer your questions.
Implement FrameCreated
Wait for the frame with the name you want
Wait for DomContentCompleted event
Call ExecuteScriptAsync ONE THE FRAME you get from 2
Code would then be assuming that that frame's document is as you have in your js
Dim firstNameText As String = Await WebView21FRAME.ExecuteScriptAsync("document.getElementById('m.first_name').textContent")

webkit : how to get the actual content of a page after content was added via javascript?

I want to get the actual content of a page I loaded into a webview after some content has been updated by some jquery
$(document).ready(function() {
$("#main").append('<p>Test</p><p>Test</p><p>Test</p><p>Test</p><p>Test</p><p>Test</p><p>Test</p>');
});
After the page has been updated I tried to get the content of page with the following command [vala syntax]
web_view.get_main_frame ().get_data_source().get_data().str
but I only get the original content (even if loading is finished)
using
web_view.get_dom_document().document_element.text_content
I get the actual content but the tags are removed.
I guess I could walk the whole tree to get the actual document but tere should be a more easy way to do it.
EDIT:
my solution
this.web_view.load_finished.connect ((source, frame) => {
stderr.printf(this.web_view.get_dom_document().body.get_inner_html());
}
I'll probably find this awfull when I'll read this some years from now but for now I'll go with that.
In the HTML DOM, elements implement the HTMLElement interface. The HTMLElement interface in WebKit includes the outerHTML property. This property returns a string containing the serialized markup of the element and all of its children. I'm not familiar with Vala but based on your code snippets this would be accessed like so:
web_view.get_dom_document().document_element.outer_html
The correct answer in Vala is this:
webview.get_dom_document().get_body().get_inner_html ()

WP8 WebBrowser: inline script works, loaded doesn't

Windows Phone 8 app, WebBrowser control. I load a chunk of HTML via NavigateToString (after setting IsScriptEnabled=true). Some time later (long after it's loaded), I'm invoking some JavaScript on the page with InvokeScript.
When I invoke a JavaScript function that's defined inline inside a <script> element, it works as expected. When I invoke one that's defined in an external JS file, it doesn't, and an exception from HRESULT 0x80020006 ("name not found") is thrown.
The external script file is loaded from my app package. In the HTML string, there's a <base> element which contains a file:// URL to the package's folder (retrieved via Package::Current->InstalledLocation), and the <script> element contains just the file name. There are also styles and images in that folder - they load fine.
The HTML has no DOCTYPE and no xmlns - I know those things can sometimes throw JavaScript off.
The external script file is valid - it came straight from Android where it worked on the respective WebView control. The function I'm trying to invoke is empty anyway, to be on the safe side, JavaScript syntax-wise.
This could in theory be some kind of a cross-domain scripting issue. Technically, the script comes from a file:// URL while the page itself comes from no URL at all. Some piece of system code that makes sure no fishy script is called could've gotten in the way.
Found one workaround: load the external script file into a string on startup, once the HTML is loaded (LoadCompleted fires), feed it to the document using JavaScript eval.
Here is example of how to inject some script dynamically
Browser.InvokeScript("eval", new string[] { FileUtils.ReadFileContent("app/www/js/console.js") });
Where ReadFileContent could be defined as following
https://github.com/sgrebnov/IeMobileDebugger/blob/master/Libraries/Support/FileUtils.cs
Full example
https://github.com/sgrebnov/IeMobileDebugger/blob/master/Libraries/IE.Debug.Core/WebPageDebugger.cs
PS. instead of reading script from file you can pass hardcoded string, etc
Are you sure that your script is being loaded? One thing you can do is tuck an alert in there to make sure it is being loaded. My suspicion is that it isn't being loaded.
Any time I have run into this before that has been the case although admittedly I haven't loaded a JS file from Isolated Storage before.

Changing javascript var in processingjs & load different txt file int html depends on the var

This two question is happend in the same html.
I have a .pde file will detect user mouse click, and there are few object in it, if the user click on the first one, I will get '1' as output and so on. And I have create a var in javascript to store the output, but how can I change the var value in processingjs?
How can I load different txt into a div in html depends on the var that return by processing js? When I done part one, and depends on the user click on different object, I want to load different txt in to the div. (e.g if object 2 is clicked, text2.txt will be loaded in div)
1: you don't need a special JavaScript var for this, you can track it in you sketch until you need the result on your page.
2: make your sketch tell javascript what to do when you click - see http://processingjs.org/articles/PomaxGuide.html#interface on how to make a sketch and javascript talk to each other while keeping your code runnable on both the web and in the Processing IDE

Alter Rendered Page in Webbrowser Control

is there a way to alter the rendered HTML page in webbrowser control? What i need is to alter the rendered HTML Page in my webbrowser control to highlight selected text.
What i did is use a webclient and use the webclient.Downloadstring() to get the source code of the page, Highlight specific text then write it again in webbrowser. problm is, images along with that page does not appear since they are rendered as relative path.
Is there a way to solve this problem? Is there a way to detect images in a webbrowser control?
Not sure why you need to change the HTML to lighlight text, why not use IHighlightRenderingServices?
To specify a base url when loading HTML string you need to use the document's IPersistMoniker interface and specify a url in your IMoniker implementation.
I suggest you do it a different way, download and replace the text using the webbrowser control, this way your links will work. All you do is replace whatever is in the Search TextBox with the following, say the search term is "hello", then you replace all occurances of hello with the following:
<font color="yellow">hello</font>
Of course, this HTML can be replaced with the SPAN tag (which is an inline version of the DIV tag, so your lines wont break using SPAN, but will using DIV). But in either case, both these tags have a style attribute, where you can use CSS to change its color or a zillion other properties that are CSS compatible, like follows:
<SPAN style="background-color: yellow;">hello</SPAN>
Of course, there are a zillion other ways to change color using HTML, feel free to search the web for more if you want.
Now, you can use the .Replace() function in dotnet to do this (replace the searched text), it's very easy. So, you can Get the Whole document as a string using .DocumentText, and once all occurances are replaced (using .Replace()), you can set it back to .DocumentText (so, you're using .DocumentText to get the original string, and setting .DocumentText with the replaced string). Of course, you probably don't want to do this to items inside the actual HTML, so you can just loop through all the elements on the page by doing a For Each loop over all elements like below:
For Each someElement as HTMLElement in WebBrowser1.Document.All
And each element will have a .InnerText/.InnerHTML and .OuterText/.OuterHTML that you can Get (read from) and Set (overwrite with replaced text).
Of course, for your needs, you'd probably just want to be replacing and overwriting the .InnerText and/or the .OuterText.
If you need more help, let me know. In either case, i'd like to know how it worked out for you anyway, or if there is anything more any of us can do to add value to your problem. Cheers.