What is "Chrome Client" in relation to WebKit? - webkit

The WebKit source code and documentation uses the term "Chrome Client" often to describe a certain class used for front-end display. I'm confused by what the term "Chrome" refers to, as it seems to be unrelated to the Google Chromium port. It's difficult to search for any information about this on the web, because the search terms "chrome" and "client" inevitably bring up results relating to the Google Chrome browser, or merely give me links to the WebKit source code.
Can anyone explain what Chrome Client is, and what "Chrome" means in this context?

ChromeClient is an abstract interface that WebCore uses to interact with the multiple WebKit API layers that are built on top of WebCore. Its functionality centers around the user interface (aka "chrome") aspects of the view containing a particular WebCore Page. This abstraction is important as there are a number of separate API layers built on top of WebCore, and how each API layer handles the user interface can differ even between API layers running on the same OS (for instance, WebKit and WebKit2 have quite different needs).
As a simple example, ChromeClient::runJavaScriptAlert is called by the JavaScript alert function. The implementation of runJavaScriptAlert for the Cocoa WebView class simply calls in to the appropriate WebUIDelegate methods as one would expect. Other cases, such as the display of tooltips, are handled entirely by the concrete ChromeClient implementation directly without involving any of WebView's delegates.

ChomeClient is a interface which delegates displaying GUI elements like alert, popup window, prompt, window(window.open) to WebKit ports.
Basically we can say window related operations like scrolling, requesting repaint via invalidating portion of window,..etc
Each WebKit port provides their own implementation by overriding the ChromeClient interaface. For example Qt might have ChromeClientQt.h & ChromeClientQt.cpp & GTK might have ChromeClientGTk.h & ChromeClientGTk.cpp.

Webkit is a web browser rendering rendering engine used by Safari and Chrome (among others, but these are the popular ones).
The -webkit prefix on CSS selectors are properties that only this engine is intended to process, very similar to -moz properties. Many of us are hoping this goes away, for example -webkit-border-radius will be replaced by the standard border-radius and you won't need multiple rules for the same thing for multiple browsers. This is really the result of "pre-specification" features that are intended to not interfere with the standard version when it comes about.
For your update:...no it's not related to IE really, IE at least before 9 uses a different rendering engine called Trident.

Related

Appcelerator Hyperloop vs. Plain Titanium Modules

I've started playing around with Appcelerator Hyperloop. While it seems great to access native APIs from JS from day zero, it does raise a few questions about architecture of the platform and the performance.
Currently (AFAIK) a Titanium app has a main UI thread (that runs the native UI controllers) and a JS thread (that runs the JS logic). Each call from JS to Native is passed though the "Bridge" (which is the expansive operation in an app).
Also, Titanium API doesn't cover all the native API and abstracts as much as it can. But if new APIs are introduced it could take time for Appcelerator to implement those into the platform.
One of my favorite things about Titanium is the ability to extend it (using objective-c for iOS and java for Android) - allowing to use native APIs that are not covered by Titanium, and also developing a really native performance controls in case we need to do anything that's too "heavy" for JS. And, as mentioned it's developed 100% native for each platform.
Now that Appcelerator introduced Hyperloop I've done a simple test app and saw that Hyperloop is not translated into native code but just to normal JS code:
var UILabel = require('hyperloop/uikit/uilabel');
var label = new UILabel();
label.text = "HELLO WORLD!";
$.index.add(label);
And another thing about it is that you have to run on the main thread.
So we basically have a few things come to mind here as far as Hyperloop architecture goes:
We still have a bridge? if Hyperloop is JS that calls "special" Hyperloop require then we still have a bridge, that now not only acts as a bridge but also needs to do some sort of reflection (which is also an expansive operation)?
Until now JS ran in it's own thread - so now running in a single main thread seems to be a potential source to more UI blocking operation.
The old-fashioned modules were truly native (not including the bridge call) - so how do Hyperloop-enabled apps compare with those?
There isn't much documentation or articles about Hyperloop that explain the inner working yet - so if anyone has any answers have been trying apps with it could be very helpful.
Answering your questions straight-forward:
There are no Kroll-Proxies involved anymore, since actual classes are being generated on runtime. This is done by using the hyperloop-metabase that does reflection (as you already said) to build an AST that grabs the actual signatures, types, classes, methods, properties, etc.
We did not see any performance-issues with running on the main-thread for now. If you do so, please file a JIRA-ticket so we can investigate the use-case.
The old-modules were "less native" then now, simply because they were all wrapped by the Kroll-proxy (by extending every view from TiUIView and every proxy from TiProxy / TiViewProxy. Hyperloop does not work with those, making the module-development much more faster by also allowing the developer to test his/her process live in their app without the need of packaging and referencing the module manually. Hyperloop modules are nothing else then CommonJS modules that are already used frequently across Alloy and other Ti-components.
I hope that gives you a quick overview on how Hyperloop works. If you have further questions, let us know!
Hans
(As a detailed answer to the above comment)
So let's say you have a tableview in iOS. The native class is UITableView and the Titanium-API is Ti.UI.TableView / Ti.UI.ListView.
While the ListView already provides a huge performance-boost compared to the TableView by abstracting the Child-API usage to templates, those child-API's (Ti.UI.Label, Ti.UI.ImageView, ...) are still custom classes that are wrapped and provide custom logic (!) e.g. keeping track of it's parent-references, internal data-structures and locks to jump between the threads.
If you now check the Hyperloop example of a native UITableView, you access the native API's directly, so no proxy behind it needs to manage sections, templates, items etc. Of course we deliver that API through a kroll proxy in order to display it in Titanium, but you don't "jump between the bridge" with every call you make from the SDK.
The easiest way to see that is to actually run some bigger example like the tableview, collectionview and view-animation. If you do a fast scroll through these, you already feel the performance boost compared to "classic" Titanium API's, simply because the only communication between your proxy and (like a Ti.UI.Window you want to add it to) is the .add() to receive the native API of the type HyperloopClass.
Finally, of course it still makes sense to use Ti.UI.ListView for example, because it comes with the builtin utilities that Titanium devs love (events, easy configuration and layout-handling). But thats also where the benefit of Hyperloop comes along, by allowing the developer to access those API's him-/herself.
I hope that helps a bit more to understand it.

From a technical perspective, how does Selenium click an element on a web page?

Context is provided in case anyone knows of an alternative way to solve the larger issue.
Problem Context
I am spearheading the development of a test automation framework for a web application which uses Web Components. This has presented a problem when testing in Internet Explorer, because Internet Explorer does not support Web Components natively; instead, a polyfill is used to provide this functionality.
A primary repercussion of this is that much of Selenium will not work as expected. It cannot 'see' the Shadow DOM in Internet Explorer the way it can in Firefox and Chrome.
The alternative is to write a test framework which provides an alternate mechanism for accessing elements via JavaScript - this allows elements to be located through the polyfill.
Our current implementation checks the WebDriver being used, and either uses the original Selenium implementation of a method (in the case of Chrome or Firefox), or our own alternative implementation (in the case of Internet Explorer).
This means that we want our implementation to be as close as possible to Selenium's implementation, at its core, browser-interacty, level.
Problem
I am trying to replicate the functionality of Actions.click(WebElement onElement) (source), in a simplified form (without following the Builder design pattern of the Actions class, and making assumptions that the click is with the left mouse button and no other keys (Ctrl, Shift, Alt) are being held down).
I want to find the core code which handles the click does (specifically in Chrome, Firefox, and Internet Explorer), so I can replicate it as closely as possible, however I've found myself lost in a deep pit of classes and interfaces...
A new ClickAction (source) is created (to later be performed). Performing this includes a 'click()' call on an instance of the Mouse interface (source) ... aaaaand I'm lost. I see from generated JavaDoc that this is implemented by either EventFiringMouse (source) or HtmlUnitMouse (source), but I'm not sure which one will be implemented. I made an assumption (with little basis) that HtmlUnitMouse would be used, which has led me further down the rabbit hole looking at HTMLUnit code from Gargoyle Software...
In short, I am totally lost.
Any guidance would be much appreciated :)
Research
I have found that I was incorrect in my assumption that HTMLUnit is used by Chrome, Firefox, and Internet Explorer. Documentation shows that RemoteWebDriver (source) is subclassed by ChromeDriver, FirefoxDriver, and InternetExplorerDriver.
The Essentials
The drivers for Chrome, Firefox, and Internet Explorer are all RemoteWebDrivers.
This means that any actions which Selenium performs are sent to the browser (the WebDriver), via an HttpRequest.
Once the request is received by the browser, it will perform the action as either a "native event" or synthetically. How a browser executes an action depends on the capabilities of the browser (and potentially a flag option).
"Native" events are OS-level events.
Actions executed synthetically are executed using JavaScript. "Automation Atoms" are used - as one infers from 'atom', they are small, simple functions to perform low-level actions.
References
RemoteWebDriver is subclassed by ChromeDriver, FirefoxDriver, InternetExplorerDriver, OperaDriver, and SafariDriver. (reference)
All implementations of WebDriver that communicate with the browser, or a RemoteWebDriver server shall use a common wire protocol. This wire protocol defines a RESTful web service using JSON over HTTP. (reference)
In WebDriver advanced user interactions are provided by either simulating the JavaScript events directly (i.e. synthetic events) or by letting the browser generate the JavaScript events (i.e. native events). Native events simulate the user interactions better whereas synthetic events are platform independent [...] Native events should be used whenever possible. (reference)
Browser Automation Atoms are building blocks intended to be used by Selenium implementations. By using the same pieces throughout the codebase, rather than reimplementing required functionality in multiple places, the project can reduce the number of bugs found, and can simplify the process of adding new functionality and drivers. (reference)
Automation Atoms
A summary of the available Automation Atoms
The raw JavaScript code for the Automation Atoms - this may serve as a useful starting point in developing simpler synthetic events, if necessary.
The wiki for the Selenium IE Driver states that it uses native events rather than JavaScript events to interact with the browser.
As the InternetExplorerDriver is Windows-only, it attempts to use
so-called "native", or OS-level events to perform mouse and keyboard
operations in the browser. This is in contrast to using simulated
JavaScript events for the same operations.
Except for clicking <option> elements, where it uses JavaScript.
The IE driver handles this one scenario by using the click()
Automation Atom, which essentially sets the .selected property of the
element and simulates the onChange event in JavaScript.

May I use video.js as a minimalistic polyfill?

I'd like to have a library that would offer a javascript API to control a player and manage its events, nothing more.
All the GUI would (optionnaly) not be part of the library. I've tried to set a player without controls, but even in that case the GUI is created in the DOM but not shown.
I can see 2 benefits: i can reuse my previous GUI in an easier way, and the video.js script is smaller.
But this also question the nature of the polyfill. Adding a track to manage subtitles without an interface to render it would not make a true polyfill. There would be 2 kinds of polyfill: the first would just let the browser play the video, and the second would create a consistent graphical interface to manage all the player's features.
The answerable question is: does video.js offers a way to only provide a js API (and modify the dom when flash is required)?
If there is not such feature, is it an option for the future (and why not)?
Thank you all!

Testing a "Dojo" web application with Selenium

Has anyone done some extensive automation with Selenium and a Dojo-heavy web app? I'm looking for any issues or problem that you might have run into or issues related directly to the combination of Selenium and Dojo.
I've used Selenium extensively with a bunch of different web apps, including a few on Dojo. You should be fine. One practice I would recommend is to make sure all the components you'll be testing (both UI controls you'll be driving, as well as text components you'll be reading for testing) have ID tags set. Selenium has a bunch of elegant selectors to get at the elements you need, but selection by ID is still the best. The other methods can be more brittle.
I've had some challenging experiences with Selenium RC not being as compatible with my code as Selenium IDE, to the point that I stopped using Selenium RC. And in case you are not super familiar with Selenium, you should be aware that it doesn't natively support some (IMO) pretty fundamental features like flow control and includes; but there are user extensions to the framework that allow this. I'd also recommend taking a look at Watir which I now generally prefer over Selenium because it exposes the full power/flexibility of a first class language (Ruby).
I'm working on a Dojo-heavy app right now, and am making a number of tests with Selenium IDE. I've ran into a few issues with certain Dojo elements, such as drop down menus and tabbed components. I've learned to appreciate XPath, and have been experimenting with how clickAt and waitForElementPosition commands, which seem to help accommodate for some of Dojo's features.
Dojo specifics - very brief
The Dojo itself differs in some approaches from other heavy-DOM and extensively impressive frameworks (like ExtJS, jQuery, YUI).
The general Dojo specific it workaround the limitations by using Flash (YUI does as well) or Silverlight.
Here is a couple scenarious when Dojo can use Fash:
the browser is not HTML5 and javascript need local storage. Then Dojo will use "Flash Cookie" Flash Local Shared Objects (package dojox.storage)
need support of cross domain https calls.
The general tricks that can turn your testing into something difficult:
browser messages, like "do you wish to allow this site..."
nested frames can make the selection of the node difficult
javascript timeout/intervals they might work with different speed in Selenium then in real browser. Yes they can.
The biggest issue I encountered was the fact that dojo menus, and pop-up UI elements in general, are absolutely positioned as children of the body element and are not children of the element that creates them.
This can impact how you write Selenium CSS Selector and, in my case, made it a bit more challenging to automatically crop a screenshot that includes a menu and its dropdown.
Selenium should be fine with dojo because it's rendered in Firefox and not on it's own. Just make sure dojo is available when testing ( i.e. don't connect to google's cdn if your test environment doesn't have an internet connection ). But that's a problem you'd have with any external resource
I have no experience, but did see http://www.ibm.com/developerworks/opensource/library/os-webautoselenium/index.html discussing how to use Selenium with dojo
If you need to test in an SSL environment and you use Selenium RC's trustAllSslCertificates + proxy, you must make sure all of your JS files are hosted on the same domain. I've seen problems recently with using CDNs to load JS and image files when testing under recent Firefox versions and selenium rc

How is it possible to access function of app A from app B

I was wondering if and in how many way an app can access specific funcions of another app.
for example
open an url in safari/firefox/chrome
run a javascript in current browser-tab
play/pause itunes
rename selected files in Finder
I am aware of the existence of applescript but i was wondering if that's the only way i have to interact with those apps and others
thanks
There are three main ways an app exposes its function to the outside world.
One is by supporting an URL protocol. To open an URL, just use NSWorkspace. There are many methods; if an app registers a specific protocol like x-my-app://some-work, you can just do
[[NSWorkspace sharedWorkspace] openURL:[NSURL URLWithString:#"x-my-app://some-work"] ];
If you want to open an URL whose protocol (say http) is supported by many apps and if you want to specify which app to use, use openURLs:withAppBundleIdentifier:options:additionalEventParamDescriptor:launchIdentifiers:
.
Another is the System Services. With this, an app can add entries in the Service menu and in the context menu of other apps; you can also call it programmatically.
Otherwise, it's via Apple events. Applescript is one way to deal with them, but not the only one. It's just a language to issue Apple events. There are many ways to deal with Apple events from Cocoa, see this detailed document by Apple.
Basically, an app can export its internal as an object-oriented manner (which is not just its Objective-C hierarchy; you can control how much of its internal objects and methods you expose, etc.) by an sdef file. Then, another app can use this object-oriented system via Apple events.
To send and receive Apple events, you can of course construct them by hand, but you can use higher-level objects like
Applescript via NSAppleScript
Scripting Bridge
or AppScript.
To learn what kind of aspects an app exposes, just open the AppleScript Editor and choose the menu File → Open Dictionary, and choose an app.
Now, it's rather hard to use features of an app which the app does not expose via any of these methods. You still have a few workaround.
UI Scripting. This is done by sending Apple Events to a headless app called System Events which is one of the core program in OS X. This way, you can programmatically emulate clicking a button, choosing a menu, etc. of another app. So, almost whatever you can do using GUI with another app can be done programmatically from another app. To see the hierarchy of UI objects accessible from UI scripting, use a utility which comes with XCode tools, at
/Developer/Applications/Utilites/Accessibility Tools/Accessibility Inspector.app
This is very rudimentary but does the job; if you regularly use UI scripting, consider obtaining UI browser, as Zygmunt suggests.
Finally, if you want to use a non-GUI non-exposed feature of another app, you can inject a code into another app.
Just expanding on Yuji's answer. If you were forced to go the UI scripting path, there's a nice application to analyze the interface - hxxp://pfiddlesoft.com/uibrowser/. However, the examples you mentioned should expose some APIs.
I might also recommend using Sikuli hxxp://groups.csail.mit.edu/uid/sikuli/ as an IDE to script around user interface robustly.
For some applications usually coming from GNU/Linux there is D-BUS hxxp://en.wikipedia.org/wiki/D-Bus - although I haven't used it on a Mac on my own yet. And let me also quote Wikipedia about Cocoa "It is one of five major APIs available for Mac OS X; the others are Carbon, POSIX (for the BSD environment), X11 and Java." hxxp://en.wikipedia.org/wiki/Cocoa_%28API%29 That's just a loose tip for further exploration as Yuji has already explained Apple events that are key to your question.