Hopefully somebody can verify this or shed some insight.
The most basic use of player.loadVideoByID() method seems not to be working in IE.
I have deployed the google getting started sample code from:
https://developers.google.com/youtube/iframe_api_reference
to my site to illustrate the issue.
You can use this link and view the source to easily see it in action.
http://www.fuhshnizzle.com/YT.html
You will also note that the sample which autoplays in chrome, ff & safari merely shows the play icon in IE.
So that is really 2 bugs.
No autoplay and loadVideoByID() is tossing an exception in all flavors of IE.
I have wrapped the loadVideoByID() method call in a try catch and you can see it is no longer working in IE but is fine in chrome, ff & safari.
I would imagine this is affecting EVERY IE YT API app that uses this method call and therefore should be a major priority for google/YouTube team. Since stackoverflow has the best developer response times I thought I would reach out for support here.
Thanks very much!
Related
I have a question regarding executeJs function.
page.executeJs("$0.click();", downloadAnchor.getElement());
This line of code is not working in real iPhone Safari browser, though it works in mobile responsive mode from desktop safari. Appreciate your help on this
Browsers will be "suspicious" of anything starting a download that isn't a direct reaction to interaction by the user. This is done as a security precaution since starting to download files without the user's explicit consent can be dangerous in specific cases. Different browsers and configurations have different policies for exactly where to draw the line.
In your case, the download isn't started as a direct consequence of user interaction but instead as direct consequence of receiving a message from the server. This kind of pattern will simply not work reliably no matter what you do.
Instead, you need to design the interaction so that the download is directly triggered by the user. The easiest way of doing that is by having the user directly click on the actual download link. If you want to have some indirection, then you still need to make the action work directly in the browser without going through the server.
I would like to render the following website with Scrapy Splash.
https://m.mobilebet.com/en/sports/football/england-premier-league/
Unfortunately, Splash always gets stuck at the loading screen:
I have already tried using a long waiting time (up to 60 seconds) with no results. My Splash version is 3.3.1 and obey robots.txt has been set to false.
Thanks!
There's not quite enough info to answer, but I've got a good guess.
You see, the major difference between Splash and your browser is the user agent string. You have one that looks like a person. Splash generally doesn't.
This kind of infinite loading is a method used by sites to mitigate repetitive load. Often when you're developing locally without a proxy you'll trip these issues. They are quite maddening to develop against because they're inconsistent.
Your requests are just getting dropped, you'll probably see a 403 after 5-10 minutes.
I think it's likely you can solve this issue with the method mentioned in this answer: Scrapy+Splash return 403 for any site.
I don't think it'll be possible - this website needs JS to be rendered. So you'll need to use something like Selenium to scrape information from it.
Also, perhaps what you are looking for is an API for that information - since scraping it from a website can be very inefficient. Try googling "sports REST API" - look for one with Python SDK.
Ok, so Splash is supposed to render the JS for you it seems. But I wouldn't rely on it too much - those websites constantly change and they are developed against latest browsers, your best bet is to use Selenium with Chromium driver (though using API is much more preferable).
Is there any way to consistently detect PhantomJS/CasperJS? I've been dealing with a spat of malicious spambots built with it and have been able to mostly block them based on certain behaviours, but I'm curious if there's a rock-solid way to know if CasperJS is in use, as dealing with constant adaptations gets slightly annoying.
I don't believe in using Captchas. They are a negative user experience and ReCaptcha has never worked to block spam on my MediaWiki installations. As our site has no user registrations (anonymous discussion board), we'd need to have a Captcha entry for every post. We get several thousand legitimate posts a day and a Captcha would see that number divebomb.
I very much share your take on CAPTCHA. I'll list what I have been able to detect so far, for my own detection script, with similar goals. It's only partial, as they are many more headless browsers.
Fairly safe to use exposed window properties to detect/assume those particular headless browser:
window._phantom (or window.callPhantom) //phantomjs
window.__phantomas //PhantomJS-based web perf metrics + monitoring tool
window.Buffer //nodejs
window.emit //couchjs
window.spawn //rhino
The above is gathered from jslint doc and testing with phantom js.
Browser automation drivers (used by BrowserStack or other web capture services for snapshot):
window.webdriver //selenium
window.domAutomation (or window.domAutomationController) //chromium based automation driver
The properties are not always exposed and I am looking into other more robust ways to detect such bots, which I'll probably release as full blown script when done. But that mainly answers your question.
Here is another fairly sound method to detect JS capable headless browsers more broadly:
if (window.outerWidth === 0 && window.outerHeight === 0){ //headless browser }
This should work well because the properties are 0 by default even if a virtual viewport size is set by headless browsers, and by default it can't report a size of a browser window that doesn't exist. In particular, Phantom JS doesn't support outerWith or outerHeight.
ADDENDUM: There is however a Chrome/Blink bug with outer/innerDimensions. Chromium does not report those dimensions when a page loads in a hidden tab, such as when restored from previous session. Safari doesn't seem to have that issue..
Update: Turns out iOS Safari 8+ has a bug with outerWidth & outerHeight at 0, and a Sailfish webview can too. So while it's a signal, it can't be used alone without being mindful of these bugs. Hence, warning: Please don't use this raw snippet unless you really know what you are doing.
PS: If you know of other headless browser properties not listed here, please share in comments.
There is no rock-solid way: PhantomJS, and Selenium, are just software being used to control browser software, instead of a user controlling it.
With PhantomJS 1.x, in particular, I believe there is some JavaScript you can use to crash the browser that exploits a bug in the version of WebKit being used (it is equivalent to Chrome 13, so very few genuine users should be affected). (I remember this being mentioned on the Phantom mailing list a few months back, but I don't know if the exact JS to use was described.) More generally you could use a combination of user-agent matching up with feature detection. E.g. if a browser claims to be "Chrome 23" but does not have a feature that Chrome 23 has (and that Chrome 13 did not have), then get suspicious.
As a user, I hate CAPTCHAs too. But they are quite effective in that they increase the cost for the spammer: he has to write more software or hire humans to read them. (That is why I think easy CAPTCHAs are good enough: the ones that annoy users are those where you have no idea what it says and have to keep pressing reload to get something you recognize.)
One approach (which I believe Google uses) is to show the CAPTCHA conditionally. E.g. users who are logged-in never get shown it. Users who have already done one post this session are not shown it again. Users from IP addresses in a whitelist (which could be built from previous legitimate posts) are not shown them. Or conversely just show them to users from a blacklist of IP ranges.
I know none of those approaches are perfect, sorry.
You could detect phantom on the client-side by checking window.callPhantom property. The minimal script is on the client side is:
var isPhantom = !!window.callPhantom;
Here is a gist with proof of concept that this works.
A spammer could try to delete this property with page.evaluate and then it depends on who is faster. After you tried the detection you do a reload with the post form and a CAPTCHA or not depending on your detection result.
The problem is that you incur a redirect that might annoy your users. This will be necessary with every detection technique on the client. Which can be subverted and changed with onResourceRequested.
Generally, I don't think that this is possible, because you can only detect on the client and send the result to the server. Adding the CAPTCHA combined with the detection step with only one page load does not really add anything as it could be removed just as easily with phantomjs/casperjs. Defense based on user agent also doesn't make sense since it can be easily changed in phantomjs/casperjs.
I'm not a developer but an internet marketer - so forgive me for what is I'm sure a very basic question. In my career, it's useful when looking at website marketing to better understand what tools are used, such as Google Analytics for example. Most of the time this is quite simple - just view source and you'll see in the source code the javascript snippet.
I use the ghostery plugin to make this a bit easier, but what I don't understand about http requests is how ghostery reports a technology as being used, such as the ad server DoubleClick for example, but I can't see any code in the source code that references Doubleclick. This happens a lot but it's most often with ad server technologies.
When I look using Chrome Dev Tools, I do in fact see that the call was made by viewing the Source tab.
My question is this and it's really a general question where I'm trying to better understand how calls are made back and forth between the browser and all the various servers and services:
How, in Chrome Tools do I find what code made the call to load the resource, such as DoubleClick. I can't find anything in the source code, which tells me I don't fully understand how interactions are working.
I think from search StackOverflow that is is a xmlhttp call but I'm not sure about that, maybe it's cookies - I just don't know how this is working. At the end of the day, I don't like not understanding how this all works, so I'm hoping someone can point me in the write direction.
Thanks.
In the Network panel, find your request of interest. The Initiator column will contain a reference to the code that has initiated loading the associated resource.
we are developing an extension, hosted in the Google chrome web store.recently - we've got complaints from our users that sometime they get a notification window saying "the extension crashed, click here to reload".
after a short research we found out that this is happening only when we upload a new version to the Chrome Web Store.
we started to look it up online and found no documentation what so ever for this, so we started to check for it ourselves.
we tried to see what exactly can cause this problem and if we can identify a distinctive cause.
our tries included updating only the manifest.json file, a css file, a js file or not changing nothing at all but the version number, and on each change we've uploaded a new version and update it in about 10 different machines.
the results were the same, when on each update we made, it caused the extension to crash on just a few of the machines, while updating perfectly fine on the others. each time different machines acted differently.
then, we thought it might be related to the fact we have a timer working in the background page, and it might be happening just at the time it is working.
so we tried to raise the timer's frequency (from 5 seconds to 100 millisecond), and it still acted the same, crashing on only 3 out of the 10 machines.
we ran out of ideas now, and it really causing a problem in terms of user experience to our extension's users.
did someone had this problem, or came across any extension crashes on version update?
is it a known bug in chrome's extension engine or are we doing something wrong?
I am having the same problem and I think I found the cause. Do you by chance, override the new tab page?
I am able to reproduce the problem 100% of the time and when I remove the new tab override from the manifest, the problem goes away.
I opened an issue: Issue 104401