How might I provided a URL use the FireShot API to take a screenshot, upload to Imgur, and return some output (eg. markdown) - api

I am looking for a way to utilize the FireShot API with JS to given a URL (or perhaps a list) use the FireShot API to take screenshot, upload to Imgur, then return the user the URLs or perhaps something like markdown to use quickly in forums.
Method 1: Open new window
I tried opening the URL in a new window, but found that I cant control that page with JS dues to cross domain problems. The same with iFrames.
Method 2: simple $.get()
A simple $.get() wont work because of the same cross domain issues I guess?
http://jsfiddle.net/t6aeq/
$.get($url.val(), function(data) {
console.log(data);
});
Via PHP "Proxy"
So I tried creating a simple PHP script that gets the HTML of the URL and returns it to my JS (using file_get_contents($url)). But some sites like Microsoft will detect that I am using some automated methods and give an error page of sorts. I also cant seem to find a way to use jQuery to query that returned HTML for link[rel=stylesheet], script, style and body to append to the head and a div respectively. I posted abt that on another question
A new Idea: Embed scripts on browser level
So I thought away of getting around these is using iMacros or GreeseMonkey or something to insert scripts into pages on the browser level instead? But any guidance or tips on how can I do that? Also, I'd prefer a pure JS/PHP method if available so users are not limited to using Browser plugin/scripts (tho I will be the only user for now)
It suddenly came to my mind that this may not work because the FireShot API key and Imgur is limited to the domain? Any solutions?

You might be able to inject the FireShot script using Greasemonkey. But, first use GM_xmlhttpRequest() to fetch an API key, for that page's domain, from the "Create FireShot API Key" page.
Note that GM_xmlhttpRequest() does not have the same cross-domain issues that $.get() has.
However, at this point you might be better off just writing your own Firefox add-on. Maybe start with FireShot's code for ideas. Also see the Screengrab add-on.

Related

Retrieve current chrome open page in html without saving it

I'm implementing a python script mainly based on pyautogui. One of the things the script does is to open a chrome webpage. After that I would need to access the DOM of this currently open webpage.
Since I've not opened the browser with selenium, I can't use it to analyze the DOM.
However, my question is: is this currently open chrome page available/saved somewhere in the hard drive so that I can access it with selenium? Like an .html file?
I checked many other questions here and users talk about chrome cache, but there are no html files there.
I just need to be able to access the current open page and not all the historical data in the cache.
Opening web browser directly with selenium is not an option either, since most of the websites analyzed have captchas and distil technology.
Thanks.
If you start the original chrome with --remote-debugging-port=PORT_NR argument, and visit localhost:PORT_NR from another browser, you will have access to the full content of the browser, including dev console.
Once you have this, you have multiple ways to go:
You can visit http://localhost:PORT_NR with with any other browser (or even with the same browser), and you should have full access to the content of the original Chrome. With Selenium you should have a relatively easy time to get by.
You can also use the devtools api (the documentation.. is.. well... there is room for improvement. Search for chrome devtools protocol to be amazed by the lack of docs). As an example you can get to http://localhost:PORT_NR/json to get the available debugging URIs. Grab the relevant websocket endpoint (webSocketDebuggerUrl). Open a websocket connection, and issue a command, like {"method": "DOM.getDocument", "id":12}. You can find available DOM related commands here: https://chromedevtools.github.io/devtools-protocol/1-3/DOM
Sice I had to reinvet the wheel I may give some extra info that I coudn't find anywhere:
Start the Browser with remote debugging enabled (see previous posts)
Connect to the given port on localhost and use these HTTP-GET-Requests to geta very limited control on your browser:
https://chromedevtools.github.io/devtools-protocol/#endpoints
Most important:
GET /json/new?{url}
GET /json/activate/{targetId}
GET /json/close/{targetId}
GET /json or /json/list
To gain full control over the browser, you need to use a "websocket" connection. Each Object in the GET /json or /json/list has it's own ID. Use this ID to interact with the tab. Btw: Type "page" are normal tabs, the other stuff are extentions and so on. Once you know which Tab you want to influence, get it's "webSocketDebuggerUrl".
Use this URL and connect with something that can speak the Websocket-protocol.
Once connected, you must craft a valid Json by the following structure:
{
"id":0,
"method":"Page.navigate",
"params":{url:http://google.com}}
}
Notes:
ID is a simple counter (int) that get bigger - not the ID of the tab(!)
Method is the method described in the docs params is also in the docs.
The return values are always JSONs.
From now on you can use the official docs:
https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-navigate
Dunno how other ppl found out about it but it took a few hours to get it working. Probably cause everyone is just using python's selenium to do it.

Does google index script tags as content when using handlebars.js

If you use the standard handlesbar.js implementation, does Google view the content within the custom script tags as content, script or unknown content?
If you're in doubt, do in pure HTML. Unfortunately, Google should ignore this. I looked about, and all I heard is that this application was not made ​​to be searchfriendly.
In fact, Google undestand and even follow links created via Javascript, but handlebarsjs is very more complex.
Possible solution
A strong suggestion that I make to you is load a simplified version with some content in plain simplified and after use handlebarsjs, so at then at least do not let google completely blind. But thsi version should be used also to end user, because google Will know if you show a diferent content just for Googlebot.
Possible solution 2
Exist a way to make websites that rely heavily on AJAX still work in Making AJAX Applications Crawlable

Pass a string to various websites

I have a product code which I need to enter into 6 different websites in order to pull different information from them about the product. Is there away to save this product code into some sort of variable and pass it into each websites input box and it return all the information from each one automatically? Really have no idea where to go/start with this so if anyone can brainstorm a few ideas to get me moving that would be great.
In order get what you are planning for:
You need a script which visits the specified web site,
then at the website, you can get the element by tag.
For instance in javascript,
var textBox = document.getElementByTag(Input);
This will give you a reference to text field to enter the text. It can be done as follows:
textBox.value = "any string";
Once you have done this, you will have to retrieve the results from the page, based on the website layout.
So if you can specify about your work in detail, you would get better response.
Assuming you're talking about using an ordinary GUI browser, the best you can do is copy it to your system clipboard, and paste it into each page on the browser.
If you're talking about a programmatic web-access like wget or curl, it depends on what language you are writing your script in.
you have to create the web request for each web site and find a way to parse the response which will be HTML
have a look at the HttpWebRequest you can find lots of example on internet that shows how you can create an HTTP POST to a website.
http://www.terminally-incoherent.com/blog/2008/05/05/send-a-https-post-request-with-c/

How to speed up Google Translate

I have a web page that has 70000 characters. As you know when doing translation through Google API you can only send up to 5000 characters at a time. Which means I have to send data to Google 14 times (70000/5000) which takes a lot of time and then my page is displayed. Is there a way to speed up the process?
Thanks
have you tried caching the translation?
If you were using some AJAX framework (you don't mention what your web page is created with eg c#) then you can make it faster by making the API call via the AJAX framework.
It would look something like this (psuedo-code since we don't know what you are using):
Serve web page (almost instant)
Web page starts AJAX call:
Break text into chunks
Foreach chunk
Translate via API
Append to the page
This way the user will see the page immediately, and will also see the translation appear piece by piece as it is processsed instead of having to wait until the end.
My best bet would be to generate a page in one language, then ask google to translate it trough HTTP and display result as your own, to make it seamless for user. I believe that is what Google Chrome does when translating web pages.
Example of URL that makes Google translate the whole web page:
http://translate.google.com/translate?hl=en&sl=ru&tl=en&u=http%3A%2F%2Flinux.org.ru%2F
Of course, another option is to use Google Translate API and cache result if page content is not changing frequently.
go to the Javascript file in Google, it will lead you also to the CSS file, make a file or perhaps two, or you may be able to add CSS to your own, now make Javascript page on your web site in own directory. make a nip of code to update the Javascript code every so many seconds or minutes, and this will make the transition much faster, just by refreshing the content they give.. have fun :) also ultimately you can also send a request at the same time as the first one to translate after char 5000 which should be relatively easy to do.

Refresh browser via cron(or not) to a different page on remote request?

I need to display pages in a tutorial fashion. I looked in to netsupport, beamyourscreen and other possibilities but, I do not want the viewers to download anything. I cannot use gd / send screenshots due to audio / video instructions embedded in some of the pages.
Basically, I need the ability to "refresh" a users browser window to a different page via an interface on my end. Whether via a form submission, javascript or any other type of "controller" that allows me to change the page on the viewers browser. PERL preferred but, PHP / javascript whatever works and is cross browser. I set up a simple javascript page forward timer that "works" but, page load times and conversation interruptions are a huge factor.
The entire tutorial website will be developed around this ability.
I was looking in to curl / cron / wget methods but, found little information.
I have seen forum and chat scripts that basically perform a similar task but, there must be a simple(ish) solution in leau of hacking up another script to suit my needs.
I do not want others to control the pages either. The site really, only needs to be accessable during the tutorial however, It "could" remain web accessable as long as user interaction was normal unless (being controlled).
The initial site concept is based on instructing people how to properly introduce new pets into a home. Will be operated by a veteranarian that saved my pets life. I wanted to give something back.
Possible? I really appreciate simple examples etc...
You have no other way but to keep polling the server for "instructions" using javascript. No, you can't send nothing to the end user browser, neither curl nor wget.
Mainly, you'll have to set up a simple request/response protocol between the browser and the server.
If you want to go deeper, you can use something like cometd/meteord/etc. If not, a hidden iframe that reloads himself and receives pages with javascript code for the needed actions can do the trick.
Another alternative.
With javascript dopolling and single character flatfile. Have a simple one character flatfile with a single var. Write it in perl (it is faster and uses less resources than php). The parent script calls a javascript variable in a flatfile. It hits the flatfile and goes wherever the var sets it. The flatfile is written to by the controller. Done.
I guess you could also rename an empty flatfile and use that as the controller. I am usure which is faster, open and read a specific file or hit the directory and return the file name. On the controller side, opening and writing to a file vs renaming a file. Maybe they counter each other in resources and time?
This way the site can act as a normal site. When you want to have remote users see a "presentation" (automatically being shown the site pages at the controllers pace), the controller activates polling and tells the viewers to push a start button. This allows a remote instructor to load pages for the viewers at his leisure.
It is a simple solution that works with nothing really sophisticated going on. No frames are needed either. Just need javascript enabled.
Any better suggestions are welcome!
It occurred to me that what you might want to use is HTML Push technology. Check out the wiki, they have several links. I have never used it myself