Script to download Google web history - google-search-api

How does one write a script to download one's Google web history?
I know about
https://www.google.com/history/
https://www.google.com/history/lookup?hl=en&authuser=0&max=1326122791634447
feed:https://www.google.com/history/lookup?month=1&day=9&yr=2011&output=rss
but they fail when called programmatically rather than through a browser.

I wrote up a blog post on how to download your entire Google Web History using a script I put together.
It all works directly within your web browser on the client side (i.e. no data is transmitted to a third-party), and you can download it to a CSV file. You can view the source code here:
http://geeklad.com/tools/google-history/google-history.js
My blog post has a bookmarklet you can use to easily launch the script. It works by accessing the same feed, but performs the iteration of reading the entire history 1000 records at a time, converting it into a CSV string, and making the data downloadable at the touch of a button.
I ran it against my own history, and successfully downloaded over 130K records, which came out to around 30MB when exported to CSV.
EDIT: It seems that number of foks that have used my script have run into problems, likely due to some oddities in their history data. Unfortunately, since the script does everything within the browser, I cannot debug it when it encounters histories that break it. If you're a JavaScript developer, use my script, and it appears your history has caused it to break; please feel free to help me fix it and send me any updates to the code.

I tried GeekLad's system, unfortunately two breaking changes have occurred #1 URL has changed ( I modified and hosted my own copy which led to #2 type=rss arguments no longer works.
I only needed the timestamps... so began the best/worst hack I've written in a while.
Step 1 - https://stackoverflow.com/a/3177718/9908 - Using chrome disable ALL security protocols.
Step 2 - https://gist.github.com/devdave/22b578d562a0dc1a8303
Using contentscript.js and manifest.json, make a chrome extension, host ransack.js locally to whatever service you want ( PHP, Ruby, Python, etc ). Goto https://history.google.com/history/ after installing your contentscript extension in developer mode ( unpacked ). It will automatically inject ransack.js + jQuery into the dom, harvest the data, and then move on to the next "Later" link.
Every 60 seconds, Google will force you to re-login randomly so this is not a start and walk away process BUT it does work and if they up the obfustication ante, you can always resort to chaining Ajax calls and send the page back to the backend for post processing. At full tilt, my abomination script collected 1 page a second of data.
On moral grounds I will not help anyone modify this script to get search terms and results as this process is not sanctioned by Google ( though not blocked apparently ) and recommend it only to sufficiently motivated individuals to make it work for them. By my estimates it took me 3-4 hours to get all 9 years of data ( 90K records ) # 1 page every 900ms or faster.
While this thing is going, DO NOT browse the rest of the web because Chrome is running with no safeguards in place, most of them exist for a reason.

One can download her search logs directly from Google (In case downloading it using a script is not the primary purpose),
Steps:
1) Login and Go to https://history.google.com/history/
2) Just below your profile picture logo, towards the right side, you can find an icon for settings. See the second option called "Download". Click on that.
3) Then click on "Create Archive", then Google will mail you the log within minutes.

maybe before issuing a request to get the feed the script shuld add a User-Agent HTTP header of well known browser, for Google to decide that the request came from that browser.

Related

Login screen recording in JMeter repeating url issue

I need your help, I have recorded a login script in blaze meter and importing it into JMeter what I noticed that browsing URL is repeating like site.com/0, site.com/1,site.com/2 and so on. Please suggest what to do to fix it asap help required. thanks.
I am trying to record a login script in blaze meter when I imported the script in JMeter I found that the browsing URL is repeating. like example.com/0, example.com/1,and so on. please help me.
We cannot "help" without knowing what are your expectations.
When it comes to performance testing of web applications you need to ensure that JMeter is properly configured to behave exactly like a real browser.
It means that JMeter should send the same requests and in the same manner as the real browser does.
In case if the network footprint generated by JMeter matches the one which the real browser produces - you don't need any "help" there. If it doesn't - we need to see:
the dump of requests from "Network" tab of your browser's developer tools
how did you configure the BlazeMeter Chrome Extension, i.e. choosing "Only top level requests" might "help" you
Normally these numeric postfixes are used as the naming convention for the Transaction Controller to all nested redirects, embedded resources and so on would be considered an integral part of the parent "transaction"

Zap Vaadin setup issues

Being absolutely new to security testing, I am trying to run basic steps and then trying to run a spider and active scan. I have seen couple of videos from owasp & youtube and tried to make sense of the included ZAP documentation. However, I don't think ZAP is logged in while making the spider crawl or active scans.
Mine is a Java + Vaadin & Spring based application and the POST URL's don't change on making any kind of request. Just the request parameters change. POST URL's are always
http://example.com/UIDL/?v-uiId=0
I have used
Set Active Session under HTTP session, before running the spider/active scan
have setup the context(though very few URL's appear under Sites),
I have also tried to update structure parameter with words like "page", "UIDL" etc. which i could see in the URL's
I have also tried to exclude logout URL, but since all the POST URL's are the same, the spider and active scan don't really do anything, in that case.
On right clicking i got an option to select the request as Form based Authentication.
I have tried that as well, however, it populates the "Login Request POST data" with vaues like
12929ddf-5d7f-4264-b810-2fa8f38eca6f[["597","v","v",["pagelength",["i","28"]]],["597","v","v",["firstToBeRendered",["i","0"]]],["597","v","v",["lastToBeRendered",["i","26"]]],["597","v","v",["reqfirstrow",["i","15"]]],["597","v","v",["reqrows",["i","12"]]]]
and fills the username/password fields with exactly the same.
One last thing, do we really need to disable the XSRF to be able to use ZAP properly with Java/Vaadin applications?
I am simply trying to get it working and then continue learning from there. Would appreciate if anyone can please help me with the correct setup.

How to check in jmeter if entered fields remain same in the first page after navigating back from nth page

I want to test a page.Where i want to fill up the fields like first name last name etc.and after going two pages further if i come back to the original page by using back navigation ,data entered for first name and last name remains the same.or it is filled up.
In jmeter i want to check the same if data entered for the fields remain same if i navigate back .
How can i achieve this.
I tried gving url directly in the path its not happening since it is not the way.
please help me since i'm new to jmeter.
You need to understand 2 things. How JMeter works and how your application works.
JMeter only captures data that is communicated to server. It does not matter how data is entered from UI. It does not check if data retains in the fields or not. It only records the request that is sent by your application to server-side.
So, if you understand above, you also need to understand how your application sends data to server. Does it sends the request as you move from first page to second. Or does it send (Submit) data on final page.
Either way, JMeter is not a tool to test if your form fields are retaining data in them as you navigate between pages. As mentioned earlier it only monitors data requests/responses.
Selenium seems a better option for your test requirement.
Please read the apache documentation carefully:
JMeter is not a browser. As far as web-services and remote services are concerned, JMeter looks like a browser (or rather, multiple browsers); however JMeter does not perform all the actions supported by browsers. In particular, JMeter does not execute the Javascript found in HTML pages. Nor does it render the HTML pages as a browser does (it's possible to view the response as HTML etc., but the timings are not included in any samples, and only one sample in one thread is ever viewed at a time).
First, you have to understand how JMeter works!!! To do the Functional Testing, Selenium would be a good choice.
Thanks

How to speed up Google Translate

I have a web page that has 70000 characters. As you know when doing translation through Google API you can only send up to 5000 characters at a time. Which means I have to send data to Google 14 times (70000/5000) which takes a lot of time and then my page is displayed. Is there a way to speed up the process?
Thanks
have you tried caching the translation?
If you were using some AJAX framework (you don't mention what your web page is created with eg c#) then you can make it faster by making the API call via the AJAX framework.
It would look something like this (psuedo-code since we don't know what you are using):
Serve web page (almost instant)
Web page starts AJAX call:
Break text into chunks
Foreach chunk
Translate via API
Append to the page
This way the user will see the page immediately, and will also see the translation appear piece by piece as it is processsed instead of having to wait until the end.
My best bet would be to generate a page in one language, then ask google to translate it trough HTTP and display result as your own, to make it seamless for user. I believe that is what Google Chrome does when translating web pages.
Example of URL that makes Google translate the whole web page:
http://translate.google.com/translate?hl=en&sl=ru&tl=en&u=http%3A%2F%2Flinux.org.ru%2F
Of course, another option is to use Google Translate API and cache result if page content is not changing frequently.
go to the Javascript file in Google, it will lead you also to the CSS file, make a file or perhaps two, or you may be able to add CSS to your own, now make Javascript page on your web site in own directory. make a nip of code to update the Javascript code every so many seconds or minutes, and this will make the transition much faster, just by refreshing the content they give.. have fun :) also ultimately you can also send a request at the same time as the first one to translate after char 5000 which should be relatively easy to do.

Refresh browser via cron(or not) to a different page on remote request?

I need to display pages in a tutorial fashion. I looked in to netsupport, beamyourscreen and other possibilities but, I do not want the viewers to download anything. I cannot use gd / send screenshots due to audio / video instructions embedded in some of the pages.
Basically, I need the ability to "refresh" a users browser window to a different page via an interface on my end. Whether via a form submission, javascript or any other type of "controller" that allows me to change the page on the viewers browser. PERL preferred but, PHP / javascript whatever works and is cross browser. I set up a simple javascript page forward timer that "works" but, page load times and conversation interruptions are a huge factor.
The entire tutorial website will be developed around this ability.
I was looking in to curl / cron / wget methods but, found little information.
I have seen forum and chat scripts that basically perform a similar task but, there must be a simple(ish) solution in leau of hacking up another script to suit my needs.
I do not want others to control the pages either. The site really, only needs to be accessable during the tutorial however, It "could" remain web accessable as long as user interaction was normal unless (being controlled).
The initial site concept is based on instructing people how to properly introduce new pets into a home. Will be operated by a veteranarian that saved my pets life. I wanted to give something back.
Possible? I really appreciate simple examples etc...
You have no other way but to keep polling the server for "instructions" using javascript. No, you can't send nothing to the end user browser, neither curl nor wget.
Mainly, you'll have to set up a simple request/response protocol between the browser and the server.
If you want to go deeper, you can use something like cometd/meteord/etc. If not, a hidden iframe that reloads himself and receives pages with javascript code for the needed actions can do the trick.
Another alternative.
With javascript dopolling and single character flatfile. Have a simple one character flatfile with a single var. Write it in perl (it is faster and uses less resources than php). The parent script calls a javascript variable in a flatfile. It hits the flatfile and goes wherever the var sets it. The flatfile is written to by the controller. Done.
I guess you could also rename an empty flatfile and use that as the controller. I am usure which is faster, open and read a specific file or hit the directory and return the file name. On the controller side, opening and writing to a file vs renaming a file. Maybe they counter each other in resources and time?
This way the site can act as a normal site. When you want to have remote users see a "presentation" (automatically being shown the site pages at the controllers pace), the controller activates polling and tells the viewers to push a start button. This allows a remote instructor to load pages for the viewers at his leisure.
It is a simple solution that works with nothing really sophisticated going on. No frames are needed either. Just need javascript enabled.
Any better suggestions are welcome!
It occurred to me that what you might want to use is HTML Push technology. Check out the wiki, they have several links. I have never used it myself