How would you go about making an application that automatically retrieves your bank account balance twice a day? - onlinebanking

I'm building a utility that will hopefully keep my wife in tune with how much money we have available.
I need a simple secure way of logging into my bank account and retrieving the balance.
Something like mechanize is the only method I can think of. I'm not even sure if that would work given the properly authenticated https that banks use.
Any ideas?

Write a perl script using LWP::UserAgent. It supports HTTPS connections. The only issue might be if the site requires javascript.
Web Client Programming with Perl has a few examples to get you started if you're not too familiar with perl.

If you really want to go there, get these extensions for Firefox: Live HTTP Headers, Firebug, FireCookie, and HttpFox. Also download cURL and a scripting language that can run cURL command-line tasks (or a scripting language like PHP or Perl that has access to cURL libraries directly).
I've started down this road for some idempotent GET tasks like getting PDFs of the S&P reports (of the stocks I track) from my online brokerage, and downloading the check images for my bank account. Both tasks are repetitive and slow ways of downloading data to my computer that the financial institutions don't provide any way of making it easier.
Here's why you shouldn't: (as a shortcut I'm going to call the archetypal large bank, brokerage, or other financial institution "BloatBank")
BloatBank is not likely to make public their API for accessing this kind of information. So it can change any time and all your hard work will be for naught. Whenever they change their mechanism, you'll have to adapt.
If BloatBank finds out you've been using automatic scripting to try to access your account information, they may ban you because you've violated their terms of service.
You might screw up, and the interaction between the hodgepodge of scripts on BloatBank's server, and your scripts that access your account, might cause a Bad Thing like closing your account. Testing this kind of script is tremendously difficult because you don't have any documentation about how their online service works, and you don't have a test account you can mess with.
(a variant of the above) You think you're safe because you're issuing GET requests. But BloatBank is just a crazy bank that doesn't know anything about REST, so there are some GET requests that can mess up your account.
If someone else does use your script to maliciously sniff your online password or mess with your account, any liability coverage from BloatBank may disappear because you've opened a security hole.

Why don't you teach your wife how to login to the bank herself? Or use Quicken (or Mint, etc) and teach her how to use the auto-download feature?

Have you checked out Watir? It is fantastic for automating web-browser actions. And since it's written in Ruby, you can take the results and store them in a DB (or email them to yourself) if needed.

If you are open to AIR, I'd say build an AIR app. I have worked with mechanize and I think it's cool. AIR gives you similar features with a richer GUI (see HTMLLoader and DOM manipulation of webpage).
If I were you, I'd simply pull the page and manipulate the DOM to suit my visual needs.

Please, if you find this easy to do for your bank please post your bank's name. If I have the same one I'll be closing my account.
More to your question. The process of loading a web page inside of your code rather than in a browser can be a black art, especially if their is any javascript involved. Your best bet would probably be embedding the IE Web Browser control in your app and then simulating key strokes and mouse clicks to arrive at your balance page. Then scrape the HTML for the balance.

I could try paying for Quicken and letting it do the balance downloading. Then I'd just need to find a way to get the number out of the software automatically.
This way I'm not violating any terms of service and I'm also reducing security risk since all "hacking" goes on locally.

Related

google home reading from website

I'm currently working on a project where my main focus is to create an Action for Google Home which can be invoked and asked to read out some articles (chosen previously from a list, also by voice) from a particular website.
I was wondering if it was possible, or if it were already some similar projects.
What I'd like to do is something like the feature in Pocket or instapaper, where you can make the device read the article for you.
I also thought to make something like a database with all the articles I'm interested in, which auto-updates itself whenever a new article is posted, but my main concern now is to be able to separate the articles in various lists, parse the article and in the end implement text to speech into the Action.
Also some implementations with 3rd party services and apps would be useful.
Please ask me if anything isn't exactly clear, english is not my first language.
Yes, this is possible. Not necessarily easy, but possible.
First - there is nothing in the Actions on Google library or in Google Home that will automatically scrape a website. That will be up to you.
Second - Responses from your Action are limited in how much they can send at a time.
If you're having it do text-to-speech, you're limited to two "text bubbles" of 640 characters each before the user has to reply. You should keep well below that and should probably stick to just one "text bubble".
If you're playing an audio cut, then you're limited to two minutes.
You can work around both of these limitations by using the Media Response. With TTS, you would play a portion of the text, a brief Media response, at the conclusion of which, your server would be triggered to send the next chunk of text. If it is all recorded, you can just send the longer audio as the Media.
Be advised, however, that if you're using the inline editor or using Firebase Cloud Functions (which the inline editor uses), that by default you're not able to access most sites outside Google's network. You need to upgrade to a paid plan to do so. I suggest the Blaze plan which is pay-as-you-go, but includes a free tier which is typically good enough for development work and light production usage.

Can BigQuery's browser interface be white-labeled?

Like most people, we're pretty impressed with BigQuery. We're willing to put up with it being based on proprietary "Dremel" in exchange for not having to configure a ton of servers in our LAN, on EC2, or anywhere else.
The REST API is excellent, and we're incorporating that into our apps, but we still find ourselves using the BQ Browser interface as well. We'd like to incorporate something like a 'generic SQL window' into our app, without divulging that the backend is BQ or that data is stored in Google at all, for that matter. Does Google provide a way to use their BQ browser tool in a white-label manner?
Note also, that even extending access to the existing browser tool is problematic. It relies on user-accounts existing in one's own domain - something that can't be done, in our case, with a customer's email address. The REST interface solves this with service-level accounts, but that doesn't get you to the SQL window/browser tool.
If the folks at Google are listening (and I know that you are), consider the benefits of white-labeling the browser tool: I think you'd find a lot of software companies integrating it into their suites of products and, then, running circles around any Hadoop/CDH/EMR/Impala/Hive combination.
So, to summarize: How does a software developer import or emulate the BQ browser tool (with all it's autocompletes, query histories, etc..) in their own web-based app?
The initial version of the BigQuery web interface was considered just an 'example' UI that anyone could create themselves. It uses only the public BigQuery API to talk to BigQuery.
There are a couple of Google-internal things we've added since then, such as the current design of 'saved queries', and an auth shortcut so that users don't have to explicitly grant permission to the UI to access BigQuery data. But it is still mostly plain-ol-javascript talking to BigQuery via the REST API the same way anybody else does.
The javascript is obfuscated, however, but my understanding is that this is just for compression purposes so that it downloads more quickly.
The SQL highlighting is done by CodeMirror with special configuration for the BigQuery SQL variant.
I'll talk to the other members of the BigQuery team about open-sourcing the javascript code in the Web UI. It may be difficult to do at this point, but it doesn't hurt to have a conversation about it. I'll bring this up with the team and update this thread. The most likely answer will be "We'll think about it", but hopefully we can also think about it and start working on it too :-)
Let me know if that sounds like it would meet your needs. It might not solve the auth problems you mention, since your users likely won't have BigQuery accounts, but you may be able to solve that by proxying oauth2 access tokens.

Is scraping Google+ pages/comments/notifications via cURL legal?

The limitations of the Google+ API have just put a hold on a little project I am working on.
I can achieve what I need with a basic cURL script (login to Google+ Page, scrape page, parse data) but I was just wondering if this is allowed?
(yes the script will break whenever they update G+, I can live with that)
A search on "are you allowed to login to google with curl" produces lots of results. So it seems lots of people are doing it, just wondering if anyone knows if it is "really" allowed?
I am not a legal attorney, always seek advice from law professionals.
However my take on this is that it is legal. Nothing restricts you from crawling websites and performing datamining without wasting the resources of the server (Such as DDOS) and without using any illegal means to attain this information such as exploiting the server software or using some vulnerability that might expose the user data.
If this information is publicly available online it belongs to the Public Domain, and as long as you are not selling it, it can be considered fair use.
On the other hand you are violating an awful lot of user agreements.

Smart card, PIN, Secure HTTP, Login and Downloading and manipulating the source html - need a suitable coding langugage

I am now motivated to explore a coding language so that I can make the best solution possible.
But I am not sure of the capabilities of all coding langugages, so I am asking for advice.
I want to automate some of the daily processes I do at the office. There is an external database on the internet that we use. We access it with a smart card and secured http.
In short, these are the actions that I do each time I restart the browser or a session ends:
Open a Secured HTTP. /....jsp
After being promted I choose an installed certificate
A smart card is called and I enter a PIN. /charismatics smart security interface/
The page asks me to log in with a username and password.
I open the desired link.
I extract the data from the opened webpage manually.
Is it possible to have all these action automated by code?
THANK YOU FOR ANY SUPPORT
If you get a PIN screen from the charismatics smart card security interface instead of from the operating system then it it may be very hard to automate this. Your program is unlikely to get access to the PIN popup Window.
If you get the PIN prompt from a CSP (as you mentioned in the comments) then it may be possible to automate the PIN login. The PIN is normally used to set up the SSL/TLS connection, so having it open in the browser won't help you much, unless you program the browser itself.
If you are bound to CSP's it may be best to keep to C#/.NET. There are of course bindings for other runtimes, but it is better to have as much control as possible.
You may want to take a look at topics such as parsing HTML, because that's something you certainly need to do. Life becomes a lot harder if the web-pages are filled in using JavaScript, so you may check for that first.
Now if you want to manually choose a link you may want to render the page in your own application and handle the download yourself.
This is certainly not a task I would recommend when starting off on an unknown programming language. I would find this a tricky task - there are a lot of ifs left with this description.

Running MTurk HITs on external website

I am implementing a website on which the recruited MTurk workers will perform tasks. I plan to recruit workers using MTurk tasks, using which I will redirect them to an external website for actual work. I have the following questions relating to this plan.
Is there any foreseeable problems with this approach of running HITs? If so, how can we mitigate them?
how should I implement the authentication procedure on my external site? For example, how can I make sure the people who come to the website to perform a specific task are indeed the same group of people recruited earlier for this particular task on MTurk?
when the workers finish the task, how should I integrate the payment procedure with MTurk based on their performance? For example, say worker is owed $3 after finishing the task on my external site, is it possible for me to tell MTurk to pay him/her this amount programmatically?
The external site will be built using Python, if such detail matters.
Any suggestions and comments based on your experiences and insights in using MTurk would be much appreciated!
I am thinking through this for a similar project of mine. I've experimented as a worker myself. Here is my plan, I hope it is of use to you. (I have not implemented it yet. It is based on an academic HIT I participated in as a worker.) Here goes:
A. Create a template that has language something like:
1. Please open this web site in a new browser window:
http://your-url.xyz.blah/tasks/${token}
2. Read and follow the instructions there.
3. After completing the task, you will receive a confirmation code. Paste
it here: [________]
B. Create some random tokens for your Mechnical Turk data file:
1A1B43B327015141
09F49F2D47823E0C
B5C49A18B3DB56F4
4E93BB63B0938728
CCE7FA60BFEB3198
...
(Generate these tokens from your app; it needs to cross-reference them.)
C. Your app extracts the token from URL, looks up the task, and does whatever it needs to do. I personally don't worry about people stumbling onto a URL, since it is a one-time use token.
D. After a user completes the task on the external web site, the external app gives a confirmation code. The confirmation code should be random and opaque. Only your application will know if any particular code corresponds to a correct or incorrect answer. In fact, if you want, the correctness may not even be determined in real time -- it could be the result of an aggregation and/or comparison across multiple submissions.
E. Write some code to interact programmatically. Take the token and confirmation code supplied from the MTurk result and make sure they match with your external app. If they don't match, reject the HIT. If they match, check the correctness in your external app and approve or reject. You might consider a bonus pay structure.
So, to answer your particular questions:
I don't anticipate problems with the approach I described. That said, Mechanical Turk is both an art and a science. Perhaps more art. Writing good questions and paying Turkers appropriately is something you have to figure out with a combination of common sense, market research, and experimentation.
See (C) above. A token is designed to only be used once. Use long enough tokens and the probability of collision becomes very low.
See (E) above. The Mechanical Turk Developer Guide is a good place to start.
Please share your results back. Or have the Turkers send StackOverflow hundreds of postcards. :)
Notes:
I'm currently exploring qualification tests. I suspect they can be very useful.
I want to get a Turker's Worker ID in my external application, but I haven't figured that part out yet. I'm reading up on it; for example: Getting workerId by assignmentId
I am thinking about using the ExternalQuestion feature from the API: "... you can host the questions on your own web site using an "external" question. ... A HIT with an external question displays a web page from your web site in a frame in the Worker's web browser. Your web page displays a form for the Worker to fill out and submit. The Worker submits results using your form, and your form submits the results back to Mechanical Turk. Using your web site to display the form gives your web site control over how the question appears and how answers are collected."
You might also find PsiTurk to be useful: "PsiTurk is an open platform for conducting custom behvioral experiments on Amazon's Mechanical Turk. ... It is intended to provide most of the backend machinery necessary to run your experiment. It uses AMT's External Question HIT type, meaning that you can collect data using any website. As long as you can turn your experiment into a website, you can run it with PsiTurk!"