I'm trying to write a program that takes in audio from the user via microphone, and then converts what's being said to text for further manipulation.
I know google has such a thing, but it's for Android developers, and I'm trying to make something more like a chrome extension (chrome extension is what I had envisioned, but I'm open to alternatives).
I've looked at the Mac OSX NSSpeechRecognizer, but I need something more comprehensive since this requires you to specify a limited grammer ahead of time.
I can't expect, or predict what my users will say into the microphone (but I CAN assume it will be English).
Google have an unofficial API, which can be used as described in this post:
http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/
If you're writing a chrome extension, you might be able to use a tag like:
<input type="text" x-webkit-speech />
which adds the microphone button and uses google's cloud speech-recognition to fill the text box with what the user says.
Related
I'm currently working on a project where my main focus is to create an Action for Google Home which can be invoked and asked to read out some articles (chosen previously from a list, also by voice) from a particular website.
I was wondering if it was possible, or if it were already some similar projects.
What I'd like to do is something like the feature in Pocket or instapaper, where you can make the device read the article for you.
I also thought to make something like a database with all the articles I'm interested in, which auto-updates itself whenever a new article is posted, but my main concern now is to be able to separate the articles in various lists, parse the article and in the end implement text to speech into the Action.
Also some implementations with 3rd party services and apps would be useful.
Please ask me if anything isn't exactly clear, english is not my first language.
Yes, this is possible. Not necessarily easy, but possible.
First - there is nothing in the Actions on Google library or in Google Home that will automatically scrape a website. That will be up to you.
Second - Responses from your Action are limited in how much they can send at a time.
If you're having it do text-to-speech, you're limited to two "text bubbles" of 640 characters each before the user has to reply. You should keep well below that and should probably stick to just one "text bubble".
If you're playing an audio cut, then you're limited to two minutes.
You can work around both of these limitations by using the Media Response. With TTS, you would play a portion of the text, a brief Media response, at the conclusion of which, your server would be triggered to send the next chunk of text. If it is all recorded, you can just send the longer audio as the Media.
Be advised, however, that if you're using the inline editor or using Firebase Cloud Functions (which the inline editor uses), that by default you're not able to access most sites outside Google's network. You need to upgrade to a paid plan to do so. I suggest the Blaze plan which is pay-as-you-go, but includes a free tier which is typically good enough for development work and light production usage.
I would like to know if it's possible to view a Google Spreadsheet Doc as a PDF without first manually converting it as a PDF? I don't want to share a link directly to the spreadsheet, I want to share a link to a PDF version of it which ends up looking better (in Print View rather than Spreadsheet Document View)
I know I can Print > Save as PDF, then download to local machine, then upload and save somewhere on my server. But is there is a way to be able to view the spreadsheet as a PDF.
I have Google'd this and found nothing. The best I could come up with is the Google Document Viewer (https://docs.google.com/viewer) but that does not seem to give mt the option I am looking for. Further, I do not want to install any Chrome plugins, etc. because I want to be able to share a link to the PDF with people but not have to have them install a plugin to see the doc.
Unfortunately, what you are trying to do and the way you are trying to do it is not a capability within Google Docs. Sorry.
I think the best way is to use Google Drive API to write own script that will do this job. I mean:
You have a web server
Write a simple method in any web technology, such as PHP, Python, Java, C#, whatever you like and your server is able to serve. This method is connected to the google drive through it's API to your account, knows which spreadsheet to take care of and how to understand the columns. This spread should be parsed to HTML and with some popular tool (proper for your programming language or server's operating system) you create the PDF. The method should create HTTP response with header type: application/pdf.
You provide interested people with the link under which your method is available.
I guess this reference should help you to use Google API:
How to download the resources:
https://developers.google.com/drive/web/manage-downloads
How to convert (i.e. to PDF) and open the resources in your own application:
https://developers.google.com/drive/web/integrate-open#open_and_convert_google_docs_in_your_app
I hope this helps.
I want to create an application capable to play YouTube video's audios and also save the downloaded content in a local cache, therefore when the user decides to resume or play the video again, then it doesn't have to download part of video again but only download the remaining part (User can decide what to do with the cache then, and how to organize it).
It is also very convenient for mobiles (it is my main focus) but I'd like to create a desktop one too for experimental purposes.
So, my question itself is, does YouTube provide any API for this? I mean, in order to cache the download content I need that my application download the content and not any embed player (also remember that it is a native application). I have a third-party application in my Android system that plays YouTube videos, so I think it's possible unless that the developers use some sort of hack, again this is what I don't know.
Don't confuse with the web gdata info API and the embed API, this is not what I want, what I want is to handle the video transfer.
As far as I know, there is no official API for that. However, you could use libquvi to look up the URLs of the real video data, or you could have a look at how they do it and reimplement it yourself (see here).
I'm currently trying to upload an image from my Mac application to Facebook. To do this, I'd like for the user to simply input his username and password, and to click a button.
The only issue is, Facebook doesn't actually have an API for the Mac, it only has one for iOS. This shouldn't be a problem, except for the fact that to login, you must use a web view, something I'm not to keen on doing, since I'd like the interface to be two simple text fields.
I've also looked into PHFacebook, a class I found online, but it also seems to utilize an NSWebView.
I'm wondering if there's a security issue when you use text fields; indeed, it's slightly strange no available API offers this function !
So, to conclude, is it possible, or is there an API, that lets you upload an image and lets you provide the user's credentials through simple NSStrings?
There's a Cocoa framework called MKAbeFook that lets you access Facebook APIs. I don't know if it's up to date, but you should give it a try.
It looks like this is not supported unless you can embed a web browser.
If you scroll to the bottom of: facebook authentication docs, it seems to say this.
Embedding a web browser is generally doable, but may be overkill for what you are trying to do.
Good luck!
You could do this through JSON on your mac app but I think the way Facebook protocols work you simply provide the image to upload and then the user logs into their Facebook account through a webview and has to authorize it.
Would a script that sets display messages for instant messengers be simple or complex? After some searching, there doesn't seem to be any information about this at all.
For the sake of an example, if I had a text file of quotations, would it be possible to have the google talk display message change to a different quotation hourly?
Depends on which client you're using. As far as I know, Google's client doesn't offer any interface for plugins, but the open source instant messenger Pidgin does. I think there already is a plugin for what you want to do, but you can write your own using the documentation and examples they give you.
The complexity of writing something like this is based on how much C or Perl you know, since you can program in either of those for Pidgin. Reading code from other people's plugins, you should be able to figure out the Pidgin API.
You can use Kik API to programmatically send rich content and files between mobile applications. It is available for iPhone and Android platforms and takes only about 5 lines of code to integrate into your app. There is more info at the API website: http://www.kik.com/dev
Disclaimer: I'm on of the developers behind Kik API :)