Import user accounts data from other websites with user sign-in - api

dealsgoround.com and citypockets.com access user accounts and their daily deals from livingsocial.com and other daily deal websites. These websites ask users to give their credentials to sign them in to respective websites and then import account details. LivingSocial and other deal website don't provide API access to user accounts.
I also want such access to other websites and import data from there. But I am unable to find out the backend process by which dealsgoround.com and citypockets.com import data from Groupon and LivingSocial etc.
By now I am able to find out that web crawlers/spiders can be used to scrap data from web pages. But I am not sure web crawlers can be useful in a scenario where we need user sign in and the page urls are encrypted or at least are dynamically generated urls.
Please help me and suggest me a way to do this. If crawlers are the solution to this please provide links to some web crawler APIs that I can use in my .net application.
Thanks
Atif

The ability to sign in using alternate credential such as Facebook or Google ID is called Open Auth or OAuth (pronounced Oath).
Incidentally you should not import the data without the users express permission and be vigilent of security issues.
There is a lot of documentation available and a lot of it rather heavy. The best place to start is to read specification RFC 5849
The OAuth website is also useful http://oauth.net

I was able to find the answer... Yes web crawler is the solution in this scenario.
We can use PHP, ASP.net or any other server side scripting language to send an httppostrequest with post parameters (user_name/password). This will authenticate (sign in) the user. After sign-in we can read the contents of any url with user information who has been signed-in.
Note: In my case this isn't unauthorized access to user accounts as users provide their credentials themselves to import their data from deal websites as in dealsgoround.com and citypocket.com

Related

Google oAuth login - How to allow only whitelisted emails to log into my app

I'm creating a web app and decided to use google authentication for its ease of use.
Thing is, I want to only let certain emails login. All other emails should not be able to login!
How do I do that?
I'm aware that I can send the auth token to the backend, verify it with google's library, and then filter the emails but... there should be an easier way, I hope?
You need to consider how Open id and oauth work. You are technically forwarding a user over to googles login page. They login and approve any apis on Googles site you have no way of knowing who they are until they are redirected back to you.
Nor is there any way to limit the users who can login to your client directly in Googles Developer console for your project. TBH i think that would be really hard for them to administrate.
Your best bet is going to be checking the users email when they return and decide that that time if they may login or not. It would be also be a good idea to do a revoke on any credentials google returns to you if you dont want them to have access.

Implementing Account Linking - queries

I've asked a Google Dev Advocate for help as I'm struggling to implement Account Linking on my Google Actions app, he sent me a link to a documentation article I already had read and suggested I also consult Stackoverflow. Having already done the second suggestion too and having struggled to find the exact answers to my questions, I've deiced to link to the doc article here, add all my queries and send this back to the Dev Advocate in the hope to get more clarification, especially as a reminder that Documentation could be read by complete newbies on the topic and that nothing should be given for granted.
This is the article I am referring to https://developers.google.com/actions/identity/account-linking
My queries below:
What is the difference between implicit and authorization code flow. In the article "Authorization code" is chosen, why?
Although I have found on Stackoverflow where to get your Client ID and secret, don't you think it would be good to add a link in the article?
Authorization URL - this is something instead I haven't found a clear guide for. Some Stackoverflow tickets report 2 Google OAUth URLs can be used (For the Authorization URL, enter https://accounts.google.com/o/oauth2/v2/auth
For the Token URL, enter https://www.googleapis.com/oauth2/v4/token), but a recent change to google policy suggests
When implementing account linking using OAuth, you must own your OAuth endpoint
So I'm now extremely confused at what I should put in the Aiuthorization URL and Token URL - why isn't this documented in a more basic and clear way? I've also read it needs to be served over HTTPS, what if you're working on local and on a pet project which isn't commercial and you won't be able to pay for HTTPS?
What is Seamless Account Linking and why isn't this explained and documented?
If your app supports seamless account linking
Where should we whitelist this?
Whitelist the following redirect URI: https://oauth-redirect.googleusercontent.com/r/
What are your OAuth 2.0 client configuration details? Where can they be found?
In the expanded OAuth 2.0 form, fill out the fields with your OAuth 2.0 client configuration. When filling in scopes, ensure they are space delimited.
I don't see the Discovery tab on my Oneplus 3T Google App, where else can I find it?
Open the Google app and go to the Discover tab.
This is where I get stuck - as many other people on Stackoverflow I get "The account is not linked yet" error. Maybe resolving the issues above will resolve the Account Linking error?
Invoke your app. Since it's the first time invoking the app with your Google account, the Assistant notifies you that you must link your account.
In addition to those questions, I also have the following:
I would like to get access to the user calendar and user basic info so I've added profile, email and https://www.googleapis.com/auth/calendar could you confirm these are correct?
Thanks and please remember documentation should be for everyone!
Documentation is for all developers. However, keep in mind that some of the tasks might require you, as a developer, to learn more than you currently know. Coming to SO is one of the ways to do that, but there are many other avenues that supplement that.
Good original documentation does, however, help. Google's docs are currently just bad - they used to be terrible.
Update - Before we begin, let me answer a question you suggest, but don't actually ask.
Why do I need an OAuth server at all?
First of all - you don't.
Think of your service like a website and the Assistant as a browser. For lots of websites, they don't need to know who the user is in order to use the website. There are lots of things the website can do without a user account at all.
In some cases, it is useful to know that the user visiting your website has visited you before. Frequently, you'll use a cookie to do track users like this.
The Assistant has an equivalent to this, although it is slightly different. The Assistant sends an anonymous UserID with each message to you. This UserID is only for this user and for your Action - it isn't re-used for any other Action or any other user. So if you track it, you'll know when the user returns. Like cookies, users can reset or clear it, but for the most part, this is durable.
But sometimes, you might need a person to log in to an account on your website. This is what the OAuth server is meant to accomplish - give users a way to log into your Action. OAuth is a pretty standard way to let people log into services these days, although the intent is really to authorize a client to act on your behalf.
The latter is really what OAuth is doing in this case - your user is authorizing the Assistant to act on the user's behalf when talking to your Action.
(Update - There are now ways to avoid having to setup an OAuth server at all in some circumstances. See the update at the bottom of this answer.)
Now back to your questions
But... let's go over your questions.
What is the difference between implicit and authorization code flow.
These are two terms that are more carefully defined by the OAuth2 standard, but in short - both of them let a client (a remote server from yours - the Assistant in this case) to get a user to give certain rights on your server.
The Implicit flow is simpler, both in what you need to setup and what the two servers exchange, but assume that once you issue a token, it is indefinitely valid. This brings with it a slightly higher risk that someone can get this token and use it to impersonate the Assistant.
The Auth Code flow is more complex (although not a lot) and addresses the risks in several ways. One way is that some transactions are done server-to-server instead of including the client, and that those transactions include a shared secret. Another way is that the auth token has a limited lifetime, and therefore a limited window of exposure, but that there is a refresh token which can be used to get a new auth token.
In the article "Authorization code" is chosen, why?
Most likely because it is more secure for a minimal level of extra work. Most of the security issues it addresses, however, are most visible in more open environments such as browser and mobile - they're not as big a risk with the Assistant. However, for places that need to setup an auth server, going with the more secure route has benefits in other areas.
Most Google APIs use the Auth Code flow or variants of it. (Although most use it from the client side - not the server side. Which is what Account Linking for Actions requires.)
Although I have found on Stackoverflow where to get your Client ID and secret, don't you think it would be good to add a link in the article?
Well... except that SO answer is no longer valid. (And, apparently, was never intended to be valid.) As you noted in your next question, Google has clarified their policy that requires you own the OAuth endpoints you use for an Action. They have, furthermore, made technical changes that prevent you from using Google's endpoints. (And I've updated the answer to say so.)
While the "Configure cloud project" part is correct, and describes how you setup credentials to be used with the Calendar API, you cannot use Google's OAuth endpoints to do the auth for your own project.
So I'm now extremely confused at what I should put in the Authorization URL and Token URL - why isn't this documented in a more basic and clear way?
Because this is a point where they're making an assumption that isn't very clear in the documentation. It is suggested where they say "Step 1. Configure your server" that you have an OAuth server. If you have an OAuth server already, then you should know what your server's Authorization and Token URLs are.
If you don't, however, this does get further explained where they talk about determining what the endpoints will be for an OAuth service you're creating.
I've also read it needs to be served over HTTPS, what if you're working on local and on a pet project which isn't commercial and you won't be able to pay for HTTPS?
Yes, it has to be HTTPS. This is a requirement of OAuth, and good practice when you're sending tokens that can be used to do things authorized by a user. It sounds like you want to be able to issue API calls to a Google server, and if those tokens got out (or tokens that could be used to access the same resources), then your Google Account could be compromised.
You have a lot of options here for your local or pet project development. Just to list a few:
You can use Firebase Functions. For projects on a "pet" level, they're free. (And if your Action gets a little popular, Google Assistant will give you credits that should pay for a modest level of use.)
You can get SSL certificates for your server for free using Let's Encrypt.
Since your server has to have a public address, you can create a tunnel using ngrok, which also provides a public HTTPS address you can use. This probably isn't good once your project gets out of the "personal testing" stage, but is a good tool to start with.
There are other approaches, of course, but these are a few good tools that you can use depending on your needs.
What is Seamless Account Linking and why isn't this explained and documented?
It is. Except in the documentation they confuse things by also calling it "Streamlined Identity Flow".
On the Account Linking Overview page it says "For more information, see Streamlined Identity Flows about how to configure your OAuth server to support the seamless identity experiences on the Google Assistant."
This takes you to a page talking about how this flow builds on top of the other two identity flows and has some additional requirements, but should make the user's experience better.
However... don't worry so much about this. If you're just doing this for fun, the normal identity flows aren't that much of a burden. If you're doing this for a commercial product - get the normal flows working first.
Where should we whitelist this?
Whitelist the following redirect URI: https://oauth-redirect.googleusercontent.com/r/
This is one of the underlying concepts of OAuth - as part of the communication between the client server and your server, it will say to redirect to a particular URL when you're done authenticating the user and getting their permission to issue a token.
The OAuth spec requires you to compare that redirect URL to a URL that has already been setup for that client. It does not specify how you set that up. So Google is saying "When you setup the OAuth server for our client - here is the URL that we will ask you to redirect to."
Google can't answer where to whitelist this except "in your OAuth server". Most OAuth servers have a way to configure multiple clients, and this is one of the values you'll set for that client. (The ClientID and ClientSecret are other values, but Google lets you determine these values and tell it as part of the configuration for Account Linking in the Action Console. Which is your next question.)
What are your OAuth 2.0 client configuration details? Where can they be found?
Again, this depends on your OAuth server and your requirements for what you want to prompt the user when they try to login to your server. The ClientID and ClientSecret are two such parameters. The OAuth scopes that the Assistant should request access to are other parameters. But these are up to you - because it is your server they are trying to get access to.
I don't see the Discovery tab on my Oneplus 3T Google App, where else can I find it?
That documentation looks incorrect. I think that should say that you should open the Google Home app on your mobile device.
It is also possible that it does mean the Google app, in which case your phone may not support the Google Assistant as part of the Google app. You can download the Google Assistant separately, if necessary.
However - use the simulator to test initially. Although it requires a few manual steps, they are easy to follow and help you trace things.
This is where I get stuck - as many other people on Stackoverflow I get "The account is not linked yet" error. Maybe resolving the issues above will resolve the Account Linking error?
Well, your account isn't linked yet. {:
It sounds like you haven't set an auth server for your Action. Until you get an auth server working, the rest isn't going to work.
I would like to get access to the user calendar and user basic info so I've added profile, email and https://www.googleapis.com/auth/calendar could you confirm these are correct?
First of all, keep in mind that this whole process is to link the user's Assistant account to their account on your service. You may have information in their account (on your service) that you use to do things - such as access Google resources or access other things that you know about them.
This is not directly a way that you gain access to the Google account that they're using to talk to the Assistant.
In order to get a user's permission to access their resources on Google's servers, you'll need to get them to authorize your server permission to access that. That is done using OAuth, again, but this time you're the client. User's will need to go to your server, you'll redirect them to Google's server to authorize you, and they'll be redirected back to your server with codes that you will need to store. This is all done outside of the Assistant and it's Account Linking system.
That said, for what you want, profile and email are fairly normal scopes to request. The Calendar API Documentation confirms that the https://www.googleapis.com/auth/calendar scope is what you need to access that API. (Keep in mind that this URL is not one that you'd use in a browser or that you'd go to to access anything - it is a uniquely identifying name only.)
Update to reflect API Changes. Since this answer was originally written, Google has introduced Google Sign In for Assistant, which lets you avoid having to setup your own OAuth server when you are willing to tie operations to the same Google account they use on the Assistant. If the user permits, you can get simple user profile information this way, and you can then leverage this to get access to other APIs (again, with the user's permission). See this SO answer that discusses how to use this to access Google's other APIs.
Thanks and please remember documentation should be for everyone!
From my conversation with Google's Assistant team, they are looking to make documentation easier, and hopefully they will take many of your suggestions to heart. I hope these clarifications have helped you (and anyone else who gets here with similar problems.)

Login functionality from my platform INTO other sites?

I am creating a software-platform in Symfony3 (a PHP framework). In this platform, there needs to be a page with links to approximately 20 different websites (among others: Google Analytics, Google TagManager, Rocket.Chat)
My goal is to enable the following functionality:
Upon having clicked on one of these links that are in my platform, I want to redirect the user to that particular site, while having them logged in.
So for example:
There is a 'link' to Google Analytics and when being clicked, the user needs to be redirected to their campaign within Google Analytics.
Note: The username/password that will be used to perform this login will be the same throughout all the different sites. We have a database with these credentials for each user, which we want to use to login to those different services.
I was thinking about using cURL, but this would not be a universal technique. (and bring forwards many security implications)
Another option would be to use the Google API to perform an authentication, but I could not find such functionality. (Normally, it is the other way around: logging into your platform by means of a google account)
I look forward hearing from your input!
Kind regards,
Jeroen

OAuth2 Troubles with PicasaWeb API

I have spent the last couple of nights bashing my head against the wall amongst a see of conflicting out of date documentation and semi-helpful blog posts that were/are appropriate to what I am trying to do.
Essentially I want to write a wee personal app do download my images from PicasaWeb/Google+ and store them on my local hard disk.
I have managed to do the following:
Figured out the GData API for the appropriate request to get private album data (works fine in my 'google-logged-in' chrome browser)
Got the correct private data back from my GData URL with the token generated by the OAuth playground.
Managed to get an OAuth2 token back from https://www.googleapis.com/oauth2/v3/token using JWT.
However - when I try my access token I generate myself I get back a forbidden response with the message 'Not authorized to view access private'.
I am pretty stumped - my only guess is that my service account configured in google developers console doesn't actually have access to my personal google stuff like google+ photos. When I look in there I can see the OAuth playground has access. How do I give my app access - and do I need to in this scenario?
Thanks in advance,
Robert
"my only guess is that my service account configured in google developers console doesn't actually have access to my personal google stuff".
Totally correct.
I guess I see 2-3 questions per month on SO where people have made the false assumption that a Service Account is some kind of proxy to their Google Account. It isn't. It's a completely new and independent account.
The two approaches you can take are:-
Share the items to the Service Account so it has permission to access them.
Give your app direct access to your Picassa account. See How do I authorise an app (web or installed) without user intervention? (canonical ?) for the steps involved.

Access google drive files via api without logging in?

I am trying to make a webpage that can display information about documents on my google drive. For example I would like to display the titles of all my google documents on a webpage. I don't want the user to have to be logged into a google account, and I don't want to have to authorize anything (or the user to authorize anything). I just want the user to be able to see what I display - in a read only format - when they navigate to the page. The user will have no chance to edit or upload or delete anything, they can just view the info I display.
Is there a way to get files from google drive (via the API or any other way) possibly without using oauth 2.0? I've looked through the api docs and even coded up the sample apps, but all of them have a step that says, "Go to this URL, click Allow, enter the code" then you get access. These steps shouldn't be necessary. I just want to download the file and be able to manipulate it (either in memory or as a stream) then display something about it.
Also, I may misunderstand how OAuth 2.0 works so if that seems like the case, any helpful information would be much appreciated. Thank you.
You don't need to authenticate your visitors into Google, but need to authenticate yourself, so your web app can retrieve data from your personal Drive.
Get an access token and refresh token for yourself, store them and autenticate your requests. If you're using one of our client libraries, most of them refresh the access tokens once they are expired. See Using OAuth 2.0 for Web Server Applications for more details and OAuth 2.0 Playground helps you to understand how to get these tokens.