Rails: Detecting bot IPs to get around shortener pings - ruby-on-rails-3

I have an application that logs clicks by users. The problem is, these clicks are being pushed through Twitter, which shortens every single link with t.co. Because of this, Twitter appears to hit the link between 7-15 times from different IPs, probably to do things like logging and SPAM protection. The issue is that this logs 7-15 "clicks" on my app that didn't come from actual users.
I am wondering if there is a way to detect if a visit is coming from an actual user or is simply being cURL'd or something of the sort from a bot or spider.
The one method that seemed it could have worked was using http://www.projecthoneypot.org/ 's API to see if the IPs hitting my site are coming from known bots. I found a gem to help (http://cl.ly/GlT8) but kept getting a NET DNS error while trying to use it.
I am fresh out of ideas. Would really appreciate any assistance!

Twitter should set its User-Agent: http header properly so you can filter those out. This can be forged of course but it's a start.
You can obtain the header in rails with request.headers["User-Agent"].

Related

CloudFlare failed to purge dynamic content to a blog post

I run my blog that linked with Cloudflare when I create a new post I don't see the current post that affected my site. I tried many methods that I learned from sites like trying to use a page rule to bypass Cloudflare cache but it didn't work. Also, I turn it off auto minify js, CSS, and Html still does not work. my blog still shows the oldest post that is from since 5 days ago. When you log in WordPress dashboard panel you will see the current posts but for the normal visit will see the cached posts that remain static all time
here is my Cloudflare setting
Page Rules Settings
Page Speed Settings
Page Caching Settings
I need your help from everybody who knows about this problem and how we can solve it
Thanks....!
Looking at the site now, looks like you have maybe disabled Cloudflare? I'm seeing the latest posts, and there are no Cloudflare response headers coming back.
For troubleshooting caching issues, one of the most useful things you can do is to inspect the response headers (using Chrome Dev Tools or similar 'Network' tab). First step is to identify which request is responsible for the cached content (a document, or an AJAX call, etc.)
From there, you can look at the response headers to see why it is behaving this way, specifically you'll want to check the Cache-Control and CF-Cache-Status header. More info here - https://developers.cloudflare.com/cache/about/default-cache-behavior
I fixed this problem because all performance is managed by Ezoic. I tried to purge everything on Ezoic and all things are working perfectly.

Vue + Flask Gmail API

I am attempting to build a webapp using Vue for the frontend and Flask for the backend that reads in the users Gmail emails.
Desired functionality:
User clicks a button to "Link Gmail Account" on the frontend
User is authenticated with gmail Oauth2 and confirms. Once confirmed, they redirect back to the page they were on
Once the user confirms, the backend queries gmail to get all of the users emails and returns the data to the frontend
I have been trying to use https://developers.google.com/gmail/api/quickstart/python as a starting point, but I cannot authenticate the user -- I keep getting a redirect uri mistmatch error with a random port (I am doing this locally so have set the redirect uri to be the localhost port where I access my project).
I think I am doing something fundamentally wrong or not using the Gmail API in the correct way, but have searched all over google and youtube to no avail.
Specific things that I think could be causing an issue:
What is the best overall strategy to implement this? Should I use the Gmail API in Python or Javascript? Right now, the use clicks the "Link Account" button which calls an API in my backend which then runs the code in the Python Quickstart guide.
What kind of google project should I set up? I currently have my credentials configured for a "web application"
What should I put as the redirect uri? I am using localhost but am unsure exactly what to put here (I have tried http://localhost, http://localhost:5000, http://localhost:5000/, http://localhost:5000/emails [this is the url I want them to return to]). No matter what I put, I keep getting a redirect uri mismatch and it says the uri it is looking for is http://localhost:[random port]/
I would appreciate any help on how to approach achieving this. Thank you!
Depending on what you are going to use Gmail API for, you must select the device or category. In your case, as it is a website it should be set to "Web Application".
Also, you should be using the following redirect URI: http://localhost/emails/. You should not include the port number and you should be using trailing slashes (adding the last / at the end). Note that the redirect URI you set up in your backend must be an exact match of the one you have set up in your Credentials Page. Also please note that it might take some minutes to update this URI.
Moreover, this is a guide on how to create a Sign In button that will authorise your users that I believe will be useful for you.

SEOstats API Google Pagerank Blocked by Google

I wrote a php script to get all the urls' pageranks of my company website. But get the following response from GetWithCurl($url) - $str. It looks like Google has some restriction to get the pagerank dynamically.
Is there any way to resolve it? or contact with google? but how. Thank you!
Sorry...GoogleSorry...We're sorry...... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.See Google Help for more information.© 2009 Google - Google Home
Well.. That is not an issue with SEOstats. The problem is that Google detected that you send automated requests to it, which is against their Terms of Services!
You should be able to "solve" this by getting a fresh IP from your provider (turn router off/on) or sending your requests through proxies. Anyway, you must decrease your request frequency to avoid getting blocked again!
See: https://github.com/eyecatchup/SEOstats/issues/33

Azure Websites Custom SSL ASP.Net MVC Workaround

Currently Azure Websites don't allow custom SSL certificates, but they have wildcard SSL enabled for the *.azurewebsites.net domain. I need a secure login form for my web app, but with no custom SSL, it appears that I'm SOL.
Is there any kind of workaround for this? Would it be possible somehow to have a login form at https://mydomain.azurewebsites.net that creates a forms authentication ticket that will then work at http://mydomain.com?
Couple of months ago I had exactly the same problem i.e. application was built on Azure Websites, had to run on custom domain other than *.azurewebsites.net and had to allow secure login process.
Workaround for that we used was to embed an iframe (using secure protocol and .azurewebsites.net domain name e.g. https://oursite.azurewebsites.net/login) into non-secure page on custom domain (e.g. http://mysite.com/login). And entire login process was performed in the iframe.
There is one thing which you should be aware of, namely, lots of customers checks whether the page where they provide their credentials was using secure connection or not. In our case, secure iframe in non-secure page was causing lots of customer complains. Workaround for that problem was to put a message confirming that the login process uses secure connection. The message made some improvements, however, still certain number of customers complains remained.
I hope that will help.
This isn't really an answer to your question, but Microsoft are very aware that custom mapped SSL to websites is one of the most requested features for Azure websites and they have said they are working on it.
Scott Hanselman himself confirms it here
In the meantime, Tom's answer is a perfectly valid workaround.
One thing I would be very wary of though is with something Tom brings up: the security warning that the browser will present. You'd be amazed how many people freak out when they see that message and don't go any further! We have a fairly active ecommerce site and there have been occasions where we have accidentally used a none secure image path on an SSL page and we have always received emails from customers asking if our site has been hacked or similar!
The disclaimer that Tom mentions is a good idea, but I think it will still put some people off.
I am working directly with the WAWS team right now to produce some public guidance for this. A GitHub repository with the requirements is currently being evaluated by the team (I sent it over to them literally 1 hour ago). Hopefully, the solution will be approved and made public within a few weeks.
I can say this - the workaround won't be fully supported or much custom guidance given on its usage aside from the repository and accompanying documentation. SSL is, literally, the #1 priority for the product, and hundreds of people are working insane hours to make it happen for everyone. This workaround should also be considered temporary, as you'll no longer need it once the full SSL functionality is launched.

How Does Google Global Login Work?

Whenever I login to one Google service, I am automatically logged in all their other websites on different domains.
What I want to know is how they are able to access the disparate cookies and sessions that belong on another domain.
I tried searching online but I couldn't find any information. I could probably pull out firebug and try to find out but I am sure someone here knows.
A Google Login works like this:
1) You login, normally at a login page that is under the Google.com/accounts domain.
1a) If you aren't on the Google.com/accounts domain, it is going to forward you there after you post the form. This can be found on sites like Blogger.
Once you arrive at the Google.com/accounts domain, they do two things
2) They set a cookie(s) that is specific to the Google.com/accounts domain, that are also only able to be sent over a secure connection. This is to verify your identity later on.
I say multiple because there are several cookies bound to the google.com/accounts domain. I believe that one of these is to make sure that all doesn't fail if secure connections aren't allowed
3) They set a cookie that spans all the domains using .google.com as their domain, because this will make the cookie available to any domain.
4) They forward you back.
5) If it is a site on a different domain, like blogger, they send along an authorization key in the URL. The page sees it, verfies it, and sets the cookie for a different domain. A technique like this can be seen using Google's Oauth.
Here is where that Secure Cookie comes in.
If you notice, whenever you go to a site after you close your browser, they forward you to the google.com/accounts path, where they reverify you under a secure connection, and then reset the subdomain-wide cookie. Then they send you back.
Furthermore, some sites like Google Adsense use the same technique as Google.com/accounts uses, by making a secure cookie on a specific path, and then using more global cookies to allow greater access.
Some of this is guessing, but given what a non-insider can see, I believe that is close to the truth.
Note: I literally spent like an entire month just browsing from Google Site to Google Site seeing how they did stuff. By upvoting this post, you are decreasing the sadness I have for having no life