Plagiarism system using google - google-custom-search

I am creating an Arabic plagiarism detector. I need to check the paper against the web, I am using Google custom search to retrieve results. but Google offers limited number of queries ( 100 per day for free , or you have to purchase max of 10000 per day which is not enough ) .
I wonder if there are any ways to search the web for certain content. may be Google for business or other search engine.
Or should create my own crawler?
I need a solution to this.
any ideas?

Related

Google Maps API and costs. Beginner question

First of all: i am a total beginner, especially with Google Cloud Platform.
I am building a Real Estate website that includes flats that are imported into the CMS via APIs. The flats are uploaded and managed from my client in a dedicated software.
Every flat includes also a map with the flat's position and I need to render it via Google Maps.
The flats (query) are updated every night. There are more than 120 flat that are daily updated.
So far everything fine, Google Maps works properly, but I realized that the costs of the Google Cloud Platform are increasing, drastically.
There is a way I can limit this? I only need to display in front-end (and back-end) the position of the available flats.
Many many thanks in advance!
You can use the Google Maps API for free within your first 90 days of registration. After that, you are billed proportionally to how many requests are sent. You will probably use the Static Maps API if you are just going to load the flat's location. Here is more pricing information.

Requirements for beginner using twitter api to retweet specific tweets

I am new to twitter api.
I want to search tweets that have 2 specific terms and 1 specific hashtag, and then I want to retweet them in my account for the purpose of consolidating all the tweets.
Do I need to have a developer account?
Should I look to an already existing app (I prefer one that is free or open source), or can I do this with twitter api as a regular user?
Any tutorials or instructions are greatly appreciated. TIA.
I have applied for a developer account, but I don't know how long it will take - I also don't know what the criteria are for being granted one.
I found different kinds of "retweet" applets on ifttt.com - I implemented one of them, and it accomplished what I wanted to achieve, though not perfectly, and there was no documentation to customize functionality, etc.
I couldn't find information anywhere about using twitter API without a developer account, so I applied for that type of account. They emailed approximately 3 times to get more information about my use case, and purposes, intent of use, what I intend to develop, etc. My application was approved within approximately 48 hours.
I will update this answer if there is more information I think might be valuable to share.

Accessing google custom search

I am building a software that processes Google search results for a linguistic purpose. It makes multiple searches at a fast rate and opens links (10 to 100 links per search) at a fast rate to access the websites and extract texts.
I will be accessing Google through Custom Search API. The Custom Search terms of service appear to disallow this type of access because they say that the results cannot be modified or crawled; however, I have found an interpretation that allows some wiggle room for my software. Now my question: If I go ahead and build the software, will the Custom Search block the access?

Measure how hot a topic is on Twitter

What kind of service should I use to measure how hot a topic is on Twitter, and how hot it has been in the past?
I thought about:
The Twitter API (https://dev.twitter.com/rest/reference/get/search/tweets) that lets me run searches up to 100 tweets. So in this case I have to make multiple calls to determine how many tweets there are. Is that correct?
TweetReach, that gives reports like this: https://tweetreach.com/reports/16000571, but the cheapest plan is at 300$/month.
With the Twitter API, you have a few options, but none of them may be exactly what you want, and none of them can go back very far into the past. You would have to either compile that information yourself, or use an external service like the one you mentioned.
Using the search API, you can only get results from the past 7 days, and are limited to 100 tweets per request. You can also set result_type to popular to get the most popular tweets about that search term. Twitter does have rate limits, but the ones for search are relatively high. You can use 180 requests every 15 minutes for any user you have authenticated, plus 450 requests every 15 minutes for the app itself (completely separate from the user requests). So if you only use app requests, you can get 45,000 tweets every 15 minutes.
If you don't need to search for specific terms, you can get trending topics in different areas using trends. The available areas can be retrieved using trends/available. Searching for trends also gives you the tweet_volume of each trend over the past 24 hours. If you check the trends every 24 hours and save the volumes, you can build up histories of trending topics.
Another option is using the streaming api. This only gives you current tweets, but you can use track to only get results for a set of terms, which you can then analyze.
Any external service, like TweetReach, will probably either cost you money or strictly limit the amount you can do with it unless you pay.
I'm the Social Media Manager for Union Metrics (we make TweetReach and lots of other things) and I just wanted to let you know that our free snapshots are built on the Search API which gives it those restrictions you've already discussed above, while our full snapshot reports can grab up to 1500 tweets for $20.
We do have more comprehensive Twitter analytics which I think you've already looked at, and those do backfill 30 days before tracking going forward. However you might have missed our new product Echo, which allows for a full, interactive search of the entire Twitter archive (you can see it in action here https://unionmetrics.com/product/echo-twitter-archive-search/) and is available through our Social Suite.
I understand if you don't have a large budget, and I completely understand the dilemma of cost of your time to build what you need vs. budget restrictions. Hope this helps at least let you know what else we offer!
Sarah A. Parker
Social Media Manager | Union Metrics
Fine Makers of TweetReach, The Union Metrics Social Suite, and more

Alternative to the deprecated google REST web search API

I have been using the Google Websearch API for over 1 year now. The service was deprecated in Nov 2010 but continues to provide results to date. More recently, google has started to enforce the 1,000 queries (?) per day limit on this deprecated service. I swear, last month I made over 10,000 API calls in one day without any errors from the service (same IP, same API key).
So I guess my question is has anyone found an alternative yet? I know yahoo boss is pretty good but I am working exclusively on Google for my projects. I do not mind spending money for for this service either as long as i can get 64 results from Google.
On that thought, how are services like Zoomrank able to bypass all Google limits? I have a subscription with Zoomrank and I can get daily rankings for all my keywords. Do they have a tie-up with Google or are they just accessing some secret service I don't know about.
Some people have suggested the new Google custom search, but i dont know how does that help me search the web? Google CS is limited to the CSE you create and searches within those engines. If I am looking for web results for Pizza, Google CS doesnt help me.
Thanks for your input. Much appreciated
UPDATE: #ggez44 points to some official Google documentation of the solution described below here: http://support.google.com/customsearch/bin/answer.py?hl=en&answer=1210656
You can use the Google Custom Search Engine to search the entire web.
In brief:
Create a CSE that searches a single site (e.g. google.com)
In the CSE control panel's Basics section, set to "Search the entire web but emphasize certain sites"
In the Sites section, delete the single site that you added when you created the CSE
Full details here:
http://www.google.com/support/forum/p/customsearch/thread?tid=56c0bd92dda351b7&hl=en&fid=56c0bd92dda351b7000495e3f500d83f
Once that's implemented, you can enable billing in the Google API Console at a CPM of $5, to a total of 10,000 queries.
Google API Console: https://code.google.com/apis/console/
Pricing: https://code.google.com/apis/customsearch/v1/overview.html#Pricing