Best practice for joining AdWords API placement data with AdWords ValueTrack placement data? - api

We've been working successfully with the AdWords API (Version: 201708 -
Google Ads Python Client Library) for a good while building internal reports for our application. Until, that is, we hit placements…
I define placements as anywhere an AdWords ad is shown. The placement might be a domain, page, ad unit, app you name it! Placements are a very broad definition.
For our app to work for placements we need to join API spend data with activity on our website.
To do this we are running AdWords API reports and then collecting session data using AdWords ValueTrack parameters.
The ValueTrack parameters are easy enough, as there seems to be only 1 option: {placement}.
However, it's on the API where things get interesting, the API has numerous options for getting placement data. For example:
https://developers.google.com/adwords/api/docs/reference/v201708/AdGroupCriterionService.MobileApplication
https://developers.google.com/adwords/api/docs/appendix/reports/url-performance-report
https://developers.google.com/adwords/api/docs/appendix/reports/placement-performance-report#criteria
https://developers.google.com/adwords/api/docs/appendix/reports/automatic-placements-performance-report#domain
https://developers.google.com/adwords/api/docs/reference/v201708/AdGroupCriterionService
After spending some time going back and forth on the various options, and burning lots of dev time, we've come to the conclusion that there must be some best practice advice out there for joining placement data from the API and ValueTrack. One that works for all types of placements, including:
Websites
Apps
AdSense
Blogspot
AMP
An example of where we are running into a matching problem is "10060.android.com.nytimes.android.adsenseformobileapps.com"... this is a placement we see coming in from ValueTrack but has no match in any of our spend reports. (In fact there are many many adsenseformobileapps.com traffic sources for which there are no spend items).
Also seeing strings like "mobileapp::2-com.mobilesrepublic.appy". These show up on our spend side but only appear in our ValueTrack around 10% of the time. Some match. The vast majority don't.
A definitive workflow on this would be SO useful for ourselves and no doubt other users…
Thanks!

According to https://developers.google.com/adwords/api/docs/guides/valuetrack-mapping
the incoming ValueTrack placement should map to the following report fields:
PlacementPerformanceReport.Criteria
CriteriaPerformanceReport.Criteria
AutomaticPlacementsPerformanceReport.DisplayName
In addition to this I have also found this report useful:
UrlPlacementPerformanceReport.Domain and .Url
But I have found it is not so clear in practice. For one thing each of these reports return a slightly different subset of results..and none of these subsets exactly match the ValueTrack data set.
Here are the exceptions I have found:
subdomains
ValueTrack placements have urls with www on them... some of the time. None of the other reports do, so you will either have to strip www from ValueTrack or add www to your report data in order to match them. But be careful, other subdomains are preserved (like edition.cnn.com) and not all urls have a subdomain, so you can't just strip all the subdomains from Valuetrack and you can't just add www to all the urls in the reports. What I have found actually matches the best is the url field from the UrlPlacementPerformanceReport... but for this field you just need to strip everything after the / to get a best case matching subset. To use the other reports you would need to strip all the subdomain information from ValueTrack and sum the totals from those records. This means you would lose potentially useful data such as differences between espn.com, scores.espn.com, insider.espn.com and games.espn.com. Using the UrlPlacementPerformanceReport.url is the only way to preserve that info.
mobileapp::
ValueTrack reports on mobileapp:: placements. Many of the reports return these values too but I have found that each report just gives a subset of the whole. In particular the CriteriaPerformanceReport.Criteria report gives you many mobileapp:: values that none of the other reports do, but the other reports give you at least a few values that the CriteriaPerformanceReport doesn't. To be complete you would have to take a Union of the mobileapps: returned by the criteria performance report and another report such as the UrlPlacementPerformanceReport.url.
anonymous.google
ValueTrack provides sudomains to anonymous.google that look like a8122ac7e5da8e49.anonymous.google. If you want to match this information to your spend the only report that has this detail is UrlPlacementPerformanceReport.url.
adsenseformobileapps.com
ValueTrack provides detailed domains such as 1.iphone.com.localtvllc.fox2.adsenseformobileapps.com. None of the adwords reports can match this. The best you can get is a single summation record for the entire adsenseformobileapps.com group.

Related

Lists the most viewed pages Wikimedia projects (Wikipedia) with more than 1000 results

I have seen that there are various APIs and various tools that allow you to see the most visited pages of the Wikimedia projects such as Wikipedia, but all these services have a limit, they do not allow to show more than 1000 pages, while I would like to have the list of 5000-10000(or more) most visited pages in order of traffic.
these are all the services that I checked and with which I found this limit:
https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bmostviewed
https://stats.wikimedia.org/#/en.wikipedia.org/reading/top-viewed-articles/normal|table|last-month|~total|monthly
https://pageviews.toolforge.org/topviews/?project=en.wikipedia.org&platform=all-access&date=last-month&excludes=
https://wikimedia.org/api/rest_v1/#/Pageviews%20data
I have also found services like https://quarry.wmflabs.org/ or
https://query.wikidata.org/ where you can run a query, technically perhaps through this service you could but I don't know the query to be performed to show the pages with most visits.
I also found an interesting article here: https://www.reddit.com/r/bigquery/comments/3dg9le/analyzing_50_billion_wikipedia_pageviews_in_5/ where it is explained that it is possible to use Google's BigQuery but it is an external service and before using it I wanted to know if it existed a simpler method.
If the REST API doesn't suit your purpose, you'd need to parse the raw data yourself. That's because all the tools you've linked just consume the REST API.
The raw data are available at https://dumps.wikimedia.org/other/pageviews/. There are two groups of files there. One starts with pageviews-, which lists the number of views of individual pages, the second starts with projectviews-, which lists the number of views of individual projects.
For your target, you need the pageviews ones. Download the files for your timespan, and then analyze them using a script.
The file is space-separated. Each row represents one page that was visited in that hour. First column represents the project (en is English Wikipedia, for instance), second is the page title (spaces are represented by underscores) and then there are total pageviews.
The technical documentation is available at https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageviews.

How to get the most searched words in Solr? [duplicate]

I'm trying to organize a solr search engine. I've already set up the misspelling system and the suggestions.
However I can't seem to find how to retrieve the top 10 most searched words/terms/keywords in solr/lucene. How can I get this? I want to display those on my homepage.
Solr does not provide this kind of feature out of the box. There is the StatsComponent, that provides you with all kind of statistics, but all of those are numeric only.
Depending on how you access solr (directly or via your own app) you could intercept all calls an log the query string. I did this in a recent project where I logged a queries to a database. If you submit all keywords to an other core on your solr server, you can faceting queries on your search terms as described by Hyque
You could use a facet for retrieving the Top X words like this:
http://yourservergoeshere/solr/select?q=*&wt=xml&indent=true&facet=true&facet.query=*&facet.field=message&facet.limit=10&facet.minCount=1
The value of facet.field depends on the field you like to search in. With facet.limit you'll (obviously) limit the amount of results to 10. You'll find the facet results at the end of the results, starting with "facet_counts"
Edit: I really should go to bed earlier. I didn't see the "most searched" in your question. Sorry for that.
Apache Solr does not provide any such capability as of today. There is a desire for this and a JIRA ticket corresponding to it. You can vote for it if you'd like to see it in Solr some day: https://issues.apache.org/jira/browse/SOLR-10359.
The stats component provides information around statistics, but it's mostly numeric in nature. You could parse server logs and come up with a way to build a Frequently Searched Terms (e.g. pump those logs in SiLK or Kibana for visualization).
If you have the ability to change the front end and add some javascript code to the UI or can intercept the search request and make an async or batch calls to APIs for tracking, you can use SearchStax Analytics that provides Search Analytics that tracks searches, clicks, cart actions, revenue, etc.

Can google analytics gave me stats for different url's on same domain

Scenario
I am working on a web application assume it www.abc.com which having a profile for all users
www.abc.com/username
and all users have a dashboard for controlling their profiles
Requirement
i have one analytics profile for www.abc.com but my requirement is
a to show stats to all users on their dashboard
can i get this by google analytics API
Visits
demographics
all traffic source
and keywords
i have integrated reporting by API on one of my project but that is for the domain . i am not sure for my requirement.
Resource-guru.com, what you can do is to pull all the data with page path dimension included, and then simply filter the results if the username string is found in the page path.
As for the second part of your question - you can get:
visits (metric)
traffic sources as well as keywords (dimensions, but remember (not provided) might make this useless report)
you can NOT get demographics data via API.
Hope this helps.
You can use the API to do this but remember your going to have an issue with the fact that you can only make 10k requests per day per view (profile).
The Demographics report displays age and gender. Those dimensions can be found under the Audience - Dimensions & Metrics Reference
ga:visitorAgeBracket
ga:visitorGender
ga:interestOtherCategory
ga:interestAffinityCategory
Traffic Source is just really just a mix up of ga:sourceMedium , ga:campaign maybe a few others depending on what information you want to display.
You may have issues with Keywords because due to ssl and trying to keep user info private Google has stopped recording this sometimes you get (not provided). But you can get that information from webmaster tools. Its just hard to merge it with your GA data then.
Keyword: The keywords that visitors searched are usually captured in
the case of search engine referrals. This is true for both organic and
paid search. Note, however, that when SSL search is employed, Keyword
will have the value (not provided).

know the page rank for certain key words

I want to know the page rank for certain key words against my page. For example I wrote "best movies 2012" my page does come, but in 30th to 50th page. I want to query in the result set Google gave against my keywords so that I can see the rank of my page and my competitors against typical keywords.
I think you may be confusing PageRank with positions. PageRank is an algorithm that Google uses to determine the authority of your site. This doesn't always affect the positions of certain keywords.
There are plenty of good programs and web services around that you can use such as
http://raventools.com/
Most of the good free web services have been closed down due to Google now limiting the amount of searches performed and charging for this data.
You could check out:
http://www.semrush.com
It's free but you have to register to get data.
There are several web services providing this functionality: http://raventools.com/ or http://seomoz.org/
Or, you can perform the task manually. Here is an example on how to query google search using Java: How can you search Google Programmatically Java API
You need to compare your webpage PageRank and website PR against those of the competition. The best indication we have of website PR is the HomePage PagRank.
Ensure that you do this for the appropriate Google domain - USA - Google.com - UK Google.co.uk etc
The technique is described in more detail on http://www.keywordseopro.com
You can repeat the technique for each keyword.

Programmatic Querying of Google and Other Search Engines With Domain and Keywords

I'm trying to find out if there is a programmatic way to determine how far down in a search engine's search results my site shows up for given keywords. For example, my query would provide my domain name, and keywords, and the result would return a say 94 indicating that my site was the 94th result. I'm specifically interested in how to do this with google but also interested in Bing and Yahoo.
No.
There is no programmatic access to such data. People generally roll out their own version of such trackers. Get the Google search page and use regexes to find your position. But now different results are show in different geographies and results are personalize.
gl=us parameter will help you getting results from US, you can change geography accordingly to get the results.
Before creating this from scratch, you may want to save yourself some time (and money) by using a service that does exactly that [and more]: Ginzametrics.
They have a free plan (so you can test if it fits your requirements and check if it's really worth creating your own tool), an API and can even import data from Google Analytics.