total visitors at any given instance - apache

I am working on site analytics and would like to know how I can find the total number of visitors at any given instance. I am concerned only about the current time and not about past views. Right now I am trying to keep the problem simple by not finding the unique visitors.
One approach I can think of is to get total number of http connections at any given instance, assuming that the connection have very short timeout.
My setup includes apache web server and tomcat servlet container.
I know it is still a generic question but this use case is not specific to any particular language.

How about looking in the logs?
For example:
http://www.geekpad.ca/blog/post/get-unique-visitors-from-apache-log-file

Related

CrowdStrike API - How to pull all active hosts' without specifying Host Id's?

I am trying to write a query to get every active host on my network using the GET /devices/entities/online-state/v1 endpoint, however this endpoint requires a specific host's ID as a filter - meaning I would first have to query out to another API functionality to get the host ID, then hard code them into my initial query. Furthermore this API endpoint limits the amount of host Id's to 100 per query. I work on a network with 10's of thousands of endpoints, so this is not practical. I know there has to be a way to blanketly grab every host & its associated status, but I am still very new to the CS API and do not know what function to use. If anyone knows a solution - be it through a CS API endpoint I am unware of, or through syntactical corrections, (for example a wildcard I could use,) to my original query, I would really appreciate some help.

Apache: IP addresses vs users

Suppose you wanna analyze your access log files in order to check users activities. One common way is to assume that a same IP address corresponds to a same user.
However, several internet providers use CGNAT. Which, briefly, allows multiple end users to use a common public IP address.
In that case, users behind a CGNAT and sharing the same public address might be confused with each other. Therefore, causing problems to calculate view counts and to ban disruptive traffic.
Question
Any alternative to mitigate that?
(Preferably using only Apache)
You could consider unique users are unique combinations of IP+user-agent. It would be a bit better but still wouldn't be able to differentiate users on the same IP and using the same browser, on the same platform.
Other than that, you'd need to use a server side scripting technology and track sessions. That would require cookies tho, which is not too much of a biggie. You can't track static assets using that method tho.

Google maps geocoding returns empty string

Yesterday I created a new Google map geocoding API key on the developper console. I need to get gps coordinates from a server script. When I use the "which key do I need" in the console, it tells me I choosed the right key. I also allowed the fixed IP of my server in the key settings.
Now, when I use "https://maps.googleapis.com/maps/api/geocode/json?address=MY_ADDRESS&sensor=false&key=MY_KEY", it returns an empty string.
When I use "http//maps.googleapis.com/maps/api/geocode/json?address=MY_ADDRESS&sensor=false&key=MY_KEY" it returns a warning about this kind of queries must use https (which is consistent with the doc).
And finally, when I use "http//maps.googleapis.com/maps/api/geocode/json?address=MY_ADDRESS&sensor=false" (no https and no key) I get the relevant data, either in json or xml. As explained in the doc, this can be used with a limit of 2500 geocoding per day, but the problem is that I have different domains on the same server (with the same IP) that geocode, and since google tracks by IP to evaluate daily quotas...
So my question is : what am I missing when trying to geocode an address using https and the key ?
The only thing that crossed my mind is : do I need to activate billing in google maps, even though I know for sure that I will never exceed the free quota of 2500 queries per day, at least with the project to which the key is associated ?
Thanks in advance for any tip or advice.

Many requests in an API vs Many separate requests

I am making an application based on Google maps API. This requires requesting for distance between two cities. Now I want distances between many cities.
So should I use "for loop" and make many requests separately or should I send all the cities names in one link. Which one will work faster? And which one will be better?
For sure you should avoid sending multiple requests, because each roundtrip to a server takes time.
However when you are grouping many requests this can also take a long time (both to send, and to process on the server), and affect the user experience (long waiting time).
In your case I suspect that the "for loop" will not load to a lot of data, and server side processing will also not be too heavy, so sending a grouped single request should be the way to go.
You can use the "DirectionService" sevices ,which is providing by Google i.e "api3".
You can find the distance between the Many cities ,it takes one origin point ,destination point and 8 way points (total 10 places) for one request and it provides a JSON file
in return ,which contains all the information (distance in KM,value meters,city names and lot more) .Please check this link, https://developers.google.com/maps/documentation/javascript/directions . i hope this answer will meet your requirement,otherwise don't mind.

eCommerce Third Party API Data Best Practice

What would be best practice for the following situation. I have an ecommerce store that pulls down inventory levels from a distributor. Should the site, for everytime a user loads a product detail page use the third party API for the most up to date data? Or, should the site using third party APIs and then store that data for a certain amount of time in it's own system and update it periodically?
To me it seems obvious that it should be updated everytime the product detail page is loaded but what about high traffic ecommerce stores? Are completely different solutions used for that case?
In this case I would definitely cache the results from the distributor's site for some period of time, rather than hitting them every time you get a request. However, I would not simply use a blanket 5 minute or 30 minute timeout for all cache entries. Instead, I would use some heuristics. If possible, for instance if your application is written in a language like Python, you could attach a simple script to every product which implements the timeout.
This way, if it is an item that is requested infrequently, or one that has a large amount in stock, you could cache for a longer time.
if product.popularityrating > 8 or product.lastqtyinstock < 20:
cache.expire(productnum)
distributor.checkstock(productnum)
This gives you flexibility that you can call on if you need it. Initially, you can set all the rules to something like:
cache.expireover("3m",productnum)
distributor.checkstock(productnum)
In actual fact, the script would probably not include the checkstock function call because that would be in the main app, but it is included here for context. If python seems too heavyweiaght to include just for this small amount of flexibilty, then have a look at TCL which was specifically designed for this type of job. Both can be embedded easily in C, C++, C# and Java applications.
Actually, there is another solution. Your distributor keeps the product catalog on their servers and gives you access to it via Open Catalog Interface. When a user wants to make an order he gets redirected in-place to the distributor's catalog, chooses items then transfers selection back to your shop.
It is widely used in SRM (Supplier Relationship Management) branch.
It depends on many factors: the traffic to your site, how often the inventory levels change, the business impact of displaing outdated data, how often the supplers allow you to call their API, their API's SLA in terms of availability and performance, and so on.
Once you have these answers, there are of course many possibilities here. For example, for a low-traffic site where getting the inventory right is important, you may want to call the 3rd-party API on every call, but revert to some alternative behavior (such as using cached data) if the API does not respond within a certain timeout.
Sometimes, well-designed APIs will include hints as to the validity period of the data. For example, some REST-over-HTTP APIs support various HTTP Cache control headers that can be used to specify a validity period, or to only retrieve data if it has changed since last request.