I am starting to see a few of these requests in my Apache logs. They seem to come in pairs; first a request for /notified-Notify_AUP followed by a request for /verify-Notify_AUP.
The requests come with a google search referrer pointing to my site. The requests seem to come from legit companies -- of course anything can be hacked.
I have never heard of these files, unlike so many of the other fishing expeditions aimed at all of our sites. Is this something new or are these legit and I am supposed to be providing some sort of reply?
Thanks,
Boggle
I finally found out that this is an attack on ProxySG. Since I do not have a ProxySG box, I can safely ignore this problem.
Related
I have collected all the requests made by websites with the aim to identify the third-parties through the requests which are made by a website. I used selenium and WebDriver to do that.
These requests can be made by the JavaScript present in the source code of the website or can be dynamically called by the web-page from the advertisements or can be initiated by Google or DoubleClick or Facebook. These requests help to track the data that is being shared by these websites with or without the user consent.
You can see an example of the requests when the browser wants to load this website: www.focuscamera.com/ in this excel file:
https://drive.google.com/file/d/16wNA0dFUehrjPww31TAIj8GZUZ05LsIU/view?usp=sharing
My questions are:
1- which kind of HTTP header field can be used for my analysis if I tend to gather some info about third parties? my goal is to distinguish and differentiate the third party behavior!
For example, the field content-length in the requests indicates the size of the entity-body. So a request with higher content-length means that the third party received and collect more data/information?
2- What does exactly content-length indicates? what does exactly "HTTP request body data" contain?
3- Are there any other HTTP header fields that I can use if I aim to distinguish and differentiate the third party behavior? ( a list of field I collect can be found in sheet1 of the excel file I shared before)
4- Are there any other information on the internet that I can use if I aim to distinguish and differentiate the third party behavior? For example, I use cookiepedia.co.uk in order to know what kind of services third parties provide? is it functionality, performance, or Targeting/advertising?
It sounds like you may be reinventing the wheel here. Take a look at https://webbkoll.dataskydd.net; they provide lots of security and privacy analysis on any site you like. Generate nice visual request maps using https://requestmap.webperf.tools:
Try using that tool on sites like wired.com and forbes.com to see how spectacularly bad it can get!
To answer your questions specifically:
Headers are not massively useful as they are within each request (it's the request itself that's more interesting), but the important ones from a privacy perspective will be Referer and Set-cookie. Content-length does indeed tell you how big the request body is – that will always be 0 on a GET request and so is usually omitted – large post requests indicate more data is being transmitted, but that may be down to inefficiency rather than anything else.
Content-length indicates the length of the data (in bytes) within the body of a POST request. An HTTP request body can contain any kind of data: text, images, video, audio, formatted data.
There are some, but most headers are functional rather than semantic, concerned with making the request actually work. It's more interesting that requests happen at all than what they contain.
You can't necessarily tell what kind of service a third party is providing from the requests themselves, but the domains they are going to are more interesting. For example anything going to doubleclick.com is going to be ad and tracking related because of what that domain is known to be used for (Webbkoll cites these as "known trackers"); So you're correct that sites like cookiepedia can help you find out what a particular service does. The divisions between functional/performance/profiling are mostly made up by ad companies to excuse their behaviour, and you can't tell what they are using data for, only whether they are receiving data, and what data they are receiving (because you can see what's in the requests they make using browser developer tools). To clarify - a site could receive your full name and address, but do absolutely nothing with it; but you can't tell that from looking at the data that's sent. In privacy terms, it's always best to assume the worst (because ad companies absolutely cannot be trusted!), so if they are receiving data, assume it will be abused.
Over the last couple days I've been getting millions of requests from rotating IPs. They're attempting to run post requests and seem to be using an incorrect HTTP_ORIGIN. By incorrect, I mean that it's not the same as what my server sends:
My server sends: "https://www.example.com"
The spam request sends: www.example.com
I placed some logging for each scenario:
User logged in and has incorrect HTTP_ORIGIN
User NOT logged in and has incorrect HTTP_ORIGIN
What I've noticed is that there are users that are logged in, but have the wrong HTTP_ORIGIN (origin is missing "https://". I have checked those user accounts and while they appear to be real, and not created by the original spam requests, they may be currently run through scripts.
It seems like it would prevent those users from accessing the POST requests of the site, but on the other hand, if they were real users, it would cause a problem.
Now if I were to put filtering in place to block requests that didn't match the origin, my questions are:
What would be the side effect of that?
Are there downsides or negative aspects?
Would I see drops in traffic?
If that so, It's like you said some are using your website from scripts, considering if your website is normal (I mean not like a website to upload data or sth like that), then it would be good to consider adding captcha to your website in place of filtering requests (cause I think it would be simple for those who send incorrect HTTP_ORIGIN to make a similar one to the original if they use a sslstream especially if it is for malicious goals).
And for the consequences if you use a filtering to the http request, I think the requests will drop remarkably (since you will refuse incorrect ones), and some real users who use scripts will switch to browser (it's a rare case especially if they scrape data from website in an automatic way) or they will stop using your website.
You need to wait for further research and make sure that those false requests are not malicious ones (perhaps they are using simple tcp client). Either way it is best for the time being to inspect data sent in the POST requests (incorrect ones) and see if there is some suspicious data (In that case you should use some safety method in your website)
We have a single-page app (AngularJs) which interacts with the backend using REST API. The app allows each user to see information about the company the user works at, but not any other company's data. Our current REST API looks like this:
domain.com/companies/123
domain.com/companies/123/employees
domain.com/employees/987
NOTE: All ids are GUIDs, hence the last end-point doesn't have company id, just the employee id.
We recently started working on enforcing the requirement of each user having access to information related exclusively the company where the user works. This means that on the backend we need to track who the logged in user is (which is simple auth problem) as well as determining the company whose information is being accessed. The latter is not easy to determine from our REST API calls, because some of them do not include company id, such as the last one shown above.
We decided that instead of tracking company ID in the UI and sending it with each request, we would put it in the subdomain. So, assuming that ACME company has id=123 our API would change as follows:
acme.domain.com
acme.domain.com/employees
acme.domain.com/employees/987
This makes identifying the company very easy on the backend and requires minor changes to REST calls from our single-page app. However, my concern is that it breaks the RESTfulness of our API. This may also introduce some CORS problems, but I don't have a use case for it now.
I would like to hear your thoughts on this and how you dealt with this problem in the past.
Thanks!
In a similar application, we did put the 'company id' into the path (every company-specific path), not as a subdomain.
I wouldn't care a jot about whether some terminology enthusiast thought my design was "RESTful
" or not, but I can see several disadvantages to using domains, mostly stemming from the fact that the world tends to assume that the domain identifies "the server", and the path is how you find an item on that server. There will be a certain amount of extra stuff you'll have to deal with with multiple domains which you wouldn't with paths:
HTTPS - you'd need a wildcard certificate instead of a simple one
DNS - you're either going to have wildcard DNS entries, or your application management is now going to involve DNS management
All the CORS stuff which you mention - may or may not be a headache in your specific application - anything which is making 'same domain' assumptions about security policy is going to be affected.
Of course, if you want lots of isolation between companies, and effectively you would be as happy running a separate server for each company, then it's not a bad design. I can't see it's more or less RESTful, as that's just a matter of viewpoint.
There is nothing "unrestful" in using subdomains. URIs in REST are opaque, meaning that you don't really care about what the URI is, but only about the fact that every single resource in the system can be identified and referenced independently.
Also, in a RESTful application, you never compose URLs manually, but you traverse the hypermedia links you find at the API endpoint and in all the returned responses. Since you don't need to manually compose URIs, from the REST point of view it's indifferent how they look. Having a URI such as
//domain.com/ABGHTYT12345H
would be as RESTful as
//domain.com/companies/acme/employees/123
or
//domain.com/acme/employees/smith-charles
or
//acme.domain.com/employees/123
All of those are equally RESTful.
But... I like to think of usable APIs, and when it comes to usability having readable meaningful URLs is a must for me. Also following conventions is a good idea. In your particular case, there is nothing unrestful with the route, but it is unusual to find that kind of behaviour in an API, so it might not be the best practice. Also, as someone pointed out, it might complicate your development (Not specifically on the CORS part though, that one is easily solved by sending a few HTTP headers)
So, even if I can't see anything non REST on your proposal, the conventions elsewhere would be against subdomains on an API.
To deal with recent growth our application has been split across two sets of separate infrastructure. Approximately half of our customers are on set 1 and the other half are on set 2.
Both sets have different urls (api1.ourdomain.com and api2.ourdomain.com).
Problem is clients accidentally putt the wrong url and then wonder why they get error messages.
Other then user education any other strategies for dealing with this mess?
Is it possible to redirect requests to the correct endpoint?
Thanks.
I don't think your question is detailed enough to provide meaningful feedback. There are obviously several factors that could easily contribute to a recommendation.
Does your application make use of user profiles (or a similar construct)? If so you might consider associating a primary URI for each user in their profile and include logic in your application to interrogate the profile for each request and redirect if a user goes to the wrong URI.
Is this an authorization issue? If so you might consider including some basic authorization routing that provides a custom 403 page with the proper URL.
If you could provide additional detail I think we could be more helpful.
While trying to debug my openid implementation with Google, which kept returning Apache 406 errors, I in the end discovered that my hosting company does not allow to pass a string containing "/id" as a GET parameter (something like "example.php?anyattribute=%2Fid" once URL encoded).
That's rather annoying as Google openid endpoint includes this death word "/id" (https://google.com/accounts/o8/id) so my app is returning 406 errors every time I log in with Google because of this. I contacted my hosting company who told me this has been deactivated for security purposes.
I could use POST instead, for sure. But has anyone got an idea why this could cause security problems ???
It can't, your host is being stupid. There's nothing magical about the string /id.
Sometimes people do stupid things with the string /id, like assuming no one is going to guess what follows, so that example.com/mysensitivedata/id/3/ shows my data because my user has id 3, and being the sneaky sort, I wonder what happens if I navigate to example.com/mysensitivedata/id/4/, and your site blindly lets me through to see someone else's stuff.
If that sort of attack breaks your site, no amount of mollycoddling by your host will help you anyway.
One reason a simple ID in the URL could be a security concern is that a user could see their ID and then type another one in, such as if its an integer they may select the next integer up, and potentially see another users info if it is not protected.