I'm new-ish to web development. Setting up a hobby website for weather forecasting. I have an API key to make 'GET' requests from a forecasting service (openweathermap). I know I don't want to expose this key to the public. So my current plan is to have the user select their location by text input, a client side JavaScript function sends this info to a server side Perl script which makes the 'GET+key' request to the forecasting service. The Perl script returns the forecast data to the client side JavaScript for display on the page.
Since the API key is stored in the server side script, I assume it is not visible to the public. Is this true and is my above approach good to use?
On the server, the client should not be able to see it. However, there are various things that you might do that could expose it. Use proper transport security (so, HTTPS, whatever), be careful with the output of error messages, store it somewhere away from the prying eyes of others, and so on.
Look into various ways to manage secrets. Some of these make it so you don't even know what the API key is.
Related
Let’s say I have a client who has spent a lot of time and money creating a custom database. So there is a need for extra data security. They have concerns that the information from the database could get scraped if they allow access to it from a normal web app. A secure login won’t be enough; someone could log in and then scrape the data. Just like any other web app, a PWA won't protect against this.
My overall opinion is that sensitive data would be better protected on a hybrid app that has to be installed. I am leaning toward React-Native or Ionic for this project.
Am I wrong? Is there a way to protect the data from being scraped in a PWA?
There is no way to protected data visible to browser client regardless of technology - simple HTML or PWA/hybrid app.
Though you can make it more difficult.
Enforce limits on how many information a client can fetch per minute/hour/day. The one who exceed limits can be blocked/sued/whatever.
You can return some data as images rather than text. Would make extraction process a bit more difficult but would complicate your app and will use more bandwidth.
If we are talking about a native/hybrid app it can add few more layers to make it more secure:
Use HTTPS connection and enforce check for valid certificate.
Even better if you can check for a specific certificate so it's not replaced by a man-in-the-middle.
I guess iOS app would be more secure then Android as Android is easier to decompile and run modified version with removed restrictions.
Again, rate limiting seems to be the most cost effective solution.
On top of rate limiting, you can add some sort of pattern limiting. For example, if a client requests data with regular intervals close to limits, it is logical to think that requests are from a robot and data is being scrapped.
HTTPS encrypts the data being retrieved from your API, so it could not be 'sniffed' by a man in the middle.
The data stored in the Cache and IndexedDB is somewhat encrypted, which makes it tough to access.
What you should do is protect access to the data behind authentication.
The only way someone could get to the stored data is by opening the developer tools and viewing the data in InsdexedDB. Right now you can only see a response has been cached in the Cache database.
Like Alexander says, a hybrid or native application will not protect the data any better than a web app.
We have a back end API, running ASP.Net Core, with two front ends: A SPA web site (Vuejs) and a progressive web page (for mobile users). The front ends are basically only client code and all services are on different domains. We don't use cookies as authentication uses bearer tokens.
We've been playing with Application Insights for monitoring, but as the documentation is not very descriptive for our situations, I would like to get some more inputs for what is the best strategy and possibilities for:
Tracking users and metrics without cookies from e.g. the button click in the applications to the server call, Entity Framework/SQL query (I see that this is currently not supported, How to enable dependency tracking with Application Insights in an Asp.Net Core project), processing data and presentation of the result on the client.
Separating calls from mobile and standard web in an easy manner in Application Insights queries. Any way to show this in the standard charts that show up initially would be beneficial.
Making sure that our strategy will also fit in situations where other external clients will access the API, and we should be able to identify these easily, and see how much load they are creating for the system.
Doing all of the above with the least amount of code.
this might be worthy of several independent questions if you want specifics on any of them. (and generally your last bullet is always implied, isn't it? :))
What have you tried so far? most of the "best way for you" kinds of things are going to be opinions though.
For general answers:
re: tracking users...
If you're already doing user info/auth for other purposes, you'd just set the various context.user.* fields with the info you have on the incoming request's telemetry context. all other telemetry that occurs using that same telemetry context would then inerit whatever user info you already have.
re: separating calls from mobile and standard...
if you're already doing this as different services/domains, and you are already using the same instrumentation key for both places, then the domain/host info of pageviews or requests is already there, you can filter/group on this in the portal or make custom queries in the analytics portal to analyze that way. if you know which site it is regardless of the host, you could add that as custom properties in the telemetry context, you could also do that to avoid dealing with host info.
re: external callers via an api
similarly, if you're already exposing an api and using auth, you should (ideally) already know who the inbound callers are, and you can set that info in custom properties as well.
In general, custom properties (string:string key value pairs) and custom metrics (string:double key value pairs) are your friends. you can set them on contexts so all the events generated in that context inherit the same properties, you can explicitly set them on individual TrackEvent (or any of the other Track* calls) to send specific properties/metrics with any single event.
You can also use telemetry initializers to augment or filter any telemetry that's being generated automatically (like requests or dependencies on the server side, or page views and ajax dependencies client side)
My app is a Personal Assistant who's main job is to redirect the user to something that complies with his/her wishes. I realize, for example that AllRecipies.com has no API. My question is that can I, say open the browser app with the url as
http://allrecipes.com/search/results/?wt=QUERY>&sort=re.
Is this considered as using their API? Not just AllRecipies, but numerous other such services. If I am using this method, then do I have to request API key, etc? I am not retrieving anything. I am simply redirecting the user to their page with the query pre-written. Does this require all the licensing fees, API Key, etc?
Do I have to agree to this fees(If they ask), Request API Key, etc?
With the particular URL in question, it is simply an HTML web server URL, rather than a web API, as such. You can still get data out of it, but you'd have to parse the HTML yourself to extract what you want from the HTML response.
They may have an API that you can use to access data more directly as JSON, XML, etc, but you'll have to look into that yourself. And you will possibly require an API key to access it. But perhaps not, if it's publicly available and they don't care how many calls they get to it by anonymous users.
You may find this resource useful. It contains a lot of open APIs and code snippets to access them: http://www.programmableweb.com/
If you are simply trying to hit a URL or directing a user to this particular URL which you already know and is static meaning you always hit the same url without change in parameters, then this is not considered an API call and will not be requiring any API key.
However, if they have some APIs exposed, you will need to go through their documentation and using this API most likely requires the use of an API key(alhough this might not be true always). Usually, most platforms have a bunch of APIs available for different scenarios and these are called based on user specific parameters and requirements.
What things should a developer designing and implementing an API for a community based website know before starting the heavy coding? There are a bunch of APIs out there like Twitter API, Facebook API, Flickr API, etc which are all good examples. But how would you build your own API?
What technologies would you use? I think it's a good idea to use REST-like interface so that the API is accessible from different platforms/clients/browsers/command line tools (like curl). Am I right? I know that all the principles of web development should be met like caching, availability, scalability, security, protection against potential DOS attacks, validation, etc. And when it comes to APIs some of the most important things are backward compatibility and documentation. Am I missing something?
On the other hand, thinking from user's point of view (I mean the developer who is going to use your API), what would you look for in an API? Good documentation? Lots of code samples?
This question was inspired by Joel Coehoorn's question "What should a developer know before building a public web site?".
This question is a community wiki, so I hope you will help me put in one place all the things that should be addressed when building an API for a community based website.
If you really want to define a REST api, then do the following:
forget all technology issues other than HTTP and media types.
Identify the major use cases where a client will interact with the API
Write client code that perform those "use cases" against a hypothetical HTTP server. The only information that client should start with is the response from a GET request to the root API url. The client should identify the media-type of the response from the HTTP content-type header and it should parse the response. That response should contain links to other resources that allow the client to perform all of the APIs required operations.
When creating a REST api it is easier to think of it as a "user interface" for a machine rather than exposing an object model or process model. Imagine the machine navigating the api programmatically by retrieving a response, following a link, processing the response and following the next link. The client should never construct a URL based on its knowledge of how the server organizes resources.
How those links are formatted and identified is critical. The most important decision you will make in defining a REST API is your choice of media types. You either need to find standard ways of representing that link information (think Atom, microformats, atom link-relations, Html5 link relations) or if you have specialized needs and you don't need really wide reach to many clients, then you could create your own media-types.
Document how those media types are structured and what links/link-relations they may contain. Specific information about media types is critical to the client. Having a server return Content-Type:application/xml is useless to a client if it wants to do anything more than parse the response. The client cannot know what is contained in a response of type application/xml. Some people do believe you can use XML schema to define this but there are several disadvantages to this and it violates the REST "self-descriptive message" constraint.
Remember that what the URL looks like has absolutely no bearing on how the client should operate. The only exception to this, is that a media type may specify the use of templated URIs and may define parameters of those templates. The structure of the URL will become significant when it comes to choosing a server side framework. The server controls the URL structure, the client should not care. However, do not let the server side framework dictate how the client interacts with the API and be very cautious about choosing a framework that requires you to change your API. HTTP should be the only constraint regarding the client/server interaction.
My idea is to treat URI's in my rest api as a unique resource, except in the context of the client's location, which is stored in a cookie. Are there any downsides to this approach?
From a philosophical perspective, it's not really REST if you don't uniquely identify the resource via URL (at least, per my reading of Fielding).
From a practical perspective -- and this is based on experience -- you're in for a world of pain if you require web service calls to use cookies. Primarily because it's a piece of information that has to be managed on a different code path, making your client-side code more complex. You'll also run into issues with domain and proxies (particularly if you share the cookie between the service and a traditional web-app), and it isn't portable between clients.
If you're looking to generate different content based on location, why not use a geolocation service?
Edit: why not make location part of the request URL? You can still use a cookie to store this information, and retrieve it using JavaScript. This would leave your service interface clean, and allow you to easily use the service from other clients.
As an API, you should aim at making ease of use for the client programmer a high priority. In many libraries that support HTTP, putting cookies into the HTTP request is more difficult than putting, say, a query parameter into the URL.
I'd be concerned about caching. Do one request with the user at location A, it gets cached, user moves to B and makes the request again, gets location A version of the request.