Is there a way to protect data from being scraped in a PWA?

Is there a way to protect data from being scraped in a PWA? - react-native

Let’s say I have a client who has spent a lot of time and money creating a custom database. So there is a need for extra data security. They have concerns that the information from the database could get scraped if they allow access to it from a normal web app. A secure login won’t be enough; someone could log in and then scrape the data. Just like any other web app, a PWA won't protect against this.
My overall opinion is that sensitive data would be better protected on a hybrid app that has to be installed. I am leaning toward React-Native or Ionic for this project.
Am I wrong? Is there a way to protect the data from being scraped in a PWA?

There is no way to protected data visible to browser client regardless of technology - simple HTML or PWA/hybrid app.
Though you can make it more difficult.
Enforce limits on how many information a client can fetch per minute/hour/day. The one who exceed limits can be blocked/sued/whatever.
You can return some data as images rather than text. Would make extraction process a bit more difficult but would complicate your app and will use more bandwidth.
If we are talking about a native/hybrid app it can add few more layers to make it more secure:
Use HTTPS connection and enforce check for valid certificate.
Even better if you can check for a specific certificate so it's not replaced by a man-in-the-middle.
I guess iOS app would be more secure then Android as Android is easier to decompile and run modified version with removed restrictions.
Again, rate limiting seems to be the most cost effective solution.
On top of rate limiting, you can add some sort of pattern limiting. For example, if a client requests data with regular intervals close to limits, it is logical to think that requests are from a robot and data is being scrapped.

HTTPS encrypts the data being retrieved from your API, so it could not be 'sniffed' by a man in the middle.
The data stored in the Cache and IndexedDB is somewhat encrypted, which makes it tough to access.
What you should do is protect access to the data behind authentication.
The only way someone could get to the stored data is by opening the developer tools and viewing the data in InsdexedDB. Right now you can only see a response has been cached in the Cache database.
Like Alexander says, a hybrid or native application will not protect the data any better than a web app.

Related

WhatsApp - How WhatsApp server stops/detects requests from unauthorized apps?

Every application that generates dynamic content must have a server whose address is embedded inside the application to enable communication with server.
Now in the case of WhatsApp definitely they have also embed the server's address inside the WhatsApp application. For example someone reverse engineer the WhatsApp apk and found the address of the server, as well as he also found the parameters and all the stuff that the application sends to the server (i-e session, token, authentication key etc etc) for successful communication, so is that mean he can use these same parameters structure and the server address in different third party app to play/communicate with the WhatsApp server? Because server is just an electronic device that works on the digital signals and thats it. Server don't know that these parameters are coming from the authorized WhatsApp apk or from third party apk.
If yes, then don't you guys think that there should be solution to that problem?
If no, then what are the techniques and algorithms they are using to stop requests from unauthorized/fake apps.
I believe not any employee from WhatsApp will answer here to share the algorithm, but i know SOF is full of geeks, if someone knows how WhatsApp stops these kind of issues please share, otherwise i will be still glad to know about the advice and ideas that you guys have in your mind for the best security practices.
How banking, paypal etc and messaging apps including WhatsApp works in that scenario and how they stop the issue that i described above?
Important:
I am not going to reverse engineer the WhatsApp, i am just creating a server and fighting with this issue to be solved to secure my server and only accept request from my app but stop requests from unauthorized/fake apps.
Thanks & respect to all in advance who will contribute.

There is no way to prevent malicious reverse-engineering, resulting in a fake app pretending to be the real thing. While you are working on your server, you need to do defensive programming, that is, your server shouldn't assume that the request was sent via the app. So, if you protect your server against all kinds of malicious and deliberate misuses, then your server is safe.
However, that's easier said than done, because your project is developed by a finite amount of people and - if it becomes successful then - the audience contains a swarm of smart bad people.
You will therefore need to detect a subset of features that you need to absolutely protect against misuses and prioritize testing and improving those, by thinking with the mind of a fictional hacker, who would like to either gain unearned profits or do harm to your project. Schizophrenic, I know, but you need to do that on the server. You also need to improve the security of less than critical features, but at a lower priority and log the requests you get, so if SHTF, then you will have at least a chance to deduce what caused it and how.
If the phone app is in your hands as well, then you might implement some additional authentication for each version, like generating a version token for each user that downloads your app. Since the version token generator algorithm would not be in the hands of hackers, they would have to solve that on a per user basis, which is extremely laborius to solve this for several users if done by hand and if they work it out in a way to make it automatic, their solution would be viable only for a version.
So, there is no 100% accuracy in this area, but you can make life very hard and miserable for people payed to hack through your application.

Effective Backend for Real Estate Application

I am looking to develop a cross platform mobile app involving real-estate. I have looked at Zillow's API and I think that will be one of the API's I utilize.
https://www.zillow.com/howto/api/APIOverview.htm
My question is if I were to utilize their API as well as those of some other real estate sites, would it make more sense for me to call those APIs directly from the mobile applications, or would it make more sense to have a proxy server, possibly with my own databases compiled from these sites, that the mobile application would call? I have only read the basic overview of the Zillow API, but it looks like it is limited to 1000 calls per day. I understand it is a fairly general questions. If there are any more details that would help to make a better answer, please let me know.
Also, if you know of any other free/cheap real-estate APIs, can you please provide them?
Thanks

Not exactly sure what your metrics are.
But generally speaking, it is a bad idea to hook your mobile app directly to third party API for the following reasons:
You do not control the API, if the third party changed their API your app won't work, the user would have to upgrade. But if you isolate the mobile app by connecting to your server you have more control and can have much longer life.
Caching/rate limits. You can get the data from the third party and store it (if you are allowed) then share the data with all your users
Multiple datasources: Usually you get the data from multiple datasources, so aggregating the data on your server then send the enhanced data model to the app is a lot easier than pulling data from different sources and compiling them on the app itself.

Strategies for designing REST APIs for all types of client devices

The question is more targeted at server side development.
When writing a REST API, I want to write it in such a way that it can be consumed by both desktop and mobile applications.
Could see two possible approaches:
Each API should support pagination and the responsibility should be delegated to the client of how much data should be fetched in one go. So , mobile apps will ask for fewer pages in one go and desktop applications will ask for more.
Separate APIs for mobile devices hosted separately. The front-end web server can check the user agent (i.e. source from where is request is coming) and if it's a mobile device, then re-route the request to the server hosting the APIs for mobile devices.
Interested to know more strategies around this.
Appreciate your inputs.

I would suggest a bit of both (1) and (2), here's how.
Instead of re-building whole new api for mobile itself, Have adapters for all the supported devices. i.e have a layer on top of you REST API implementation which renders/instructs the underlying service to return the content suitable for selected mobile device.
coming to pagination, you can parameterize the pagination as an input from the Abstraction mentioned above.

I would recommend something closer to option (1). If the main difference between the clients will be the amount of data they request at a time, it seems trivial to add some kind of query parameter or HTTP header to the REST API indicating how many records to return, for instance.
Relying on checking the User-Agent header may require you to maintain a list of known client user agents and match against them, which would be an additional maintenance cost of a separate API.

Is it a bad idea to have a web browser query another api instead of my site providing it?

Here's my issue. I have a site that provides some investing services, I pay for end of day data which is all I really need for my service but I feel its a bit odd when people check in during the day and it only displays yesterdays closing price. End of day is fine for my analytics but I want to display delayed quotes on my site.
According to the yahoo's YQL faq: If you use IP based authentication then you are limited to 1000 calls/day/IP, if my site grows I may exceed that but I was thinking of trying to push this request to the people browsing my site themselves since its extremely unlikely that the same IP will visit my site 1,000 times a day(my site itself has no use for this info). I would call a url from their browser, then parse the results so I can allow them to view it in the format of the sites template.
I'm new to web development so I'm wondering is it a common practice or a bad idea to have the users browser make the api call themselves?

It is not a bad idea at all:
You stretch up limitations this way;
Your server will respond faster (since it does not have to contact the api);
Your page will load faster because the initial response is smaller;
You can load the remaining data from the api in async manner while your UI is already responsive.
Generally speaking it is a great idea to talk with api's from the client. It's more dynamic, you spread traffic, more responsiveness etc...
The biggest downside I can think of is depending on the availability of other services. On the other hand your server(s) will be stressed less because of spreading the traffic.
Hope this helped a bit! Cheers!

Solutions to protecting game high-scores

My friend proved it to me by taking the WP7 papertoss games and getting the .xap from it and then posting his own high scores.
Is there any fool proof way to stop this ? (I think xbox live integration makes hacking the high scores impossible but that is for special people )

It depends first of all how the high-scores are sent. I can only assume that what your friend did was take the XAP and modify some internal file or track the HTTP web requests that are used to send the scores to the centralized locations. I have two recommendations for you.
Encrypt. Don't keep scores in plaintext. There are plenty of strong encryption methods that you can take advantage of that will render the scoreboard useless unless the person who tries to read it has the key.
If you send the scores to a web service, never send it in plaintext (once again). From my own experience I can say that web requests can be easily altered and sniffed. So if I see that the app sends http://yourservice/sendscore?user=Den&score=500, I might as well invoke http://yourservice/sendscore?user=Den&score=99999999. Same applies if you plan on using headers.
Be aware, that using the Xbox Live services is only possible if you are a registered Xbox developer, and this is not easy to get.

First of all - is a high score list really that critical that you're worried about an edge case (the common person isn't going to have a dev unlocked phone with ability to modify the *.xap file)?
Second of all, no. There's no fool-proof way to protect your high score list if it is being stored locally on the device. The only way to protect the high score list would be to store it in the cloud via a web service or some other mechanism.

It is tricky to have a secure high score system since users can always modify information on the client side. It's impossible to prevent a determined hacker from looking at your code, but you can make it more difficult by obfuscating your code. PreEmptive's Dotfuscator is currently free for Windows Phone 7 developers and also has analytics built in if you want to use it. This will obfuscate your code and make it harder to read your code. Although it's not fool proof, it's an extra hurdle for hackers to overcome.
The obfuscation would make it harder to find the encryption key you're using to authenticate the high score.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas