Is it good to use Google libphonenumber library for commercial purpose? - whatsapp

Is it good idea to use Google libphonenumber library for commercial application? Eg. we are thinking to add validation of a phone number at the time of sign up. Earlier we were using regex for it. As we have OTP based verification system, so regex works good for us. And hence, we do not get any bad phone numbers in our system. Earlier, we were working only for single country numbers, so regex was easy. Now we are thinking to accept international numbers also. So, we found that google libphonenumber library does validate for any international number. https://github.com/googlei18n/libphonenumber
This seems a good option and also seems very accurate. But in this, we think there is 1 problem that when some numbers are open in some country, then we always need to update the version of library in our system also so that it will start accepting those numbers. We found that this open source library is also used by Whatsapp.
https://www.whatsapp.com/opensource/?l=es
But how do they manage this problem of updating version every time? Also, is any other company which is also using it.
Can someone please suggest, what should we do? Should we use regex for international numbers or this library.

Yes, you can use googlelibphonenumber for comercial applications since it has an Apache 2.0 license (you can see details here).
And of course, you should be always updating the library.
But you always could lose some very new numbers, right?
Well, if you are very strict with that and never want to deal with a false negative numbers, you could use the isPossibleNumber function like you were using isValidNumber function. But that would open the possibility to have false positive numbers. So at the end is your decision but you can't be 100% accurate and updated. You just have to choose if it's better to have false positive numbers or false negative numbers. but don't worry too much about that, the possibility of both cases is very little.

Related

What is the benefit of versioning a REST api by date as Twilio does?

Basically, I think it's a good idea to version your REST api. That's common sense. Usually you meet two approaches on how to do this:
Either, you have a version identifier in your url, such as /api/v1/foo/bar,
or, you use a header, such as Accept: vnd.myco+v1.
So far, so good. This is what almost all big companies do. Both approaches have their pros and cons, and lots of this stuff is discussed here.
Now I have seen an entirely different approach, at Twilio, as described here. They use a date:
At compilation time, the developer includes the timestamp of the application when the code was compiled. That timestamp goes in all the HTTP requests.
When the request comes into Twilio, they do a look up. Based on the timestamp they identify the API that was valid when this code was created and route accordingly.
It's a very clever and interesting approach, although I think it is a bit complex. It can be confusing to understand whether the timestamp is compilation time or the timestamp when the API was released, for example.
Now while I somehow find this quite clever as well, I wonder what the real benefits of this approach are. Of course, it means that you only have to document one version of your API (the current one), but on the other hand it makes traceability of what has changed more difficult.
Does anyone know what the advantages of this approach are, so why Twilio decided to do so?
Please note that I am aware that this question sounds as if the answer(s) are primarily opinion-based, but I guess that Twilio had a good technical reason to do so. So please do not close this question as primariliy opinion-based, as I hope that the answer is not.
Interesting question, +1, but from what I see they only have two versions: 2008-08-01 and 2010-04-01. So from my point of view that's just another way to spell v1 and v2 so I don't think there was a technical reason, just a preference.
This is all I could find on their decision: https://news.ycombinator.com/item?id=2857407
EDIT: make sure you read the comments below where #kelnos and #andes mention an advantage of using such an approach to version the API.
There's another thing I can think about that makes this an interesting approach is if you are the developer of such api.
You have 20 methods, and you need to introduce a breaking change in 1 of those.
Using semver (v1, v2, v3, etc) you need a v2 api.
All your 20 methods now needs to respond to v2, but in reality, those methods aren't changed at all, aren't new.
Using dates, you can keep your unchanged methods as is, and when the request comes in, it just pick the best match.
I don't know how is this implemented, any information on that will be really welcome.
I used to work for a company that used date versioning (as in each api call had param of the API date desired ?v=20200630) and loved it.
It lets you be less strict than with the traditional versioning (v1, v2, v3) as client developers don't need to even care about the version number and just use the current build time. Everything else is pretty much the same as as with the traditional versioning + small benefit from seeing date checks in the server code - you can easily see how old this or that code path is.
I believe the situation would have been different if we had to support a number of external clients and for example fix a bug in ?v=20200630 - there is no elegant way to specify something like ?v=20200630.1. As you can see from Twilio's experience they were just changing what API version 2010-04-01 was - thus client couldn't be sure which version exactly it was seeing.
So my outcome from this:
date based version seems easier and more flexible when you are a typical startup or a small company with a few of apps (e.g. frontend, iOS, Android) and no or few 3rd party clients. Date-based versioning makes it a bit easier for client developers to "just write code" and since you control all the code, most of the time you can fix old API bugs by just releasing a new version and asking clients to switch to it
Once you start having the real need to maintain the old API versions (AKA when you have a number of important clients who are not likely to update quickly), then semver versioning becomes more reliable

Requirement, design, code derivation

I am following verification and validation threads and I think an example might be helpful. I am not an experienced developer so I would like to know whether this would be correct:
User equirement: I want to be save my friend's name, address and phone number to the system
Software Requirement specification: User wants to be able to enter and save a name, address, phone number.
Technical analysis: web UI for data entering. Data will be saved into the SQL DB.
Detailed design: UI elements: 3 fields of a string type, 1 button, object XYZ, dbConnection....
Code: (actual code of UI, db scripts)
Is it like that? Could anyone correct or add what I am missing here?
As for verification, each phase can be verified against requirement (traceability). As for validation, the functional code should work as expected (save three attributes).
While this is some what theoretically true (I have to say this), it is completely wrong in all practical and real world scenarios.
Capture user needs and Why he wants to do a certain thing. This allows you to build just the software that user wants, eliminate waste that come as part of made-up requirements, technical requirements, nice to haves etc.
So instead of,
I want to be save my friend's name, address and phone number to the system...
I'd rather like to have the below which emphasizes Why? the real need of the user
I want to send a greeting card to my friend on his birthday.
Now, I know I just need his name and address. Since this is for future I also want to store this information. So what I write next is a set of acceptance criteria to meet the above customer needs. If I can capture these as a set of executable specifications then it is even better as those are verifiable programmatically.
Ignore everything else. Traceability is unnecessary overhead. We need it if we are building software based on fabricated requirements.
Read the below
Agile Manifesto
ATDD and BDD
Impact Mapping
I've never seen a good way to trace code to requirements outside a single sprint/time box. And also, you're missing testers from your list! Unless your testers are also your business analysts (I my experience professional testers find a lot of the requirements inconsistencies - aka bugs).
I think the best approach is to have everyone as involved as possible, so you can cross reference each person expectations often. If everyone works together, you don't need to implement a cargo-cult process where batches of information are transferred down stream in one way.
The simplest tool have traceability is your VCS, where each commit includes the ID of the user story/use case that the commit is related to.

How usable and secure is Confident CAPTCHA? Are there other options?

I am trying to find an easier CAPTCHA to use with my website. I currently have reCAPTCHA but the users are struggling to get the words right the first time.
I have came across Confident CAPTCHA (here) and would like to know what you guys think about it.
Has anyone used it before?
How safe is it?
Are there similar CAPTCHA's, excluding reCAPTCHA?
Interesting captcha, I have not seen this one before.
I will try to address your second question about How safe is it?. There are no docs available or sample code to check so the analysis is based on using it a few times.
It seems like it should be reasonably secure. I see that it uses a 3rd party service, so you will rely on API calls to generate the HTML markup and validate the captcha.
In their demo, you are required to choose 4 images out of a total of 9 which means the probability of guessing the correct value is about 0.000330688% (1/9 * 1/8 * 1/7 * 1/6).
It essentially works by creating an alpha captcha code based on the sequence of images you choose. So the server generates a random challenge (cat, vehicle, drink, house) and associates each element with a random letter from the range [A-Z].
Clicking the sequence of images creates a captcha code based on the letter assigned to each image (e.g. PKIR) if cat = P, vehicle = K, drink = I, house = R that gets placed in a hidden input and submitted with the form.
Therefore the only way to pass the captcha is to come up with a code that agrees with the sequence of images on the server side.
I would conclude it is relatively secure in that there is no way to defeat the captcha solely on the client side (see this question for example). Since there is no reason for them to ever present anything related to the solution to the client (browser); it would seem logical that the only way to get the correct captcha code is to select the correct images in the correct sequence.
Conclusion:
At first glance, the captcha seems secure (no easy bypasses).
This specific captcha may be more difficult to farm out to human solvers (a positive)
Depending on the number of objects and images in the database, it may be possible to generate a database of words to images.
One potential downfall to the captcha is that certain words may require a moderate level of understanding the English language; non-English speaking users may be completely cut off or at least have to put in additional effort to translate words to their native language.
You may want to do a usability check of this captcha on mobile devices (just a thought).
That's my 2 cents, I hope that helps you out.
I'm using it with ads and well, this is very secure.
About english language, the api support many languages and adapt the questions based on the browser language.
I have used GoogleTranslation to help people who have spoken language out of the ConfidentCaptcha reach.
No problem so far. They are very responsive, a very good support.
About mobile, if you don't use ads, you have a special mobile mode, which make it very easy and adapted to the tiny devices.

Hardware Serial Number Discussion. Licence protection

I am working on some application wich will get HDD serial number and then i will use that HDD serial number for licence (cd-key) registration with product. Now the problems wich i can come to:
User have 2 HDD's and once my application gets its serial from first HDD it will register with it so what if user later changes order of HDD's? if the seccond HDD becomes a Master and the first one becomes slave? could be solved with getting both and combine them togather but what if later he removes one then? :D
What if user's HDD dies and he buys new one? Is still same pc only another HDD. So the licence wont be walid anymore just because is another HDD.
Is it possible to fake it? Example i am using VB.net 2010 and application is working on framework(.net) so there is some "dll" wich is responsible to get the serial of HDD so would be possible to replace this "dll" (crack it) so it returns some hardcoded serial of hdd?!?
Could be possible to get processor serial? that would be much batter but could it be done? and does the processor have serial, i mean probably have but is it possible to get it? and same question as abowe could it be faked through changing "dll" or something?
anny other suggestions or experiances?
I seen there are more questions like this but couldnt find some answers so now i ask here!
------ EDITED/ADDED: -------
As talked below i forgot all .net can be decompiled in few secconds! so...
Making own installer. Why?
if i make an installer in wich you enter serial and only if serial is ok to use then install software so what it does? it extracts my software to your computer and again you have ".net" exe wich you can easely decompile and make a crack for it so where is point in making installation with serial!? or if my software is "protected" with some obfuscator so then installation with serial is unneded here i could then simply include serial registration in my software and using some booleans store registered=1||0
i got email from one person here, btw. duno where you got my mail :) and he says some smart things and why some of you people dont respond to my question and this discussion and what he says is this: "scared that others will see my code and how bad it is." so then people just dont want to spent time on this. well thats not problem i know my code is big "minestrone", big mess much words(variables) some on english some on croatian so on well my software is working thats important and i know i suc* we all suc* everyone knows something(more or batter) that the other one. anyway, thats not problem, problem is that i dont want that the software is open source lets say my software is "photoshop" and now someone downloads click there and there and have the whole code and can easely copy paste change few things and no problem he made good application :)
custom compiler? anyone have experiances? would it be ok for some time? :)
what other solution or language would be good to use in future to avoid this "open source" .net! i been looking around so for vb.net, c#, c++ is all based on .net so is all same. vb6 wich i love again same thing. they all can easely be decompiled! what language could not be so easy to decompile? should i switch to assembler? :D i joke, i hope! :p
maybe i just too much stressed up, much work! duno you decide :)
PLEASE READ MY QUESTION AND PLEASE DONT ANSWER ME SOMETHING LIKE "PIRACY CANOT BE STOPED BLA BLA" AND THINGS LIKE THAT. THAT WASNT MY QUESTION! THANK YOU!
Sorry on bold big latters but some people read just title and then answer stupidities! If you want talk about it then read question and write otherwise dont post some stupidities please
Let me first answer your questions:
If the order of the HDDs changes, your application could still find that serial number within the system. However, in either case I would resort to a scheme where I use the device of the system partition or so.
If the HDD dies, the user will be in trouble. There is no good solution to that as long as you insist on your source for the uniqueness of the user's system: i.e. the HDD serial.
It's absolutely possible, yes. At different levels, though. A cracker would always choose the simplest method.
Yes. I'm afraid that will only work with unmanaged code, though. See Wikipedia. And yes, this could be circumvented again by DLL placement (see my comment on the question).
Now let me give you an advice that worked fine for me. Use the SID of the machine account (not to be confused with SYSTEM, which has a well-known SID). And before you counter with NewSid (which, by the way has been retired by MS), this is much more effort to change, especially in domain environments and can have very nasty and unforeseen effects. Therefore if you want to tie your application to a Windows installation, the SID will be sufficient. The SID has the same advantages as a UUID you could create, but it's not as easy to manipulate as a UUID that you store in the registry or a file.
Oh, and before I forget to mention it. Yes, even using the SID can be "cracked" in various ways. But it balances convenience for the user with your demand for security.
Yes, you have to be aware of that. You'll need several fall back methods to take care of this
You have to be aware of that as well.
Everything is fakeable with some energy behind it. However, why fake such an id if you simply can manipulate the program itself? All .net code can be disassembled and manipulated
I think this is possible as well, but would have the same problem behind it.
Other suggestion:
Just because there is piracy, don't make the experience bad for your customer. Use something that is reuseable (like a serial number or keyfile), invest in a good obfuscator to make it harder for somebody to inspect your code, but beyond all: Make your application stand out so people buy it. And even though you didn't ask for it, I have to say it - you can't stop piracy by enforcing orwellian-like surveillance of your program. This will drive customers away as it is a pain in the *ss to work with your application. With a serial or keyfile you still have some sort of protection, the customer likes it because it is easy to use, he doesn't have to call you/write a support ticket if his computer fails or the stars align unfavourable. Pirates will break it eventually, but your customer is happy, and that is what counts.
Anything you rely on which is in userland can and will be spoofed if it is worthwhile to the end user/attacker. So locking the licence to an HDD serial number will not put of attackers, but it will seriously upset your customers.
The same goes for processor serial numbers - it is too easy to pop some code inline to change what your application will read.
Your only reasonable bet will be dongles - ie specific hardware, or a way to get them to register and run with an online connection, so you can validate them using elements you control (although in saying that, if your app is high enough value, expect the dongle to be hacked/replicated too!)
Your biggest problem may be overdoing the security - if you get it wrong in any way you will alienate your customer base.
People regularly upgrade failed hard drives, or those which are too small, as well as most other components in their computers. If you stop them using your product, even for a couple of days, they are likely to look elsewhere!
You can do what you are suggesting, but there are issues. What you are suggesting is called "machine binding" in the licensing world. There are commercial tools that do this for you (disclaimer: I work for one such provider Wibu-Systems). What YOU are proposing has some pros and cons:
Pros: requires no separate hardware (dongle), you can roll your own, easy solution to implement at a basic level.
Cons: can be cracked in a matter of minutes, will create problems for users when they change the HW config or move the app to a new PC, rolling your own will introduce the oppty for new bugs in an area you apparently have no prior experience with.
Why not use a commercial solution? Would you write your own setup program, too? How about your own compiler, linker, and debugger?

reCAPTCHA accepting one word out of two

I am a bit confused about how reCAPTCHA works. I have implemented it
using ROR.
Sometimes even if i specify only one word out of two, it returns true
while sometimes it fails.
I am really confused and not able to understand the behaviour of
reCAPTCHA.
Only one of the recaptcha words is "known" by the system - it is relying on the user performing the captcha to tell the system what the other word is, because it is not machine-readable.
That is the "point" of recaptcha, or the added benefit - it is not only performing a human test, it is also massively group-sourcing translation where automated OCR has failed.
Recaptcha shows two words. One that a computer scanner has scanned and recognized and one that the computer scanner cannot recognize. Recaptcha checks for the word it knows the answer to and saves the response for the unknown word. These responses to the unknown words are compiled and analyzed so that it is essentially "solved" by humans and not by the computer scanner.
Here's more info, in their own words:
"But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct."
source - http://www.google.com/recaptcha/learnmore
Recaptcha uses two words, one of which is known and one which is unknown (the unknown word is the one that the program is trying to help decipher--it's probably scanned out of an old book or something somewhere!). So really, all the service is looking for is the right answer to the KNOWN word. If that's the word you put it, it will succeed even if you don't put in anything for the unknown word. If you put in the other word (the unknown one) it will fail.
I think that's the main point of recaptcha. It helps developers make difference between humans and robots and it also helps digitalize books.
There're always two words. One is easier to read. If you can read this word, it's fine, you're human.
The second word is a scan from a book where automatic OCR (recognition) is not sure about this word. So users are helping read this word so books can be digitalized better.