I am learning Perl6 from Perl5.
In order to make this compile, I'll post the whole program:
sub lgamma ( Num(Real) \n --> Num ){
use NativeCall;
sub lgamma (num64 --> num64) is native {}
lgamma( n )
}
sub pvalue (#a, #b) {
if #a.elems <= 1 {
return 1.0;
}
if #b.elems <= 1 {
return 1.0;
}
my Rat $mean1 = #a.sum / #a.elems;
my Rat $mean2 = #b.sum / #b.elems;
if $mean1 == $mean2 {
return 1.0;
}
my Rat $variance1 = 0.0;
my Rat $variance2 = 0.0;
for #a -> $i {
$variance1 += ($mean1 - $i)**2#";" unnecessary for last statement in block
}
for #b -> $i {
$variance2 += ($mean2 - $i)**2
}
if ($variance1 == 0 && $variance2 == 0) {
return 1.0;
}
$variance1 /= (#a.elems - 1);
$variance2 /= (#b.elems - 1);
my $WELCH_T_STATISTIC = ($mean1-$mean2)/sqrt($variance1/#a.elems+$variance2/#b.elems);
my $DEGREES_OF_FREEDOM = (($variance1/#a.elems+$variance2/#b.elems)**2)
/
(
($variance1*$variance1)/(#a.elems*#a.elems*(#a.elems-1))+
($variance2*$variance2)/(#b.elems*#b.elems*(#b.elems-1))
);
my $A = $DEGREES_OF_FREEDOM/2;
my $value = $DEGREES_OF_FREEDOM/($WELCH_T_STATISTIC*$WELCH_T_STATISTIC+$DEGREES_OF_FREEDOM);
my Num $beta = lgamma($A)+0.57236494292470009-lgamma($A+0.5);
my Rat $acu = 10**(-15);
my ($ai,$cx,$indx,$ns,$pp,$psq,$qq,$rx,$temp,$term,$xx);
# Check the input arguments.
return $value if $A <= 0.0;# || $q <= 0.0;
return $value if $value < 0.0 || 1.0 < $value;
# Special cases
return $value if $value == 0.0 || $value == 1.0;
$psq = $A + 0.5;
$cx = 1.0 - $value;
if $A < $psq * $value {
($xx, $cx, $pp, $qq, $indx) = ($cx, $value, 0.5, $A, 1);
} else {
($xx, $pp, $qq, $indx) = ($value, $A, 0.5, 0);
}
$term = 1.0;
$ai = 1.0;
$value = 1.0;
$ns = $qq + $cx * $psq;
$ns = $ns.Int;
#Soper reduction formula.
$rx = $xx / $cx;
$temp = $qq - $ai;
$rx = $xx if $ns == 0;
while (True) {
$term = $term * $temp * $rx / ( $pp + $ai );
$value = $value + $term;
$temp = $term.abs;
if $temp <= $acu && $temp <= $acu * $value {
$value = $value * ($pp * $xx.log + ($qq - 1.0) * $cx.log - $beta).exp / $pp;
$value = 1.0 - $value if $indx;
last;
}
$ai++;
$ns--;
if 0 <= $ns {
$temp = $qq - $ai;
$rx = $xx if $ns == 0;
} else {
$temp = $psq;
$psq = $psq + 1.0;
}
}
return $value;
}
my #array2d = ([27.5,21.0,19.0,23.6,17.0,17.9,16.9,20.1,21.9,22.6,23.1,19.6,19.0,21.7,21.4],
[27.1,22.0,20.8,23.4,23.4,23.5,25.8,22.0,24.8,20.2,21.9,22.1,22.9,20.5,24.4],#0.
[17.2,20.9,22.6,18.1,21.7,21.4,23.5,24.2,14.7,21.8],
[21.5,22.8,21.0,23.0,21.6,23.6,22.5,20.7,23.4,21.8,20.7,21.7,21.5,22.5,23.6,21.5,22.5,23.5,21.5,21.8],
[19.8,20.4,19.6,17.8,18.5,18.9,18.3,18.9,19.5,22.0],
[28.2,26.6,20.1,23.3,25.2,22.1,17.7,27.6,20.6,13.7,23.2,17.5,20.6,18.0,23.9,21.6,24.3,20.4,24.0,13.2],
[30.02,29.99,30.11,29.97,30.01,29.99],
[29.89,29.93,29.72,29.98,30.02,29.98],
[3.0,4.0,1.0,2.1],
[490.2,340.0,433.9],
[<1.0/15.0>, <10.0/62.0>],
[<1.0/10>, <2/50.0>],
[0.010268,0.000167,0.000167],
[0.159258,0.136278,0.122389],
[9/23.0,21/45.0,0/38.0],
[0/44.0,42/94.0,0/22.0]);
say #array2d[11][0];
my #CORRECT_ANSWERS = (0.021378001462867,
0.148841696605327,
0.0359722710297968,
0.090773324285671,
0.0107515611497845,
0.00339907162713746,
0.52726574965384,
0.545266866977794);
my UInt $i = 0;
my Real $error = 0.0;
for #array2d -> #left, #right {
my $pvalue = pvalue(#left, #right);
$error += ($pvalue - #CORRECT_ANSWERS[$i]).abs;
say "$i [" ~ #left.join(',') ~ '] [' ~ #right ~ "] = $pvalue";
if $error > 10**-9 {
say "\$p = $pvalue, but should be #CORRECT_ANSWERS[$i]";
die;
}
# printf("Test sets %u p-value = %.14g\n",$i+1,$pvalue);
$i++
}
printf("the cumulative error is %g\n", $error);
What makes this sub-array different is that it has the "/" for division. How can I make Perl6 for-loop evaluate this sub-array?
EDIT: I'm struggling with what constitutes a minimal working example. I'm posting the entire code so that it will compile.
(It turned out that this answer completely missed what was ailing #con. But I'll not delete it because it gathers together some hopefully useful links regarding rational numerics.)
What's wrong?
why is array skipping calculated values inside declaration?
It isn't.
I am learning Perl6 ...
Numbers like 1.3 are decimals in math and in Perl 6. Some explanations:
Damian Conway spends 3 minutes demonstrating the difference between the Perl 6 and Perl 5 approaches to rationals.
My Medium comment on Baking rationals into a programming language.
The P6 doc's Numerics page. (Written by Zoffix. Thank you for the ton of excellent stuff you did for P6 Zoffix.)
My SO answer to the question "Does Perl 6 performance suffer by using rationals for decimal numbers".
... from Perl5
Something about this belongs in one of the Perl 5 to Perl 6 guide docs. Would you be willing to open a new doc issue?
[1.0/15.0, 10.0/62.0],#this isn't evaluated
[1.0/10, 2/50.0],#neither is this
They're all evaluated.
In math and in P6 a literal 1.0 is a decimal is a rational. foo / bar is also a rational.
(Well, it is if foo and bar are both integers or rationals and either the denominator of the result remains 64 bits or less or one of foo or bar is an arbitrary precision FatRat rational.)
However, Perl6 doesn't seem to like values like I have indicated here.
You haven't explained the situation in which you're seeing something that makes you think P6 doesn't like them.
A strong possibility is that, per Brad Gilbert++'s comment, you are seeing values like, say, <1/15> instead of 0.066667. The former is a P6 literal value that 100% accurately denotes 1/15. To 100% accurately denote that in decimal the display would have to be 0.06666666666666... with a ... or some such at the end to denote an infinitely repeating final 6. But <1/15> denotes the same number and is shorter and arguably simpler so dd and .perl use the <1/15> form instead.
How can I get Perl6 to evaluate expressions like this when I'm declaring the 2D array?
You don't have to do anything. It's evaluating them. :)
(This is a nanswer, i.e. not an answer per se. It was originally written after #con rewrote their question to include all their code and was a step towards my Third and final answer. It's now hopefully a useful resource for those learning Perl 6.)
Original intro to my first edit of this nanswer
That's a whole lot of code! :)
My thought thus far is: are you absolutely, positively, 100% certain you haven't just missed out some input data? It seems vastly more likely that you've skipped data than that P6 has, especially given that the value calculated is exactly the one you're expecting for your next correct result.
(Update It did indeed turn out to be that the problem was incorrect input data.)
Translation, with commentary, of the code in question
The rest of this nanswer is a line-by-line "clean up" (not a refactor) of the code in the question. I had two goals in mind, with the second one being the most important one as you, dear reader, read this:
My translation effectively proved to me and demonstrated to #con that I'd taken every bit of their code into account. This was intended to reduce uncertainty about where the bug might be. Note that most of my changes could not possibly be directly relevant to their bug but I didn't feel comfortable assuming anything until I'd done the rewrite.
#con's code, and my translation of it, can presumably be useful for anyone learning Perl 6. #con's code is a P6 translation of Perl 5 code. The P5 code is in turn a translation of C code. And there are other translations of that code into other languages. My code takes #con's translation and translates it to a more idiomatic version with commentary explaining why I changed their code.
My translation of code in question, runnable via tio
My translation of #con's code, minus my commentary (see below), in tio.
(My original plan was that we would further explore #con's bug based on modifying the code in tio and sharing links to the modified tio version(s). To do that you/they would just click within tio the link icon at the top (š) to get a link to the code as it is when the link icon is clicked. Tio is great for being able to write, edit and run code written in P6 and other languages and then share it.)
My translation of code in question, with commentary
I've left the opening lgamma routine as is. I've written extensive commentary about it in a footnote, mainly for other readers, because it packs in a ton of interesting features1:
sub lgamma ( Num(Real) \n --> Num ){
use NativeCall;
sub lgamma (num64 --> num64) is native {}
lgamma( n )
}
I note that the lgamma code is all about floating point numbers (in P6 that's the Num / num... types).
sub pvalue (#a, #b) {
For those who don't know Perls, the sub keyword introduces a subroutine (aka function).
This one takes two "listy" (Positional) arguments.
(When you see an # (either as a "sigil" eg as in #foo or as an operator, eg as in #($bar)), think "list".)
To speed up code reading and reduce repetitive code I replaced this:
if #a.elems <= 1 {
return 1.0;
}
if #b.elems <= 1 {
return 1.0;
}
with this:
return 1 if #a | #b <= 1;
Read this in English as "return 1 if the number of elements in either list #a or list #b is less than or equal to 1".
I used the | operator to construct an any junction.
(Don't waste time trying to wrap your head around the theory of what junctions do, and how they do it. Just use them in practice in simple and/or succinct ways. If they're kept simple, and read in a simple fashion, they're great and just do the obvious thing. If using them makes code not obvious in a particular case, then consider using some other construct for that particular case.)
An array in numeric context evaluates to its number of elements. Numeric operators like <= impose numeric context. So I dropped the .elems.
Numeric context, and arrays evaluating to their length in numeric context, are basic aspects of P6 so this is idiomatic coding that's appropriate for all but the most basic introductory newbie examples.
I switched from 1.0 to 1.
My guess is that #con wrote 1.0 because that was how the value was literally written in code they were translating and/or with the intent it represented a floating point value. But in P6 plain decimal literals (without an e exponent) like 1.0 produce rational numbers instead.
Using 1.0 instead of 1 in this code replaces a simpler faster type (integers) with a more complicated slower type (rationals). A value of this slower type will then be forced to convert (more slowly than an integer) into a floating point number when it's used in formulae in which any component value is floating point. Which will happen for most or all the formulae in this code because lgamma returns a floating point number.
More generally, P6 works and reads best if you leave the types of variables and values unspecified, leaving them for the compiler to figure out, unless there's a compelling reason to specify them. Leaving types unspecified reduces cognitive load for the reader and increases flexibility for reuse of the code and for optimization of the code by the compiler.
This leads to a complementary pair of maxims:
By default, leave type information implicit. If you don't know if it matters what type a value or variable is given, or know that it doesn't matter, then don't specify it.
If you make the type of a value, variable or parameter explicit, then P6 will use that type forcing compliance with that type, regardless of whether that improves your code or unnecessarily slows code down or stops it altogether.
If you absolutely know that you need to use a different type or add a type constraint to make code correct, then by all means go ahead. Likewise, if you've already verified that your code is correct without the more specific type but you want to make it faster, safer, or clearer, go ahead. But if you don't know, then leave it generic. P6 has been designed to do what you mean and will succeed far more often than you can possibly imagine, as Ben once said.
To state this point even more forcefully, in analogy with premature optimization, in P6, premature typing is an anti-pattern in both one-off code and long term production code.
my Rat $mean1 = #a.sum / #a.elems;
my Rat $mean2 = #b.sum / #b.elems;
if $mean1 == $mean2 {
return 1.0;
}
became:
my (\mean_a, \mean_b) = #a.sum / #a, #b.sum / #b;
return 1 if mean_a == mean_b;
I typically "slash sigils" of "variables" unless I know I'll need a sigil. If a "variable" doesn't actually vary, it likely doesn't need a sigil.
I've renamed $mean1 to mean_a because it clearly corresponds to #a.
my Rat $variance1 = 0.0;
my Rat $variance2 = 0.0;
for #a -> $i {
$variance1 += ($mean1 - $i)**2#";" unnecessary for last statement in block
}
for #b -> $i {
$variance2 += ($mean2 - $i)**2
}
became:
my ($vari_a, $vari_b);
$vari_a += (mean_a - $_)Ā² for #a;
$vari_b += (mean_b - $_)Ā² for #b;
Read the variable $_ as "it".
if ($variance1 == 0 && $variance2 == 0) {
return 1.0;
}
$variance1 /= (#a.elems - 1);
$variance2 /= (#b.elems - 1);
became:
return 1 unless ($vari_a or $vari_b);
$vari_a /= (#a - 1);
$vari_b /= (#b - 1);
In Perls, 0 evaluates in a boolean test context as False. And an array or list like #a evaluates in a numeric context to its element count.
my $WELCH_T_STATISTIC = ($mean1-$mean2)/sqrt($variance1/#a.elems+$variance2/#b.elems);
my $DEGREES_OF_FREEDOM = (($variance1/#a.elems+$variance2/#b.elems)**2)
/
(
($variance1*$variance1)/(#a.elems*#a.elems*(#a.elems-1))+
($variance2*$variance2)/(#b.elems*#b.elems*(#b.elems-1))
);
my $A = $DEGREES_OF_FREEDOM/2;
my $value = $DEGREES_OF_FREEDOM/($WELCH_T_STATISTIC*$WELCH_T_STATISTIC+$DEGREES_OF_FREEDOM);
my Num $beta = lgamma($A)+0.57236494292470009-lgamma($A+0.5);
my Rat $acu = 10**(-15);
my ($ai,$cx,$indx,$ns,$pp,$psq,$qq,$rx,$temp,$term,$xx);
# Check the input arguments.
return $value if $A <= 0.0;# || $q <= 0.0;
became:
my \WELCH_T_STATISTIC = (mean_a - mean_b)
/ ( $vari_a / #a + $vari_b / #b ).sqrt;
my \DEGREES_OF_FREEDOM = ($vari_a / #a + $vari_b / #b)Ā²
/ ($vari_aĀ² / (#aĀ² * (#a - 1)) + $vari_bĀ² / (#bĀ² * (#b - 1)));
my \A = DEGREES_OF_FREEDOM / 2;
my $value = DEGREES_OF_FREEDOM
/ (WELCH_T_STATISTICĀ² + DEGREES_OF_FREEDOM);
my \magic-num = 0.57236494292470009;
my \beta = lgamma(A) + magic-num - lgamma(A + 0.5);
my \acu = 1e-15;
# Check the input arguments.
return $value if A <= 0;# || $q <= 0.0;
(What's that $q about in the above line?)
Note that I dropped the types on beta and acu. The value assigned to the beta variable will be a Num anyway because lgamma returns a Num. The value assigned to acu will also be a Num because use of the e exponent in a number literal means the value it constructs is a Num.
return $value if $value < 0.0 || 1.0 < $value;
# Special cases
return $value if $value == 0.0 || $value == 1.0;
became:
return $value unless $value ~~ 0^..^1;
Read the 0^ as "above zero" (not including zero) and the ^1 as "up to one" (not including one).
The ~~ is the "smart match" operator. It returns True if the value on its right accepts the value on its left.
So this return statement returns $value if $value is less than or equal to 0 or greater than or equal to 1.
As a minor tidy up I moved a my declaration of a load of variables that I dropped out in the above rewrite to this point, immediately before they became relevant, and added $ns too:
my ($ai, $cx, $indx, $pp, $psq, $qq, $rx, $temp, $term, $xx, $ns);
$psq = $A + 0.5;
$cx = 1.0 - $value;
if $A < $psq * $value {
($xx, $cx, $pp, $qq, $indx) = ($cx, $value, 0.5, $A, 1);
} else {
($xx, $pp, $qq, $indx) = ($value, $A, 0.5, 0);
}
became:
$psq = A + 0.5;
$cx = 1 - $value;
($xx, $cx, $pp, $qq, $indx) =
A < $psq * $value
?? ($cx, $value, 0.5, A, 1)
!! ($value, $cx, A, 0.5, 0);
I rewrote the conditional assignment of a bunch of variables as a ternary so it was easier to see what got assigned to what.
$term = 1.0;
$ai = 1.0;
$value = 1.0;
$ns = $qq + $cx * $psq;
$ns = $ns.Int;
became:
$term = 1;
$ai = 1;
$value = 1;
$ns = ($qq + $cx * $psq) .Int;
Replacing 1.0s with 1s and combining the expression assigned to $ns with the .Int coercion.
(I stripped types from the code as I translated and it continued to calculate the correct results except that removing the above Int coercion made the code infiniloop. This is what finally led me to search the net to see if I could find the code #con was translating. That's when I found it on rosettacode.org. It was explicitly typed as an integer in the code I saw so presumably it's central to ensuring the algorithm works.)
#Soper reduction formula.
$rx = $xx / $cx;
$temp = $qq - $ai;
$rx = $xx if $ns == 0;
(Didn't change.)
while (True) {
became:
loop {
$term = $term * $temp * $rx / ( $pp + $ai );
$value = $value + $term;
$temp = $term.abs;
(Didn't change.)
if $temp <= $acu && $temp <= $acu * $value {
$value = $value * ($pp * $xx.log + ($qq - 1.0) * $cx.log - $beta).exp / $pp;
$value = 1.0 - $value if $indx;
last;
}
became:
if $temp <= acu & acu * $value {
$value = $value * ($pp * $xx.log + ($qq - 1) * $cx.log - beta).exp / $pp;
$value = 1 - $value if $indx;
last;
}
This time the condition containing the junction (&) reads in English as "if temp is less than or equal to both acu and acu times value".
$ai++;
$ns--;
if 0 <= $ns {
$temp = $qq - $ai;
$rx = $xx if $ns == 0;
} else {
$temp = $psq;
$psq = $psq + 1;
}
}
return $value;
}
I just replaced 1.0 with 1.
Now the problematic array. As I wrote at the start, I'm pretty sure you have (or the supplier of your data has) just forgotten a couple lines:
my #array2d =
[27.5,21.0,19.0,23.6,17.0,17.9,16.9,20.1,21.9,22.6,23.1,19.6,19.0,21.7,21.4],
[27.1,22.0,20.8,23.4,23.4,23.5,25.8,22.0,24.8,20.2,21.9,22.1,22.9,20.5,24.4],
[17.2,20.9,22.6,18.1,21.7,21.4,23.5,24.2,14.7,21.8],
[21.5,22.8,21.0,23.0,21.6,23.6,22.5,20.7,23.4,21.8,20.7,21.7,21.5,22.5,23.6,21.5,22.5,23.5,21.5,21.8],
[19.8,20.4,19.6,17.8,18.5,18.9,18.3,18.9,19.5,22.0],
[28.2,26.6,20.1,23.3,25.2,22.1,17.7,27.6,20.6,13.7,23.2,17.5,20.6,18.0,23.9,21.6,24.3,20.4,24.0,13.2],
[30.02,29.99,30.11,29.97,30.01,29.99],
[29.89,29.93,29.72,29.98,30.02,29.98],
[3.0,4.0,1.0,2.1],
[490.2,340.0,433.9],
[<1.0/15.0>, <10.0/62.0>],
[<1.0/10>, <2/50.0>],
[0.010268,0.000167,0.000167],
[0.159258,0.136278,0.122389],
[9/23.0,21/45.0,0/38.0],
[0/44.0,42/94.0,0/22.0];
say #array2d[11][0];
Who or what says these are the correct answers? Are you 100% sure the 0.0033... answer goes with the [<1.0/15.0>, <10.0/62.0>],[<1.0/10>, <2/50.0>] data?
my #CORRECT_ANSWERS =
0.021378001462867,
0.148841696605327,
0.0359722710297968,
0.090773324285671,
0.0107515611497845,
0.00339907162713746,
0.52726574965384,
0.545266866977794;
And in the last bit I just removed types and once again used a superscript for a prettier value-plus-exponent (10ā»ā¹):
my $i = 0;
my $error = 0;
for #array2d -> #left, #right {
my $pvalue = pvalue(#left, #right);
$error += ($pvalue - #CORRECT_ANSWERS[$i]).abs;
say "$i [" ~ #left.join(',') ~ '] [' ~ #right ~ "] = $pvalue";
if $error > 10ā»ā¹ {
say "\$p = $pvalue, but should be #CORRECT_ANSWERS[$i]";
die;
}
# printf("Test sets %u p-value = %.14g\n",$i+1,$pvalue);
$i++
}
printf("the cumulative error is %g\n", $error);
Footnotes
1 Love that nice lgamma routine! (Turns out Brad Gilbert wrote it.)
sub lgamma ( Num(Real) \n --> Num ){
use NativeCall;
sub lgamma (num64 --> num64) is native {}
lgamma( n )
}
Showcasing:
Shadowing a low level (C) routine with a high level P6 (Perl 6) routine with the same name to add further processing before calling the C function.
Explicitly converting the input type from the broader P6 type Real numbers to the narrower Num. Perls are "strongly typed" per the original technical definition of the term, but P6 provides additional options per several other interpretations of "strong typing" and relative to languages less capable of these other interpretations of "strong typing" (like Perl 5, Python, and C). Explicit type conversion is part of this shift in capability that P6 introduces.
Slashing the sigil. This is further discussed where I've done the same elsewhere in this post.
Lexically scoping use of a library. The inner lgamma routine, and the symbols imported by use Nativecall;, are not visible anywhere but inside the outer lgamma routine containing it.
Using NativeCall (the P6 C FFI) to allow high level P6 code to sweetly map directly to C code including automatic conversion from P6's IEEE double float compatible boxed type Num to the unboxed machine data type equivalent num64.
All in 5 lines! Very nice. :)
Third and final answer
.oO (He says, hopefully. Has anyone ever written four answers to an SO question?!? š¤ )
If you swap these lines in your data:
[<1.0/15.0>, <10.0/62.0>],
[<1.0/10>, <2/50.0>],
[0.010268,0.000167,0.000167],
[0.159258,0.136278,0.122389],
to instead be the other way around:
[0.010268,0.000167,0.000167],
[0.159258,0.136278,0.122389],
[<1.0/15.0>, <10.0/62.0>],
[<1.0/10>, <2/50.0>],
then the output of your program (at least my version of it from my rewrite of your code nanswer) is:
0.159258
0 [27.5,21,19,23.6,17,17.9,16.9,20.1,21.9,22.6,23.1,19.6,19,21.7,21.4] [27.1 22 20.8 23.4 23.4 23.5 25.8 22 24.8 20.2 21.9 22.1 22.9 20.5 24.4] = 0.02137800146286709
1 [17.2,20.9,22.6,18.1,21.7,21.4,23.5,24.2,14.7,21.8] [21.5 22.8 21 23 21.6 23.6 22.5 20.7 23.4 21.8 20.7 21.7 21.5 22.5 23.6 21.5 22.5 23.5 21.5 21.8] = 0.14884169660532756
2 [19.8,20.4,19.6,17.8,18.5,18.9,18.3,18.9,19.5,22] [28.2 26.6 20.1 23.3 25.2 22.1 17.7 27.6 20.6 13.7 23.2 17.5 20.6 18 23.9 21.6 24.3 20.4 24 13.2] = 0.035972271029797116
3 [30.02,29.99,30.11,29.97,30.01,29.99] [29.89 29.93 29.72 29.98 30.02 29.98] = 0.09077332428566681
4 [3,4,1,2.1] [490.2 340 433.9] = 0.010751561149784494
5 [0.010268,0.000167,0.000167] [0.159258 0.136278 0.122389] = 0.003399071627137453
6 [1.0/15.0,10.0/62.0] [1.0/10 2/50.0] = 0.5272657496538401
7 [0.391304,0.466667,0] [0 0.446809 0] = 0.5452668669777938
the cumulative error is 5.50254e-15
In retrospect this was very obviously what was wrong. 2020 hindsight and all that. :)
Can credit card type be determined solely from the credit card number?
Is this recommended or should we always ask client for the type of credit card they're using?
I Googled about it and found this algorithm: http://cuinl.tripod.com/Tips/o-1.htm , is this reliable?
Yes, the site you mentioned is correct. Many sites, incl. Google Checkout I believe, rely on automatic detection of the card type. It's convenient, makes the UI less cluttered (one less input box) and saves time. Go ahead!
I have heard one good reason to make them pick (even though you can figure it out). So that they know the list of credit cards you accept.
As a consumer, I hate choosing a card first. I want to just start typing the number.
This issue is discussed in Wroblewski's Web Form Design on pages 153-154. It's in the section "Removing Questions" of the chapter "Unnecessary Inputs." The example given is Paypal, which highlights the type of card when you've typed in your number.
I am pretty certain that at least for MasterCard, Visa, Discover, and American Express that that is accurate. I have never worked with any of the others.
See the very bottom of this page:
http://www.merchantplus.com/resources/pages/credit-card-logos-and-test-numbers/
Also this might be useful to you"
http://www.beachnet.com/~hstiles/cardtype.html
This is pretty interesting to:
http://en.wikipedia.org/wiki/Bank_card_number
here is the script that i use that works with current card ranges. also does a validity check on the number.
/**
* checks a given string for a valid credit card
* #returns:
* -1 invalid
* 1 mastercard
* 2 visa
* 3 amex
* 4 diners club
* 5 discover
* 6 enRoute
* 7 jcb
*/
function checkCC(val) {
String.prototype.startsWith = function (str) {
return (this.match("^" + str) == str)
}
Array.prototype.has=function(v,i){
for (var j=0;j<this.length;j++){
if (this[j]==v) return (!i ? true : j);
}
return false;
}
// get rid of all non-numbers (space etc)
val = val.replace(/[^0-9]/g, "");
// now get digits
var d = new Array();
var a = 0;
var len = 0;
var cval = val;
while (cval != 0) {
d[a] = cval%10;
cval -= d[a];
cval /= 10;
a++;
len++;
}
if (len < 13)
return -1;
var cType = -1;
// mastercard
if (val.startsWith("5")) {
if (len != 16)
return -1;
cType = 1;
} else
// visa
if (val.startsWith("4")) {
if (len != 16 && len != 13)
return -1;
cType = 2;
} else
// amex
if (val.startsWith("34") || val.startsWith("37")) {
if (len != 15)
return -1;
cType = 3;
} else
// diners
if (val.startsWith("36") || val.startsWith("38") || val.startsWith("300") || val.startsWith("301") || val.startsWith("302") || val.startsWith("303") || val.startsWith("304") || val.startsWith("305")) {
if (len != 14)
return -1;
cType = 4;
} else
// discover
if (val.startsWith("6011")) {
if (len != 15 && len != 16)
return -1;
cType = 5;
} else
// enRoute
if (val.startsWith("2014") || val.startsWith("2149")) {
if (len != 15 && len != 16)
return -1;
// any digit check
return 6;
} else
// jcb
if (val.startsWith("3")) {
if (len != 16)
return -1;
cType = 7;
} else
// jcb
if (val.startsWith("2131") || val.startsWith("1800")) {
if (len != 15)
return -1;
cType = 7;
} else
return - 1;
// invalid cc company
// lets do some calculation
var sum = 0;
var i;
for (i = 1; i < len; i += 2) {
var s = d[i] * 2;
sum += s % 10;
sum += (s - s%10) /10;
}
for (i = 0; i < len; i += 2)
sum += d[i];
// musst be %10
if (sum%10 != 0)
return - 1;
return cType;
}
Here's Complete C# or VB code for all kinds of CC related things on codeproject.
IsValidNumber
GetCardTypeFromNumber
GetCardTestNumber
PassesLuhnTest
This article has been up for a couple years with no negative comments.
Wikipedia contains a list of most card prefixes. Some cards are missing from the link you posted. It also appears that the link you provided is valid.
One reason to ask for the card type is for extra validation, compare what the user provided against the number.
This is the php version of same algorithm mentioned in 1st post
<?php
function CreditCardType($CardNo)
{
/*
'*CARD TYPES *PREFIX *WIDTH
'American Express 34, 37 15
'Diners Club 300 to 305, 36 14
'Carte Blanche 38 14
'Discover 6011 16
'EnRoute 2014, 2149 15
'JCB 3 16
'JCB 2131, 1800 15
'Master Card 51 to 55 16
'Visa 4 13, 16
*/
//Just in case nothing is found
$CreditCardType = "Unknown";
//Remove all spaces and dashes from the passed string
$CardNo = str_replace("-", "",str_replace(" ", "",$CardNo));
//Check that the minimum length of the string isn't less
//than fourteen characters and -is- numeric
If(strlen($CardNo) < 14 || !is_numeric($CardNo))
return false;
//Check the first two digits first
switch(substr($CardNo,0, 2))
{
Case 34: Case 37:
$CreditCardType = "American Express";
break;
Case 36:
$CreditCardType = "Diners Club";
break;
Case 38:
$CreditCardType = "Carte Blanche";
break;
Case 51: Case 52: Case 53: Case 54: Case 55:
$CreditCardType = "Master Card";
break;
}
//None of the above - so check the
if($CreditCardType == "Unknown")
{
//first four digits collectively
switch(substr($CardNo,0, 4))
{
Case 2014:Case 2149:
$CreditCardType = "EnRoute";
break;
Case 2131:Case 1800:
$CreditCardType = "JCB";
break;
Case 6011:
$CreditCardType = "Discover";
break;
}
}
//None of the above - so check the
if($CreditCardType == "Unknown")
{
//first three digits collectively
if(substr($CardNo,0, 3) >= 300 && substr($CardNo,0, 3) <= 305)
{
$CreditCardType = "American Diners Club";
}
}
//None of the above -
if($CreditCardType == "Unknown")
{
//So simply check the first digit
switch(substr($CardNo,0, 1))
{
Case 3:
$CreditCardType = "JCB";
break;
Case 4:
$CreditCardType = "Visa";
break;
}
}
return $CreditCardType;
}
?>
The code you linked has an incomplete BIN/range list for Discover, omits Diner's club (which now belongs to Discover anyway), lists card types that no longer exist and should be folded into other types (enRoute, Carte Blanche), and ignores the increasingly-important Maestro International cart type.
As #Alex confirmed, it's possible to determine the card type from the BIN number, and numerous companies do it but doing so consistently and correctly is not trivial: card brands constantly change, and keeping track of things becomes more complicated as you try to handle regional debit cards (Laser in Ireland, Maestro in Europe, etc) - I have not found a free and maintained (correct) piece of code or algorithm for this anywhere.
As #MitMaro poined out, Wikipedia contains a high-level list of card identifiers, and also a more-specific list of BIN numbers and ranges, which is a good start, and as gbjbaanb commented, Barclays has a publically-published list (but it does not seem to include Discover for some reason - presumably they don't process on the Discover network?)
Trivial as it may seem, a correct card-identification algorithm/method/function takes work to maintain, so unless your detection routine is non-critical/informational (eg #Simon_Weaver's suggestion), OR you're going to put in the work to keep it current, I would recommend that you skip the automatic detection.
Stripe has provided this fantastic javascript library for card scheme detection. Let me add few code snippets and show you how to use it.
Firstly Include it to your web page as
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery.payment/1.2.3/jquery.payment.js " ></script>
Secondly use the function cardType for detecting the card scheme.
$(document).ready(function() {
var type = $.payment.cardType("4242 4242 4242 4242"); //test card number
console.log(type);
});
Here are the reference links for more examples and demos.
Stripe blog for jquery.payment.js
Github repository
Here's a quick a dirty way to determine the card type automatically and show it to the user while they're typing.
That means
a) the user doesnt have to pick it and
b) they wont waste time looking for a dropdown that doesnt exist.
Very simple jQuery version for Amex, Visa and Mastercard.
if you need other card types you can take the
$('[id$=CreditCardNumber]').assertOne().keyup(function(){
// rules taken from http://en.wikipedia.org/wiki/Credit_card_number#cite_note-GenCardFeatures-0
var value = $(this).val();
$('#ccCardType').removeClass("unknown");
if ((/^4/).test(value)) {
$('#ccCardType').html("Visa");
return;
}
if ((/^5[1-5]/).test(value)) {
$('#ccCardType').html("Mastercard");
return;
}
if ((/^3[47]/).test(value)) {
$('#ccCardType').html("Mastercard");
return;
}
$('#ccCardType').html("Enter card number above");
$('#ccCardType').addClass("unknown");
});
This is the jQuery to accompany this (ASP.NET MVC):
Card number: <%= Html.TextBox("PaymentDetails.CreditCardDetails.CreditCardNumber")%>
Card Type: <span id="ccCardType" class="unknown">Enter card number above</span>
I have a css rule for .unknown to display grayed out text.
This implementation in Python should work for AmEx, Discover, Master Card, Visa:
def cardType(number):
number = str(number)
cardtype = "Invalid"
if len(number) == 15:
if number[:2] == "34" or number[:2] == "37":
cardtype = "American Express"
if len(number) == 13:
if number[:1] == "4":
cardtype = "Visa"
if len(number) == 16:
if number[:4] == "6011":
cardtype = "Discover"
if int(number[:2]) >= 51 and int(number[:2]) <= 55:
cardtype = "Master Card"
if number[:1] == "4":
cardtype = "Visa"
return cardtype
If all the credit cards that you accept have the same properties then just let the user enter the card number and other properties (expiry date, CVV, etc). However, some card types require different fields to be entered (e.g. start date or issue number for UK Maestro cards). In those cases, you either have to have all fields, thereby confusing the user, or some Javascript to hide/show the relevant fields, again making the user experience a bit weird (fields disappearing/appearing, as they enter the credit card number). In those cases, I recommend asking for the card type first.
Personally I have no problem with picking the card type first. But there are two aspects of credit card number entry that are problematic in my view.
The worst is the inability to enter spaces between groups of numbers. Including the spaces printed on the physical cards would make the digits vastly easier for the user to scan to verify they've entered the information correctly. Every time I encounter this ubiquitous deficiency I feel like I'm being propelled backwards into a stone age when user input couldn't be filtered to remove unnecessary characters.
The second is the need when placing a phone order to listen to the vendor repeat the card number back to you. All the credit card recipient actually needs is a UI that gives them access to the check digit scheme which verifies that a cc number is valid. According to that algorithm the first 15 (or however many) digits calculate the last digit - and is virtually impossible to "fool." For a fat fingered number to "pass" requires at least two mutually canceling errors among the 15 digits. Unless the algorithm suffers from the defect of being dis-proportionally fooled by transposing adjacent numbers (a common entry error) which I doubt, I except it is more reliable than any human double check.
https://binlist.net/ offers a free API. You only need to enter the first 6 or 8 digits of the card number - i.e. the Issuer Identification Numbers (IIN), previously known as Bank Identification Number (BIN).
curl -H "Accept-Version: 3" "https://lookup.binlist.net/45717360"
NPM: https://www.npmjs.com/package/binlookup
Source: https://github.com/paylike/binlookup
(cross-posted from a more specific question: How tell the difference between a Debit Card and a Credit Card )