I have some MS Word documents which I have transferred the entire contents into a SQL table.
The contents contain a number of square brackets and curly brackets e.g.
[{a} as at [b],] {c,} {d,} etc
and I need to do a check to make sure that the brackets are balanced/matching, e.g. the below contents should return false:
- [{a} as at [b], {c,} {d,}
- ][{a} as at [b], {c,} {d,}
- [{a} as at [b],] {c,} }{d,
What I've done so far is extracted all the brackets and stored their info into a SQL table like below:
(paragraph number, bracket type, bracket position, bracket level)
3 [ 8 1
3 ] 18 0
3 [ 23 1
3 ] 35 0
7 [ 97 1
7 ] 109 0
7 [ 128 1
7 { 129 2
7 } 165 1
7 [ 173 2
7 ] 187 1
7 ] 189 0
7 { 192 1
7 } 214 0
7 { 216 1
7 } 255 0
7 { 257 1
7 } 285 0
7 { 291 1
7 } 326 0
7 { 489 1
7 } 654 0
I am unsure how the algorithm will work to do the check on whether the brackets are balanced in each paragraph, and give an error message when they are not.
Any advice would be appreciated!
EDIT:
Code will need to work for the following scenario too;
(paragraph number, bracket type, bracket position, bracket level)
15 [ 543 1
15 { 544 2
15 } 556 1
15 [ 560 2
15 ] 580 1
15 ] 581 0
15 [ 610 1
15 ] 624 0
15 [ 817 1
15 ] 829 0
does this have to be on sql server ?
a simple solution would be to use a general purpose language and use a stack.
Read the string character by character
if you encounter a opening brace push it to stack.
if you encounter a closing brace pop.
All brackets are matched if
after reading the paragraph completely the stack is empty.
UNLESS one of the below happens during the process
you had to pop an empty stack
the popped bracket does not match the closing bracket
its not a good idea to use regex to match brackets, they are not meant to be used like that
I'm not sure which tool you have available, but here is a tested JavaScript function which validates that all (possibly nested) square brackets and curly braces are properly matched:
function isBalanced(text) {
var re = /\[[^[\]{}]*\]|\{[^[\]{}]*\}/g;
while (text.search(re) !== -1) { text = text.replace(re, ''); }
return !(/[[\]{}]/.test(text))
}
It works by matching and removing innermost balanced pairs in an iterative manner until there are no more matching pairs left. Once this is complete, a test is made to see if any square bracket or curly braces remain. If any remain, then the function returns false, otherwise it returns true. You should be able to implement this function in just about any language.
Note that this assumes that the square and curly brace pairs are not interleaved like so: [..{..]..}
Hope this helps.
Addendum: Extended version for: (), {}, [] and <>
The above method can be easily extended to handle testing all four matching bracket types: (), {}, [] and <>, like so:
/*#!(?#!js\/g re Rev:20150530_121324)
# Innermost bracket matching pair from 4 global alternatives:
\( [^(){}[\]<>]* \) # Either g1of4. Innermost parentheses.
| \{ [^(){}[\]<>]* \} # Or g2of4. Innermost curly braces.
| \[ [^(){}[\]<>]* \] # Or g3of4. Innermost square brackets.
| \< [^(){}[\]<>]* \> # Or g4of4. Innermost angle brackets.
!#*/
function isBalanced(text) {
var re = /\([^(){}[\]<>]*\)|\{[^(){}[\]<>]*\}|\[[^(){}[\]<>]*\]|\<[^(){}[\]<>]*\>/g;
while (text.search(re) !== -1) { text = text.replace(re, ''); }
return !(/[(){}[\]<>]/.test(text));
}
Note the regex has been documented in an extended mode C comment.
Edit 20150530: Extended to handle a mix of all four matching bracket types: (), {}, [] and <>.
I agree with user mzzzzb. I've been working on a coding challenge that is somewhat similar and came up with the following solution in JavaScript:
function isBalanced(str) {
const stack = [];
const pairs = { ')': '(', ']': '[', '}': '{' };
return str.split('').reduce((res, e) => {
// if its opening, put in stack
if (['(', '{', '['].includes(e)) stack.push(e);
// if closing, compare thru stack
else if ([')', '}', ']'].includes(e)) {
if (stack.pop() !== pairs[e]) return false;
}
return res;
// stack must also be empty
}, true) && stack.length === 0;
}
Related
I'm building up several Pair objects in a loop, and I use the same scalar variable (albeit with a different value) for the value of each of them.
As a simplified example of what I'm doing, consider
my #list;
my $acc = '';
for 1..30 -> $i {
if $i % 5 == 4 {
#list.push($i => $acc);
$acc = '';
} else {
$acc = "$acc $i";
}
}
say #list;
(My actual code is, of course, more complicated and reads from a file rather than a predefined range, so I can't simply eliminate the loop altogether like we theoretically could here)
We accumulate strings containing sequences of numbers written out, creating a pair mapping some of the numbers to sequences of values below that number.
I want the output of this program to be
[4 => 1 2 3 9 => 5 6 7 8 14 => 10 11 12 13 19 => 15 16 17 18 24 => 20 21 22 23 29 => 25 26 27 28]
However, I currently get
[4 => 30 9 => 30 14 => 30 19 => 30 24 => 30 29 => 30]
which, I understand, is because Pair keeps the container when I assign a scalar to its value field, so I'm really creating six pairs, all of whose values point to the same (mutable) container.
The documentation indicates this, and it even suggests a way around it
It is worth noting that when assigning a Scalar as value of a Pair the value holds the container of the value itself. This means that it is possible to change the value from outside of the Pair itself:
...
It is possible to change the above behavior forcing the Pair to remove the scalar container and to hold the effective value itself via the method freeze
which works. If I replace #list.push($i => $acc) with
my $pair = ($i => $acc);
$pair.freeze;
#list.push($pair);
then the code produces the expected output. Problem is, freeze is deprecated, and the only code listed under the deprecation warning as a possible replacement is
$p.=Map.=head.say; # OUTPUT: «orange»
which looks like it's converting the Pair to a Map and then back to do a sort of shallow-copy. Unfortunately, this doesn't seem to work, as #list.push(($i => $acc).Map.head); produces the original (incorrect) output.
So, since Pair.freeze is evidently deprecated, what is the correct way to decontainerize the value size of a Pair object in Raku now?
You are very close.
To get an idea what is going on I put this line print "$i: "; dd #list; just before the end of your for loop.
Here's a sample:
19: Array #list = [4 => "", 9 => "", 14 => "", 19 => ""]
20: Array #list = [4 => " 20", 9 => " 20", 14 => " 20", 19 => " 20"]
21: Array #list = [4 => " 20 21", 9 => " 20 21", 14 => " 20 21", 19 => " 20 21"]
So, as you say, the issue is that the $acc container is just being reused. In your case, you need to set the Pair value to the contents of $acc, not to the container itself.
Either of these variants work in place of your push line:
#list.push($i => $acc<>);
#list.push($i => "$acc");
The decont <> operator explictly decontainerizes the contents of the $acc container.
Or, perhaps more familiar, the "" quotes produce a new Str value with a copy of the current value of $acc.
I have a script like the below. Intent is to have different filter methods to filter a list.
Here is the code.
2
3 class list_filter {
4 has #.my_list = (1..20);
5
6 method filter($l) { return True; }
7
8 # filter method
9 method filter_lt_10($l) {
10 if ($l > 10) { return False; }
11 return True;
12 }
13
14 # filter method
15 method filter_gt_10($l) {
16 if ($l < 10) { return False; }
17 return True;
18 }
19
20 # expecting a list of (1..10) to be the output here
21 method get_filtered_list_lt_10() {
22 return self.get_filtered_list(&{self.filter_lt_10});
23 }
24
25 # private
26 method get_filtered_list(&filter_method) {
27 my #newlist = ();
28 for #.my_list -> $l {
29 if (&filter_method($l)) { push(#newlist, $l); }
30 }
31 return #newlist;
32 }
33 }
34
35 my $listobj = list_filter.new();
36
37 my #outlist = $listobj.get_filtered_list_lt_10();
38 say #outlist;
Expecting [1..10] to be the output here. But getting following error.
Too few positionals passed; expected 2 arguments but got 1
in method filter_lt_10 at ./b.pl6 line 9
in method get_filtered_list_lt_10 at ./b.pl6 line 22
in block <unit> at ./b.pl6 line 37
What am I doing wrong here?
Passing a method as a parameter in Perl 6 either requires you to use MOP (Meta-Object Protocol) methods, or pass the method by name (which would then do the lookup for you at runtime).
But why use methods if you're not really doing something with the object in those methods? They might as well be subs then, which you can pass as a parameter.
Perhaps this is best by example:
class list_filter {
has #.my_list = 1..20; # don't need parentheses
sub filter($ --> True) { } # don't need code, signature is enough
# filter sub
sub filter_lt_10($l) { not $l > 10 }
# filter sub
sub filter_gt_10($l) { not $l < 10 }
# private
method !get_filtered_list(&filter_sub) {
#.my_list.grep(&filter_sub);
}
# expecting a list of (1..10) to be the output here
method get_filtered_list_lt_10() {
self!get_filtered_list(&filter_lt_10);
}
}
my $listobj = list_filter.new();
my #outlist = $listobj.get_filtered_list_lt_10();
say #outlist; # [1 2 3 4 5 6 7 8 9 10]
The first sub filter, which only returns a constant value (in this case True), can be represented much more easily in the signature with an empty body.
The filter_lt_10 and filter_gt_10 subs only need the condition negated, hence the use of the not.
The get_filtered_list method is supposed to be private, so make it a private method by prefixing !.
In the get_filtered_list_lt_10 you now need to call get_filtered_list with a ! instead of a .. And you pass the filter_lt_10 sub as a parameter by prefixing the & (otherwise it would be considered a call to the sub without any parameters, which would fail).
Change the get_filtered_listto use the built-in grep method: this takes a Callable block that takes a single parameter and which should return something True to include the value of the list it works upon. Since a sub taking a single parameter is a Callable, we can just specify the sub there directly.
Hope this made sense. I tried to stay as close as possible to the intended semantics.
Some general programming remarks: it feels to me that the naming of the subs is confusing: it feels to me that they should be called filter_le_10 and filter_ge_10, because that's really what they do it appears to me. Also, if you really don't want any ad-hoc filtering, but only filtering from a specific set of predefined filters, you would probably be better of by creating a dispatch table using constants or enums, and use that to indicate which filter you want, rather than encoding this information in the name of yet another method to make and maintain.
Hope this helps.
TL;DR You told P6 what arguments to expect when calling your filter method. Then you failed to pass the agreed argument(s) when you called it. So P6 complained on your behalf. To resolve the issue, either pass the argument(s) you told P6 to expect or stop telling P6 to expect them. :)
The message says expected 2, got 1, rather than expected 1 got 0.
This is because self is implicitly passed and added to the "expected" and "got" totals in this appended bit of message detail, bumping both up by one. (This detail is perhaps Less Than Awesome, i.e. something we should perhaps consider fixing.)
When I run your code on tio I get:
Too few positionals passed; expected 2 arguments but got 1
in method filter at .code.tio line 27
in method print_filtered_list at .code.tio line 12
in block <unit> at .code.tio line 42
The method declaration method filter($l) {...} at line 27 tells P6 to expect two arguments for each .filter method call:
The invocant. (This will be bound to self.) Let's call that argument A.
A positional argument. (This will be bound to the $l parameter). Let's call that argument B.
But in &{self.filter} in line 12, while you provide the .filter method call with an argument A, i.e. an invocant argument, you don't provide an argument B, i.e. a positional argument (after filter, e.g. &{self.filter(42)}).
Hence Too few positionals passed; expected 2 arguments but got 1.
The &{self.method} syntax was new to me, so thanks for that. Unfortunately it doesn't work if parameters are needed. You can use sub as other posters mentioned, but if you need to use methods, you can get a method by calling self.^lookup, which is the use of the meta-object protocol that Elizabeth mentioned. ('^' means you're not calling a method that's part of that class, but rather part of the "shadow" class which contains the main class's guts / implementation details.)
To get a method, use run obj.^lookup(method name), and call it by passing in the object itself (often "self") as the first parameter, then the other parameters. To bind the object to the function so it doesn't need to be explicitly added each time, use the assuming function.
class MyClass {
method log(Str $message) { say now ~ " $message"; }
method get-logger() { return self.^lookup('log').assuming(self); }
}
my &log = MyClass.get-logger();
log('hello'); # output: Instant:1515047449.201730 hello
Found it. this is what worked for me.
3 class list_filter {
4 has #.my_list = (1..20);
5
6 # will be overriding this in derived classes
7 method filter1($l) { return True; }
8 method filter2($l) { return True; }
9
10 # same print method I will be calling from all derived class objects
11 method print_filtered_list($type) {
12 my #outlist = self.get_filtered_list($type);
13 say #outlist;
14 }
15
16 # private
17 method get_filtered_list($type) {
18 my #newlist = ();
19 for #.my_list -> $l {
20 my $f = "filter$type";
21 if (self."$f"($l)) { push(#newlist, $l); }
22 }
23 return #newlist;
24 }
25 }
26
27 class list_filter_lt_10 is list_filter {
28 method filter1($l) {
29 if ($l > 10) { return False; }
30 return True;
31 }
32 method filter2($l) {
33 if ($l > 10) { return False; }
34 if ($l < 5) { return False; }
35 return True;
36 }
37 }
38
39 class list_filter_gt_10 is list_filter {
40 method filter1($l) {
41 if ($l < 10) { return False; }
42 return True;
43 }
44 method filter2($l) {
45 if ($l < 10) { return False; }
46 if ($l > 15) { return False; }
47 return True;
48 }
49 }
50
51 my $listobj1 = list_filter_lt_10.new();
52 $listobj1.print_filtered_list(1);
53 $listobj1.print_filtered_list(2);
54
55 my $listobj2 = list_filter_gt_10.new();
56 $listobj2.print_filtered_list(1);
57 $listobj2.print_filtered_list(2);
58
Output:
./b.pl6
[1 2 3 4 5 6 7 8 9 10]
[5 6 7 8 9 10]
[10 11 12 13 14 15 16 17 18 19 20]
[10 11 12 13 14 15]
piojo's answer looks like it would work (though I haven't tried it).
Another approach to turning a method into a variable is to use indirection:
class Foo {
method bar($a) {
$a * 2
}
}
sub twice(&f, $x) {
f f $x
}
my $foo = Foo.new();
say twice {$foo.bar: $^a}, 1
I am reading through perl6intro on lazy lists and it leaves me confused about certain things.
Take this example:
sub foo($x) {
$x**2
}
my $alist = (1,2, &foo ... ^ * > 100);
will give me (1 2 4 16 256), it will square the same number until it exceeds 100. I want this to give me (1 4 9 16 25 .. ), so instead of squaring the same number, to advance a number x by 1 (or another given "step"), foo x, and so on.
Is it possible to achieve this in this specific case?
Another question I have on lazy lists is the following:
In Haskell, there is a takeWhile function, does something similar exist in Perl6?
I want this to give me (1 4 9 16 25 .. )
The easiest way to get that sequence, would be:
my #a = (1..*).map(* ** 2); # using a Whatever-expression
my #a = (1..*).map(&foo); # using your `foo` function
...or if you prefer to write it in a way that resembles a Haskell/Python list comprehension:
my #a = ($_ ** 2 for 1..*); # using an in-line expression
my #a = (foo $_ for 1..*); # using your `foo` function
While it is possible to go out of one's way to express this sequence via the ... operator (as Brad Gilbert's answer and raiph's answer demonstrate), it doesn't really make sense, as the purpose of that operator is to generate sequences where each element is derived from the previous element(s) using a consistent rule.
Use the best tool for each job:
If a sequence is easiest to express iteratively (e.g. Fibonacci sequence):
Use the ... operator.
If a sequence is easiest to express as a closed formula (e.g. sequence of squares):
Use map or for.
Here is how you could write a Perl 6 equivalent of Haskell's takewhile.
sub take-while ( &condition, Iterable \sequence ){
my \iterator = sequence.iterator;
my \generator = gather loop {
my \value = iterator.pull-one;
last if value =:= IterationEnd or !condition(value);
take value;
}
# should propagate the laziness of the sequence
sequence.is-lazy
?? generator.lazy
!! generator
}
I should probably also show an implementation of dropwhile.
sub drop-while ( &condition, Iterable \sequence ){
my \iterator = sequence.iterator;
GATHER: my \generator = gather {
# drop initial values
loop {
my \value = iterator.pull-one;
# if the iterator is out of values, stop everything
last GATHER if value =:= IterationEnd;
unless condition(value) {
# need to take this so it doesn't get lost
take value;
# continue onto next loop
last;
}
}
# take everything else
loop {
my \value = iterator.pull-one;
last if value =:= IterationEnd;
take value
}
}
sequence.is-lazy
?? generator.lazy
!! generator
}
These are only just-get-it-working examples.
It could be argued that these are worth adding as methods to lists/iterables.
You could (but probably shouldn't) implement these with the sequence generator syntax.
sub take-while ( &condition, Iterable \sequence ){
my \iterator = sequence.iterator;
my \generator = { iterator.pull-one } …^ { !condition $_ }
sequence.is-lazy ?? generator.lazy !! generator
}
sub drop-while ( &condition, Iterable \sequence ){
my \end-condition = sequence.is-lazy ?? * !! { False };
my \iterator = sequence.iterator;
my $first;
loop {
$first := iterator.pull-one;
last if $first =:= IterationEnd;
last unless condition($first);
}
# I could have shoved the loop above into a do block
# and placed it where 「$first」 is below
$first, { iterator.pull-one } … end-condition
}
If they were added to Perl 6/Rakudo, they would likely be implemented with Iterator classes.
( I might just go and add them. )
A direct implementation of what you are asking for is something like:
do {
my $x = 0;
{ (++$x)² } …^ * > 100
}
Which can be done with state variables:
{ ( ++(state $x = 0) )² } …^ * > 100
And a state variable that isn't used outside of declaring it doesn't need a name.
( A scalar variable starts out as an undefined Any, which becomes 0 in a numeric context )
{ (++( $ ))² } …^ * > 100
{ (++$)² } …^ * > 100
If you need to initialize the anonymous state variable, you can use the defined-or operator // combined with the equal meta-operator =.
{ (++( $ //= 5))² } …^ * > 100
In some simple cases you don't have to tell the sequence generator how to calculate the next values.
In such cases the ending condition can also be simplified.
say 1,2,4 ...^ 100
# (1 2 4 8 16 32 64)
The only other time you can safely simplify the ending condition is if you know that it will stop on the value.
say 1, { $_ * 2 } ... 64;
# (1 2 4 8 16 32 64)
say 1, { $_ * 2 } ... 3;
# (1 2 4 8 16 32 64 128 256 512 ...)
I want this to give me (1 4 9 16 25 .. )
my #alist = {(++$)²} ... Inf;
say #alist[^10]; # (1 4 9 16 25 36 49 64 81 100)
The {…} is an arbitrary block of code. It is invoked for each value of a sequence when used as the LHS of the ... sequence operator.
The (…)² evaluates to the square of the expression inside the parens. (I could have written (…) ** 2 to mean the same thing.)
The ++$ returns 1, 2, 3, 4, 5, 6 … by combining a pre-increment ++ (add one) with a $ variable.
In Haskell, there is a takeWhile function, does something similar exist in Perl6?
Replace the Inf from the above sequence with the desired end condition:
my #alist = {(++$)²} ... * > 70; # stop at step that goes past 70
say #alist; # [1 4 9 16 25 36 49 64 81]
my #alist = {(++$)²} ...^ * > 70; # stop at step before step past 70
say #alist; # [1 4 9 16 25 36 49 64]
Note how the ... and ...^ variants of the sequence operator provide the two variations on the stop condition. I note in your original question you have ... ^ * > 70, not ...^ * > 70. Because the ^ in the latter is detached from the ... it has a different meaning. See Brad's comment.
I have this string which looks like this:
613 3503||0 82 1 49 1 1950 63543 11301 3 CORP-A1 1656.06 150 0 N 82.8 198.72 12.42 N 0 0 0 N Y 1
However, when I string split it by either TAB or SPACE, it does not split via Tab or space. It still outputs as the whole thing.
I tried the following:
= fromVisMtext.Text.Split(vbTab)
= fromVisMtext.Text.Split(" ")
Also, here at stack overflow when I pasted said string, it isn't delimited and is connected with each other.
6133503||0821491195063543113013CORP-A11656.061500N82.8198.7212.42N000NY1
The string I've pasted above was mine that I added white spaces manually, since here StackOverflow removes said delimiters.
Also, said string is from the VisM control of Intersystems Cache.
How can split this string by either Tab or Space? It doesn't seem to be either, but the data is definitely delimited by a white space or tab something.
EDIT here is the result of Dim theGlobals = String.Join(" ", fromVisMtext.Text.Select(Function(ch) Microsoft.VisualBasic.AscW(ch).ToString("x4")))
In general case (space, tab, non breaking space etc. separators) you can try split by any white space, e.g.:
String source = #"613 3503||0 82 1 49 1 1950 63543 11301 3 CORP-A1 1656.06 150 0 N 82.8 198.72 12.42 N 0 0 0 N Y 1";
var result = Regex
.Split(source, #"\s")
.Where(item => !String.IsNullOrEmpty(item));
//.ToArray(); // <- if you want to materialize
// 613
// 3503||0
// 82
// 1
// ...
// N
// Y
// 1
Console.Write(String.Join(Environment.NewLine, result));
If you´re sure that separators can be space (' ') or tab ('\t') only you can just split:
var result = source.Split(new Char[] { ' ', '\t' },
StringSplitOptions.RemoveEmptyEntries);
I have been working on this for more than 2 days without success. It will be a common problem, but I can't find a solution. I did do a search!
Problem:
I have some data that I want to read in say, 5 values per line. I know how many I want to read from a value read previously. For example, 6 values to read, spread over 2 lines...
6
10 20 30 40 50
60
so after every 5 variables I want to read a new line. If there are 0 variables, I want to skip the bit to do with this, and if I want to read an exact multiple of 5 variables, then I want to avoid duplicating the NL call.
I tried this...
varblock[ Integer count ]
#init{
Integer varIndex = 0;
}
: { count > 0 }? ( dp=NUMBER { count--; varIndex++; }
{ ( varIndex \% 5 ) == 0 }? NL { varIndex = 0; }
)+ { varIndex > 0 }? => NL
|
;
But I get...
failed predicate: { ( varIndex \% 5 ) == 0 }?
It might be that I misunderstand predicates. I have several other predicates in my grammar that seem to work, but they are not of this type. There, I am trying to skip bits of the grammar depending on the version of the input file.
Thanks.
NL is just a line feed that is expected at the end of the input lines.
NL : ( '\n' | '\r' )+ ;
In other lines we read several other things such as...
"IPE270" "BS 7191 GR 355C" 0.0 0 0
and the values such as STRING, FLOAT, or NUMBER must be on those lines in expected sequences. So if we encounter a NL before we have read the requisite data values, there is a syntax error. So, perhaps the answer to your question is "Yes".
Perhaps I just (over?) simplified the example.
SOLVED:
It was a problem of brackets. I looked at the generated parser code to get a clue.
varblock[ Integer count ]
#init{
Integer index = 0;
}
: ( { count > 0 }? => ( NUMBER { count--; index++; }
| { (index \% 5) == 0 }? => NL ) )+ { index > 0 }? => NL
|
;
This reads values up to 5 per line.