Techniques to control stack overflows? - vb.net

Basically, my program will try to generate the list of all possible lowercase 5-letter words. Including all combinations that clearly are not real words like jshcc or mmdzq.
I do that by stacking up a massive amount of calls for a function, which does the word work.
But that's simply too much, and I get a stack overflow error.
How would someone control that?

Basically, convert from recursion to iteration. Typically that involves creating a Stack<T> as a "logical" stack, or something similar.
However, I'd have expected a method generating a list of all possible 5-letter words to only have a stack about 5 deep - one for each letter. Each stack level would be responsible for one level of letter - so the "top" of the stack would iterate through each possible last letter; the next stack frame down would iterate through every possible fourth letter, calling the method recursively to iterate through all possible last letters etc. Something like this (C# code, but hopefully you can understand it and apply it to VB):
const string Letters = "abcdefghijklmnopqrstuvwxyz";
public static List<string> GenerateValidWords(int length)
{
List<string> words = new List<string>();
GenerateValidWords(0, new char[length], words);
return words;
}
private static void GenerateValidWords(int depth, char[] current,
List<string> words)
{
foreach (char letter in letters)
{
current[depth] = letter;
if (depth == current.Length - 1)
{
string word = new string(current);
if (IsValid(word))
{
words.Add(word);
}
}
else
{
GenerateValidWords(depth + 1, current, words);
}
}
}
Now if you don't have any sort of filtering, that's going to generate 11,881,376 words - which at 24 bytes each (on x86) is about 285MB - plus all the space for the list etc. That shouldn't kill a suitably big machine, but it is quite a lot of memory. Are you sure you need all of these?

As a simple solution, I would use an iterative method with multiple loops in order to generate these words:
Dim words As New List(Of String)
Dim first As Integer = Asc("a")
Dim last As Integer = Asc("z")
For one As Integer = first To last
For two As Integer = first To last
For three As Integer = first To last
For four As Integer = first To last
For five As Integer = first To last
words.Add(Chr(one) & Chr(two) & Chr(three) & Chr(four) & Chr(five))
Next
Next
Next
Next
Next
MsgBox(words.Count)

Related

Removing a loop to make code run faster (Kotlin) (Big O)

I'm trying a leetcode challenge and am struggling to pass the challenge due to the speed of my code:
class Solution {
fun longestPalindrome(s: String): String {
var longestPal = ""
var substring = ""
for (i in 0..s.length) {
for (j in i + 1..s.length) {
substring = s.substring(i, j)
if (substring == substring.reversed() && substring.length > longestPal.length) {
longestPal = substring
}
}
}
return longestPal
}
}
I'm a newb and not familiar with Big O notation.
I imagine if I could use just one loop I would be able to speed this code up significantly but am not sure how I would go about this.
(Not saying this is the best approach, but that is a start)
Palindromes can only be found between two same letters. So one idea is to go through the string once, and keep track of letter indexes. When you encounter a letter you already saw before, and the difference in indexes is longer than the current longest palindrome, you check for a palindrome:
fun longestPal(s: String): String {
val letters = mutableMapOf<Char, MutableList<Int>>()
var longest = ""
s.indices.forEach { index ->
val indicesForCurrentChar = letters[s[index]] ?: mutableListOf()
for (i in indicesForCurrentChar) {
if ((index - i) < longest.length) break // (1) won't be the longest anyway
val sub = s.substring(i, index + 1)
if (sub == sub.reversed()) longest = sub
}
indicesForCurrentChar.add(index)
letters[s[index]] = indicesForCurrentChar
}
return longest
}
What is costly here is the palindrome check itself (sub == sub.reversed). But the check in (1) should contain it (think of a string which is the repetition of the same letter).
I would be curious to know what other suggest online.
Your code runs in O(n^3) time, a loop within a loop within a loop, because that reversed() call iterates up to the size of the input string. You can look up Manacher's algorithm for an explanation of how to do it in linear time (O(n)), no nested iteration at all.

Counting how many times specific character appears in string - Kotlin

How one may count how many times specific character appears in string in Kotlin?
From looking at https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/-string/ there is nothing built-in and one needs to write loop every time (or may own extension function), but maybe I missed a better way to achieve this?
Easy with filter {} function
val str = "123 123 333"
val countOfSymbol = str
.filter { it == '3' } // 3 is your specific character
.length
println(countOfSymbol) // output 5
Another approach
val countOfSymbol = str.count { it == '3'} // 3 is your specific character
println(countOfSymbol) // output 5
From the point of view of saving computer resources, the count decision(second approach) is more correct.

VB.Net Search a string

I am searching a string of DNA chars, ATCG. I may wish to look for AT for example and the search must ignore ATAT not find two AT's. I need to know how many AT,s there are in the string and their position in there position.
I tried various ideas but so far have failed. I used Mid, Contains. If someone can give me a hint I would be grateful.
Regards
Not a VB.NET dude (C# is my current dope), but luckily they're similar. If you don't care about execution time, you can just brute force it. First, see if your pattern occurs in the string at all:
bool containsAT myDNAChars.Contains( "AT" );
Then you can go about finding their positions:
var myListOfMatches = new List<int>();
int searchIndex = 0;
string pattern = "AT";
bool done = false;
while( !done )
{
int atIndex = myDNAChars.IndexOf( pattern, searchIndex );
myListOfMatches.Add( atIndex );
searchIndex += pattern.Length;
if( searchIndex > myDNAChars.Length )
{
done = true;
}
}
Once you have a list of matches, you can iterate over it and discard any occurrences of "ATAT" or any other patterns you don't want.
It's not elegant--it's brute force--but it should work.
Sorry about the C#, but it should be easy to convert to VB.NET.

additional logic to this exercise missing

Writing a basic program to count the number of words in a string. I've changed my original code to account for multiple spaces between words. By setting one variable to the current index and one variable to the previous index and comparing them, I can say "if this current index is a space, but the previous index contains something other than a space (basically saying a character), then increase the word count".
int main(int argc, const char * argv[]) {
#autoreleasepool {
//establishing the string that we'll be parsing through.
NSString * paragraph = #"This is a test paragraph and we will be testing out a string counter.";
//we're setting our counter that tracks the # of words to 0
int wordCount = 0;
/*by setting current to a blank space ABOVE the for loop, when the if statement first runs, it's comparing [paragraph characterAtIndex:i to a blank space. Once the loop runs through for the first time, the next value that current will have is characterAtIndex:0, while the if statement in the FOR loop will hold a value of characterAtIndex:1*/
char current = ' ';
for (int i=0; i< paragraph.length; i++) {
if ([paragraph characterAtIndex:i] == ' ' && (current != ' ')) {
wordCount++;
}
current = [paragraph characterAtIndex:i];
//after one iteration, current will be T and it will be comparing it to paragraph[1] which is h.
}
wordCount ++;
NSLog(#"%i", wordCount);
}
return 0;
}
I tried adding "or" statements to account for delimiters such as ";" "," and "." instead of just looking at a space. It didn't work...any idea what I can do, logically speaking, to account for anything that isn't a letter (but preferably just limiting it to these four delimiters - . , ; and space.
A standard way to solve these types of problems is to build a finite state machine, your code isn't quite one but its close.
Instead of thinking about comparing the previous and current characters think in terms of states - you can start with just two, in a word and not in a word.
Now for each state you consider what the current character implies in terms of actions and changes to the state. For example, if the state is not in a word and the current character is a letter then the action is increment word count and the next state is in a word.
In (Objective-)C you can build a simple finite state machine using an enum to give the states names and a case statement inside a loop. In pseudo-code this is something like:
typedef enum { NotInWord, InWord } State;
State currentState = NotInWord;
NSUInteger wordCount = 0;
for currentChar in sourceString
case currentState of
NotInWord:
if currentChar is word start character -- e.g. a letter
then
increment wordCount;
currentState = InWord;
InWord:
if currentChar is not a word character -- e.g. a letter
then
currentState = NotInWord;
end case
end for
The above is just a step from your original algorithm - recasting it in terms of states rather than the previous character.
Now if you want to get smarter you can add more states. For example how many words are there in "Karan's question"? Two. So you might want to allow a single apostrophe in a word. To handle that you can add a state AfterApostrophe whose logic is the same as the current InWord; and modify InWord logic to include if the current character is an apostrophe the next state is AfterApostrophe - that would allow one apostrophe in a word (or its end, which is also valid). Next you might want to consider hyphenated words, etc...
To test if a character is a particular type you have two easy choices:
If this is just an exercise and you are happy to stick with the ASCII range of characters there are functions such as isdigit(), isletter() etc.
If you want to handle full Unicode you can use the NSCharacterSet type with its pre-defined sets for letters, digits, etc.
See the documentation for both of the above choices.
HTH
I don't understand, You should be able to add or statements....
int main(void) {
char paragraph[] = "This is a test paragraph,EXTRAWORDHERE and we will be testing out a string.";
char current = ' ';
int i;
int wordCount = 0;
for (i = 0; i < sizeof(paragraph); i++){
if ((paragraph[i] == 32 || paragraph[i] == 44) && !(current == 32 || current == 44)){ //32 = ascii for space, 44 for comma
wordCount++;
}
current = paragraph[i];
}
wordCount++;
printf("%d\n",wordCount);
return 0;
}
I suppose it would be better to change the comparison of current from a not equal to into an equal to. Hopefully that helps.

Redim boolean Array vs enumerate and set

In a case where you would like to reset an array of boolean values what is faster, rediming the array or enumerating and resetting the values?
I have run some tests and they seem to suggest that a redim is a lot faster but I am not convinced that it isnt a result of how I’m running the tests.
My tests seem to suggest that redim is nearly twice as fast.
So could anyone care to comment on which is faster and why? Also would you expect the same result across different languages?
Enum Test:
Dim booleanArray(200) As Boolean
Dim startTime As Date = Date.Now
For i As Integer = 0 To 9999999
For l As Integer = 0 To 200
booleanArray(l) = True
Next
Next
Dim endTime As Date = Date.Now
Dim timeTaken As TimeSpan = endTime - startTime
Redim Test:
Dim booleanArray(200) As Boolean
Dim startTime As Date = Date.Now
For i As Integer = 0 To 9999999
ReDim booleanArray(200)
Next
Dim endTime As Date = Date.Now
Dim timeTaken As TimeSpan = endTime - startTime
This shows that allocating a new array is fast. That's to be expected when there's plenty of memory available - basically it's incrementing a pointer and a tiny bit of housekeeping.
However, note that that will create a new array with all elements as False rather than True.
A more appropriate test might be to call Array.Clear on the existing array, in the first case, which will wipe out the contents pretty quickly.
Note that your second form will be creating a lot more garbage - in this case it will always stay in gen0 and be collected easily, but in real applications with more realistic memory usage, you could end up causing garbage collection performance issues by creating new arrays instead of clearing out the old ones.
Here's a quick benchmark in C# which tests the three strategies:
using System;
using System.Diagnostics;
public class Test
{
const int Iterations = 100000000;
static void Main()
{
TestStrategy(Clear);
TestStrategy(ManualWipe);
TestStrategy(CreateNew);
}
static void TestStrategy(Func<bool[], bool[]> strategy)
{
bool[] array = new bool[200];
GC.Collect();
GC.WaitForPendingFinalizers();
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < Iterations; i++)
{
array = strategy(array);
}
sw.Stop();
Console.WriteLine("{0}: {1}ms", strategy.Method.Name,
(long) sw.ElapsedMilliseconds);
}
static bool[] Clear(bool[] original)
{
Array.Clear(original, 0, original.Length);
return original;
}
static bool[] ManualWipe(bool[] original)
{
for (int i = 0; i < original.Length; i++)
{
original[i] = false;
}
return original;
}
static bool[] CreateNew(bool[] original)
{
return new bool[original.Length];
}
}
Results:
Clear: 4910ms
ManualWipe: 19185ms
CreateNew: 2802ms
However, that's still just using generation 0 - I'd personally expect Clear to be better for overall application performance. Note that they behave differently if any other code has references to the original array - the "create new" strategy (ReDim) doesn't change the existing array at all.
The tests aren't comparable.
The first test sets the each element to true, whereas Redim doesn't do that.
Redim helps you increase/decrease the bounds & clean up the content (and set it to default).
e.g. Redim will help set the content of the boolean array to false.
Are you expecting Redim to set all the elements to true?
Dim booleanArray(200) As Boolean
For l As Integer = 0 To 200
booleanArray(l) = True
Next
Redim booleanArray(200)
This will reset the content of each element of booleanArray to false.
If you want to retain the content & increase the size - Redim Preserve booleanArray(300) (instead of Redim booleanArray(200)). This will keep the first 200 elements to true and a new 100 elements will have default value (false).
I have tested this in C# language 3.5
Time taken for Enum Test : 00:00:06.2656
Time taken for Redim Test: 00:00:00.0625000
As you can see Redim is a lot faster as you are not setting any values to it.
I would expect the ReDim to be faster, as you are not assigning a value to each item of the array.
The micro benchmark appears OK.