Writing a basic program to count the number of words in a string. I've changed my original code to account for multiple spaces between words. By setting one variable to the current index and one variable to the previous index and comparing them, I can say "if this current index is a space, but the previous index contains something other than a space (basically saying a character), then increase the word count".
int main(int argc, const char * argv[]) {
#autoreleasepool {
//establishing the string that we'll be parsing through.
NSString * paragraph = #"This is a test paragraph and we will be testing out a string counter.";
//we're setting our counter that tracks the # of words to 0
int wordCount = 0;
/*by setting current to a blank space ABOVE the for loop, when the if statement first runs, it's comparing [paragraph characterAtIndex:i to a blank space. Once the loop runs through for the first time, the next value that current will have is characterAtIndex:0, while the if statement in the FOR loop will hold a value of characterAtIndex:1*/
char current = ' ';
for (int i=0; i< paragraph.length; i++) {
if ([paragraph characterAtIndex:i] == ' ' && (current != ' ')) {
wordCount++;
}
current = [paragraph characterAtIndex:i];
//after one iteration, current will be T and it will be comparing it to paragraph[1] which is h.
}
wordCount ++;
NSLog(#"%i", wordCount);
}
return 0;
}
I tried adding "or" statements to account for delimiters such as ";" "," and "." instead of just looking at a space. It didn't work...any idea what I can do, logically speaking, to account for anything that isn't a letter (but preferably just limiting it to these four delimiters - . , ; and space.
A standard way to solve these types of problems is to build a finite state machine, your code isn't quite one but its close.
Instead of thinking about comparing the previous and current characters think in terms of states - you can start with just two, in a word and not in a word.
Now for each state you consider what the current character implies in terms of actions and changes to the state. For example, if the state is not in a word and the current character is a letter then the action is increment word count and the next state is in a word.
In (Objective-)C you can build a simple finite state machine using an enum to give the states names and a case statement inside a loop. In pseudo-code this is something like:
typedef enum { NotInWord, InWord } State;
State currentState = NotInWord;
NSUInteger wordCount = 0;
for currentChar in sourceString
case currentState of
NotInWord:
if currentChar is word start character -- e.g. a letter
then
increment wordCount;
currentState = InWord;
InWord:
if currentChar is not a word character -- e.g. a letter
then
currentState = NotInWord;
end case
end for
The above is just a step from your original algorithm - recasting it in terms of states rather than the previous character.
Now if you want to get smarter you can add more states. For example how many words are there in "Karan's question"? Two. So you might want to allow a single apostrophe in a word. To handle that you can add a state AfterApostrophe whose logic is the same as the current InWord; and modify InWord logic to include if the current character is an apostrophe the next state is AfterApostrophe - that would allow one apostrophe in a word (or its end, which is also valid). Next you might want to consider hyphenated words, etc...
To test if a character is a particular type you have two easy choices:
If this is just an exercise and you are happy to stick with the ASCII range of characters there are functions such as isdigit(), isletter() etc.
If you want to handle full Unicode you can use the NSCharacterSet type with its pre-defined sets for letters, digits, etc.
See the documentation for both of the above choices.
HTH
I don't understand, You should be able to add or statements....
int main(void) {
char paragraph[] = "This is a test paragraph,EXTRAWORDHERE and we will be testing out a string.";
char current = ' ';
int i;
int wordCount = 0;
for (i = 0; i < sizeof(paragraph); i++){
if ((paragraph[i] == 32 || paragraph[i] == 44) && !(current == 32 || current == 44)){ //32 = ascii for space, 44 for comma
wordCount++;
}
current = paragraph[i];
}
wordCount++;
printf("%d\n",wordCount);
return 0;
}
I suppose it would be better to change the comparison of current from a not equal to into an equal to. Hopefully that helps.
I am a newbie to golang and want to find a way to define a single byte variable.
It's a demo program in Effective Go reference.
package main
import (
"fmt"
)
func unhex(c byte) byte{
switch {
case '0' <= c && c <= '9':
return c - '0'
case 'a' <= c && c <= 'f':
return c - 'a' + 10
case 'A' <= c && c <= 'F':
return c - 'A' + 10
}
return 0
}
func main(){
// It works fine here, as I wrap things with array.
c := []byte{'A'}
fmt.Println(unhex(c[0]))
//c := byte{'A'} **Error** invalid type for composite literal: byte
//fmt.Println(unhex(c))
}
As you see I can wrap a byte with array, things goes fine, but How can I define a single byte without using array? thanks.
In your example, this would work, using the conversion syntax T(x):
c := byte('A')
Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T.
See this playground example.
cb := byte('A')
fmt.Println(unhex(cb))
Output:
10
If you don't want to use the := syntax, you can still use a var statement, which lets you explicitly specify the type. e.g:
var c byte = 'A'
Often in my code I need to check whether the state of x amount of bools are all true OR all bools are false. So I do:
BOOL first, second, third;
if((first && second && third) || (!first && !second && !third))
//do something
Being a lazy programmer, I want to know if there is some mathematical shorthand for this kind of query, instead of having to type out this whole thing every time?
The shorthand for all bools the same is testing for (pairwise) equality:
(first==second && second==third)
Of course you can expand this to any number of booleans, having N-1 equality checks joined with the and operator.
If this is something you frequently require then you're better off using an integer and reading bits individually.
For instance, instead of:
BOOL x; // not this
BOOL y; // not this
BOOL z; // not this
...and instead of bit fields (because their layout is implementation-defined):
unsigned int x : 1; // not this
unsigned int y : 1; // not this
unsigned int z : 1; // not this
...use a single field such as:
unsigned int flags; // do this
...and assign every value to a bit; for example:
enum { // do this
FLAG_X = (1 << 0),
FLAG_Y = (1 << 1),
FLAG_Z = (1 << 2),
ALL_FLAGS = 0x07 // "all bits are on"
};
Then, to test "all false" you simply say "if (!flags)" and to test "all true" you simply say "if (flags == ALL_FLAGS)" where ALL_FLAGS is a number that sets all valid bits to 1. Other bitwise operators can be used to set or test individual bits as needed.
Note that this technique has an upper limit of 32 Boolean values before you have to do more (e.g. create an additional integer field to store more bits).
Check if the sum is 0 or equal to the number of bools:
((first + second + third) % 3 == 0)
This works for any number of arguments.
(But don't take this answer serious and do it for real.)
When speaking about predicates, you can usually simplify the logic by using two variables for the quantification operations - universal quantification (for all) and existential quantification (there exists).
BOOL allValues = (value1 && value2 && value3);
BOOL anyValue = (value1 || value2 || value3);
if (allValues || !anyValue) {
... do something
}
This would also work if you have a lot of boolean values in an array - you could create a for cycle evaluating the two variables.
How do I convert an int to a char and also back from char to int?
e.g 12345 == abcde
Right now I have it using a whole bunch of case statement, wonder if there is a smarter way of doing that?
Thanks,
Tee
I would recommend use ASCII values and just typecast.
In most cases it is best to just use the ASCII values to encode letters; however if you wanted to use 1 2 3 4 to represent 'a' 'b' 'c' 'd' then you could use the following.
For example, if you wanted to convert the letter 1 to 'a' you could do:
char letter = (char) 1 + 96;
as in ASCII 97 corresponds to the character 'a'. Likewise you can convert the character 'a' to the integer 1 as follows
int num = (int) 'a' - 96;
Of course it is just easier to use ASCII values to start with and avoid adding or subtracting as shown above. :-D
If you want just to map 'a' -> 1, 'b' -> 2, ..., 'i' -> 9, you should do simply the following:
int convert(char* s)
{
if (!s) return -1; // error
int result = 0;
while (*s)
{
int digit = *s - 'a' + 1;
if (digit < 1 || digit > 9)
return -1; // error
result = 10 * result + digit;
}
return result;
}
However, you should still care about 0s (which letter do you want to map to 0?) and overflow (my code doesn't check for it).
unsigned char encryPt[1];
encryPt[0] = (char)1;
I'm looking for a way to generate combinations of objects ordered by a single attribute. I don't think lexicographical order is what I'm looking for... I'll try to give an example. Let's say I have a list of objects A,B,C,D with the attribute values I want to order by being 3,3,2,1. This gives A3, B3, C2, D1 objects. Now I want to generate combinations of 2 objects, but they need to be ordered in a descending way:
A3 B3
A3 C2
B3 C2
A3 D1
B3 D1
C2 D1
Generating all combinations and sorting them is not acceptable because the real world scenario involves large sets and millions of combinations. (set of 40, order of 8), and I need only combinations above the certain threshold.
Actually I need count of combinations above a threshold grouped by a sum of a given attribute, but I think it is far more difficult to do - so I'd settle for developing all combinations above a threshold and counting them. If that's possible at all.
EDIT - My original question wasn't very precise... I don't actually need these combinations ordered, just thought it would help to isolate combinations above a threshold. To be more precise, in the above example, giving a threshold of 5, I'm looking for an information that the given set produces 1 combination with a sum of 6 ( A3 B3 ) and 2 with a sum of 5 ( A3 C2, B3 C2). I don't actually need the combinations themselves.
I was looking into subset-sum problem, but if I understood correctly given dynamic solution it will only give you information is there a given sum or no, not count of the sums.
Thanks
Actually, I think you do want lexicographic order, but descending rather than ascending. In addition:
It's not clear to me from your description that A, B, ... D play any role in your answer (except possibly as the container for the values).
I think your question example is simply "For each integer at least 5, up to the maximum possible total of two values, how many distinct pairs from the set {3, 3, 2, 1} have sums of that integer?"
The interesting part is the early bailout, once no possible solution can be reached (remaining achievable sums are too small).
I'll post sample code later.
Here's the sample code I promised, with a few remarks following:
public class Combos {
/* permanent state for instance */
private int values[];
private int length;
/* transient state during single "count" computation */
private int n;
private int limit;
private Tally<Integer> tally;
private int best[][]; // used for early-bail-out
private void initializeForCount(int n, int limit) {
this.n = n;
this.limit = limit;
best = new int[n+1][length+1];
for (int i = 1; i <= n; ++i) {
for (int j = 0; j <= length - i; ++j) {
best[i][j] = values[j] + best[i-1][j+1];
}
}
}
private void countAt(int left, int start, int sum) {
if (left == 0) {
tally.inc(sum);
} else {
for (
int i = start;
i <= length - left
&& limit <= sum + best[left][i]; // bail-out-check
++i
) {
countAt(left - 1, i + 1, sum + values[i]);
}
}
}
public Tally<Integer> count(int n, int limit) {
tally = new Tally<Integer>();
if (n <= length) {
initializeForCount(n, limit);
countAt(n, 0, 0);
}
return tally;
}
public Combos(int[] values) {
this.values = values;
this.length = values.length;
}
}
Preface remarks:
This uses a little helper class called Tally, that just isolates the tabulation (including initialization for never-before-seen keys). I'll put it at the end.
To keep this concise, I've taken some shortcuts that aren't good practice for "real" code:
This doesn't check for a null value array, etc.
I assume that the value array is already sorted into descending order, required for the early-bail-out technique. (Good production code would include the sorting.)
I put transient data into instance variables instead of passing them as arguments among the private methods that support count. That makes this class non-thread-safe.
Explanation:
An instance of Combos is created with the (descending ordered) array of integers to combine. The value array is set up once per instance, but multiple calls to count can be made with varying population sizes and limits.
The count method triggers a (mostly) standard recursive traversal of unique combinations of n integers from values. The limit argument gives the lower bound on sums of interest.
The countAt method examines combinations of integers from values. The left argument is how many integers remain to make up n integers in a sum, start is the position in values from which to search, and sum is the partial sum.
The early-bail-out mechanism is based on computing best, a two-dimensional array that specifies the "best" sum reachable from a given state. The value in best[n][p] is the largest sum of n values beginning in position p of the original values.
The recursion of countAt bottoms out when the correct population has been accumulated; this adds the current sum (of n values) to the tally. If countAt has not bottomed out, it sweeps the values from the start-ing position to increase the current partial sum, as long as:
enough positions remain in values to achieve the specified population, and
the best (largest) subtotal remaining is big enough to make the limit.
A sample run with your question's data:
int[] values = {3, 3, 2, 1};
Combos mine = new Combos(values);
Tally<Integer> tally = mine.count(2, 5);
for (int i = 5; i < 9; ++i) {
int n = tally.get(i);
if (0 < n) {
System.out.println("found " + tally.get(i) + " sums of " + i);
}
}
produces the results you specified:
found 2 sums of 5
found 1 sums of 6
Here's the Tally code:
public static class Tally<T> {
private Map<T,Integer> tally = new HashMap<T,Integer>();
public Tally() {/* nothing */}
public void inc(T key) {
Integer value = tally.get(key);
if (value == null) {
value = Integer.valueOf(0);
}
tally.put(key, (value + 1));
}
public int get(T key) {
Integer result = tally.get(key);
return result == null ? 0 : result;
}
public Collection<T> keys() {
return tally.keySet();
}
}
I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
Check out this question in stackoverflow: Algorithm to return all combinations
I also just used a the java code below to generate all permutations, but it could easily be used to generate unique combination's given an index.
public static <E> E[] permutation(E[] s, int num) {//s is the input elements array and num is the number which represents the permutation
int factorial = 1;
for(int i = 2; i < s.length; i++)
factorial *= i;//calculates the factorial of (s.length - 1)
if (num/s.length >= factorial)// Optional. if the number is not in the range of [0, s.length! - 1]
return null;
for(int i = 0; i < s.length - 1; i++){//go over the array
int tempi = (num / factorial) % (s.length - i);//calculates the next cell from the cells left (the cells in the range [i, s.length - 1])
E temp = s[i + tempi];//Temporarily saves the value of the cell needed to add to the permutation this time
for(int j = i + tempi; j > i; j--)//shift all elements to "cover" the "missing" cell
s[j] = s[j-1];
s[i] = temp;//put the chosen cell in the correct spot
factorial /= (s.length - (i + 1));//updates the factorial
}
return s;
}
I am extremely sorry (after all those clarifications in the comments) to say that I could not find an efficient solution to this problem. I tried for the past hour with no results.
The reason (I think) is that this problem is very similar to problems like the traveling salesman problem. Until unless you try all the combinations, there is no way to know which attributes will add upto the threshold.
There seems to be no clever trick that can solve this class of problems.
Still there are many optimizations that you can do to the actual code.
Try sorting the data according to the attributes. You may be able to avoid processing some values from the list when you find that a higher value cannot satisfy the threshold (so all lower values can be eliminated).
If you're using C# there is a fairly good generics library here. Note though that the generation of some permutations is not in lexicographic order
Here's a recursive approach to count the number of these subsets: We define a function count(minIndex,numElements,minSum) that returns the number of subsets of size numElements whose sum is at least minSum, containing elements with indices minIndex or greater.
As in the problem statement, we sort our elements in descending order, e.g. [3,3,2,1], and call the first index zero, and the total number of elements N. We assume all elements are nonnegative. To find all 2-subsets whose sum is at least 5, we call count(0,2,5).
Sample Code (Java):
int count(int minIndex, int numElements, int minSum)
{
int total = 0;
if (numElements == 1)
{
// just count number of elements >= minSum
for (int i = minIndex; i <= N-1; i++)
if (a[i] >= minSum) total++; else break;
}
else
{
if (minSum <= 0)
{
// any subset will do (n-choose-k of them)
if (numElements <= (N-minIndex))
total = nchoosek(N-minIndex, numElements);
}
else
{
// add element a[i] to the set, and then consider the count
// for all elements to its right
for (int i = minIndex; i <= (N-numElements); i++)
total += count(i+1, numElements-1, minSum-a[i]);
}
}
return total;
}
Btw, I've run the above with an array of 40 elements, and size-8 subsets and consistently got back results in less than a second.