How to multiply several fields in a tuple by a given field of the tuple - apache-pig

For each row of data, I would like to multiply fields 1 through N by field 0. The data could have hundreds of fields per row (or a variable number of fields for that matter), so writing out each pair is not feasible. Is there a way to specify a range of fields, sort of like the the following (incorrect) snippet?
A = LOAD 'foo.csv' USING PigStorage(',');
B = FOREACH A GENERATE $0*($1,..);

A UDF could come in handy here.
Implement exec(Tuple input) and iterate over all fields of the tuple as follows (not tested):
public class MultiplyField extends EvalFunc<Long> {
public Long exec(Tuple input) throws IOException {
if (input == null || input.size() == 0) {
return null;
}
try {
Long retVal = 1;
for (int i = 0; i < input.size(); i++) {
Long j = (Long)input.get(i);
retVal *= j;
}
return retVal;
} catch(Exception e) {
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
Then register your UDF and call it from your FOREACH.

Related

Using Lucene's highlighting, getting too much highlighted, is there a workaround for this?

I am using the highlighting feature of Lucene to isolate matching terms for my query, but some of the matched terms are excessive.
I have some simple test cases which are delivered in an Ant project (download details below).
Materials
You can download the test case here: mydemo_with_libs.zip
That archive includes the Lucene 8.6.3 libraries which my test uses; if you prefer a copy without the JAR files you can download that from here: mydemo_without_libs.zip
The necessary libraries are: core, analyzers, queries, queryparser, highlighter, and memory.
You can run the test case by unzipping the archive into an empty directory and running the Ant command ant synsearch
Input
I have provided a short synonym list which is used for indexing and analysing in the highlighting methods:
cope,manage
jobs,tasks
simultaneously,at once
and there is one document being indexed:
Queues are a useful way of grouping jobs together in order to manage a number of them at once. You can:
hold or release multiple jobs at the same time;
group multiple tasks (for the same event);
control the priority of jobs in the queue;
Eventually log all events that take place in a queue.
Use either job.queue or task.queue in specifications.
Process
When building the index I am storing the text field, and using a custom analyzer. This is because (in the real world) the content I am indexing is technical documentation, so stripping out punctuation is inappropriate because so much of it may be significant in technical expressions. My analyzer uses a TechTokenFilter which breaks the stream up into tokens consisting of strings of words or digits, or individual characters which don't match the previous pattern.
Here's the relevant code for the analyzer:
public class MyAnalyzer extends Analyzer {
public MyAnalyzer(String synlist) {
if (synlist != "") {
this.synlist = synlist;
this.useSynonyms = true;
}
}
public MyAnalyzer() {
this.useSynonyms = false;
}
#Override
protected TokenStreamComponents createComponents(String fieldName) {
WhitespaceTokenizer src = new WhitespaceTokenizer();
TokenStream result = new TechTokenFilter(new LowerCaseFilter(src));
if (useSynonyms) {
result = new SynonymGraphFilter(result, getSynonyms(synlist), Boolean.TRUE);
result = new FlattenGraphFilter(result);
}
return new TokenStreamComponents(src, result);
}
and here's my filter:
public class TechTokenFilter extends TokenFilter {
private final CharTermAttribute termAttr;
private final PositionIncrementAttribute posIncAttr;
private final ArrayList<String> termStack;
private AttributeSource.State current;
private final TypeAttribute typeAttr;
public TechTokenFilter(TokenStream tokenStream) {
super(tokenStream);
termStack = new ArrayList<>();
termAttr = addAttribute(CharTermAttribute.class);
posIncAttr = addAttribute(PositionIncrementAttribute.class);
typeAttr = addAttribute(TypeAttribute.class);
}
#Override
public boolean incrementToken() throws IOException {
if (this.termStack.isEmpty() && input.incrementToken()) {
final String currentTerm = termAttr.toString();
final int bufferLen = termAttr.length();
if (bufferLen > 0) {
if (termStack.isEmpty()) {
termStack.addAll(Arrays.asList(techTokens(currentTerm)));
current = captureState();
}
}
}
if (!this.termStack.isEmpty()) {
String part = termStack.remove(0);
restoreState(current);
termAttr.setEmpty().append(part);
posIncAttr.setPositionIncrement(1);
return true;
} else {
return false;
}
}
public static String[] techTokens(String t) {
List<String> tokenlist = new ArrayList<String>();
String[] tokens;
StringBuilder next = new StringBuilder();
String token;
char minus = '-';
char underscore = '_';
char c, prec, subc;
// Boolean inWord = false;
for (int i = 0; i < t.length(); i++) {
prec = i > 0 ? t.charAt(i - 1) : 0;
c = t.charAt(i);
subc = i < (t.length() - 1) ? t.charAt(i + 1) : 0;
if (Character.isLetterOrDigit(c) || c == underscore) {
next.append(c);
// inWord = true;
}
else if (c == minus && Character.isLetterOrDigit(prec) && Character.isLetterOrDigit(subc)) {
next.append(c);
} else {
if (next.length() > 0) {
token = next.toString();
tokenlist.add(token);
next.setLength(0);
}
if (Character.isWhitespace(c)) {
// shouldn't be possible because the input stream has been tokenized on
// whitespace
} else {
tokenlist.add(String.valueOf(c));
}
// inWord = false;
}
}
if (next.length() > 0) {
token = next.toString();
tokenlist.add(token);
// next.setLength(0);
}
tokens = tokenlist.toArray(new String[0]);
return tokens;
}
}
Examining the index I can see that the index contains the separate terms I expect, including the synonym values. For example the text at the end of the first line has produced the terms
of
them
at , simultaneously
once
.
You
can
:
and the text at the end of the third line has produced the terms
same
event
)
;
When the application performs a search it analyzes the query without using the synonym list (because the synonyms are already in the index), but I have discovered that I need to include the synonym list when analyzing the stored text to identify the matching fragments.
Searches match the correct documents, but the code I have added to identify the matching terms over-performs. I won't show all the search method here, but will focus on the code which lists matched terms:
public static void doSearch(IndexReader reader, IndexSearcher searcher,
Query query, int max, String synList) throws IOException {
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter("\001", "\002");
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
Analyzer analyzer;
if (synList != null) {
analyzer = new MyAnalyzer(synList);
} else {
analyzer = new MyAnalyzer();
}
// Collect all the docs
TopDocs results = searcher.search(query, max);
ScoreDoc[] hits = results.scoreDocs;
int numTotalHits = Math.toIntExact(results.totalHits.value);
System.out.println("\nQuery: " + query.toString());
System.out.println("Matches: " + numTotalHits);
// Collect matching terms
HashSet<String> matchedWords = new HashSet<String>();
int start = 0;
int end = Math.min(numTotalHits, max);
for (int i = start; i < end; i++) {
int id = hits[i].doc;
float score = hits[i].score;
Document doc = searcher.doc(id);
String docpath = doc.get("path");
String doctext = doc.get("text");
try {
TokenStream tokens = TokenSources.getTokenStream("text", null, doctext, analyzer, -1);
TextFragment[] frag = highlighter.getBestTextFragments(tokens, doctext, false, 100);
for (int j = 0; j < frag.length; j++) {
if ((frag[j] != null) && (frag[j].getScore() > 0)) {
String match = frag[j].toString();
addMatchedWord(matchedWords, match);
}
}
} catch (InvalidTokenOffsetsException e) {
System.err.println(e.getMessage());
}
System.out.println("matched file: " + docpath);
}
if (matchedWords.size() > 0) {
System.out.println("matched terms:");
for (String word : matchedWords) {
System.out.println(word);
}
}
}
Problem
While the correct documents are selected by these queries, and the fragments chosen for highlighting do contain the query terms, the highlighted pieces in some of the selected fragments extend over too much of the input.
For example, if the query is
+text:event +text:manage
(the first example in the test case) then I would expect to see 'event' and 'manage' in the highlighted list. But what I actually see is
event);
manage
Despite the highlighting process using an analyzer which breaks terms apart and treats punctuation characters as single terms, the highlight code is "hungry" and breaks on whitespace alone.
Similarly if the query is
+text:queeu~1
(my final test case) I would expect to only see 'queue' in the list. But I get
queue.
job.queue
task.queue
queue;
It is so nearly there... but I don't understand why the highlighted pieces are inconsistent with the index, and I don't think I should have to parse the list of matches through yet another filter to produce the correct list of matches.
I would really appreciate any pointers to what I am doing wrong or how I could improve my code to deliver exactly what I need.
Thanks for reading this far!
I managed to get this working by replacing the WhitespaceTokenizer and TechTokenFilter in my analyzer with a PatternTokenizer; the regular expression took a bit of work but once I had it all the matching terms were extracted with pinpoint accuracy.
The replacement analyzer:
public class MyAnalyzer extends Analyzer {
public MyAnalyzer(String synlist) {
if (synlist != "") {
this.synlist = synlist;
this.useSynonyms = true;
}
}
public MyAnalyzer() {
this.useSynonyms = false;
}
private static final String tokenRegex = "(([\\w]+-)*[\\w]+)|[^\\w\\s]";
#Override
protected TokenStreamComponents createComponents(String fieldName) {
PatternTokenizer src = new PatternTokenizer(Pattern.compile(tokenRegex), 0);
TokenStream result = new LowerCaseFilter(src);
if (useSynonyms) {
result = new SynonymGraphFilter(result, getSynonyms(synlist), Boolean.TRUE);
result = new FlattenGraphFilter(result);
}
return new TokenStreamComponents(src, result);
}

Arithmetic parser in kotlin [duplicate]

I'm trying to write a Java routine to evaluate math expressions from String values like:
"5+3"
"10-4*5"
"(1+10)*3"
I want to avoid a lot of if-then-else statements.
How can I do this?
With JDK1.6, you can use the built-in Javascript engine.
import javax.script.ScriptEngineManager;
import javax.script.ScriptEngine;
import javax.script.ScriptException;
public class Test {
public static void main(String[] args) throws ScriptException {
ScriptEngineManager mgr = new ScriptEngineManager();
ScriptEngine engine = mgr.getEngineByName("JavaScript");
String foo = "40+2";
System.out.println(engine.eval(foo));
}
}
I've written this eval method for arithmetic expressions to answer this question. It does addition, subtraction, multiplication, division, exponentiation (using the ^ symbol), and a few basic functions like sqrt. It supports grouping using (...), and it gets the operator precedence and associativity rules correct.
public static double eval(final String str) {
return new Object() {
int pos = -1, ch;
void nextChar() {
ch = (++pos < str.length()) ? str.charAt(pos) : -1;
}
boolean eat(int charToEat) {
while (ch == ' ') nextChar();
if (ch == charToEat) {
nextChar();
return true;
}
return false;
}
double parse() {
nextChar();
double x = parseExpression();
if (pos < str.length()) throw new RuntimeException("Unexpected: " + (char)ch);
return x;
}
// Grammar:
// expression = term | expression `+` term | expression `-` term
// term = factor | term `*` factor | term `/` factor
// factor = `+` factor | `-` factor | `(` expression `)` | number
// | functionName `(` expression `)` | functionName factor
// | factor `^` factor
double parseExpression() {
double x = parseTerm();
for (;;) {
if (eat('+')) x += parseTerm(); // addition
else if (eat('-')) x -= parseTerm(); // subtraction
else return x;
}
}
double parseTerm() {
double x = parseFactor();
for (;;) {
if (eat('*')) x *= parseFactor(); // multiplication
else if (eat('/')) x /= parseFactor(); // division
else return x;
}
}
double parseFactor() {
if (eat('+')) return +parseFactor(); // unary plus
if (eat('-')) return -parseFactor(); // unary minus
double x;
int startPos = this.pos;
if (eat('(')) { // parentheses
x = parseExpression();
if (!eat(')')) throw new RuntimeException("Missing ')'");
} else if ((ch >= '0' && ch <= '9') || ch == '.') { // numbers
while ((ch >= '0' && ch <= '9') || ch == '.') nextChar();
x = Double.parseDouble(str.substring(startPos, this.pos));
} else if (ch >= 'a' && ch <= 'z') { // functions
while (ch >= 'a' && ch <= 'z') nextChar();
String func = str.substring(startPos, this.pos);
if (eat('(')) {
x = parseExpression();
if (!eat(')')) throw new RuntimeException("Missing ')' after argument to " + func);
} else {
x = parseFactor();
}
if (func.equals("sqrt")) x = Math.sqrt(x);
else if (func.equals("sin")) x = Math.sin(Math.toRadians(x));
else if (func.equals("cos")) x = Math.cos(Math.toRadians(x));
else if (func.equals("tan")) x = Math.tan(Math.toRadians(x));
else throw new RuntimeException("Unknown function: " + func);
} else {
throw new RuntimeException("Unexpected: " + (char)ch);
}
if (eat('^')) x = Math.pow(x, parseFactor()); // exponentiation
return x;
}
}.parse();
}
Example:
System.out.println(eval("((4 - 2^3 + 1) * -sqrt(3*3+4*4)) / 2"));
Output: 7.5 (which is correct)
The parser is a recursive descent parser, so internally uses separate parse methods for each level of operator precedence in its grammar. I deliberately kept it short, but here are some ideas you might want to expand it with:
Variables:
The bit of the parser that reads the names for functions can easily be changed to handle custom variables too, by looking up names in a variable table passed to the eval method, such as a Map<String,Double> variables.
Separate compilation and evaluation:
What if, having added support for variables, you wanted to evaluate the same expression millions of times with changed variables, without parsing it every time? It's possible. First define an interface to use to evaluate the precompiled expression:
#FunctionalInterface
interface Expression {
double eval();
}
Now to rework the original "eval" function into a "parse" function, change all the methods that return doubles, so instead they return an instance of that interface. Java 8's lambda syntax works well for this. Example of one of the changed methods:
Expression parseExpression() {
Expression x = parseTerm();
for (;;) {
if (eat('+')) { // addition
Expression a = x, b = parseTerm();
x = (() -> a.eval() + b.eval());
} else if (eat('-')) { // subtraction
Expression a = x, b = parseTerm();
x = (() -> a.eval() - b.eval());
} else {
return x;
}
}
}
That builds a recursive tree of Expression objects representing the compiled expression (an abstract syntax tree). Then you can compile it once and evaluate it repeatedly with different values:
public static void main(String[] args) {
Map<String,Double> variables = new HashMap<>();
Expression exp = parse("x^2 - x + 2", variables);
for (double x = -20; x <= +20; x++) {
variables.put("x", x);
System.out.println(x + " => " + exp.eval());
}
}
Different datatypes:
Instead of double, you could change the evaluator to use something more powerful like BigDecimal, or a class that implements complex numbers, or rational numbers (fractions). You could even use Object, allowing some mix of datatypes in expressions, just like a real programming language. :)
All code in this answer released to the public domain. Have fun!
For my university project, I was looking for a parser / evaluator supporting both basic formulas and more complicated equations (especially iterated operators). I found very nice open source library for JAVA and .NET called mXparser. I will give a few examples to make some feeling on the syntax, for further instructions please visit project website (especially tutorial section).
https://mathparser.org/
https://mathparser.org/mxparser-tutorial/
https://mathparser.org/api/
And few examples
1 - Simple furmula
Expression e = new Expression("( 2 + 3/4 + sin(pi) )/2");
double v = e.calculate()
2 - User defined arguments and constants
Argument x = new Argument("x = 10");
Constant a = new Constant("a = pi^2");
Expression e = new Expression("cos(a*x)", x, a);
double v = e.calculate()
3 - User defined functions
Function f = new Function("f(x, y, z) = sin(x) + cos(y*z)");
Expression e = new Expression("f(3,2,5)", f);
double v = e.calculate()
4 - Iteration
Expression e = new Expression("sum( i, 1, 100, sin(i) )");
double v = e.calculate()
Found recently - in case you would like to try the syntax (and see the advanced use case) you can download the Scalar Calculator app that is powered by mXparser.
The correct way to solve this is with a lexer and a parser. You can write simple versions of these yourself, or those pages also have links to Java lexers and parsers.
Creating a recursive descent parser is a really good learning exercise.
HERE is another open source library on GitHub named EvalEx.
Unlike the JavaScript engine this library is focused in evaluating mathematical expressions only. Moreover, the library is extensible and supports use of boolean operators as well as parentheses.
You can evaluate expressions easily if your Java application already accesses a database, without using any other JARs.
Some databases require you to use a dummy table (eg, Oracle's "dual" table) and others will allow you to evaluate expressions without "selecting" from any table.
For example, in Sql Server or Sqlite
select (((12.10 +12.0))/ 233.0) amount
and in Oracle
select (((12.10 +12.0))/ 233.0) amount from dual;
The advantage of using a DB is that you can evaluate many expressions at the same time. Also most DB's will allow you to use highly complex expressions and will also have a number of extra functions that can be called as necessary.
However performance may suffer if many single expressions need to be evaluated individually, particularly when the DB is located on a network server.
The following addresses the performance problem to some extent, by using a Sqlite in-memory database.
Here's a full working example in Java
Class. forName("org.sqlite.JDBC");
Connection conn = DriverManager.getConnection("jdbc:sqlite::memory:");
Statement stat = conn.createStatement();
ResultSet rs = stat.executeQuery( "select (1+10)/20.0 amount");
rs.next();
System.out.println(rs.getBigDecimal(1));
stat.close();
conn.close();
Of course you could extend the above code to handle multiple calculations at the same time.
ResultSet rs = stat.executeQuery( "select (1+10)/20.0 amount, (1+100)/20.0 amount2");
You can also try the BeanShell interpreter:
Interpreter interpreter = new Interpreter();
interpreter.eval("result = (7+21*6)/(32-27)");
System.out.println(interpreter.get("result"));
Another way is to use the Spring Expression Language or SpEL which does a whole lot more along with evaluating mathematical expressions, therefore maybe slightly overkill. You do not have to be using Spring framework to use this expression library as it is stand-alone. Copying examples from SpEL's documentation:
ExpressionParser parser = new SpelExpressionParser();
int two = parser.parseExpression("1 + 1").getValue(Integer.class); // 2
double twentyFour = parser.parseExpression("2.0 * 3e0 * 4").getValue(Double.class); //24.0
This article discusses various approaches. Here are the 2 key approaches mentioned in the article:
JEXL from Apache
Allows for scripts that include references to java objects.
// Create or retrieve a JexlEngine
JexlEngine jexl = new JexlEngine();
// Create an expression object
String jexlExp = "foo.innerFoo.bar()";
Expression e = jexl.createExpression( jexlExp );
// Create a context and add data
JexlContext jctx = new MapContext();
jctx.set("foo", new Foo() );
// Now evaluate the expression, getting the result
Object o = e.evaluate(jctx);
Use the javascript engine embedded in the JDK:
private static void jsEvalWithVariable()
{
List<String> namesList = new ArrayList<String>();
namesList.add("Jill");
namesList.add("Bob");
namesList.add("Laureen");
namesList.add("Ed");
ScriptEngineManager mgr = new ScriptEngineManager();
ScriptEngine jsEngine = mgr.getEngineByName("JavaScript");
jsEngine.put("namesListKey", namesList);
System.out.println("Executing in script environment...");
try
{
jsEngine.eval("var x;" +
"var names = namesListKey.toArray();" +
"for(x in names) {" +
" println(names[x]);" +
"}" +
"namesListKey.add(\"Dana\");");
}
catch (ScriptException ex)
{
ex.printStackTrace();
}
}
if we are going to implement it then we can can use the below algorithm :--
While there are still tokens to be read in,
1.1 Get the next token.
1.2 If the token is:
1.2.1 A number: push it onto the value stack.
1.2.2 A variable: get its value, and push onto the value stack.
1.2.3 A left parenthesis: push it onto the operator stack.
1.2.4 A right parenthesis:
1 While the thing on top of the operator stack is not a
left parenthesis,
1 Pop the operator from the operator stack.
2 Pop the value stack twice, getting two operands.
3 Apply the operator to the operands, in the correct order.
4 Push the result onto the value stack.
2 Pop the left parenthesis from the operator stack, and discard it.
1.2.5 An operator (call it thisOp):
1 While the operator stack is not empty, and the top thing on the
operator stack has the same or greater precedence as thisOp,
1 Pop the operator from the operator stack.
2 Pop the value stack twice, getting two operands.
3 Apply the operator to the operands, in the correct order.
4 Push the result onto the value stack.
2 Push thisOp onto the operator stack.
While the operator stack is not empty,
1 Pop the operator from the operator stack.
2 Pop the value stack twice, getting two operands.
3 Apply the operator to the operands, in the correct order.
4 Push the result onto the value stack.
At this point the operator stack should be empty, and the value
stack should have only one value in it, which is the final result.
This is another interesting alternative
https://github.com/Shy-Ta/expression-evaluator-demo
The usage is very simple and gets the job done, for example:
ExpressionsEvaluator evalExpr = ExpressionsFactory.create("2+3*4-6/2");
assertEquals(BigDecimal.valueOf(11), evalExpr.eval());
It seems like JEP should do the job
It's too late to answer but I came across same situation to evaluate expression in java, it might help someone
MVEL does runtime evaluation of expressions, we can write a java code in String to get it evaluated in this.
String expressionStr = "x+y";
Map<String, Object> vars = new HashMap<String, Object>();
vars.put("x", 10);
vars.put("y", 20);
ExecutableStatement statement = (ExecutableStatement) MVEL.compileExpression(expressionStr);
Object result = MVEL.executeExpression(statement, vars);
Try the following sample code using JDK1.6's Javascript engine with code injection handling.
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
public class EvalUtil {
private static ScriptEngine engine = new ScriptEngineManager().getEngineByName("JavaScript");
public static void main(String[] args) {
try {
System.out.println((new EvalUtil()).eval("(((5+5)/2) > 5) || 5 >3 "));
System.out.println((new EvalUtil()).eval("(((5+5)/2) > 5) || true"));
} catch (Exception e) {
e.printStackTrace();
}
}
public Object eval(String input) throws Exception{
try {
if(input.matches(".*[a-zA-Z;~`#$_{}\\[\\]:\\\\;\"',\\.\\?]+.*")) {
throw new Exception("Invalid expression : " + input );
}
return engine.eval(input);
} catch (Exception e) {
e.printStackTrace();
throw e;
}
}
}
This is actually complementing the answer given by #Boann. It has a slight bug which causes "-2 ^ 2" to give an erroneous result of -4.0. The problem for that is the point at which the exponentiation is evaluated in his. Just move the exponentiation to the block of parseTerm(), and you'll be all fine. Have a look at the below, which is #Boann's answer slightly modified. Modification is in the comments.
public static double eval(final String str) {
return new Object() {
int pos = -1, ch;
void nextChar() {
ch = (++pos < str.length()) ? str.charAt(pos) : -1;
}
boolean eat(int charToEat) {
while (ch == ' ') nextChar();
if (ch == charToEat) {
nextChar();
return true;
}
return false;
}
double parse() {
nextChar();
double x = parseExpression();
if (pos < str.length()) throw new RuntimeException("Unexpected: " + (char)ch);
return x;
}
// Grammar:
// expression = term | expression `+` term | expression `-` term
// term = factor | term `*` factor | term `/` factor
// factor = `+` factor | `-` factor | `(` expression `)`
// | number | functionName factor | factor `^` factor
double parseExpression() {
double x = parseTerm();
for (;;) {
if (eat('+')) x += parseTerm(); // addition
else if (eat('-')) x -= parseTerm(); // subtraction
else return x;
}
}
double parseTerm() {
double x = parseFactor();
for (;;) {
if (eat('*')) x *= parseFactor(); // multiplication
else if (eat('/')) x /= parseFactor(); // division
else if (eat('^')) x = Math.pow(x, parseFactor()); //exponentiation -> Moved in to here. So the problem is fixed
else return x;
}
}
double parseFactor() {
if (eat('+')) return parseFactor(); // unary plus
if (eat('-')) return -parseFactor(); // unary minus
double x;
int startPos = this.pos;
if (eat('(')) { // parentheses
x = parseExpression();
eat(')');
} else if ((ch >= '0' && ch <= '9') || ch == '.') { // numbers
while ((ch >= '0' && ch <= '9') || ch == '.') nextChar();
x = Double.parseDouble(str.substring(startPos, this.pos));
} else if (ch >= 'a' && ch <= 'z') { // functions
while (ch >= 'a' && ch <= 'z') nextChar();
String func = str.substring(startPos, this.pos);
x = parseFactor();
if (func.equals("sqrt")) x = Math.sqrt(x);
else if (func.equals("sin")) x = Math.sin(Math.toRadians(x));
else if (func.equals("cos")) x = Math.cos(Math.toRadians(x));
else if (func.equals("tan")) x = Math.tan(Math.toRadians(x));
else throw new RuntimeException("Unknown function: " + func);
} else {
throw new RuntimeException("Unexpected: " + (char)ch);
}
//if (eat('^')) x = Math.pow(x, parseFactor()); // exponentiation -> This is causing a bit of problem
return x;
}
}.parse();
}
import java.util.*;
public class check {
int ans;
String str="7 + 5";
StringTokenizer st=new StringTokenizer(str);
int v1=Integer.parseInt(st.nextToken());
String op=st.nextToken();
int v2=Integer.parseInt(st.nextToken());
if(op.equals("+")) { ans= v1 + v2; }
if(op.equals("-")) { ans= v1 - v2; }
//.........
}
I think what ever way you do this it's going to involve a lot of conditional statements. But for single operations like in your examples you could limit it to 4 if statements with something like
String math = "1+4";
if (math.split("+").length == 2) {
//do calculation
} else if (math.split("-").length == 2) {
//do calculation
} ...
It gets a whole lot more complicated when you want to deal with multiple operations like "4+5*6".
If you are trying to build a calculator then I'd surgest passing each section of the calculation separatly (each number or operator) rather than as a single string.
You might have a look at the Symja framework:
ExprEvaluator util = new ExprEvaluator();
IExpr result = util.evaluate("10-40");
System.out.println(result.toString()); // -> "-30"
Take note that definitively more complex expressions can be evaluated:
// D(...) gives the derivative of the function Sin(x)*Cos(x)
IAST function = D(Times(Sin(x), Cos(x)), x);
IExpr result = util.evaluate(function);
// print: Cos(x)^2-Sin(x)^2
package ExpressionCalculator.expressioncalculator;
import java.text.DecimalFormat;
import java.util.Scanner;
public class ExpressionCalculator {
private static String addSpaces(String exp){
//Add space padding to operands.
//https://regex101.com/r/sJ9gM7/73
exp = exp.replaceAll("(?<=[0-9()])[\\/]", " / ");
exp = exp.replaceAll("(?<=[0-9()])[\\^]", " ^ ");
exp = exp.replaceAll("(?<=[0-9()])[\\*]", " * ");
exp = exp.replaceAll("(?<=[0-9()])[+]", " + ");
exp = exp.replaceAll("(?<=[0-9()])[-]", " - ");
//Keep replacing double spaces with single spaces until your string is properly formatted
/*while(exp.indexOf(" ") != -1){
exp = exp.replace(" ", " ");
}*/
exp = exp.replaceAll(" {2,}", " ");
return exp;
}
public static Double evaluate(String expr){
DecimalFormat df = new DecimalFormat("#.####");
//Format the expression properly before performing operations
String expression = addSpaces(expr);
try {
//We will evaluate using rule BDMAS, i.e. brackets, division, power, multiplication, addition and
//subtraction will be processed in following order
int indexClose = expression.indexOf(")");
int indexOpen = -1;
if (indexClose != -1) {
String substring = expression.substring(0, indexClose);
indexOpen = substring.lastIndexOf("(");
substring = substring.substring(indexOpen + 1).trim();
if(indexOpen != -1 && indexClose != -1) {
Double result = evaluate(substring);
expression = expression.substring(0, indexOpen).trim() + " " + result + " " + expression.substring(indexClose + 1).trim();
return evaluate(expression.trim());
}
}
String operation = "";
if(expression.indexOf(" / ") != -1){
operation = "/";
}else if(expression.indexOf(" ^ ") != -1){
operation = "^";
} else if(expression.indexOf(" * ") != -1){
operation = "*";
} else if(expression.indexOf(" + ") != -1){
operation = "+";
} else if(expression.indexOf(" - ") != -1){ //Avoid negative numbers
operation = "-";
} else{
return Double.parseDouble(expression);
}
int index = expression.indexOf(operation);
if(index != -1){
indexOpen = expression.lastIndexOf(" ", index - 2);
indexOpen = (indexOpen == -1)?0:indexOpen;
indexClose = expression.indexOf(" ", index + 2);
indexClose = (indexClose == -1)?expression.length():indexClose;
if(indexOpen != -1 && indexClose != -1) {
Double lhs = Double.parseDouble(expression.substring(indexOpen, index));
Double rhs = Double.parseDouble(expression.substring(index + 2, indexClose));
Double result = null;
switch (operation){
case "/":
//Prevent divide by 0 exception.
if(rhs == 0){
return null;
}
result = lhs / rhs;
break;
case "^":
result = Math.pow(lhs, rhs);
break;
case "*":
result = lhs * rhs;
break;
case "-":
result = lhs - rhs;
break;
case "+":
result = lhs + rhs;
break;
default:
break;
}
if(indexClose == expression.length()){
expression = expression.substring(0, indexOpen) + " " + result + " " + expression.substring(indexClose);
}else{
expression = expression.substring(0, indexOpen) + " " + result + " " + expression.substring(indexClose + 1);
}
return Double.valueOf(df.format(evaluate(expression.trim())));
}
}
}catch(Exception exp){
exp.printStackTrace();
}
return 0.0;
}
public static void main(String args[]){
Scanner scanner = new Scanner(System.in);
System.out.print("Enter an Mathematical Expression to Evaluate: ");
String input = scanner.nextLine();
System.out.println(evaluate(input));
}
}
A Java class that can evaluate mathematical expressions:
package test;
public class Calculator {
public static Double calculate(String expression){
if (expression == null || expression.length() == 0) {
return null;
}
return calc(expression.replace(" ", ""));
}
public static Double calc(String expression) {
String[] containerArr = new String[]{expression};
double leftVal = getNextOperand(containerArr);
expression = containerArr[0];
if (expression.length() == 0) {
return leftVal;
}
char operator = expression.charAt(0);
expression = expression.substring(1);
while (operator == '*' || operator == '/') {
containerArr[0] = expression;
double rightVal = getNextOperand(containerArr);
expression = containerArr[0];
if (operator == '*') {
leftVal = leftVal * rightVal;
} else {
leftVal = leftVal / rightVal;
}
if (expression.length() > 0) {
operator = expression.charAt(0);
expression = expression.substring(1);
} else {
return leftVal;
}
}
if (operator == '+') {
return leftVal + calc(expression);
} else {
return leftVal - calc(expression);
}
}
private static double getNextOperand(String[] exp){
double res;
if (exp[0].startsWith("(")) {
int open = 1;
int i = 1;
while (open != 0) {
if (exp[0].charAt(i) == '(') {
open++;
} else if (exp[0].charAt(i) == ')') {
open--;
}
i++;
}
res = calc(exp[0].substring(1, i - 1));
exp[0] = exp[0].substring(i);
} else {
int i = 1;
if (exp[0].charAt(0) == '-') {
i++;
}
while (exp[0].length() > i && isNumber((int) exp[0].charAt(i))) {
i++;
}
res = Double.parseDouble(exp[0].substring(0, i));
exp[0] = exp[0].substring(i);
}
return res;
}
private static boolean isNumber(int c) {
int zero = (int) '0';
int nine = (int) '9';
return (c >= zero && c <= nine) || c =='.';
}
public static void main(String[] args) {
System.out.println(calculate("(((( -6 )))) * 9 * -1"));
System.out.println(calc("(-5.2+-5*-5*((5/4+2)))"));
}
}
How about something like this:
String st = "10+3";
int result;
for(int i=0;i<st.length();i++)
{
if(st.charAt(i)=='+')
{
result=Integer.parseInt(st.substring(0, i))+Integer.parseInt(st.substring(i+1, st.length()));
System.out.print(result);
}
}
and do the similar thing for every other mathematical operator accordingly ..
It is possible to convert any expression string in infix notation to a postfix notation using Djikstra's shunting-yard algorithm. The result of the algorithm can then serve as input to the postfix algorithm with returns the result of the expression.
I wrote an article about it here, with an implementation in java
Yet another option: https://github.com/stefanhaustein/expressionparser
I have implemented this to have a simple but flexible option to permit both:
Immediate processing (Calculator.java, SetDemo.java)
Building and processing a parse tree (TreeBuilder.java)
The TreeBuilder linked above is part of a CAS demo package that does symbolic derivation. There is also a BASIC interpreter example and I have started to build a TypeScript interpreter using it.
External library like RHINO or NASHORN can be used to run javascript. And javascript can evaluate simple formula without parcing the string. No performance impact as well if code is written well.
Below is an example with RHINO -
public class RhinoApp {
private String simpleAdd = "(12+13+2-2)*2+(12+13+2-2)*2";
public void runJavaScript() {
Context jsCx = Context.enter();
Context.getCurrentContext().setOptimizationLevel(-1);
ScriptableObject scope = jsCx.initStandardObjects();
Object result = jsCx.evaluateString(scope, simpleAdd , "formula", 0, null);
Context.exit();
System.out.println(result);
}
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
public class test2 {
public static void main(String[] args) throws ScriptException {
String s = "10+2";
ScriptEngineManager mn = new ScriptEngineManager();
ScriptEngine en = mn.getEngineByName("js");
Object result = en.eval(s);
System.out.println(result);
}
}
I have done using iterative parsing and shunting Yard algorithm and i have really enjoyed developing the expression evaluator ,you can find all the code here
https://github.com/nagaraj200788/JavaExpressionEvaluator
Has 73 test cases and even works for Bigintegers,Bigdecimals
supports all relational, arithmetic expression and also combination of both .
even supports ternary operator .
Added enhancement to support signed numbers like -100+89 it was intresting, for details check TokenReader.isUnaryOperator() method and i have updated code in above Link

Why doesn't my number sequence print from the 2d arraylist correctly?

I cannot get the loop to work in the buildDimArray method to store the number combinations "11+11", "11+12", "11+21", "11+22", "12+11", "12+12", "12+21", "12+22", "21+11", "21+12", "21+21", "21+22", "22+11", "22+12", "22+21", and "22+22" into the 2d arraylist with each expression going into one column of the index dimBase-1 row. The loop may work for other people, but for some reason mine isn't functioning correctly. The JVM sees the if dimBase==1 condition, but refuses to check the other conditions. The "WTF" not being printed as a result from the buildDimArray method. If dimBase=1, it prints successfully, but doesn't for the other integers. The dimBase==3 condition needs a loop eventually. The "WTF" is for illustrative purposes. I could get away with a 1d arraylist, but in the future I will likely need the 2d arraylist once the program is completed.
package jordanNumberApp;
import java.util.Scanner;
import java.util.ArrayList;
/*
* Dev Wills
* Purpose: This code contains some methods that aren't developed. This program is supposed to
* store all possible number combinations from numbers 1-dimBase for the math expression
* "##+##" into a 2d arraylist at index row dimBase-1 and the columns storing the
* individual combinations. After storing the values in the arraylist, the print method
* pours the contents in order from the arraylist as string values.
*/
public class JordanNumberSystem {
// a-d are digits, assembled as a math expression, stored in outcomeOutput, outcomeAnswer
public static int dimBase, outcomeAnswer, a, b, c, d;
public static String inputOutcome, outcomeOutput;
public static final int NUM_OF_DIMENSIONS = 9; //Eventually # combinations go up to 9
public static ArrayList<ArrayList<String>> dimBaseArray;
public static Scanner keyboard;
/*
* Constructor for JordanNumber System
* accepts no parameters
*/
public JordanNumberSystem() // Defunct constructor
{
// Declare and Initialize public variables
this.dimBase = dimBase;
this.outcomeOutput = outcomeOutput;
this.outcomeAnswer = outcomeAnswer;
}
// Set all values of variable values
public static void setAllValues()
{
// Initialize
dimBase = 1;
outcomeAnswer = 22; // variables not used for now
outcomeOutput = "1"; // variables not used for now
//a = 1;
//b = 1;
//c = 1;
//d = 1;
dimBaseArray = new ArrayList<ArrayList<String>>();
keyboard = new Scanner(System.in);
}
public static void buildDimArray(int dim)
{
dimBase = dim;
try
{
//create first row
dimBaseArray.add(dimBase-1, new ArrayList<String>());
if( dimBase == 1)
{
a = b = c = d = dimBase ;
dimBaseArray.get(0).add(a+""+b+"+"+c+""+d);
System.out.println("WTF"); // SHOWS
}
else if (dimBase == 2)
{ // dim = 2
a = b = c = d = 1 ;
System.out.println("WTF"); // doesn't show
// dimBaseArray.get(dimBase-1).add(a+""+b+"+"+c+""+d);
for( int i = 1 ; i <= dim ; i++)
a=i;
for( int j = 1 ; j <= dim ; j++)
b=j;
for( int k = 1 ; k <= dim ; k++)
c=k;
for( int l = 1 ; l <= dim ; l++)
{
d=l;
dimBaseArray.get(dim-1).add(a+""+b+"+"+c+""+d);
}
}
else if (dimBase == 3)
{
a = b = c = d = dimBase;
dimBaseArray.get(2).add(a+""+b+"+"+c+""+d);
System.out.println("WTF");
}
}catch (IndexOutOfBoundsException e)
{
System.out.println(e.getMessage());
}
}
public static void printArray(int num) // Prints the contents of the array
{ // Fixing the printing method
try
{
int i = num-1;
for( String string : dimBaseArray.get(i))
{
System.out.println(string);
System.out.println("");
}
} catch (IndexOutOfBoundsException e)
{
System.out.println(e.getMessage());
}
}
public static void main(String[] args) throws java.lang.IndexOutOfBoundsException
{
setAllValues(); // sets the initial a,b,c,d values and dimBase, initializes 2d arraylist
// Get the Dimension Base number
System.out.println("Enter Dimension Base Number. Input an integer: ");
int dimBaseInput = keyboard.nextInt(); // Receives integer
dimBase = dimBaseInput;
if( dimBase != 1 && dimBase != 2 && dimBase != 3)
{// Error checking
System.out.println("invalid Dimension Base Number should be 1 or 2 ");
System.exit(1);
}
// Build the arraylist, print, clear, exit
buildDimArray(dimBase);
printArray(dimBase);
dimBaseArray.clear();
System.exit(1);
}
}// End of class

Capitalise first letter of each word in string + lowercase all other letters [duplicate]

Is there a function built into Java that capitalizes the first character of each word in a String, and does not affect the others?
Examples:
jon skeet -> Jon Skeet
miles o'Brien -> Miles O'Brien (B remains capital, this rules out Title Case)
old mcdonald -> Old Mcdonald*
*(Old McDonald would be find too, but I don't expect it to be THAT smart.)
A quick look at the Java String Documentation reveals only toUpperCase() and toLowerCase(), which of course do not provide the desired behavior. Naturally, Google results are dominated by those two functions. It seems like a wheel that must have been invented already, so it couldn't hurt to ask so I can use it in the future.
WordUtils.capitalize(str) (from apache commons-text)
(Note: if you need "fOO BAr" to become "Foo Bar", then use capitalizeFully(..) instead)
If you're only worried about the first letter of the first word being capitalized:
private String capitalize(final String line) {
return Character.toUpperCase(line.charAt(0)) + line.substring(1);
}
The following method converts all the letters into upper/lower case, depending on their position near a space or other special chars.
public static String capitalizeString(String string) {
char[] chars = string.toLowerCase().toCharArray();
boolean found = false;
for (int i = 0; i < chars.length; i++) {
if (!found && Character.isLetter(chars[i])) {
chars[i] = Character.toUpperCase(chars[i]);
found = true;
} else if (Character.isWhitespace(chars[i]) || chars[i]=='.' || chars[i]=='\'') { // You can add other chars here
found = false;
}
}
return String.valueOf(chars);
}
Try this very simple way
example givenString="ram is good boy"
public static String toTitleCase(String givenString) {
String[] arr = givenString.split(" ");
StringBuffer sb = new StringBuffer();
for (int i = 0; i < arr.length; i++) {
sb.append(Character.toUpperCase(arr[i].charAt(0)))
.append(arr[i].substring(1)).append(" ");
}
return sb.toString().trim();
}
Output will be: Ram Is Good Boy
I made a solution in Java 8 that is IMHO more readable.
public String firstLetterCapitalWithSingleSpace(final String words) {
return Stream.of(words.trim().split("\\s"))
.filter(word -> word.length() > 0)
.map(word -> word.substring(0, 1).toUpperCase() + word.substring(1))
.collect(Collectors.joining(" "));
}
The Gist for this solution can be found here: https://gist.github.com/Hylke1982/166a792313c5e2df9d31
String toBeCapped = "i want this sentence capitalized";
String[] tokens = toBeCapped.split("\\s");
toBeCapped = "";
for(int i = 0; i < tokens.length; i++){
char capLetter = Character.toUpperCase(tokens[i].charAt(0));
toBeCapped += " " + capLetter + tokens[i].substring(1);
}
toBeCapped = toBeCapped.trim();
I've written a small Class to capitalize all the words in a String.
Optional multiple delimiters, each one with its behavior (capitalize before, after, or both, to handle cases like O'Brian);
Optional Locale;
Don't breaks with Surrogate Pairs.
LIVE DEMO
Output:
====================================
SIMPLE USAGE
====================================
Source: cApItAlIzE this string after WHITE SPACES
Output: Capitalize This String After White Spaces
====================================
SINGLE CUSTOM-DELIMITER USAGE
====================================
Source: capitalize this string ONLY before'and''after'''APEX
Output: Capitalize this string only beforE'AnD''AfteR'''Apex
====================================
MULTIPLE CUSTOM-DELIMITER USAGE
====================================
Source: capitalize this string AFTER SPACES, BEFORE'APEX, and #AFTER AND BEFORE# NUMBER SIGN (#)
Output: Capitalize This String After Spaces, BeforE'apex, And #After And BeforE# Number Sign (#)
====================================
SIMPLE USAGE WITH CUSTOM LOCALE
====================================
Source: Uniforming the first and last vowels (different kind of 'i's) of the Turkish word D[İ]YARBAK[I]R (DİYARBAKIR)
Output: Uniforming The First And Last Vowels (different Kind Of 'i's) Of The Turkish Word D[i]yarbak[i]r (diyarbakir)
====================================
SIMPLE USAGE WITH A SURROGATE PAIR
====================================
Source: ab 𐐂c de à
Output: Ab 𐐪c De À
Note: first letter will always be capitalized (edit the source if you don't want that).
Please share your comments and help me to found bugs or to improve the code...
Code:
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.Locale;
public class WordsCapitalizer {
public static String capitalizeEveryWord(String source) {
return capitalizeEveryWord(source,null,null);
}
public static String capitalizeEveryWord(String source, Locale locale) {
return capitalizeEveryWord(source,null,locale);
}
public static String capitalizeEveryWord(String source, List<Delimiter> delimiters, Locale locale) {
char[] chars;
if (delimiters == null || delimiters.size() == 0)
delimiters = getDefaultDelimiters();
// If Locale specified, i18n toLowerCase is executed, to handle specific behaviors (eg. Turkish dotted and dotless 'i')
if (locale!=null)
chars = source.toLowerCase(locale).toCharArray();
else
chars = source.toLowerCase().toCharArray();
// First charachter ALWAYS capitalized, if it is a Letter.
if (chars.length>0 && Character.isLetter(chars[0]) && !isSurrogate(chars[0])){
chars[0] = Character.toUpperCase(chars[0]);
}
for (int i = 0; i < chars.length; i++) {
if (!isSurrogate(chars[i]) && !Character.isLetter(chars[i])) {
// Current char is not a Letter; gonna check if it is a delimitrer.
for (Delimiter delimiter : delimiters){
if (delimiter.getDelimiter()==chars[i]){
// Delimiter found, applying rules...
if (delimiter.capitalizeBefore() && i>0
&& Character.isLetter(chars[i-1]) && !isSurrogate(chars[i-1]))
{ // previous character is a Letter and I have to capitalize it
chars[i-1] = Character.toUpperCase(chars[i-1]);
}
if (delimiter.capitalizeAfter() && i<chars.length-1
&& Character.isLetter(chars[i+1]) && !isSurrogate(chars[i+1]))
{ // next character is a Letter and I have to capitalize it
chars[i+1] = Character.toUpperCase(chars[i+1]);
}
break;
}
}
}
}
return String.valueOf(chars);
}
private static boolean isSurrogate(char chr){
// Check if the current character is part of an UTF-16 Surrogate Pair.
// Note: not validating the pair, just used to bypass (any found part of) it.
return (Character.isHighSurrogate(chr) || Character.isLowSurrogate(chr));
}
private static List<Delimiter> getDefaultDelimiters(){
// If no delimiter specified, "Capitalize after space" rule is set by default.
List<Delimiter> delimiters = new ArrayList<Delimiter>();
delimiters.add(new Delimiter(Behavior.CAPITALIZE_AFTER_MARKER, ' '));
return delimiters;
}
public static class Delimiter {
private Behavior behavior;
private char delimiter;
public Delimiter(Behavior behavior, char delimiter) {
super();
this.behavior = behavior;
this.delimiter = delimiter;
}
public boolean capitalizeBefore(){
return (behavior.equals(Behavior.CAPITALIZE_BEFORE_MARKER)
|| behavior.equals(Behavior.CAPITALIZE_BEFORE_AND_AFTER_MARKER));
}
public boolean capitalizeAfter(){
return (behavior.equals(Behavior.CAPITALIZE_AFTER_MARKER)
|| behavior.equals(Behavior.CAPITALIZE_BEFORE_AND_AFTER_MARKER));
}
public char getDelimiter() {
return delimiter;
}
}
public static enum Behavior {
CAPITALIZE_AFTER_MARKER(0),
CAPITALIZE_BEFORE_MARKER(1),
CAPITALIZE_BEFORE_AND_AFTER_MARKER(2);
private int value;
private Behavior(int value) {
this.value = value;
}
public int getValue() {
return value;
}
}
Using org.apache.commons.lang.StringUtils makes it very simple.
capitalizeStr = StringUtils.capitalize(str);
From Java 9+
you can use String::replaceAll like this :
public static void upperCaseAllFirstCharacter(String text) {
String regex = "\\b(.)(.*?)\\b";
String result = Pattern.compile(regex).matcher(text).replaceAll(
matche -> matche.group(1).toUpperCase() + matche.group(2)
);
System.out.println(result);
}
Example :
upperCaseAllFirstCharacter("hello this is Just a test");
Outputs
Hello This Is Just A Test
With this simple code:
String example="hello";
example=example.substring(0,1).toUpperCase()+example.substring(1, example.length());
System.out.println(example);
Result: Hello
I'm using the following function. I think it is faster in performance.
public static String capitalize(String text){
String c = (text != null)? text.trim() : "";
String[] words = c.split(" ");
String result = "";
for(String w : words){
result += (w.length() > 1? w.substring(0, 1).toUpperCase(Locale.US) + w.substring(1, w.length()).toLowerCase(Locale.US) : w) + " ";
}
return result.trim();
}
Use the Split method to split your string into words, then use the built in string functions to capitalize each word, then append together.
Pseudo-code (ish)
string = "the sentence you want to apply caps to";
words = string.split(" ")
string = ""
for(String w: words)
//This line is an easy way to capitalize a word
word = word.toUpperCase().replace(word.substring(1), word.substring(1).toLowerCase())
string += word
In the end string looks something like
"The Sentence You Want To Apply Caps To"
This might be useful if you need to capitalize titles. It capitalizes each substring delimited by " ", except for specified strings such as "a" or "the". I haven't ran it yet because it's late, should be fine though. Uses Apache Commons StringUtils.join() at one point. You can substitute it with a simple loop if you wish.
private static String capitalize(String string) {
if (string == null) return null;
String[] wordArray = string.split(" "); // Split string to analyze word by word.
int i = 0;
lowercase:
for (String word : wordArray) {
if (word != wordArray[0]) { // First word always in capital
String [] lowercaseWords = {"a", "an", "as", "and", "although", "at", "because", "but", "by", "for", "in", "nor", "of", "on", "or", "so", "the", "to", "up", "yet"};
for (String word2 : lowercaseWords) {
if (word.equals(word2)) {
wordArray[i] = word;
i++;
continue lowercase;
}
}
}
char[] characterArray = word.toCharArray();
characterArray[0] = Character.toTitleCase(characterArray[0]);
wordArray[i] = new String(characterArray);
i++;
}
return StringUtils.join(wordArray, " "); // Re-join string
}
public static String toTitleCase(String word){
return Character.toUpperCase(word.charAt(0)) + word.substring(1);
}
public static void main(String[] args){
String phrase = "this is to be title cased";
String[] splitPhrase = phrase.split(" ");
String result = "";
for(String word: splitPhrase){
result += toTitleCase(word) + " ";
}
System.out.println(result.trim());
}
1. Java 8 Streams
public static String capitalizeAll(String str) {
if (str == null || str.isEmpty()) {
return str;
}
return Arrays.stream(str.split("\\s+"))
.map(t -> t.substring(0, 1).toUpperCase() + t.substring(1))
.collect(Collectors.joining(" "));
}
Examples:
System.out.println(capitalizeAll("jon skeet")); // Jon Skeet
System.out.println(capitalizeAll("miles o'Brien")); // Miles O'Brien
System.out.println(capitalizeAll("old mcdonald")); // Old Mcdonald
System.out.println(capitalizeAll(null)); // null
For foo bAR to Foo Bar, replace the map() method with the following:
.map(t -> t.substring(0, 1).toUpperCase() + t.substring(1).toLowerCase())
2. String.replaceAll() (Java 9+)
ublic static String capitalizeAll(String str) {
if (str == null || str.isEmpty()) {
return str;
}
return Pattern.compile("\\b(.)(.*?)\\b")
.matcher(str)
.replaceAll(match -> match.group(1).toUpperCase() + match.group(2));
}
Examples:
System.out.println(capitalizeAll("12 ways to learn java")); // 12 Ways To Learn Java
System.out.println(capitalizeAll("i am atta")); // I Am Atta
System.out.println(capitalizeAll(null)); // null
3. Apache Commons Text
System.out.println(WordUtils.capitalize("love is everywhere")); // Love Is Everywhere
System.out.println(WordUtils.capitalize("sky, sky, blue sky!")); // Sky, Sky, Blue Sky!
System.out.println(WordUtils.capitalize(null)); // null
For titlecase:
System.out.println(WordUtils.capitalizeFully("fOO bAR")); // Foo Bar
System.out.println(WordUtils.capitalizeFully("sKy is BLUE!")); // Sky Is Blue!
For details, checkout this tutorial.
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Enter the sentence : ");
try
{
String str = br.readLine();
char[] str1 = new char[str.length()];
for(int i=0; i<str.length(); i++)
{
str1[i] = Character.toLowerCase(str.charAt(i));
}
str1[0] = Character.toUpperCase(str1[0]);
for(int i=0;i<str.length();i++)
{
if(str1[i] == ' ')
{
str1[i+1] = Character.toUpperCase(str1[i+1]);
}
System.out.print(str1[i]);
}
}
catch(Exception e)
{
System.err.println("Error: " + e.getMessage());
}
I decided to add one more solution for capitalizing words in a string:
words are defined here as adjacent letter-or-digit characters;
surrogate pairs are provided as well;
the code has been optimized for performance; and
it is still compact.
Function:
public static String capitalize(String string) {
final int sl = string.length();
final StringBuilder sb = new StringBuilder(sl);
boolean lod = false;
for(int s = 0; s < sl; s++) {
final int cp = string.codePointAt(s);
sb.appendCodePoint(lod ? Character.toLowerCase(cp) : Character.toUpperCase(cp));
lod = Character.isLetterOrDigit(cp);
if(!Character.isBmpCodePoint(cp)) s++;
}
return sb.toString();
}
Example call:
System.out.println(capitalize("An à la carte StRiNg. Surrogate pairs: 𐐪𐐪."));
Result:
An À La Carte String. Surrogate Pairs: 𐐂𐐪.
Use:
String text = "jon skeet, miles o'brien, old mcdonald";
Pattern pattern = Pattern.compile("\\b([a-z])([\\w]*)");
Matcher matcher = pattern.matcher(text);
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(buffer, matcher.group(1).toUpperCase() + matcher.group(2));
}
String capitalized = matcher.appendTail(buffer).toString();
System.out.println(capitalized);
There are many way to convert the first letter of the first word being capitalized. I have an idea. It's very simple:
public String capitalize(String str){
/* The first thing we do is remove whitespace from string */
String c = str.replaceAll("\\s+", " ");
String s = c.trim();
String l = "";
for(int i = 0; i < s.length(); i++){
if(i == 0){ /* Uppercase the first letter in strings */
l += s.toUpperCase().charAt(i);
i++; /* To i = i + 1 because we don't need to add
value i = 0 into string l */
}
l += s.charAt(i);
if(s.charAt(i) == 32){ /* If we meet whitespace (32 in ASCII Code is whitespace) */
l += s.toUpperCase().charAt(i+1); /* Uppercase the letter after whitespace */
i++; /* Yo i = i + 1 because we don't need to add
value whitespace into string l */
}
}
return l;
}
package com.test;
/**
* #author Prasanth Pillai
* #date 01-Feb-2012
* #description : Below is the test class details
*
* inputs a String from a user. Expect the String to contain spaces and alphanumeric characters only.
* capitalizes all first letters of the words in the given String.
* preserves all other characters (including spaces) in the String.
* displays the result to the user.
*
* Approach : I have followed a simple approach. However there are many string utilities available
* for the same purpose. Example : WordUtils.capitalize(str) (from apache commons-lang)
*
*/
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Test {
public static void main(String[] args) throws IOException{
System.out.println("Input String :\n");
InputStreamReader converter = new InputStreamReader(System.in);
BufferedReader in = new BufferedReader(converter);
String inputString = in.readLine();
int length = inputString.length();
StringBuffer newStr = new StringBuffer(0);
int i = 0;
int k = 0;
/* This is a simple approach
* step 1: scan through the input string
* step 2: capitalize the first letter of each word in string
* The integer k, is used as a value to determine whether the
* letter is the first letter in each word in the string.
*/
while( i < length){
if (Character.isLetter(inputString.charAt(i))){
if ( k == 0){
newStr = newStr.append(Character.toUpperCase(inputString.charAt(i)));
k = 2;
}//this else loop is to avoid repeatation of the first letter in output string
else {
newStr = newStr.append(inputString.charAt(i));
}
} // for the letters which are not first letter, simply append to the output string.
else {
newStr = newStr.append(inputString.charAt(i));
k=0;
}
i+=1;
}
System.out.println("new String ->"+newStr);
}
}
Here is a simple function
public static String capEachWord(String source){
String result = "";
String[] splitString = source.split(" ");
for(String target : splitString){
result += Character.toUpperCase(target.charAt(0))
+ target.substring(1) + " ";
}
return result.trim();
}
This is just another way of doing it:
private String capitalize(String line)
{
StringTokenizer token =new StringTokenizer(line);
String CapLine="";
while(token.hasMoreTokens())
{
String tok = token.nextToken().toString();
CapLine += Character.toUpperCase(tok.charAt(0))+ tok.substring(1)+" ";
}
return CapLine.substring(0,CapLine.length()-1);
}
Reusable method for intiCap:
public class YarlagaddaSireeshTest{
public static void main(String[] args) {
String FinalStringIs = "";
String testNames = "sireesh yarlagadda test";
String[] name = testNames.split("\\s");
for(String nameIs :name){
FinalStringIs += getIntiCapString(nameIs) + ",";
}
System.out.println("Final Result "+ FinalStringIs);
}
public static String getIntiCapString(String param) {
if(param != null && param.length()>0){
char[] charArray = param.toCharArray();
charArray[0] = Character.toUpperCase(charArray[0]);
return new String(charArray);
}
else {
return "";
}
}
}
Here is my solution.
I ran across this problem tonight and decided to search it. I found an answer by Neelam Singh that was almost there, so I decided to fix the issue (broke on empty strings) and caused a system crash.
The method you are looking for is named capString(String s) below.
It turns "It's only 5am here" into "It's Only 5am Here".
The code is pretty well commented, so enjoy.
package com.lincolnwdaniel.interactivestory.model;
public class StringS {
/**
* #param s is a string of any length, ideally only one word
* #return a capitalized string.
* only the first letter of the string is made to uppercase
*/
public static String capSingleWord(String s) {
if(s.isEmpty() || s.length()<2) {
return Character.toUpperCase(s.charAt(0))+"";
}
else {
return Character.toUpperCase(s.charAt(0)) + s.substring(1);
}
}
/**
*
* #param s is a string of any length
* #return a title cased string.
* All first letter of each word is made to uppercase
*/
public static String capString(String s) {
// Check if the string is empty, if it is, return it immediately
if(s.isEmpty()){
return s;
}
// Split string on space and create array of words
String[] arr = s.split(" ");
// Create a string buffer to hold the new capitalized string
StringBuffer sb = new StringBuffer();
// Check if the array is empty (would be caused by the passage of s as an empty string [i.g "" or " "],
// If it is, return the original string immediately
if( arr.length < 1 ){
return s;
}
for (int i = 0; i < arr.length; i++) {
sb.append(Character.toUpperCase(arr[i].charAt(0)))
.append(arr[i].substring(1)).append(" ");
}
return sb.toString().trim();
}
}
Here we go for perfect first char capitalization of word
public static void main(String[] args) {
String input ="my name is ranjan";
String[] inputArr = input.split(" ");
for(String word : inputArr) {
System.out.println(word.substring(0, 1).toUpperCase()+word.substring(1,word.length()));
}
}
}
//Output : My Name Is Ranjan
For those of you using Velocity in your MVC, you can use the capitalizeFirstLetter() method from the StringUtils class.
String s="hi dude i want apple";
s = s.replaceAll("\\s+"," ");
String[] split = s.split(" ");
s="";
for (int i = 0; i < split.length; i++) {
split[i]=Character.toUpperCase(split[i].charAt(0))+split[i].substring(1);
s+=split[i]+" ";
System.out.println(split[i]);
}
System.out.println(s);
package corejava.string.intern;
import java.io.DataInputStream;
import java.util.ArrayList;
/*
* wap to accept only 3 sentences and convert first character of each word into upper case
*/
public class Accept3Lines_FirstCharUppercase {
static String line;
static String words[];
static ArrayList<String> list=new ArrayList<String>();
/**
* #param args
*/
public static void main(String[] args) throws java.lang.Exception{
DataInputStream read=new DataInputStream(System.in);
System.out.println("Enter only three sentences");
int i=0;
while((line=read.readLine())!=null){
method(line); //main logic of the code
if((i++)==2){
break;
}
}
display();
System.out.println("\n End of the program");
}
/*
* this will display all the elements in an array
*/
public static void display(){
for(String display:list){
System.out.println(display);
}
}
/*
* this divide the line of string into words
* and first char of the each word is converted to upper case
* and to an array list
*/
public static void method(String lineParam){
words=line.split("\\s");
for(String s:words){
String result=s.substring(0,1).toUpperCase()+s.substring(1);
list.add(result);
}
}
}
If you prefer Guava...
String myString = ...;
String capWords = Joiner.on(' ').join(Iterables.transform(Splitter.on(' ').omitEmptyStrings().split(myString), new Function<String, String>() {
public String apply(String input) {
return Character.toUpperCase(input.charAt(0)) + input.substring(1);
}
}));
String toUpperCaseFirstLetterOnly(String str) {
String[] words = str.split(" ");
StringBuilder ret = new StringBuilder();
for(int i = 0; i < words.length; i++) {
ret.append(Character.toUpperCase(words[i].charAt(0)));
ret.append(words[i].substring(1));
if(i < words.length - 1) {
ret.append(' ');
}
}
return ret.toString();
}

implement feedback in lucene

Can someone give me hints to apply pseudo-feedback in lucene. I can not find much help on google. I am using Similarity classes.
Is there any class in lucene which I can extend to implement feedback ?
thanks.
Assuming you are referring to this relevance feedback method, Once you have the TopDocs for your original query, iterate through how ever many (let's say we'll want the top 25 terms for the top 25 docs of the original query) records you desire, and call IndexReader.getTermVectors(int), which will grab the information you need. Iterate through each. while storing the term frequencies in a hash map would be the implementation the immediately occurs to me.
Something like:
//Get the original results
TopDocs docs = indexsearcher.search(query,25);
HashMap<String,ScorePair> map = new HashMap<String,ScorePair>();
for (int i = 0; i < docs.scoreDocs.length; i++) {
//Iterate fields for each result
FieldsEnum fields = indexreader.getTermVectors(docs.scoreDocs[i].doc).iterator();
String fieldname;
while (fieldname = fields.next()) {
//For each field, iterate it's terms
TermsEnum terms = fields.terms().iterator();
while (terms.next()) {
//and store it
putTermInMap(fieldname, terms.term(), terms.docFreq(), map);
}
}
}
List<ScorePair> byScore = new ArrayList<ScorePair>(map.values());
Collections.sort(byScore);
BooleanQuery bq = new BooleanQuery();
//Perhaps we want to give the original query a bit of a boost
query.setBoost(5);
bq.add(query,BooleanClause.Occur.SHOULD);
for (int i = 0; i < 25; i++) {
//Add all our found terms to the final query
ScorePair pair = byScore.get(i);
bq.add(new TermQuery(new Term(pair.field,pair.term)),BooleanClause.Occur.SHOULD);
}
}
//Say, we want to score based on tf/idf
void putTermInMap(String field, String term, int freq, Map<String,ScorePair> map) {
String key = field + ":" + term;
if (map.containsKey(key))
map.get(key).increment();
else
map.put(key,new ScorePair(freq,field,term));
}
private class ScorePair implements Comparable{
int count = 0;
double idf;
String field;
String term;
ScorePair(int docfreq, String field, String term) {
count++;
//Standard Lucene idf calculation. This is calculated once per field:term
idf = (1 + Math.log(indexreader.numDocs()/((double)docfreq + 1))) ^ 2;
this.field = field;
this.term = term;
}
void increment() { count++; }
double score() {
return Math.sqrt(count) * idf;
}
//Standard Lucene TF/IDF calculation, if I'm not mistaken about it.
int compareTo(ScorePair pair) {
if (this.score() < pair.score()) return -1;
else return 1;
}
}
(I make no claim that this is functional code, in it's current state.)