Im using LocationTextExtractionStrategy to render text from PDF.
Text is rendered in function called RenderText.
So my question is: Can one chunk contains more than 2 words ?
For example we have text:
'MKL is a helpfull person'
Can it be written in chunks like (the most important chunk is bolded):
MK
L
is a h
elpfull
per son
?
Below is the code i use for word separation.
Im doing the word separation during adding text(chunk from renderText function) to current line.
public class TextLineLocation
{
public float X { get; set; }
public float Y { get; set; }
public float Height { get; set; }
public float Width { get; set; }
private string Text;
private List<char> bannedSings = new List<char>() {' ',',', '.', '/', '|', Convert.ToChar(#"\"), ';', '(', ')', '*', '&', '^', '!','?' };
public void AddText(TextInfo text)
{
Text += text;
foreach (char sign in bannedSings)
{
//creating new word
if (text.textChunk.Text.Contains(sign))
{
string[] splittedText = text.textChunk.Text.Split(sign);
foreach (string val in splittedText)
{
//if its first element, add it to current word
if (splittedText[0] == val)
{
// if its space, just ignore...
if (splittedText[0] == " ")
{
continue;
}
wordList[wordList.Count - 1].Text += val;
wordList[wordList.Count - 1].Width += text.getFontWidth();
wordList[wordList.Count - 1].Height += text.getFontHeight();
}
else
{
//if it isnt a first element, create another word
wordList.Add(new WordLocation(text.textChunk.StartLocation[1], text.textChunk.StartLocation[0], text.getFontWidth(), text.getFontHeight(), val));
//TODO: what if chunk has more than 2 words separated ?
}
}
}
}
else
{
//update last word
wordList[wordList.Count-1].Text += text.textChunk.Text;
wordList[wordList.Count - 1].Width += text.getFontWidth();
wordList[wordList.Count - 1].Height += text.getFontHeight();
}
}
public List<WordLocation> wordList = new List<WordLocation>();
}
Not sure from what library LocationTextExtractionStrategy comes, or what it does exactly, but in the PDF representation itself you can group characters together in a "chunk".
How this is used totally depends on the program that produces the PDF: Some programs keep words together, some programs only group word fragments (for example for kerning), some program do other, random things.
So, if LocationTextExtractionStrategy returns these as chunks, you can't rely on anything. If LocationTextExtractionStrategy doesn't return these, but instead relies on spacing heuristics to group characters into chunks, then this will be as good as the heuristics are.
Bottom line: A PDF doesn't contain text, and contains glyphs and their position on the page. Trying to reconstruct text from it is and remains guesswork. You may get it to work in the majority of cases, but there'll always be PDFs where whatever you are doing fails.
I have implemented Google's Mobile Vision for Android by following a tutorial. I am trying to build an app that will scan a receipt and find the numeric total. However, as I scan different receipts that are printed in different formats, the API will detect TextBlocks in what seems to be an arbitrary way. For example, in one receipt, if several words of text are separated by single spaces, then they are grouped into a single TextBlock. However, if two words of text are separated by lots of spaces, then they are separated as independent TextBlocks, even though they appear on the same "line". What I am trying to do is force the API to recognize each entire line of the receipt as a single entity. Is this possible?
public ArrayList<T> getAllGraphicsInRow(float rawY) {
synchronized (mLock) {
ArrayList<T> row = new ArrayList<>();
// Get the position of this View so the raw location can be offset relative to the view.
int[] location = new int[2];
this.getLocationOnScreen(location);
for (T graphic : mGraphics) {
float rawX = this.getWidth();
for (int i=0; i<rawX; i+=10){
if (graphic.contains(i - location[0], rawY - location[1])) {
if(!row.contains(graphic)) {
row.add(graphic);
}
}
}
}
return row;
}
}
This should be in the GraphicOverlay.java file and essentially fetches all the graphics in that row.
public static boolean almostEqual(double a, double b, double eps){
return Math.abs(a-b)<(eps);
}
public static boolean pointAlmostEqual(Point a, Point b){
return almostEqual(a.y,b.y,10);
}
public static boolean cornerPointAlmostEqual(Point[] rect1, Point[] rect2){
boolean almostEqual=true;
for (int i=0; i<rect1.length;i++){
if (!pointAlmostEqual(rect1[i],rect2[i])){
almostEqual=false;
}
}
return almostEqual;
}
private boolean onTap(float rawX, float rawY) {
String priceRegex = "(\\d+[,.]\\d\\d)";
ArrayList<OcrGraphic> graphics = mGraphicOverlay.getAllGraphicsInRow(rawY);
OcrGraphic currentGraphics = mGraphicOverlay.getGraphicAtLocation(rawX,rawY);
if (graphics !=null && currentGraphics!=null) {
List<? extends Text> currentComponents = currentGraphics.getTextBlock().getComponents();
final Pattern pattern = Pattern.compile(priceRegex);
final Pattern pattern1 = Pattern.compile(priceRegex);
TextBlock text = null;
Log.i("text results", "This many in the row: " + Integer.toString(graphics.size()));
ArrayList<Text> combinedComponents = new ArrayList<>();
for (OcrGraphic graphic : graphics) {
if (!graphic.equals(currentGraphics)) {
text = graphic.getTextBlock();
Log.i("text results", text.getValue());
combinedComponents.addAll(text.getComponents());
}
}
for (Text currentText : currentComponents) { // goes through components in the row
final Matcher matcher = pattern.matcher(currentText.getValue()); // looks for
Point[] currentPoint = currentText.getCornerPoints();
for (Text otherCurrentText : combinedComponents) {//Looks for other components that are in the same row
final Matcher otherMatcher = pattern1.matcher(otherCurrentText.getValue()); // looks for
Point[] innerCurrentPoint = otherCurrentText.getCornerPoints();
if (cornerPointAlmostEqual(currentPoint, innerCurrentPoint)) {
if (matcher.find()) { // if you click on the price
Log.i("oh yes", "Item: " + otherCurrentText.getValue());
Log.i("oh yes", "Value: " + matcher.group(1));
itemList.add(otherCurrentText.getValue());
priceList.add(Float.valueOf(matcher.group(1)));
}
if (otherMatcher.find()) { // if you click on the item
Log.i("oh yes", "Item: " + currentText.getValue());
Log.i("oh yes", "Value: " + otherMatcher.group(1));
itemList.add(currentText.getValue());
priceList.add(Float.valueOf(otherMatcher.group(1)));
}
Toast toast = Toast.makeText(this, " Text Captured!" , Toast.LENGTH_SHORT);
toast.show();
}
}
}
return true;
}
return false;
}
This should be in OcrCaptureActivity.java and it breaks up the TextBlock into lines and finds the blocks in the same row as the line and checks if the components are all prices, and prints all value accordingly.
The eps value in almostEqual is the tolerance for how tall it checks for graphics in the row.
Is there a function built into Java that capitalizes the first character of each word in a String, and does not affect the others?
Examples:
jon skeet -> Jon Skeet
miles o'Brien -> Miles O'Brien (B remains capital, this rules out Title Case)
old mcdonald -> Old Mcdonald*
*(Old McDonald would be find too, but I don't expect it to be THAT smart.)
A quick look at the Java String Documentation reveals only toUpperCase() and toLowerCase(), which of course do not provide the desired behavior. Naturally, Google results are dominated by those two functions. It seems like a wheel that must have been invented already, so it couldn't hurt to ask so I can use it in the future.
WordUtils.capitalize(str) (from apache commons-text)
(Note: if you need "fOO BAr" to become "Foo Bar", then use capitalizeFully(..) instead)
If you're only worried about the first letter of the first word being capitalized:
private String capitalize(final String line) {
return Character.toUpperCase(line.charAt(0)) + line.substring(1);
}
The following method converts all the letters into upper/lower case, depending on their position near a space or other special chars.
public static String capitalizeString(String string) {
char[] chars = string.toLowerCase().toCharArray();
boolean found = false;
for (int i = 0; i < chars.length; i++) {
if (!found && Character.isLetter(chars[i])) {
chars[i] = Character.toUpperCase(chars[i]);
found = true;
} else if (Character.isWhitespace(chars[i]) || chars[i]=='.' || chars[i]=='\'') { // You can add other chars here
found = false;
}
}
return String.valueOf(chars);
}
Try this very simple way
example givenString="ram is good boy"
public static String toTitleCase(String givenString) {
String[] arr = givenString.split(" ");
StringBuffer sb = new StringBuffer();
for (int i = 0; i < arr.length; i++) {
sb.append(Character.toUpperCase(arr[i].charAt(0)))
.append(arr[i].substring(1)).append(" ");
}
return sb.toString().trim();
}
Output will be: Ram Is Good Boy
I made a solution in Java 8 that is IMHO more readable.
public String firstLetterCapitalWithSingleSpace(final String words) {
return Stream.of(words.trim().split("\\s"))
.filter(word -> word.length() > 0)
.map(word -> word.substring(0, 1).toUpperCase() + word.substring(1))
.collect(Collectors.joining(" "));
}
The Gist for this solution can be found here: https://gist.github.com/Hylke1982/166a792313c5e2df9d31
String toBeCapped = "i want this sentence capitalized";
String[] tokens = toBeCapped.split("\\s");
toBeCapped = "";
for(int i = 0; i < tokens.length; i++){
char capLetter = Character.toUpperCase(tokens[i].charAt(0));
toBeCapped += " " + capLetter + tokens[i].substring(1);
}
toBeCapped = toBeCapped.trim();
I've written a small Class to capitalize all the words in a String.
Optional multiple delimiters, each one with its behavior (capitalize before, after, or both, to handle cases like O'Brian);
Optional Locale;
Don't breaks with Surrogate Pairs.
LIVE DEMO
Output:
====================================
SIMPLE USAGE
====================================
Source: cApItAlIzE this string after WHITE SPACES
Output: Capitalize This String After White Spaces
====================================
SINGLE CUSTOM-DELIMITER USAGE
====================================
Source: capitalize this string ONLY before'and''after'''APEX
Output: Capitalize this string only beforE'AnD''AfteR'''Apex
====================================
MULTIPLE CUSTOM-DELIMITER USAGE
====================================
Source: capitalize this string AFTER SPACES, BEFORE'APEX, and #AFTER AND BEFORE# NUMBER SIGN (#)
Output: Capitalize This String After Spaces, BeforE'apex, And #After And BeforE# Number Sign (#)
====================================
SIMPLE USAGE WITH CUSTOM LOCALE
====================================
Source: Uniforming the first and last vowels (different kind of 'i's) of the Turkish word D[İ]YARBAK[I]R (DİYARBAKIR)
Output: Uniforming The First And Last Vowels (different Kind Of 'i's) Of The Turkish Word D[i]yarbak[i]r (diyarbakir)
====================================
SIMPLE USAGE WITH A SURROGATE PAIR
====================================
Source: ab 𐐂c de à
Output: Ab 𐐪c De À
Note: first letter will always be capitalized (edit the source if you don't want that).
Please share your comments and help me to found bugs or to improve the code...
Code:
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.Locale;
public class WordsCapitalizer {
public static String capitalizeEveryWord(String source) {
return capitalizeEveryWord(source,null,null);
}
public static String capitalizeEveryWord(String source, Locale locale) {
return capitalizeEveryWord(source,null,locale);
}
public static String capitalizeEveryWord(String source, List<Delimiter> delimiters, Locale locale) {
char[] chars;
if (delimiters == null || delimiters.size() == 0)
delimiters = getDefaultDelimiters();
// If Locale specified, i18n toLowerCase is executed, to handle specific behaviors (eg. Turkish dotted and dotless 'i')
if (locale!=null)
chars = source.toLowerCase(locale).toCharArray();
else
chars = source.toLowerCase().toCharArray();
// First charachter ALWAYS capitalized, if it is a Letter.
if (chars.length>0 && Character.isLetter(chars[0]) && !isSurrogate(chars[0])){
chars[0] = Character.toUpperCase(chars[0]);
}
for (int i = 0; i < chars.length; i++) {
if (!isSurrogate(chars[i]) && !Character.isLetter(chars[i])) {
// Current char is not a Letter; gonna check if it is a delimitrer.
for (Delimiter delimiter : delimiters){
if (delimiter.getDelimiter()==chars[i]){
// Delimiter found, applying rules...
if (delimiter.capitalizeBefore() && i>0
&& Character.isLetter(chars[i-1]) && !isSurrogate(chars[i-1]))
{ // previous character is a Letter and I have to capitalize it
chars[i-1] = Character.toUpperCase(chars[i-1]);
}
if (delimiter.capitalizeAfter() && i<chars.length-1
&& Character.isLetter(chars[i+1]) && !isSurrogate(chars[i+1]))
{ // next character is a Letter and I have to capitalize it
chars[i+1] = Character.toUpperCase(chars[i+1]);
}
break;
}
}
}
}
return String.valueOf(chars);
}
private static boolean isSurrogate(char chr){
// Check if the current character is part of an UTF-16 Surrogate Pair.
// Note: not validating the pair, just used to bypass (any found part of) it.
return (Character.isHighSurrogate(chr) || Character.isLowSurrogate(chr));
}
private static List<Delimiter> getDefaultDelimiters(){
// If no delimiter specified, "Capitalize after space" rule is set by default.
List<Delimiter> delimiters = new ArrayList<Delimiter>();
delimiters.add(new Delimiter(Behavior.CAPITALIZE_AFTER_MARKER, ' '));
return delimiters;
}
public static class Delimiter {
private Behavior behavior;
private char delimiter;
public Delimiter(Behavior behavior, char delimiter) {
super();
this.behavior = behavior;
this.delimiter = delimiter;
}
public boolean capitalizeBefore(){
return (behavior.equals(Behavior.CAPITALIZE_BEFORE_MARKER)
|| behavior.equals(Behavior.CAPITALIZE_BEFORE_AND_AFTER_MARKER));
}
public boolean capitalizeAfter(){
return (behavior.equals(Behavior.CAPITALIZE_AFTER_MARKER)
|| behavior.equals(Behavior.CAPITALIZE_BEFORE_AND_AFTER_MARKER));
}
public char getDelimiter() {
return delimiter;
}
}
public static enum Behavior {
CAPITALIZE_AFTER_MARKER(0),
CAPITALIZE_BEFORE_MARKER(1),
CAPITALIZE_BEFORE_AND_AFTER_MARKER(2);
private int value;
private Behavior(int value) {
this.value = value;
}
public int getValue() {
return value;
}
}
Using org.apache.commons.lang.StringUtils makes it very simple.
capitalizeStr = StringUtils.capitalize(str);
From Java 9+
you can use String::replaceAll like this :
public static void upperCaseAllFirstCharacter(String text) {
String regex = "\\b(.)(.*?)\\b";
String result = Pattern.compile(regex).matcher(text).replaceAll(
matche -> matche.group(1).toUpperCase() + matche.group(2)
);
System.out.println(result);
}
Example :
upperCaseAllFirstCharacter("hello this is Just a test");
Outputs
Hello This Is Just A Test
With this simple code:
String example="hello";
example=example.substring(0,1).toUpperCase()+example.substring(1, example.length());
System.out.println(example);
Result: Hello
I'm using the following function. I think it is faster in performance.
public static String capitalize(String text){
String c = (text != null)? text.trim() : "";
String[] words = c.split(" ");
String result = "";
for(String w : words){
result += (w.length() > 1? w.substring(0, 1).toUpperCase(Locale.US) + w.substring(1, w.length()).toLowerCase(Locale.US) : w) + " ";
}
return result.trim();
}
Use the Split method to split your string into words, then use the built in string functions to capitalize each word, then append together.
Pseudo-code (ish)
string = "the sentence you want to apply caps to";
words = string.split(" ")
string = ""
for(String w: words)
//This line is an easy way to capitalize a word
word = word.toUpperCase().replace(word.substring(1), word.substring(1).toLowerCase())
string += word
In the end string looks something like
"The Sentence You Want To Apply Caps To"
This might be useful if you need to capitalize titles. It capitalizes each substring delimited by " ", except for specified strings such as "a" or "the". I haven't ran it yet because it's late, should be fine though. Uses Apache Commons StringUtils.join() at one point. You can substitute it with a simple loop if you wish.
private static String capitalize(String string) {
if (string == null) return null;
String[] wordArray = string.split(" "); // Split string to analyze word by word.
int i = 0;
lowercase:
for (String word : wordArray) {
if (word != wordArray[0]) { // First word always in capital
String [] lowercaseWords = {"a", "an", "as", "and", "although", "at", "because", "but", "by", "for", "in", "nor", "of", "on", "or", "so", "the", "to", "up", "yet"};
for (String word2 : lowercaseWords) {
if (word.equals(word2)) {
wordArray[i] = word;
i++;
continue lowercase;
}
}
}
char[] characterArray = word.toCharArray();
characterArray[0] = Character.toTitleCase(characterArray[0]);
wordArray[i] = new String(characterArray);
i++;
}
return StringUtils.join(wordArray, " "); // Re-join string
}
public static String toTitleCase(String word){
return Character.toUpperCase(word.charAt(0)) + word.substring(1);
}
public static void main(String[] args){
String phrase = "this is to be title cased";
String[] splitPhrase = phrase.split(" ");
String result = "";
for(String word: splitPhrase){
result += toTitleCase(word) + " ";
}
System.out.println(result.trim());
}
1. Java 8 Streams
public static String capitalizeAll(String str) {
if (str == null || str.isEmpty()) {
return str;
}
return Arrays.stream(str.split("\\s+"))
.map(t -> t.substring(0, 1).toUpperCase() + t.substring(1))
.collect(Collectors.joining(" "));
}
Examples:
System.out.println(capitalizeAll("jon skeet")); // Jon Skeet
System.out.println(capitalizeAll("miles o'Brien")); // Miles O'Brien
System.out.println(capitalizeAll("old mcdonald")); // Old Mcdonald
System.out.println(capitalizeAll(null)); // null
For foo bAR to Foo Bar, replace the map() method with the following:
.map(t -> t.substring(0, 1).toUpperCase() + t.substring(1).toLowerCase())
2. String.replaceAll() (Java 9+)
ublic static String capitalizeAll(String str) {
if (str == null || str.isEmpty()) {
return str;
}
return Pattern.compile("\\b(.)(.*?)\\b")
.matcher(str)
.replaceAll(match -> match.group(1).toUpperCase() + match.group(2));
}
Examples:
System.out.println(capitalizeAll("12 ways to learn java")); // 12 Ways To Learn Java
System.out.println(capitalizeAll("i am atta")); // I Am Atta
System.out.println(capitalizeAll(null)); // null
3. Apache Commons Text
System.out.println(WordUtils.capitalize("love is everywhere")); // Love Is Everywhere
System.out.println(WordUtils.capitalize("sky, sky, blue sky!")); // Sky, Sky, Blue Sky!
System.out.println(WordUtils.capitalize(null)); // null
For titlecase:
System.out.println(WordUtils.capitalizeFully("fOO bAR")); // Foo Bar
System.out.println(WordUtils.capitalizeFully("sKy is BLUE!")); // Sky Is Blue!
For details, checkout this tutorial.
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Enter the sentence : ");
try
{
String str = br.readLine();
char[] str1 = new char[str.length()];
for(int i=0; i<str.length(); i++)
{
str1[i] = Character.toLowerCase(str.charAt(i));
}
str1[0] = Character.toUpperCase(str1[0]);
for(int i=0;i<str.length();i++)
{
if(str1[i] == ' ')
{
str1[i+1] = Character.toUpperCase(str1[i+1]);
}
System.out.print(str1[i]);
}
}
catch(Exception e)
{
System.err.println("Error: " + e.getMessage());
}
I decided to add one more solution for capitalizing words in a string:
words are defined here as adjacent letter-or-digit characters;
surrogate pairs are provided as well;
the code has been optimized for performance; and
it is still compact.
Function:
public static String capitalize(String string) {
final int sl = string.length();
final StringBuilder sb = new StringBuilder(sl);
boolean lod = false;
for(int s = 0; s < sl; s++) {
final int cp = string.codePointAt(s);
sb.appendCodePoint(lod ? Character.toLowerCase(cp) : Character.toUpperCase(cp));
lod = Character.isLetterOrDigit(cp);
if(!Character.isBmpCodePoint(cp)) s++;
}
return sb.toString();
}
Example call:
System.out.println(capitalize("An à la carte StRiNg. Surrogate pairs: 𐐪𐐪."));
Result:
An À La Carte String. Surrogate Pairs: 𐐂𐐪.
Use:
String text = "jon skeet, miles o'brien, old mcdonald";
Pattern pattern = Pattern.compile("\\b([a-z])([\\w]*)");
Matcher matcher = pattern.matcher(text);
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(buffer, matcher.group(1).toUpperCase() + matcher.group(2));
}
String capitalized = matcher.appendTail(buffer).toString();
System.out.println(capitalized);
There are many way to convert the first letter of the first word being capitalized. I have an idea. It's very simple:
public String capitalize(String str){
/* The first thing we do is remove whitespace from string */
String c = str.replaceAll("\\s+", " ");
String s = c.trim();
String l = "";
for(int i = 0; i < s.length(); i++){
if(i == 0){ /* Uppercase the first letter in strings */
l += s.toUpperCase().charAt(i);
i++; /* To i = i + 1 because we don't need to add
value i = 0 into string l */
}
l += s.charAt(i);
if(s.charAt(i) == 32){ /* If we meet whitespace (32 in ASCII Code is whitespace) */
l += s.toUpperCase().charAt(i+1); /* Uppercase the letter after whitespace */
i++; /* Yo i = i + 1 because we don't need to add
value whitespace into string l */
}
}
return l;
}
package com.test;
/**
* #author Prasanth Pillai
* #date 01-Feb-2012
* #description : Below is the test class details
*
* inputs a String from a user. Expect the String to contain spaces and alphanumeric characters only.
* capitalizes all first letters of the words in the given String.
* preserves all other characters (including spaces) in the String.
* displays the result to the user.
*
* Approach : I have followed a simple approach. However there are many string utilities available
* for the same purpose. Example : WordUtils.capitalize(str) (from apache commons-lang)
*
*/
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Test {
public static void main(String[] args) throws IOException{
System.out.println("Input String :\n");
InputStreamReader converter = new InputStreamReader(System.in);
BufferedReader in = new BufferedReader(converter);
String inputString = in.readLine();
int length = inputString.length();
StringBuffer newStr = new StringBuffer(0);
int i = 0;
int k = 0;
/* This is a simple approach
* step 1: scan through the input string
* step 2: capitalize the first letter of each word in string
* The integer k, is used as a value to determine whether the
* letter is the first letter in each word in the string.
*/
while( i < length){
if (Character.isLetter(inputString.charAt(i))){
if ( k == 0){
newStr = newStr.append(Character.toUpperCase(inputString.charAt(i)));
k = 2;
}//this else loop is to avoid repeatation of the first letter in output string
else {
newStr = newStr.append(inputString.charAt(i));
}
} // for the letters which are not first letter, simply append to the output string.
else {
newStr = newStr.append(inputString.charAt(i));
k=0;
}
i+=1;
}
System.out.println("new String ->"+newStr);
}
}
Here is a simple function
public static String capEachWord(String source){
String result = "";
String[] splitString = source.split(" ");
for(String target : splitString){
result += Character.toUpperCase(target.charAt(0))
+ target.substring(1) + " ";
}
return result.trim();
}
This is just another way of doing it:
private String capitalize(String line)
{
StringTokenizer token =new StringTokenizer(line);
String CapLine="";
while(token.hasMoreTokens())
{
String tok = token.nextToken().toString();
CapLine += Character.toUpperCase(tok.charAt(0))+ tok.substring(1)+" ";
}
return CapLine.substring(0,CapLine.length()-1);
}
Reusable method for intiCap:
public class YarlagaddaSireeshTest{
public static void main(String[] args) {
String FinalStringIs = "";
String testNames = "sireesh yarlagadda test";
String[] name = testNames.split("\\s");
for(String nameIs :name){
FinalStringIs += getIntiCapString(nameIs) + ",";
}
System.out.println("Final Result "+ FinalStringIs);
}
public static String getIntiCapString(String param) {
if(param != null && param.length()>0){
char[] charArray = param.toCharArray();
charArray[0] = Character.toUpperCase(charArray[0]);
return new String(charArray);
}
else {
return "";
}
}
}
Here is my solution.
I ran across this problem tonight and decided to search it. I found an answer by Neelam Singh that was almost there, so I decided to fix the issue (broke on empty strings) and caused a system crash.
The method you are looking for is named capString(String s) below.
It turns "It's only 5am here" into "It's Only 5am Here".
The code is pretty well commented, so enjoy.
package com.lincolnwdaniel.interactivestory.model;
public class StringS {
/**
* #param s is a string of any length, ideally only one word
* #return a capitalized string.
* only the first letter of the string is made to uppercase
*/
public static String capSingleWord(String s) {
if(s.isEmpty() || s.length()<2) {
return Character.toUpperCase(s.charAt(0))+"";
}
else {
return Character.toUpperCase(s.charAt(0)) + s.substring(1);
}
}
/**
*
* #param s is a string of any length
* #return a title cased string.
* All first letter of each word is made to uppercase
*/
public static String capString(String s) {
// Check if the string is empty, if it is, return it immediately
if(s.isEmpty()){
return s;
}
// Split string on space and create array of words
String[] arr = s.split(" ");
// Create a string buffer to hold the new capitalized string
StringBuffer sb = new StringBuffer();
// Check if the array is empty (would be caused by the passage of s as an empty string [i.g "" or " "],
// If it is, return the original string immediately
if( arr.length < 1 ){
return s;
}
for (int i = 0; i < arr.length; i++) {
sb.append(Character.toUpperCase(arr[i].charAt(0)))
.append(arr[i].substring(1)).append(" ");
}
return sb.toString().trim();
}
}
Here we go for perfect first char capitalization of word
public static void main(String[] args) {
String input ="my name is ranjan";
String[] inputArr = input.split(" ");
for(String word : inputArr) {
System.out.println(word.substring(0, 1).toUpperCase()+word.substring(1,word.length()));
}
}
}
//Output : My Name Is Ranjan
For those of you using Velocity in your MVC, you can use the capitalizeFirstLetter() method from the StringUtils class.
String s="hi dude i want apple";
s = s.replaceAll("\\s+"," ");
String[] split = s.split(" ");
s="";
for (int i = 0; i < split.length; i++) {
split[i]=Character.toUpperCase(split[i].charAt(0))+split[i].substring(1);
s+=split[i]+" ";
System.out.println(split[i]);
}
System.out.println(s);
package corejava.string.intern;
import java.io.DataInputStream;
import java.util.ArrayList;
/*
* wap to accept only 3 sentences and convert first character of each word into upper case
*/
public class Accept3Lines_FirstCharUppercase {
static String line;
static String words[];
static ArrayList<String> list=new ArrayList<String>();
/**
* #param args
*/
public static void main(String[] args) throws java.lang.Exception{
DataInputStream read=new DataInputStream(System.in);
System.out.println("Enter only three sentences");
int i=0;
while((line=read.readLine())!=null){
method(line); //main logic of the code
if((i++)==2){
break;
}
}
display();
System.out.println("\n End of the program");
}
/*
* this will display all the elements in an array
*/
public static void display(){
for(String display:list){
System.out.println(display);
}
}
/*
* this divide the line of string into words
* and first char of the each word is converted to upper case
* and to an array list
*/
public static void method(String lineParam){
words=line.split("\\s");
for(String s:words){
String result=s.substring(0,1).toUpperCase()+s.substring(1);
list.add(result);
}
}
}
If you prefer Guava...
String myString = ...;
String capWords = Joiner.on(' ').join(Iterables.transform(Splitter.on(' ').omitEmptyStrings().split(myString), new Function<String, String>() {
public String apply(String input) {
return Character.toUpperCase(input.charAt(0)) + input.substring(1);
}
}));
String toUpperCaseFirstLetterOnly(String str) {
String[] words = str.split(" ");
StringBuilder ret = new StringBuilder();
for(int i = 0; i < words.length; i++) {
ret.append(Character.toUpperCase(words[i].charAt(0)));
ret.append(words[i].substring(1));
if(i < words.length - 1) {
ret.append(' ');
}
}
return ret.toString();
}
I want to do a query like that : "banana apple cherry" on a "fruit" field.
All the fruits in the desserts needs to be in the query, but not all the fruits in the query needs to be in the desserts..
Here's an example..
NAME FRUIT
Dessert1 banana apple OK (we got banana and apple in the query)
Dessert2 cherry apple banana OK(the order doesn't matter)
Dessert3 cherry apple banana melon NO (melon is missing in the query)
public class ArrayStringFieldBridge implements TwoWayFieldBridge{
#Override
public Object get(String name, Document document) {
IndexableField[] fields = document.getFields(name);
String[] values = new String[fields.length];
for (int i=0; i<fields.length; i++) {
values[i] = fields[i].stringValue();
}
return values;
}
#Override
public String objectToString(Object value) {
return StringUtils.join((String[])value, " ");
}
#Override
public void set(String name, Object value, Document document, LuceneOptions luceneOptions) {
String newString = StringUtils.join((String[])value, " ");
Field field = new Field(name, newString, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector());
field.setBoost(luceneOptions.getBoost());
document.add(field);
}
}
#Indexed
#AnalyzerDef(name = "customanalyzer",
tokenizer = #TokenizerDef(factory = StandardTokenizerFactory.class))
public class Dessert {
#Analyzer(definition="customanalyzer")
#Field(name = "equipment", index=Index.YES, analyze = Analyze.YES, store=Store.YES)
#FieldBridge(impl=ArrayStringFieldBridge.class)
public String[] fruits = new String[]{};
}
Even if you are not using hibernate-search, every suggestions about the theory to handle that would be great... Thank you
Step 1 : Fire lucene query "fruit:banana OR fruit:apple OR fruit:cherry"
Step 2 : Gather all matched dessert documents
Step 3 : Post process your match dessert document with query
convert match document to array of terms matchDocArr : {banana, apple}
convert query terms to array - queryArr : {banana, apple, cherry}
iterate over matchDocArr and make sure each term of matchDocArr is found in queryArr by array, if NOT (melon use case) knockout this matched document
Here is an example function which needs to be called for every matched doc
public static boolean isDocInterested(String query, String matchDoc)
{
List<String> matchDocArr = new ArrayList<String>();
matchDocArr = Arrays.asList(matchDoc.split(" "));
List<String> queryArr = new ArrayList<String>();
queryArr = Arrays.asList(query.split(" "));
int matchCounter = 0;
for(int i=0; i<matchDocArr.size(); i++)
{
if (queryArr.contains(matchDocArr.get(i)))
matchCounter++;
}
if (matchCounter == matchDocArr.size())
return true;
return false;
}
if function returns TRUE we are interested in doc/dessert, if it returns FALSE ignore this doc/dessert.
of course this function can be written in many different ways but I think you get the point.
I use Json.Net.
When I serialize a Department2 object and WriteJson() is invoked I want it to be recursively invoked with each of the Telephone2 objects like I do in ReadJson().
How do I do that?
using System;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
public interface ISimpleDatabag
{
string Databag { get; set; }
}
[JsonConverter(typeof(JsonDataBagCreationConverter<Department2>))]
public class Department2
{
public Telephone2[] Phones { get; set; }
}
[JsonConverter(typeof(JsonDataBagCreationConverter<Telephone2>))]
public class Telephone2
{
public string Name { get; set; }
public string AreaCode { get; set; }
public string Number { get; set; }
}
public class JsonDataBagCreationConverter<T> : JsonConverter where T : new()
{
// Json.Net version 4.5.7.15008
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
// When I serialize Department and this function is invoked
// I want it to recursively invoke WriteJson with each of the Telephone objects
// Like I do in ReadJson
// How do I do that?
T t = (T)value;
serializer.Serialize(writer, t.GetType());
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
var jsonObject = JObject.Load(reader);
var target = Create(objectType, jsonObject);
serializer.Populate(jsonObject.CreateReader(), target); // Will call this function recursively for any objects that have JsonDataBagCreationConverter as attribute
return target;
}
protected T Create(Type objectType, JObject jsonObject)
{
return new T();
}
public override bool CanConvert(Type objectType)
{
return typeof(T).IsAssignableFrom(objectType);
}
}
private void Form1_Load(object sender, EventArgs e)
{
string jsonInput = "{\"Name\": \"Seek4\" , \"CustomDepartmentData\": \"This is custom department data\", \"Phones\":[ {\"Name\": \"A\", \"AreaCode\":444, \"Number\":11111111} ,{\"Name\": \"B\", \"AreaCode\":555, \"Number\":987987987}, {\"Name\": \"C\", \"AreaCode\":222, \"Number\":123123123, \"CustomPhoneData\": \"This is custom phone data\"} ] }";
Department2 objDepartment2 = JsonConvert.DeserializeObject<Department2>(jsonInput); // Yes, it works well
Array.Reverse(objDepartment2.Phones);
string jsonNoDatabag = JsonConvert.SerializeObject(objDepartment2);
}
I ended up controlling the entire process myself, using this huge (not refactored) function.
I basically investigate each of the properties of the object to serialize and then serialize it property by property.
Then I can do custom things on each property
/// <summary>
/// Serializes an object by merging its current values into its databag and returns the databag
/// </summary>
/// <param name="objectToSerialize"></param>
/// <returns>the current values merged into the original databag</returns>
/// <remarks>Jan Nielsen, 01-10-2012</remarks>
internal static string SerializeObjectToDatabag(object objectToSerialize)
{
// You have to do it property by property instead of just serializing the entire object and merge it into the original
// because the object might contain lists of objects with custom data and these list might have been sorted differently from when they were loaded
// So you cannot merge them properly unless you do it on a per listitem basis.
// Which is what I do here.
try
{
if (objectToSerialize == null) // If you ie serialize an empty object in an array
{
return null;
}
string updatedDatabag = "";
bool isIDataBag = objectToSerialize is IDataBag;
if (isIDataBag)
{
updatedDatabag = ((IDataBag)objectToSerialize).Data == null ? "" : ((IDataBag)objectToSerialize).Data.ToString();
// updatedDatabag = ((IDataBag)objectToSerialize).Data.ToString(); // Save original data in a local variable. This is the one we will merge new values into
}
string result = "";
// Now iterate through the objects properties
// Possible properties:
// Simple types: string, int, bool etc: their current value should be overwritten in the databag
// types that implement IDatabag: they should be sent to this function recursively so their possible customdata is not overwritten
// but instead their simple values are merged into their own databag. Then the result of this single property merge is overwritten in the outer objects databag
// Types that are not simple and don't implement IDatabag but have properties that implement IDatabag
// types that are not simple and don't implement IDatabag and don't have any properties in any depth that implement IDatabag: They are overwritten in the databag
// Types that are arrays:
// If the types in the array are simple types (string, bool etc) the entire array property is overwritten in the databag
// If the types in the array implement IDatabag each object is sent recursively to this function and their databag is updated via merge
// Then the entire array is overwritten in the outer objects databag
// Types that are generic list are treated like arrays
var properties = objectToSerialize.GetType().GetProperties();
// In order to be able to deserialize abstract classes and interfaces, we need to serialize the classname with the class
// the deserializer recognizes the word $type followed by a type, when its is invoked with a serializerSettings of
// serializerSettings.TypeNameHandling = TypeNameHandling.Objects;
string name = objectToSerialize.GetType().AssemblyQualifiedName;
string shortName = RemoveAssemblyDetails(name);
bool addedType = false;
foreach (var propertyInfo in properties)
{
if (propertyInfo.Name.ToLower() != "data") // Just skip Databag. Databag is not a "real" property but the contents of all the properties when the object was loaded + possible custom data
{
if (!addedType)
{
string jsonSingleProperty = "{ " + ToCustomJson("$type") + " : " + ToCustomJson(shortName) + " }";
// Merge the current value (jsonSingleProperty) into the databag (that might already have been updated with the values of other properties)
// and update the current result with the new values. Ie "Name" : "Seek4" is updated to "Name" : "Seek4Cars" in the databag
// and the function will now use the updated databag to merge the other properties into
updatedDatabag = MergeDefault(jsonSingleProperty, updatedDatabag, true);
addedType = true;
}
// propertyInfo.Name.ToLower().Contains("struct")
var value = propertyInfo.GetValue(objectToSerialize, null); // This gets the value of the specified property in the current object
isIDataBag = value is IDataBag; // Update for the current object. Note that ie an array of IDatabag will return false here, because array is not IsimpleDatabag
// Basically we should just check if the property implements IDatabag
// But the simpletype check is faster because I don't have to check for the interfaces on ie a string, int etc.
// This branch takes care of 3 cases:
// 1) it is a simple type, ie int
// 2) value is null
// 3) it is an array with a value of null
// If an array with values enters this branch of code the values of the array will be appended, overwritten
// Therefore arrays are treated below in a special case. Unless they are null
// GeneralFunctions.IsExtendedSimpleType_AllTypes(propertyInfo.PropertyType) returns true on ie string[], but only arrays with a value of null should be handled here
// This first check originally just checked for simple types
// Then it became extended simple types ie non-simple types that only contains simple types ie List<int,int>
// But not arrays that must be handled separately
// Then it also handled null values
// And then a special case was made for arrays that are null
if ((GeneralFunctions.IsExtendedSimpleType_AllTypes(propertyInfo.PropertyType) || value == null) && (!propertyInfo.PropertyType.IsArray || (propertyInfo.PropertyType.IsArray && value == null)))
{
// You have to merge even though it is default value.
// If you have ie a bool that has an initial value of true and you deliberately sets it to false
// You want the defaultvalue of false to be merged into the json.
string jsonSingleProperty = "{" + ToCustomJson(propertyInfo.Name) + " : " + ToCustomJson(value) + "}"; // ie {"Name" : "Seek4Cars"}
// Merge the current value (jsonSingleProperty) into the databag (that might already have been updated with the values of other properties)
// and update the current result with the new values. Ie "Name" : "Seek4" is updated to "Name" : "Seek4Cars" in the databag
// and the function will now use the updated databag to merge the other properties into
updatedDatabag = MergeDefault(jsonSingleProperty, updatedDatabag, true);
continue;
}
if (isIDataBag) // ie PhoneSingle. A single property of type IDataBag
{
// Invoke recursively
// First check if this is an object with all null values
bool allPropertiesAreNull = true; // Maybe this should in the future be expanded with a check on if the property has its default value ie an int property with a value of 0
foreach (var propertyInfoLocal in value.GetType().GetProperties())
{
var valueLocal = propertyInfoLocal.GetValue(value, null);
if (valueLocal != null)
{
allPropertiesAreNull = false;
break;
}
}
var testjson = "";
if (allPropertiesAreNull)
{
result = "{" + ToCustomJson(propertyInfo.Name) + " : " + " { } }";
}
else
{
testjson = ToCustomJson(value);
result = "{" + ToCustomJson(propertyInfo.Name) + " : " + SerializeObjectToDatabag(value) + "}";
}
updatedDatabag = MergeDefault(result, updatedDatabag, true);
continue;
}
bool containsIDataBag = CheckForDatabagInterfaces.ImplementsInterface(propertyInfo.PropertyType, "idatabag"); // Check if anything inside the property implements IDatabag ie an array of IDatabag
if (containsIDataBag)
{
// Check if it is somekind of generic list (List<T>, Dictionary<T,T) etc) and if it is a type of ignoreTypes ie List<entity>)
if (value.GetType().IsGenericType && value.GetType().GetGenericArguments().Length > 0)
{
string listValuesAsJson = "";
if (value is IEnumerable)
{
listValuesAsJson += "{ " + ToCustomJson(propertyInfo.Name) + " : [";
bool containsItems = false;
foreach (var element in (IEnumerable)value)
{
containsItems = true;
var current = SerializeObjectToDatabag(element);
if (current != null) // If you serialize an empty array element it is null
{
listValuesAsJson += current + ", "; // Add , between each element
}
}
if (containsItems)
{
listValuesAsJson = listValuesAsJson.Substring(0, listValuesAsJson.Length - 2) + "] }"; // remove last , and add ending ] for the array and add a } because this property is flowing in the free
}
else // No items in value
{
listValuesAsJson += "] }"; // add ending ] for the array and add a } because this property is flowing in the free
}
}
else // A single, generic KeyValuePair property
{
listValuesAsJson += "{ " + ToCustomJson(propertyInfo.Name) + " : ";
listValuesAsJson += SerializeObjectToDatabag(value);
listValuesAsJson += " }";
}
updatedDatabag = MergeDefault(listValuesAsJson, updatedDatabag, false);
}
else if (value.GetType().IsArray)
{
string arrayValuesAsJson = "{ " + ToCustomJson(propertyInfo.Name) + " : [";
bool containsItems = false;
foreach (var element in (Array)value)
{
// Treat them the same way you treat any other object
var current = SerializeObjectToDatabag(element);
if (current != null) // If you serialize an empty array element it is null
{
containsItems = true;
arrayValuesAsJson += current + ", ";
}
}
if (containsItems)
{
arrayValuesAsJson = arrayValuesAsJson.Substring(0, arrayValuesAsJson.Length - 2) + "] }"; // remove last , and add ending ] for the array and add a } because this property is flowing in the free
}
else // No items in value
{
arrayValuesAsJson += "] }"; // add ending ] for the array and add a } because this property is flowing in the free
}
updatedDatabag = MergeDefault(arrayValuesAsJson, updatedDatabag, false);
}
else if ( value.GetType().BaseType != null && value.GetType().BaseType.FullName.ToLower().Contains("system.collections.objectmodel"))
{
// This branch was made specifically to take care of the Media collection of a Seek4.Entities.V2.Media.MediaCollection
var genericList = (IList)value;
int counter = genericList.Count;
string listAsJson = "{ " + ToCustomJson(propertyInfo.Name) + " : [";
if (counter == 0)
{
listAsJson += "] }"; // Ie { "Media": [] }
}
else
{
foreach (var obj in genericList)
{
var current = SerializeObjectToDatabag(obj);
listAsJson += current + ", ";
}
listAsJson = listAsJson.Substring(0, listAsJson.Length -2) + " ] }" ;
}
updatedDatabag = MergeDefault(listAsJson, updatedDatabag, true); // hvordan gør json.net dette med standard?
}
else // a single Non-IDatabag property that contains Idatabag properties
{
string tempResult = "{ " + ToCustomJson(propertyInfo.Name) + " : ";
tempResult += SerializeObjectToDatabag(value) + " }";
updatedDatabag = MergeDefault(tempResult, updatedDatabag, true);
}
}
else
{
if (value.GetType().IsArray) // This is an array of simple types so just overwrite
{
string arrayAsJson = "{ " + ToCustomJson(propertyInfo.Name) + " : ";
arrayAsJson += ToCustomJson(value) + "}";
updatedDatabag = MergeDefault(arrayAsJson, updatedDatabag, false);
}
else // ie an object that is not simpledatabag and doesn't contain simple databag
{
string jsonSingleProperty = "{" + ToCustomJson(propertyInfo.Name) + " : " + ToCustomJson(value) + "}";
updatedDatabag = MergeDefault(jsonSingleProperty, updatedDatabag, true);
}
}
}
}
return updatedDatabag;
}
catch (Exception ex)
{
string message = ex.Message;
string stack = ex.StackTrace;
throw;
}
}
internal static string ToCustomJson(object objectToConvertToJson)
{
try
{
// Distinguished from Mongodb.Bson.ToJson() extensionmethod by Custom name
JsonSerializerSettings serializerSettings = new JsonSerializerSettings();
serializerSettings.TypeNameHandling = TypeNameHandling.Objects; // Adds a $type on all objects which we need when it is abstract classes and interfaces
IgnoreDataMemberContractResolver contractResolver = new IgnoreDataMemberContractResolver(null, true, true);
serializerSettings.ContractResolver = contractResolver;
serializerSettings.DefaultValueHandling = DefaultValueHandling.Ignore;
IsoDateTimeConverter converter = new IsoDateTimeConverter();
serializerSettings.Converters.Add(converter);
string result = JsonConvert.SerializeObject(objectToConvertToJson, Formatting.None, serializerSettings);
return result;
}
catch (Exception ex)
{
throw new Exception("Error in ToCustomJson: " + ex.Message, ex);
}
}