Removing a comma from a string

I’m learning java and I’m just trying to figure how to go about solving this problem. I have a string ex: The <object> <verb>, on the <object>.

Every string that contained in <> (<> are only for clarification and not in the original string) are keys to a hash map that will return a random value.
I then break the string into an array of strings, and then loop through the array and search the hash map if the key exist return a value, here is where I encounter a problem, in the example above the <verb> is a key but not <verb>, (with the comma)
How can I break away from the comma but then return a value with it.

So the end result I don’t need the full code for this, just ideas on how to solve this particular problem.

The dog sat, on the cat.


2 Answers Removing a comma from a string

You can use regular expressions to extract all words (consisting only of letters), and then search the map as you wish. I know that you ask only for comma, but I assume this is the use-case.

List<String> allMatches = new ArrayList<String>();
Matcher m = Pattern.compile("[a-zA-Z]+") //regex for letter-strings only
     .matcher(yourString); // e.g. "The dog sat, on the cat."
 while (m.find()) {

Result will be a following list:

{"The", "dog", "sat", "on", "the", "cat"}

Then you can iterate allMatches to find appropriate results from your data structure.

P.S. When you use regex, try to compile the pattern only once and reuse it if required again, as it's not that cheap operation.

2 weeks ago

The split method of String class in Java takes regular expressions, and regular expressions in Java have the or operator similar to if.

So instead of splitting on a single character like a space, you can split on several different things like coma and space, just a space, and a period.

String sentence = "The dog sat, on the cat.";
String [] words = sentence.split(", | |\\.");

The pipe character | is the or operator for regex. Note that I added the \\. which will remove the '.' from 'cat'. In regex, the dot means "any letter" so to match and actual dot (period, end of sentence) you need to escape it with a \. And in a Java string literals (anything between "") \\ means put an actual \ in the string so \\. will become \. when split receives it as a parameter.

There is even a more general way:

String [] words = sentence.split("\\W+");

\W means "any non-word character" - anything other then a letter, a digit or an underscore, and + means "appearing one or more times in a row".

So this will split the string on anything that is not a word.

Remember that in Java, String class is immutable - its contents can not be changed once it is created.

So no matter which solution you use - replaceAll suggested by Spara, word search using Matcher suggested by nemanja228, or the split in my answer, there will always be copies of the original string created, and the original will not be changed, so you just need to keep a reference to it to preserve it for future use (don't change the variable that holds it).

2 weeks ago