Introduction
Splitting strings is a fundamental operation in Java programming, commonly used in data processing, text manipulation, and parsing structured content like CSV files or logs. Whether you need to break down user input, process API responses, or extract values from a large text, Java provides multiple ways to efficiently split strings based on a delimiter.
Each method has its own strengths and weaknesses, affecting performance, flexibility, and ease of use. Some methods, like String.split(), are straightforward and built-in, while others, such as using StringTokenizer or Scanner, offer more control. Additionally, modern approaches like Java Streams provide a functional alternative for handling string splitting with filtering capabilities.
Code Example: Java String Splitting
package string;
import java.util.*;
import java.util.regex.Pattern;
import java.util.stream.Stream;
public class StringSplit {
// Split using String split method with regex.
public static String[] splitM1(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
return input.split(Pattern.quote(delimiter));
}
// Split using StringTokenizer
public static String[] splitM2(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
StringTokenizer tokenizer = new StringTokenizer(input, delimiter);
List<String> list = new ArrayList<>();
while (tokenizer.hasMoreTokens()) {
list.add(tokenizer.nextToken());
}
return list.toArray(new String[0]); // Convert an ArrayList of String to a String Array
}
// Split using String methods: indexOf and substring. (Manual approach)
public static String[] splitM3(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
List<String> list = new ArrayList<>();
int index = 0;
do {
int lastIndex = index;
index = input.indexOf(delimiter, index + 1);
if (lastIndex != index) {
String element = input.substring(
lastIndex == 0 ? 0 : lastIndex + delimiter.length(),
index > 0 ? index : input.length()
);
if (!element.equals(delimiter) && !element.isEmpty()) {
list.add(element);
}
}
} while (index >= 0);
return list.toArray(new String[0]); // Convert an ArrayList of String to a String Array
}
// Split using Pattern split method with regex.
public static String[] splitM4(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
return Pattern.compile(Pattern.quote(delimiter)).split(input);
}
// Split using Scanner and Pattern quote.
public static String[] splitM5(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
List<String> list = new ArrayList<>();
try (Scanner scanner = new Scanner(input)) {
scanner.useDelimiter(Pattern.quote(delimiter));
while (scanner.hasNext()) {
String element = scanner.next();
if (!element.isEmpty()) {
list.add(element);
}
}
}
return list.toArray(new String[0]); // Convert an ArrayList of String to a String Array
}
public static String[] splitM6(String input, String delimiter) {
if (input == null || delimiter == null) return new String[0];
return Stream.of(input.split(Pattern.quote(delimiter)))
.filter(s -> !s.isEmpty())
.toArray(String[]::new);
}
public static void main(String[] args) {
String input = "apple,banana,orange";
String delimiter = ",";
System.out.println("Method splitM1 " + Arrays.toString(StringSplit.splitM1(input, delimiter)));
System.out.println("Method splitM2 " + Arrays.toString(StringSplit.splitM2(input, delimiter)));
System.out.println("Method splitM3 " + Arrays.toString(StringSplit.splitM3(input, delimiter)));
System.out.println("Method splitM4 " + Arrays.toString(StringSplit.splitM4(input, delimiter)));
System.out.println("Method splitM5 " + Arrays.toString(StringSplit.splitM5(input, delimiter)));
System.out.println("Method splitM6 " + Arrays.toString(StringSplit.splitM6(input, delimiter)));
}
}Output
Method splitM1 [apple, banana, orange]
Method splitM2 [apple, banana, orange]
Method splitM3 [apple, banana, orange]
Method splitM4 [apple, banana, orange]
Method splitM5 [apple, banana, orange]
Method splitM6 [apple, banana, orange]Overview of the Code
- Pattern.quote(delimiter): Escapes special characters in the delimiter to ensure correct splitting.
- Returns an array of substrings split by the delimiter.
- StringTokenizer iterates over tokens split by the delimiter.
- Tokens are stored in an ArrayList and then converted to an array.
- Note: StringTokenizer does not support regular expressions.
- Uses indexOf() to find the next occurrence of the delimiter.
- Extracts substrings using substring().
- More efficient than split() if working with large strings.
- Pattern.compile(Pattern.quote(delimiter)): Compiles a pattern for efficient splitting.
- This is a good alternative to String.split() when the same delimiter is used multiple times.
- Uses Scanner to iterate through tokens separated by the delimiter.
- More flexible than StringTokenizer since it allows regex-based delimiters.
- Uses Stream.of() to create a stream from the split string.
- Filters out empty strings.
- Useful when combined with other stream operations.
Using String.split() Method
The simplest way to split a string is by using the built-in split() method of the String class. This method takes a regular expression as a delimiter and returns an array of substrings.
[codefile] error: Method "splitM1" not found.
Explanation
Using StringTokenizer
The StringTokenizer class is an older method to split strings. It tokenizes a string based on a given delimiter.
Explanation
Using indexOf() and substring() (Manual Approach)
For full control over the splitting process, you can manually extract substrings using indexOf() and substring().
Explanation
Using Pattern.split()
The Pattern class provides a way to split strings using compiled regular expressions.
Explanation
Using Scanner
Java’s Scanner class provides another way to tokenize a string.
Explanation
Using Java Streams
The Java 8 Streams API offers a functional approach to string splitting.
Explanation
Conclusion
Each method of splitting strings in Java has its own advantages:
- splitM1 and splitM4 – split() and Pattern.split() are easy to use and handle regex.
- splitM2 – StringTokenizer is lightweight but lacks regex support.
- splitM3 – indexOf() and substring() provide a manual approach with more control.
- splitM5 – Scanner is useful when processing structured input.
- splitM6 – Streams offer a functional, concise approach.
Choosing the right method depends on the use case, performance needs, and complexity of the delimiter.
