I'm trying to split text in a JTextArea
using a regex to split the String by \n
However, this does not work and I also tried by \r\n|\r|n
and many other combination of regexes.
Code:
public void insertUpdate(DocumentEvent e) {
String split[], docStr = null;
Document textAreaDoc = (Document)e.getDocument();
try {
docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset());
} catch (BadLocationException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
split = docStr.split("\\n");
}
After failed attempts on the basis of all given solutions. I replace \n
with some special word and then split. For me following did the trick:
article = "Alice phoned\n bob.";
article = article.replace("\\n", " NEWLINE ");
String sen [] = article.split(" NEWLINE ");
I couldn't replicate the example given in the question. But, I guess this logic can be applied.
The above answers did not help me on Android, thanks to the Pshemo response that worked for me on Android. I will leave some of Pshemo's answer here :
split("\\\\n")
If, for some reason, you don't want to use String.split
(for example, because of regular expressions) and you want to use functional programming on Java 8 or newer:
List<String> lines = new BufferedReader(new StringReader(string))
.lines()
.collect(Collectors.toList());
String split[], docStr = null;
Document textAreaDoc = (Document)e.getDocument();
try {
docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset());
} catch (BadLocationException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
split = docStr.split("\n");
If you don’t want empty lines:
String.split("[\\r\\n]+")
All answers given here actually do not respect Javas definition of new lines as given in e.g. BufferedReader#readline. Java is accepting \n
, \r
and \r\n
as new line. Some of the answers match multiple empty lines or malformed files. E..g. <sometext>\n\r\n<someothertext>
when using [\r\n]+
would result in two lines.
String lines[] = string.split("(\r\n|\r|\n)", -1);
In contrast, the answer above has the following properties:
String split[], docStr = null;
Document textAreaDoc = (Document)e.getDocument();
try {
docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset());
} catch (BadLocationException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
split = docStr.split("\n");
There are three different conventions (it could be said that those are de facto standards) to set and display a line break:
carriage return
+ line feed
line feed
carriage return
In some text editors, it is possible to exchange one for the other:
The simplest thing is to normalize to line feed
and then split.
final String[] lines = contents.replace("\r\n", "\n")
.replace("\r", "\n")
.split("\n", -1);
String#split?(String regex)
method is using regex (regular expressions). Since Java 8 regex supports \R
which represents (from documentation of Pattern class):
Linebreak matcher
\R Any Unicode linebreak sequence, is equivalent to\u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
So we can use it to match:
\u000D\000A
-> \r\n
pair \n
)\t
which is \u0009
)\f
)\r
)As you see \r\n
is placed at start of regex which ensures that regex will try to match this pair first, and only if that match fails it will try to match single character line separators.
So if you want to split on line separator use split("\\R")
.
If you don't want to remove from resulting array trailing empty strings ""
use split(regex, limit)
with negative limit
parameter like split("\\R", -1)
.
If you want to treat one or more continues empty lines as single delimiter use split("\\R+")
.
Maybe this would work:
Remove the double backslashes from the parameter of the split method:
split = docStr.split("\n");
If you don’t want empty lines:
String.split("[\\r\\n]+")
In JDK11
the String
class has a lines()
method:
Returning a stream of lines extracted from this string, separated by line terminators.
Further, the documentation goes on to say:
A line terminator is one of the following: a line feed character "\n" (U+000A), a carriage return character "\r" (U+000D), or a carriage return followed immediately by a line feed "\r\n" (U+000D U+000A). A line is either a sequence of zero or more characters followed by a line terminator, or it is a sequence of one or more characters followed by the end of the string. A line does not include the line terminator.
With this one can simply do:
Stream<String> stream = str.lines();
then if you want an array:
String[] array = str.lines().toArray(String[]::new);
Given this method returns a Stream it upon up a lot of options for you as it enables one to write concise and declarative expression of possibly-parallel operations.
A new method lines
has been introduced to String
class in java-11, which returns Stream<String>
Returns a stream of substrings extracted from this string partitioned by line terminators.
Line terminators recognized are line feed "\n" (U+000A), carriage return "\r" (U+000D) and a carriage return followed immediately by a line feed "\r\n" (U+000D U+000A).
Here are a few examples:
jshell> "lorem \n ipusm \n sit".lines().forEach(System.out::println)
lorem
ipusm
sit
jshell> "lorem \n ipusm \r sit".lines().forEach(System.out::println)
lorem
ipusm
sit
jshell> "lorem \n ipusm \r\n sit".lines().forEach(System.out::println)
lorem
ipusm
sit
The above answers did not help me on Android, thanks to the Pshemo response that worked for me on Android. I will leave some of Pshemo's answer here :
split("\\\\n")
Maybe this would work:
Remove the double backslashes from the parameter of the split method:
split = docStr.split("\n");
There are three different conventions (it could be said that those are de facto standards) to set and display a line break:
carriage return
+ line feed
line feed
carriage return
In some text editors, it is possible to exchange one for the other:
The simplest thing is to normalize to line feed
and then split.
final String[] lines = contents.replace("\r\n", "\n")
.replace("\r", "\n")
.split("\n", -1);
As an alternative to the previous answers, guava's Splitter
API can be used if other operations are to be applied to the resulting lines, like trimming lines or filtering empty lines :
import com.google.common.base.Splitter;
Iterable<String> split = Splitter.onPattern("\r?\n").trimResults().omitEmptyStrings().split(docStr);
Note that the result is an Iterable
and not an array.
String lines[] =String.split( System.lineSeparator())
In JDK11
the String
class has a lines()
method:
Returning a stream of lines extracted from this string, separated by line terminators.
Further, the documentation goes on to say:
A line terminator is one of the following: a line feed character "\n" (U+000A), a carriage return character "\r" (U+000D), or a carriage return followed immediately by a line feed "\r\n" (U+000D U+000A). A line is either a sequence of zero or more characters followed by a line terminator, or it is a sequence of one or more characters followed by the end of the string. A line does not include the line terminator.
With this one can simply do:
Stream<String> stream = str.lines();
then if you want an array:
String[] array = str.lines().toArray(String[]::new);
Given this method returns a Stream it upon up a lot of options for you as it enables one to write concise and declarative expression of possibly-parallel operations.
As an alternative to the previous answers, guava's Splitter
API can be used if other operations are to be applied to the resulting lines, like trimming lines or filtering empty lines :
import com.google.common.base.Splitter;
Iterable<String> split = Splitter.onPattern("\r?\n").trimResults().omitEmptyStrings().split(docStr);
Note that the result is an Iterable
and not an array.
For preserving empty lines from getting squashed use:
String lines[] = String.split("\\r?\\n", -1);
If you don’t want empty lines:
String.split("[\\r\\n]+")
There is new boy in the town, so you need not to deal with all above complexities. From JDK 11 onward, just need to write as single line of code, it will split lines and returns you Stream of String.
public class MyClass {
public static void main(String args[]) {
Stream<String> lines="foo \n bar \n baz".lines();
//Do whatever you want to do with lines
}}
Some references. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#lines() https://www.azul.com/90-new-features-and-apis-in-jdk-11/
I hope this will be helpful to someone. Happy coding.
The above code doesnt actually do anything visible - it just calcualtes then dumps the calculation. Is it the code you used, or just an example for this question?
try doing textAreaDoc.insertString(int, String, AttributeSet) at the end?
If you don’t want empty lines:
String.split("[\\r\\n]+")
String.split(System.getProperty("line.separator"));
This should be system independent
You don't have to double escape characters in character groups.
For all non empty lines use:
String.split("[\r\n]+")
A new method lines
has been introduced to String
class in java-11, which returns Stream<String>
Returns a stream of substrings extracted from this string partitioned by line terminators.
Line terminators recognized are line feed "\n" (U+000A), carriage return "\r" (U+000D) and a carriage return followed immediately by a line feed "\r\n" (U+000D U+000A).
Here are a few examples:
jshell> "lorem \n ipusm \n sit".lines().forEach(System.out::println)
lorem
ipusm
sit
jshell> "lorem \n ipusm \r sit".lines().forEach(System.out::println)
lorem
ipusm
sit
jshell> "lorem \n ipusm \r\n sit".lines().forEach(System.out::println)
lorem
ipusm
sit
There is new boy in the town, so you need not to deal with all above complexities. From JDK 11 onward, just need to write as single line of code, it will split lines and returns you Stream of String.
public class MyClass {
public static void main(String args[]) {
Stream<String> lines="foo \n bar \n baz".lines();
//Do whatever you want to do with lines
}}
Some references. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#lines() https://www.azul.com/90-new-features-and-apis-in-jdk-11/
I hope this will be helpful to someone. Happy coding.
After failed attempts on the basis of all given solutions. I replace \n
with some special word and then split. For me following did the trick:
article = "Alice phoned\n bob.";
article = article.replace("\\n", " NEWLINE ");
String sen [] = article.split(" NEWLINE ");
I couldn't replicate the example given in the question. But, I guess this logic can be applied.
String.split(System.getProperty("line.separator"));
This should be system independent
The above code doesnt actually do anything visible - it just calcualtes then dumps the calculation. Is it the code you used, or just an example for this question?
try doing textAreaDoc.insertString(int, String, AttributeSet) at the end?
If, for some reason, you don't want to use String.split
(for example, because of regular expressions) and you want to use functional programming on Java 8 or newer:
List<String> lines = new BufferedReader(new StringReader(string))
.lines()
.collect(Collectors.toList());
You don't have to double escape characters in character groups.
For all non empty lines use:
String.split("[\r\n]+")
Maybe this would work:
Remove the double backslashes from the parameter of the split method:
split = docStr.split("\n");
package in.javadomain;
public class JavaSplit {
public static void main(String[] args) {
String input = "chennai\nvellore\ncoimbatore\nbangalore\narcot";
System.out.println("Before split:\n");
System.out.println(input);
String[] inputSplitNewLine = input.split("\\n");
System.out.println("\n After split:\n");
for(int i=0; i<inputSplitNewLine.length; i++){
System.out.println(inputSplitNewLine[i]);
}
}
}
For preserving empty lines from getting squashed use:
String lines[] = String.split("\\r?\\n", -1);
Maybe this would work:
Remove the double backslashes from the parameter of the split method:
split = docStr.split("\n");
String lines[] =String.split( System.lineSeparator())
package in.javadomain;
public class JavaSplit {
public static void main(String[] args) {
String input = "chennai\nvellore\ncoimbatore\nbangalore\narcot";
System.out.println("Before split:\n");
System.out.println(input);
String[] inputSplitNewLine = input.split("\\n");
System.out.println("\n After split:\n");
for(int i=0; i<inputSplitNewLine.length; i++){
System.out.println(inputSplitNewLine[i]);
}
}
}
String#split?(String regex)
method is using regex (regular expressions). Since Java 8 regex supports \R
which represents (from documentation of Pattern class):
Linebreak matcher
\R Any Unicode linebreak sequence, is equivalent to\u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
So we can use it to match:
\u000D\000A
-> \r\n
pair \n
)\t
which is \u0009
)\f
)\r
)As you see \r\n
is placed at start of regex which ensures that regex will try to match this pair first, and only if that match fails it will try to match single character line separators.
So if you want to split on line separator use split("\\R")
.
If you don't want to remove from resulting array trailing empty strings ""
use split(regex, limit)
with negative limit
parameter like split("\\R", -1)
.
If you want to treat one or more continues empty lines as single delimiter use split("\\R+")
.
The above code doesnt actually do anything visible - it just calcualtes then dumps the calculation. Is it the code you used, or just an example for this question?
try doing textAreaDoc.insertString(int, String, AttributeSet) at the end?
All answers given here actually do not respect Javas definition of new lines as given in e.g. BufferedReader#readline. Java is accepting \n
, \r
and \r\n
as new line. Some of the answers match multiple empty lines or malformed files. E..g. <sometext>\n\r\n<someothertext>
when using [\r\n]+
would result in two lines.
String lines[] = string.split("(\r\n|\r|\n)", -1);
In contrast, the answer above has the following properties:
Source: Stackoverflow.com