String parsing in Java with delimiter tab t using split

Question

I m processing a string which is tab delimited  I m accomplishing this using the  split function  and it works in most situations  The problem occurs when a field is missing  so instead of getting null in that field I get the next value  I m storing the parsed values in a string array   String   columnDetail   new String 11   columnDetail   column split   t      Any help would be appreciated  If possible I d like to store the parsed strings into a string array so that I can easily access the parsed data

User · Answer

You can use yourstring split   x09    I tested it  and it works

User · Answer

Try this   String   columnDetail   column split   t   -1     Read the Javadoc on String split java lang String  int  for an explanation about the limit parameter of split function   split  public String   split String regex  int limit  Splits this string around matches of the given regular expression  The array returned by this method contains each substring of this string that is terminated by another substring that matches the given expression or is terminated by the end of the string  The substrings in the array are in the order in which they occur in this string  If the expression does not match any part of the input then the resulting array has just one element  namely this string   The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array  If the limit n is greater than zero then the pattern will be applied at most n - 1 times  the array s length will be no greater than n  and the array s last entry will contain all input beyond the last matched delimiter  If n is non-positive then the pattern will be applied as many times as possible and the array can have any length  If n is zero then the pattern will be applied as many times as possible  the array can have any length  and trailing empty strings will be discarded   The string  boo and foo   for example  yields the following results with these parameters   Regex   Limit   Result     2      boo    and foo        5      boo    and    foo        -2     boo    and    foo    o   5      b         and f            o   -2     b         and f            o   0      b         and f      When the last few fields  I guest that s your situation  are missing  you will get the column like this   field1 tfield2 tfield3 t t   If no limit is set to split    the limit is 0  which will lead to that  trailing empty strings will be discarded   So you can just get just 3 fields    field1    field2    field3     When limit is set to -1  a non-positive value  trailing empty strings will not be discarded  So you can get 5 fields with the last two being empty string    field1    field2    field3

User · Answer

I just had the same question and noticed the answer in some kind of tutorial  In general you need to use the second form of the split method  using the   split regex  limit    Here is the full tutorial http   www rgagnon com javadetails java-0438 html  If you set some negative number for the limit parameter you will get empty strings in the array where the actual values are missing  To use this your initial string should have two copies of the delimiter i e  you should have  t t where the values are missing   Hope this helps

User · Answer

String split uses Regular Expressions  also you don t need to allocate an extra array for your split   The split-method will give you a list   the problem is that you try to pre-define how many occurrences you have of a tab  but how would you Really know that  Try using the Scanner or StringTokenizer and just learn how splitting strings work   Let me explain Why  t does not work and why you need      to escape      Okay  so when you use Split  it actually takes a regex   Regular Expression   and in regular expression you want to define what Character to split by  and if you write  t that actually doesn t mean  t and what you WANT to split by is  t  right  So  by just writing  t you tell your regex-processor that  Hey split by the character that is escaped t  NOT  Hey split by all characters looking like  t   Notice the difference  Using   means to escape something  And   in regex means something Totally different than what you think   So this is why you need to use this Solution     t   To tell the regex processor to look for  t  Okay  so why would you need two of em  Well  the first   escapes the second  which means it will look like this   t when you are processing the text   Now let s say that you are looking to split    Well then you would be left with    but see  that doesn t Work  because   will try to escape the previous char  That is why you want the Output to be    and therefore you need to have        I really hope the examples above helps you understand why your solution doesn t work and how to conquer other ones   Now  I ve given you this answer before  maybe you should start looking at them now   OTHER METHODS  StringTokenizer  You should look into the StringTokenizer  it s a very handy tool for this type of work   Example   StringTokenizer st   new StringTokenizer  this is a test     while  st hasMoreTokens           System out println st nextToken          This will output   this  is  a  test   You use the Second Constructor for StringTokenizer to set the delimiter   StringTokenizer String str  String delim    Scanner  You could also use a Scanner as one of the commentators said this could look somewhat like this  Example   String input    1 fish 2 fish red fish blue fish     Scanner s   new Scanner input  useDelimiter    s fish  s       System out println s nextInt      System out println s nextInt      System out println s next      System out println s next       s close       The output would be    1  2  red  blue    Meaning that it will cut out the word  fish  and give you the rest  using  fish  as the delimiter   examples taken from the Java API

User · Answer

Well nobody answered - which is in part the fault of the question   the input string contains eleven fields  this much can be inferred  but how many tabs   Most possibly exactly 10  Then the answer is  String s     t2 t t4 t5 t6 t t8 t t10 t   String   fields   s split   t   -1       in your case s split   t   11  might also do for  int i   0  i  lt  fields length    i        if     equals fields i    fields i    null    System out println Arrays asList fields        null  2  null  4  5  6  null  8  null  10  null     with s split   t      null  2  null  4  5  6  null  8  null  10    If the fields happen to contain tabs this won t work as expected  of course  The -1 means   apply the pattern as many times as needed - so trailing fields  the 11th  will be preserved  as empty strings      if absent  which need to be turned to null explicitly    If on the other hand there are no tabs for the missing fields - so  5 t6  is a valid input string containing the fields 5 6 only - there is no way to get the fields   via split

User · Answer

String split implementations will have serious limitations if the data in a tab-delimited field itself contains newline  tab and possibly   characters    TAB-delimited formats have been around for donkey s years  but format is not standardised and varies  Many implementations don t escape characters  newlines and tabs  appearing within a field  Rather  they follow CSV conventions and wrap any non-trivial fields in  double quotes   Then they escape only double-quotes  So a  line  could extend over multiple lines   Reading around I heard  just reuse apache tools   which sounds like good advice    In the end I personally chose opencsv  I found it light-weight  and since it provides options for escape and quote characters it should cover most popular comma- and tab- delimited data formats   Example   CSVReader tabFormatReader   new CSVReader new FileReader  yourfile tsv      t

User · Answer

String   columnDetail   new String 11   columnDetail   column split  quot  t quot   -1      unlimited OR columnDetail   column split  quot  t quot   11      if you are sure about limit      The   code limit  parameter controls the number of times the    pattern is applied and therefore affects the length of the resulting    array   If the limit  lt i gt n lt  i gt  is greater than zero then the pattern    will be applied at most  lt i gt n lt  i gt  amp nbsp - amp nbsp 1 times  the array s    length will be no greater than  lt i gt n lt  i gt   and the array s last entry    will contain all input beyond the last matched delimiter   If  lt i gt n lt  i gt     is non-positive then the pattern will be applied as many times as    possible and the array can have any length   If  lt i gt n lt  i gt  is zero then    the pattern will be applied as many times as possible  the array can    have any length  and trailing empty strings will be discarded

[java] String parsing in Java with delimiter tab "\t" using split

Examples related to java

Examples related to string

Examples related to tab-delimited