[string] Reading a file line by line in Go

I'm unable to find file.ReadLine function in Go. I can figure out how to quickly write one, but I am just wondering if I'm overlooking something here. How does one read a file line by line?

This question is related to string file parsing go line

The answer is


Use:

  • reader.ReadString('\n')
    • If you don't mind that the line could be very long (i.e. use a lot of RAM). It keeps the \n at the end of the string returned.
  • reader.ReadLine()
    • If you care about limiting RAM consumption and don't mind the extra work of handling the case where the line is greater than the reader's buffer size.

I tested the various solutions suggested by writing a program to test the scenarios which are identified as problems in other answers:

  • A file with a 4MB line.
  • A file which doesn't end with a line break.

I found that:

  • The Scanner solution does not handle long lines.
  • The ReadLine solution is complex to implement.
  • The ReadString solution is the simplest and works for long lines.

Here is code which demonstrates each solution, it can be run via go run main.go, or at https://play.golang.org/p/RAW3sGblbas

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "io"
    "os"
)

func readFileWithReadString(fn string) (err error) {
    fmt.Println("readFileWithReadString")

    file, err := os.Open(fn)
    if err != nil {
        return err
    }
    defer file.Close()

    // Start reading from the file with a reader.
    reader := bufio.NewReader(file)
    var line string
    for {
        line, err = reader.ReadString('\n')
        if err != nil && err != io.EOF {
            break
        }

        // Process the line here.
        fmt.Printf(" > Read %d characters\n", len(line))
        fmt.Printf(" > > %s\n", limitLength(line, 50))

        if err != nil {
            break
        }
    }
    if err != io.EOF {
        fmt.Printf(" > Failed with error: %v\n", err)
        return err
    }
    return
}

func readFileWithScanner(fn string) (err error) {
    fmt.Println("readFileWithScanner (scanner fails with long lines)")

    // Don't use this, it doesn't work with long lines...

    file, err := os.Open(fn)
    if err != nil {
        return err
    }
    defer file.Close()

    // Start reading from the file using a scanner.
    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        line := scanner.Text()

        // Process the line here.
        fmt.Printf(" > Read %d characters\n", len(line))
        fmt.Printf(" > > %s\n", limitLength(line, 50))
    }
    if scanner.Err() != nil {
        fmt.Printf(" > Failed with error %v\n", scanner.Err())
        return scanner.Err()
    }
    return
}

func readFileWithReadLine(fn string) (err error) {
    fmt.Println("readFileWithReadLine")

    file, err := os.Open(fn)
    if err != nil {
        return err
    }
    defer file.Close()

    // Start reading from the file with a reader.
    reader := bufio.NewReader(file)
    for {
        var buffer bytes.Buffer

        var l []byte
        var isPrefix bool
        for {
            l, isPrefix, err = reader.ReadLine()
            buffer.Write(l)
            // If we've reached the end of the line, stop reading.
            if !isPrefix {
                break
            }
            // If we're at the EOF, break.
            if err != nil {
                if err != io.EOF {
                    return err
                }
                break
            }
        }
        line := buffer.String()

        // Process the line here.
        fmt.Printf(" > Read %d characters\n", len(line))
        fmt.Printf(" > > %s\n", limitLength(line, 50))

        if err == io.EOF {
            break
        }
    }
    if err != io.EOF {
        fmt.Printf(" > Failed with error: %v\n", err)
        return err
    }
    return
}

func main() {
    testLongLines()
    testLinesThatDoNotFinishWithALinebreak()
}

func testLongLines() {
    fmt.Println("Long lines")
    fmt.Println()

    createFileWithLongLine("longline.txt")
    readFileWithReadString("longline.txt")
    fmt.Println()
    readFileWithScanner("longline.txt")
    fmt.Println()
    readFileWithReadLine("longline.txt")
    fmt.Println()
}

func testLinesThatDoNotFinishWithALinebreak() {
    fmt.Println("No linebreak")
    fmt.Println()

    createFileThatDoesNotEndWithALineBreak("nolinebreak.txt")
    readFileWithReadString("nolinebreak.txt")
    fmt.Println()
    readFileWithScanner("nolinebreak.txt")
    fmt.Println()
    readFileWithReadLine("nolinebreak.txt")
    fmt.Println()
}

func createFileThatDoesNotEndWithALineBreak(fn string) (err error) {
    file, err := os.Create(fn)
    if err != nil {
        return err
    }
    defer file.Close()

    w := bufio.NewWriter(file)
    w.WriteString("Does not end with linebreak.")
    w.Flush()
    return
}

func createFileWithLongLine(fn string) (err error) {
    file, err := os.Create(fn)
    if err != nil {
        return err
    }
    defer file.Close()

    w := bufio.NewWriter(file)
    fs := 1024 * 1024 * 4 // 4MB
    // Create a 4MB long line consisting of the letter a.
    for i := 0; i < fs; i++ {
        w.WriteRune('a')
    }
    // Terminate the line with a break.
    w.WriteRune('\n')

    // Put in a second line, which doesn't have a linebreak.
    w.WriteString("Second line.")
    w.Flush()
    return
}

func limitLength(s string, length int) string {
    if len(s) < length {
        return s
    }
    return s[:length]
}

I tested on:

  • go version go1.15 darwin/amd64
  • go version go1.7 windows/amd64
  • go version go1.6.3 linux/amd64
  • go version go1.7.4 darwin/amd64

The test program outputs:

Long lines

readFileWithReadString
 > Read 4194305 characters
 > > aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 > Read 12 characters
 > > Second line.

readFileWithScanner (scanner fails with long lines)
 > Failed with error bufio.Scanner: token too long

readFileWithReadLine
 > Read 4194304 characters
 > > aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 > Read 12 characters
 > > Second line.
 > Read 0 characters
 > > 

No linebreak

readFileWithReadString
 > Read 28 characters
 > > Does not end with linebreak.

readFileWithScanner (scanner fails with long lines)
 > Read 28 characters
 > > Does not end with linebreak.

readFileWithReadLine
 > Read 28 characters
 > > Does not end with linebreak.
 > Read 0 characters
 > > 

Another method is to use the io/ioutil and strings libraries to read the entire file's bytes, convert them into a string and split them using a "\n" (newline) character as the delimiter, for example:

import (
    "io/ioutil"
    "strings"
)

func main() {
    bytesRead, _ := ioutil.ReadFile("something.txt")
    file_content := string(bytesRead)
    lines := strings.Split(file_content, "\n")
}

Technically you're not reading the file line-by-line, however you are able to parse each line using this technique. This method is applicable to smaller files. If you're attempting to parse a massive file use one of the techniques that reads line-by-line.


Example from this gist

func readLine(path string) {
  inFile, err := os.Open(path)
  if err != nil {
     fmt.Println(err.Error() + `: ` + path)
     return
  }
  defer inFile.Close()

  scanner := bufio.NewScanner(inFile)
  for scanner.Scan() {
    fmt.Println(scanner.Text()) // the line
  }
}

but this gives an error when there is a line that larger than Scanner's buffer.

When that happened, what I do is use reader := bufio.NewReader(inFile) create and concat my own buffer either using ch, err := reader.ReadByte() or len, err := reader.Read(myBuffer)

Another way that I use (replace os.Stdin with file like above), this one concats when lines are long (isPrefix) and ignores empty lines:


func readLines() []string {
  r := bufio.NewReader(os.Stdin)
  bytes := []byte{}
  lines := []string{}
  for {
    line, isPrefix, err := r.ReadLine()
    if err != nil {
      break
    }
    bytes = append(bytes, line...)
    if !isPrefix {
      str := strings.TrimSpace(string(bytes))
      if len(str) > 0 {
        lines = append(lines, str)
        bytes = []byte{}
      }
    }
  }
  if len(bytes) > 0 {
    lines = append(lines, string(bytes))
  }
  return lines
}

In the code bellow, I read the interests from the CLI until the user hits enter and I'm using Readline:

interests := make([]string, 1)
r := bufio.NewReader(os.Stdin)
for true {
    fmt.Print("Give me an interest:")
    t, _, _ := r.ReadLine()
    interests = append(interests, string(t))
    if len(t) == 0 {
        break;
    }
}
fmt.Println(interests)

You can also use ReadString with \n as a separator:

  f, err := os.Open(filename)
  if err != nil {
    fmt.Println("error opening file ", err)
    os.Exit(1)
  }
  defer f.Close()
  r := bufio.NewReader(f)
  for {
    path, err := r.ReadString(10) // 0x0A separator = newline
    if err == io.EOF {
      // do something here
      break
    } else if err != nil {
      return err // if you return error
    }
  }

EDIT: As of go1.1, the idiomatic solution is to use bufio.Scanner

I wrote up a way to easily read each line from a file. The Readln(*bufio.Reader) function returns a line (sans \n) from the underlying bufio.Reader struct.

// Readln returns a single line (without the ending \n)
// from the input buffered reader.
// An error is returned iff there is an error with the
// buffered reader.
func Readln(r *bufio.Reader) (string, error) {
  var (isPrefix bool = true
       err error = nil
       line, ln []byte
      )
  for isPrefix && err == nil {
      line, isPrefix, err = r.ReadLine()
      ln = append(ln, line...)
  }
  return string(ln),err
}

You can use Readln to read every line from a file. The following code reads every line in a file and outputs each line to stdout.

f, err := os.Open(fi)
if err != nil {
    fmt.Printf("error opening file: %v\n",err)
    os.Exit(1)
}
r := bufio.NewReader(f)
s, e := Readln(r)
for e == nil {
    fmt.Println(s)
    s,e = Readln(r)
}

Cheers!


In Go 1.1 and newer the most simple way to do this is with a bufio.Scanner. Here is a simple example that reads lines from a file:

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
)

func main() {
    file, err := os.Open("/path/to/file.txt")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        fmt.Println(scanner.Text())
    }

    if err := scanner.Err(); err != nil {
        log.Fatal(err)
    }
}

This is the cleanest way to read from a Reader line by line.

There is one caveat: Scanner does not deal well with lines longer than 65536 characters. If that is an issue for you then then you should probably roll your own on top of Reader.Read().


import (
     "bufio"
     "os"
)

var (
    reader = bufio.NewReader(os.Stdin)
)

func ReadFromStdin() string{
    result, _ := reader.ReadString('\n')
    witl := result[:len(result)-1]
    return witl
}

Here is an example with function ReadFromStdin() it's like fmt.Scan(&name) but its takes all strings with blank spaces like: "Hello My Name Is ..."

var name string = ReadFromStdin()

println(name)

There two common way to read file line by line.

  1. Use bufio.Scanner
  2. Use ReadString/ReadBytes/... in bufio.Reader

In my testcase, ~250MB, ~2,500,000 lines, bufio.Scanner(time used: 0.395491384s) is faster than bufio.Reader.ReadString(time_used: 0.446867622s).

Source code: https://github.com/xpzouying/go-practice/tree/master/read_file_line_by_line

Read file use bufio.Scanner,

func scanFile() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf("open file error: %v", err)
        return
    }
    defer f.Close()

    sc := bufio.NewScanner(f)
    for sc.Scan() {
        _ = sc.Text()  // GET the line string
    }
    if err := sc.Err(); err != nil {
        log.Fatalf("scan file error: %v", err)
        return
    }
}

Read file use bufio.Reader,

func readFileLines() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf("open file error: %v", err)
        return
    }
    defer f.Close()

    rd := bufio.NewReader(f)
    for {
        line, err := rd.ReadString('\n')
        if err != nil {
            if err == io.EOF {
                break
            }

            log.Fatalf("read file line error: %v", err)
            return
        }
        _ = line  // GET the line string
    }
}

bufio.Reader.ReadLine() works well. But if you want to read each line by a string, try to use ReadString('\n'). It doesn't need to reinvent the wheel.


// strip '\n' or read until EOF, return error if read error  
func readline(reader io.Reader) (line []byte, err error) {   
    line = make([]byte, 0, 100)                              
    for {                                                    
        b := make([]byte, 1)                                 
        n, er := reader.Read(b)                              
        if n > 0 {                                           
            c := b[0]                                        
            if c == '\n' { // end of line                    
                break                                        
            }                                                
            line = append(line, c)                           
        }                                                    
        if er != nil {                                       
            err = er                                         
            return                                           
        }                                                    
    }                                                        
    return                                                   
}                                    

In the new version of Go 1.16 we can use package embed to read the file contents as shown below.

package main

import _"embed"


func main() {
    //go:embed "hello.txt"
    var s string
    print(s)

    //go:embed "hello.txt"
    var b []byte
    print(string(b))

    //go:embed hello.txt
    var f embed.FS
    data, _ := f.ReadFile("hello.txt")
    print(string(data))
}

For more details go through https://tip.golang.org/pkg/embed/ And https://golangtutorial.dev/tips/embed-files-in-go/


Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to parsing

Got a NumberFormatException while trying to parse a text file for objects Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) Python/Json:Expecting property name enclosed in double quotes Correctly Parsing JSON in Swift 3 How to get response as String using retrofit without using GSON or any other library in android UIButton action in table view cell "Expected BEGIN_OBJECT but was STRING at line 1 column 1" How to convert an XML file to nice pandas dataframe? How to extract multiple JSON objects from one file? How to sum digits of an integer in java?

Examples related to go

Has been blocked by CORS policy: Response to preflight request doesn’t pass access control check Go test string contains substring Golang read request body How to uninstall Golang? Decode JSON with unknown structure Access HTTP response as string in Go How to search for an element in a golang slice How to delete an element from a Slice in Golang How to set default values in Go structs MINGW64 "make build" error: "bash: make: command not found"

Examples related to line

How do I compute the intersection point of two lines? Java Replace Line In Text File draw diagonal lines in div background with CSS How to make a new line or tab in <string> XML (eclipse/android)? How do you run a command for each line of a file? How can I format a list to print each element on a separate line in python? Remove lines that contain certain string How to paste text to end of every line? Sublime 2 Making PHP var_dump() values display one line per value How to do multiple line editing?