Categories
c++ file-io ofstream

Read file line by line using ifstream in C++

745

The contents of file.txt are:

5 3
6 4
7 1
10 5
11 6
12 3
12 4

Where 5 3 is a coordinate pair.
How do I process this data line by line in C++?

I am able to get the first line, but how do I get the next line of the file?

ifstream myfile;
myfile.open ("file.txt");

0

    1088

    First, make an ifstream:

    #include <fstream>
    std::ifstream infile("thefile.txt");
    

    The two standard methods are:

    1. Assume that every line consists of two numbers and read token by token:

      int a, b;
      while (infile >> a >> b)
      {
          // process pair (a,b)
      }
      
    2. Line-based parsing, using string streams:

      #include <sstream>
      #include <string>
      
      std::string line;
      while (std::getline(infile, line))
      {
          std::istringstream iss(line);
          int a, b;
          if (!(iss >> a >> b)) { break; } // error
      
          // process pair (a,b)
      }
      

    You shouldn’t mix (1) and (2), since the token-based parsing doesn’t gobble up newlines, so you may end up with spurious empty lines if you use getline() after token-based extraction got you to the end of a line already.

    22

    • 1

      @EdwardKarak: I don’t understand what “commas as the token” means. Commas don’t represent integers.

      – Kerrek SB

      Oct 18, 2014 at 14:22

    • 8

      the OP used a space to delimit the two integers. I wanted to know if while (infile >> a >> b) would work if the OP used a as a comma a delimiter, because that is the scenario in my own program

      Oct 18, 2014 at 14:46

    • 31

      @EdwardKarak: Ah, so when you said “token” you meant “delimiter”. Right. With a comma, you’d say: int a, b; char c; while ((infile >> a >> c >> b) && (c == ','))

      – Kerrek SB

      Oct 18, 2014 at 15:25


    • 12

      @KerrekSB: Huh. I was wrong. I didn’t know it could do that. I might have some code of my own to rewrite.

      – Mark H

      Jan 6, 2015 at 15:00

    • 4

      For an explanation of the while(getline(f, line)) { } construct and regarding error handling please have a look at this (my) article: gehrcke.de/2011/06/… (I think I do not need to have bad conscience posting this here, it even slightly pre-dates this answer).

      Jan 18, 2015 at 14:15


    207

    Use ifstream to read data from a file:

    std::ifstream input( "filename.ext" );
    

    If you really need to read line by line, then do this:

    for( std::string line; getline( input, line ); )
    {
        ...for each line in input...
    }
    

    But you probably just need to extract coordinate pairs:

    int x, y;
    input >> x >> y;
    

    Update:

    In your code you use ofstream myfile;, however the o in ofstream stands for output. If you want to read from the file (input) use ifstream. If you want to both read and write use fstream.

    2

    • 9

      Your solution is a bit improved: your line variable is not visible after file read-in in contrast to Kerrek SB’s second solution which is good and simple solution too.

      Jul 23, 2013 at 14:24

    • 8

      getline is in string see, so don’t forget the #include <string>

      – mxmlnkn

      Jul 12, 2017 at 23:02

    131

    Reading a file line by line in C++ can be done in some different ways.

    [Fast] Loop with std::getline()

    The simplest approach is to open an std::ifstream and loop using std::getline() calls. The code is clean and easy to understand.

    #include <fstream>
    
    std::ifstream file(FILENAME);
    if (file.is_open()) {
        std::string line;
        while (std::getline(file, line)) {
            // using printf() in all tests for consistency
            printf("%s", line.c_str());
        }
        file.close();
    }
    

    [Fast] Use Boost’s file_description_source

    Another possibility is to use the Boost library, but the code gets a bit more verbose. The performance is quite similar to the code above (Loop with std::getline()).

    #include <boost/iostreams/device/file_descriptor.hpp>
    #include <boost/iostreams/stream.hpp>
    #include <fcntl.h>
    
    namespace io = boost::iostreams;
    
    void readLineByLineBoost() {
        int fdr = open(FILENAME, O_RDONLY);
        if (fdr >= 0) {
            io::file_descriptor_source fdDevice(fdr, io::file_descriptor_flags::close_handle);
            io::stream <io::file_descriptor_source> in(fdDevice);
            if (fdDevice.is_open()) {
                std::string line;
                while (std::getline(in, line)) {
                    // using printf() in all tests for consistency
                    printf("%s", line.c_str());
                }
                fdDevice.close();
            }
        }
    }
    

    [Fastest] Use C code

    If performance is critical for your software, you may consider using the C language. This code can be 4-5 times faster than the C++ versions above, see benchmark below

    FILE* fp = fopen(FILENAME, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);
    
    char* line = NULL;
    size_t len = 0;
    while ((getline(&line, &len, fp)) != -1) {
        // using printf() in all tests for consistency
        printf("%s", line);
    }
    fclose(fp);
    if (line)
        free(line);
    

    Benchmark — Which one is faster?

    I have done some performance benchmarks with the code above and the results are interesting. I have tested the code with ASCII files that contain 100,000 lines, 1,000,000 lines and 10,000,000 lines of text. Each line of text contains 10 words in average. The program is compiled with -O3 optimization and its output is forwarded to /dev/null in order to remove the logging time variable from the measurement. Last, but not least, each piece of code logs each line with the printf() function for consistency.

    The results show the time (in ms) that each piece of code took to read the files.

    The performance difference between the two C++ approaches is minimal and shouldn’t make any difference in practice. The performance of the C code is what makes the benchmark impressive and can be a game changer in terms of speed.

                                 10K lines     100K lines     1000K lines
    Loop with std::getline()         105ms          894ms          9773ms
    Boost code                       106ms          968ms          9561ms
    C code                            23ms          243ms          2397ms
    

    enter image description here

    10

    • 6

      What happens if you remove C++’s synchronization with C on the console outputs? You might be measuring a known disadvantage of the default behavior of std::cout vs printf.

      Jul 30, 2018 at 20:41

    • 3

      Thanks for bringing this concern. I’ve redone the tests and the performance is still the same. I have edited the code to use the printf() function in all cases for consistency. I have also tried using std::cout in all cases and this made absolutely no difference. As I have just described in the text, the output of the program goes to /dev/null so the time to print the lines is not measured.

      Jul 31, 2018 at 2:11

    • 7

      Groovy. Thanks. Wonder where the slowdown is.

      Jul 31, 2018 at 4:34

    • 10

      Hi @HugoTeixeira I know this is an old thread, I tried to replicate your results and could not see any significant difference between c and c++ github.com/simonsso/readfile_benchmarks

      – Simson

      Feb 3, 2019 at 5:24

    • 2

      Note that your use of getline in C is a gnu extension (now added to POSIX). It’s not a standard C function.

      – Dan M.

      Nov 25, 2021 at 13:31