This is a small CSV parser for positive integers mostly intended for personal projects.
I (hopefully) improved the code from Part-1 by condensing some parts and improving others while also keeping performance the same.
My main worries are any obvious pitfalls I might have missed as well as concerns about the general approach. Of course anything else you notice is also welcome.
Please see part-1 for a list of requirements/restrictions I had when writing this.
For anyone who wants to test this against actual input I provide a small Perl script to generate CSV files.
Call it with: perl [scriptname] [how many lines you want] >outputfile
use strict;
use warnings "all";
die "usage: perl $0 <amount_of_lines>\n" unless (int(@ARGV) > 0);
for (1 .. $ARGV[0]) {
my $vals_per_line = int(rand(10));
my @values;
for (0 .. $vals_per_line) {
push(@values, int(rand(1000)));
}
print join(",", @values) . "\n";
}
Code:
#include <fstream>
#include <string>
#include <vector>
template<typename T>
void parse_lines(
const std::string& filename,
const uint_fast8_t& line_length_estimate,
const uint_fast8_t& values_per_line_estimate,
T callback) {
std::ifstream infile{filename};
if (!infile.good()) {
return;
}
std::vector<uint_fast16_t> numbers;
numbers.reserve(values_per_line_estimate);
std::string buffer;
buffer.reserve(line_length_estimate);
while (infile.good() && std::getline(infile, buffer)) {
if (buffer[buffer.size() - 1] == '\r') {
buffer[buffer.size() - 1] = ',';
}
else {
buffer.push_back(',');
}
uint_fast16_t parsed_number = 0;
for (const auto& digit : buffer) {
if (digit == ',') {
numbers.emplace_back(parsed_number);
parsed_number = 0;
}
else {
parsed_number = (parsed_number * 10) + (digit - '0');
}
}
callback(numbers);
numbers.clear();
}
}
int main(int argc, char** argv) {
constexpr uint_fast8_t line_length_estimate = 50;
constexpr uint_fast8_t values_per_line_estimate = 15;
parse_lines(argv[1], line_length_estimate, values_per_line_estimate, [](auto& values) {
// ...
});
}
parsed_number += (digit - '0') * powers[current_power], but it kept ending up slower. I guess improving performance further will require some more intricate techniques. \$\endgroup\$