9

Is there an effective way to update/delete specific row in CSV file? Every other method included reading contents of entire file, creating temporary file and then replacing old file with it, etc... But let's say, I have big CSV with 10000 records, so this kind of solution would be rather resource-heavy. Let's say, I am unable to use database, so writing to file is the only way of storing data. So, the question is, what would be the most effective way to do it? Thank you in advance!

6
  • have you tried something to achieve the needed functionality? Commented Feb 16, 2016 at 18:26
  • 1
    That's what fixed width files are for, there's no way to in general update a single row in a csv file without rewriting the entire file since the new row may be another length than the original. Commented Feb 16, 2016 at 18:28
  • A database is just a program. A program that does things like this. If each line in the CSV is the same length, you can index right where you need to go with a seek system call and then overwrite the fields you need to overwrite then flush to disk. Otherwise, if the length of the line changes, you'll need to rewrite the entire remainder of the file because files are just arrays of bytes (from an end-user perspective). Commented Feb 16, 2016 at 18:31
  • Now, under the hood the filesystem driver probably deals with files in terms of fixed-size memory pages (probably 4K). You could have a special "blank" character (like null) which you write whenever you erase something, so that you don't need to rewrite the remainder of the file. But that would create internal fragmentation issues, and at that point you'd just be implementing your own abstraction of a filesystem within the file. Unless you carefully handled that empty space and defragmented once in a while, performance would degrade miserably over time. Commented Feb 16, 2016 at 18:35
  • If you're thinking "why can't I just use a filesystem with variable-sized pages, that gives me control over which pages I modify?", then you'd get external fragmentation. Also, that's be an awful filesystem. You could, however, break your CSV up into multiple files so that you only need to rewrite one when you change its row. Again, that'd create internal fragmentation within the filesystem (unless all of your subfiles happened to be perfectly aligned to FS page boundaries), but it would remain performant. Commented Feb 16, 2016 at 18:38

4 Answers 4

5

You're going to have to read the entire file. Sorry, no way around that. A CSV is a single, flat, text file with randomly sized fields and rows.

You definitely shouldn't be working directly with a CSV for database operations. You ought to pull the data into a database to work with it, then output it back to CSV when you're done.

You don't mention why you can't use a database, so I'm going to guess it's a resource issue, and you also don't say why you don't want to rewrite the file, so I'm going to guess it's due to performance. You could cache a number of operations and perform them all at once, but you're not going to get away from rewriting all or at least some portion of the file.

Sign up to request clarification or add additional context in comments.

Comments

4

Consider reading the csv line by line into a multi-dimensional array, and at a certain row make your changes. Then, export array data out to csv. Below example modifies the 100th row assuming a 6-column comma delimited csv file (0-5).

Now, if you want to delete the row, then exclude it from $newdata array by conditionally skipping to next loop iteration with continue. Alternatively, if you want to update, simple set current inner array $newdata[$i] to new values:

$i = 0;
$newdata = [];
$handle = fopen("OldFile.csv", "r");

// READ CSV
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {      

    // UPDATE 100TH ROW DATA (TO EXCLUDE, KEEP ONLY $i++ AND continue)
    if ($i == 99) {
        $newdata[$i][] = somenewvalue;          
        $newdata[$i][] = somenewvalue;   
        $newdata[$i][] = somenewvalue;  
        $newdata[$i][] = somenewvalue;
        $newdata[$i][] = somenewvalue;
        $newdata[$i][] = somenewvalue;
        $i++;
        continue;
    }  
    $newdata[$i][] = $data[0];          
    $newdata[$i][] = $data[1];    
    $newdata[$i][] = $data[2];      
    $newdata[$i][] = $data[3];    
    $newdata[$i][] = $data[4];    
    $newdata[$i][] = $data[5];
    $i++;    
}

// EXPORT CSV
$fp = fopen('NewFile.csv', 'w');    
foreach ($newdata as $rows) {
    fputcsv($fp, $rows);
}    
fclose($fp);

2 Comments

it's working thanks but how if I want to update only first char. Like my file is like 7 Enter Enter 0.2,0.6,12 I just want to update 7.
@Pardeep: IF (and its a big IF!) you rebuild your CSV file to have all fixed length records (for example text with space padding you later trim), so that all your records are the SAME length, AND you know all the delimiters and EOL characters, it should be possible to ignore the fact that its a CSV file, open the file for read/write as a binary file, read till you get to the desired record (total line size x recordNum-1), you could then write new data. Its obviously very error prone though. In the end I think you'll wish you didn't go down this road. (speaking from experience here :-)
3

Break the CSV into multiple files all in one directory. That way you still have to rewrite files, but you don't have to rewrite nearly as much.

Comments

1

Bit late but for people who may search same thing, you could put your csv into an sqlite what addionaly gives you the ability to search in the dataset. There is some sample code: Import CSV File into a SQLite Database via PHP

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.