1

I'm trying to save binary data from a file metadata.mfs into the postgres database and then reading from the database, grab the saved binary data and load it into a file. For the purpose of illustrating my goal lets also call it metadata.mfs with the same name but in a different directory. When I run md5sum metadata.mfs on both files I am expecting to see the same hash. (Essentially I want whatever is saved into the database from file1 be exactly the same as what I extract from the database in file2)

Currently I am not able to achieve that.

Below is what I have so far:

string readFile2(const string &fileName)
{
    ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);
    ifs.seekg(0, ios::end);

    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(bytes.data(), fileSize);
    cout.write(bytes.data(),bytes.size());
    cout << "\n";
    cout << fileSize;
    cout << "\n";
    // return bytes.data();
    return string(bytes.data(), fileSize);
}

int main() {
  string content;
  string test = "h";
  char test1 = 'C';
  try {
    cout << "A1 \n";;
    content = readFile2("/var/opt/lizardfs/lib/lizardfs/metadata.mfs");
    pqxx::connection c("postgresql://mark@localhost:26257/metadata");
    pqxx::nontransaction w(c);


    w.exec("CREATE TABLE IF NOT EXISTS binary (id INT PRIMARY KEY, meta bytea)");
    w.exec("INSERT INTO binary (id,meta) VALUES (18, '"+w.esc_raw(content)+"')");
    
    pqxx::result r = w.exec("SELECT meta FROM binary WHERE id='18'");
    std::ofstream outfile("metadata.mfs");
    for (auto row: r) {
        cout << row[0] << endl;
        outfile << row[0] << endl;
    }
    outfile.close();
    w.commit()
}

The problem is that cout.write(bytes.data(),bytes.size()); prints out exactly the same as what I would see in the linux terminal if I run cat metadata.mfs, but from cout << row[0] << endl I see everything in hex, ie, \x4c495a4d20322e39000021....

I suspect this is because I am using w.esc_raw() on the binary content before inserting into postgres. Does that mean I need to unescape, ie using w.unesc_raw() after extracting the binary data from the database? But how would I do that? I've been looking at the docs here: https://libpqxx.readthedocs.io/en/6.1.1/a00225.html

6
  • 1
    Use placeolder values and skip this injection stuff entirely. In other words this section of the docs. Commented Mar 28, 2021 at 23:34
  • I'll do that but I don't think that is the source of my problem. It has to do with the data being binary as I don't have a problem when the file content is just text Commented Mar 28, 2021 at 23:38
  • Putting binary junk in the middle of a query is asking for trouble, if not misery. It's best avoided. Use placeholders. Package the data separately. This side-steps a lot of issues, especially UTF-8 encoding related. Is this a binary BLOB column? Commented Mar 28, 2021 at 23:39
  • Remember, if you're stuck and have literally no other option, then you must escape, and do it in a way that's compatible with your encoding. This is only relevant when making a query. Any results you get will be handed over by the driver in whatever form is appropriate. There is never a need to "unescape" it, at least not with any driver worth using. Commented Mar 28, 2021 at 23:40
  • Very similar from a few hours ago can you explain the difference ? stackoverflow.com/questions/66845340/… Commented Mar 29, 2021 at 0:00

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.