1

I have quite a lot of data in JSON files that I need to convert to the SQLite database. I did a simple loop with the INSERT query, but the queries take a lot of time. I am a hobbyist and I am learning all the time. I think that there must be a more professional solution.

use JSON::XS;
use DBI;

my $CCC = 'string';

# start connection to SQLite
my $dbh = DBI->connect(          
"dbi:SQLite:dbname=aaaa.db", "", "", { RaiseError => 1 }, ) or die $DBI::errstr;

my $stmt = "CREATE TABLE IF NOT EXISTS $CCC (id INTEGER PRIMARY KEY AUTOINCREMENT, start INTEGER UNIQUE, open REAL NOT NULL, high REAL NOT NULL, low REAL NOT NULL, close REAL NOT NULL, vwp REAL NOT NULL, volume REAL NOT NULL, trades INTEGER NOT NULL)";
my $sth = $dbh->prepare( $stmt );
my $rv = $sth->execute() or die $DBI::errstr;

if($rv < 0) {
  print $DBI::errstr;
}

# Open file.json
open(my $fh, '<', file.json) or die "cannot open file";
{
  local $/;
  $data = <$fh>;
}

# Converting JSON format to Perl's variables
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
my $json = $coder->decode ($data);

# Loop. Im inserting every hash ({}) from file.json by INSERT sql statement
foreach (@{$json}) {
  my $stmt = "INSERT OR IGNORE INTO $CCC values (null, strftime('%s',\'$_->{T}\'), $_->{O}, $_->{H}, $_->{L}, $_->{C}, ".(($_->{H}+$_->{L}+$_->{C}+$_->{O}) / 4).", $_->{V}, 1)";
  my $sth = $dbh->prepare( $stmt );
  my $rv = $sth->execute() or die $DBI::errstr;

  if($rv < 0) {
    print $DBI::errstr;
  }
}

file.json

[{"O": 1.0, "H": 1.0, "L": 0.00014, "C": 0.000145, "V": 176703.92394752, "T": "2018-02-16T00:00:00", "BV": 25.71390226}, {"O": 0.00014499, "H": 0.00014499, "L": 0.00011101, "C": 0.00012599, "V": 247646.2068748, "T": "2018-02-16T00:05:00", "BV": 30.66246148}, {"O": 0.00012599, "H": 0.0001295, "L": 0.000122, "C": 0.00012699, "V": 102563.86201627, "T": "2018-02-16T00:10:00", "BV": 12.88322597}]

Do you know any more efficient way?

4
  • People have experimented on this kind of thing in the past eg stackoverflow.com/questions/1711631/… Commented Apr 20, 2018 at 13:19
  • 1
    I had an answer here that solved this problem. Commented Apr 20, 2018 at 15:34
  • It works! The query from autocommit = 0 ended in a minute when the other two attempts (without this parameter) were running for 30 minutes and were still unfinished. Out of curiosity, I will wait for other ideas. Should there be a difference in performance between your loop and mine? Commented Apr 20, 2018 at 15:48
  • 1
    @J.Doe: Executing PRAGMA journal_mode = MEMORY should make a significant difference. Commented Apr 20, 2018 at 17:28

1 Answer 1

4

I think you should have the statement and prepare outside the foreach loop (using placeholders), then execute inside the loop. The prepare normally only needs to be done once.

foreach (@{$json}) {
  my $stmt = "INSERT OR IGNORE INTO $CCC values (null, strftime('%s',\'$_->{T}\'), $_->{O}, $_->{H}, $_->{L}, $_->{C}, ".(($_->{H}+$_->{L}+$_->{C}+$_->{O}) / 4).", $_->{V}, 1)";
  my $sth = $dbh->prepare( $stmt );
  my $rv = $sth->execute() or die $DBI::errstr;

  if($rv < 0) {
    print $DBI::errstr;
  }
}

Perhaps this:

my $stmt = "INSERT INTO $CCC values (?, ?, ?, ?, ?, ?, ?, ?, ?)";
my $sth = $dbh->prepare( $stmt );

foreach (@{$json}) {
    $sth->execute(undef, $_->{T}, $_->{O}, $_->{H}, $_->{L}, $_->{C}, ($_->{H}+$_->{L}+$_->{C}+$_->{O}) / 4, $_->{V}, 1) or die $DBI::errstr;
}

Update: placed undef as first parameter as suggested by Kjetil S.

Sign up to request clarification or add additional context in comments.

1 Comment

In execute you should probably replace 'null' with undef (without quotes).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.