processing data with perl - selecting for update usage with mysql

Question

I have a table that is storing data that needs to be processed. I have id, status, data in the table. I'm currently going through and selecting id, data where status = #. I'm then doing an update immediately after the select, changing the status # so that it won't be selected again.

my program is multithreaded and sometimes I get threads that grab the same id as they are both querying the table at a relatively close time to each other, causing the grab of the same id. i looked into select for update, however, i either did the query wrong, or i'm not understanding what it is used for.

my goal is to find a way of grabbing the id, data that i need and setting the status so that no other thread tries to grab and process the same data. here is the code i tried. (i wrote it all together for show purpose here. i have my prepares set at the beginning of the program as to not do a prepare for each time it's ran, just in case anyone was concerned there)

  my $select = $db->prepare("SELECT id, data FROM `TestTable` WHERE _status=4 LIMIT ? FOR UPDATE") or die $DBI::errstr;
if ($select->execute($limit))
     {
       while ($data = $select->fetchrow_hashref())
            { 

                my $update_status = $db->prepare( "UPDATE `TestTable` SET _status = ?, data = ? WHERE _id=?");
                    $update_status->execute(10, "", $data->{_id});
        push(@array_hash, $data);

             }
     }

when i run this, if doing multiple threads, i'll get many duplicate inserts, when trying to do an insert after i process my transaction data.

i'm not terribly familiar with mysql and the research i've done, i haven't found anything that really cleared this up for me.

thanks

also, in the case of doing this, i could handle doing an update or delete from the table. — user985590
– user985590, Commented Oct 16, 2011 at 19:45

Richard Simões · Accepted Answer · 2011-10-16 22:50:37Z

1

As a sanity check, are you using InnoDB? MyISAM has zero transactional support, aside from faking it with full table locking.

I don't see where you're starting a transaction. MySQL's autocommit option is on by default, so starting a transaction and later committing would be necessary unless you turned off autocommit.

answered Oct 16, 2011 at 22:50

Richard Simões

12.9k6 gold badges44 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user985590 Over a year ago

i'm using mysql and autocommit is on. I don't understand what you mean by saying "I don't see where you're starting a transaction" Could you clarify more on that?

Richard Simões Over a year ago

You need to call START TRANSACTION and then COMMIT when you're done.

Richard Simões Over a year ago

Also, you say you're threading this. Are the SELECT ... FOR UPDATE and UPDATE queries occurring in the same thread, each thread with its own connection?

Joel Berger · Accepted Answer · 2011-10-16 23:35:12Z

0

It looks like you simply rely on the database locking mechanisms. I googled perl dbi locking and found this:

$dbh->do("LOCK TABLES foo WRITE, bar READ");
$sth->prepare("SELECT x,y,z FROM bar");
$sth2->prepare("INSERT INTO foo SET a = ?");
while (@ary = $sth->fetchrow_array()) {
    $sth2->$execute($ary[0]);
    }
$sth2->finish();
$sth->finish();
$dbh->do("UNLOCK TABLES");

Not really saying GIYF as I am also fairly novice at both MySQL and DBI, but perhaps you can find other answers that way.

Another option might be as follows, and this only works if you control all the code accessing the data. You can create lock column in the table. When your code accesses the table it (pseudocode):

if row.lock != 1 
  row.lock = 1
  read row
  update row
  row.lock = 0
  next
else 
  sleep 1
  redo

again though, this trusts that all users/script that access this data will agree to follow this policy. If you cannot ensure that then this won't work.

Anyway thats all the knowledge I have on the topic. Good Luck!

edited Oct 16, 2011 at 23:35

answered Oct 16, 2011 at 22:37

Joel Berger

20.3k5 gold badges52 silver badges106 bronze badges

2 Comments

user985590 Over a year ago

i don't know if locking the table would be the best choice. It would then slow up other sets of code that are using the same table. I more need to lock a specific row.

Joel Berger Over a year ago

@user985590, updated, its not a great answer but maybe in the absence of something better ...

Collectives™ on Stack Overflow

processing data with perl - selecting for update usage with mysql

2 Answers 2

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related