2

I have a function written in PostgreSQL, to go over a large table and insert a load of values into a different table. The output is fine, with loads of lines apparently being inserted, but no values are actually inserted into the target table ("resources" table in my code).

I have tried putting the insert statement inside a transaction, to no avail. Is there some sort of fudgy access or permission settings that I am missing? I have found several examples on the web that do this like I am doing, so I am pulling a little hair on this one...

Here is my function:

DECLARE
datatype_property record; 
property record;
new_resource_id bigint;
BEGIN  
    RAISE NOTICE 'Starting...';
    FOR datatype_property IN  
      SELECT * FROM datatype_properties
    LOOP  
        RAISE NOTICE 'Trying to insert';


        if not exists(select * from resources where uri = datatype_property.subject_resource) then
              SELECT INTO new_resource_id NEXTVAL('resources_id_seq');  
              INSERT INTO resources (id, uri) VALUES(  
                    new_resource_id,    
                    datatype_property.subject_resource
              );   
            RAISE NOTICE 'Inserted % with id %',datatype_property.subject_resource, new_resource_id;
        end if;
    END LOOP; 

 FOR property IN  
      SELECT * FROM properties 
 LOOP  

        if not exists(select * from resources where uri = property.source_uri) then
                SELECT INTO new_resource_id NEXTVAL('resources_id_seq');
              INSERT INTO resources (id, uri) VALUES(  
                        new_resource_id,
                        resource.source_uri
              ) ;   
                RAISE NOTICE 'Inserted % with id %',resource.source_uri, new_resource_id;
        end if;
        if not exists(select * from resources where uri = property.destination_uri) then
                SELECT INTO new_resource_id NEXTVAL('resources_id_seq');
              INSERT INTO resources (id, uri) VALUES(  
                        new_resource_id,
                        resource.source_uri
              ) ;   
        RAISE NOTICE 'Inserted % with id %',resource.source_uri, new_resource_id;
        end if;
 END LOOP;  
 RETURN;  

END;

EDIT: I've activated the plpgsql language with the directions from the following link:

http://wiki.postgresql.org/wiki/CREATE_OR_REPLACE_LANGUAGE

EDIT 2:

this code:

DECLARE
datatype_property record; 
property record;
new_resource_id bigint;
BEGIN  

    insert into resources (id, uri) values ('3', 'www.google.com');
END

does not work either :O

4
  • 1
    Did you check the server log files? Maybe you are running out of memory, or you reached a configurable limit (table size, number of records, etc.) Commented Jul 20, 2012 at 13:58
  • thanks, ill give them a look. Commented Jul 20, 2012 at 13:59
  • Well, i've checked the postgres.log file and the contents are only the outputs that I can see in the pgadmin III query browser. Commented Jul 20, 2012 at 14:03
  • 2
    it look like uncommited transaction. Some environments disables autocommit - and you have to explicitly do commit. Commented Jul 20, 2012 at 14:16

1 Answer 1

1

Your problem does sound like you are not comitting your transaction (as Pavel pointed out) or the tool which you use to check the rows is e.g. using REPEATABLE READ as its isolation level or some kind of caching.

But your function isn't a good solution to begin with. Inserting rows one by one in a loop is alway a bad idea. It will be much slower than doing a single insert (and will be less scalable).

If I'm not mistaken, the two loops can be rewritten into the following statements:

insert into resource (id, uri)
select NEXTVAL('resources_id_seq'),
       dt.subject_resource
from datatype_properties dt
where not exists (select 1
                  from resources r
                  where r.uri = dt.subject_resource);


insert into resources (id, uri)
select nextval('resources_id_seq'),
       p.source_uri
from properties p
where not exists (select 1 
                  from resources r 
                  where r.uri = p.source_uri
                     or r.uri = p.destinatioin_uri);
Sign up to request clarification or add additional context in comments.

4 Comments

Yeah. Thanks a lot for all your replies guys. When the data was not being inserted, the first thing I tried was a commit, but the only thing I got was syntax errors. In the meanwhile I have discovered that you cannot do an explicit commit in a postgresql function! stackoverflow.com/questions/5448984/… As for the single insert, I am now using a small Java program to make the insertions, using batch Statements and executeBatch(), so it goes along a_horse_with_no_name's reasoning. Thanks, response accepted!
The moral of the story is that you should leave stored procedures to DBMSs that really support them in full, e.g Oracle or SQL Server (even though I love open-source solutions)...
@JoãoRochadaSilva: PostgreSQL does fully support stored procedures (or functions that is). Apparently there is something in your environment that you are not telling is. The only difference is, that Postgres requires the caller to handle the transaction. Btw: doing multiple inserts via batch will still be slower than a single insert as I have shown.
Thats true, thats what I would do under normal circumstances. However, my batch insert is REALLY big (talk about 140 million rows), and is only run once. A single file containing all the text required for the insert would be unwieldy at best. About the "full support" statement, I was referring to the ability to call explicit commits whenever I feel like it (even tough it may be wrong, but in real systems we all know that good practice is sometimes disregarded in favour of something that works).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.