0

Couldn't find any examples online but what I'm trying to do is basically use Java Spring Batch to read in a whole table in postgres and then for each row, publish that data elsewhere. I read https://spring.io/guides/gs/batch-processing/ but can't figure out how to do this. I also want to space out the data retrieval so my database doesn't get blocked up. There are a lot of examples reading from a csv file but can't find how to read from a Repository.

2 Answers 2

1

To read the table , you need to use one of Spring Batch provided readers - either use - org.springframework.batch.item.data.RepositoryItemReader or org.springframework.batch.item.database.JdbcPagingItemReader

Both readers implement pagination so your DB reading happens page by page & not whole table gets read all at once.

RepositoryItemReader has setPageSize(int pageSize) method and similar method is there in JdbcPagingItemReader too. There must be a column in your table on which ordering can be done to implement pagination.

Try to find code examples using these two readers.

These readers will read a page once , keep it in memory , and process single - single items till chunk size is reached & then commit happens. Next DB read wouldn't happen till one page is fully finished. Generally, for optimal performance , chunk size needs to be few times smaller than page size e.g. reader page size - 1000 & chunk size = 100 so 1000 items would be read once and committed in chunks of 100 - 100 items.

Next DB read happens when all of 1000 previous read have been passed to processor.

then for each row, publish that data elsewhere

To accomplish above, you will have to set chunk size to one and then in your writer , you can do whatever you wish and that way your transaction will be committed for each item.

Sign up to request clarification or add additional context in comments.

2 Comments

org.springframework.batch.item.database.JdbcCursorItemReader is another option. and in case of postgres you need to remember to set your connection to autocommit off otherwise no cursor :)
Yes, that is indeed an option & I deliberately didn't mention it because newbies usually try to read whole table and then get stuck in OOM for large tables. With this reader, either SQL needs to fetch limited rows or fetch size needs to be specified ( which is not guaranteed to be entertained ).
0

Couldn't find any examples online

Have you seen the official samples here: https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples ?

There are many examples that show how to read data from a database:

what I'm trying to do is basically use Java Spring Batch to read in a whole table in postgres and then for each row, publish that data elsewhere.

All the jobs in the previous samples have at least one step that reads data from a database and write it elsewhere.

I also want to space out the data retrieval so my database doesn't get blocked up

I would recommend using one of the paging item readers (see https://docs.spring.io/spring-batch/4.0.x/reference/html/readersAndWriters.html#pagingItemReaders) to read data in pages and not open a cursor on the whole table.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.