1

I am reading 5gb of XML file using the below code and processing that data into Database using spring dataJpa, this below is just sample logic we are closing the inpustream and aswell xsr object.

 XMLInputFactory xf=XMLInputFactory.newInstance();
 XMLStreamReader xsr=xf.createXMLStreamReader(new InputStreamReader(new FileInputStream("test.xml"))

i have configured the max 8GB(ie -xms7000m and -xmx8000m) of heap memory But its getting the below hibernate heap issue when saving the data. it inserted around 700000 data total of 2100000

[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: Java heap space] with root cause

java.lang.OutOfMemoryError: Java heap space
    at java.base/java.util.IdentityHashMap.resize(IdentityHashMap.java:472) ~[na:na]
    at java.base/java.util.IdentityHashMap.put(IdentityHashMap.java:441) ~[na:na]
    at org.hibernate.event.internal.DefaultPersistEventListener.entityIsPersistent(DefaultPersistEventListener.java:159) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.event.internal.DefaultPersistEventListener.onPersist(DefaultPersistEventListener.java:124) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl$$Lambda$1620/0x00000008010a3040.applyEventToListener(Unknown Source) ~[na:na]
    at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:113) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl.persistOnFlush(SessionImpl.java:765) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.spi.CascadingActions$8.cascade(CascadingActions.java:341) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeToOne(Cascade.java:492) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:416) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeCollectionElements(Cascade.java:525) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeCollection(Cascade.java:456) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:419) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.internal.Cascade.cascade(Cascade.java:151) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.event.internal.AbstractFlushingEventListener.cascadeOnFlush(AbstractFlushingEventListener.java:158) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.event.internal.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:148) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.event.internal.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:81) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:39) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl$$Lambda$1597/0x0000000801076040.accept(Unknown Source) ~[na:na]
    at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:102) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1362) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:453) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:3212) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:2380) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:447) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.beforeCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:183) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$300(JdbcResourceLocalTransactionCoordinatorImpl.java:40) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:281) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
    at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:534) ~[spring-orm-5.2.10.RELEASE.jar:5.2.10.RELEASE]

As per the above trace logs it seems some issue with hibernate save casacade , but not able to figured out , below are the entity classes which uses to save the data in databse.

@Data
@EqualsAndHashCode
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
public class UMEnityPK implements Serializable {
    private static final long serialVersionUID=1L;

    private String batchId;
    private Long batchVersion;
    private BigInteger umId;
}

@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_base")
@IdClass(UMEnityPK.class)
public class UMBase {

    @Id private String batchId;
    @Id private Long batchVersion;
    @Id private BigInteger umId;

    private String firstName;
    private String lastName;
    private String umType;
    private String umLevel;

    @OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
    private List<UMAddress> umAddresses;

    @OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
    private List<UMIdentifier> umIdentifiers;

    @OneToOne(mappedBy = "umBase", cascade = CascadeType.ALL)
    private UMHierarchy umHierarchy;
}

@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_identifier")
public class UMIdentifier {
    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
    @SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
    private Long id;

    private String idValue;
    private String idType;
    private String groupType;

    @ManyToOne
    @JoinColumns({
            @JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
            @JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
            @JoinColumn(name = "UM_ID", referencedColumnName = "umId")
    })
    private UMBase umBase;
}

@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_hierarchy")
public class UMHierarchy {

    @Id
    private String batchId;
    @Id private Long batchVersion;
    @Id private BigInteger umId;

    private String hierarchyTpe;
    private String umStatusCode;
    private String immediateParentId;
    private Date hierarchyDate;

    @OneToOne(cascade = CascadeType.ALL)
    @JoinColumns({
            @JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
            @JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
            @JoinColumn(name = "UM_ID", referencedColumnName = "umId")
    })
    private UMBase umBase;
}

@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_address")
public class UMAddress {
    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
    @SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
    private Long id;
    private String addressType;
    private String addressLine1;
    private String getAddressLine2;
    private String city;
    private String state;
    private String postalCode;
    private String country;

    @ManyToOne
    @JoinColumns({
            @JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
            @JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
            @JoinColumn(name = "UM_ID", referencedColumnName = "umId")
    })
    private UMBase umBase;
}

Is there any issues with hibernate entity mapping which eating of the memory

2 Answers 2

1

When dealing with converting datasets that big you need to do that in batches. Read 100 records from the xml, convert them to entities, save each of them with em.persist(record), then call em.flush() and em.clear() to remove them from Hibernate, then clear them from your local collection, then manually invoke the garbage collector using System.gc(). You may even want to use Hibernate's batch processing as described in this tutorial.

In pseudocode this would be:

boolean finished = false;
List<Entity> locals = new ArrayList<>(100);
while (!finished) {
  for (int records = 0; records < 100; records++) {
    Entity ent = readEntityFrom(xml);
    // readEntity function must return null when no more remain to read
    if (ent == null) {
      finished = true;
      break;
    }
    locals.add(ent);
  }
  for (Entity ent : locals) em.persist(ent);
  em.flush(); // send any that are still waiting to the database
  em.clear(); // remove references Hibernate holds to these entities
  locals.clear(); // remove references we hold to these entities
  // now all these entity references are weak and can be garbage-collected
  System.gc(); // purge them from memory
}

Also you may want to manually begin and commit a transaction around each insert loop to ensure the database isn't holding everything for your entire import, or it might run out of memory instead of your java application.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the response @coladict , from XML we are reading one by one record and that record converted to entity , that entity adding in List<UMbase> entity and if list size is greater than 5000 then call saveAll(listofentities) method of JPAReposoties , db called happend after 5000 record only ,and remaining record inserted later, is JPAReposoties internally will not do the above logic like clearing and flusing the entity manager.
1

After checking the heap dump, the issue is with the org.hibernate.engine.StatefulPersistenceContext -> org.hibernate.util.IdentityMap memory leak , so used the below way and worked fine , create a custom JPARepository and have below sample method logic.

public <S extends T> void saveInBatch(Iterable<S> entities) {

        if (entities == null) {
            return;
        }

        EntityManager entityManager = entityManagerFactory.createEntityManager();
        EntityTransaction entityTransaction = entityManager.getTransaction();

        try {
            entityTransaction.begin();

            int i = 0;
            for (S entity : entities) {
                if (i % batchSize == 0 && i > 0) {
                    entityTransaction.commit();
                    entityTransaction.begin();

                    entityManager.clear();
                }

                entityManager.persist(entity);
                i++;
            }
            entityTransaction.commit();
        } catch (RuntimeException e) {
            if (entityTransaction.isActive()) {
                entityTransaction.rollback();
            }

            throw e;
        } finally {
            entityManager.close();
        }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.