[java] JPA eager fetch does not join

What exactly does JPA's fetch strategy control? I can't detect any difference between eager and lazy. In both cases JPA/Hibernate does not automatically join many-to-one relationships.

Example: Person has a single address. An address can belong to many people. The JPA annotated entity classes look like:

@Entity
public class Person {
    @Id
    public Integer id;

    public String name;

    @ManyToOne(fetch=FetchType.LAZY or EAGER)
    public Address address;
}

@Entity
public class Address {
    @Id
    public Integer id;

    public String name;
}

If I use the JPA query:

select p from Person p where ...

JPA/Hibernate generates one SQL query to select from Person table, and then a distinct address query for each person:

select ... from Person where ...
select ... from Address where id=1
select ... from Address where id=2
select ... from Address where id=3

This is very bad for large result sets. If there are 1000 people it generates 1001 queries (1 from Person and 1000 distinct from Address). I know this because I'm looking at MySQL's query log. It was my understanding that setting address's fetch type to eager will cause JPA/Hibernate to automatically query with a join. However, regardless of the fetch type, it still generates distinct queries for relationships.

Only when I explicitly tell it to join does it actually join:

select p, a from Person p left join p.address a where ...

Am I missing something here? I now have to hand code every query so that it left joins the many-to-one relationships. I'm using Hibernate's JPA implementation with MySQL.

Edit: It appears (see Hibernate FAQ here and here) that FetchType does not impact JPA queries. So in my case I have explicitly tell it to join.

This question is related to java hibernate jpa join

The answer is


If you use EclipseLink instead of Hibernate you can optimize your queries by "query hints". See this article from the Eclipse Wiki: EclipseLink/Examples/JPA/QueryOptimization.

There is a chapter about "Joined Reading".


to join you can do multiple things (using eclipselink)

  • in jpql you can do left join fetch

  • in named query you can specify query hint

  • in TypedQuery you can say something like

    query.setHint("eclipselink.join-fetch", "e.projects.milestones");

  • there is also batch fetch hint

    query.setHint("eclipselink.batch", "e.address");

see

http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html


Two things occur to me.

First, are you sure you mean ManyToOne for address? That means multiple people will have the same address. If it's edited for one of them, it'll be edited for all of them. Is that your intent? 99% of the time addresses are "private" (in the sense that they belong to only one person).

Secondly, do you have any other eager relationships on the Person entity? If I recall correctly, Hibernate can only handle one eager relationship on an entity but that is possibly outdated information.

I say that because your understanding of how this should work is essentially correct from where I'm sitting.


Try with:

select p from Person p left join FETCH p.address a where...

It works for me in a similar with JPA2/EclipseLink, but it seems this feature is present in JPA1 too:


I had exactly this problem with the exception that the Person class had a embedded key class. My own solution was to join them in the query AND remove

@Fetch(FetchMode.JOIN)

My embedded id class:

@Embeddable
public class MessageRecipientId implements Serializable {

    @ManyToOne(targetEntity = Message.class, fetch = FetchType.LAZY)
    @JoinColumn(name="messageId")
    private Message message;
    private String governmentId;

    public MessageRecipientId() {
    }

    public Message getMessage() {
        return message;
    }

    public void setMessage(Message message) {
        this.message = message;
    }

    public String getGovernmentId() {
        return governmentId;
    }

    public void setGovernmentId(String governmentId) {
        this.governmentId = governmentId;
    }

    public MessageRecipientId(Message message, GovernmentId governmentId) {
        this.message = message;
        this.governmentId = governmentId.getValue();
    }

}

The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.

If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.

If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.

You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.

For the second part of the question - how to get hibernate to generate optimised SQL?

Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.


"mxc" is right. fetchType just specifies when the relation should be resolved.

To optimize eager loading by using an outer join you have to add

@Fetch(FetchMode.JOIN)

to your field. This is a hibernate specific annotation.


JPA doesn't provide any specification on mapping annotations to select fetch strategy. In general, related entities can be fetched in any one of the ways given below

  • SELECT => one query for root entities + one query for related mapped entity/collection of each root entity = (n+1) queries
  • SUBSELECT => one query for root entities + second query for related mapped entity/collection of all root entities retrieved in first query = 2 queries
  • JOIN => one query to fetch both root entities and all of their mapped entity/collection = 1 query

So SELECT and JOIN are two extremes and SUBSELECT falls in between. One can choose suitable strategy based on her/his domain model.

By default SELECT is used by both JPA/EclipseLink and Hibernate. This can be overridden by using:

@Fetch(FetchMode.JOIN) 
@Fetch(FetchMode.SUBSELECT)

in Hibernate. It also allows to set SELECT mode explicitly using @Fetch(FetchMode.SELECT) which can be tuned by using batch size e.g. @BatchSize(size=10).

Corresponding annotations in EclipseLink are:

@JoinFetch
@BatchFetch

If you use EclipseLink instead of Hibernate you can optimize your queries by "query hints". See this article from the Eclipse Wiki: EclipseLink/Examples/JPA/QueryOptimization.

There is a chapter about "Joined Reading".


Try with:

select p from Person p left join FETCH p.address a where...

It works for me in a similar with JPA2/EclipseLink, but it seems this feature is present in JPA1 too:


Two things occur to me.

First, are you sure you mean ManyToOne for address? That means multiple people will have the same address. If it's edited for one of them, it'll be edited for all of them. Is that your intent? 99% of the time addresses are "private" (in the sense that they belong to only one person).

Secondly, do you have any other eager relationships on the Person entity? If I recall correctly, Hibernate can only handle one eager relationship on an entity but that is possibly outdated information.

I say that because your understanding of how this should work is essentially correct from where I'm sitting.


"mxc" is right. fetchType just specifies when the relation should be resolved.

To optimize eager loading by using an outer join you have to add

@Fetch(FetchMode.JOIN)

to your field. This is a hibernate specific annotation.


Two things occur to me.

First, are you sure you mean ManyToOne for address? That means multiple people will have the same address. If it's edited for one of them, it'll be edited for all of them. Is that your intent? 99% of the time addresses are "private" (in the sense that they belong to only one person).

Secondly, do you have any other eager relationships on the Person entity? If I recall correctly, Hibernate can only handle one eager relationship on an entity but that is possibly outdated information.

I say that because your understanding of how this should work is essentially correct from where I'm sitting.


to join you can do multiple things (using eclipselink)

  • in jpql you can do left join fetch

  • in named query you can specify query hint

  • in TypedQuery you can say something like

    query.setHint("eclipselink.join-fetch", "e.projects.milestones");

  • there is also batch fetch hint

    query.setHint("eclipselink.batch", "e.address");

see

http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html


The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.

If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.

If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.

You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.

For the second part of the question - how to get hibernate to generate optimised SQL?

Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.


I had exactly this problem with the exception that the Person class had a embedded key class. My own solution was to join them in the query AND remove

@Fetch(FetchMode.JOIN)

My embedded id class:

@Embeddable
public class MessageRecipientId implements Serializable {

    @ManyToOne(targetEntity = Message.class, fetch = FetchType.LAZY)
    @JoinColumn(name="messageId")
    private Message message;
    private String governmentId;

    public MessageRecipientId() {
    }

    public Message getMessage() {
        return message;
    }

    public void setMessage(Message message) {
        this.message = message;
    }

    public String getGovernmentId() {
        return governmentId;
    }

    public void setGovernmentId(String governmentId) {
        this.governmentId = governmentId;
    }

    public MessageRecipientId(Message message, GovernmentId governmentId) {
        this.message = message;
        this.governmentId = governmentId.getValue();
    }

}

The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.

If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.

If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.

You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.

For the second part of the question - how to get hibernate to generate optimised SQL?

Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.


JPA doesn't provide any specification on mapping annotations to select fetch strategy. In general, related entities can be fetched in any one of the ways given below

  • SELECT => one query for root entities + one query for related mapped entity/collection of each root entity = (n+1) queries
  • SUBSELECT => one query for root entities + second query for related mapped entity/collection of all root entities retrieved in first query = 2 queries
  • JOIN => one query to fetch both root entities and all of their mapped entity/collection = 1 query

So SELECT and JOIN are two extremes and SUBSELECT falls in between. One can choose suitable strategy based on her/his domain model.

By default SELECT is used by both JPA/EclipseLink and Hibernate. This can be overridden by using:

@Fetch(FetchMode.JOIN) 
@Fetch(FetchMode.SUBSELECT)

in Hibernate. It also allows to set SELECT mode explicitly using @Fetch(FetchMode.SELECT) which can be tuned by using batch size e.g. @BatchSize(size=10).

Corresponding annotations in EclipseLink are:

@JoinFetch
@BatchFetch

The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.

If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.

If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.

You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.

For the second part of the question - how to get hibernate to generate optimised SQL?

Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to hibernate

Hibernate Error executing DDL via JDBC Statement How does spring.jpa.hibernate.ddl-auto property exactly work in Spring? Error creating bean with name 'entityManagerFactory' defined in class path resource : Invocation of init method failed JPA Hibernate Persistence exception [PersistenceUnit: default] Unable to build Hibernate SessionFactory Disable all Database related auto configuration in Spring Boot Unable to create requested service [org.hibernate.engine.jdbc.env.spi.JdbcEnvironment] HikariCP - connection is not available Hibernate-sequence doesn't exist How to find distinct rows with field in list using JPA and Spring? Spring Data JPA and Exists query

Examples related to jpa

No converter found capable of converting from type to type How does spring.jpa.hibernate.ddl-auto property exactly work in Spring? Deserialize Java 8 LocalDateTime with JacksonMapper Error creating bean with name 'entityManagerFactory' defined in class path resource : Invocation of init method failed How to beautifully update a JPA entity in Spring Data? JPA Hibernate Persistence exception [PersistenceUnit: default] Unable to build Hibernate SessionFactory How to return a custom object from a Spring Data JPA GROUP BY query How to find distinct rows with field in list using JPA and Spring? What is this spring.jpa.open-in-view=true property in Spring Boot? Spring Data JPA and Exists query

Examples related to join

Pandas Merging 101 pandas: merge (join) two data frames on multiple columns How to use the COLLATE in a JOIN in SQL Server? How to join multiple collections with $lookup in mongodb How to join on multiple columns in Pyspark? Pandas join issue: columns overlap but no suffix specified MySQL select rows where left join is null How to return rows from left table not found in right table? Why do multiple-table joins produce duplicate rows? pandas three-way joining multiple dataframes on columns