Understanding the effective data fetching with JPA Entity Graphs (Part-2)

JPA Sep 18, 2020

Part-2: The Solution

In the previous post I tried to demonstrate JPA fetching strategies. The are number of problems using the FetchType.Lazy or FetchType.Eager when it is the only configuration for the entity for fetching plan.

  • It causes N+1 problem which runs many unnecessary queries and affect the data access layer performance very badly.
  • It does not give the flexibility to choose different fetch strategies for different scenarios, so it is limiting.
  • Very likely to put you in trouble with JPA lazy initialisation exception that is caused by accessing the proxy objects that are not fetched from the outside of transaction context.

What is needed to improve this behaviour is that using SQL JOIN s when it is intended to retrieve the referenced objects. One of the solution could be writing your own queries for more predictable and better performing queries.

@Query(
value = "SELECT a FROM Address AS a " +
	"JOIN FETCH a.user user " +
    "JOIN FETCH a.city city "
)
List <Address> findAllAddresses();

Although it is good enough to perform join queries when it is necessary, in a big scale enterprise project most probably number of such methods will increase a lot and there might me lots of almost duplicate SQL queries needed. So for such cases Entity Graphs seems to be a better fit.

Entity Graphs

Entity graphs are introduced in JPA 2.1 and used to allow partial or specified fetching of objects. When an entity has references to other entities we can specify a fetch plan by entity graphs in order to determine which fields or properties should be fetched together. We can describe a fetch plan with its paths and boundaries with @NamedEntityGraph annotation in the entity class.

@NamedEntityGraph(
        name = "address-city-user-graph",
        attributeNodes = {
                @NamedAttributeNode("city"),
                @NamedAttributeNode("user")
        }
)
@Entity
public class Address {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String title;
    private String street;
    private String flat;
    private String postalCode;

    @OneToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "city_id")
    private City city;

    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "user_id")
    private User user;
    
    // getters and setters
    
}

I switched the fetch type to be lazy, and defined and entity graph named  address-city-user-graph. Entity graph contains the city and user attributes so that those objects will be retrieved all together. So for example when I want to retrieve only the address data, I can use findAll method and children object will not be queries because of lazy fetch strategy. However if I would like to retrieve details of users and city as well I can use the address-city-user-graph entity graph that I defined. To assign a defined NamedEntityGraph in the queries we use @EntityGraph annotation on the repository methods.

@Repository
public interface AddressRepository extends JpaRepository<Address, Long> {

    @EntityGraph("address-city-user-graph")
    List <Address> findByUserId(Long userId);

    @EntityGraph("address-city-user-graph")
    List<Address> findByCityId(Long cityId);
}

Entity graph annotation takes the name of NamedEntityGraph which is only a user defined String as an identifier. So we can define this fetch plan by entity graph and use it multiple times. If I were to write SQL queries I would have to write duplicate JOIN queries multiple times. Now when I call the findByUserId method it runs a single query with JOINs:

-- hibernate	
    select
        address0_.id as id1_0_0_,
        user1_.id as id1_2_1_,
        city2_.id as id1_1_2_,
        address0_.city_id as city_id6_0_0_,
        address0_.flat as flat2_0_0_,
        address0_.postal_code as postal_c3_0_0_,
        address0_.street as street4_0_0_,
        address0_.title as title5_0_0_,
        address0_.user_id as user_id7_0_0_,
        user1_.email as email2_2_1_,
        user1_.name as name3_2_1_,
        user1_.password as password4_2_1_,
        user1_.phone as phone5_2_1_,
        city2_.name as name2_1_2_ 
    from
        address address0_ 
    left outer join
        users user1_ 
            on address0_.user_id=user1_.id 
    left outer join
        city city2_ 
            on address0_.city_id=city2_.id 
    where
        user1_.id=?

We can define multiple entity graphs in an entity within @NamedEntityGraphs annotation:

@NamedEntityGraphs({
        @NamedEntityGraph(
                name = Address.WITH_USER_GRAPH,
                attributeNodes = {
                        @NamedAttributeNode("user")
                }
        ),
        @NamedEntityGraph(
                name = Address.WITH_CITY_GRAPH,
                attributeNodes = {
                        @NamedAttributeNode("city"),
                }
        ),
        @NamedEntityGraph(
                name = Address.WITH_USER_AND_CITY_GRAPH,
                attributeNodes = {
                        @NamedAttributeNode("user"),
                        @NamedAttributeNode("city")
                }
        )
})
@Entity
public class Address {
    
    public static final String WITH_USER_GRAPH = "address-with-user-graph";
    public static final String WITH_CITY_GRAPH = "address-with-city-graph";
    public static final String WITH_USER_AND_CITY_GRAPH = "address-with-user-and-city-graph";
    
    // fields..
    
}

And we can assign each entity graph to repository methods as we'd like:

@Repository
public interface AddressRepository extends JpaRepository<Address, Long> {
    
    @EntityGraph(Address.WITH_USER_AND_CITY_GRAPH)
    List<Address> findAll();

    @EntityGraph(Address.WITH_USER_GRAPH)
    List <Address> findByUserId(Long userId);

    @EntityGraph(Address.WITH_CITY_GRAPH)
    List<Address> findByCityId(Long cityId);
    
    Optional<Address> findById(Long id);
    
}

In the example there is no Entity Graph assigned to findById method so it will use FetchTypes from the entity which is defined lazy so it will only retrieve the fields of address but not children. If I run the findByCity method thanks to its Entity Graph it returns the city field of the address in the same query as well.

Hibernate: 
    select
        address0_.id as id1_0_0_,
        city1_.id as id1_1_1_,
        address0_.city_id as city_id6_0_0_,
        address0_.flat as flat2_0_0_,
        address0_.postal_code as postal_c3_0_0_,
        address0_.street as street4_0_0_,
        address0_.title as title5_0_0_,
        address0_.user_id as user_id7_0_0_,
        city1_.name as name2_1_1_ 
    from
        address address0_ 
    left outer join
        city city1_ 
            on address0_.city_id=city1_.id 
    where
        city1_.id=?

Types of Entity Graph

@EntityGraph annotation takes a type parameter with 2 values:

  • FETCH: It is the default graph type. When it is selected the attributes that are specified by attribute nodes of the entity graph are treated as FetchType.EAGER and attributes that are not specified are treated as FetchType.LAZY.
  • LOAD: When this type is selected the attributes that are specified by attribute nodes of the entity graph are treated as FetchType.EAGER and attributes that are not specified are treated according to their specified or default FetchType.

Selecting LOAD type example:

@EntityGraph(value = Address.WITH_USER_AND_CITY_GRAPH, type = EntityGraph.EntityGraphType.LOAD)
List<Address> findAll();

Subgraphs

Subgraphs can be defined in a NamedEntityGraph in order to specify fetch plan of the fields that belongs to the child.

In the previous post User entity did not have reference for Address relation, so the relation was mapped with unidirectional @ManyToOne on the Address entity. To demonstrate the the subgraphs I will make this relation bidirectional with adding @OneToMany to the User entity, and also I will create another entity named Group that will be parent entity of User.

@Entity
@Table(name = "users")
public class User {

	// ..

	@OneToMany(fetch = FetchType.LAZY, mappedBy = "user")
    private Set <Address> addressList = new HashSet <>();
    
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "group_id")
    private Group group;
    
}

And the group entity:

@Entity
public class Group {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String name;

    @OneToMany(fetch = FetchType.LAZY, mappedBy = "group")
    private Set <User> userList = new HashSet <>();
    
    // getter and setters..
    
}

Now I will define an entity graph in order to retrieve group with its users that contains its children addresses, and addresses contain city information. So I would like to see such a tree from parent to child in 1 query.

The entity graph contains the subgraph and subgraphs are defined with @NamedSubgraph annotation. This annotation also takes @NamedAttributeNode to specify the attributes to be fetched.

@NamedEntityGraphs(
        @NamedEntityGraph(
                name = Group.WITH_USER_AND_SUB_GRAPH,
                attributeNodes = {
                        @NamedAttributeNode( value = "userList", subgraph = "userSubGraph") },
                subgraphs = {
                        @NamedSubgraph(
                                name = "userSubGraph",
                                attributeNodes = { @NamedAttributeNode( value = "addressList", subgraph = "addressSubGraph") }
                        ),
                        @NamedSubgraph(
                                name = "addressSubGraph",
                                attributeNodes = {
                                        @NamedAttributeNode("city")
                                }
                        )
                }
        )
)
@Entity
@Table(name = "groups")
public class Group {

	// ..
}

So all the tree is retrieved in one query with joining the tables:

Hibernate: 
    select
        group0_.id as id1_2_0_,
        userlist1_.id as id1_3_1_,
        addresslis2_.id as id1_0_2_,
        city3_.id as id1_1_3_,
        group0_.name as name2_2_0_,
        userlist1_.email as email2_3_1_,
        userlist1_.group_id as group_id6_3_1_,
        userlist1_.name as name3_3_1_,
        userlist1_.password as password4_3_1_,
        userlist1_.phone as phone5_3_1_,
        userlist1_.group_id as group_id6_3_0__,
        userlist1_.id as id1_3_0__,
        addresslis2_.city_id as city_id6_0_2_,
        addresslis2_.flat as flat2_0_2_,
        addresslis2_.postal_code as postal_c3_0_2_,
        addresslis2_.street as street4_0_2_,
        addresslis2_.title as title5_0_2_,
        addresslis2_.user_id as user_id7_0_2_,
        addresslis2_.user_id as user_id7_0_1__,
        addresslis2_.id as id1_0_1__,
        city3_.name as name2_1_3_ 
    from
        groups group0_ 
    left outer join
        users userlist1_ 
            on group0_.id=userlist1_.group_id 
    left outer join
        address addresslis2_ 
            on userlist1_.id=addresslis2_.user_id 
    left outer join
        city city3_ 
            on addresslis2_.city_id=city3_.id

So I have the object with it is references loaded:

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.