04 - Spring Data JPA Notes
A beginner-to-advanced guide to Spring Data JPA, entity lifecycle, repositories, query methods, fetch strategies, cascade types, and Hibernate integration for Spring Professional Certification candidates. Covers Spring Boot 3 and Hibernate 6 concepts.
Table of Contents
- Entity Lifecycle
- Fetch Strategies
- Cascade Types
- Repository Abstraction
- Query Methods
- Interview Questions
- Cheat Sheet
1. Entity Lifecycle
An entity lifecycle describes the state of a JPA entity instance in relation to the persistence context.
Lifecycle States
| State | Meaning | Example |
|---|---|---|
| Transient | Object is not associated with a persistence context and has no database identity | new User() |
| Managed | Object is associated with the persistence context; changes are tracked automatically | Entity returned by findById() |
| Detached | Object was managed earlier but is no longer attached to a persistence context | Entity after transaction/session closes |
| Removed | Object is scheduled for deletion from the database | Entity passed to delete() |
State Transitions
| Operation | Transition |
|---|---|
persist() / save() for a new entity | Transient to managed |
| Entity lookup | Database row to managed entity |
| Transaction commit / flush | Managed changes synchronized to database |
detach() / persistence context close | Managed to detached |
merge() | Detached state copied into a managed entity |
remove() / delete() | Managed to removed |
Dirty Checking
Dirty checking is the automatic detection of changes made to managed entities. If a managed entity is modified inside a transaction, JPA synchronizes those changes to the database during flush or commit.
@Transactional
public void updateEmail(Long id, String email) {
User user = userRepository.findById(id).orElseThrow();
user.setEmail(email);
}
No explicit save() is required here because user is managed.
Flush
Flush synchronizes the persistence context with the database, but it does not necessarily commit the transaction.
Common flush triggers:
- Transaction commit
- Explicit
flush() - Before query execution, depending on flush mode
2. Fetch Strategies
Fetch strategy controls when associated entities are loaded from the database.
Lazy Fetching
Lazy fetching loads an association only when it is accessed.
@OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
private List<Employee> employees;
Advantages:
- Better initial query performance
- Avoids loading unnecessary associations
- Usually preferred for collections
Risks:
LazyInitializationExceptionwhen accessed outside an active persistence context- N+1 query problem when associations are accessed repeatedly in loops
Eager Fetching
Eager fetching loads an association immediately with the owning entity.
@ManyToOne(fetch = FetchType.EAGER)
private Department department;
Advantages:
- Association is available immediately
- Can avoid lazy loading issues for small, always-needed relationships
Risks:
- Loads data even when not needed
- Can create large joins and performance problems
- Can accidentally fetch deep object graphs
Default Fetch Types
| Association | Default Fetch Type |
|---|---|
@OneToOne | EAGER |
@ManyToOne | EAGER |
@OneToMany | LAZY |
@ManyToMany | LAZY |
Handling N+1 Queries
N+1 occurs when one query loads parent records and then one additional query is executed for each parent to load children.
Common solutions:
- Use
JOIN FETCH - Use
@EntityGraph - Use DTO projections
- Tune batch fetching with Hibernate-specific settings
@Query("select d from Department d join fetch d.employees where d.id = :id")
Optional<Department> findByIdWithEmployees(Long id);
3. Cascade Types
Cascade defines which entity operations should propagate from a parent entity to its associated child entities.
| Cascade Type | Meaning |
|---|---|
PERSIST | Propagates persist operation |
MERGE | Propagates merge operation |
REMOVE | Propagates remove operation |
REFRESH | Propagates refresh operation |
DETACH | Propagates detach operation |
ALL | Includes all cascade operations |
Example
@OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
private List<OrderItem> items = new ArrayList<>();
When the Order is saved, updated, or deleted, the related OrderItem entities are affected as well.
Cascade vs Orphan Removal
| Feature | Purpose |
|---|---|
| Cascade remove | Deletes child entities when the parent is deleted |
| Orphan removal | Deletes child entities when they are removed from the parent collection |
Use orphanRemoval = true when a child should not exist without its parent.
Best Practices
- Use cascades carefully on aggregate boundaries.
- Avoid
CascadeType.REMOVEon@ManyToMany. - Prefer
CascadeType.ALLonly when child lifecycle is fully owned by the parent. - Always keep both sides of bidirectional relationships in sync.
4. Repository Abstraction
Spring Data JPA repositories reduce boilerplate data access code by generating implementations at runtime.
Common Repository Interfaces
| Interface | Description |
|---|---|
Repository<T, ID> | Marker interface |
CrudRepository<T, ID> | Basic CRUD operations |
PagingAndSortingRepository<T, ID> | Pagination and sorting support |
JpaRepository<T, ID> | JPA-specific operations and batch methods |
Example
public interface UserRepository extends JpaRepository<User, Long> {
}
This provides methods such as:
save(entity)findById(id)findAll()delete(entity)count()existsById(id)flush()
Pagination and Sorting
Page<User> page = userRepository.findAll(PageRequest.of(0, 20, Sort.by("name")));
Custom Repository Methods
Use custom repository implementations when query methods or @Query are not enough.
public interface UserRepositoryCustom {
List<User> findActivePremiumUsers();
}
5. Query Methods
Spring Data JPA can derive queries from repository method names.
Derived Query Examples
List<User> findByLastName(String lastName);
Optional<User> findByEmail(String email);
List<User> findByAgeGreaterThan(int age);
List<User> findByStatusAndCreatedAtAfter(Status status, LocalDateTime createdAt);
boolean existsByEmail(String email);
long countByStatus(Status status);
void deleteByStatus(Status status);
Common Keywords
| Keyword | Example |
|---|---|
And | findByStatusAndType |
Or | findByStatusOrType |
Between | findByCreatedAtBetween |
LessThan | findByAgeLessThan |
GreaterThan | findByAgeGreaterThan |
Like | findByNameLike |
Containing | findByNameContaining |
StartingWith | findByNameStartingWith |
EndingWith | findByNameEndingWith |
IsNull | findByDeletedAtIsNull |
IsNotNull | findByDeletedAtIsNotNull |
In | findByStatusIn |
OrderBy | findByStatusOrderByCreatedAtDesc |
JPQL with @Query
@Query("select u from User u where u.status = :status")
List<User> findUsersByStatus(@Param("status") Status status);
Native Query
@Query(value = "select * from users where email = :email", nativeQuery = true)
Optional<User> findByEmailNative(@Param("email") String email);
Modifying Query
@Modifying
@Transactional
@Query("update User u set u.status = :status where u.id = :id")
int updateStatus(Long id, Status status);
Projections
Interface-based projection:
public interface UserSummary {
String getName();
String getEmail();
}
List<UserSummary> findByStatus(Status status);
DTO projection:
@Query("select new com.example.UserDto(u.id, u.name) from User u")
List<UserDto> findUserDtos();
6. Interview Questions
1. What is the difference between JPA, Hibernate, and Spring Data JPA?
JPA is a specification for ORM in Java. Hibernate is an implementation of the JPA specification. Spring Data JPA is an abstraction over JPA that reduces boilerplate repository code.
2. What are the entity lifecycle states?
The main states are transient, managed, detached, and removed.
3. What is dirty checking?
Dirty checking is the automatic detection and persistence of changes made to managed entities inside a transaction.
4. What is the difference between persist() and merge()?
persist() makes a new transient entity managed. merge() copies the state of a detached entity into a managed entity and returns that managed instance.
5. What is the N+1 query problem?
N+1 occurs when one query loads a list of parent entities and then one additional query is executed for each parent to load related data.
6. How can N+1 be fixed?
Use fetch joins, entity graphs, DTO projections, or batch fetching.
7. What is the difference between lazy and eager loading?
Lazy loading loads associations when accessed. Eager loading loads associations immediately with the owning entity.
8. What are default fetch types in JPA?
@OneToOne and @ManyToOne are eager by default. @OneToMany and @ManyToMany are lazy by default.
9. What is cascade in JPA?
Cascade propagates entity operations from one entity to associated entities.
10. What is the difference between cascade remove and orphan removal?
Cascade remove deletes children when the parent is deleted. Orphan removal deletes a child when it is removed from the parent relationship.
11. Why should CascadeType.REMOVE be avoided on many-to-many relationships?
Because both sides are independent aggregate roots. Removing one entity should usually delete only join table rows, not the related entity itself.
12. What is the persistence context?
The persistence context is the first-level cache where managed entities are tracked by the entity manager.
13. What is the first-level cache?
The first-level cache is the persistence-context-level cache. Within the same persistence context, the same entity ID maps to the same managed object instance.
14. What is the difference between getReferenceById() and findById()?
findById() immediately queries the database and returns an Optional. getReferenceById() returns a lazy proxy and may query the database only when the proxy is accessed.
15. What is the purpose of @Transactional?
@Transactional defines a transaction boundary. It allows multiple database operations to succeed or fail as one unit and keeps managed entities attached during the transaction.
16. What is the difference between JPQL and native SQL?
JPQL works with entity names and fields. Native SQL works directly with database tables and columns.
17. What are projections?
Projections allow queries to return selected fields instead of full entity objects.
18. What is optimistic locking?
Optimistic locking uses a version field, usually annotated with @Version, to detect concurrent updates.
@Version
private Long version;
19. What is the difference between save() and saveAndFlush()?
save() persists or merges an entity and flushes later. saveAndFlush() immediately synchronizes pending changes to the database.
20. When should custom repository implementations be used?
Use custom repositories when derived methods, specifications, query annotations, or projections are not expressive enough.
7. Cheat Sheet
Entity Lifecycle
| Concept | Quick Note |
|---|---|
| Transient | New object, not tracked |
| Managed | Tracked by persistence context |
| Detached | Previously tracked, now outside context |
| Removed | Scheduled for deletion |
| Dirty checking | Auto-persist changes to managed entities |
| Flush | Synchronizes persistence context with database |
Fetching
| Topic | Quick Note |
|---|---|
| Lazy | Load on access |
| Eager | Load immediately |
| Collection default | Lazy |
| To-one default | Eager |
| N+1 fix | Fetch join, entity graph, projection, batching |
Cascades
| Cascade | Propagates |
|---|---|
PERSIST | Save new child |
MERGE | Merge detached child |
REMOVE | Delete child |
REFRESH | Reload child |
DETACH | Detach child |
ALL | All cascade operations |
Repositories
| Method | Purpose |
|---|---|
save() | Insert or update |
findById() | Find by primary key |
findAll() | Find all rows |
delete() | Delete entity |
existsById() | Check existence |
count() | Count rows |
flush() | Force synchronization |
Query Method Patterns
| Pattern | Example |
|---|---|
| Equality | findByEmail |
| Multiple conditions | findByStatusAndType |
| Range | findByCreatedAtBetween |
| Comparison | findByAgeGreaterThan |
| Null check | findByDeletedAtIsNull |
| Collection match | findByStatusIn |
| Sorting | findByStatusOrderByCreatedAtDesc |
Annotations
| Annotation | Purpose |
|---|---|
@Entity | Marks a JPA entity |
@Id | Primary key |
@GeneratedValue | Primary key generation |
@Table | Maps entity to table |
@Column | Maps field to column |
@OneToOne | One-to-one relationship |
@ManyToOne | Many-to-one relationship |
@OneToMany | One-to-many relationship |
@ManyToMany | Many-to-many relationship |
@JoinColumn | Foreign key column |
@JoinTable | Join table mapping |
@Query | Custom JPQL/native query |
@Modifying | Update/delete query |
@Transactional | Transaction boundary |
@Version | Optimistic locking |
Best Practices
- Prefer lazy loading by default.
- Use DTO projections for read-heavy screens.
- Avoid exposing entities directly from REST APIs.
- Keep transactions short.
- Do not use
CascadeType.ALLunless the parent truly owns the child lifecycle. - Avoid bidirectional relationships unless needed.
- Use fetch joins or entity graphs intentionally, not globally.
- Monitor SQL logs when tuning performance.