04 - Spring Data JPA Notes

A beginner-to-advanced guide to Spring Data JPA, entity lifecycle, repositories, query methods, fetch strategies, cascade types, and Hibernate integration for Spring Professional Certification candidates. Covers Spring Boot 3 and Hibernate 6 concepts.

Entity Lifecycle
Fetch Strategies
Cascade Types
Repository Abstraction
Query Methods
Interview Questions
Cheat Sheet

1. Entity Lifecycle

An entity lifecycle describes the state of a JPA entity instance in relation to the persistence context.

Lifecycle States

State	Meaning	Example
Transient	Object is not associated with a persistence context and has no database identity	`new User()`
Managed	Object is associated with the persistence context; changes are tracked automatically	Entity returned by `findById()`
Detached	Object was managed earlier but is no longer attached to a persistence context	Entity after transaction/session closes
Removed	Object is scheduled for deletion from the database	Entity passed to `delete()`

State Transitions

Operation	Transition
`persist()` / `save()` for a new entity	Transient to managed
Entity lookup	Database row to managed entity
Transaction commit / flush	Managed changes synchronized to database
`detach()` / persistence context close	Managed to detached
`merge()`	Detached state copied into a managed entity
`remove()` / `delete()`	Managed to removed

Dirty Checking

Dirty checking is the automatic detection of changes made to managed entities. If a managed entity is modified inside a transaction, JPA synchronizes those changes to the database during flush or commit.

@Transactional
public void updateEmail(Long id, String email) {
    User user = userRepository.findById(id).orElseThrow();
    user.setEmail(email);
}

No explicit save() is required here because user is managed.

Flush

Flush synchronizes the persistence context with the database, but it does not necessarily commit the transaction.

Common flush triggers:

Transaction commit
Explicit flush()
Before query execution, depending on flush mode

2. Fetch Strategies

Fetch strategy controls when associated entities are loaded from the database.

Lazy Fetching

Lazy fetching loads an association only when it is accessed.

@OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
private List<Employee> employees;

Advantages:

Better initial query performance
Avoids loading unnecessary associations
Usually preferred for collections

Risks:

LazyInitializationException when accessed outside an active persistence context
N+1 query problem when associations are accessed repeatedly in loops

Eager Fetching

Eager fetching loads an association immediately with the owning entity.

@ManyToOne(fetch = FetchType.EAGER)
private Department department;

Advantages:

Association is available immediately
Can avoid lazy loading issues for small, always-needed relationships

Risks:

Loads data even when not needed
Can create large joins and performance problems
Can accidentally fetch deep object graphs

Default Fetch Types

Association	Default Fetch Type
`@OneToOne`	`EAGER`
`@ManyToOne`	`EAGER`
`@OneToMany`	`LAZY`
`@ManyToMany`	`LAZY`

Handling N+1 Queries

N+1 occurs when one query loads parent records and then one additional query is executed for each parent to load children.

Common solutions:

Use JOIN FETCH
Use @EntityGraph
Use DTO projections
Tune batch fetching with Hibernate-specific settings

@Query("select d from Department d join fetch d.employees where d.id = :id")
Optional<Department> findByIdWithEmployees(Long id);

3. Cascade Types

Cascade defines which entity operations should propagate from a parent entity to its associated child entities.

Cascade Type	Meaning
`PERSIST`	Propagates persist operation
`MERGE`	Propagates merge operation
`REMOVE`	Propagates remove operation
`REFRESH`	Propagates refresh operation
`DETACH`	Propagates detach operation
`ALL`	Includes all cascade operations

Example

@OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
private List<OrderItem> items = new ArrayList<>();

When the Order is saved, updated, or deleted, the related OrderItem entities are affected as well.

Cascade vs Orphan Removal

Feature	Purpose
Cascade remove	Deletes child entities when the parent is deleted
Orphan removal	Deletes child entities when they are removed from the parent collection

Use orphanRemoval = true when a child should not exist without its parent.

Best Practices

Use cascades carefully on aggregate boundaries.
Avoid CascadeType.REMOVE on @ManyToMany.
Prefer CascadeType.ALL only when child lifecycle is fully owned by the parent.
Always keep both sides of bidirectional relationships in sync.

4. Repository Abstraction

Spring Data JPA repositories reduce boilerplate data access code by generating implementations at runtime.

Common Repository Interfaces

Interface	Description
`Repository<T, ID>`	Marker interface
`CrudRepository<T, ID>`	Basic CRUD operations
`PagingAndSortingRepository<T, ID>`	Pagination and sorting support
`JpaRepository<T, ID>`	JPA-specific operations and batch methods

Example

public interface UserRepository extends JpaRepository<User, Long> {
}

This provides methods such as:

save(entity)
findById(id)
findAll()
delete(entity)
count()
existsById(id)
flush()

Pagination and Sorting

Page<User> page = userRepository.findAll(PageRequest.of(0, 20, Sort.by("name")));

Custom Repository Methods

Use custom repository implementations when query methods or @Query are not enough.

public interface UserRepositoryCustom {
    List<User> findActivePremiumUsers();
}

5. Query Methods

Spring Data JPA can derive queries from repository method names.

Derived Query Examples

List<User> findByLastName(String lastName);

Optional<User> findByEmail(String email);

List<User> findByAgeGreaterThan(int age);

List<User> findByStatusAndCreatedAtAfter(Status status, LocalDateTime createdAt);

boolean existsByEmail(String email);

long countByStatus(Status status);

void deleteByStatus(Status status);

Common Keywords

Keyword	Example
`And`	`findByStatusAndType`
`Or`	`findByStatusOrType`
`Between`	`findByCreatedAtBetween`
`LessThan`	`findByAgeLessThan`
`GreaterThan`	`findByAgeGreaterThan`
`Like`	`findByNameLike`
`Containing`	`findByNameContaining`
`StartingWith`	`findByNameStartingWith`
`EndingWith`	`findByNameEndingWith`
`IsNull`	`findByDeletedAtIsNull`
`IsNotNull`	`findByDeletedAtIsNotNull`
`In`	`findByStatusIn`
`OrderBy`	`findByStatusOrderByCreatedAtDesc`

JPQL with `@Query`

@Query("select u from User u where u.status = :status")
List<User> findUsersByStatus(@Param("status") Status status);

Native Query

@Query(value = "select * from users where email = :email", nativeQuery = true)
Optional<User> findByEmailNative(@Param("email") String email);

Modifying Query

@Modifying
@Transactional
@Query("update User u set u.status = :status where u.id = :id")
int updateStatus(Long id, Status status);

Projections

Interface-based projection:

public interface UserSummary {
    String getName();
    String getEmail();
}

List<UserSummary> findByStatus(Status status);

DTO projection:

@Query("select new com.example.UserDto(u.id, u.name) from User u")
List<UserDto> findUserDtos();

6. Interview Questions

1. What is the difference between JPA, Hibernate, and Spring Data JPA?

JPA is a specification for ORM in Java. Hibernate is an implementation of the JPA specification. Spring Data JPA is an abstraction over JPA that reduces boilerplate repository code.

2. What are the entity lifecycle states?

The main states are transient, managed, detached, and removed.

3. What is dirty checking?

Dirty checking is the automatic detection and persistence of changes made to managed entities inside a transaction.

4. What is the difference between `persist()` and `merge()`?

persist() makes a new transient entity managed. merge() copies the state of a detached entity into a managed entity and returns that managed instance.

5. What is the N+1 query problem?

N+1 occurs when one query loads a list of parent entities and then one additional query is executed for each parent to load related data.

6. How can N+1 be fixed?

Use fetch joins, entity graphs, DTO projections, or batch fetching.

7. What is the difference between lazy and eager loading?

Lazy loading loads associations when accessed. Eager loading loads associations immediately with the owning entity.

8. What are default fetch types in JPA?

@OneToOne and @ManyToOne are eager by default. @OneToMany and @ManyToMany are lazy by default.

9. What is cascade in JPA?

Cascade propagates entity operations from one entity to associated entities.

10. What is the difference between cascade remove and orphan removal?

Cascade remove deletes children when the parent is deleted. Orphan removal deletes a child when it is removed from the parent relationship.

11. Why should `CascadeType.REMOVE` be avoided on many-to-many relationships?

Because both sides are independent aggregate roots. Removing one entity should usually delete only join table rows, not the related entity itself.

12. What is the persistence context?

The persistence context is the first-level cache where managed entities are tracked by the entity manager.

13. What is the first-level cache?

The first-level cache is the persistence-context-level cache. Within the same persistence context, the same entity ID maps to the same managed object instance.

14. What is the difference between `getReferenceById()` and `findById()`?

findById() immediately queries the database and returns an Optional. getReferenceById() returns a lazy proxy and may query the database only when the proxy is accessed.

15. What is the purpose of `@Transactional`?

@Transactional defines a transaction boundary. It allows multiple database operations to succeed or fail as one unit and keeps managed entities attached during the transaction.

16. What is the difference between JPQL and native SQL?

JPQL works with entity names and fields. Native SQL works directly with database tables and columns.

17. What are projections?

Projections allow queries to return selected fields instead of full entity objects.

18. What is optimistic locking?

Optimistic locking uses a version field, usually annotated with @Version, to detect concurrent updates.

@Version
private Long version;

19. What is the difference between `save()` and `saveAndFlush()`?

save() persists or merges an entity and flushes later. saveAndFlush() immediately synchronizes pending changes to the database.

20. When should custom repository implementations be used?

Use custom repositories when derived methods, specifications, query annotations, or projections are not expressive enough.

7. Cheat Sheet

Entity Lifecycle

Concept	Quick Note
Transient	New object, not tracked
Managed	Tracked by persistence context
Detached	Previously tracked, now outside context
Removed	Scheduled for deletion
Dirty checking	Auto-persist changes to managed entities
Flush	Synchronizes persistence context with database

Fetching

Topic	Quick Note
Lazy	Load on access
Eager	Load immediately
Collection default	Lazy
To-one default	Eager
N+1 fix	Fetch join, entity graph, projection, batching

Cascades

Cascade	Propagates
`PERSIST`	Save new child
`MERGE`	Merge detached child
`REMOVE`	Delete child
`REFRESH`	Reload child
`DETACH`	Detach child
`ALL`	All cascade operations

Repositories

Method	Purpose
`save()`	Insert or update
`findById()`	Find by primary key
`findAll()`	Find all rows
`delete()`	Delete entity
`existsById()`	Check existence
`count()`	Count rows
`flush()`	Force synchronization

Query Method Patterns

Pattern	Example
Equality	`findByEmail`
Multiple conditions	`findByStatusAndType`
Range	`findByCreatedAtBetween`
Comparison	`findByAgeGreaterThan`
Null check	`findByDeletedAtIsNull`
Collection match	`findByStatusIn`
Sorting	`findByStatusOrderByCreatedAtDesc`

Annotations

Annotation	Purpose
`@Entity`	Marks a JPA entity
`@Id`	Primary key
`@GeneratedValue`	Primary key generation
`@Table`	Maps entity to table
`@Column`	Maps field to column
`@OneToOne`	One-to-one relationship
`@ManyToOne`	Many-to-one relationship
`@OneToMany`	One-to-many relationship
`@ManyToMany`	Many-to-many relationship
`@JoinColumn`	Foreign key column
`@JoinTable`	Join table mapping
`@Query`	Custom JPQL/native query
`@Modifying`	Update/delete query
`@Transactional`	Transaction boundary
`@Version`	Optimistic locking

Best Practices

Prefer lazy loading by default.
Use DTO projections for read-heavy screens.
Avoid exposing entities directly from REST APIs.
Keep transactions short.
Do not use CascadeType.ALL unless the parent truly owns the child lifecycle.
Avoid bidirectional relationships unless needed.
Use fetch joins or entity graphs intentionally, not globally.
Monitor SQL logs when tuning performance.

04 - Spring Data JPA Notes

Table of Contents

1. Entity Lifecycle

Lifecycle States

State Transitions

Dirty Checking

Flush

2. Fetch Strategies

Lazy Fetching

Eager Fetching

Default Fetch Types

Handling N+1 Queries

3. Cascade Types

Example

Cascade vs Orphan Removal

Best Practices

4. Repository Abstraction

Common Repository Interfaces

Example

Pagination and Sorting

Custom Repository Methods

5. Query Methods

Derived Query Examples

Common Keywords

JPQL with @Query

Native Query

Modifying Query

Projections

6. Interview Questions

1. What is the difference between JPA, Hibernate, and Spring Data JPA?

2. What are the entity lifecycle states?

3. What is dirty checking?

4. What is the difference between persist() and merge()?

5. What is the N+1 query problem?

6. How can N+1 be fixed?

7. What is the difference between lazy and eager loading?

8. What are default fetch types in JPA?

9. What is cascade in JPA?

10. What is the difference between cascade remove and orphan removal?

11. Why should CascadeType.REMOVE be avoided on many-to-many relationships?

12. What is the persistence context?

13. What is the first-level cache?

14. What is the difference between getReferenceById() and findById()?

15. What is the purpose of @Transactional?

16. What is the difference between JPQL and native SQL?

17. What are projections?

18. What is optimistic locking?

19. What is the difference between save() and saveAndFlush()?

20. When should custom repository implementations be used?

7. Cheat Sheet

Entity Lifecycle

Fetching

Cascades

Repositories

Query Method Patterns

Annotations

Best Practices

JPQL with `@Query`

4. What is the difference between `persist()` and `merge()`?

11. Why should `CascadeType.REMOVE` be avoided on many-to-many relationships?

14. What is the difference between `getReferenceById()` and `findById()`?

15. What is the purpose of `@Transactional`?

19. What is the difference between `save()` and `saveAndFlush()`?