← Back to the section

Spring Data JPA removes almost all the boilerplate around working with a database. That's convenient as long as the queries are simple. Once they get more complex, you need to understand what happens under the hood, otherwise strange slowdowns appear. Let's start from scratch.

Why you need a repository

In the past, to pull data out of a database, you wrote the same tedious code for every table: open a connection, compose the SQL, walk through the result, assemble objects, close the connection. Ten tables meant ten nearly identical classes, and each one was easy to get wrong.

A repository is an object responsible for reading and writing a single entity (an order, for example). The idea behind Spring Data is that you don't have to write its implementation: you declare only an interface, and Spring generates the code for you.

public interface OrderRepository extends JpaRepository<Order, UUID> {
}

Here Order is the entity class that maps to a table row, and UUID is the type of its primary key. From this single line alone you get ready-made methods for free: save, findById, findAll, deleteById, count, and others. Spring creates the implementation itself when the application starts.

JpaRepository is the richest of the ready-made interfaces. There are simpler ones too (CrudRepository, PagingAndSortingRepository), but in practice people almost always reach for JpaRepository — it includes the capabilities of the others.

Queries from a method name

Ready-made methods like findById aren't always enough — you often need to search by other fields. In the past you'd write SQL by hand for that. Spring Data has a cleverer trick: it reads the method name and composes the query itself.

public interface OrderRepository extends JpaRepository<Order, UUID> {

    List<Order> findByCustomerIdAndStatus(UUID customerId, OrderStatus status);

    Optional<Order> findFirstByCustomerIdOrderByCreatedAtDesc(UUID customerId);

    long countByCustomerId(UUID customerId);

    boolean existsByOrderNumber(String orderNumber);
}

Spring breaks the name into parts: findBy (we're searching), CustomerId and Status (by which fields), And (both conditions). The method parameters are substituted in the same order. You don't need to write an implementation.

The names understand keywords: And, Or, Between, LessThan, GreaterThan, Like, In, IsNull, OrderBy<Field>Asc/Desc, Top<N> (the first N), Distinct. A fairly complex search can be assembled from these.

The technique has a limit: once a method name grows to six or seven words, it becomes impossible to read. At that point it's time to switch to a hand-written query.

A hand-written query with @Query

When a query can't be expressed through a method name, you write it explicitly in a @Query annotation. Inside is not pure SQL but JPQL: a similar language, except it operates on entity classes and their fields rather than tables and columns.

@Query("""
    SELECT o FROM Order o
    WHERE o.customerId = :customerId
      AND o.status IN :statuses
    ORDER BY o.createdAt DESC
""")
List<Order> findRecent(UUID customerId, Collection<OrderStatus> statuses);

Here Order is the class name, not the table name; o.status is a field of the object. Parameters are passed by name: :customerId is taken from the customerId argument.

If JPQL isn't enough (for example, you need functions specific to a particular database), you can write real SQL — for that you add nativeQuery = true:

@Query(value = "SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '24 hours'",
       nativeQuery = true)
List<Order> findRecentNative();

The price is that the query becomes tied to a specific database and isn't checked at compile time: you'll only spot a typo in a column name at runtime.

Projections — when you don't need every field

An ordinary repository method returns a whole entity — with all its fields. But often a list on the screen needs only two or three fields, and pulling everything else out of the database is wasted work and memory.

A projection is a way to return only the set of fields you need. The simplest option is to declare an interface with the required getters:

public interface OrderSummary {
    UUID getId();
    String getCustomerName();
    BigDecimal getTotalAmount();
}

public interface OrderRepository extends JpaRepository<Order, UUID> {
    List<OrderSummary> findByCustomerId(UUID customerId);
}

Spring will see that the method returns OrderSummary and compose SQL with just three columns instead of the whole row.

The second option is to return your own class (a record, for example), assembling it right in the query:

public record OrderSummary(UUID id, String customerName, BigDecimal totalAmount) {}

@Query("""
    SELECT new com.example.OrderSummary(o.id, c.name, o.totalAmount)
    FROM Order o JOIN o.customer c
    WHERE c.id = :customerId
""")
List<OrderSummary> findSummaries(UUID customerId);

Both approaches give the same benefit: only what you actually need arrives from the database.

Paged output: Page and Slice

An order list may contain millions of rows — you can't hand it over whole. The data is cut into pages. To get a single page, you pass the method a Pageable object:

Page<Order> page = repo.findByStatus(OrderStatus.PENDING,
    PageRequest.of(0, 20, Sort.by("createdAt").descending()));

page.getContent();        // 20 orders on this page
page.getTotalElements();  // how many orders in total
page.getTotalPages();     // how many pages in total

PageRequest.of(0, 20, ...) means: page number 0 (the first), 20 elements each, sorted by date descending.

An important detail: Page runs two queries — one fetches the 20 rows themselves, the second counts the total number. On a large table the second query (counting all rows) can be expensive.

If you don't need to know the total — for infinite scrolling, for example, where all that matters is "is there more" — you use Slice. It doesn't count everything and runs a single query:

Slice<Order> slice = repo.findByStatus(OrderStatus.PENDING, PageRequest.of(0, 20));
slice.hasNext();  // true if there's another page ahead

The rule is simple: need page numbers and a total count — Page; need only "load more" — Slice.

The N+1 problem

This is the most common and the most treacherous problem when working with JPA. First, how it works.

Entities have relationships: an order has a customer, an order has line items. By default, related data is loaded lazily: until you touch it, the database isn't queried for it. As soon as you access the field, Hibernate quietly runs a separate query.

That sounds reasonable, but look what happens in a loop:

List<Order> orders = repo.findAll();                      // 1 query: fetched the orders
for (Order order : orders) {
    System.out.println(order.getCustomer().getName());    // +1 query for EACH order
}

A hundred orders is one query for the list plus a hundred queries for the customers. Hence the name: N+1. On the page everything works, on test data it's fast, but under real load the database chokes. It's cured with two techniques.

JOIN FETCH — pull it all at once

We ask Hibernate to fetch both the orders and the customers in a single query:

@Query("SELECT o FROM Order o JOIN FETCH o.customer WHERE o.status = :status")
List<Order> findWithCustomer(OrderStatus status);

Now it's one SQL statement instead of a hundred and one.

@EntityGraph — the same thing, but declaratively

If you'd rather not write a query, you can simply list what to load along with it:

@EntityGraph(attributePaths = {"customer", "lines"})
List<Order> findByStatus(OrderStatus status);

The result is the same — the related data arrives in a single query, and the loop no longer spawns database round-trips.

Open Session In View

Another trap tied to lazy loading. By default, Spring Boot keeps the database connection open for the entire HTTP request — this setting is called Open Session In View (spring.jpa.open-in-view=true).

The convenience is that lazy fields can be touched anywhere: in the controller, in the page template. The problem is exactly that. Database queries start silently firing off while the response is being built, far from the place where the data is actually needed. That very N+1 problem spreads across the whole application, and it's hard to notice because "everything still works."

A common recommendation is to turn this setting off:

spring.jpa.open-in-view=false

After that, an attempt to access a lazy field outside a transaction immediately produces a LazyInitializationException error. That's not a bug but a useful signal: it forces you to fetch everything you need up front (via JOIN FETCH, @EntityGraph, or a projection) where the database work happens, and to hand out an already-prepared result.

In short

  • A repository is declared as an interface; Spring writes the implementation. The default choice is JpaRepository.
  • From a method name (findByCustomerIdAndStatus) Spring composes the query itself; when the name gets too long — switch to @Query.
  • In @Query you write JPQL (over classes and fields) or real SQL with nativeQuery = true.
  • Projections return only the fields you need — through an interface or through your own record in the query.
  • Page runs two queries (data + total count), Slice runs one (only "is there more").
  • N+1 — accessing lazy relationships in a loop spawns one query per element; cured with JOIN FETCH or @EntityGraph.
  • Open Session In View is often turned off (open-in-view=false): it exposes hidden queries and forces you to load data deliberately.
  • @Transactional in depth — how a transaction manages the database connection.
  • Spring Testing — @DataJpaTest and testing repositories.
  • ACID and isolation levels in PostgreSQL — what happens at the database level itself.