Boosting Database Performance in Java: A Guide to JPA Best Practices

The Java Trail
10 min readSep 19, 2023

--

Java Persistence API (JPA) is a Java specification that defines a standard way to manage relational databases. It provides a high-level, object-oriented view of data persistence, which simplifies the development of Java applications that need to interact with a database.

ORM is a technique that allows developers to map Java objects to relational database tables. This means that developers can interact with a database using Java objects, without having to write any SQL code.

JPA provides a number of features that make it easier to manage data persistence, including:

  • Entity mapping: JPA allows developers to map Java objects to relational database tables using annotations. This simplifies the process of creating and maintaining mappings between the two types of systems.
  • Object-oriented queries: JPA provides a Java Persistence Query Language (JPQL), which is an object-oriented query language that can be used to query data from a database.
  • Transaction management: JPA provides a transaction management API that allows developers to manage transactions in a database.

Hibernate

Hibernate is a popular open-source implementation of the JPA specification.

  • Caching: Hibernate can cache database data in memory.
  • Lazy loading: Hibernate can lazily load data from the database, which means that it will only load the data when it is needed.
  • Transaction management: Hibernate provides built-in transaction management support, which can help to ensure the integrity of data.

To use Hibernate, developers first need to create a persistence unit. This is a configuration file that tells Hibernate which database to connect to and which Java classes to map to database tables.

Once a persistence unit has been created, developers can use the Hibernate EntityManager to interact with the database. The EntityManager is responsible for managing the lifecycle of persistent objects, such as saving, updating, and deleting them.

// Create an EntityManagerFactory.
EntityManagerFactory emf = Persistence.createEntityManagerFactory("my-persistence-unit");

// Create an EntityManager.
EntityManager em = emf.createEntityManager();

// Create a new User object.
User user = new User();
user.setName("John Doe");
user.setEmail("john.doe@example.com");

// Save the User object to the database.
em.persist(user);

// Close the EntityManager.
em.close();

To retrieve the User object from the database, developers can use the following code:

// Create an EntityManager.
EntityManager em = emf.createEntityManager();

// Find the User object by its ID.
User user = em.find(User.class, 1L);

// Close the EntityManager.
em.close();

Optimize Database Performance by using JPA

1. Use Proper Indexing:

Indexes are crucial for efficient database queries. Make sure to add indexes to columns that are frequently used in WHERE clauses or JOIN operations. Spring Data JPA provides annotations to define indexes on entity fields.

Indexes are data structures that improve the speed of data retrieval operations on database tables. They work like the index of a book, allowing the database to quickly locate the rows that satisfy a query’s WHERE clause or are involved in JOIN operations. Without indexes, the database would need to scan the entire table, which can be slow for large datasets.

import org.springframework.data.jpa.domain.support.AuditingEntityListener;
import javax.persistence.*;

@Entity
@Table(name = "products")
@EntityListeners(AuditingEntityListener.class)
public class Product {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;

@Column(unique = true)
private String sku;

@Column
private String name;

// Add index to sku field
@Index(name = "idx_sku")
@Column(unique = true)
private String sku;

// ...
}

Considerations

  • Indexes come with a trade-off. While they speed up read operations, they can slow down write (insert, update, delete) operations because the database needs to maintain the index data structure. Therefore, it’s essential to choose the columns to index carefully, focusing on those that are frequently used in WHERE clauses or JOIN operations.
  • In addition to single-column indexes, you can create composite indexes on multiple columns if you frequently query based on combinations of those columns.

2. Lazy Loading

Lazy loading is a technique used to optimize database performance by loading related entities from the database only when they are explicitly accessed.

  • Reduces memory usage: This is especially important for applications that handle large amounts of data, such as e-commerce websites and social media platforms.
  • Improves load time: This is especially important for mobile devices, which have limited memory and processing power.
  • Reduces bandwidth usage: Lazy loading only loads data when it is needed. This is especially important for users with slow internet connections.

Imagine that you have a web page with 100 images. Without lazy loading, all of the images would be loaded into memory when the user opens the page. This could take a long time, especially if the user has a slow internet connection or a device with limited memory.

With lazy loading, only the images that are visible on the screen would be loaded into memory when the user opens the page. As the user scrolls down the page, the remaining images would be loaded into memory one at a time. This would significantly improve the load time of the page and reduce the amount of memory and bandwidth that is used.

Consider a common scenario in a Spring Boot application where you have two entities: Author and Book, representing a library. Each Author can have multiple Book entities associated with them.

@Entity
public class Author {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;

private String name;

// Define the relationship with books (lazy loading by default)
@OneToMany(mappedBy = "author")
private List<Book> books;

// Getters and setters
}

@Entity
public class Book {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;

private String title;

// Define the relationship with author (lazy loading by default)
@ManyToOne
@JoinColumn(name = "author_id")
private Author author;

// Getters and setters
}
// Fetch an author (only the author data is retrieved from the database)
Author author = authorRepository.findById(1L).orElse(null);

// Accessing the books property triggers the lazy loading of books
List<Book> books = author.getBooks();

In this case, the books will only be loaded from the database when author.getBooks() is called, which can be more efficient, especially if an author has a large number of books.

Eager Loading: configure eager loading using the fetch attribute in JPA annotations.

@Entity
public class Author {
// ...

@OneToMany(mappedBy = "author", fetch = FetchType.EAGER)
private List<Book> books;

// ...
}

Considerations

Caution with N+1 Query Problem:

It occurs when you retrieve a collection of entities and then access a lazy-loaded relationship for each entity within a loop. In such a scenario, it can result in a separate database query for each related entity, leading to a significant increase in the number of database queries and potential performance degradation.

List<Author> authors = authorRepository.findAll();

for (Author author : authors) {
List<Book> books = author.getBooks(); // Lazy loading, triggers separate queries
}

Each call to author.getBooks() triggers a database query to retrieve the books for the respective author. If you have N authors, this will result in N+1 database queries: one query to fetch the authors and one additional query for each author to fetch their books.

How did I solve N+1 query problem?

To mitigate the N+1 query problem, you can use DTO (Data Transfer Object) projections to fetch only the required data from the database in a single query.

//DTO Definition
public class AuthorDto {
private Long id;
private String name;
private List<String> bookTitles; // Include only necessary book data

// Getters and setters
}
//DTO Query
public interface AuthorRepository extends JpaRepository<Author, Long> {
@Query("SELECT new com.example.dto.AuthorDto(a.id, a.name, b.title) FROM Author a JOIN a.books b")
List<AuthorDto> findAllAuthorsWithBooks();
}
List<AuthorDto> authorsWithBooks = authorRepository.findAllAuthorsWithBooks();

3. Caching (EhCache+Redis) Results

Configuring both Ehcache as a second-level cache provider and Redis as a distributed cache can be a powerful combination to improve the performance of your Spring Boot application.

  1. Set up a Spring Boot project: Create or use an existing Spring Boot project with the following dependencies:
  • Spring Web
  • Spring Data JPA
  • Spring Data Redis
  • MySQL
  • Redis (You need to have a Redis server running)
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>

2. Configure MySQL and Redis in your application: In your application.properties or application.yml, configure both MySQL and Redis settings:


# MySQL configuration
spring.datasource.url=jdbc:mysql://localhost:3306/your_database_name
spring.datasource.username=your_username
spring.datasource.password=your_password
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver

# Redis configuration
spring.redis.host=localhost
spring.redis.port=6379
spring.cache.type=redis

3. Enable Caching in your Spring Boot Application:

@SpringBootApplication
@EnableCaching
public class YourApplication {
public static void main(String[] args) {
SpringApplication.run(YourApplication.class, args);
}
}

4. Define an Entity with JPA:

@Entity
public class Product {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
private double price;

// Getters and setters
}

5. Create a Repository Interface: Create a repository interface for your Product entity by extending JpaRepository.

public interface ProductRepository extends JpaRepository<Product, Long> {
}

6. Use Caching in Your Service Layer: In your service layer, you can use caching annotations to control caching behavior. For example, you can use @Cacheable to cache the results of a method.

@Service
public class ProductService {
@Autowired
private ProductRepository productRepository;

@Cacheable("products")
public List<Product> getAllProducts() {
return productRepository.findAll();
}

@Cacheable(value = "products", key = "#id")
public Product getProductById(Long id) {
return productRepository.findById(id).orElse(null);
}

@CacheEvict(value = "products", allEntries = true)
public void clearProductCache() {
}
}
  • @Cacheable("products") tells Spring to cache the results of the method in the "products" cache.
  • @Cacheable(value = "products", key = "#id") caches individual products with a specific ID.
  • @CacheEvict(value = "products", allEntries = true) clears the entire "products" cache when data is updated.

7. Test the Caching:

@RestController
@RequestMapping("/api/products")
public class ProductController {
@Autowired
private ProductService productService;

@GetMapping
public List<Product> getAllProducts() {
return productService.getAllProducts();
}

@GetMapping("/{id}")
public Product getProductById(@PathVariable Long id) {
return productService.getProductById(id);
}

@PostMapping("/clear-cache")
public void clearProductCache() {
productService.clearProductCache();
}
}

4. Paging and Sorting:

Paging and sorting in a Spring Boot application significantly boost performance by reducing the database workload and network latency.

By fetching data in manageable chunks (pages) and ordering it efficiently, you minimize the amount of data retrieved from the database, improving query execution times and network efficiency, also prevents excessive memory usage by fetching and storing only a portion of the data at a time. This not only optimizes memory usage but also enhances the application’s responsiveness and scalability.

1. Define a Spring Data JPA Repository:

public interface UserRepository extends JpaRepository<User, Long> {
// Define custom query methods if needed
}

2. Implement Paging and Sorting:

@Service
public class UserService {
@Autowired
private UserRepository userRepository;

public Page<User> getUsers(int page, int size, String sortBy) {
// Create a Pageable object to define the page number, page size, and sorting
Pageable pageable = PageRequest.of(page, size, Sort.by(sortBy));

// Use the findAll method of your repository with pageable as a parameter
return userRepository.findAll(pageable);
}
}
@RestController
@RequestMapping("/api/users")
public class UserController {
@Autowired
private UserService userService;

@GetMapping
public ResponseEntity<Page<User>> getUsers(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "10") int size,
@RequestParam(defaultValue = "id") String sortBy) {
Page<User> users = userService.getUsers(page, size, sortBy);
return ResponseEntity.ok(users);
}
}
  • We create a Pageable object using PageRequest.of(page, size, Sort.by(sortBy)). This object defines the page number, page size, and sorting criteria. You can change page, size, and sortBy based on your application's requirements.
  • We use the findAll method of the UserRepository with the Pageable parameter to fetch the requested page of users.

/api/users?page=1&size=20&sortBy=name will return the second page of users, each page containing 20 users, sorted by the name field.

5. Bulk Updates & Deletes :

Performing bulk updates and deletes using JPA’s batch operations can significantly improve performance in certain scenarios compared to loading and modifying individual entities one by one. The performance improvement achieved with bulk update and delete operations lies in the reduction of database round-trips, minimized data transfer between the application and the database, and the efficiency of executing single SQL statements.

Scenario: Suppose you want to update the availability of books published before a certain year.

Using Individual Updates:

@Transactional
public void updateBooksAvailability(int year, boolean available) {
List<Book> books = bookRepository.findByPublishedYearLessThan(year);
for (Book book : books) {
book.setAvailable(available);
}
}

Using Bulk Update:

@Transactional
@Modifying
@Query("UPDATE Book b SET b.available = :available WHERE b.publishedYear < :year")
int updateBooksAvailability(@Param("available") boolean available, @Param("year") int year);
  • When using individual updates, you first load all the books published before the specified year into memory. Then, you iterate over each book, modify its availability, and save it back to the database. This approach generates multiple SELECT and UPDATE statements, resulting in database round-trips and increased overhead.
  • In contrast, the bulk update approach uses a single JPQL (Java Persistence Query Language) query with the @Modifying annotation. This query directly updates the available attribute of the books that meet the criteria without loading them into memory individually. It performs a single SQL UPDATE statement, which is more efficient and significantly reduces database round-trips and overhead.

Another Approach: Use JDBC Batch Processing:

  • Create a connection to your database using JDBC.
  • Prepare an SQL INSERT statement with placeholders for the data you want to insert.
  • Use a PreparedStatement to execute batch inserts by repeatedly setting parameter values and adding them to the batch.
  • Execute the batch of inserts and handle any exceptions that may occur.

Example using JDBC batch processing for inserting book data:

try (Connection connection = dataSource.getConnection();
PreparedStatement preparedStatement = connection.prepareStatement(
"INSERT INTO book (title, author, published_year) VALUES (?, ?, ?)"
)) {

int batchSize = 50;
int count = 0;

for (BookData data : bookDataList) {
preparedStatement.setString(1, data.getTitle());
preparedStatement.setString(2, data.getAuthor());
preparedStatement.setInt(3, data.getPublishedYear());
preparedStatement.addBatch();

if (++count % batchSize == 0) {
preparedStatement.executeBatch();
}
}

preparedStatement.executeBatch(); // Execute any remaining batched inserts
} catch (SQLException e) {
// Handle exceptions
}

For very large datasets, where database-specific optimizations come into play, the @Modifying approach with JPA may have an advantage, as JPA providers often have built-in optimizations for bulk operations.

--

--

The Java Trail

Scalable Distributed System, Backend Performance Optimization, Java Enthusiast. (mazumder.dip.auvi@gmail.com Or, +8801741240520)