網易首頁 > 網易號 > 正文申請入駐

13 秒插入 30 萬條數據，這才是批量插入正確的姿勢！

2025-10-22 09:03:45　來源: Java精選

北京舉報

分享至

Java精選面試題（微信小程序）：5000+道面試題和選擇題，包含Java基礎、并發、JVM、線程、MQ系列、Redis、Spring系列、Elasticsearch、Docker、K8s、Flink、Spark、架構設計、大廠真題等，在線隨時刷題！

本文主要講述通過MyBatis、JDBC等做大數據量數據插入的案例和結果。

30萬條數據插入插入數據庫驗證

實體類、mapper和配置文件定義
- User實體
- mapper接口
- mapper.xml文件
- jdbc.properties
- sqlMapConfig.xml
不分批次直接梭哈
循環逐條插入
MyBatis實現插入30萬條數據
JDBC實現插入30萬條數據
總結

驗證的數據庫表結構如下：

CREATE TABLE `t_user` (   `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '用戶id',   `username` varchar(64) DEFAULT NULL COMMENT '用戶名稱',   `age` int(4) DEFAULT NULL COMMENT '年齡',   PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='用戶信息表';

話不多說，開整！

實體類、mapper和配置文件定義

User實體

/**  * 用戶實體  *  * @Author zjq  */ @Data public class User {     private int id;     private String username;     private int age; }

mapper接口

public interface UserMapper {     /**      * 批量插入用戶      * @param userList      */     void batchInsertUser(@Param("list") List userList); }

mapper.xml文件

"batchInsertUser"  parameterType= "java.util.List" >     insert into t_user(username,age) values     "list"  item= "item"  index= "index"  separator= "," >         (          #{item.username},          #{item.age}         )

jdbc.properties

jdbc.driver=com.mysql.jdbc.Driver jdbc.url=jdbc:mysql://localhost:3306/test jdbc.username=root jdbc.password=root

sqlMapConfig.xml

"1.0" encoding="UTF-8" ?> "-//mybatis.org//DTD Config 3.0//EN""http://mybatis.org/dtd/mybatis-3-config.dtd">         "jdbc.properties" >                 type = "com.zjq.domain.User" alias = "user" >             "developement" >         "developement" >             type = "JDBC" >             type = "POOLED" >                 "driver"  value= "${jdbc.driver}" />                 "url"  value= "${jdbc.url}" />                 "username"  value= "${jdbc.username}" />                 "password"  value= "${jdbc.password}" />                                         "com/zjq/mapper/UserMapper.xml" >

不分批次直接梭哈

MyBatis直接一次性批量插入30萬條，代碼如下：

@Test public void testBatchInsertUser() throws IOException {     InputStream resourceAsStream =             Resources.getResourceAsStream("sqlMapConfig.xml");     SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);     SqlSession session = sqlSessionFactory.openSession();     System.out.println("===== 開始插入數據 =====");     long startTime = System.currentTimeMillis();     try {         List userList = new ArrayList<>();         for (int i = 1; i <= 300000; i++) {             User user = new User();             user.setId(i);             user.setUsername("共飲一杯無 " + i);             user.setAge((int) (Math.random() * 100));             userList.add(user);         }         session.insert("batchInsertUser", userList); // 最后插入剩余的數據         session.commit();         long spendTime = System.currentTimeMillis()-startTime;         System.out.println("成功插入 30 萬條數據,耗時："+spendTime+"毫秒");     } finally {         session.close();     } }

可以看到控制臺輸出：

Cause: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (27759038 >yun 4194304). You can change this value on the server by setting the max_allowed_packet’ variable.

圖片

超出最大數據包限制了，可以通過調整max_allowed_packet限制來提高可以傳輸的內容，不過由于30萬條數據超出太多，這個不可取，梭哈看來是不行了

既然梭哈不行那我們就一條一條循環著插入行不行呢

循環逐條插入

mapper接口和mapper文件中新增單個用戶新增的內容如下:

/**  * 新增單個用戶  * @param user  */ void insertUser(User user); "insertUser"  parameterType= "user" >     insert into t_user(username,age) values         (          #{username},          #{age}         )

調整執行代碼如下：

@Test public void testCirculateInsertUser() throws IOException {     InputStream resourceAsStream =             Resources.getResourceAsStream("sqlMapConfig.xml");     SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);     SqlSession session = sqlSessionFactory.openSession();     System.out.println("===== 開始插入數據 =====");     long startTime = System.currentTimeMillis();     try {         for (int i = 1; i <= 300000; i++) {             User user = new User();             user.setId(i);             user.setUsername("共飲一杯無 " + i);             user.setAge((int) (Math.random() * 100));             // 一條一條新增             session.insert("insertUser", user);             session.commit();         }         long spendTime = System.currentTimeMillis()-startTime;         System.out.println("成功插入 30 萬條數據,耗時："+spendTime+"毫秒");     } finally {         session.close();     } }

執行后可以發現磁盤IO占比飆升，一直處于高位。

圖片

等啊等等啊等，好久還沒執行完

圖片

先不管他了太慢了先搞其他的，等會再來看看結果吧。

two thousand year later …

控制臺輸出如下：

圖片

總共執行了14909367毫秒，換算出來是4小時八分鐘。太慢了。。

圖片

還是優化下之前的批處理方案吧

MyBatis實現插入30萬條數據

先清理表數據，然后優化批處理執行插入：

-- 清空用戶表 TRUNCATE table  t_user;

以下是通過 MyBatis 實現 30 萬條數據插入代碼實現：

/**  * 分批次批量插入  * @throws IOException  */ @Test public void testBatchInsertUser() throws IOException {     InputStream resourceAsStream =             Resources.getResourceAsStream("sqlMapConfig.xml");     SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);     SqlSession session = sqlSessionFactory.openSession();     System.out.println("===== 開始插入數據 =====");     long startTime = System.currentTimeMillis();     int waitTime = 10;     try {         List userList = new ArrayList<>();         for (int i = 1; i <= 300000; i++) {             User user = new User();             user.setId(i);             user.setUsername("共飲一杯無 " + i);             user.setAge((int) (Math.random() * 100));             userList.add(user);             if (i % 1000 == 0) {                 session.insert("batchInsertUser", userList);                 // 每 1000 條數據提交一次事務                 session.commit();                 userList.clear();                 // 等待一段時間                 Thread.sleep(waitTime * 1000);             }         }         // 最后插入剩余的數據         if(!CollectionUtils.isEmpty(userList)) {             session.insert("batchInsertUser", userList);             session.commit();         }         long spendTime = System.currentTimeMillis()-startTime;         System.out.println("成功插入 30 萬條數據,耗時："+spendTime+"毫秒");     } catch (Exception e) {         e.printStackTrace();     } finally {         session.close();     } }

使用了 MyBatis 的批處理操作，將每 1000 條數據放在一個批次中插入，能夠較為有效地提高插入速度。同時請注意在循環插入時要帶有合適的等待時間和批處理大小，以防止出現內存占用過高等問題。此外，還需要在配置文件中設置合理的連接池和數據庫的參數，以獲得更好的性能。

圖片

在上面的示例中，我們每插入1000行數據就進行一次批處理提交，并等待10秒鐘。這有助于控制內存占用，并確保插入操作平穩進行。

圖片

五十分鐘執行完畢，時間主要用在了等待上。

如果低谷時期執行，CPU和磁盤性能又足夠的情況下，直接批處理不等待執行：

/**  * 分批次批量插入  * @throws IOException  */ @Test public void testBatchInsertUser() throws IOException {     InputStream resourceAsStream =             Resources.getResourceAsStream("sqlMapConfig.xml");     SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);     SqlSession session = sqlSessionFactory.openSession();     System.out.println("===== 開始插入數據 =====");     long startTime = System.currentTimeMillis();     int waitTime = 10;     try {         List userList = new ArrayList<>();         for (int i = 1; i <= 300000; i++) {             User user = new User();             user.setId(i);             user.setUsername("共飲一杯無 " + i);             user.setAge((int) (Math.random() * 100));             userList.add(user);             if (i % 1000 == 0) {                 session.insert("batchInsertUser", userList);                 // 每 1000 條數據提交一次事務                 session.commit();                 userList.clear();             }         }         // 最后插入剩余的數據         if(!CollectionUtils.isEmpty(userList)) {             session.insert("batchInsertUser", userList);             session.commit();         }         long spendTime = System.currentTimeMillis()-startTime;         System.out.println("成功插入 30 萬條數據,耗時："+spendTime+"毫秒");     } catch (Exception e) {         e.printStackTrace();     } finally {         session.close();     } }

則24秒可以完成數據插入操作：

圖片圖片

可以看到短時CPU和磁盤占用會飆高。

把批處理的量再調大一些調到5000，在執行：

圖片

13秒插入成功30萬條，直接蕪湖起飛

JDBC實現插入30萬條數據

JDBC循環插入的話跟上面的mybatis逐條插入類似，不再贅述。

以下是 Java 使用 JDBC 批處理實現 30 萬條數據插入的示例代碼。請注意，該代碼僅提供思路，具體實現需根據實際情況進行修改。

/**  * JDBC分批次批量插入  * @throws IOException  */ @Test public void testJDBCBatchInsertUser() throws IOException {     Connection connection = null;     PreparedStatement preparedStatement = null;     String databaseURL = "jdbc:mysql://localhost:3306/test";     String user = "root";     String password = "root";     try {         connection = DriverManager.getConnection(databaseURL, user, password);         // 關閉自動提交事務，改為手動提交         connection.setAutoCommit(false);         System.out.println("===== 開始插入數據 =====");         long startTime = System.currentTimeMillis();         String sqlInsert = "INSERT INTO t_user ( username, age) VALUES ( ?, ?)";         preparedStatement = connection.prepareStatement(sqlInsert);         Random random = new Random();         for (int i = 1; i <= 300000; i++) {             preparedStatement.setString(1, "共飲一杯無 " + i);             preparedStatement.setInt(2, random.nextInt(100));             // 添加到批處理中             preparedStatement.addBatch();             if (i % 1000 == 0) {                 // 每1000條數據提交一次                 preparedStatement.executeBatch();                 connection.commit();                 System.out.println("成功插入第 "+ i+" 條數據");             }         }         // 處理剩余的數據         preparedStatement.executeBatch();         connection.commit();         long spendTime = System.currentTimeMillis()-startTime;         System.out.println("成功插入 30 萬條數據,耗時："+spendTime+"毫秒");     } catch (SQLException e) {         System.out.println("Error: " + e.getMessage());     } finally {         if (preparedStatement != null) {             try {                 preparedStatement.close();             } catch (SQLException e) {                 e.printStackTrace();             }         }         if (connection != null) {             try {                 connection.close();             } catch (SQLException e) {                 e.printStackTrace();             }         }     } }

圖片

圖片

上述示例代碼中，我們通過 JDBC 連接 MySQL 數據庫，并執行批處理操作插入數據。具體實現步驟如下：

獲取數據庫連接。
創建 Statement 對象。
定義 SQL 語句，使用PreparedStatement對象預編譯 SQL 語句并設置參數。
執行批處理操作。
處理剩余的數據。
關閉 Statement 和 Connection 對象。

使用setAutoCommit(false)來禁止自動提交事務，然后在每次批量插入之后手動提交事務。每次插入數據時都新建一個PreparedStatement對象以避免狀態不一致問題。在插入數據的循環中，每 10000 條數據就執行一次executeBatch()插入數據。

另外，需要根據實際情況優化連接池和數據庫的相關配置，以防止連接超時等問題。

總結

實現高效的大量數據插入需要結合以下優化策略（建議綜合使用）：

1.批處理：批量提交SQL語句可以降低網絡傳輸和處理開銷，減少與數據庫交互的次數。在Java中可以使用Statement或者PreparedStatement的addBatch()方法來添加多個SQL語句，然后一次性執行executeBatch()方法提交批處理的SQL語句。

在循環插入時帶有適當的等待時間和批處理大小，從而避免內存占用過高等問題：

- 設置適當的批處理大小：批處理大小指在一次插入操作中插入多少行數據。如果批處理大小太小，插入操作的頻率將很高，而如果批處理大小太大，可能會導致內存占用過高。通常，建議將批處理大小設置為1000-5000行，這將減少插入操作的頻率并降低內存占用。
- 采用適當的等待時間：等待時間指在批處理操作之間等待的時間量。等待時間過短可能會導致內存占用過高，而等待時間過長則可能會延遲插入操作的速度。通常，建議將等待時間設置為幾秒鐘到幾十秒鐘之間，這將使操作變得平滑且避免出現內存占用過高等問題。
- 可以考慮使用一些內存優化的技巧，例如使用內存數據庫或使用游標方式插入數據，以減少內存占用。
總的來說，選擇適當的批處理大小和等待時間可以幫助您平穩地進行插入操作，避免出現內存占用過高等問題。

2.索引:在大量數據插入前暫時去掉索引，最后再打上，這樣可以大大減少寫入時候的更新索引的時間。

3.數據庫連接池：使用數據庫連接池可以減少數據庫連接建立和關閉的開銷，提高性能。在沒有使用數據庫連接池的情況，記得在finally中關閉相關連接。

數據庫參數調整：增加MySQL數據庫緩沖區大小、配置高性能的磁盤和I/O等。

來源：https://blog.csdn.net/qq_35427589/article/details/129665307

公眾號“Java精選”所發表內容注明來源的，版權歸原出處所有（無法查證版權的或者未注明出處的均來自網絡，系轉載，轉載的目的在于傳遞更多信息，版權屬于原作者。如有侵權，請聯系，筆者會第一時間刪除處理！

最近有很多人問，有沒有讀者交流群！加入方式很簡單，公眾號Java精選，回復“加群”，即可入群！

文章有幫助的話，點在看，轉發吧！

特別聲明：以上內容(如有圖片或視頻亦包括在內)為自媒體平臺“網易號”用戶上傳并發布，本平臺僅提供信息存儲服務。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.