Elasticsearch 概述

什么是 Elasticsearch

Elasticsearch 是一个基于 Lucene 的分布式搜索和分析引擎提供了 RESTful API能够快速地存储搜索和分析海量数据

  • 核心功能
    • 全文搜索支持复杂的全文检索
    • 结构化搜索精确匹配范围查询等
    • 数据分析聚合分析统计分析
    • 实时性近实时的数据索引和搜索
  • 主要优势
    • 分布式架构天然支持水平扩展
    • 高性能基于 Lucene搜索速度快
    • 易用性提供简洁的 RESTful API
    • 灵活性支持多种数据类型和查询方式

Elasticsearch 的核心概念

基本概念

  • Index索引
    • 相当于关系数据库中的数据库
    • 是文档的集合
  • Document文档
    • 相当于关系数据库中的行
    • 是最小的数据单元以 JSON 格式存储
  • Type类型
    • 相当于关系数据库中的表7.x 版本已废弃
    • 6.x 版本中一个 Index 可以有多个 Type
  • Field字段
    • 相当于关系数据库中的列
    • 文档中的属性
  • Mapping映射
    • 定义文档字段的类型和属性
    • 相当于数据库的表结构定义
  • Shard分片
    • 索引的水平分割单元
    • 允许水平扩展和提高性能
  • Replica副本
    • 分片的复制
    • 提供高可用性和故障恢复

集群架构

  • Node节点
    • 单个 Elasticsearch 实例
    • 存储数据并参与集群的索引和搜索
  • Cluster集群
    • 一个或多个节点的集合
    • 共同持有全部数据并提供联合索引和搜索功能

Elasticsearch 的工作原理

文档操作流程

1
2
3
4
1. 客户端发送请求到任意节点
2. 节点将请求转发到主分片所在节点
3. 主分片执行操作并同步到副本分片
4. 返回结果给客户端

搜索流程

1
2
3
4
5
1. 客户端发送搜索请求
2. 协调节点将请求广播到所有相关分片
3. 各分片执行搜索并返回结果
4. 协调节点合并结果并排序
5. 返回最终结果给客户端

环境搭建

安装 Elasticsearch

Docker 安装推荐

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 拉取镜像
docker pull elasticsearch:7.17.0

# 运行容器
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
elasticsearch:7.17.0

# 验证安装
curl http://localhost:9200

Linux 安装

1
2
3
4
5
6
7
8
9
10
11
12
# 下载 Elasticsearch
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.0-linux-x86_64.tar.gz

# 解压
tar -xzf elasticsearch-7.17.0-linux-x86_64.tar.gz
cd elasticsearch-7.17.0

# 启动
./bin/elasticsearch

# 后台启动
./bin/elasticsearch -d

Windows 安装

下载地址https://www.elastic.co/cn/downloads/elasticsearch

步骤

  1. 下载压缩包并解压
  2. 进入 bin 目录
  3. 运行 elasticsearch.bat

macOS 安装

1
2
3
4
5
6
7
8
# 使用 Homebrew
brew install elasticsearch@7.17

# 启动
brew services start elasticsearch@7.17

# 停止
brew services stop elasticsearch@7.17

安装 Kibana

Kibana 是 Elasticsearch 的可视化工具

1
2
3
4
5
6
7
8
9
# Docker 安装
docker run -d \
--name kibana \
-p 5601:5601 \
-e "ELASTICSEARCH_HOSTS=http://elasticsearch:9200" \
kibana:7.17.0

# 访问 Kibana
# http://localhost:5601

基本配置

1
2
3
4
5
6
7
8
# config/elasticsearch.yml
cluster.name: my-cluster
node.name: node-1
path.data: /path/to/data
path.logs: /path/to/logs
network.host: 0.0.0.0
http.port: 9200
discovery.type: single-node

验证安装

1
2
3
4
5
6
7
8
# 检查集群健康状态
curl http://localhost:9200/_cluster/health?pretty

# 查看节点信息
curl http://localhost:9200/_nodes?pretty

# 查看索引列表
curl http://localhost:9200/_cat/indices?v

基本操作

索引操作

创建索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# 创建简单索引
curl -X PUT "http://localhost:9200/user" -H 'Content-Type: application/json' -d'
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}'

# 创建带 Mapping 的索引
curl -X PUT "http://localhost:9200/user" -H 'Content-Type: application/json' -d'
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word"
},
"age": {
"type": "integer"
},
"email": {
"type": "keyword"
},
"create_time": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}'

查看索引

1
2
3
4
5
6
7
8
# 查看索引信息
curl http://localhost:9200/user?pretty

# 查看所有索引
curl http://localhost:9200/_cat/indices?v

# 查看索引 Mapping
curl http://localhost:9200/user/_mapping?pretty

删除索引

1
2
3
4
5
6
7
8
# 删除索引
curl -X DELETE "http://localhost:9200/user"

# 删除多个索引
curl -X DELETE "http://localhost:9200/user,product"

# 删除所有索引谨慎使用
curl -X DELETE "http://localhost:9200/_all"

关闭/打开索引

1
2
3
4
5
# 关闭索引
curl -X POST "http://localhost:9200/user/_close"

# 打开索引
curl -X POST "http://localhost:9200/user/_open"

文档操作

添加文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# 指定 ID 添加文档
curl -X PUT "http://localhost:9200/user/_doc/1" -H 'Content-Type: application/json' -d'
{
"name": "张三",
"age": 25,
"email": "zhangsan@example.com",
"create_time": "2024-01-01 10:00:00"
}'

# 自动生成 ID
curl -X POST "http://localhost:9200/user/_doc" -H 'Content-Type: application/json' -d'
{
"name": "李四",
"age": 30,
"email": "lisi@example.com",
"create_time": "2024-01-02 11:00:00"
}'

# 批量添加文档
curl -X POST "http://localhost:9200/user/_bulk" -H 'Content-Type: application/json' -d'
{"index":{"_id":"3"}}
{"name":"王五","age":28,"email":"wangwu@example.com"}
{"index":{"_id":"4"}}
{"name":"赵六","age":35,"email":"zhaoliu@example.com"}
'

查询文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 根据 ID 查询
curl http://localhost:9200/user/_doc/1?pretty

# 查询所有文档
curl http://localhost:9200/user/_search?pretty

# 条件查询
curl -X GET "http://localhost:9200/user/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "张三"
}
}
}'

更新文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 全量更新
curl -X PUT "http://localhost:9200/user/_doc/1" -H 'Content-Type: application/json' -d'
{
"name": "张三丰",
"age": 100,
"email": "zhangsanfeng@example.com"
}'

# 部分更新
curl -X POST "http://localhost:9200/user/_update/1" -H 'Content-Type: application/json' -d'
{
"doc": {
"age": 101
}
}'

# 使用脚本更新
curl -X POST "http://localhost:9200/user/_update/1" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.age += 1",
"lang": "painless"
}
}'

删除文档

1
2
3
4
5
6
7
8
9
10
11
12
# 根据 ID 删除
curl -X DELETE "http://localhost:9200/user/_doc/1"

# 条件删除
curl -X POST "http://localhost:9200/user/_delete_by_query" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "张三"
}
}
}'

数据类型

核心数据类型

类型 说明 示例
text 全文本会分词 "Hello World"
keyword 关键字不分词 "user@example.com"
integer 整数 42
long 长整数 9223372036854775807
float 单精度浮点数 3.14
double 双精度浮点数 3.1415926
boolean 布尔值 true/false
date 日期 "2024-01-01"
object JSON 对象 {"key": "value"}
nested 嵌套对象 [{"key": "value"}]

特殊数据类型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// Geo-point地理位置
{
"location": {
"type": "geo_point",
"lat": 40.7128,
"lon": -74.0060
}
}

// IP 地址
{
"ip_address": {
"type": "ip",
"value": "192.168.1.1"
}
}

// Completion自动完成
{
"suggest": {
"type": "completion"
}
}

查询 DSL

Query Context vs Filter Context

  • Query Context
    • 计算文档的相关度评分
    • 用于全文搜索
    • 性能相对较低
  • Filter Context
    • 只判断是否匹配是/否
    • 可以缓存
    • 性能更高

基础查询

Match Query全文查询

1
2
3
4
5
6
7
8
GET /user/_search
{
"query": {
"match": {
"name": "张三"
}
}
}

Term Query精确查询

1
2
3
4
5
6
7
8
9
10
GET /user/_search
{
"query": {
"term": {
"email": {
"value": "zhangsan@example.com"
}
}
}
}

Range Query范围查询

1
2
3
4
5
6
7
8
9
10
11
GET /user/_search
{
"query": {
"range": {
"age": {
"gte": 20,
"lte": 30
}
}
}
}

Wildcard Query通配符查询

1
2
3
4
5
6
7
8
9
10
GET /user/_search
{
"query": {
"wildcard": {
"name": {
"value": "张*"
}
}
}
}

Prefix Query前缀查询

1
2
3
4
5
6
7
8
9
10
GET /user/_search
{
"query": {
"prefix": {
"name": {
"value": "张"
}
}
}
}

复合查询

Bool Query布尔查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
GET /user/_search
{
"query": {
"bool": {
"must": [
{"match": {"name": "张三"}}
],
"filter": [
{"range": {"age": {"gte": 20, "lte": 30}}}
],
"should": [
{"term": {"email": "zhangsan@example.com"}}
],
"must_not": [
{"term": {"age": 25}}
]
}
}
}
子句 说明 影响评分
must 必须匹配
filter 必须匹配不评分
should 应该匹配
must_not 必须不匹配

Multi Match Query多字段查询

1
2
3
4
5
6
7
8
9
GET /user/_search
{
"query": {
"multi_match": {
"query": "张三",
"fields": ["name", "email"]
}
}
}

高级查询

Fuzzy Query模糊查询

1
2
3
4
5
6
7
8
9
10
11
GET /user/_search
{
"query": {
"fuzzy": {
"name": {
"value": "张山",
"fuzziness": 1
}
}
}
}

Regexp Query正则查询

1
2
3
4
5
6
7
8
GET /user/_search
{
"query": {
"regexp": {
"email": ".*@example\\.com"
}
}
}

Exists Query存在查询

1
2
3
4
5
6
7
8
GET /user/_search
{
"query": {
"exists": {
"field": "email"
}
}
}

聚合查询

Metric Aggregations度量聚合

Avg平均值

1
2
3
4
5
6
7
8
9
10
11
GET /user/_search
{
"size": 0,
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}

Sum求和

1
2
3
4
5
6
7
8
9
10
11
GET /user/_search
{
"size": 0,
"aggs": {
"total_age": {
"sum": {
"field": "age"
}
}
}
}

Max/Min最大/最小值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
GET /user/_search
{
"size": 0,
"aggs": {
"max_age": {
"max": {
"field": "age"
}
},
"min_age": {
"min": {
"field": "age"
}
}
}
}

Stats统计信息

1
2
3
4
5
6
7
8
9
10
11
GET /user/_search
{
"size": 0,
"aggs": {
"age_stats": {
"stats": {
"field": "age"
}
}
}
}

Bucket Aggregations桶聚合

Terms分组

1
2
3
4
5
6
7
8
9
10
11
12
GET /user/_search
{
"size": 0,
"aggs": {
"age_groups": {
"terms": {
"field": "age",
"size": 10
}
}
}
}

Range范围分组

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
GET /user/_search
{
"size": 0,
"aggs": {
"age_ranges": {
"range": {
"field": "age",
"ranges": [
{"to": 20},
{"from": 20, "to": 30},
{"from": 30}
]
}
}
}
}

Date Histogram日期直方图

1
2
3
4
5
6
7
8
9
10
11
12
GET /user/_search
{
"size": 0,
"aggs": {
"users_over_time": {
"date_histogram": {
"field": "create_time",
"calendar_interval": "month"
}
}
}
}

Pipeline Aggregations管道聚合

Derivative导数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
GET /user/_search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "create_time",
"calendar_interval": "month"
},
"aggs": {
"total_sales": {
"sum": {"field": "amount"}
},
"sales_derivative": {
"derivative": {
"buckets_path": "total_sales"
}
}
}
}
}
}

Spring Boot 整合

添加依赖

1
2
3
4
5
<!-- Spring Data Elasticsearch -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

配置文件

1
2
3
4
5
6
7
# application.yml
spring:
elasticsearch:
rest:
uris: http://localhost:9200
connection-timeout: 5s
read-timeout: 60s

创建实体类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
package com.example.entity;

import lombok.Data;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.time.LocalDateTime;

@Data
@Document(indexName = "user")
public class User {

@Id
private String id;

@Field(type = FieldType.Text, analyzer = "ik_max_word")
private String name;

@Field(type = FieldType.Integer)
private Integer age;

@Field(type = FieldType.Keyword)
private String email;

@Field(type = FieldType.Date, format = {}, pattern = "yyyy-MM-dd HH:mm:ss")
private LocalDateTime createTime;
}

创建 Repository

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
package com.example.repository;

import com.example.entity.User;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import org.springframework.stereotype.Repository;

import java.util.List;

@Repository
public interface UserRepository extends ElasticsearchRepository<User, String> {

// 方法名查询
List<User> findByName(String name);

List<User> findByAgeBetween(Integer minAge, Integer maxAge);

List<User> findByEmail(String email);

// 自定义查询
@Query("{\"match\": {\"name\": \"?0\"}}")
List<User> searchByName(String name);
}

创建 Service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
package com.example.service;

import com.example.entity.User;
import com.example.repository.UserRepository;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.core.SearchHits;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
import org.springframework.stereotype.Service;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

@Service
public class UserService {

@Autowired
private UserRepository userRepository;

@Autowired
private ElasticsearchRestTemplate elasticsearchTemplate;

// 保存文档
public User save(User user) {
return userRepository.save(user);
}

// 批量保存
public Iterable<User> saveAll(List<User> users) {
return userRepository.saveAll(users);
}

// 根据 ID 查询
public User findById(String id) {
return userRepository.findById(id).orElse(null);
}

// 删除文档
public void delete(String id) {
userRepository.deleteById(id);
}

// 分页查询
public Page<User> page(int pageNum, int pageSize) {
return userRepository.findAll(PageRequest.of(pageNum - 1, pageSize));
}

// 条件查询
public List<User> search(String keyword) {
NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder();
queryBuilder.withQuery(org.elasticsearch.index.query.QueryBuilders
.multiMatchQuery(keyword, "name", "email"));

SearchHits<User> searchHits = elasticsearchTemplate.search(
queryBuilder.build(), User.class);

return searchHits.getSearchHits().stream()
.map(SearchHit::getContent)
.collect(Collectors.toList());
}

// 聚合查询
public List<Map<String, Object>> aggregateByAge() {
NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder();
queryBuilder.addAggregation(
AggregationBuilders.terms("age_groups").field("age"));

SearchHits<User> searchHits = elasticsearchTemplate.search(
queryBuilder.build(), User.class);

// 处理聚合结果
Terms terms = searchHits.getAggregations().get("age_groups");
List<Map<String, Object>> result = new ArrayList<>();
for (Terms.Bucket bucket : terms.getBuckets()) {
Map<String, Object> map = new HashMap<>();
map.put("age", bucket.getKey());
map.put("count", bucket.getDocCount());
result.add(map);
}

return result;
}
}

创建 Controller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
package com.example.controller;

import com.example.entity.User;
import com.example.service.UserService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.web.bind.annotation.*;

import java.util.List;

@RestController
@RequestMapping("/users")
public class UserController {

@Autowired
private UserService userService;

@PostMapping
public User create(@RequestBody User user) {
return userService.save(user);
}

@PostMapping("/batch")
public Iterable<User> batchCreate(@RequestBody List<User> users) {
return userService.saveAll(users);
}

@GetMapping("/{id}")
public User get(@PathVariable String id) {
return userService.findById(id);
}

@DeleteMapping("/{id}")
public void delete(@PathVariable String id) {
userService.delete(id);
}

@GetMapping("/page")
public Page<User> page(
@RequestParam(defaultValue = "1") int pageNum,
@RequestParam(defaultValue = "10") int pageSize) {
return userService.page(pageNum, pageSize);
}

@GetMapping("/search")
public List<User> search(@RequestParam String keyword) {
return userService.search(keyword);
}
}

高级功能

分词器

内置分词器

  • standard标准分词器默认
  • simple简单分词器
  • whitespace空白分词器
  • keyword不分词
  • pattern模式分词器

IK 分词器中文

1
2
3
4
# 安装 IK 分词器插件
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.0/elasticsearch-analysis-ik-7.17.0.zip

# 重启 Elasticsearch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// 使用 IK 分词器
PUT /user
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "ik_max_word"
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
}
}
}
}

拼音分词器

1
2
# 安装拼音分词器
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.17.0/elasticsearch-analysis-pinyin-7.17.0.zip
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
PUT /user
{
"settings": {
"analysis": {
"analyzer": {
"pinyin_analyzer": {
"tokenizer": "my_pinyin"
}
},
"tokenizer": {
"my_pinyin": {
"type": "pinyin",
"keep_first_letter": false,
"keep_separate_first_letter": false,
"keep_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"lowercase": true
}
}
}
}
}

高亮显示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
GET /user/_search
{
"query": {
"match": {
"name": "张三"
}
},
"highlight": {
"fields": {
"name": {
"pre_tags": ["<em>"],
"post_tags": ["</em>"]
}
}
}
}

建议器Suggester

1
2
3
4
5
6
7
8
9
10
11
12
POST /user/_search
{
"suggest": {
"name_suggest": {
"text": "张",
"completion": {
"field": "name_suggest",
"size": 10
}
}
}
}

地理位置搜索

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// 创建索引
PUT /location
{
"mappings": {
"properties": {
"name": {"type": "text"},
"location": {"type": "geo_point"}
}
}
}

// 添加数据
POST /location/_doc/1
{
"name": "北京",
"location": {
"lat": 39.9042,
"lon": 116.4074
}
}

// 附近搜索
GET /location/_search
{
"query": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 40.0,
"lon": 116.0
}
}
}
}

性能优化

索引优化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// 1. 合理设置分片数
PUT /user
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}

// 2. 禁用不必要的字段
PUT /user/_mapping
{
"properties": {
"description": {
"type": "text",
"index": false // 不需要搜索的字段
}
}
}

// 3. 使用 keyword 代替 text
PUT /user/_mapping
{
"properties": {
"status": {
"type": "keyword" // 精确匹配用 keyword
}
}
}

查询优化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// 1. 使用 filter 代替 query可缓存
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
.filter(QueryBuilders.termQuery("status", "active"))
.must(QueryBuilders.matchQuery("name", "张三"));

// 2. 限制返回字段
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.fetchSource(new String[]{"name", "age"}, null);

// 3. 分页优化深分页问题
// 使用 search_after 代替 from/size
SearchAfterBuilder searchAfter = SearchAfterBuilder.builder()
.setSortValues(lastHit.getSortValues())
.build();

批量操作

1
2
3
4
5
6
7
8
9
10
// 批量索引
BulkRequest bulkRequest = new BulkRequest();
for (User user : users) {
IndexRequest indexRequest = new IndexRequest("user")
.id(user.getId())
.source(JSON.toJSONString(user), XContentType.JSON);
bulkRequest.add(indexRequest);
}

BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);

集群优化

1
2
3
4
5
6
7
8
9
# elasticsearch.yml
# JVM 堆内存设置为物理内存的 50%不超过 32GB
ES_JAVA_OPTS: -Xms16g -Xmx16g

# 禁用 swap
bootstrap.memory_lock: true

# 调整线程池
thread_pool.write.queue_size: 1000

最佳实践

命名规范

1
2
3
4
5
6
7
8
9
10
索引命名
- 小写字母
- 使用下划线或连字符
- 避免特殊字符
- 示例user_infoorder-2024

字段命名
- 使用驼峰命名
- 语义化命名
- 示例createTimeuserName

设计建议

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// 1. 合理设计 Mapping
PUT /user
{
"mappings": {
"properties": {
"name": {"type": "text", "analyzer": "ik_max_word"},
"age": {"type": "integer"},
"email": {"type": "keyword"},
"create_time": {"type": "date"}
}
}
}

// 2. 使用别名
POST /_aliases
{
"actions": [
{"add": {"index": "user_v1", "alias": "user"}}
]
}

// 3. 定期清理旧索引
DELETE /user-2023.*

监控建议

1
2
3
4
5
6
7
8
9
10
11
# 1. 监控集群健康
curl http://localhost:9200/_cluster/health

# 2. 监控节点状态
curl http://localhost:9200/_nodes/stats

# 3. 监控索引状态
curl http://localhost:9200/_cat/indices?v

# 4. 使用 Kibana Monitoring
# 访问 http://localhost:5601/app/monitoring

常见问题

深分页问题

1
2
3
4
5
6
问题from + size > 10000 时报错

解决方案
1. 使用 search_after
2. 使用 scroll API不适合实时搜索
3. 限制最大页数
1
2
3
4
5
6
7
8
9
10
// 使用 search_after
GET /user/_search
{
"size": 10,
"sort": [
{"create_time": "desc"},
{"_id": "asc"}
],
"search_after": [1640000000000, "abc123"]
}

内存溢出

1
2
3
4
5
6
7
问题OutOfMemoryError

解决方案
1. 增加 JVM 堆内存
2. 优化查询减少返回数据量
3. 使用 filter 代替 query
4. 限制聚合结果的 size

脑裂问题

1
2
3
4
5
6
问题集群出现多个 master

解决方案
1. 设置 discovery.zen.minimum_master_nodes
2. 确保网络稳定
3. 使用奇数个 master 候选节点
1
2
# elasticsearch.yml
discovery.zen.minimum_master_nodes: 2

报错处理

💗💗 Elasticsearch 报错Rejecting mapping update

1
2
3
4
5
6
7
8
9
10
错误信息
Rejecting mapping update to [user] as the final mapping would have more than 1 type

错误原因
Elasticsearch 7.x 不再支持多个 Type

解决方案
1. 升级到 7.x 后每个索引只能有一个 Type
2. 使用 _doc 作为默认的 Type
3. 重新设计索引结构

💗💗 Elasticsearch 报错CircuitBreakingException

1
2
3
4
5
6
7
8
9
10
错误信息
CircuitBreakingException: [parent] Data too large

错误原因
查询数据量超过限制

解决方案
1. 优化查询减少返回数据量
2. 增加 circuit_breaker.limit
3. 使用分页或 search_after

💗💗 Elasticsearch 报错ClusterBlockException

1
2
3
4
5
6
7
8
9
10
11
错误信息
ClusterBlockException: blocked by: [FORBIDDEN/12/index read-only]

错误原因
磁盘空间不足索引变为只读

解决方案
1. 清理磁盘空间
2. 解除只读状态
curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'
{"index.blocks.read_only_allow_delete": null}'

学习资源

  • 视频
    • 黑马Elasticsearch全套教程https://www.bilibili.com/video/BV1b8411Z7w5
  • 官方文档
    • Elasticsearch 官方文档https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
    • Elasticsearch GitHubhttps://github.com/elastic/elasticsearch
  • 书籍
    • Elasticsearch 权威指南Clinton Gormley 著
    • Elasticsearch 实战Radu Gheorghe 著
  • 教程
    • Elasticsearch 入门教程https://www.runoob.com/elasticsearch/elasticsearch-tutorial.html
    • Elastic 官方培训https://www.elastic.co/training
  • 工具
    • Kibana可视化平台
    • Head Plugin集群管理插件
    • Cerebro集群监控工具
  • 社区
    • Elastic 中文社区https://elasticsearch.cn/
    • Stack Overflow Elasticsearch 标签https://stackoverflow.com/questions/tagged/elasticsearch