当前位置: 首页 > news >正文

微服务day08

Elasticsearch

 

需要安装elasticsearch和Kibana,应为Kibana中有一套控制台可以方便的进行操作。

安装elasticsearch

使用docker命令安装:

docker run -d \ --name es \-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \  //设置他的运行内存空间,不要低于512否则出问题-e "discovery.type=single-node" \ //设置安装模式,当前为单点模式,而非集群模式。-v es-data:/usr/share/elasticsearch/data \-v es-plugins:/usr/share/elasticsearch/plugins \--privileged \--network hm-net \-p 9200:9200 \-p 9300:9300 \  //集群通信的端口。elasticsearch:7.12.1

安装Kibana

docker run -d \
--name kibana \
-e ELASTICSEARCH_HOSTS=http://es:9200 \
--network=hm-net \
-p 5601:5601  \
kibana:7.12.1

通过kibana向es发送Http请求。

倒排索引

小结:

IK分词器

安装

在线安装:

docker exec -it es ./bin/elasticsearch-plugin  install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.1/elasticsearch-analysis-ik-7.12.1.zip

重启:

docker restart es

线下安装:

首先,查看之前安装的Elasticsearch容器的plugins数据卷目录:

docker volume inspect es-plugins

结果如下:

[{"CreatedAt": "2024-11-06T10:06:34+08:00","Driver": "local","Labels": null,"Mountpoint": "/var/lib/docker/volumes/es-plugins/_data","Name": "es-plugins","Options": null,"Scope": "local"}
]

可以看到elasticsearch的插件挂载到了/var/lib/docker/volumes/es-plugins/_data这个目录。我们需要把IK分词器上传至这个目录。

找到课前资料提供的ik分词器插件,课前资料提供了7.12.1版本的ik分词器压缩文件,你需要对其解压:

然后上传至虚拟机的/var/lib/docker/volumes/es-plugins/_data这个目录

重启es容器:

docker restart es

IK分词器包含两种模式:

  • ik_smart:智能语义切分

  • ik_max_word:最细粒度切分

POST /_analyze
{"analyzer": "ik_smart","text": "黑马程序员学习java太棒了"
}

结果:

{"tokens" : [{"token" : "黑马","start_offset" : 0,"end_offset" : 2,"type" : "CN_WORD","position" : 0},{"token" : "程序员","start_offset" : 2,"end_offset" : 5,"type" : "CN_WORD","position" : 1},{"token" : "学习","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 2},{"token" : "java","start_offset" : 7,"end_offset" : 11,"type" : "ENGLISH","position" : 3},{"token" : "太棒了","start_offset" : 11,"end_offset" : 14,"type" : "CN_WORD","position" : 4}]
}

由于互联网词汇不断增多,需要拓展词汇库。

所以要想正确分词,IK分词器的词库也需要不断的更新,IK分词器提供了扩展词汇的功能。

1)打开IK分词器config目录:

注意,如果采用在线安装的通过,默认是没有config目录的,需要把课前资料提供的ik下的config上传至对应目录。

在IKAnalyzer.cfg.xml配置文件内容添加:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK Analyzer 扩展配置</comment><!--用户可以在这里配置自己的扩展字典 *** 添加扩展词典--><entry key="ext_dict">ext.dic</entry>
</properties>

在IK分词器的config目录新建一个 ext.dic,可以参考config目录下复制一个配置文件进行修改:

传智播客
泰裤辣

重启elasticsearch

docker restart es

再次测试,可以发现传智播客泰裤辣都正确分词了:

{"tokens" : [{"token" : "传智播客","start_offset" : 0,"end_offset" : 4,"type" : "CN_WORD","position" : 0},{"token" : "开设","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 1},{"token" : "大学","start_offset" : 6,"end_offset" : 8,"type" : "CN_WORD","position" : 2},{"token" : "真的","start_offset" : 9,"end_offset" : 11,"type" : "CN_WORD","position" : 3},{"token" : "泰裤辣","start_offset" : 11,"end_offset" : 14,"type" : "CN_WORD","position" : 4}]
}

总结

分词器的作用是什么?

  • 创建倒排索引时,对文档分词

  • 用户搜索时,对输入的内容分词

IK分词器有几种模式?

  • ik_smart:智能切分,粗粒度

  • ik_max_word:最细切分,细粒度

IK分词器如何拓展词条?如何停用词条?

  • 利用config目录的IkAnalyzer.cfg.xml文件添加拓展词典和停用词典

  • 在词典中添加拓展词条或者停用词条

基础概念

索引库操作

Mapping映射属性

索引库操作

Restful规范

索引库操作

创建索引库:

基本语法

  • 请求方式:PUT

  • 请求路径:/索引库名,可以自定义

  • 请求参数:mapping映射

格式:

PUT /索引库名称
{"mappings": {"properties": {"字段名":{"type": "text","analyzer": "ik_smart"},"字段名2":{"type": "keyword","index": "false"},"字段名3":{"properties": {"子字段": {"type": "keyword"}}},// ...略}}
}

 实例代码

PUT /hmall
{"mappings": {"properties": {"info":{"type": "text","analyzer": "ik_smart"},"email":{"type": "keyword","index": false},"name":{"properties": {"firstName":{"type":"keyword"},"lastName":{"type":"keyword"}}}}}
}

返回结果:

{"acknowledged" : true,"shards_acknowledged" : true,"index" : "hmall"
}
查询索引库

基本语法

  • 请求方式:GET

  • 请求路径:/索引库名

  • 请求参数:无

格式

GET /索引库名

实例代码

GET /hmall

查询结果:

{"hmall" : {"aliases" : { },"mappings" : {"properties" : {"email" : {"type" : "keyword","index" : false},"info" : {"type" : "text","analyzer" : "ik_smart"},"name" : {"properties" : {"firstName" : {"type" : "keyword"},"lastName" : {"type" : "keyword"}}}}},"settings" : {"index" : {"routing" : {"allocation" : {"include" : {"_tier_preference" : "data_content"}}},"number_of_shards" : "1","provided_name" : "hmall","creation_date" : "1731489076840","number_of_replicas" : "1","uuid" : "KJqNTGY9RhiqFHkliRdGcw","version" : {"created" : "7120199"}}}}
}
修改索引库

倒排索引结构虽然不复杂,但是一旦数据结构改变(比如改变了分词器),就需要重新创建倒排索引,这简直是灾难。因此索引库一旦创建,无法修改mapping

虽然无法修改mapping中已有的字段,但是却允许添加新的字段到mapping中,因为不会对倒排索引产生影响。因此修改索引库能做的就是向索引库中添加新字段,或者更新索引库的基础属性。

语法说明:

PUT /索引库名/_mapping
{"properties": {"新字段名":{"type": "integer"}}
}

添加字段:

PUT /hmall/_mapping
{"properties": {"age":{"type": "integer"}}
}

返回结果:

{"acknowledged" : true
}
删除索引库

语法:

  • 请求方式:DELETE

  • 请求路径:/索引库名

  • 请求参数:无

语法格式

DELETE /索引库名

删除库

DELETE /hmall

执行结果

{"acknowledged" : true
}
索引库操作总结:

索引库操作有哪些?

  • 创建索引库:PUT /索引库名

  • 查询索引库:GET /索引库名

  • 删除索引库:DELETE /索引库名

  • 修改索引库,添加字段:PUT /索引库名/_mapping

可以看到,对索引库的操作基本遵循的Restful的风格,因此API接口非常统一,方便记忆。

文档操作

新增文档

语法规范

POST /索引库名/_doc/文档id
{"字段1": "值1","字段2": "值2","字段3": {"子属性1": "值3","子属性2": "值4"},
}

实例代码

POST /hmall/_doc/1
{"info": "黑马程序员Java讲师","email": "zy@itcast.cn","name": {"firstName": "云","lastName": "赵"}
}

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 8,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 8,"_primary_term" : 1
}
查询文档

语法规范

GET /{索引库名称}/_doc/{id}

实例代码

GET /hmall/_doc/1

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 8,"_seq_no" : 8,"_primary_term" : 1,"found" : true,"_source" : {"info" : "黑马程序员Java讲师","email" : "zy@itcast.cn","name" : {"firstName" : "云","lastName" : "赵"}}
}
删除文档

语法规范

DELETE /{索引库名}/_doc/id值

实例代码

DELETE /hmall/_doc/1

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 9,"result" : "deleted","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 9,"_primary_term" : 1
}
修改文档
全量修改

语法规范

PUT /{索引库名}/_doc/文档id
{"字段1": "值1","字段2": "值2",// ... 略
}

实例代码

#全量修改
PUT /hmall/_doc/1
{"info": "黑马程序员Python讲师","email": "ls@itcast.cn","name": {"firstName": "四","lastName": "李"}
}

运行结果

第一次运行,由于前面已经删除了该文档所以 执行类型为 created.

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 10,"result" : "created","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 10,"_primary_term" : 1
}

 第二次运行,为修改返回的类型为 updated。

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 11,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 11,"_primary_term" : 1
}
部分修改

语法规范

POST /{索引库名}/_update/文档id
{"doc": {"字段名": "新的值",}
}

实例代码

#局部修改
POST /hmall/_update/1
{"doc": {"email":"ZhaoYuen@itcast.com"}
}

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 12,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 12,"_primary_term" : 1
}

当前文档的数据:

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 12,"_seq_no" : 12,"_primary_term" : 1,"found" : true,"_source" : {"info" : "黑马程序员Python讲师","email" : "ZhaoYuen@itcast.com","name" : {"firstName" : "四","lastName" : "李"}}
}
文档基础操作小结

文档批量处理

  • index代表新增操作

    • _index:指定索引库名

    • _id指定要操作的文档id

    • { "field1" : "value1" }:则是要新增的文档内容

  • delete代表删除操作

    • _index:指定索引库名

    • _id指定要操作的文档id

  • update代表更新操作

    • _index:指定索引库名

    • _id指定要操作的文档id

    • { "doc" : {"field2" : "value2"} }:要更新的文档字段

#批量新增
POST /_bulk
{"index": {"_index":"hmall", "_id": "3"}}
{"info": "黑马程序员C++讲师", "email": "ww@itcast.cn", "name":{"firstName": "五", "lastName":"王"}}
{"index": {"_index":"hmall", "_id": "4"}}
{"info": "黑马程序员前端讲师", "email": "zhangsan@itcast.cn", "name":{"firstName": "三", "lastName":"张"}}POST /_bulk
{"delete":{"_index":"hmall", "_id": "3"}}
{"delete":{"_index":"hmall", "_id": "4"}}

JavaRestClient

作用:向restful接口发送请求。

客户端初始化

1)在item-service模块中引入esRestHighLevelClient依赖:

<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>

2)因为SpringBoot默认的ES版本是7.17.10,所以我们需要覆盖默认的ES版本:

  <properties><maven.compiler.source>11</maven.compiler.source><maven.compiler.target>11</maven.compiler.target><elasticsearch.version>7.12.1</elasticsearch.version></properties>

3)初始化RestHighLevelClient:

初始化的代码如下:

RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.150.101:9200")
));

4)创建一个测试用例来实现

package com.hmall.item.es;import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;import java.io.IOException;public class Test {private RestHighLevelClient client;//创建开始初始化函数,对客户端进行初始化@BeforeEachvoid setUp() {this.client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.21.101:9200")));}//测试代码@org.junit.jupiter.api.Testpublic void test(){System.out.println(client);}//测试用例结束后释放资源@AfterEachvoid tearDown() throws IOException {client.close();}
}

商品Mapping映射

分析那些字段需要添加到文档中。

根据分析编写的需求:

PUT /hmall
{"mappings": {"properties": {"id":{"type": "keyword"},"name":{"type": "text","analyzer": "ik_smart"},"price":{"type": "integer"},"image":{"type": "keyword","index": false},"category":{"type": "keyword"},"brand":{"type": "keyword"},"sold":{"type": "integer"},"commentCount":{"type": "integer","index": false},"isAD":{"type": "boolean"},"updateTime":{"type": "date"}}}
}

索引库操作

创建索引库:

    //创建索引库@org.junit.jupiter.api.Testpublic void testCreateIndex() throws IOException {//创建Requst对象CreateIndexRequest request = new CreateIndexRequest("items");//准备参数 ,将编写的JSON字符串赋值给MAPPINF_JSON,XContentType.JSON用于指定是JSON格式request.source(MAPPINF_JSON, XContentType.JSON);//发送请求,RequestOptions.DEFAULT为是否有自定义配置,选择默认client.indices().create(request, RequestOptions.DEFAULT);}private final String MAPPINF_JSON = "{\n" +"  \"mappings\": {\n" +"    \"properties\": {\n" +"      \"id\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"name\":{\n" +"        \"type\": \"text\",\n" +"        \"analyzer\": \"ik_smart\"\n" +"      },\n" +"      \"price\":{\n" +"        \"type\": \"integer\"\n" +"      },\n" +"      \"image\":{\n" +"        \"type\": \"keyword\",\n" +"        \"index\": false\n" +"      },\n" +"      \"category\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"brand\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"sold\":{\n" +"        \"type\": \"integer\"\n" +"      },\n" +"      \"commentCount\":{\n" +"        \"type\": \"integer\",\n" +"        \"index\": false\n" +"      },\n" +"      \"isAD\":{\n" +"        \"type\": \"boolean\"\n" +"      },\n" +"      \"updateTime\":{\n" +"        \"type\": \"date\"\n" +"      }\n" +"    }\n" +"  }\n" +"}";

删除索引库

    //删除索引库@org.junit.jupiter.api.Testpublic void testDeleteIndex() throws IOException {//创建Requst对象DeleteIndexRequest request = new DeleteIndexRequest("items");//发送请求client.indices().delete(request,RequestOptions.DEFAULT);}

查看索引库

    //查找索引库@org.junit.jupiter.api.Testpublic void testGetIndex() throws IOException {//创建Requst对象GetIndexRequest request = new GetIndexRequest("items");//发送2请求
//        client.indices().get(request, RequestOptions.DEFAULT);//由于GEt请求返回的数据较多,如果只需要判断是否存在,则使用exists方法boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);System.out.println("exists = " + exists);}

文档操作

新增文档

由于索引库结构与数据库结构还存在一些差异,因此我们要定义一个索引库结构对应的实体。

package com.hmall.item.domain.po;import io.swagger.annotations.ApiModel;
import io.swagger.annotations.ApiModelProperty;
import lombok.Data;import java.time.LocalDateTime;@Data
@ApiModel(description = "索引库实体")
public class ItemDoc{@ApiModelProperty("商品id")private String id;@ApiModelProperty("商品名称")private String name;@ApiModelProperty("价格(分)")private Integer price;@ApiModelProperty("商品图片")private String image;@ApiModelProperty("类目名称")private String category;@ApiModelProperty("品牌名称")private String brand;@ApiModelProperty("销量")private Integer sold;@ApiModelProperty("评论数")private Integer commentCount;@ApiModelProperty("是否是推广广告,true/false")private Boolean isAD;@ApiModelProperty("更新时间")private LocalDateTime updateTime;
}

API语法

POST /{索引库名}/_doc/1
{"name": "Jack","age": 21
}

可以看到与索引库操作的API非常类似,同样是三步走:

  • 1)创建Request对象,这里是IndexRequest,因为添加文档就是创建倒排索引的过程

  • 2)准备请求参数,本例中就是Json文档

  • 3)发送请求

变化的地方在于,这里直接使用client.xxx()的API,不再需要client.indices()了。

    //新增文档@org.junit.jupiter.api.Testpublic void test() throws IOException {//读取数据库获取商品信息Item item = service.getById(317578);//将商品信息对象转换为DtoItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);//将对象转换为JSONString jsonStr = JSONUtil.toJsonStr(itemDoc);//创建Requst对象IndexRequest request = new IndexRequest("items").id(itemDoc.getId());request.source(jsonStr, XContentType.JSON);//发送信息client.index(request, RequestOptions.DEFAULT);}

package com.hmall.item.es;import cn.hutool.core.bean.BeanUtil;
import cn.hutool.json.JSONUtil;
import com.hmall.item.domain.dto.ItemDoc;
import com.hmall.item.domain.po.Item;
import com.hmall.item.service.impl.ItemServiceImpl;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;
@SpringBootTest(properties = {"spring.profiles.active=local"})
public class docTest {private RestHighLevelClient client;@Autowiredprivate ItemServiceImpl service;//创建开始初始化函数,对客户端进行初始化@BeforeEachvoid setUp() {this.client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.21.101:9200")));}//新增文档,全局修改文档@org.junit.jupiter.api.Testpublic void test() throws IOException {//读取数据库获取商品信息Item item = service.getById(317578);//将商品信息对象转换为DtoItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);//将对象转换为JSONString jsonStr = JSONUtil.toJsonStr(itemDoc);//创建Requst对象IndexRequest request = new IndexRequest("items").id(itemDoc.getId());request.source(jsonStr, XContentType.JSON);//发送信息client.index(request, RequestOptions.DEFAULT);}//查看文档@org.junit.jupiter.api.Testpublic void testGet() throws IOException {//创建Requst对象GetRequest request = new GetRequest("items","317578");//发送信息GetResponse response = client.get(request, RequestOptions.DEFAULT);String json = response.getSourceAsString();ItemDoc bean = JSONUtil.toBean(json, ItemDoc.class);System.out.println("bean = " + bean);}//删除文档@org.junit.jupiter.api.Testpublic void testdelect() throws IOException {//创建Requst对象DeleteRequest request = new DeleteRequest("items","317578");//发送信息client.delete(request, RequestOptions.DEFAULT);}//修改文档@org.junit.jupiter.api.Testpublic void testupdata2() throws IOException {//创建Requst对象UpdateRequest request = new UpdateRequest("items","317578");//数据都是独立的不是成对出现的request.doc("price", 10000);//发送信息client.update(request, RequestOptions.DEFAULT);}//测试用例结束后释放资源@AfterEachvoid tearDown() throws IOException {client.close();}}

批量操作文档

    //新增文档,全局修改文档@org.junit.jupiter.api.Testpublic void testBulk() throws IOException {//设置分页信息int size = 1000,pageNo = 1;//循环写入数据while (true){//读取数据库获取商品信息分页查询//设置当前查询条件,即状态为起售//创建分页对象Page<Item> page = service.lambdaQuery().eq(Item::getStatus, 1).page(new Page<Item>(pageNo, size));List<Item> records = page.getRecords();//判空if (records.isEmpty()||records == null){return;}//创建Requst对象
//            BulkRequest request = new BulkRequest();
//            for (Item item : records) {
//                request.add(new IndexRequest("items")
//                        .id(item.getId().toString())
//                        .source(JSONUtil.toJsonStr(BeanUtil.copyProperties(item, ItemDoc.class))),XContentType.JSON
//                );
//            }BulkRequest request = new BulkRequest("items");for (Item item : records) {// 2.1.转换为文档类型ItemDTOItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);// 2.2.创建新增文档的Request对象request.add(new IndexRequest().id(itemDoc.getId()).source(JSONUtil.toJsonStr(itemDoc), XContentType.JSON));}//发送信息client.bulk(request, RequestOptions.DEFAULT);pageNo++;}}


http://www.mrgr.cn/news/75223.html

相关文章:

  • MC1.12.2 macOS高清修复OptiFine运行崩溃
  • 【前端】自学基础算法 -- 24.动态规划-变态青蛙蛙跳台阶
  • C++并发编程之基于细粒度锁的线程安全查找表
  • Debye-Einstein-模型拟合比热容Python脚本
  • 【Rust练习】27.Module
  • 《Java核心技术II》并行流
  • AUTOSAR_EXP_ARAComAPI的7章笔记(3)
  • 17-鸿蒙开发中的背景图片设置:位置、定位、单位和尺寸
  • Linux软件包管理与Vim编辑器使用指南
  • 绝对路径和相对路径的区别
  • 搜维尔科技:我们使用Xsens动作捕捉技术创建的短片
  • 行驶证 OCR 识别 API 接口的优势分析
  • Python中,处理日期和时间的库
  • GCN基于图卷积神经网络多特征分类预测(多输入单输出) Matlab代码
  • springboot039基于Web足球青训俱乐部管理系统
  • 似然函数解析
  • LeetCode 每日一题 统计满足 K 约束的子字符串数量 I
  • AI视觉小车基础--2.按键读取
  • 【MYSQL】数据库日志 (了解即可)
  • Linux 驱动
  • 机器学习(1)
  • [DB]
  • [ICML 2024]Learning High-Order Relationships of Brain Regions
  • 超全面!一文带你快速入门HTML,CSS和JavaScript!
  • pip install volcengine-python-sdk报错
  • 【027B】基于51单片机模拟电梯(点阵)【Proteus仿真+Keil程序+报告+原理图】