当前位置: 首页 > news >正文

L5打卡学习笔记

  • 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
  • 🍖 原作者:K同学啊

决策树模型

  • 导入数据
  • 模型训练
  • 模型预测
  • 个人总结

导入数据

import pandas as pd
import numpy as npurl = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['花萼-length', '花萼-width', '花瓣-length', '花瓣-width', 'class']dataset = pd.read_csv(url, names=names)
dataset
花萼-length花萼-width花瓣-length花瓣-widthclass
05.13.51.40.2Iris-setosa
14.93.01.40.2Iris-setosa
24.73.21.30.2Iris-setosa
34.63.11.50.2Iris-setosa
45.03.61.40.2Iris-setosa
..................
1456.73.05.22.3Iris-virginica
1466.32.55.01.9Iris-virginica
1476.53.05.22.0Iris-virginica
1486.23.45.42.3Iris-virginica
1495.93.05.11.8Iris-virginica

150 rows × 5 columns

# 数据划分
X = dataset.iloc[ : ,[0,1,2,3]].values #可选择部分列作为分类标准
Y = dataset.iloc[ : ,  4].values
X,Y
(array([[5.1, 3.5, 1.4, 0.2],[4.9, 3. , 1.4, 0.2],[4.7, 3.2, 1.3, 0.2],[4.6, 3.1, 1.5, 0.2],[5. , 3.6, 1.4, 0.2],[5.4, 3.9, 1.7, 0.4],[4.6, 3.4, 1.4, 0.3],[5. , 3.4, 1.5, 0.2],[4.4, 2.9, 1.4, 0.2],[4.9, 3.1, 1.5, 0.1],[5.4, 3.7, 1.5, 0.2],[4.8, 3.4, 1.6, 0.2],[4.8, 3. , 1.4, 0.1],[4.3, 3. , 1.1, 0.1],[5.8, 4. , 1.2, 0.2],[5.7, 4.4, 1.5, 0.4],[5.4, 3.9, 1.3, 0.4],[5.1, 3.5, 1.4, 0.3],[5.7, 3.8, 1.7, 0.3],[5.1, 3.8, 1.5, 0.3],[5.4, 3.4, 1.7, 0.2],[5.1, 3.7, 1.5, 0.4],[4.6, 3.6, 1. , 0.2],[5.1, 3.3, 1.7, 0.5],[4.8, 3.4, 1.9, 0.2],[5. , 3. , 1.6, 0.2],[5. , 3.4, 1.6, 0.4],[5.2, 3.5, 1.5, 0.2],[5.2, 3.4, 1.4, 0.2],[4.7, 3.2, 1.6, 0.2],[4.8, 3.1, 1.6, 0.2],[5.4, 3.4, 1.5, 0.4],[5.2, 4.1, 1.5, 0.1],[5.5, 4.2, 1.4, 0.2],[4.9, 3.1, 1.5, 0.1],[5. , 3.2, 1.2, 0.2],[5.5, 3.5, 1.3, 0.2],[4.9, 3.1, 1.5, 0.1],[4.4, 3. , 1.3, 0.2],[5.1, 3.4, 1.5, 0.2],[5. , 3.5, 1.3, 0.3],[4.5, 2.3, 1.3, 0.3],[4.4, 3.2, 1.3, 0.2],[5. , 3.5, 1.6, 0.6],[5.1, 3.8, 1.9, 0.4],[4.8, 3. , 1.4, 0.3],[5.1, 3.8, 1.6, 0.2],[4.6, 3.2, 1.4, 0.2],[5.3, 3.7, 1.5, 0.2],[5. , 3.3, 1.4, 0.2],[7. , 3.2, 4.7, 1.4],[6.4, 3.2, 4.5, 1.5],[6.9, 3.1, 4.9, 1.5],[5.5, 2.3, 4. , 1.3],[6.5, 2.8, 4.6, 1.5],[5.7, 2.8, 4.5, 1.3],[6.3, 3.3, 4.7, 1.6],[4.9, 2.4, 3.3, 1. ],[6.6, 2.9, 4.6, 1.3],[5.2, 2.7, 3.9, 1.4],[5. , 2. , 3.5, 1. ],[5.9, 3. , 4.2, 1.5],[6. , 2.2, 4. , 1. ],[6.1, 2.9, 4.7, 1.4],[5.6, 2.9, 3.6, 1.3],[6.7, 3.1, 4.4, 1.4],[5.6, 3. , 4.5, 1.5],[5.8, 2.7, 4.1, 1. ],[6.2, 2.2, 4.5, 1.5],[5.6, 2.5, 3.9, 1.1],[5.9, 3.2, 4.8, 1.8],[6.1, 2.8, 4. , 1.3],[6.3, 2.5, 4.9, 1.5],[6.1, 2.8, 4.7, 1.2],[6.4, 2.9, 4.3, 1.3],[6.6, 3. , 4.4, 1.4],[6.8, 2.8, 4.8, 1.4],[6.7, 3. , 5. , 1.7],[6. , 2.9, 4.5, 1.5],[5.7, 2.6, 3.5, 1. ],[5.5, 2.4, 3.8, 1.1],[5.5, 2.4, 3.7, 1. ],[5.8, 2.7, 3.9, 1.2],[6. , 2.7, 5.1, 1.6],[5.4, 3. , 4.5, 1.5],[6. , 3.4, 4.5, 1.6],[6.7, 3.1, 4.7, 1.5],[6.3, 2.3, 4.4, 1.3],[5.6, 3. , 4.1, 1.3],[5.5, 2.5, 4. , 1.3],[5.5, 2.6, 4.4, 1.2],[6.1, 3. , 4.6, 1.4],[5.8, 2.6, 4. , 1.2],[5. , 2.3, 3.3, 1. ],[5.6, 2.7, 4.2, 1.3],[5.7, 3. , 4.2, 1.2],[5.7, 2.9, 4.2, 1.3],[6.2, 2.9, 4.3, 1.3],[5.1, 2.5, 3. , 1.1],[5.7, 2.8, 4.1, 1.3],[6.3, 3.3, 6. , 2.5],[5.8, 2.7, 5.1, 1.9],[7.1, 3. , 5.9, 2.1],[6.3, 2.9, 5.6, 1.8],[6.5, 3. , 5.8, 2.2],[7.6, 3. , 6.6, 2.1],[4.9, 2.5, 4.5, 1.7],[7.3, 2.9, 6.3, 1.8],[6.7, 2.5, 5.8, 1.8],[7.2, 3.6, 6.1, 2.5],[6.5, 3.2, 5.1, 2. ],[6.4, 2.7, 5.3, 1.9],[6.8, 3. , 5.5, 2.1],[5.7, 2.5, 5. , 2. ],[5.8, 2.8, 5.1, 2.4],[6.4, 3.2, 5.3, 2.3],[6.5, 3. , 5.5, 1.8],[7.7, 3.8, 6.7, 2.2],[7.7, 2.6, 6.9, 2.3],[6. , 2.2, 5. , 1.5],[6.9, 3.2, 5.7, 2.3],[5.6, 2.8, 4.9, 2. ],[7.7, 2.8, 6.7, 2. ],[6.3, 2.7, 4.9, 1.8],[6.7, 3.3, 5.7, 2.1],[7.2, 3.2, 6. , 1.8],[6.2, 2.8, 4.8, 1.8],[6.1, 3. , 4.9, 1.8],[6.4, 2.8, 5.6, 2.1],[7.2, 3. , 5.8, 1.6],[7.4, 2.8, 6.1, 1.9],[7.9, 3.8, 6.4, 2. ],[6.4, 2.8, 5.6, 2.2],[6.3, 2.8, 5.1, 1.5],[6.1, 2.6, 5.6, 1.4],[7.7, 3. , 6.1, 2.3],[6.3, 3.4, 5.6, 2.4],[6.4, 3.1, 5.5, 1.8],[6. , 3. , 4.8, 1.8],[6.9, 3.1, 5.4, 2.1],[6.7, 3.1, 5.6, 2.4],[6.9, 3.1, 5.1, 2.3],[5.8, 2.7, 5.1, 1.9],[6.8, 3.2, 5.9, 2.3],[6.7, 3.3, 5.7, 2.5],[6.7, 3. , 5.2, 2.3],[6.3, 2.5, 5. , 1.9],[6.5, 3. , 5.2, 2. ],[6.2, 3.4, 5.4, 2.3],[5.9, 3. , 5.1, 1.8]]),array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa','Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica', 'Iris-virginica','Iris-virginica', 'Iris-virginica'], dtype=object))

模型训练

  • DecisionTreeClassifier实际上是基于 CART 算法实现的,默认使用基尼不纯度作为分裂准则。
  • 基尼指数是用于衡量样本集合中类别的不纯度(或混合程度)的指标。其取值范围在 0 到 1 之间:0 表示完全纯净,即数据集中所有样本都属于同一类别。1 表示完全不纯,即数据集样本均匀分布在所有类别中
from sklearn import tree
from sklearn.datasets import load_irisclf = tree.DecisionTreeClassifier()  # sk-learn的决策树模型
clf = clf.fit(X, Y)                  # 用数据训练树模型构建()
r   = tree.export_text(clf)

模型预测

text_x = X[[0,1,2,3,4,5], :]
pred_target_prob = clf.predict_proba(text_x)        # 预测类别概率
pred_target = clf.predict(text_x)              # 预测类别
# 打印结果
print("\n===模型======")
print(r)
print("\n===测试数据:=====")
print(text_x)
print("\n===预测所属类别概率:=====")
print(pred_target_prob)
print("\n===预测所属类别:======")
print(pred_target)
===模型======
|--- feature_2 <= 2.45
|   |--- class: Iris-setosa
|--- feature_2 >  2.45
|   |--- feature_3 <= 1.75
|   |   |--- feature_2 <= 4.95
|   |   |   |--- feature_3 <= 1.65
|   |   |   |   |--- class: Iris-versicolor
|   |   |   |--- feature_3 >  1.65
|   |   |   |   |--- class: Iris-virginica
|   |   |--- feature_2 >  4.95
|   |   |   |--- feature_3 <= 1.55
|   |   |   |   |--- class: Iris-virginica
|   |   |   |--- feature_3 >  1.55
|   |   |   |   |--- feature_0 <= 6.95
|   |   |   |   |   |--- class: Iris-versicolor
|   |   |   |   |--- feature_0 >  6.95
|   |   |   |   |   |--- class: Iris-virginica
|   |--- feature_3 >  1.75
|   |   |--- feature_2 <= 4.85
|   |   |   |--- feature_0 <= 5.95
|   |   |   |   |--- class: Iris-versicolor
|   |   |   |--- feature_0 >  5.95
|   |   |   |   |--- class: Iris-virginica
|   |   |--- feature_2 >  4.85
|   |   |   |--- class: Iris-virginica===测试数据:=====
[[5.1 3.5 1.4 0.2][4.9 3.  1.4 0.2][4.7 3.2 1.3 0.2][4.6 3.1 1.5 0.2][5.  3.6 1.4 0.2][5.4 3.9 1.7 0.4]]===预测所属类别概率:=====
[[1. 0. 0.][1. 0. 0.][1. 0. 0.][1. 0. 0.][1. 0. 0.][1. 0. 0.]]===预测所属类别:======
['Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa''Iris-setosa']

个人总结

  • 本质上就是把输入向量的几个特征值做分类后按照影响因子排序后,作为逐层分类标准,影响因子大的排前面,例如花萼-length的影响最大的化 那么第一次就按这个参数分类,以此类推

http://www.mrgr.cn/news/37161.html

相关文章:

  • Oracle RMAN 无敌备份脚本
  • 【C++篇】深度剖析C++ STL:玩转 list 容器,解锁高效编程的秘密武器
  • Teams集成-会议侧边栏应用开发-会议转写
  • 深入理解指针(4)
  • 高校竞赛管理系统的设计与实现
  • 【Linux学习】【Ubuntu入门】2-1-1 vim编辑器设置
  • 面试速通宝典——4
  • 智能守护者X100 - 自动化生产线智能机器人安全监控管理系统
  • 华为OD机试真题------分糖果
  • 如何设置一个拉风的PowerShell命令永久别名?
  • 文献笔记 - Ground effect on rotorcraft unmanned aerial vehicles: a review
  • calibre-web默认左上角字体修改
  • 数据结构之链表(1),单链表
  • OFDM系统中公共相位误差是怎么产生的?
  • Java高级Day51-apacheDBUtils
  • 基于JAVA+SpringBoot+Vue的健身房管理系统1
  • 【Linux实践】实验九:Shell流程控制语句
  • 实战篇 | VUE3 的安装使用并跑通第一个项目(高效实操版)
  • 【NLP】LSTM结构,原理,代码实现,序列池化
  • SQL_create_view