当前位置：首页 > news >正文

【最优化方法】线搜索技术

news 2025/7/13 17:59:38

线搜索分为精确线搜索和非精确线搜索。

精确线搜索是指求步长 $\alpha_{k}$ ,使得目标函数沿方向 $d_{k}$ 达到极小。

非精确线搜索是指求步长 $\alpha_{k}$ ,使得目标函数得到可接受的下降量。

进退法-初步搜索区间

基本思想：从一点出发，按一定步长，试图确定函数值呈现“高-低-高”的三点，从而得到一个近似的单峰区间。

实例

# 目标函数
def f(t):return t * t * t - 2 * t + 1 # 进退法：确定搜索区间
def advance_and_retreat(f, initial_point, step_size, multiplier):# f   目标函数# initial_point   初始点# step_size   初始步长# multiplier   加倍系数current_point = initial_pointcurrent_value = f(current_point)# front searchwhile True:next_point = current_point + step_sizenext_value = f(next_point)if next_value < current_value:current_point = next_pointcurrent_value = next_valuestep_size *= multiplierelse:break;# back searchstep_size /= multiplierwhile True:next_point = current_point - step_sizenext_value = f(next_point)if next_value < current_value:current_point = next_pointcurrent_value = next_valuestep_size *= multiplierelse:break;return current_pointinitial_point = 0
step_size = 1
multiplier = 2r = advance_and_retreat(f, initial_point=0, step_size=1, multiplier=2)
print("搜索区间为 {}-{}".format( initial_point , r))

运行结果：

代码理解

根据初始点往后面搜，往后就是增加一个步长，与算法比赛里二分模板的+1有异曲同工，即current_point + step_size。

由于是找最小值，左边一定是一段递减的过程，所以到一旦next的值大于current就说明出现了拐点，因此到拐点处再往左边接着搜索

检验

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1000,1000,1000000)
y = [f(i) for i in x]
plt.title('函数图像',fontsize=14)
plt.xlabel('x',fontsize=14)
plt.ylabel('y',fontsize=14)plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
plt.plot(x,y)

可以看到函数图像如上所示拐点大概在0附近。

缩小范围到0-3，可以看到最低点应该在1附近。

进一步缩小到0-1，可以看到最低点就在中间，所以答案正确。

精确线搜索

精确线搜索分为使用导数的搜索（插值法、牛顿法和抛物线法等）以及不使用导数的搜索（0.618法、分数法以及成功-失败法等）。

黄金分割法--0.618法

基本思想：通过试探点函数值的比较，使得包含极小点的搜索区间不断缩小。

实例

# 目标函数
def objective_function(x):return x * (x + 2)# 黄金分割法计算法则
def calculate_phi1(a, b):return round(a + 0.382 * (b - a), 3)  # 黄金分割点 1
def calculate_phi2(a, b):return round(a + 0.618 * (b - a), 3)  # 黄金分割点 2def golden_section_search(objective_function, left_boundary, right_boundary, eps=0.3):# 初始化黄金分割点及其函数值phi1_value = calculate_phi1(left_boundary, right_boundary)phi1_value_func = round(objective_function(phi1_value), 3)phi2_value = calculate_phi2(left_boundary, right_boundary)phi2_value_func = round(objective_function(phi2_value), 3)# 主循环，判断是否到达终止条件while right_boundary - left_boundary > eps:if phi1_value_func < phi2_value_func:right_boundary = phi2_valuephi2_value = phi1_valuephi1_value = calculate_phi1(left_boundary, right_boundary)else:left_boundary = phi1_valuephi1_value = phi2_valuephi2_value = calculate_phi2(left_boundary, right_boundary)phi1_value_func = round(objective_function(phi1_value), 3)phi2_value_func = round(objective_function(phi2_value), 3)print(f'右边界与左边界的绝对值为 {abs(right_boundary - left_boundary)} 小于终止条件，算法停止')print(f'最优解 X* 包含于 [{left_boundary}, {right_boundary}]')print(f'X* = {round((left_boundary + right_boundary) / 2, 3)}')golden_section_search(objective_function, -3,5,0.3)

实验结果：

检验

import matplotlib.pyplot as plt
import numpy as np# 定义函数
def f(t):return t * (t + 2)x = np.linspace(-5, 5, 1000) 
y = [f(i) for i in x]# 设置图像标题和坐标轴标签
plt.title('函数图像', fontsize=14)
plt.xlabel('x', fontsize=14)
plt.ylabel('y', fontsize=14)plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号plt.plot(x, y)
plt.axhline(0, color='black',linewidth=0.5)  # 添加x轴参考线
plt.axvline(0, color='black',linewidth=0.5)  # 添加y轴参考线
plt.grid(True)  # 添加网格线
plt.show()

进一步地，缩小范围到[-2,0]

缩小到实验结果[-1.111,-0.836]，查看图像。可以看到答案就在这个区间范围内，因此该算法可行。

抛物线法--二次插值法

基本思想：在搜索区间中不断地使用二次多项式去近似目标函数，并逐步用插值多项式的极小点去逼近线搜索问题。

实例

参考自：插值法（二次插值法和三次插值法算法分析以及python代码解释）-CSDN博客

例题：用二次插值法求函数 $f(x)=3x^{4}-4x^{3}-12x^{2}$ 的极小点，迭代三次给定 $x_{1}=-1.2,x_{2}=-1.1,x_{3}=-0.8$

#  二次插值 / 抛物线法
def fun(x):return x ** 2 + 2 * x - 10
def run(x1, x2, x3):B1 = (x2 * x2 - x3 * x3) * fun(x1)B2 = (x3 * x3 - x1 * x1) * fun(x2)B3 = (x1 * x1 - x2 * x2) * fun(x3)C1 = (x2 - x3) * fun(x1)C2 = (x3 - x1) * fun(x2)C3 = (x1 - x2) * fun(x3)D = (x1 - x2) * (x2 - x3) * (x3 - x1)if (B1 + B2 + B3) == 0:x0 = 0else:x0 = (B1 + B2 + B3) / (2 * (C1 + C2 + C3))Arr = [x0, x1, x2, x3]Arr.sort()if fun(x0) > fun(x2):index = Arr.index(x2)x1 = Arr[index - 1]x3 = Arr[index + 1]else:index = Arr.index(x0)x2 = x0x1 = Arr[index - 1]x3 = Arr[index + 1]return x1, x2, x3def text(x1, x2, x3, ε):count = 0while abs(x3 - x1) and abs(x3 - x2) and abs(x2 - x1) > ε:count = count + 1print("第{}次迭代".format(count))Arr2 = run(x1, x2, x3)x1 = Arr2[0]x2 = Arr2[1]x3 = Arr2[2]print(Arr2)print("该精度下的最优解为：%f" % x2)print("函数在该精度上的最小值为",fun(x2))return
text(-3, 0.5, 4, 0.0001)

实验结果：

非精确线搜索

"线搜索技术是求解许多优化问题下降算法的基本组成部分,但精确线搜索往往需要计算很多的函数值和梯度值，从而耗费较多的计算资源，特别是当迭代点远离最优点时，精确线搜索通常不是十分有效和合理的.对于许多优化算法，其收敛速度并不依赖于精确搜索过程，因此，既能保证目标函数具有可接受的下降量，又能使最终形成的选代序列收敛的非精确线搜索变得越来越流行。着重介绍非精确线搜索中的 Wolfe 准则和 Armijo 准则。"

Wolfe准则

实例

import numpy as np# 定义目标函数 (Rosenbrock 函数)
def f(x):x1, x2 = xreturn 100 * (x2 - x1**2)**2 + (1 - x1)**2# 定义目标函数的梯度
def grad_f(x):x1, x2 = xdf_dx1 = -400 * x1 * (x2 - x1**2) + 2 * (x1 - 1)df_dx2 = 200 * (x2 - x1**2)return np.array([df_dx1, df_dx2])# Wolfe 准则线搜索函数
def wolfe_line_search(f, grad_f, xk, dk, alpha_init=1, c1=1e-4, c2=0.9, max_iter=100):alpha = alpha_initphi_0 = f(xk)phi_prime_0 = np.dot(grad_f(xk), dk)for _ in range(max_iter):# 计算 phi(alpha)x_new = xk + alpha * dkphi_alpha = f(x_new)phi_prime_alpha = np.dot(grad_f(x_new), dk)# 检查 Wolfe 条件if phi_alpha > phi_0 + c1 * alpha * phi_prime_0:alpha *= 0.5  # 如果不满足 Armijo 条件，减少步长elif phi_prime_alpha < c2 * phi_prime_0:alpha *= 2  # 如果不满足曲率条件，增大步长else:return alpha  # 满足 Wolfe 条件时返回最优步长return alpha# 初始点 xk 和搜索方向 dk
xk = np.array([-1.0, 1.0])
dk = np.array([1.0, 1.0])# 使用 Wolfe 准则进行线搜索
alpha_opt = wolfe_line_search(f, grad_f, xk, dk)
x_opt = xk + alpha_opt * dk# 输出结果
print(f"最优步长 alpha: {alpha_opt}")
print(f"最优点 x: {x_opt}")
print(f"在最优点处的函数值: {f(x_opt)}")

Armijo准则

Armijo准则的核心思想是：在当前搜索方向上，只要目标函数在该方向上的下降量超过了某个给定的阈值，步长就被接受。

import numpy as np# 目标函数 (示例：Rosenbrock 函数的变形)
def objective_function(x1, x2):return 100 * (x2 - x1**2)**2 + (1 - x1)**2# 目标函数的梯度
def gradient_function(x):x1, x2 = xdf_dx1 = -400 * x1 * (x2 - x1**2) + 2 * (x1 - 1)df_dx2 = 200 * (x2 - x1**2)return np.array([df_dx1, df_dx2])# Armijo准则线搜索
def armijo_line_search(f, grad_f, xk, pk, alpha=1.0, rho=0.8, c=1e-4):"""Armijo准则线搜索:param f: 目标函数:param grad_f: 梯度函数:param xk: 当前点:param pk: 搜索方向:param alpha: 初始步长:param rho: 步长缩减比例:param c: Armijo条件中的常数:return: 满足Armijo条件的步长"""while f(xk[0] + alpha * pk[0], xk[1] + alpha * pk[1]) > \f(xk[0], xk[1]) + c * alpha * np.dot(grad_f(xk), pk):alpha *= rho  # 若不满足Armijo条件，步长减小return alpha# 示例使用
xk = np.array([-1.0, 1.0])  # 当前点
pk = np.array([1.0, 1.0])   # 搜索方向（比如负梯度方向）# 执行 Armijo 线搜索
alpha = armijo_line_search(objective_function, gradient_function, xk, pk)
x_new = xk + alpha * pk  # 更新后的点# 输出结果
print(f"Armijo准则确定的步长 alpha: {alpha}")
print(f"新的点 x_new: {x_new}")
print(f"目标函数值 f(x_new): {objective_function(x_new[0], x_new[1])}")