照片EXIF数据统计与可视化
拍的照片越来越多,想要了解一下日常拍摄的习惯,便于后面换镜头、调整参数等操作,所以写了这个脚本来统计照片的EXIF数据。该脚本用于统计指定文件夹下所有JPG图片的EXIF数据,包括快门速度、ISO、焦距、光圈和拍摄时间,并生成相应的分布图。在使用时,需要将文中的代码段都粘贴到同一个文件中,然后修改folder_path
变量为你要处理的文件夹路径,运行脚本即可。以下是脚本的详细说明。
依赖
- Python 3.x
exifread
库:用于读取图片的EXIF数据matplotlib
库:用于绘制分布图numpy
库:用于处理数值数据pathlib
库:用于处理文件路径collections.Counter
:用于统计频率
安装依赖
使用以下命令安装所需的Python库:
pip install exifread matplotlib numpy
使用方法
- 将脚本保存为
photo_statistic.py
。 - 修改脚本中的
folder_path
变量,设置为你要处理的文件夹路径。 - 运行脚本:
python photo_statistic.py
代码说明
导入依赖
from pathlib import Path
import exifread
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
from collections import Counter
获取EXIF数据
get_exif_data
函数用于读取图片的EXIF数据,并提取快门速度、ISO、焦距、光圈和拍摄时间。
def get_exif_data(image_path):with open(image_path, 'rb') as f:tags = exifread.process_file(f)shutter_speed = tags.get('EXIF ExposureTime') or tags.get('EXIF ShutterSpeedValue')iso = tags.get('EXIF ISOSpeedRatings')focal_length = tags.get('EXIF FocalLength')aperture = tags.get('EXIF FNumber') or tags.get('EXIF ApertureValue')datetime = tags.get('EXIF DateTimeOriginal') or tags.get('Image DateTime')return {'file': image_path,'shutter_speed': shutter_speed,'iso': iso,'focal_length': focal_length,'aperture': aperture,'datetime': datetime}
处理文件夹中的图片
process_images_in_folder
函数遍历指定文件夹下的所有JPG图片,并调用get_exif_data
函数获取每张图片的EXIF数据。
def process_images_in_folder(folder_path):results = []folder = Path(folder_path)for image_path in folder.rglob('*.jpg'):results.append(get_exif_data(image_path))return results
绘制统计图
plot_statistics
函数用于绘制快门速度、ISO、焦距和光圈的分布图,并统计指定焦距的时间分布。
def plot_statistics(results):shutter_speeds = [float(result['shutter_speed'].values[0]) for result in results if result['shutter_speed']]isos = [int(result['iso'].values[0]) for result in results if result['iso']]focal_lengths = [float(result['focal_length'].values[0].num) / float(result['focal_length'].values[0].den) for result in results if result['focal_length']]apertures = [float(result['aperture'].values[0].num) / float(result['aperture'].values[0].den) for result in results if result['aperture']]dates = [result['datetime'].values.split(' ')[0].replace(':', '.') for result in results if result['datetime']]# 删除快门速度大于阈值的数据shutter_speeds_thres = 0.1shutter_speeds = [speed for speed in shutter_speeds if speed <= shutter_speeds_thres]fig, axs = plt.subplots(2, 1)# 将快门速度转换为分数格式def format_shutter_speed(speed):return (f"1/{(1/speed):.0f}" if speed != 0 else "0") if speed < 1 else str(speed)# 绘制快门速度分布图并将x轴刻度标签设置为分数格式axs[0].hist(shutter_speeds, bins=100, color='blue', edgecolor='black')axs[0].set_title('Shutter Speed Distribution' + f' (Threshold: {shutter_speeds_thres}s)')axs[0].set_xlabel('Shutter Speed (s)')# 使用分数表示快门速度axs[0].xaxis.set_major_formatter(ticker.FuncFormatter(lambda x, _: format_shutter_speed(x)))axs[0].set_ylabel('Frequency')axs[1].hist(focal_lengths, bins=100, color='red', edgecolor='black')axs[1].set_title('Focal Length Distribution')axs[1].set_xlabel('Focal Length (mm)')axs[1].set_ylabel('Frequency')plt.tight_layout()# 标出频率最高的N个焦距N = 6focal_length_counts = np.bincount(focal_lengths)most_common_focal_lengths = np.argsort(focal_length_counts)[-N:][::-1]for focal_length in most_common_focal_lengths:axs[1].text(focal_length, focal_length_counts[focal_length], str(focal_length), color='black')# 统计所有时间的频率dates_freq = Counter(dates)# 统计指定焦距的时间分布def get_focal_length_time_distribution(results, focal_length):date_of_focal_length = []for result in results:if result['focal_length'] and result['focal_length'].values[0].num / result['focal_length'].values[0].den == focal_length:date_of_focal_length.append(result['datetime'].values)date_of_focal_length.sort()# 以空格分隔时间字符串,只保留日期部分date_of_focal_length = [time.split(' ')[0].replace(':', '.') for time in date_of_focal_length]# 统计相同时间的频率return Counter(date_of_focal_length)prefix_focal_length = 50focal_length_200mm_freq = get_focal_length_time_distribution(results, prefix_focal_length)fig, ax = plt.subplots()# dates_freq排序dates_freq = dict(sorted(dates_freq.items()))ax.barh(list(dates_freq.keys()), list(dates_freq.values()), alpha=0.5, label='All')ax.barh(list(focal_length_200mm_freq.keys()), list(focal_length_200mm_freq.values()), label='Focal Length ' + str(prefix_focal_length) + 'mm')ax.legend()ax.set_xlabel('Frequency')ax.set_ylabel('Date')plt.show()
主函数
main
函数设置文件夹路径,并调用process_images_in_folder
和plot_statistics
函数。
def main():folder_path = 'c:/Users/25503/Desktop/照片合集'results = process_images_in_folder(folder_path)plot_statistics(results)if __name__ == "__main__":main()
运行结果
注意事项
- 确保文件夹路径正确,并且文件夹中包含JPG格式的图片。
- 脚本会读取图片的EXIF数据,如果图片没有EXIF数据,相关统计信息将无法获取。