当前位置：首页 > news >正文

Python的80个小tips（上）

news 2025/10/24 3:03:18

Tip 1：使用 enumerate()进行下标枚举

fruits = ['apple', 'banana', 'cherry']
for index, fruit in enumerate(fruits, start=1):
print(f"{index}. {fruit}")

Tip 2：不要盲目使用列表推导式，他不一定更快

许多人认为使用列表推导总是比使用 for 循环更快。然而，这并不总是正确的，当循环体很复杂时，列表推导式可能会比较慢。这时候不如使用循环试代码更具有可读性。

import math
import random
import timedef complex_calculation(x):return sum(math.sqrt(i) for i in range(1, x + 1))numbers = [random.randint(1, 10000) for _ in range(10000)]start_time = time.time()
results_comprehension = [complex_calculation(x) for x in numbers]
comprehension_time = time.time() - start_timestart_time = time.time()
results_loop = []
for number in numbers:result = complex_calculation(number)results_loop.append(result)
loop_time = time.time() - start_timeprint("list comprehension consume:", comprehension_time * 1000) # 1748.9590644836426
print("using for loop consume:", loop_time * 1000) # 1744.1840171813965

Tip 3：使用with对文件进行操作

进行文件操作时最好使用 Python 内置的 with 语句，因为它会自动处理文件的打开和关闭。即使出现了异常，也会自动关闭文件。

with open('file.txt', 'r') as file:content = file.read()# Do something with content

Tip 4：可以使用`timeit`进行性能基准测试

比如

import math
import random
import timeitdef complex_calculation(x):return sum(math.sqrt(i) for i in range(1, x + 1))def loop_calculation(numbers):results_loop = []for number in numbers:result = complex_calculation(number)results_loop.append(result)numbers = [random.randint(1, 10000) for _ in range(10000)]comprehension_time = timeit.timeit(lambda: [complex_calculation(x) for x in numbers], number=1
)
loop_time = timeit.timeit(lambda: loop_calculation(numbers), number=1)print("list comprehension consume:", comprehension_time * 1000)  # 1748.9590644836426
print("using for loop consume:", loop_time * 1000)  # 1744.1840171813965

Tip 5：不要使用可变对象作为默认参数

切勿使用可变对象（例如列表或字典）作为函数的默认参数。相反，请使用 None 并在函数内部创建对象。
因为函数的默认参数值是在函数定义时被计算的，而不是在函数被调用时。这意味着如果默认参数是一个可变对象（如列表或字典），那么这个对象将在函数定义后保持不变，不会在每次调用时重新创建。这可能导致一些意想不到的副作用


def bad_append(item, default_list=[]):default_list.append(item)return default_listprint(bad_append(1))  # 输出: [1]print(bad_append(2))  # 输出: [1, 2]，而不是预期的 [2]def good_append(item, default_list=None):if default_list is None:default_list = []default_list.append(item)return default_list

Tip 6:：使用`collections`获取容器的数据类型提供便利

使用collections模块来获取专门的容器数据类型，例如Counter、defaultdict和OrderedDict等。可以为特定的应用场景提供便利和效率

from collections import Counterword_list = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
word_count = Counter(word_list)
print(word_count)
# Output: Counter({'apple': 3, 'banana': 2, 'orange': 1})

Tip 7：for-else语句用于循环结束时的处理

for-else结构中的else块会在for循环正常结束时执行，也就是说，没有遇到break语句导致的提前退出。如果for循环中有一个break语句被执行了，那么else块将不会被执行。

for n in range(2, 10):for x in range(2, n):if n % x == 0:print(f"{n} equals {x} * {n//x}")breakelse:print(f"{n} is a prime number")

Tip 8：使用str.join()方法来连接字符串通常比使用+操作符性能更优

import random
import string
import timeitdef bad_concat(strings):result = ""for s in strings:result += sreturn resultdef good_concat(strings):return "".join(strings)test_str = ["".join(random.choices(string.ascii_letters + string.digits, k=10))for _ in range(1000)
]bad_time = timeit.timeit(lambda: bad_concat(test_str), number=10000)
print(f"Bad time: {bad_time:.6f}s")  # 0.277742sgood_time = timeit.timeit(lambda: good_concat(test_str), number=10000)
print(f"Good time: {good_time:.6f}s")  # 0.031005s

Tip 9：is和==的区别

is 检查是否是同一个对象
== 检查对象的值是否相等

a = [1, 2, 3]
b = [1, 2, 3]print(a == b)  # Output: True
print(a is b)  # Output: False

Tip 10：非必要不要使用global变量

会导致代码可读性和可维护性降低，而且可能会引入并发等问题

# Bad practice
global_variable = 0def increment_bad():global global_variableglobal_variable += 1# Good practice
def increment_good(variable):return variable + 1local_variable = 0
local_variable = increment_good(local_variable)

Tip 11：使用`zip`同时迭代两个列表

names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]for name, age in zip(names, ages):print(f"{name} is {age} years old")

Tip 12：使用`isinstance()`来检查对象是否是某个类的实例

def greet(person):if isinstance(person, str):print(f"Hello, {person}!")else:print("Hello, stranger!")greet("Alice")

Tip 13：使用`try-except-else`进行没有异常时的处理

try:result = 1 / 2
except ZeroDivisionError:print("Can't divide by zero!")
else:print(f"The result is {result}")

Tip 14：使用`ast.literal_eval()`替代`eval()`

eval()因为它可以执行任意代码。这意味着如果函数的参数是不可控的，或者来自于不可信的源，那么攻击者可以利用这一点执行恶意代码。
ast.literal_eval()函数是ast模块提供的一个安全函数，它用于评估一个字符串表达式，但仅限于字面量语法。这意味着它只能评估那些在Python中有效的字面量结构，比如字符串、数字、元组等。它不能执行任意代码，因此更安全

# Bad practice
user_input = "os.system('rm -rf /')"
result = eval(user_input)  # This could be dangerous!# Good practice
from ast import literal_evaluser_input = "[1, 2, 3]"
result = literal_eval(user_input)  # Safe and returns a list

Tip 15：如果不需要修改列表，使用切片代替reversed

list[::-1]创建了一个原列表的倒序副本，这个过程在内部是优化过的，通常比使用reversed()函数和列表构造器的组合更快。
切片语法不会改变原列表，它返回一个新的反转后的列表。这在你需要保留原列表顺序的情况下非常有用。

original_list = [1, 2, 3, 4, 5]
reversed_list = original_list[::-1]
print(reversed_list)
# Output: [5, 4, 3, 2, 1]

Tip 16：使用`any()`and `all()`校验列表更好

any()函数用于检查可迭代对象中是否至少有一个元素为True
all()函数用于检查可迭代对象中的所有元素是否都为True

可读性和效率都会更好，因为些函数在找到第一个True或False时会立即停止迭代，这可以提高效率，尤其是在处理大型数据集时。

numbers = [2, 4, 6, 8, 10]even_check = lambda x: x % 2 == 0print(any(even_check(x) for x in numbers))  # Output: True
print(all(even_check(x) for x in numbers))  # Output: True

Tip 17：使用上下文管理器进行资源处理

上下文管理器能够优雅的处理一些外部资源（比如网络链接，数据库会话等）。他能够自动进行资源管理和异常处理，而且代码更加简洁

from contextlib import contextmanager@contextmanager
def database_connection(url):connection = create_database_connection(url)try:yield connectionfinally:connection.close()with database_connection('example-db-url') as connection:result = connection.execute('SELECT * FROM users')

Tip 18：`map()`和`filter()`在某些情况下更加好

对于简单的操作，map()和filter()函数非常方便。
map()和filter()返回的是迭代器而不是完整列表，内存消耗较少，适合大数据处理
在数据预处理和链式调用的时候会更好

numbers = [1, 2, 3, 4, 5]# Using map() and filter()
squares = map(lambda x: x2, numbers)
even_squares = filter(lambda x: x % 2 == 0, squares)# Using list comprehensions
squares = [x2 for x in numbers]
even_squares = [x for x in squares if x % 2 == 0]

Tip 19：不要使用可变的数据结构作为字典的key

可以使用tuples或者namedtuples作为字典的key。

# Bad practice
bad_key = [1, 2, 3]
bad_dict = {bad_key: "value"}  # Raises TypeError: unhashable type: 'list'# Good practice
good_key = (1, 2, 3)
good_dict = {good_key: "value"}

Tip 20：使用Python的内置函数`sum()`计算数字列表

numbers = [1, 2, 3, 4, 5]# Inefficient
total = 0
for number in numbers:total += number# Efficient
total = sum(numbers)

Tip 21：可以使用`dir()`来查看对象的属性和方法

import datetimeattributes = dir(datetime)
print(attributes)

Tip 22：建议使用切片的方式创建浅拷贝而不是直接赋值

original_list = [1, 2, 3, 4, 5]# Bad practice
bad_copy = original_list# Good practice
good_copy = original_list[:]

Tip 23：`pass`和`continue`在循环中不能互换

pass 是一个无操作语句
continue 会跳过循环的剩余部分并进行下一次迭代

for i in range(5):if i == 2:passprint(i)  # Output: 0, 1, 2, 3, 4for i in range(5):if i == 2:continueprint(i)  # Output: 0, 1, 3, 4

Tip 24：不要使用C语言风格的循环风格

如果需要下标可以使用 range() function，可以看下Tip 1

# Bad practice
i = 0
while i < 5:print(i)i += 1# Good practice
for i in range(5):print(i)

Tip 25：处理文件路径时使用`os.path`

os.path模块，因为它可以确保跨平台兼容性

import ospath = os.path.join("folder1", "folder2", "file.txt")
print(path)

Tip 26：可以使用`round`对小数点进行截取

pi = 3.1415926535
rounded_pi = round(pi, 3)
print(rounded_pi)  # Output: 3.142

Tip 27：使用`functools.lru_cache`对函数的结果进行缓存

import functools@functools.lru_cache(maxsize=None)
def fib(n):if n <= 1:return nreturn fib(n - 1) + fib(n - 2)print(fib(100))  # Output: 354224848179261915075

Tip 28：`*args`和`kwargs`用于处理位置参数和关键字参数

def foo(*args, kwargs):print("Positional arguments:", args)print("Keyword arguments:", kwargs)foo(1, 2, 3, a=4, b=5)

Tip 29：不用使用`aseert`去处理用户输入，而是显示进行错误检查

aseert可以通过python解释器的 -O 选项来禁用，而且错误信息不友好

def divide(a, b):# Bad practiceassert b != 0, "Division by zero!"# Good practiceif b == 0:raise ValueError("Division by zero!")return a / b

Tip 30：使用 `itertools` 模块来高效地迭代和组合

import itertoolsletters = ['a', 'b', 'c']perms = list(itertools.permutations(letters, 2))
print(perms)  # Output: [('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b')]

Tip 31：可以使用`setattr()`和 `getattr()`动态设置和获取对象属性

class MyClass:passobj = MyClass()setattr(obj, 'attribute', 42)value = getattr(obj, 'attribute')
print(value)  # Output: 42

Tip 32：`name` 通常取决于模块是如何被执行或导入的

def main():print("This script is being run as the main program.")if __name__ == '__main__':main()

Tip 33：imported modules只会被执行一次

Python 会缓存导入的模块，因此它们只会执行一次，而不会重复执行。

import my_module  # my_module is executed
import my_module  # my_module is NOT executed again

Tip 34：不要用浮点数进行精确的十进制运算

因为浮点数会有误差，可以使用 decimal 模块进行浮点数精确计算:

from decimal import Decimalresult1 = 0.1 + 0.2
print(result1)  # Output: 0.30000000000000004result2 = Decimal('0.1') + Decimal('0.2')
print(result2)  # Output: 0.3

Tip 35：使用 `re` 模块进行正则匹配

import repattern = r'\\d+'
string = 'There are 42 apples and 15 oranges.'numbers = re.findall(pattern, string)
print(numbers)  # Output: ['42', '15']

Tip 36：使用 `pprint` 打印复杂数据更易读

from pprint import pprintdata = {'name': 'Alice','age': 30,'skills': ['Python', 'Java', 'C++'],'address': {'street': '123 Main St','city': 'New York','state': 'NY','zip': '10001'}
}pprint(data)

Tip 37：使用`str()`和`repr()`自定义类的字符串表示

class MyClass:def __str__(self):return "This is a human-readable description of MyClass."def __repr__(self):return f"MyClass(id={id(self)})"obj = MyClass()
print(str(obj))  # Output: This is a human-readable description of MyClass.
print(repr(obj))  # Output: MyClass(id=140123456789012)

Tip 38：使用 `try-except` 在异常发生概率低的时候更快

Python解释器对异常处理进行了优化，使得try-except块的开销非常小，尤其是在异常不常发生的情况下。因此使用try-except去包裹代码不会更慢

import timeit# 方法1: 使用条件判断
def get_value_condition(dictionary, key):# 这里字典有这个key，进行了两次查找if key in dictionary:return dictionary[key]else:return None# 方法2: 使用try-except
def get_value_try_except(dictionary, key):# 这里无论字典有没有这个key，都只进行了一次查找try:return dictionary[key]except KeyError:return None# 创建一个字典和一些测试数据
dictionary = {i: i for i in range(1000)}
key = 500
key_not_exists = 1500# 测试方法1
time_condition = timeit.timeit(lambda: get_value_condition(dictionary, key), number=1000000
)
time_condition_not_exists = timeit.timeit(lambda: get_value_condition(dictionary, key_not_exists), number=1000000
)# 测试方法2
time_try_except = timeit.timeit(lambda: get_value_try_except(dictionary, key), number=1000000
)
time_try_except_not_exists = timeit.timeit(lambda: get_value_try_except(dictionary, key_not_exists), number=1000000
)print(f"Condition (key exists): {time_condition * 10000:2.3f}")  # 569.567
print(f"Condition (key not exist): {time_condition_not_exists * 10000:2.3f}")  # 336.387
print(f"Try-Except (key exists): {time_try_except * 10000:2.3f}") # 392.551
print(f"Try-Except (key not exist): {time_try_except_not_exists * 10000:2.3f}") # 1075.084

Tip 39：不要使用`os.system()`和`os.popen()`执行外部命令

因为他们可能会导致安全漏洞，可以使用subprocess模块替代。不过还是需要对外部命令进行检查
使用subprocess.run()来执行命令，并且通过设置check=True来确保如果命令返回非零退出状态时会引发异常。我们还设置了capture_output=True来捕获输出，并且使用text=True来确保输出以文本形式返回。

import subprocessoutput = subprocess.run(['ls', '-l'], capture_output=True, text=True)
print(output.stdout)

Tip 40：使用`logging`替换`print()`

logging提供了更高级的日志管理，可以控制日志级别，输出格式，以及日志文件的存储位置。

import logginglogging.basicConfig(level=logging.INFO)logging.debug("This is a debug message.")
logging.info("This is an info message.")
logging.warning("This is a warning message.")
logging.error("This is an error message.")
logging.critical("This is a critical message.")