假设检验（Hypothesis Testing）

什么是假设检验（What is Hypothesis Testing）

假设检验是一种统计方法，用于通过样本数据来检验关于总体参数的假设。假设检验帮助我们决定是否可以根据样本数据拒绝总体假设。

错误类型（Types of Errors）

Type I Error：A type I error occurs when the true null hypothesis is rejected. For example, concluding that the mean age at which children start walking is different from 12 months when in fact it is not.
Type II Error：A type II error occurs when a false null hypothesis is not rejected. For example, failing to reject the null hypothesis that the proportion of businesses considering becoming a customer of the bank is equal to 20% when in fact the proportion is less than 20%.

假设检验的流程（Steps in Hypothesis Testing）

陈述假设（State the Hypotheses）：
- 零假设 $H_{0}$ ：表示没有效应或差异。例如， $H_{0} : μ = 7$
- 备择假设 $H_{a}$ ：表示存在效应或差异。例如， $H_{a} : μ \neq = 7$
选择显著性水平（Choose the Significance Level, $α$ ）：通常为0.05或0.01。
选择适当的检验方法（Select the Appropriate Test）：根据样本大小和数据类型选择合适的检验方法。
计算检验统计量（Calculate the Test Statistic）：从样本数据中计算检验统计量。
确定临界值或计算p值（Determine the Critical Value or Calculate the p-value）：根据选择的显著性水平和检验方法查找临界值或计算p值。
作出决策（Make a Decision）：
- 如果检验统计量超过临界值或p值小于显著性水平 $α$ ，拒绝零假设。
- 否则，不拒绝零假设。

常用假设检验方法（Common Hypothesis Tests）

针对单个均值的t检验（One-Sample t-Test）

条件：样本来自正态分布，总体标准差未知。
假设：
- $H_{0}$ ： $μ = μ_{0}$
- $H_{a}$ ： $μ \neq = μ_{0}$
检验统计量： $t = \frac{x ˉ - μ _{0}}{s / n}$
Python代码示例：

import scipy.stats as stats
 
# 样本数据
data = [12.1, 11.8, 12.3, 12.0, 12.5, 11.7, 12.2, 12.3, 12.1, 12.2]
 
# 假设的总体均值
mu_0 = 12
 
# t检验
t_stat, p_value = stats.ttest_1samp(data, mu_0)
 
# 显示结果
print("t-statistic:", t_stat)
print("p-value:", p_value)

针对两个独立样本均值的t检验（Independent Two-Sample t-Test）

条件：两个独立样本，总体标准差未知。
假设：
- $H_{0}$ ： $μ_{1} = μ_{2}$
- $H_{a}$ ： $μ_{1} \neq = μ_{2}$
检验统计量： $t = \frac{x ˉ _{1} - x ˉ _{2}}{\frac{s _{1}^{2}}{n _{1}} + \frac{s _{2}^{2}}{n _{2}}}$
Python代码示例：

# 样本数据
data1 = [12.1, 11.8, 12.3, 12.0, 12.5]
data2 = [11.7, 12.2, 12.3, 12.1, 12.2]
 
# t检验
t_stat, p_value = stats.ttest_ind(data1, data2)
 
# 显示结果
print("t-statistic:", t_stat)
print("p-value:", p_value)

针对配对样本均值的t检验（Paired Sample t-Test）

条件：配对样本，总体标准差未知。
假设：
- $H_{0}$ ： $μ_{d} = 0$
- $H_{a}$ ： $μ_{d} \neq = 0$
检验统计量： $t = \frac{d ˉ}{s _{d} / n}$
Python代码示例：

# 配对样本数据
before = [12.1, 11.8, 12.3, 12.0, 12.5]
after = [11.7, 12.2, 12.3, 12.1, 12.2]
 
# t检验
t_stat, p_value = stats.ttest_rel(before, after)
 
# 显示结果
print("t-statistic:", t_stat)
print("p-value:", p_value)

针对单个比例的z检验（One-Sample z-Test for Proportions）

条件：样本量大，总体比例已知。
假设：
- $H_{0}$ ： $p = p_{0}$
- $H_{a}$ ： $p \neq = p_{0}$
检验统计量： $z = \frac{p ^ - p _{0}}{\frac{p _{0} ( 1 - p _{0} )}{n}}$
Python代码示例：

from statsmodels.stats.proportion import proportions_ztest
 
# 样本数据
count = 30  # 成功次数
nobs = 50  # 总样本量
p0 = 0.5  # 假设的总体比例
 
# z检验
z_stat, p_value = proportions_ztest(count, nobs, value=p0)
 
# 显示结果
print("z-statistic:", z_stat)
print("p-value:", p_value)

针对两个独立比例的z检验（Two-Sample z-Test for Proportions）

条件：两个独立样本，总体比例已知。
假设：
- $H_{0}$ ： $p_{1} = p_{2}$
- $H_{a}$ ： $p_{1} \neq = p_{2}$
检验统计量： $z = \frac{p ^ _{1} - p ^ _{2}}{p ^ ( 1 - p ^ ) ( \frac{1}{n _{1}} + \frac{1}{n _{2}} )}$ 其中， $\overset{p}{^}$ 是合并比例： $\overset{p}{^} = \frac{x _{1} + x _{2}}{n _{1} + n _{2}}$
Python代码示例：

# 样本数据
count = np.array([30, 35])  # 成功次数
nobs = np.array([50, 50])  # 总样本量
 
# z检验
z_stat, p_value = proportions_ztest(count, nobs)
 
# 显示结果
print("z-statistic:", z_stat)
print("p-value:", p_value)

线性回归检验（Linear Regression Test）

条件：连续自变量和因变量。
假设：
- $H_{0}$ ： $β_{i} = 0$ （自变量 $X_{i}$ 对因变量无显著影响）
- $H_{a}$ ： $β_{i} \neq = 0$ （自变量 $X_{i}$ 对因变量有显著影响）
Python代码示例：

import statsmodels.api as sm
import pandas as pd
 
# 样本数据
data = pd.DataFrame({
    'X': [1, 2, 3, 4, 5],
    'Y': [2, 3, 5, 7, 11]
})
 
# 自变量和因变量
X = data['X']
Y = data['Y']
 
# 添加常数项
X = sm.add_constant(X)
 
# 线性回归
model = sm.OLS(Y, X).fit()
 
# 显示结果
print(model.summary())

Hua Wang

Explorer

Hypothesis Testing

假设检验（Hypothesis Testing）

什么是假设检验（What is Hypothesis Testing）

相关概念（Related Concepts）

错误类型（Types of Errors）

假设检验的流程（Steps in Hypothesis Testing）

常用假设检验方法（Common Hypothesis Tests）

针对单个均值的t检验（One-Sample t-Test）

针对两个独立样本均值的t检验（Independent Two-Sample t-Test）

针对配对样本均值的t检验（Paired Sample t-Test）

针对单个比例的z检验（One-Sample z-Test for Proportions）

针对两个独立比例的z检验（Two-Sample z-Test for Proportions）

线性回归检验（Linear Regression Test）

Table of Contents

Graph View

Backlinks