您的位置:首页 > 新闻 > 会展 > 开公众号需要多少钱_网络服务和 网络管制问题_杭州优化seo_如何投放网络广告

开公众号需要多少钱_网络服务和 网络管制问题_杭州优化seo_如何投放网络广告

2025/3/13 0:04:33 来源:https://blog.csdn.net/weixin_48846514/article/details/145885497  浏览:    关键词:开公众号需要多少钱_网络服务和 网络管制问题_杭州优化seo_如何投放网络广告
开公众号需要多少钱_网络服务和 网络管制问题_杭州优化seo_如何投放网络广告

KNN - sklearn 以及 自定义KNN 的实现

  • 前言
  • Github 链接
  • 使用SKlearn 库完成KNN的训练以及预测
    • 1. 导入需要的库
    • 2. 加载数据
      • 2.1. 输出数据信息
    • 3. 分割训练集和测试集
    • 4. 可视化
    • 5. 创建模型并预测
  • 2. 自定义KNN模型并预测

前言

前面写完了理论篇,接下来补充代码。

机器学习使用sklearn会很简单,因此重点看下如何自定义实现。

KNN理论链接跳转

Github 链接

Github链接跳转

使用SKlearn 库完成KNN的训练以及预测

1. 导入需要的库

from IPython.display import set_matplotlib_formats, display
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

2. 加载数据

from sklearn.datasets import load_iris
iris_dataset = load_iris()

2.1. 输出数据信息

print("Keys of iris_dataset:\n", iris_dataset.keys())
print(iris_dataset['DESCR'][:193] + "\n...")
print("Target names:", iris_dataset['target_names'])
print("Feature names:\n", iris_dataset['feature_names'])

3. 分割训练集和测试集

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], random_state=0)

4. 可视化

# label the columns using the strings in iris_dataset.feature_names
iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset.feature_names)
# Create a scatter matrix from the dataframe, color by y_train
pd.plotting.scatter_matrix(iris_dataframe, c=y_train, figsize=(16, 16),marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8);

在这里插入图片描述

5. 创建模型并预测

from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import accuracy_score
scaler = MinMaxScaler()#creating an object
scaler.fit(X_train)#calculate min and max value of the training dataX_train_norm = scaler.transform(X_train) #apply normalisation to the training set
X_test_norm = scaler.transform(X_test)knn = KNeighborsClassifier(n_neighbors=40)
knn.fit(X_train_norm, y_train)
y_pred = knn.predict(X_test_norm) 
print("Accuracy on test set: {:.5f}".format(accuracy_score(y_pred, y_test)))

2. 自定义KNN模型并预测

import numpy as np
from collections import Counterclass KNN:def __init__(self, k=3, distance_metric='euclidean'):self.k = kself.distance_metric = distance_metric# define fit functiondef fit(self, X_train, y_train):self.X_train = np.array(X_train)self.y_train = np.array(y_train)# calculate distancedef _compute_distance(self, x1, x2):if self.distance_metric == 'euclidean':return np.sqrt(np.sum((x1 - x2) ** 2))elif self.distance_metric == 'manhattan':return np.sum(np.abs(x1 - x2)) else:raise ValueError("Unsupported distance metric")def predict(self, X_test):X_test = np.array(X_test)predictions = []for x in X_test:distances = [self._compute_distance(x, x_train) for x_train in self.X_train] # calculate all distancek_indices = np.argsort(distances)[:self.k]  # find the neaset  k points k_nearest_labels = [self.y_train[i] for i in k_indices]  most_common = Counter(k_nearest_labels).most_common(1)[0][0]  # get the most common classpredictions.append(most_common)return np.array(predictions)def score(self, X_test, y_test):y_pred = self.predict(X_test)return np.mean(y_pred == np.array(y_test))  # scoreknn = KNN(k=40)
knn.fit(X_train_norm, y_train)
predictions = knn.predict(X_test_norm)
accuracy = knn.score(X_test_norm, y_test)print(f"Predictions: {predictions}")
print(f"Accuracy: {accuracy}")

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com