EldenRing DataAnalsis  By CloudH2O  Lv 研究目标和主要内容: 项目的主要内容包括以下几个方面:
数据收集与清洗:收集《Elden Ring》游戏中的角色属性和装备数据,并对数据进行清洗和预处理,确保数据的质量和可用性。
属性评级分析:对各属性(如力气、灵巧、智力等)的评级频数和占比进行分析,了解不同评级的分布情况,发现评级偏好和趋势。
属性关联分析:探索属性之间的相关性,使用统计方法和可视化工具分析属性之间的关系,如力气与灵巧的关联性、智力与信仰的关联性等。
属性对角色类型的影响:通过建立模型(如决策树、随机森林等),分析属性对角色类型的影响程度,揭示不同属性在角色分类中的重要性和权重。
装备属性分析:分析装备属性的分布情况和对角色战斗能力的影响,通过统计和可视化方法,了解装备属性的重要性和选择策略。
模型评估与性能指标分析:对建立的模型进行评估,计算准确率等性能指标,评估模型的预测能力和泛化能力,为玩家提供准确的决策支持。
数据源: 
https://eldenring.wiki.fextralife.com/Weapons+Comparison+Tables 
文件读取 1 2 3 4 5 import  pandas as  pddata = pd.read_csv('ElderRingData.csv' ) data 
查询数据类别 输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 <class 'pandas.core.frame.DataFrame'> RangeIndex: 307 entries, 0 to 306 Data columns (total 23 columns):  #   Column   Non-Null Count  Dtype  ---  ------   --------------  -----   0   Name     307 non-null    object  1   Type     307 non-null    object  2   Phy      307 non-null    object  3   Mag      37 non-null     object  4   Fir      21 non-null     object  5   Lit      4 non-null      object  6   Hol      33 non-null     object  7   Cri      307 non-null    int64   8   Sta      307 non-null    int64   9   Str      307 non-null    int64   10  Dex      307 non-null    int64   11  Int      307 non-null    int64   12  Fai      307 non-null    int64   13  Arc      307 non-null    int64   14  PhyD     282 non-null    object  15  MagD     282 non-null    object  16  FirD     282 non-null    object  17  LitD     282 non-null    object  18  HolD     282 non-null    object  19  Bst      282 non-null    object ...  22  Upgrade  307 non-null    object dtypes: int64(7), object(16) memory usage: 55.3+ KB None Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... 
计算均值、中位数、标准差 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 import  pandas as  pdimport  numpy as  npdata = pd.read_csv('ElderRingData.csv' ) rating_dict = {'S' : 6 , 'A' : 5 , 'B' : 4 , 'C' : 3 , 'D' : 2 , 'E' : 1 , '-' : 0 } data['Str' ] = data['Str' ].apply(lambda  x: rating_dict[x]) data['Dex' ] = data['Dex' ].apply(lambda  x: rating_dict[x]) data['Int' ] = data['Int' ].apply(lambda  x: rating_dict[x]) data['Fai' ] = data['Fai' ].apply(lambda  x: rating_dict[x]) data['Arc' ] = data['Arc' ].apply(lambda  x: rating_dict[x]) data['Mag' ] = data['Mag' ].replace('-' , np.nan) data['Fir' ] = data['Fir' ].replace('-' , np.nan) data['Lit' ] = data['Lit' ].replace('-' , np.nan) data['Hol' ] = data['Hol' ].replace('-' , np.nan) data['PhyD' ] = data['PhyD' ].replace('-' , np.nan) data['MagD' ] = data['MagD' ].replace('-' , np.nan) data['FirD' ] = data['FirD' ].replace('-' , np.nan) data['LitD' ] = data['LitD' ].replace('-' , np.nan) data['HolD' ] = data['HolD' ].replace('-' , np.nan) data['Bst' ] = data['Bst' ].replace('-' , np.nan) data['Rst' ] = data['Rst' ].replace('-' , np.nan) mean = data.mean() median = data.median() std = data.std() print ("均值:" )print (mean)print ("中位数:" )print (median)print ("标准差:" )print (std)
输出:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 均值: Cri    101.169381 Sta    105.335505 Str      2.491857 Dex      2.237785 Int      0.615635 Fai      0.657980 Arc      0.179153 dtype: float64 中位数: Mag     166.0 Fir     176.0 Lit     149.0 Hol     191.0 Cri     100.0 Sta     100.0 Str       2.0 Dex       2.0 Int       0.0 Fai       0.0 Arc       0.0 PhyD     47.0 MagD     33.0 FirD     31.0 LitD     31.0 ... Int     1.530352 Fai     1.458677 Arc     0.865358 dtype: float64 
统计每种武器类型的数量和平均物理伤害 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 import  pandas as  pdimport  numpy as  npdata = pd.read_csv('ElderRingData.csv' ) rating_dict = {'S' : 6 , 'A' : 5 , 'B' : 4 , 'C' : 3 , 'D' : 2 , 'E' : 1 , '-' : 0 } data['Str' ] = data['Str' ].apply(lambda  x: rating_dict[x]) data['Dex' ] = data['Dex' ].apply(lambda  x: rating_dict[x]) data['Int' ] = data['Int' ].apply(lambda  x: rating_dict[x]) data['Fai' ] = data['Fai' ].apply(lambda  x: rating_dict[x]) data['Arc' ] = data['Arc' ].apply(lambda  x: rating_dict[x]) cols_to_replace = ['Mag' , 'Fir' , 'Lit' , 'Hol' , 'PhyD' , 'MagD' , 'FirD' , 'LitD' , 'HolD' , 'Bst' , 'Rst' ] data[cols_to_replace] = data[cols_to_replace].replace('-' , np.nan) data['PhyD' ] = data['PhyD' ].astype(float ) data['MagD' ] = data['MagD' ].astype(float ) data['FirD' ] = data['FirD' ].astype(float ) weapon_count = data.groupby(['Type' ])['Name' ].count() weapon_phy_mean = data.groupby(['Type' ])['PhyD' ].mean() print ("每种武器类型的数量:" )print (weapon_count)print ("每种武器类型的平均物理伤害:" )print (weapon_phy_mean)
输出结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 <class 'pandas.core.frame.DataFrame'> RangeIndex: 307 entries, 0 to 306 Data columns (total 23 columns):  #   Column   Non-Null Count  Dtype  ---  ------   --------------  -----   0   Name     307 non-null    object  1   Type     307 non-null    object  2   Phy      307 non-null    object  3   Mag      37 non-null     object  4   Fir      21 non-null     object  5   Lit      4 non-null      object  6   Hol      33 non-null     object  7   Cri      307 non-null    int64   8   Sta      307 non-null    int64   9   Str      307 non-null    int64   10  Dex      307 non-null    int64   11  Int      307 non-null    int64   12  Fai      307 non-null    int64   13  Arc      307 non-null    int64   14  PhyD     282 non-null    object  15  MagD     282 non-null    object  16  FirD     282 non-null    object  17  LitD     282 non-null    object  18  HolD     282 non-null    object  19  Bst      282 non-null    object ...  22  Upgrade  307 non-null    object dtypes: int64(7), object(16) memory usage: 55.3+ KB None Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... 均值: Cri    101.169381 Sta    105.335505 Str      2.491857 Dex      2.237785 Int      0.615635 Fai      0.657980 Arc      0.179153 dtype: float64 中位数: Mag     166.0 Fir     176.0 Lit     149.0 Hol     191.0 Cri     100.0 Sta     100.0 Str       2.0 Dex       2.0 Int       0.0 Fai       0.0 Arc       0.0 PhyD     47.0 MagD     33.0 FirD     31.0 LitD     31.0 ... Int     1.530352 Fai     1.458677 Arc     0.865358 dtype: float64 Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... Mag     166.0 Fir     176.0 Lit     149.0 Hol     191.0 Cri     100.0 Sta     100.0 Str       2.0 Dex       2.0 Int       0.0 Fai       0.0 Arc       0.0 PhyD     47.0 MagD     33.0 FirD     31.0 LitD     31.0 HolD     31.0 Bst      36.3 Rst      15.0 dtype: float64 Cri      4.421129 Sta     41.609782 Str      1.106619 Dex      1.251981 Int      1.530352 Fai      1.458677 Arc      0.865358 PhyD    16.214128 MagD    10.262827 FirD     9.266148 dtype: float64 每种武器类型的数量: Type Axe                      12 Ballista                  2 Bow                       7 Claw                      4 Colossal Sword           11 Colossal Weapon          15 Crossbow                  7 Curved Greatsword         9 Curved Sword             14 Dagger                   16 Fist                      9 Flail                     5 Glintstone Staff         17 Great Spear               6 Greataxe                 12 Greatbow                  4 Greatsword               20 Halberd                  16 Hammer                   15 Heavy Thrusting Sword     4 Katana                    8 Light Bow                 5 Reaper                    4 ... Twinblade                42.000000 Warhammer                66.142857 Whip                     25.166667 Name: PhyD, dtype: float64 Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... 
查看排列 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import  pandas as  pdimport  numpy as  npdata = pd.read_csv('ElderRingData.csv' ) rating_dict = {'S' : 6 , 'A' : 5 , 'B' : 4 , 'C' : 3 , 'D' : 2 , 'E' : 1 , '-' : 0 } data['Str' ] = data['Str' ].apply(lambda  x: rating_dict[x]) data['Dex' ] = data['Dex' ].apply(lambda  x: rating_dict[x]) data['Int' ] = data['Int' ].apply(lambda  x: rating_dict[x]) data['Fai' ] = data['Fai' ].apply(lambda  x: rating_dict[x]) data['Arc' ] = data['Arc' ].apply(lambda  x: rating_dict[x]) cols_to_replace = ['Mag' , 'Fir' , 'Lit' , 'Hol' , 'PhyD' , 'MagD' , 'FirD' , 'LitD' , 'HolD' , 'Bst' , 'Rst' ] data[cols_to_replace] = data[cols_to_replace].replace('-' , np.nan) data['PhyD' ] = data['PhyD' ].astype(float ) data['MagD' ] = data['MagD' ].astype(float ) data['FirD' ] = data['FirD' ].astype(float ) data 
输出结果如下:
Name 
Type 
Phy 
Mag 
Fir 
Lit 
Hol 
Cri 
Sta 
Str 
… 
Arc 
PhyD 
MagD 
FirD 
LitD 
HolD 
Bst 
Rst 
Wgt 
Upgrade 
 
 
0 
Academy Glintstone Staff 
Glintstone Staff 
43 
NaN 
NaN 
NaN 
NaN 
100 
40 
2 
… 
0 
25.0 
15.0 
15.0 
15 
15 
15 
10 
3 
Smithing Stones 
 
1 
Alabaster Lord’s Sword 
Greatsword 
313 
93 
NaN 
NaN 
NaN 
100 
126 
4 
… 
0 
56.0 
33.0 
27.0 
27 
27 
42.9 
19 
8 
Somber Smithing Stones 
 
2 
Albinauric Bow 
Bow 
200 
NaN 
NaN 
NaN 
NaN 
100 
60 
1 
… 
0 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
4.5 
Smithing Stones 
 
3 
Albinauric Staff 
Glintstone Staff 
29 
NaN 
NaN 
NaN 
NaN 
100 
38 
2 
… 
6 
23.0 
14.0 
14.0 
14 
14 
14 
9 
2.5 
Smithing Stones 
 
4 
Antspur Rapier 
Thrusting Sword 
240 
NaN 
NaN 
NaN 
NaN 
100 
62 
2 
… 
0 
47.0 
31.0 
31.0 
31 
31 
25.2 
10 
3 
Smithing Stones 
 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
… 
 
302 
Wing of Astel 
Curved Sword 
159 
191 
NaN 
NaN 
NaN 
100 
84 
1 
… 
0 
28.0 
52.0 
23.0 
23 
23 
25.3 
9 
2.5 
Somber Smithing Stones 
 
303 
Winged Greathorn 
Greataxe 
318 
NaN 
NaN 
NaN 
NaN 
100 
150 
4 
… 
0 
65.0 
35.0 
35.0 
35 
35 
46.2 
20 
11 
Somber Smithing Stones 
 
304 
Winged Scythe 
Reaper 
213 
NaN 
NaN 
NaN 
254 
100 
110 
2 
… 
0 
30.0 
25.0 
25.0 
25 
55 
33 
15 
7.5 
Somber Smithing Stones 
 
305 
Zamor Curved Sword 
Curved Greatsword 
306 
NaN 
NaN 
NaN 
NaN 
100 
128 
3 
… 
0 
61.0 
33.0 
33.0 
33 
33 
42.9 
19 
9 
Somber Smithing Stones 
 
306 
Zweihander 
Colossal Sword 
345 
NaN 
NaN 
NaN 
NaN 
100 
126 
2 
… 
0 
67.0 
40.0 
40.0 
40 
40 
54 
22 
15.5 
Smithing Stones 
 
 
 
307 rows × 23 columns
评级平均值和标准差 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 import  pandas as  pdimport  numpy as  npdata = pd.read_csv('ElderRingData.csv' ) rating_dict = {'S' : 6 , 'A' : 5 , 'B' : 4 , 'C' : 3 , 'D' : 2 , 'E' : 1 , '-' : 0 } data['Str' ] = data['Str' ].apply(lambda  x: rating_dict[x]) data['Dex' ] = data['Dex' ].apply(lambda  x: rating_dict[x]) data['Int' ] = data['Int' ].apply(lambda  x: rating_dict[x]) data['Fai' ] = data['Fai' ].apply(lambda  x: rating_dict[x]) data['Arc' ] = data['Arc' ].apply(lambda  x: rating_dict[x]) cols_to_replace = ['Mag' , 'Fir' , 'Lit' , 'Hol' , 'PhyD' , 'MagD' , 'FirD' , 'LitD' , 'HolD' , 'Bst' , 'Rst' ] data[cols_to_replace] = data[cols_to_replace].replace('-' , np.nan) data['PhyD' ] = data['PhyD' ].astype(float ) data['MagD' ] = data['MagD' ].astype(float ) data['FirD' ] = data['FirD' ].astype(float ) attrs = ['Str' , 'Dex' , 'Int' , 'Fai' , 'Arc' ] for  attr in  attrs:    counts = data[attr].value_counts()     freqs = counts / counts.sum ()     print (f"属性{attr} 评级频数:\n{counts} \n" )     print (f"属性{attr} 评级占比:\n{freqs} \n" ) attr_cols = ['Str' , 'Dex' , 'Int' , 'Fai' , 'Arc' ] attr_means = data[attr_cols].mean() attr_stds = data[attr_cols].std() print ("属性评级平均值:" )print (attr_means)print ("\n属性评级标准差:" )print (attr_stds)
输出结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 属性Str评级频数: 2    109 3     94 4     52 1     31 0     16 5      4 6      1 Name: Str, dtype: int64 属性Str评级占比: 2    0.355049 3    0.306189 4    0.169381 1    0.100977 0    0.052117 5    0.013029 6    0.003257 Name: Str, dtype: float64 属性Dex评级频数: 2    106 3     79 4     52 0     44 ... Int    1.530352 Fai    1.458677 Arc    0.865358 dtype: float64 
随机森林模型 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 import  pandas as  pdfrom  sklearn.model_selection import  train_test_splitfrom  sklearn.ensemble import  RandomForestClassifierfrom  sklearn.preprocessing import  LabelEncoderfrom  sklearn.metrics import  accuracy_scoredata = pd.read_csv("ElderRingData.csv" ) data = data.drop(["Name" , "Mag" , "Fir" , "Lit" , "Hol" , "PhyD" , "MagD" , "FirD" , "LitD" , "HolD" , "Upgrade" ], axis=1 ) data = data.dropna() le = LabelEncoder() for  col in  data.columns:    if  data[col].dtype == "object" :         data[col] = le.fit_transform(data[col]) X = data.drop(["Type" ], axis=1 ) y = data["Type" ] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2 , random_state=0 ) rf = RandomForestClassifier(n_estimators=100 , random_state=0 ) rf.fit(X_train, y_train) y_pred = rf.predict(X_test) accuracy = accuracy_score(y_test, y_pred) print ("Accuracy:" , accuracy)
预测准确度:
1 Accuracy: 0.7903225806451613