1.配置
MyEclipse2013+Weka3.6+libsvm3.18+Jdk1.7+Win8.1
2.小Tips
1). Java使用Weka
实现:
将安装文件夹里的weka.jar加入项目编译路径中
2). CSV文件可以转换成Arff文件
实现:
运行Weka的Explorer界面,打开csv文件,保存为arff文件。
注意:
如果有训练集和测试集,将训练集的Arff文件的标签头复制到测试集的Arff文件!
3). Java通过Weka使用LibSVM
实现:
将LibSVM文件夹里的libsvm.jar加入项目编译路径中
3.示例
public static void main(String[] args) {
      
try {
          
          
Classifier classifier1;
          
Classifier classifier2;
          
Classifier classifier3;
          
Classifier classifier4;
 
          
          
File inputFile = new File(
                 
"C:\\Users\\zhangzhizhi\\Documents\\Everyone\\张志智\\总结积累\\Weka\\change_train.arff");// 
训练语料文件
          
ArffLoader atf = new ArffLoader();
          
atf.setFile(inputFile);
          
Instances instancesTrain = atf.getDataSet(); // 
读入训练文件
          
          
inputFile = new File(
                 
"C:\\Users\\zhangzhizhi\\Documents\\Everyone\\张志智\\总结积累\\Weka\\change_test.arff");// 
测试语料文件
          
atf.setFile(inputFile);
          
Instances instancesTest = atf.getDataSet(); // 
读入测试文件
 
          
          
instancesTest.setClassIndex(0);
          
instancesTrain.setClassIndex(0);
 
          
          
// 
朴素贝叶斯算法
          
classifier1 = (Classifier) Class.forName(
                 
"weka.classifiers.bayes.NaiveBayes").newInstance();
          
// 
决策树
          
classifier2 = (Classifier) Class.forName(
                 
"weka.classifiers.trees.J48").newInstance();
          
// Zero
          
classifier3 = (Classifier) Class.forName(
                 
"weka.classifiers.rules.ZeroR").newInstance();
          
// LibSVM
          
classifier4 = (Classifier) Class.forName(
                 
"weka.classifiers.functions.LibSVM").newInstance();
 
          
          
classifier4.buildClassifier(instancesTrain);
          
classifier1.buildClassifier(instancesTrain);
          
classifier2.buildClassifier(instancesTrain);
          
classifier3.buildClassifier(instancesTrain);
          
          
          
          
Evaluation eval = new Evaluation(instancesTrain);
          
          
eval.evaluateModel(classifier4, instancesTest);
          
System.out.println(eval.errorRate());
          
eval.evaluateModel(classifier1, instancesTest);
          
System.out.println(eval.errorRate());
          
eval.evaluateModel(classifier2, instancesTest);
          
System.out.println(eval.errorRate());
          
eval.evaluateModel(classifier3, instancesTest);
          
System.out.println(eval.errorRate());
          
      
} catch (Exception e) {
          
e.printStackTrace();
      
}
   
}
如果只有训练集,采用十交叉验证的方法,将上面的第5步和第6步更改为如下代码:
          
          
Evaluation eval = new Evaluation(instancesTrain);
          
eval.crossValidateModel(classifier4, instancesTrain, 10,
new Random(1));
          
System.out.println(eval.errorRate());
          
eval.crossValidateModel(classifier1, instancesTrain, 10,
new Random(1));
          
System.out.println(eval.errorRate());
          
eval.crossValidateModel(classifier2, instancesTrain, 10,
new Random(1));
          
System.out.println(eval.errorRate());
          
eval.crossValidateModel(classifier3, instancesTrain, 10,
new Random(1));
          
System.out.println(eval.errorRate());
如果需要保存和加载分类器模型参数,在第5步和第6步之间加入如下代码:
          
          
SerializationHelper.write("LibSVM.model", classifier4);
          
SerializationHelper.write("NaiveBayes.model", classifier1);
          
SerializationHelper.write("J48.model", classifier2);
          
SerializationHelper.write("ZeroR.model", classifier3);
          
          
          
Classifier classifier8 = (Classifier)
weka.core.SerializationHelper.read("LibSVM.model");
          
Classifier classifier5 = (Classifier)
weka.core.SerializationHelper.read("NaiveBayes.model");
          
Classifier classifier6 = (Classifier)
weka.core.SerializationHelper.read("J48.model");
          
Classifier classifier7 = (Classifier)
weka.core.SerializationHelper.read("ZeroR.model");
							
		 
						
		加载中,请稍候......