Java Code Examples for weka.classifiers.Classifier#distributionForInstance()
The following examples show how to use
weka.classifiers.Classifier#distributionForInstance() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: SentimentAnalyser.java From sentiment-analysis with Apache License 2.0 | 6 votes |
/**Decides upon a "disagreed" document by applying the learned model based on the last 1,000 "agreed" documents.*/ private String clarifyOnSlidingWindow(String tweet){ String out = ""; double[] instanceValues = new double[train.numAttributes()]; instanceValues[0] = train.attribute(0).addStringValue(tweet); train.add(new SparseInstance(1.0, instanceValues)); try { stwv.setInputFormat(train); Instances newData = Filter.useFilter(train, stwv); Instances train_ins = new Instances(newData, 0, train.size()-1); Instances test_ins = new Instances(newData, train.size()-1, 1); Classifier mnb = (Classifier)new NaiveBayesMultinomial(); mnb.buildClassifier(train_ins); double[] preds = mnb.distributionForInstance(test_ins.get(0)); if (preds[0]>0.5) out = "positive"; else out = "negative"; } catch (Exception e) { e.printStackTrace(); } train.remove(train.numInstances()-1); return out; }
Example 2
Source File: ND.java From tsml with GNU General Public License v3.0 | 6 votes |
/** * Predicts the class distribution for a given instance * * @param inst the (multi-class) instance to be classified * @param node the node to do get the distribution for * @return the class distribution * @throws Exception if computing fails */ protected double[] distributionForInstance(Instance inst, NDTree node) throws Exception { double[] newDist = new double[inst.numClasses()]; if (node.m_left == null) { newDist[node.getIndices()[0]] = 1.0; return newDist; } else { Classifier classifier = (Classifier)m_classifiers.get(node.m_left.getString() + "|" + node.m_right.getString()); double[] leftDist = distributionForInstance(inst, node.m_left); double[] rightDist = distributionForInstance(inst, node.m_right); double[] dist = classifier.distributionForInstance(inst); for (int i = 0; i < inst.numClasses(); i++) { if (node.m_right.contains(i)) { newDist[i] = dist[1] * rightDist[i]; } else { newDist[i] = dist[0] * leftDist[i]; } } return newDist; } }
Example 3
Source File: Ensemble.java From AILibs with GNU Affero General Public License v3.0 | 6 votes |
@Override public double[] distributionForInstance(final Instance instance) throws Exception { double[] sums = new double[instance.classAttribute().numValues()]; double[] newProbs; for (Classifier c : this) { newProbs = c.distributionForInstance(instance); for (int j = 0; j < newProbs.length; j++) { sums[j] += newProbs[j]; } } if (Utils.eq(Utils.sum(sums), 1)) { return sums; } else { Utils.normalize(sums); return sums; } }
Example 4
Source File: EvaluationUtils.java From tsml with GNU General Public License v3.0 | 5 votes |
/** * Generate a single prediction for a test instance given the pre-trained * classifier. * * @param classifier the pre-trained Classifier to evaluate * @param test the test instance * @exception Exception if an error occurs */ public Prediction getPrediction(Classifier classifier, Instance test) throws Exception { double actual = test.classValue(); double [] dist = classifier.distributionForInstance(test); if (test.classAttribute().isNominal()) { return new NominalPrediction(actual, dist, test.weight()); } else { return new NumericPrediction(actual, dist[0], test.weight()); } }
Example 5
Source File: SimpleFlipper.java From collective-classification-weka-package with GNU General Public License v3.0 | 5 votes |
/** * returns the (possibly) new class label * @param c the Classifier to use for prediction * @param instances the instances to use for flipping * @param from the starting of flipping * @param count the number of instances to flip * @param index the index of the instance to flip * @param history the flipping history * @return the (possibly) new class label */ @Override public double flipLabel( Classifier c, Instances instances, int from, int count, int index, FlipHistory history ) { double[] dist; double result; // get distribution try { dist = c.distributionForInstance(instances.instance(index)); } catch (Exception e) { e.printStackTrace(); return instances.instance(index).classValue(); } // flip label if (m_Random.nextDouble() < dist[0]) result = 0; else result = 1; // history history.add(instances.instance(index), dist); return result; }
Example 6
Source File: ConfidentFlipper.java From collective-classification-weka-package with GNU General Public License v3.0 | 5 votes |
/** * returns the (possibly) new class label * @param c the Classifier to use for prediction * @param instances the instances to use for flipping * @param from the starting of flipping * @param count the number of instances to flip * @param index the index of the instance to flip * @param history the flipping history * @return the (possibly) new class label */ @Override public double flipLabel( Classifier c, Instances instances, int from, int count, int index, FlipHistory history ) { double[] dist; double result; // get distribution try { dist = c.distributionForInstance(instances.instance(index)); } catch (Exception e) { e.printStackTrace(); return instances.instance(index).classValue(); } // do we disagree enough? if ( StrictMath.abs( dist[0] - history.getLast(instances.instance(index))[0]) >= getDelta()) { // flip label if (m_Random.nextDouble() < dist[0]) result = dist[0]; else result = dist[1]; } else { result = (instances.instance(index).classValue()); } // history history.add(instances.instance(index), dist); return result; }
Example 7
Source File: TriangleFlipper.java From collective-classification-weka-package with GNU General Public License v3.0 | 5 votes |
/** * returns the (possibly) new class label * * @param c the Classifier to use for prediction * @param instances the instances to use for flipping * @param from the starting of flipping * @param count the number of instances to flip * @param index the index of the instance to flip * @param history the flipping history * @return the (possibly) new class label */ @Override public double flipLabel( Classifier c, Instances instances, int from, int count, int index, FlipHistory history ) { double result; double[] dist; double prob; double rand; double threshold; try { result = c.classifyInstance(instances.instance(index)); dist = c.distributionForInstance(instances.instance(index)); } catch (Exception e) { e.printStackTrace(); return instances.instance(index).classValue(); } prob = dist[(int) result]; // flip label rand = m_Random.nextDouble(); threshold = StrictMath.max(5.0 / count, 1 - 2*(prob - 0.5)); if (rand < threshold) { if (Utils.eq(result, 0.0)) result = 1; else result = 0; } // history history.add(instances.instance(index), dist); return result; }
Example 8
Source File: SingleTestSetEvaluator.java From tsml with GNU General Public License v3.0 | 5 votes |
@Override public synchronized ClassifierResults evaluate(Classifier classifier, Instances dataset) throws Exception { final Instances insts = cloneData ? new Instances(dataset) : dataset; ClassifierResults res = new ClassifierResults(insts.numClasses()); res.setTimeUnit(TimeUnit.NANOSECONDS); res.setClassifierName(classifier.getClass().getSimpleName()); res.setDatasetName(dataset.relationName()); res.setFoldID(seed); res.setSplit("train"); //todo revisit, or leave with the assumption that calling method will set this to test when needed res.turnOffZeroTimingsErrors(); for (Instance testinst : insts) { double trueClassVal = testinst.classValue(); if (setClassMissing) testinst.setClassMissing(); long startTime = System.nanoTime(); double[] dist = classifier.distributionForInstance(testinst); long predTime = System.nanoTime() - startTime; res.addPrediction(trueClassVal, dist, indexOfMax(dist), predTime, ""); } res.turnOnZeroTimingsErrors(); res.finaliseResults(); res.findAllStatsOnce(); return res; }
Example 9
Source File: ClassificationExamples.java From tsml with GNU General Public License v3.0 | 4 votes |
/** * * @param train: the standard train fold Instances from the archive * @param test: the standard test fold Instances from the archive * @param c: Classifier to evaluate * @param fold: integer to indicate which fold. Set to 0 to just use train/test * @param resultsPath: a string indicating where to store the results * @return the accuracy of c on fold for problem given in train/test * * NOTES: * 1. If the classifier is a SaveableEnsemble, then we save the internal cross * validation accuracy and the internal test predictions * 2. The output of the file testFold+fold+.csv is * Line 1: ProblemName,ClassifierName, train/test * Line 2: parameter information for final classifier, if it is available * Line 3: test accuracy * then each line is * Actual Class, Predicted Class, Class probabilities * * */ public static double singleClassifierAndFold(Instances train, Instances test, Classifier c, int fold,String resultsPath){ Instances[] data=InstanceTools.resampleTrainAndTestInstances(train, test, fold); double acc=0; int act; int pred; // Save internal info for ensembles if(c instanceof SaveableEnsemble) ((SaveableEnsemble)c).saveResults(resultsPath+"/internalCV_"+fold+".csv",resultsPath+"/internalTestPreds_"+fold+".csv"); try{ c.buildClassifier(data[0]); StringBuilder str = new StringBuilder(); DecimalFormat df=new DecimalFormat("##.######"); for(int j=0;j<data[1].numInstances();j++) { act=(int)data[1].instance(j).classValue(); double[] probs=c.distributionForInstance(data[1].instance(j)); pred=0; for(int i=1;i<probs.length;i++){ if(probs[i]>probs[pred]) pred=i; } if(act==pred) acc++; str.append(act); str.append(","); str.append(pred); str.append(",,"); for(double d:probs){ str.append(df.format(d)); str.append(","); } str.append("\n"); } acc/=data[1].numInstances(); OutFile p=new OutFile(resultsPath+"/testFold"+fold+".csv"); p.writeLine(train.relationName()+","+c.getClass().getName()+",test"); if(c instanceof EnhancedAbstractClassifier){ p.writeLine(((EnhancedAbstractClassifier)c).getParameters()); }else p.writeLine("No parameter info"); p.writeLine(acc+""); p.writeLine(str.toString()); }catch(Exception e) { System.out.println(" Error ="+e+" in method simpleExperiment"+e); e.printStackTrace(); System.out.println(" TRAIN "+train.relationName()+" has "+train.numAttributes()+" attributes and "+train.numInstances()+" instances"); System.out.println(" TEST "+test.relationName()+" has "+test.numAttributes()+" attributes"+test.numInstances()+" instances"); System.exit(0); } return acc; }
Example 10
Source File: SimulationExperiments.java From tsml with GNU General Public License v3.0 | 4 votes |
public static double singleSampleExperiment(Instances train, Instances test, Classifier c, int sample,String preds){ double acc=0; OutFile p=new OutFile(preds+"/testFold"+sample+".csv"); // hack here to save internal CV for further ensembling // if(c instanceof TrainAccuracyEstimate) // ((TrainAccuracyEstimate)c).writeCVTrainToFile(preds+"/trainFold"+sample+".csv"); if(c instanceof SaveableEnsemble) ((SaveableEnsemble)c).saveResults(preds+"/internalCV_"+sample+".csv",preds+"/internalTestPreds_"+sample+".csv"); try{ c.buildClassifier(train); int[][] predictions=new int[test.numInstances()][2]; for(int j=0;j<test.numInstances();j++){ predictions[j][0]=(int)test.instance(j).classValue(); test.instance(j).setMissing(test.classIndex());//Just in case .... } for(int j=0;j<test.numInstances();j++) { predictions[j][1]=(int)c.classifyInstance(test.instance(j)); if(predictions[j][0]==predictions[j][1]) acc++; } acc/=test.numInstances(); String[] names=preds.split("/"); p.writeLine(names[names.length-1]+","+c.getClass().getName()+",test"); if(c instanceof EnhancedAbstractClassifier) p.writeLine(((EnhancedAbstractClassifier)c).getParameters()); else if(c instanceof SaveableEnsemble) p.writeLine(((SaveableEnsemble)c).getParameters()); else p.writeLine("NoParameterInfo"); p.writeLine(acc+""); for(int j=0;j<test.numInstances();j++){ p.writeString(predictions[j][0]+","+predictions[j][1]+","); double[] dist =c.distributionForInstance(test.instance(j)); for(double d:dist) p.writeString(","+d); p.writeString("\n"); } }catch(Exception e) { System.out.println(" Error ="+e+" in method simpleExperiment"+e); e.printStackTrace(); System.out.println(" TRAIN "+train.relationName()+" has "+train.numAttributes()+" attributes and "+train.numInstances()+" instances"); System.out.println(" TEST "+test.relationName()+" has "+test.numAttributes()+" attributes and "+test.numInstances()+" instances"); System.exit(0); } return acc; }
Example 11
Source File: CrossValidationExperiments.java From NLIWOD with GNU Affero General Public License v3.0 | 4 votes |
public static void main(String[] args) throws Exception { Path datapath= Paths.get("./src/main/resources/old/Qald6Logs.arff"); BufferedReader reader = new BufferedReader(new FileReader(datapath.toString())); ArffReader arff = new ArffReader(reader); Instances data = arff.getData(); data.setClassIndex(6); ArrayList<String> systems = Lists.newArrayList("KWGAnswer", "NbFramework", "PersianQA", "SemGraphQA", "UIQA_withoutManualEntries", "UTQA_English" ); int seed = 133; // Change to 100 for leave-one-out CV int folds = 10; Random rand = new Random(seed); Instances randData = new Instances(data); randData.randomize(rand); float cv_ave_f = 0; for(int n=0; n < folds; n++){ Instances train = randData.trainCV(folds, n); Instances test = randData.testCV(folds, n); //Change to the Classifier of your choice CDN Classifier = new CDN(); Classifier.buildClassifier(train); float ave_p = 0; float ave_r = 0; for(int j = 0; j < test.size(); j++){ Instance ins = test.get(j); int k = 0; for(int l=0; l < data.size(); l++){ Instance tmp = data.get(l); if(tmp.toString().equals(ins.toString())){ k = l; } } double[] confidences = Classifier.distributionForInstance(ins); int argmax = -1; double max = -1; for(int i = 0; i < 6; i++){ if(confidences[i]>max){ max = confidences[i]; argmax = i; } } String sys2ask = systems.get(systems.size() - argmax -1); ave_p += Float.parseFloat(Utils.loadSystemP(sys2ask).get(k)); ave_r += Float.parseFloat(Utils.loadSystemR(sys2ask).get(k)); } double p = ave_p/test.size(); double r = ave_r/test.size(); double fmeasure = 0; if(p>0&&r>0){fmeasure = 2*p*r/(p + r);} System.out.println("macro F on fold " + n + ": " + fmeasure); cv_ave_f += fmeasure/folds; } System.out.println("macro F average: " + cv_ave_f); System.out.println('\n'); }
Example 12
Source File: CollectiveForest.java From collective-classification-weka-package with GNU General Public License v3.0 | 4 votes |
/** * performs the actual building of the classifier * * @throws Exception if building fails */ @Override protected void buildClassifier() throws Exception { Classifier tree; int i; int n; int nextSeed; double[] dist; Instances bagData; boolean[] inBag; double outOfBagCount; double errorSum; Instance outOfBagInst; m_PureTrainNodes = 0; m_PureTestNodes = 0; for (i = 0; i < getNumTrees(); i++) { // info if (getVerbose()) System.out.print("."); // get next seed number nextSeed = m_Random.nextInt(); // bagging? if (getUseBagging()) { // inBag-dataset/array inBag = new boolean[m_TrainsetNew.numInstances()]; bagData = resample(m_TrainsetNew, nextSeed, inBag); // build i.th tree tree = initClassifier(nextSeed); // determine and store distributions for (n = 0; n < m_TestsetNew.numInstances(); n++) { dist = tree.distributionForInstance(m_TestsetNew.instance(n)); m_List.addDistribution(m_TestsetNew.instance(n), dist); } // determine out-of-bag-error outOfBagCount = 0; errorSum = 0; for (n = 0; n < inBag.length; n++) { if (!inBag[n]) { outOfBagInst = m_TrainsetNew.instance(n); outOfBagCount += outOfBagInst.weight(); if (m_TrainsetNew.classAttribute().isNumeric()) { errorSum += outOfBagInst.weight() * StrictMath.abs(tree.classifyInstance(outOfBagInst) - outOfBagInst.classValue()); } else { if (tree.classifyInstance(outOfBagInst) != outOfBagInst.classValue()) { errorSum += outOfBagInst.weight(); } } } } m_OutOfBagError = errorSum / outOfBagCount; } else { // build i.th tree tree = initClassifier(nextSeed); // determine and store distributions for (n = 0; n < m_TestsetNew.numInstances(); n++) { dist = tree.distributionForInstance(m_TestsetNew.instance(n)); m_List.addDistribution(m_TestsetNew.instance(n), dist); } } // get information about pure nodes try { if (tree instanceof AdditionalMeasureProducer) { m_PureTrainNodes += ((AdditionalMeasureProducer) tree).getMeasure( "measurePureTrainNodes"); m_PureTestNodes += ((AdditionalMeasureProducer) tree).getMeasure( "measurePureTestNodes"); } } catch (Exception e) { e.printStackTrace(); } tree = null; } if (getVerbose()) System.out.println(); }
Example 13
Source File: CollectiveInstances.java From collective-classification-weka-package with GNU General Public License v3.0 | 4 votes |
/** * calculates the RMS for test/original test and train set * @param c the classifier to use for determining the RMS * @param train the training set * @param test the test set * @param testOriginal the original test set (can be null) * @return the RMS array (contains RMS, RMSTrain, RMSTest, * RMSTestOriginal) * @throws Exception if something goes wrong */ public static double[] calculateRMS( Classifier c, Instances train, Instances test, Instances testOriginal ) throws Exception { int i; double[] dist; double[] result; result = new double[4]; // 1. train result[1] = 0; for (i = 0; i < train.numInstances(); i++) { dist = c.distributionForInstance(train.instance(i)); result[1] += StrictMath.pow( dist[StrictMath.abs((int) train.instance(i).classValue() - 1)], 2); } // 2. test result[2] = 0; for (i = 0; i < test.numInstances(); i++) { dist = c.distributionForInstance(test.instance(i)); result[2] += StrictMath.pow(StrictMath.min(dist[0], dist[1]), 2); } // 4. original test if (testOriginal != null) { result[3] = 0; for (i = 0; i < testOriginal.numInstances(); i++) { dist = c.distributionForInstance(testOriginal.instance(i)); result[3] += StrictMath.pow( dist[StrictMath.abs((int) testOriginal.instance(i).classValue() - 1)], 2); } } else { result[3] = Double.NaN; } // normalize result[0] = (result[2] + result[1]) / (test.numInstances() + train.numInstances()); result[1] = result[1] / train.numInstances(); result[2] = result[2] / test.numInstances(); if (testOriginal != null) result[3] = result[3] / testOriginal.numInstances(); // root result[0] = StrictMath.sqrt(result[0]); result[1] = StrictMath.sqrt(result[1]); result[2] = StrictMath.sqrt(result[2]); if (testOriginal != null) result[3] = StrictMath.sqrt(result[3]); return result; }
Example 14
Source File: SimulationExperiments.java From tsml with GNU General Public License v3.0 | 4 votes |
/** Runs a single fold experiment, saving all output. * * @param train * @param test * @param c * @param sample * @param preds * @return */ public static double singleSampleExperiment(Instances train, Instances test, Classifier c, int sample,String preds){ double acc=0; OutFile p=new OutFile(preds+"/testFold"+sample+".csv"); // hack here to save internal CV for further ensembling if(EnhancedAbstractClassifier.classifierAbleToEstimateOwnPerformance(c)) ((EnhancedAbstractClassifier)c).setEstimateOwnPerformance(true); if(c instanceof SaveableEnsemble) ((SaveableEnsemble)c).saveResults(preds+"/internalCV_"+sample+".csv",preds+"/internalTestPreds_"+sample+".csv"); try{ c.buildClassifier(train); if(EnhancedAbstractClassifier.classifierIsEstimatingOwnPerformance(c)) ((EnhancedAbstractClassifier)c).getTrainResults().writeFullResultsToFile(preds+"/trainFold"+sample+".csv"); int[][] predictions=new int[test.numInstances()][2]; for(int j=0;j<test.numInstances();j++){ predictions[j][0]=(int)test.instance(j).classValue(); test.instance(j).setMissing(test.classIndex());//Just in case .... } for(int j=0;j<test.numInstances();j++) { predictions[j][1]=(int)c.classifyInstance(test.instance(j)); if(predictions[j][0]==predictions[j][1]) acc++; } acc/=test.numInstances(); String[] names=preds.split("/"); p.writeLine(names[names.length-1]+","+c.getClass().getName()+",test"); if(c instanceof EnhancedAbstractClassifier) p.writeLine(((EnhancedAbstractClassifier)c).getParameters()); else if(c instanceof SaveableEnsemble) p.writeLine(((SaveableEnsemble)c).getParameters()); else p.writeLine("NoParameterInfo"); p.writeLine(acc+""); for(int j=0;j<test.numInstances();j++){ p.writeString(predictions[j][0]+","+predictions[j][1]+","); double[] dist =c.distributionForInstance(test.instance(j)); for(double d:dist) p.writeString(","+d); p.writeString("\n"); } }catch(Exception e) { System.out.println(" Error ="+e+" in method simpleExperiment"+e); e.printStackTrace(); System.out.println(" TRAIN "+train.relationName()+" has "+train.numAttributes()+" attributes and "+train.numInstances()+" instances"); System.out.println(" TEST "+test.relationName()+" has "+test.numAttributes()+" attributes and "+test.numInstances()+" instances"); System.exit(0); } return acc; }
Example 15
Source File: HTML.java From tsml with GNU General Public License v3.0 | 2 votes |
/** * Store the prediction made by the classifier as a string. * * @param classifier the classifier to use * @param inst the instance to generate text from * @param index the index in the dataset * @throws Exception if something goes wrong */ protected void doPrintClassification(Classifier classifier, Instance inst, int index) throws Exception { double[] d = classifier.distributionForInstance(inst); doPrintClassification(d, inst, index); }
Example 16
Source File: PlainText.java From tsml with GNU General Public License v3.0 | 2 votes |
/** * Store the prediction made by the classifier as a string. * * @param classifier the classifier to use * @param inst the instance to generate text from * @param index the index in the dataset * @throws Exception if something goes wrong */ protected void doPrintClassification(Classifier classifier, Instance inst, int index) throws Exception { double[] d = classifier.distributionForInstance(inst); doPrintClassification(d, inst, index); }
Example 17
Source File: CSV.java From tsml with GNU General Public License v3.0 | 2 votes |
/** * Store the prediction made by the classifier as a string. * * @param classifier the classifier to use * @param inst the instance to generate text from * @param index the index in the dataset * @throws Exception if something goes wrong */ protected void doPrintClassification(Classifier classifier, Instance inst, int index) throws Exception { double[] d = classifier.distributionForInstance(inst); doPrintClassification(d, inst, index); }
Example 18
Source File: XML.java From tsml with GNU General Public License v3.0 | 2 votes |
/** * Store the prediction made by the classifier as a string. * * @param classifier the classifier to use * @param inst the instance to generate text from * @param index the index in the dataset * @throws Exception if something goes wrong */ protected void doPrintClassification(Classifier classifier, Instance inst, int index) throws Exception { double[] d = classifier.distributionForInstance(inst); doPrintClassification(d, inst, index); }
Example 19
Source File: Ex02_Classifiers.java From tsml with GNU General Public License v3.0 | 2 votes |
public static void main(String[] args) throws Exception { // We'll use this data throughout, see Ex01_Datahandling int seed = 0; Instances[] trainTest = DatasetLoading.sampleItalyPowerDemand(seed); Instances train = trainTest[0]; Instances test = trainTest[1]; // Here's the super basic workflow, this is pure weka: RandomForest randf = new RandomForest(); randf.setNumTrees(500); randf.setSeed(seed); randf.buildClassifier(train); //aka fit, train double acc = .0; for (Instance testInst : test) { double pred = randf.classifyInstance(testInst); //aka predict //double [] dist = randf.distributionForInstance(testInst); //aka predict_proba if (pred == testInst.classValue()) acc++; } acc /= test.numInstances(); System.out.println("Random Forest accuracy on ItalyPowerDemand: " + acc); // All classifiers implement the Classifier interface. this guarantees // the buildClassifier, classifyInstance and distributionForInstance methods, // which is mainly what we want // Most if not all classifiers should extend AbstractClassifier, which adds // on a little extra common functionality // There are also a number of classifiers listed in experiments.ClassifierLists // This class is updated over time and may eventually turn in to factories etc // on the backend, but for now what this is just a way to get a classifier // with defined settings (parameters etc). We use this to record the exact // parameters used in papers for example. We also use this to instantiate // particular classifiers from a string argument when running on clusters Classifier classifier = ClassifierLists.setClassifierClassic("RandF", seed); classifier.buildClassifier(train); classifier.distributionForInstance(test.instance(0)); }