Java Code Examples for cc.mallet.types.Instance#setData()
The following examples show how to use
cc.mallet.types.Instance#setData() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: CorpusRepresentationMalletTarget.java From gateplugin-LearningFramework with GNU Lesser General Public License v2.1 | 6 votes |
/** * Extract the independent features for a single instance annotation. * * Extract the independent features for a single annotation according to the information * in the featureInfo object. The information in the featureInfo instance gets updated * by this. * * NOTE: this method is static so that it can be used in the CorpusRepresentationMalletSeq class too. * * @param instanceAnnotation instance annotation * @param inputAS input annotation set * @param targetFeatureName feature name of target * @param featureInfo feature info instance * @param pipe mallet pipe * @param nameFeature name feature * @return Instance */ static Instance extractIndependentFeaturesHelper( Annotation instanceAnnotation, AnnotationSet inputAS, FeatureInfo featureInfo, Pipe pipe) { AugmentableFeatureVector afv = new AugmentableFeatureVector(pipe.getDataAlphabet()); // Constructor parms: data, target, name, source Instance inst = new Instance(afv, null, null, null); for(FeatureSpecAttribute attr : featureInfo.getAttributes()) { FeatureExtractionMalletSparse.extractFeature(inst, attr, inputAS, instanceAnnotation); } // TODO: we destructively replace the AugmentableFeatureVector by a FeatureVector here, // but it is not clear if this is beneficial - our assumption is that yes. inst.setData(((AugmentableFeatureVector)inst.getData()).toFeatureVector()); return inst; }
Example 2
Source File: RemoveStopwords.java From baleen with Apache License 2.0 | 5 votes |
@Override public Instance pipe(Instance carrier) { TokenSequence input = (TokenSequence) carrier.getData(); TokenSequence output = new TokenSequence(); for (int i = 0; i < input.size(); i++) { Token t = input.get(i); if (!stopwords.contains(t.getText())) { output.add(t); } } carrier.setData(output); return carrier; }
Example 3
Source File: MalletCalculator.java From TagRec with GNU Affero General Public License v3.0 | 5 votes |
private void initializeDataStructures() { this.instances = new InstanceList(new StringList2FeatureSequence()); for (Map<Integer, Integer> map : this.maps) { List<String> tags = new ArrayList<String>(); for (Map.Entry<Integer, Integer> entry : map.entrySet()) { for (int i = 0; i < entry.getValue(); i++) { tags.add(entry.getKey().toString()); } } Instance inst = new Instance(tags, null, null, null); inst.setData(tags); this.instances.addThruPipe(inst); } }
Example 4
Source File: MalletCalculatorTweet.java From TagRec with GNU Affero General Public License v3.0 | 5 votes |
private void initializeDataStructures() { this.instances = new InstanceList(new StringList2FeatureSequence()); for (Map<Integer, Integer> map : this.maps) { List<String> tags = new ArrayList<String>(); for (Map.Entry<Integer, Integer> entry : map.entrySet()) { for (int i = 0; i < entry.getValue(); i++) { tags.add(entry.getKey().toString()); } } Instance inst = new Instance(tags, null, null, null); inst.setData(tags); this.instances.addThruPipe(inst); } }