Java Code Examples for org.deeplearning4j.models.word2vec.wordstore.VocabCache#elementAtIndex()
The following examples show how to use
org.deeplearning4j.models.word2vec.wordstore.VocabCache#elementAtIndex() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: WordVectorSerializer.java From deeplearning4j with Apache License 2.0 | 6 votes |
/** * This method saves specified SequenceVectors model to target OutputStream * * @param vectors SequenceVectors model * @param factory SequenceElementFactory implementation for your objects * @param stream Target output stream * @param <T> */ public static <T extends SequenceElement> void writeSequenceVectors(@NonNull SequenceVectors<T> vectors, @NonNull SequenceElementFactory<T> factory, @NonNull OutputStream stream) throws IOException { WeightLookupTable<T> lookupTable = vectors.getLookupTable(); VocabCache<T> vocabCache = vectors.getVocab(); try (PrintWriter writer = new PrintWriter(new BufferedWriter(new OutputStreamWriter(stream, StandardCharsets.UTF_8)))) { // at first line we save VectorsConfiguration writer.write(vectors.getConfiguration().toEncodedJson()); // now we have elements one by one for (int x = 0; x < vocabCache.numWords(); x++) { T element = vocabCache.elementAtIndex(x); String json = factory.serialize(element); INDArray d = Nd4j.create(1); double[] vector = lookupTable.vector(element.getLabel()).dup().data().asDouble(); ElementPair pair = new ElementPair(json, vector); writer.println(pair.toEncodedJson()); writer.flush(); } } }
Example 2
Source File: VocabHolder.java From deeplearning4j with Apache License 2.0 | 6 votes |
public INDArray getSyn0Vector(Integer wordIndex, VocabCache<VocabWord> vocabCache) { if (!workers.contains(Thread.currentThread().getId())) workers.add(Thread.currentThread().getId()); VocabWord word = vocabCache.elementAtIndex(wordIndex); if (!indexSyn0VecMap.containsKey(word)) { synchronized (this) { if (!indexSyn0VecMap.containsKey(word)) { indexSyn0VecMap.put(word, getRandomSyn0Vec(vectorLength.get(), wordIndex)); } } } return indexSyn0VecMap.get(word); }
Example 3
Source File: WordVectorSerializer.java From deeplearning4j with Apache License 2.0 | 5 votes |
/** * This method saves vocab cache to provided OutputStream. * Please note: it saves only vocab content, so it's suitable mostly for BagOfWords/TF-IDF vectorizers * * @param vocabCache * @param stream * @throws UnsupportedEncodingException */ public static void writeVocabCache(@NonNull VocabCache<VocabWord> vocabCache, @NonNull OutputStream stream) throws IOException { try (PrintWriter writer = new PrintWriter(new BufferedWriter(new OutputStreamWriter(stream, StandardCharsets.UTF_8)))) { // saving general vocab information writer.println("" + vocabCache.numWords() + " " + vocabCache.totalNumberOfDocs() + " " + vocabCache.totalWordOccurrences()); for (int x = 0; x < vocabCache.numWords(); x++) { VocabWord word = vocabCache.elementAtIndex(x); writer.println(word.toJSON()); } } }