com.google.cloud.dataflow.sdk.values.PCollectionTuple Java Examples
The following examples show how to use
com.google.cloud.dataflow.sdk.values.PCollectionTuple.
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ParDoMultiOutputITCase.java From flink-dataflow with Apache License 2.0 | 4 votes |
@Override protected void testProgram() throws Exception { Pipeline p = FlinkTestPipeline.createForBatch(); PCollection<String> words = p.apply(Create.of("Hello", "Whatupmyman", "hey", "SPECIALthere", "MAAA", "MAAFOOO")); // Select words whose length is below a cut off, // plus the lengths of words that are above the cut off. // Also select words starting with "MARKER". final int wordLengthCutOff = 3; // Create tags to use for the main and side outputs. final TupleTag<String> wordsBelowCutOffTag = new TupleTag<String>(){}; final TupleTag<Integer> wordLengthsAboveCutOffTag = new TupleTag<Integer>(){}; final TupleTag<String> markedWordsTag = new TupleTag<String>(){}; PCollectionTuple results = words.apply(ParDo .withOutputTags(wordsBelowCutOffTag, TupleTagList.of(wordLengthsAboveCutOffTag) .and(markedWordsTag)) .of(new DoFn<String, String>() { final TupleTag<String> specialWordsTag = new TupleTag<String>() { }; public void processElement(ProcessContext c) { String word = c.element(); if (word.length() <= wordLengthCutOff) { c.output(word); } else { c.sideOutput(wordLengthsAboveCutOffTag, word.length()); } if (word.startsWith("MAA")) { c.sideOutput(markedWordsTag, word); } if (word.startsWith("SPECIAL")) { c.sideOutput(specialWordsTag, word); } } })); // Extract the PCollection results, by tag. PCollection<String> wordsBelowCutOff = results.get(wordsBelowCutOffTag); PCollection<Integer> wordLengthsAboveCutOff = results.get (wordLengthsAboveCutOffTag); PCollection<String> markedWords = results.get(markedWordsTag); markedWords.apply(TextIO.Write.to(resultPath)); p.run(); }