org.apache.flink.streaming.api.functions.KeyedProcessFunction Java Examples
The following examples show how to use
org.apache.flink.streaming.api.functions.KeyedProcessFunction.
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ProcessFunctionTestHarnesses.java From flink with Apache License 2.0 | 6 votes |
/** * Returns an initialized test harness for {@link KeyedProcessFunction}. * * @param function instance of a {@link KeyedCoProcessFunction} under test * @param <K> key type * @param <IN> type of input stream elements * @param <OUT> type of output stream elements * @return {@link KeyedOneInputStreamOperatorTestHarness} wrapped around {@code function} */ public static <K, IN, OUT> KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> forKeyedProcessFunction( final KeyedProcessFunction<K, IN, OUT> function, final KeySelector<IN, K> keySelector, final TypeInformation<K> keyType) throws Exception { KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> testHarness = new KeyedOneInputStreamOperatorTestHarness<>( new KeyedProcessOperator<>( Preconditions.checkNotNull(function)), keySelector, keyType, 1, 1, 0); testHarness.open(); return testHarness; }
Example #2
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 6 votes |
@Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { long currentTime = ctx.timerService().currentProcessingTime(); // buffer the event incoming event // add current element to the window list of elements with corresponding timestamp List<RowData> rowList = inputState.get(currentTime); // null value means that this si the first event received for this timestamp if (rowList == null) { rowList = new ArrayList<RowData>(); // register timer to process event once the current millisecond passed ctx.timerService().registerProcessingTimeTimer(currentTime + 1); registerCleanupTimer(ctx, currentTime); } rowList.add(input); inputState.put(currentTime, rowList); }
Example #3
Source File: KeyedStream.java From Flink-CEPplus with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #4
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 6 votes |
@Override public void processElement( BaseRow input, KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx, Collector<BaseRow> out) throws Exception { long currentTime = ctx.timerService().currentProcessingTime(); // register state-cleanup timer registerProcessingCleanupTimer(ctx, currentTime); // buffer the event incoming event // add current element to the window list of elements with corresponding timestamp List<BaseRow> rowList = inputState.get(currentTime); // null value means that this si the first event received for this timestamp if (rowList == null) { rowList = new ArrayList<BaseRow>(); // register timer to process event once the current millisecond passed ctx.timerService().registerProcessingTimeTimer(currentTime + 1); } rowList.add(input); inputState.put(currentTime, rowList); }
Example #5
Source File: KeyedStream.java From flink with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #6
Source File: KeyedStream.java From flink with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #7
Source File: RowTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); // triggering timestamp for trigger calculation long triggeringTs = input.getLong(rowTimeIdx); Long lastTriggeringTs = lastTriggeringTsState.value(); if (lastTriggeringTs == null) { lastTriggeringTs = 0L; } // check if the data is expired, if not, save the data and register event time timer if (triggeringTs > lastTriggeringTs) { List<RowData> data = inputState.get(triggeringTs); if (null != data) { data.add(input); inputState.put(triggeringTs, data); } else { data = new ArrayList<RowData>(); data.add(input); inputState.put(triggeringTs, data); // register event time timer ctx.timerService().registerEventTimeTimer(triggeringTs); } } }
Example #8
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void onTimer( long timestamp, KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx, Collector<RowData> out) throws Exception { if (stateCleaningEnabled) { cleanupState(inputState, accState, counterState, smallestTsState); function.cleanup(); } }
Example #9
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void onTimer( long timestamp, KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx, Collector<RowData> out) throws Exception { if (stateCleaningEnabled) { cleanupState(accState); function.cleanup(); } }
Example #10
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); RowData accumulators = accState.value(); if (null == accumulators) { accumulators = function.createAccumulators(); } // set accumulators in context first function.setAccumulators(accumulators); // accumulate input row function.accumulate(input); // update the value of accumulators for future incremental computation accumulators = function.getAccumulators(); accState.update(accumulators); // prepare output row RowData aggValue = function.getValue(); output.replace(input, aggValue); out.collect(output); }
Example #11
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
private void registerCleanupTimer( KeyedProcessFunction<K, RowData, RowData>.Context ctx, long timestamp) throws Exception { // calculate safe timestamp to cleanup states long minCleanupTimestamp = timestamp + precedingTimeBoundary + 1; long maxCleanupTimestamp = timestamp + (long) (precedingTimeBoundary * 1.5) + 1; // update timestamp and register timer if needed Long curCleanupTimestamp = cleanupTsState.value(); if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) { // we don't delete existing timer since it may delete timer for data processing // TODO Use timer with namespace to distinguish timers ctx.timerService().registerProcessingTimeTimer(maxCleanupTimestamp); cleanupTsState.update(maxCleanupTimestamp); } }
Example #12
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { // triggering timestamp for trigger calculation long triggeringTs = input.getLong(rowTimeIdx); Long lastTriggeringTs = lastTriggeringTsState.value(); if (lastTriggeringTs == null) { lastTriggeringTs = 0L; } // check if the data is expired, if not, save the data and register event time timer if (triggeringTs > lastTriggeringTs) { List<RowData> data = inputState.get(triggeringTs); if (null != data) { data.add(input); inputState.put(triggeringTs, data); } else { data = new ArrayList<RowData>(); data.add(input); inputState.put(triggeringTs, data); // register event time timer ctx.timerService().registerEventTimeTimer(triggeringTs); } registerCleanupTimer(ctx, triggeringTs); } }
Example #13
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
private void registerCleanupTimer( KeyedProcessFunction<K, RowData, RowData>.Context ctx, long timestamp) throws Exception { // calculate safe timestamp to cleanup states long minCleanupTimestamp = timestamp + precedingOffset + 1; long maxCleanupTimestamp = timestamp + (long) (precedingOffset * 1.5) + 1; // update timestamp and register timer if needed Long curCleanupTimestamp = cleanupTsState.value(); if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) { // we don't delete existing timer since it may delete timer for data processing // TODO Use timer with namespace to distinguish timers ctx.timerService().registerEventTimeTimer(maxCleanupTimestamp); cleanupTsState.update(maxCleanupTimestamp); } }
Example #14
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 5 votes |
/** * Puts an element from the input stream into state if it is not late. * Registers a timer for the next watermark. * * @param input The input value. * @param ctx A {@link Context} that allows querying the timestamp of the element and getting * TimerService for registering timers and querying the time. The * context is only valid during the invocation of this method, do not store it. * @param out The collector for returning result values. * @throws Exception */ @Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); long timestamp = input.getLong(rowTimeIdx); long curWatermark = ctx.timerService().currentWatermark(); // discard late record if (timestamp > curWatermark) { // ensure every key just registers one timer // default watermark is Long.Min, avoid overflow we use zero when watermark < 0 long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1; ctx.timerService().registerEventTimeTimer(triggerTs); // put row into state List<RowData> rowList = inputState.get(timestamp); if (rowList == null) { rowList = new ArrayList<RowData>(); } rowList.add(input); inputState.put(timestamp, rowList); } }
Example #15
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 5 votes |
/** * Puts an element from the input stream into state if it is not late. * Registers a timer for the next watermark. * * @param input The input value. * @param ctx A {@link Context} that allows querying the timestamp of the element and getting * TimerService for registering timers and querying the time. The * context is only valid during the invocation of this method, do not store it. * @param out The collector for returning result values. * @throws Exception */ @Override public void processElement( BaseRow input, KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx, Collector<BaseRow> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); long timestamp = input.getLong(rowTimeIdx); long curWatermark = ctx.timerService().currentWatermark(); // discard late record if (timestamp > curWatermark) { // ensure every key just registers one timer // default watermark is Long.Min, avoid overflow we use zero when watermark < 0 long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1; ctx.timerService().registerEventTimeTimer(triggerTs); // put row into state List<BaseRow> rowList = inputState.get(timestamp); if (rowList == null) { rowList = new ArrayList<BaseRow>(); } rowList.add(input); inputState.put(timestamp, rowList); } }
Example #16
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( BaseRow input, KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx, Collector<BaseRow> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); // triggering timestamp for trigger calculation long triggeringTs = input.getLong(rowTimeIdx); Long lastTriggeringTs = lastTriggeringTsState.value(); if (lastTriggeringTs == null) { lastTriggeringTs = 0L; } // check if the data is expired, if not, save the data and register event time timer if (triggeringTs > lastTriggeringTs) { List<BaseRow> data = inputState.get(triggeringTs); if (null != data) { data.add(input); inputState.put(triggeringTs, data); } else { data = new ArrayList<BaseRow>(); data.add(input); inputState.put(triggeringTs, data); // register event time timer ctx.timerService().registerEventTimeTimer(triggeringTs); } } }
Example #17
Source File: RowTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( BaseRow input, KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx, Collector<BaseRow> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); // triggering timestamp for trigger calculation long triggeringTs = input.getLong(rowTimeIdx); Long lastTriggeringTs = lastTriggeringTsState.value(); if (lastTriggeringTs == null) { lastTriggeringTs = 0L; } // check if the data is expired, if not, save the data and register event time timer if (triggeringTs > lastTriggeringTs) { List<BaseRow> data = inputState.get(triggeringTs); if (null != data) { data.add(input); inputState.put(triggeringTs, data); } else { data = new ArrayList<BaseRow>(); data.add(input); inputState.put(triggeringTs, data); // register event time timer ctx.timerService().registerEventTimeTimer(triggeringTs); } } }
Example #18
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void onTimer( long timestamp, KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx, Collector<BaseRow> out) throws Exception { if (stateCleaningEnabled) { cleanupState(inputState, accState, counterState, smallestTsState); function.cleanup(); } }
Example #19
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void onTimer( long timestamp, KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx, Collector<BaseRow> out) throws Exception { if (stateCleaningEnabled) { cleanupState(accState); function.cleanup(); } }
Example #20
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override public void processElement( BaseRow input, KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx, Collector<BaseRow> out) throws Exception { // register state-cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); BaseRow accumulators = accState.value(); if (null == accumulators) { accumulators = function.createAccumulators(); } // set accumulators in context first function.setAccumulators(accumulators); // accumulate input row function.accumulate(input); // update the value of accumulators for future incremental computation accumulators = function.getAccumulators(); accState.update(accumulators); // prepare output row BaseRow aggValue = function.getValue(); output.replace(input, aggValue); out.collect(output); }
Example #21
Source File: ProcessFunctionTestHarnessesTest.java From flink with Apache License 2.0 | 5 votes |
@Test public void testHarnessForKeyedProcessFunction() throws Exception { KeyedProcessFunction<Integer, Integer, Integer> function = new KeyedProcessFunction<Integer, Integer, Integer>() { @Override public void processElement(Integer value, Context ctx, Collector<Integer> out) throws Exception { out.collect(value); } }; OneInputStreamOperatorTestHarness<Integer, Integer> harness = ProcessFunctionTestHarnesses .forKeyedProcessFunction(function, x -> x, BasicTypeInfo.INT_TYPE_INFO); harness.processElement(1, 10); assertEquals(harness.extractOutputValues(), Collections.singletonList(1)); }
Example #22
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 4 votes |
@Override public void onTimer( long timestamp, KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx, Collector<RowData> out) throws Exception { if (isProcessingTimeTimer(ctx)) { if (stateCleaningEnabled) { // we check whether there are still records which have not been processed yet if (inputState.isEmpty()) { // we clean the state cleanupState(inputState, accState); function.cleanup(); } else { // There are records left to process because a watermark has not been received yet. // This would only happen if the input stream has stopped. So we don't need to clean up. // We leave the state as it is and schedule a new cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); } } return; } Iterator<Long> keyIterator = inputState.keys().iterator(); if (keyIterator.hasNext()) { Long curWatermark = ctx.timerService().currentWatermark(); boolean existEarlyRecord = false; // sort the record timestamps do { Long recordTime = keyIterator.next(); // only take timestamps smaller/equal to the watermark if (recordTime <= curWatermark) { insertToSortedList(recordTime); } else { existEarlyRecord = true; } } while (keyIterator.hasNext()); // get last accumulator RowData lastAccumulator = accState.value(); if (lastAccumulator == null) { // initialize accumulator lastAccumulator = function.createAccumulators(); } // set accumulator in function context first function.setAccumulators(lastAccumulator); // emit the rows in order while (!sortedTimestamps.isEmpty()) { Long curTimestamp = sortedTimestamps.removeFirst(); List<RowData> curRowList = inputState.get(curTimestamp); if (curRowList != null) { // process the same timestamp datas, the mechanism is different according ROWS or RANGE processElementsWithSameTimestamp(curRowList, out); } else { // Ignore the same timestamp datas if the state is cleared already. LOG.warn("The state is cleared because of state ttl. " + "This will result in incorrect result. " + "You can increase the state ttl to avoid this."); } inputState.remove(curTimestamp); } // update acc state lastAccumulator = function.getAccumulators(); accState.update(lastAccumulator); // if are are rows with timestamp > watermark, register a timer for the next watermark if (existEarlyRecord) { ctx.timerService().registerEventTimeTimer(curWatermark + 1); } } // update cleanup timer registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime()); }
Example #23
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
public KeyedProcessOperator(KeyedProcessFunction<K, IN, OUT> function) { super(function); chainingStrategy = ChainingStrategy.ALWAYS; }
Example #24
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
ContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) { function.super(); this.timerService = checkNotNull(timerService); }
Example #25
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) { function.super(); this.timerService = checkNotNull(timerService); }
Example #26
Source File: KeyedProcessOperatorTest.java From flink with Apache License 2.0 | 4 votes |
@Test public void testKeyQuerying() throws Exception { class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> { @Override public void processElement( Tuple2<Integer, String> value, Context ctx, Collector<String> out) throws Exception { assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0)); // we check that we receive this output, to ensure that the assert was actually checked out.collect(value.f1); } } KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator = new KeyedProcessOperator<>(new KeyQueryingProcessFunction()); try ( OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness = new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) { testHarness.setup(); testHarness.open(); testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L)); testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L)); ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>(); expectedOutput.add(new StreamRecord<>("5", 12L)); expectedOutput.add(new StreamRecord<>("42", 13L)); TestHarnessUtil.assertOutputEquals( "Output was not correct.", expectedOutput, testHarness.getOutput()); } }
Example #27
Source File: KeyedProcessOperatorTest.java From flink with Apache License 2.0 | 4 votes |
@Test public void testKeyQuerying() throws Exception { class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> { @Override public void processElement( Tuple2<Integer, String> value, Context ctx, Collector<String> out) throws Exception { assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0)); // we check that we receive this output, to ensure that the assert was actually checked out.collect(value.f1); } } KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator = new KeyedProcessOperator<>(new KeyQueryingProcessFunction()); try ( OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness = new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) { testHarness.setup(); testHarness.open(); testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L)); testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L)); ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>(); expectedOutput.add(new StreamRecord<>("5", 12L)); expectedOutput.add(new StreamRecord<>("42", 13L)); TestHarnessUtil.assertOutputEquals( "Output was not correct.", expectedOutput, testHarness.getOutput()); } }
Example #28
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 4 votes |
@Override public void processElement( RowData input, KeyedProcessFunction<K, RowData, RowData>.Context ctx, Collector<RowData> out) throws Exception { long currentTime = ctx.timerService().currentProcessingTime(); // register state-cleanup timer registerProcessingCleanupTimer(ctx, currentTime); // initialize state for the processed element RowData accumulators = accState.value(); if (accumulators == null) { accumulators = function.createAccumulators(); } // set accumulators in context first function.setAccumulators(accumulators); // get smallest timestamp Long smallestTs = smallestTsState.value(); if (smallestTs == null) { smallestTs = currentTime; smallestTsState.update(smallestTs); } // get previous counter value Long counter = counterState.value(); if (counter == null) { counter = 0L; } if (counter == precedingOffset) { List<RowData> retractList = inputState.get(smallestTs); if (retractList != null) { // get oldest element beyond buffer size // and if oldest element exist, retract value RowData retractRow = retractList.get(0); function.retract(retractRow); retractList.remove(0); } else { // Does not retract values which are outside of window if the state is cleared already. LOG.warn("The state is cleared because of state ttl. " + "This will result in incorrect result. " + "You can increase the state ttl to avoid this."); } // if reference timestamp list not empty, keep the list if (retractList != null && !retractList.isEmpty()) { inputState.put(smallestTs, retractList); } // if smallest timestamp list is empty, remove and find new smallest else { inputState.remove(smallestTs); Iterator<Long> iter = inputState.keys().iterator(); long currentTs = 0L; long newSmallestTs = Long.MAX_VALUE; while (iter.hasNext()) { currentTs = iter.next(); if (currentTs < newSmallestTs) { newSmallestTs = currentTs; } } smallestTsState.update(newSmallestTs); } } // we update the counter only while buffer is getting filled else { counter += 1; counterState.update(counter); } // update map state, counter and timestamp List<RowData> currentTimeState = inputState.get(currentTime); if (currentTimeState != null) { currentTimeState.add(input); inputState.put(currentTime, currentTimeState); } else { // add new input List<RowData> newList = new ArrayList<RowData>(); newList.add(input); inputState.put(currentTime, newList); } // accumulate current row function.accumulate(input); // update the value of accumulators for future incremental computation accumulators = function.getAccumulators(); accState.update(accumulators); // prepare output row RowData aggValue = function.getValue(); output.replace(input, aggValue); out.collect(output); }
Example #29
Source File: StreamBookmarker.java From pravega-samples with Apache License 2.0 | 4 votes |
public static void main(String[] args) throws Exception { // Initialize the parameter utility tool in order to retrieve input parameters. ParameterTool params = ParameterTool.fromArgs(args); // Clients will contact with the Pravega controller to get information about Streams. URI pravegaControllerURI = URI.create(params.get(Constants.CONTROLLER_ADDRESS_PARAM, Constants.CONTROLLER_ADDRESS)); PravegaConfig pravegaConfig = PravegaConfig .fromParams(params) .withControllerURI(pravegaControllerURI) .withDefaultScope(Constants.DEFAULT_SCOPE); // Create the scope if it is not present. StreamManager streamManager = StreamManager.create(pravegaControllerURI); streamManager.createScope(Constants.DEFAULT_SCOPE); // Create the Pravega source to read from data produced by DataProducer. Stream sensorEvents = Utils.createStream(pravegaConfig, Constants.PRODUCER_STREAM); SourceFunction<Tuple2<Integer, Double>> reader = FlinkPravegaReader.<Tuple2<Integer, Double>>builder() .withPravegaConfig(pravegaConfig) .forStream(sensorEvents) .withReaderGroupName(READER_GROUP_NAME) .withDeserializationSchema(new Tuple2DeserializationSchema()) .build(); // Create the Pravega sink to output the stream cuts representing slices to analyze. Stream streamCutsStream = Utils.createStream(pravegaConfig, Constants.STREAMCUTS_STREAM); SinkFunction<SensorStreamSlice> writer = FlinkPravegaWriter.<SensorStreamSlice>builder() .withPravegaConfig(pravegaConfig) .forStream(streamCutsStream) .withSerializationSchema(PravegaSerialization.serializationFor(SensorStreamSlice.class)) .withEventRouter(new EventRouter()) .build(); // Initialize the Flink execution environment. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment() .enableCheckpointing(CHECKPOINT_INTERVAL); env.getCheckpointConfig().setCheckpointTimeout(CHECKPOINT_INTERVAL); GuavaImmutableMapSerializer.registerSerializers(env.getConfig()); // Bookmark those sections of the stream with values < 0 and write the output (StreamCuts). DataStreamSink<SensorStreamSlice> dataStreamSink = env.addSource(reader) .setParallelism(Constants.PARALLELISM) .keyBy(0) .process((KeyedProcessFunction) new Bookmarker(pravegaControllerURI)) .addSink(writer); // Execute within the Flink environment. env.execute("StreamBookmarker"); LOG.info("Ending StreamBookmarker..."); }
Example #30
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) { function.super(); this.timerService = checkNotNull(timerService); }