org.apache.flink.streaming.api.functions.KeyedProcessFunction Java Examples
The following examples show how to use
org.apache.flink.streaming.api.functions.KeyedProcessFunction.
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ProcessFunctionTestHarnesses.java From flink with Apache License 2.0 | 6 votes |
/** * Returns an initialized test harness for {@link KeyedProcessFunction}. * * @param function instance of a {@link KeyedCoProcessFunction} under test * @param <K> key type * @param <IN> type of input stream elements * @param <OUT> type of output stream elements * @return {@link KeyedOneInputStreamOperatorTestHarness} wrapped around {@code function} */ public static <K, IN, OUT> KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> forKeyedProcessFunction( final KeyedProcessFunction<K, IN, OUT> function, final KeySelector<IN, K> keySelector, final TypeInformation<K> keyType) throws Exception { KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> testHarness = new KeyedOneInputStreamOperatorTestHarness<>( new KeyedProcessOperator<>( Preconditions.checkNotNull(function)), keySelector, keyType, 1, 1, 0); testHarness.open(); return testHarness; }
Example #2
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 6 votes |
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
long currentTime = ctx.timerService().currentProcessingTime();
// buffer the event incoming event
// add current element to the window list of elements with corresponding timestamp
List<RowData> rowList = inputState.get(currentTime);
// null value means that this si the first event received for this timestamp
if (rowList == null) {
rowList = new ArrayList<RowData>();
// register timer to process event once the current millisecond passed
ctx.timerService().registerProcessingTimeTimer(currentTime + 1);
registerCleanupTimer(ctx, currentTime);
}
rowList.add(input);
inputState.put(currentTime, rowList);
}
Example #3
Source File: KeyedStream.java From Flink-CEPplus with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #4
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 6 votes |
@Override
public void processElement(
BaseRow input,
KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
Collector<BaseRow> out) throws Exception {
long currentTime = ctx.timerService().currentProcessingTime();
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, currentTime);
// buffer the event incoming event
// add current element to the window list of elements with corresponding timestamp
List<BaseRow> rowList = inputState.get(currentTime);
// null value means that this si the first event received for this timestamp
if (rowList == null) {
rowList = new ArrayList<BaseRow>();
// register timer to process event once the current millisecond passed
ctx.timerService().registerProcessingTimeTimer(currentTime + 1);
}
rowList.add(input);
inputState.put(currentTime, rowList);
}
Example #5
Source File: KeyedStream.java From flink with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #6
Source File: KeyedStream.java From flink with Apache License 2.0 | 6 votes |
/** * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream. * * <p>The function will be called for every element in the input streams and can produce zero * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)} * function, this function can also query the time and set timers. When reacting to the firing * of set timers the function can directly emit elements and/or register yet more timers. * * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream. * * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}. * * @return The transformed {@link DataStream}. */ @PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) { TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType( keyedProcessFunction, KeyedProcessFunction.class, 1, 2, TypeExtractor.NO_INDEX, getType(), Utils.getCallLocationName(), true); return process(keyedProcessFunction, outType); }
Example #7
Source File: RowTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
// triggering timestamp for trigger calculation
long triggeringTs = input.getLong(rowTimeIdx);
Long lastTriggeringTs = lastTriggeringTsState.value();
if (lastTriggeringTs == null) {
lastTriggeringTs = 0L;
}
// check if the data is expired, if not, save the data and register event time timer
if (triggeringTs > lastTriggeringTs) {
List<RowData> data = inputState.get(triggeringTs);
if (null != data) {
data.add(input);
inputState.put(triggeringTs, data);
} else {
data = new ArrayList<RowData>();
data.add(input);
inputState.put(triggeringTs, data);
// register event time timer
ctx.timerService().registerEventTimeTimer(triggeringTs);
}
}
}
Example #8
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void onTimer(
long timestamp,
KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
Collector<RowData> out) throws Exception {
if (stateCleaningEnabled) {
cleanupState(inputState, accState, counterState, smallestTsState);
function.cleanup();
}
}
Example #9
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void onTimer(
long timestamp,
KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
Collector<RowData> out) throws Exception {
if (stateCleaningEnabled) {
cleanupState(accState);
function.cleanup();
}
}
Example #10
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
RowData accumulators = accState.value();
if (null == accumulators) {
accumulators = function.createAccumulators();
}
// set accumulators in context first
function.setAccumulators(accumulators);
// accumulate input row
function.accumulate(input);
// update the value of accumulators for future incremental computation
accumulators = function.getAccumulators();
accState.update(accumulators);
// prepare output row
RowData aggValue = function.getValue();
output.replace(input, aggValue);
out.collect(output);
}
Example #11
Source File: ProcTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
private void registerCleanupTimer(
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
long timestamp) throws Exception {
// calculate safe timestamp to cleanup states
long minCleanupTimestamp = timestamp + precedingTimeBoundary + 1;
long maxCleanupTimestamp = timestamp + (long) (precedingTimeBoundary * 1.5) + 1;
// update timestamp and register timer if needed
Long curCleanupTimestamp = cleanupTsState.value();
if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) {
// we don't delete existing timer since it may delete timer for data processing
// TODO Use timer with namespace to distinguish timers
ctx.timerService().registerProcessingTimeTimer(maxCleanupTimestamp);
cleanupTsState.update(maxCleanupTimestamp);
}
}
Example #12
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
// triggering timestamp for trigger calculation
long triggeringTs = input.getLong(rowTimeIdx);
Long lastTriggeringTs = lastTriggeringTsState.value();
if (lastTriggeringTs == null) {
lastTriggeringTs = 0L;
}
// check if the data is expired, if not, save the data and register event time timer
if (triggeringTs > lastTriggeringTs) {
List<RowData> data = inputState.get(triggeringTs);
if (null != data) {
data.add(input);
inputState.put(triggeringTs, data);
} else {
data = new ArrayList<RowData>();
data.add(input);
inputState.put(triggeringTs, data);
// register event time timer
ctx.timerService().registerEventTimeTimer(triggeringTs);
}
registerCleanupTimer(ctx, triggeringTs);
}
}
Example #13
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
private void registerCleanupTimer(
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
long timestamp) throws Exception {
// calculate safe timestamp to cleanup states
long minCleanupTimestamp = timestamp + precedingOffset + 1;
long maxCleanupTimestamp = timestamp + (long) (precedingOffset * 1.5) + 1;
// update timestamp and register timer if needed
Long curCleanupTimestamp = cleanupTsState.value();
if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) {
// we don't delete existing timer since it may delete timer for data processing
// TODO Use timer with namespace to distinguish timers
ctx.timerService().registerEventTimeTimer(maxCleanupTimestamp);
cleanupTsState.update(maxCleanupTimestamp);
}
}
Example #14
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 5 votes |
/**
* Puts an element from the input stream into state if it is not late.
* Registers a timer for the next watermark.
*
* @param input The input value.
* @param ctx A {@link Context} that allows querying the timestamp of the element and getting
* TimerService for registering timers and querying the time. The
* context is only valid during the invocation of this method, do not store it.
* @param out The collector for returning result values.
* @throws Exception
*/
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
long timestamp = input.getLong(rowTimeIdx);
long curWatermark = ctx.timerService().currentWatermark();
// discard late record
if (timestamp > curWatermark) {
// ensure every key just registers one timer
// default watermark is Long.Min, avoid overflow we use zero when watermark < 0
long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1;
ctx.timerService().registerEventTimeTimer(triggerTs);
// put row into state
List<RowData> rowList = inputState.get(timestamp);
if (rowList == null) {
rowList = new ArrayList<RowData>();
}
rowList.add(input);
inputState.put(timestamp, rowList);
}
}
Example #15
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 5 votes |
/**
* Puts an element from the input stream into state if it is not late.
* Registers a timer for the next watermark.
*
* @param input The input value.
* @param ctx A {@link Context} that allows querying the timestamp of the element and getting
* TimerService for registering timers and querying the time. The
* context is only valid during the invocation of this method, do not store it.
* @param out The collector for returning result values.
* @throws Exception
*/
@Override
public void processElement(
BaseRow input,
KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
Collector<BaseRow> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
long timestamp = input.getLong(rowTimeIdx);
long curWatermark = ctx.timerService().currentWatermark();
// discard late record
if (timestamp > curWatermark) {
// ensure every key just registers one timer
// default watermark is Long.Min, avoid overflow we use zero when watermark < 0
long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1;
ctx.timerService().registerEventTimeTimer(triggerTs);
// put row into state
List<BaseRow> rowList = inputState.get(timestamp);
if (rowList == null) {
rowList = new ArrayList<BaseRow>();
}
rowList.add(input);
inputState.put(timestamp, rowList);
}
}
Example #16
Source File: RowTimeRangeBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
BaseRow input,
KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
Collector<BaseRow> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
// triggering timestamp for trigger calculation
long triggeringTs = input.getLong(rowTimeIdx);
Long lastTriggeringTs = lastTriggeringTsState.value();
if (lastTriggeringTs == null) {
lastTriggeringTs = 0L;
}
// check if the data is expired, if not, save the data and register event time timer
if (triggeringTs > lastTriggeringTs) {
List<BaseRow> data = inputState.get(triggeringTs);
if (null != data) {
data.add(input);
inputState.put(triggeringTs, data);
} else {
data = new ArrayList<BaseRow>();
data.add(input);
inputState.put(triggeringTs, data);
// register event time timer
ctx.timerService().registerEventTimeTimer(triggeringTs);
}
}
}
Example #17
Source File: RowTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
BaseRow input,
KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
Collector<BaseRow> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
// triggering timestamp for trigger calculation
long triggeringTs = input.getLong(rowTimeIdx);
Long lastTriggeringTs = lastTriggeringTsState.value();
if (lastTriggeringTs == null) {
lastTriggeringTs = 0L;
}
// check if the data is expired, if not, save the data and register event time timer
if (triggeringTs > lastTriggeringTs) {
List<BaseRow> data = inputState.get(triggeringTs);
if (null != data) {
data.add(input);
inputState.put(triggeringTs, data);
} else {
data = new ArrayList<BaseRow>();
data.add(input);
inputState.put(triggeringTs, data);
// register event time timer
ctx.timerService().registerEventTimeTimer(triggeringTs);
}
}
}
Example #18
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void onTimer(
long timestamp,
KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx,
Collector<BaseRow> out) throws Exception {
if (stateCleaningEnabled) {
cleanupState(inputState, accState, counterState, smallestTsState);
function.cleanup();
}
}
Example #19
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void onTimer(
long timestamp,
KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx,
Collector<BaseRow> out) throws Exception {
if (stateCleaningEnabled) {
cleanupState(accState);
function.cleanup();
}
}
Example #20
Source File: ProcTimeUnboundedPrecedingFunction.java From flink with Apache License 2.0 | 5 votes |
@Override
public void processElement(
BaseRow input,
KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
Collector<BaseRow> out) throws Exception {
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
BaseRow accumulators = accState.value();
if (null == accumulators) {
accumulators = function.createAccumulators();
}
// set accumulators in context first
function.setAccumulators(accumulators);
// accumulate input row
function.accumulate(input);
// update the value of accumulators for future incremental computation
accumulators = function.getAccumulators();
accState.update(accumulators);
// prepare output row
BaseRow aggValue = function.getValue();
output.replace(input, aggValue);
out.collect(output);
}
Example #21
Source File: ProcessFunctionTestHarnessesTest.java From flink with Apache License 2.0 | 5 votes |
@Test public void testHarnessForKeyedProcessFunction() throws Exception { KeyedProcessFunction<Integer, Integer, Integer> function = new KeyedProcessFunction<Integer, Integer, Integer>() { @Override public void processElement(Integer value, Context ctx, Collector<Integer> out) throws Exception { out.collect(value); } }; OneInputStreamOperatorTestHarness<Integer, Integer> harness = ProcessFunctionTestHarnesses .forKeyedProcessFunction(function, x -> x, BasicTypeInfo.INT_TYPE_INFO); harness.processElement(1, 10); assertEquals(harness.extractOutputValues(), Collections.singletonList(1)); }
Example #22
Source File: AbstractRowTimeUnboundedPrecedingOver.java From flink with Apache License 2.0 | 4 votes |
@Override
public void onTimer(
long timestamp,
KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
Collector<RowData> out) throws Exception {
if (isProcessingTimeTimer(ctx)) {
if (stateCleaningEnabled) {
// we check whether there are still records which have not been processed yet
if (inputState.isEmpty()) {
// we clean the state
cleanupState(inputState, accState);
function.cleanup();
} else {
// There are records left to process because a watermark has not been received yet.
// This would only happen if the input stream has stopped. So we don't need to clean up.
// We leave the state as it is and schedule a new cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
}
}
return;
}
Iterator<Long> keyIterator = inputState.keys().iterator();
if (keyIterator.hasNext()) {
Long curWatermark = ctx.timerService().currentWatermark();
boolean existEarlyRecord = false;
// sort the record timestamps
do {
Long recordTime = keyIterator.next();
// only take timestamps smaller/equal to the watermark
if (recordTime <= curWatermark) {
insertToSortedList(recordTime);
} else {
existEarlyRecord = true;
}
} while (keyIterator.hasNext());
// get last accumulator
RowData lastAccumulator = accState.value();
if (lastAccumulator == null) {
// initialize accumulator
lastAccumulator = function.createAccumulators();
}
// set accumulator in function context first
function.setAccumulators(lastAccumulator);
// emit the rows in order
while (!sortedTimestamps.isEmpty()) {
Long curTimestamp = sortedTimestamps.removeFirst();
List<RowData> curRowList = inputState.get(curTimestamp);
if (curRowList != null) {
// process the same timestamp datas, the mechanism is different according ROWS or RANGE
processElementsWithSameTimestamp(curRowList, out);
} else {
// Ignore the same timestamp datas if the state is cleared already.
LOG.warn("The state is cleared because of state ttl. " +
"This will result in incorrect result. " +
"You can increase the state ttl to avoid this.");
}
inputState.remove(curTimestamp);
}
// update acc state
lastAccumulator = function.getAccumulators();
accState.update(lastAccumulator);
// if are are rows with timestamp > watermark, register a timer for the next watermark
if (existEarlyRecord) {
ctx.timerService().registerEventTimeTimer(curWatermark + 1);
}
}
// update cleanup timer
registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
}
Example #23
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
public KeyedProcessOperator(KeyedProcessFunction<K, IN, OUT> function) {
super(function);
chainingStrategy = ChainingStrategy.ALWAYS;
}
Example #24
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
ContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
function.super();
this.timerService = checkNotNull(timerService);
}
Example #25
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
function.super();
this.timerService = checkNotNull(timerService);
}
Example #26
Source File: KeyedProcessOperatorTest.java From flink with Apache License 2.0 | 4 votes |
@Test
public void testKeyQuerying() throws Exception {
class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> {
@Override
public void processElement(
Tuple2<Integer, String> value,
Context ctx,
Collector<String> out) throws Exception {
assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0));
// we check that we receive this output, to ensure that the assert was actually checked
out.collect(value.f1);
}
}
KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator =
new KeyedProcessOperator<>(new KeyQueryingProcessFunction());
try (
OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness =
new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) {
testHarness.setup();
testHarness.open();
testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L));
testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L));
ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>();
expectedOutput.add(new StreamRecord<>("5", 12L));
expectedOutput.add(new StreamRecord<>("42", 13L));
TestHarnessUtil.assertOutputEquals(
"Output was not correct.",
expectedOutput,
testHarness.getOutput());
}
}
Example #27
Source File: KeyedProcessOperatorTest.java From flink with Apache License 2.0 | 4 votes |
@Test
public void testKeyQuerying() throws Exception {
class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> {
@Override
public void processElement(
Tuple2<Integer, String> value,
Context ctx,
Collector<String> out) throws Exception {
assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0));
// we check that we receive this output, to ensure that the assert was actually checked
out.collect(value.f1);
}
}
KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator =
new KeyedProcessOperator<>(new KeyQueryingProcessFunction());
try (
OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness =
new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) {
testHarness.setup();
testHarness.open();
testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L));
testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L));
ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>();
expectedOutput.add(new StreamRecord<>("5", 12L));
expectedOutput.add(new StreamRecord<>("42", 13L));
TestHarnessUtil.assertOutputEquals(
"Output was not correct.",
expectedOutput,
testHarness.getOutput());
}
}
Example #28
Source File: ProcTimeRowsBoundedPrecedingFunction.java From flink with Apache License 2.0 | 4 votes |
@Override
public void processElement(
RowData input,
KeyedProcessFunction<K, RowData, RowData>.Context ctx,
Collector<RowData> out) throws Exception {
long currentTime = ctx.timerService().currentProcessingTime();
// register state-cleanup timer
registerProcessingCleanupTimer(ctx, currentTime);
// initialize state for the processed element
RowData accumulators = accState.value();
if (accumulators == null) {
accumulators = function.createAccumulators();
}
// set accumulators in context first
function.setAccumulators(accumulators);
// get smallest timestamp
Long smallestTs = smallestTsState.value();
if (smallestTs == null) {
smallestTs = currentTime;
smallestTsState.update(smallestTs);
}
// get previous counter value
Long counter = counterState.value();
if (counter == null) {
counter = 0L;
}
if (counter == precedingOffset) {
List<RowData> retractList = inputState.get(smallestTs);
if (retractList != null) {
// get oldest element beyond buffer size
// and if oldest element exist, retract value
RowData retractRow = retractList.get(0);
function.retract(retractRow);
retractList.remove(0);
} else {
// Does not retract values which are outside of window if the state is cleared already.
LOG.warn("The state is cleared because of state ttl. " +
"This will result in incorrect result. " +
"You can increase the state ttl to avoid this.");
}
// if reference timestamp list not empty, keep the list
if (retractList != null && !retractList.isEmpty()) {
inputState.put(smallestTs, retractList);
} // if smallest timestamp list is empty, remove and find new smallest
else {
inputState.remove(smallestTs);
Iterator<Long> iter = inputState.keys().iterator();
long currentTs = 0L;
long newSmallestTs = Long.MAX_VALUE;
while (iter.hasNext()) {
currentTs = iter.next();
if (currentTs < newSmallestTs) {
newSmallestTs = currentTs;
}
}
smallestTsState.update(newSmallestTs);
}
} // we update the counter only while buffer is getting filled
else {
counter += 1;
counterState.update(counter);
}
// update map state, counter and timestamp
List<RowData> currentTimeState = inputState.get(currentTime);
if (currentTimeState != null) {
currentTimeState.add(input);
inputState.put(currentTime, currentTimeState);
} else { // add new input
List<RowData> newList = new ArrayList<RowData>();
newList.add(input);
inputState.put(currentTime, newList);
}
// accumulate current row
function.accumulate(input);
// update the value of accumulators for future incremental computation
accumulators = function.getAccumulators();
accState.update(accumulators);
// prepare output row
RowData aggValue = function.getValue();
output.replace(input, aggValue);
out.collect(output);
}
Example #29
Source File: StreamBookmarker.java From pravega-samples with Apache License 2.0 | 4 votes |
public static void main(String[] args) throws Exception {
// Initialize the parameter utility tool in order to retrieve input parameters.
ParameterTool params = ParameterTool.fromArgs(args);
// Clients will contact with the Pravega controller to get information about Streams.
URI pravegaControllerURI = URI.create(params.get(Constants.CONTROLLER_ADDRESS_PARAM, Constants.CONTROLLER_ADDRESS));
PravegaConfig pravegaConfig = PravegaConfig
.fromParams(params)
.withControllerURI(pravegaControllerURI)
.withDefaultScope(Constants.DEFAULT_SCOPE);
// Create the scope if it is not present.
StreamManager streamManager = StreamManager.create(pravegaControllerURI);
streamManager.createScope(Constants.DEFAULT_SCOPE);
// Create the Pravega source to read from data produced by DataProducer.
Stream sensorEvents = Utils.createStream(pravegaConfig, Constants.PRODUCER_STREAM);
SourceFunction<Tuple2<Integer, Double>> reader = FlinkPravegaReader.<Tuple2<Integer, Double>>builder()
.withPravegaConfig(pravegaConfig)
.forStream(sensorEvents)
.withReaderGroupName(READER_GROUP_NAME)
.withDeserializationSchema(new Tuple2DeserializationSchema())
.build();
// Create the Pravega sink to output the stream cuts representing slices to analyze.
Stream streamCutsStream = Utils.createStream(pravegaConfig, Constants.STREAMCUTS_STREAM);
SinkFunction<SensorStreamSlice> writer = FlinkPravegaWriter.<SensorStreamSlice>builder()
.withPravegaConfig(pravegaConfig)
.forStream(streamCutsStream)
.withSerializationSchema(PravegaSerialization.serializationFor(SensorStreamSlice.class))
.withEventRouter(new EventRouter())
.build();
// Initialize the Flink execution environment.
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment()
.enableCheckpointing(CHECKPOINT_INTERVAL);
env.getCheckpointConfig().setCheckpointTimeout(CHECKPOINT_INTERVAL);
GuavaImmutableMapSerializer.registerSerializers(env.getConfig());
// Bookmark those sections of the stream with values < 0 and write the output (StreamCuts).
DataStreamSink<SensorStreamSlice> dataStreamSink = env.addSource(reader)
.setParallelism(Constants.PARALLELISM)
.keyBy(0)
.process((KeyedProcessFunction) new Bookmarker(pravegaControllerURI))
.addSink(writer);
// Execute within the Flink environment.
env.execute("StreamBookmarker");
LOG.info("Ending StreamBookmarker...");
}
Example #30
Source File: KeyedProcessOperator.java From flink with Apache License 2.0 | 4 votes |
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
function.super();
this.timerService = checkNotNull(timerService);
}