storm.trident.spout.IBatchSpout Java Examples

The following examples show how to use storm.trident.spout.IBatchSpout. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: TridentTopology.java    From jstorm with Apache License 2.0 5 votes vote down vote up
public Stream newStream(String txId, ITridentDataSource dataSource) {
    if (dataSource instanceof IBatchSpout) {
        return newStream(txId, (IBatchSpout) dataSource);
    } else if (dataSource instanceof ITridentSpout) {
        return newStream(txId, (ITridentSpout) dataSource);
    } else if (dataSource instanceof IPartitionedTridentSpout) {
        return newStream(txId, (IPartitionedTridentSpout) dataSource);
    } else if (dataSource instanceof IOpaquePartitionedTridentSpout) {
        return newStream(txId, (IOpaquePartitionedTridentSpout) dataSource);
    } else {
        throw new UnsupportedOperationException("Unsupported stream");
    }
}
 
Example #2
Source File: TridentTopology.java    From jstorm with Apache License 2.0 5 votes vote down vote up
private static Map getSpoutComponentConfig(Object spout) {
    if(spout instanceof IRichSpout) {
        return ((IRichSpout) spout).getComponentConfiguration();
    } else if (spout instanceof IBatchSpout) {
        return ((IBatchSpout) spout).getComponentConfiguration();
    } else {
        return ((ITridentSpout) spout).getComponentConfiguration();
    }
}
 
Example #3
Source File: Part01_BasicPrimitives.java    From trident-tutorial with Apache License 2.0 4 votes vote down vote up
public static StormTopology basicPrimitives(IBatchSpout spout) throws IOException {

        // A topology is a set of streams.
        // A stream is a DAG of Spouts and Bolts.
        // (In Storm there are Spouts (data producers) and Bolts (data processors).
        // Spouts create Tuples and Bolts manipulate then and possibly emit new ones.)

        // But in Trident we operate at a higher level.
        // Bolts are created and connected automatically out of higher-level constructs.
        // Also, Spouts are "batched".
        TridentTopology topology = new TridentTopology();

        // The "each" primitive allows us to apply either filters or functions to the stream
        // We always have to select the input fields.
        topology
                .newStream("filter", spout)
                .each(new Fields("actor"), new RegexFilter("pere"))
                .each(new Fields("text", "actor"), new Print());

        // Functions describe their output fields, which are always appended to the input fields.
        // As you see, Each operations can be chained.
        topology
                .newStream("function", spout)
                .each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
                .each(new Fields("text", "uppercased_text"), new Print());

        // You can prune unnecessary fields using "project"
        topology
                .newStream("projection", spout)
                .each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
                .project(new Fields("uppercased_text"))
                .each(new Fields("uppercased_text"), new Print());

        // Stream can be parallelized with "parallelismHint"
        // Parallelism hint is applied downwards until a partitioning operation (we will see this later).
        // This topology creates 5 spouts and 5 bolts:
        // Let's debug that with TridentOperationContext.partitionIndex !
        topology
                .newStream("parallel", spout)
                .each(new Fields("actor"), new RegexFilter("pere"))
                .parallelismHint(5)
                .each(new Fields("text", "actor"), new Print());

        // You can perform aggregations by grouping the stream and then applying an aggregation
        // Note how each actor appears more than once. We are aggregating inside small batches (aka micro batches)
        // This is useful for pre-processing before storing the result to databases
        topology
                .newStream("aggregation", spout)
                .groupBy(new Fields("actor"))
                .aggregate(new Count(),new Fields("count"))
                .each(new Fields("actor", "count"),new Print())
        ;

        // In order ot aggregate across batches, we need persistentAggregate.
        // This example is incrementing a count in the DB, using the result of these micro batch aggregations
        // (here we are simply using a hash map for the "database")
        topology
                .newStream("aggregation", spout)
                .groupBy(new Fields("actor"))
                .persistentAggregate(new MemoryMapState.Factory(),new Count(),new Fields("count"))
        ;

        return topology.build();
    }
 
Example #4
Source File: TridentTopology.java    From jstorm with Apache License 2.0 4 votes vote down vote up
public Stream newStream(String txId, IBatchSpout spout) {
    Node n = new SpoutNode(getUniqueStreamId(), spout.getOutputFields(), txId, spout, SpoutNode.SpoutType.BATCH);
    return addNode(n);
}