Understanding Flume

Flume is a distributed service for collecting and moving large amounts of streaming event type data. It has fault tolerant options with recovery and failover mechanisms. Flume has three components, a Source that consumes event data, a Channel that stores the data until a Sink moves the data to a persisted location or another Flume source.

Flume can be combined with Kafka for Flafka, a sophisticated technique to capture, trace, and persist streaming event data.

Note: Flumes are a non-Pipeline type of data configuration.