Limitations
The following limitations apply to Atlas Stream Processing:
Atlas Stream Processing is currently only available on AWS, in the
US-EAST-1
region. This only applies to stream processing instances themselves; your stream processors can still read from and write to clusters hosted on different cloud providers, or in different regions provided they are in the same project as the stream processing instance.The combined
state.stateSize
of all stream processing instances can't exceed 80% of the RAM available for a worker in the same SPI tier. For example, the maximum size of a stream processor in theSP30
tier which has 8GB of RAM per worker, is 6.4GB. If thestate.stateSize
of any of your stream processors is approaching 80% of the RAM available for a worker in the same SPI tier, move up to the next SPI tier.When the 80% RAM threshold has been crossed, all stream processors fail with a
stream processing instance out of memeory
error. You can view thestate.stateSize
value of each stream processor with thesp.processor.stats()
command. See View Statistics of a Stream Processor to learn more.A stream processing instance can use only clusters in the same project as sources or sinks.
An Atlas Stream Processing pipeline definition cannot exceed 16 MB.
Only users with the
Project Owner
orAtlas admin
roles can use Atlas Stream Processing.Atlas Stream Processing currently supports only the following connection types:
Connection TypeUsageSource or SinkAtlas DatabaseSource or SinkSample ConnectionSource OnlyFor Atlas Stream Processing using Apache Kafka as a $source, if the Apache Kafka topic acting as $source to the running processor adds a partition, Atlas Stream Processing continues running without reading the partition. The processor fails when it detects the new partition after you restore it from a checkpoint after a failure, or you restart it after stopping it. You must recreate the processors that read from topics with the newly added partitions.
Atlas Stream Processing currently supports only JSON-formatted data. It does not currently support alternative serializations such as Avro or Protocol Buffers.
For Apache Kafka connections, Atlas Stream Processing currently supports only the following security protocols:
PLAINTEXT
SASL_PLAINTEXT
SASL_SSL
Atlas Stream Processing currently doesn't support custom SSL certificates.
For
SASL
, Atlas Stream Processing supports the following mechanisms:PLAIN
SCRAM-SHA-256
SCRAM-SHA-512
Atlas Stream Processing doesn't support $function JavaScript UDFs.
Atlas Stream Processing supports a subset of the Aggregation Pipeline Stages available in Atlas, allowing you to perform many of the same operations on streaming data that you can perform on data-at-rest. For a full list of supported Aggregation Pipeline Stages, see the Stream Aggregation documentation.