Contents Menu Expand Light mode Dark mode Auto light/dark mode
Try Aiven for free for 30 days! Get started ->
Fork me on GitHub!
Aiven
  • Platform
    • Concepts
      • Authentication tokens
      • Billing
        • Tax information regarding Aiven services
      • Beta services
      • Cloud security
      • About logging, metrics and alerting
      • Projects, accounts, and managing access permissions
      • Service forking
      • Backups at Aiven
      • Static IP addresses
      • TLS/SSL certificates
      • Bring your own account (BYOA)
      • Dynamic Disk Sizing
      • Enhanced compliance environments (ECE)
      • Disaster Recovery testing scenarios
    • HowTo
      • User/Access management
        • Change your email address
        • Create an authentication token
        • Create a new Aiven service user
        • Enable Aiven password
        • Manage user two-factor authentication
      • Service management
        • Create a new service
        • Fork your service
        • Pause or terminate your service
        • Scale your service
        • Migrate services
        • Add storage space
        • Tag your Aiven resources
      • Network management
        • Download a CA certificate
        • Restrict network access to your service
        • Enable public access in a VPC
        • Manage static IP addresses
        • Handle resolution errors of private IPs
      • Monitoring management
        • Monitoring services
        • Use Prometheus with Aiven
        • Increase metrics limit setting for Datadog
    • Reference
      • List of available cloud regions
  • Integrations
    • Datadog
      • Send metrics to Datadog
      • Send logs to Datadog
  • Aiven tools
    • Aiven CLI
      • avn account
        • avn account authentication-method
        • avn account team
      • avn card
      • avn cloud
      • avn credits
      • avn events
      • avn mirrormaker
      • avn project
      • avn service
        • avn service acl
        • avn service connection-pool
        • avn service connector
        • avn service database
        • avn service es-acl
        • avn service flink beta
        • avn service index
        • avn service integration
        • avn service m3
        • avn service privatelink
        • avn service tags
        • avn service topic
        • avn service user
      • avn ticket
      • avn user
        • avn user access-token
      • avn vpc
    • Aiven API
      • API examples
    • Aiven Terraform provider
      • Get started
      • HowTo
        • Upgrade the Aiven Terraform Provider from v1 to v2
      • Concepts
        • Data sources in Terraform
      • Reference
        • Cookbook
          • Apache Kafka and OpenSearch
          • Multicloud PostgreSQL
          • Apache Kafka and Apache Flink
          • Apache Kafka and Apache MirrorMaker
          • Apache Kafka with Karapace
          • Visualize PostgreSQL metrics with Grafana
          • PostgreSQL with custom configs
          • Apache Kafka MongoDB Source Connector
          • Apache Kafka with custom configurations
    • Aiven Operator for Kubernetes
  • Apache Kafka
    • Get started
    • Sample data generator
    • Concepts
      • Access control lists permission mapping
      • Compacted topics
      • Partition segments
      • Scaling options in Apache Kafka®
      • Authentication types
      • NOT_LEADER_FOR_PARTITION errors
    • HowTo
      • Code samples
        • Connect with Python
        • Connect with Java
        • Connect with Go
      • Tools
        • Configure consumer properties for Apache Kafka® toolbox
        • Use kcat with Aiven for Apache Kafka®
        • Connect to Apache Kafka® with Conduktor
        • Use Kafdrop Web UI with Aiven for Apache Kafka®
        • Use Provectus® UI for Apache Kafka® with Aiven for Apache Kafka®
        • Use Kpow with Aiven for Apache Kafka®
      • Security
        • Configure Java SSL to access Apache Kafka®
        • Manage users and access control lists
        • Monitor and alert logs for denied ACL
        • Use SASL Authentication with Apache Kafka®
        • Renew and Acknowledge service user SSL certificates
      • Administration tasks
        • Get the best from Apache Kafka®
        • Use Karapace with Aiven for Apache Kafka®
        • Manage configurations with Apache Kafka® CLI tools
        • Manage Apache Kafka® parameters
        • View and reset consumer group offsets
        • Configure log cleaner for topic compaction
        • Prevent full disks
        • Set Apache ZooKeeper™ configuration
      • Integrations
        • Integration of logs into Apache Kafka® topic
        • Use Apache Kafka® Streams with Aiven for Apache Kafka®
        • Configure Apache Kafka® metrics sent to Datadog
      • Topic/schema management
        • Creating an Apache Kafka® topic
        • Create Apache Kafka® topics automatically
        • Get partition details of an Apache Kafka® topic
        • Use schema registry in Java with Aiven for Apache Kafka®
        • Change data retention period
    • Reference
      • Advanced parameters
      • Metrics available via Prometheus
    • Apache Kafka Connect
      • Getting started
      • Concepts
        • List of available Apache Kafka® Connect connectors
        • JDBC source connector modes
        • Causes of “connector list not currently available”
      • HowTo
        • Administration tasks
          • Get the best from Apache Kafka® Connect
          • Bring your own Apache Kafka® Connect cluster
          • Enable Apache Kafka® Connect on Aiven for Apache Kafka®
          • Enable Apache Kafka® Connect connectors auto restart on failures
        • Source connectors
          • Create a JDBC source connector for PostgreSQL®
          • Create a JDBC source connector for MySQL
          • Create a JDBC source connector for SQL Server
          • Create a MongoDB source connector
          • Create a Debezium source connector for PostgreSQL®
          • Create a Debezium source connector for MySQL
          • Create a Debezium source connector for SQL Server
          • Create a Debezium source connector for MongoDB
        • Sink connectors
          • Create a JDBC sink connector
          • Configure AWS for an S3 sink connector
          • Create an S3 sink connector by Aiven
          • Create an S3 sink connector by Confluent
          • Configure GCP for a Google Cloud Storage sink connector
          • Create a Google Cloud Storage sink connector
          • Configure GCP for a Google BigQuery sink connector
          • Create a Google BigQuery sink connector
          • Create an OpenSearch® sink connector
          • Create an Elasticsearch sink connector
          • Configure Snowflake for a sink connector
          • Create a Snowflake sink connector
          • Create an HTTP sink connector
          • Create a MongoDB sink connector by MongoDB
          • Create a MongoDB sink connector by Lenses.io
          • Create a Redis™* stream reactor sink connector by Lenses.io
      • Reference
        • Advanced parameters
        • AWS S3 sink connector naming and data format
          • S3 sink connector by Aiven naming and data formats
          • S3 sink connector by Confluent naming and data formats
        • Google Cloud Storage sink connector naming and data formats
        • Metrics available via Prometheus
    • Apache Kafka MirrorMaker2
      • Getting started
      • Concepts
        • Topics included in a replication flow
      • HowTo
        • Integrate an external Apache Kafka® cluster in Aiven
        • Set up an Apache Kafka® MirrorMaker 2 replication flow
        • Setup Apache Kafka® MirrorMaker 2 monitoring
        • Remove topic prefix when replicating with Apache Kafka® MirrorMaker 2
      • Reference
        • List of advanced parameters
        • Terminology for Aiven for Apache Kafka® MirrorMaker 2
  • PostgreSQL
    • Get started
    • Sample dataset: Pagila
    • Concepts
      • About aiven-db-migrate
      • Perform DBA-type tasks in Aiven for PostgreSQL®
      • High availability
      • PostgreSQL® backups
      • Connection pooling
      • About PostgreSQL® disk usage
      • About TimescaleDB
      • Upgrade and failover procedures
    • HowTo
      • Code samples
        • Connect with Go
        • Connect with Java
        • Connect with NodeJS
        • Connect with PHP
        • Connect with Python
      • DBA tasks
        • Create additional PostgreSQL® databases
        • Perform a PostgreSQL® major version upgrade
        • Install or update an extension
        • Create manual PostgreSQL® backups
        • Restore PostgreSQL® from a backup
        • Migrate to a different cloud provider or region
        • Claim public schema ownership
        • Manage connection pooling
        • Access PgBouncer statistics
        • Use the PostgreSQL® dblink extension
        • Enable JIT in PostgreSQL®
        • Identify and repair issues with PostgreSQL® indexes with REINDEX
        • Identify PostgreSQL® slow queries
        • Optimize PostgreSQL® slow queries
        • Check and avoid transaction ID wraparound
        • Prevent PostgreSQL® full disk issues
      • Replication and migration
        • Create and use read-only replicas
        • Set up logical replication to Aiven for PostgreSQL®
        • Migrate to Aiven for PostgreSQL® with aiven-db-migrate
          • Enable logical replication on Amazon Aurora PostgreSQL®
          • Enable logical replication on Amazon RDS PostgreSQL®
          • Enable logical replication on Google Cloud SQL
        • Migrate to Aiven for PostgreSQL® with pg_dump and pg_restore
        • Migrate between PostgreSQL® instances using aiven-db-migrate in Python
      • Integrations
        • Connect with psql
        • Connect with pgAdmin
        • Visualize PostgreSQL® data with Grafana®
        • Monitor PostgreSQL® metrics with Grafana®
        • Monitor PostgreSQL® metrics with pgwatch2
        • Connect two PostgreSQL® services via datasource integration
        • Report and analyze with Google Data Studio
    • Reference
      • High CPU load
      • Advanced parameters
      • Extensions
      • Metrics exposed to Grafana
      • Connection limits per plan
      • Resource capability per plan
      • Terminology
  • Apache Flink
    • Get started
    • Concepts
      • Apache Flink® for data analysts
      • Apache Flink® for operators
      • Standard and upsert Apache Kafka® connectors
      • Requirements for Apache Kafka® connectors
      • Built-in SQL editor
      • Event and processing times
      • Windows
      • Watermarks
      • Checkpoints
    • HowTo
      • Create Apache Flink® integrations
      • Create an Apache Kafka®-based Apache Flink® table
      • Create a PostgreSQL®-based Apache Flink® table
      • Create an OpenSearch®-based Apache Flink® table
      • Create an Apache Flink® job
      • Define OpenSearch® timestamp data in SQL pipeline
      • Create a real-time alerting solution - Aiven console
    • Reference
      • Advanced parameters
  • ClickHouse
    • Get started
    • Sample dataset
    • Concepts
      • Online analytical processing
      • About databases and tables
      • Columnar databases
      • Indexing and data processing
    • HowTo
      • Use the ClickHouse® client
      • Use the query editor
      • Add service user accounts
      • Grant privileges
    • Reference
      • Supported table engines
      • Advanced parameters
  • OpenSearch
    • Get started
    • Sample dataset: recipes
    • Concepts
      • Access control
      • Backups
      • Indices
      • Aggregations
      • OpenSearch® vs Elasticsearch
      • Optimal number of shards
      • When to create a new index
    • HowTo
      • Use Aiven for OpenSearch® with cURL
      • Migrate Elasticsearch data to Aiven for OpenSearch®
      • Manage OpenSearch® log integration
      • Upgrade to OpenSearch®
      • Upgrade to OpenSearch® with Terraform
      • Upgrade Elasticsearch clients to OpenSearch®
      • Connect with NodeJS
      • Connect with Python
      • Search with Python
      • Search with NodeJS
      • Aggregation with NodeJS
      • Control access to content in your service
      • Restore an OpenSearch® backup
      • Dump OpenSearch® index using elasticsearch-dump
      • Set index retention patterns
      • Create alerts with OpenSearch® API
      • Handle low disk space
    • OpenSearch Dashboards
      • Getting started
      • HowTo
        • Getting started with Dev tools
        • Create alerts with OpenSearch® Dashboards
    • Reference
      • Plugins
      • Advanced parameters
      • Automatic adjustment of replication factors
      • REST API endpoint access
      • Low disk space watermarks
  • M3DB
    • Get started
    • Concepts
      • Aiven for M3 components
      • About M3DB namespaces and aggregation
      • About scaling M3
    • HowTo
      • Visualize M3DB data with Grafana®
      • Monitor Aiven services with M3DB
      • Use M3DB as remote storage for Prometheus
      • Write to M3 from Telegraf
      • Telegraf to M3 to Grafana® Example
      • Write data to M3DB with Go
      • Write data to M3DB with PHP
      • Write data to M3DB with Python
    • Reference
      • Terminology
      • Advanced parameters
      • Advanced parameters M3Aggregator
  • MySQL
    • Get started
    • Concepts
      • Understand MySQL backups
      • Understand MySQL high memory usage
    • HowTo
      • Code samples
        • Connect to MySQL from the command line
        • Using mysqlsh
        • Using mysql
        • Connect to MySQL with Python
        • Connect to MySQL with Java
      • Create additional MySQL® databases
      • Connect to MySQL with MySQL Workbench
      • Calculate the maximum number of connections for MySQL
      • Migrate to Aiven for MySQL from an external MySQL
      • Service migration check
      • Prevent MySQL disk full
    • Reference
      • Advanced parameters
      • Resource capability per plan
  • Redis
    • Get started
    • Concepts
      • High availability in Aiven for Redis™*
      • Lua scripts with Aiven for Redis™*
      • Memory usage, on-disk persistence and replication in Aiven for Redis™*
    • HowTo
      • Code samples
        • Connect with redis-cli
        • Connect with Go
        • Connect with NodeJS
        • Connect with PHP
        • Connect with Python
        • Connect with Java
      • DBA tasks
        • Configure ACL permissions in Aiven for Redis™*
        • Migrate from Redis™* to Aiven for Redis™*
      • Estimate maximum number of connection
      • Manage SSL connectivity
      • Handle warning overcommit_memory
    • Reference
      • Advanced parameters
  • Apache Cassandra
    • Reference
      • Advanced parameters
  • Grafana
    • Get started
    • HowTo
      • Log in to Aiven for Grafana®
      • Send emails from Aiven for Grafana®
    • Reference
      • Advanced parameters
      • Plugins
  • Community
    • Documentation
      • Create anonymous links
      • Create orphan pages
      • Rename files and adding redirects

Apache Flink® beta¶

What is Aiven for Apache Flink®?¶

Aiven for Apache Flink® beta is powered by the open-source framework Apache Flink®, a distributed processing engine for stateful computations over data streams. It enables you to easily get started with real-time stream processing using SQL.

The service is currently available as a beta release and is intended for non-production use.

Why Apache Flink?¶

Apache Flink is a processing engine enabling the definition of streaming data pipelines with SQL statements. It can work in batch or streaming mode, it’s distributed by default, and performs computation at in-memory speed at any scale.

You can use Flink to implement batch data processing, but also for handling real-time processing for data streams. More conventional database solutions might only have automation to process the accumulated data at certain intervals. Working with data streams, on the other hand, allows you to analyze and process the data in real time. This means that you can use Apache Flink to configure solutions for real-time alerting or triggering operations instead of using less efficient solutions.

Real time filtering¶

A key part of data stream processing is the ability to filter and transform incoming data in real time. As an example, compliance requirements might mean that you have to ensure limited visibility and access to certain data. In such cases, the capability to process incoming data directly can add significant value and efficiency. Using Apache Flink, you can configure data pipelines to handle the incoming data and store or deliver it differently according to the type of the data or content. You can also transform the data according to your needs, or combine sources to enrich the data.

Use familiar SQL¶

As Apache Flink uses SQL as a key part of constructing data pipelines, it is quite an approachable option for people who are already familiar with databases and batch processing. However, there are some relevant concepts (such as windows, watermarks, and checkpoints) that are worth knowing if you are new to data stream processing.

Get started with Aiven for Apache Flink¶

Take your first steps with Aiven for Apache Flink by following our Getting started with Aiven for Apache Flink® article, or browse through our full list of articles:

📚 Concepts

💻 HowTo

📖 Reference

Apache Flink features¶

Flink SQL

Apache Flink enables you to develop streaming applications using standard SQL. The Aiven web console provides an SQL editor to explore the table schema and create SQL queries to process streaming data.

Built-in data flow integration with Aiven for Apache Kafka®

Connect with Aiven for Apache Kafka® as a source or sink for your data.

  • Autocompletion for finding existing topics in a connected Kafka service when you create data tables.

  • Choose the table format when reading data from Kafka - JSON, Apache Avro, Confluent Avro, Debezium CDC.

  • Supports upsert Kafka connectors, which allow you to produce a changelog stream, where each data record represents an update or delete event.

Built-in data flow integration with Aiven for PostgreSQL®

Connect with Aiven for PostgreSQL® as a source or sink for your data. The Aiven web console features autocompletion for finding existing databases in a connected PostgreSQL service when you create data tables.

Automate workflows

Automate workflows for managing Flink services with Aiven Terraform Provider. See the Flink data source for details.

Apache Flink resources¶

If you are new to Flink, try these resources to get you started with the platform:

  • Read about the overview of the Flink and its architecture on the main Apache Flink project documentation.

  • Our Getting started with Aiven for Apache Flink® guide is a good way to get hands on with your first project..

  • Read more about Flink SQL capabilities.

Did you find this useful?

Apache, Apache Kafka, Kafka, Apache Flink, Flink, Apache Cassandra, and Cassandra are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. M3, M3 Aggregator, M3 Coordinator, OpenSearch, PostgreSQL, MySQL, InfluxDB, Grafana, Terraform, and Kubernetes are trademarks and property of their respective owners. *Redis is a trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by Aiven is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and Aiven. All product and service names used in this website are for identification purposes only and do not imply endorsement.

Copyright © 2022, Aiven Team | Show Source | Last updated: March 2022
Contents
  • Apache Flink® beta
    • What is Aiven for Apache Flink®?
    • Why Apache Flink?
      • Real time filtering
      • Use familiar SQL
    • Get started with Aiven for Apache Flink
    • Apache Flink features
    • Apache Flink resources