
In the old version of Airflow, both the web server and the scheduler required access to the DAG files in order to read and parse them. Processing DAGs on both the web server and the scheduler is rather inefficient due to unnecessary duplication, affecting the overall performance of Airflow negatively. Simply, the scheduler can parse DAG files and keep a representation in the database, so it can later be fetched by the web server to fill the user interface. They are stored in lightweight JSON format. The term refers to storing a serialized representation of the DAGs in the database. Serialization is quite an important functionality of Apache Airflow. In this article, we’d like to focus on the last two changes and explain how they make Airflow 2.0 setup easier. There have been many improvements, for sure, and it is normal that users need some time to get used to them. It is performed in the Airflow UI, so then it does not affect performance. Task Groups - Instead of using SubDAGs, which caused performance issues, there is a possibility to use Task Groups to organize tasks within DAG’s graph view.DAG versioning - users gain additional support for storing many versions of serialized DAGs.DAG serialization - in Airflow’s new version, the system server parses DAGs differently, as only the scheduler needs access to the DAG file.Smart sensors - in new Airflow’s, you will observe improved efficiency of long-running tasks thanks to DAG centralization and batch processing.Complete REST API - the new fully supported API can create some issues when upgrading the software, but it certainly makes it easier to access third-parties platforms.It is also possible to run multiple scheduler instances in an active/active model, which increases the availability and failovers, which is crucial for stability of the particular Airflow solution. An efficient scheduler - the scheduler is one of the core functionalities of Airflow and now, due to the modifications, its performance is much better than before.A redesigned user interface - the new, clear and easy to read dashboard is certainly a positive change.You can always contact us for our support, but before that, here are some of the most noticeable changes you should know about: Will changes affect a team’s efficiency in a positive or negative way? Will the new program suit your company needs? Will it be easy enough to get used to it? There are a lot of questions to be answered, but Airflow 2.0 is already here, so you can probably try to answer them yourself or join the discussion.
Airflow 2.0 dag example software#
Introducing a new version of some software is always preceded by a mix of excitement and concern on the part of the professionals that use it on a daily basis.

Read below about DAG versioning and serialization. Fortunately, there are also modifications that can simplify the day-to-day work of your data engineers. This can pose some new challenges when upgrading. Has Airflow changed for the better? How can you simplify its setup with DAG Versioning and Serialization?Īlthough some functionality of earlier versions has been preserved, there are some important changes in the new Airflow for example, it comes with complete REST API. Those who use Apache Airflow and have already encountered Airflow 2.0 will surely agree that even minor modifications can totally modify how DAGs work or even block them. Releasing the new version of some software can create significant challenges for data engineers.
