Installing Apache Airflow on CentOS 7
1 min readAug 1, 2016
Apache Airflow is a platform to programmatically author, schedule and monitor workflows. The latest version of Airflow can be installed via pip, and several extra packages are available for different, richer environments.
Each extra package may require additional system packages, making the installation of Airflow slightly tricky. Here are some recipes to install Airflow on CentOS 7, Python 2.7.5. All recipes assume an installation within a python virtual environment:
virtualenv airflow
source airflow/bin/activate
Airflow Minimal
sudo yum -y install gcc gcc-c++
pip install airflow
Airflow Common
(MySQL, Celery, Crypto, Password auth)
sudo yum -y install gcc gcc-c++ libffi-devel mariadb-devel
pip install airflow[async,celery,crypto,jdbc,mysql,password,rabbitmq]
Airflow Common + Hadoop
(HDFS, Hive, Druid, Vertica, LDAP and Kerberos Security)
sudo yum -y install gcc gcc-c++ libffi-devel mariadb-devel cyrus-sasl-devel
easy_install -U setuptools
pip install airflow[async,celery,crypto,druid,jdbc,hdfs,hive,kerberos,ldap,mysql,password,rabbitmq,vertica]
From here, the Airflow Quick Start should be "quick and straightforward"! :)