Installing Apache Airflow on CentOS 7

Emanuele Cesena
1 min readAug 1, 2016

--

Apache Airflow is a platform to programmatically author, schedule and monitor workflows. The latest version of Airflow can be installed via pip, and several extra packages are available for different, richer environments.

Each extra package may require additional system packages, making the installation of Airflow slightly tricky. Here are some recipes to install Airflow on CentOS 7, Python 2.7.5. All recipes assume an installation within a python virtual environment:

virtualenv airflow
source airflow/bin/activate

Airflow Minimal

sudo yum -y install gcc gcc-c++
pip install airflow

Airflow Common

(MySQL, Celery, Crypto, Password auth)

sudo yum -y install gcc gcc-c++ libffi-devel mariadb-devel
pip install airflow[async,celery,crypto,jdbc,mysql,password,rabbitmq]

Airflow Common + Hadoop

(HDFS, Hive, Druid, Vertica, LDAP and Kerberos Security)

sudo yum -y install gcc gcc-c++ libffi-devel mariadb-devel cyrus-sasl-devel
easy_install -U setuptools
pip install airflow[async,celery,crypto,druid,jdbc,hdfs,hive,kerberos,ldap,mysql,password,rabbitmq,vertica]

From here, the Airflow Quick Start should be "quick and straightforward"! :)

--

--

Emanuele Cesena

Forging the Everdragons2 NFT. Former security at Pinterest.