Apache Airflow を MySQL で Docker Compose してみた

Apache Airflow 公式サイトから取得できいる docker-compose.yaml はメタデータ用のデータベースに posgresql を使っているのですが、MySQL で立ち上げられる様に変更してみました。

これはその時の備忘録です。

  1. 公式サイトから YAML ファイルを取得
  2. 不要な部分をコメントアウト
  3. Postgresql の部分を MySQL に変更
  4. コメント部分を削除

1. 公式サイトから YAML ファイルを取得

公式サイトのリンクから YAML ファイルを取得

% mkdir docker_airflow
% cd docker_airflow
% mkdir -p ./dags ./logs ./plugins ./config

% curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.6.1/docker-compose.yaml'

中身がこれ(2023 年 5 月時点)

version: '3.8'
x-airflow-common:
  &airflow-common
  # In order to add custom dependencies or upgrade provider packages you can use your extended image.
  # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
  # and uncomment the "build" line below, Then run `docker-compose build` to build the images.
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.6.1}
  # build: .
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    # For backward compatibility, with Airflow <2.3
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
    AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
    # yamllint disable rule:line-length
    # Use simple http server on scheduler for health checks
    # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
    # yamllint enable rule:line-length
    AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
    # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
    # for other purpose (development, test and especially production usage) build/extend Airflow image.
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
  volumes:
    - ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
    - ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
    - ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
    - ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:0"
  depends_on:
    &airflow-common-depends-on
    redis:
      condition: service_healthy
    postgres:
      condition: service_healthy

services:
  postgres:
    image: postgres:13
    environment:
      POSTGRES_USER: airflow
      POSTGRES_PASSWORD: airflow
      POSTGRES_DB: airflow
    volumes:
      - postgres-db-volume:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "airflow"]
      interval: 10s
      retries: 5
      start_period: 5s
    restart: always

  redis:
    image: redis:latest
    expose:
      - 6379
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 30s
      retries: 50
      start_period: 30s
    restart: always

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8974/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-worker:
    <<: *airflow-common
    command: celery worker
    healthcheck:
      test:
        - "CMD-SHELL"
        - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    environment:
      <<: *airflow-common-env
      # Required to handle warm shutdown of the celery workers properly
      # See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
      DUMB_INIT_SETSID: "0"
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-triggerer:
    <<: *airflow-common
    command: triggerer
    healthcheck:
      test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-init:
    <<: *airflow-common
    entrypoint: /bin/bash
    # yamllint disable rule:line-length
    command:
      - -c
      - |
        function ver() {
          printf "%04d%04d%04d%04d" $${1//./ }
        }
        airflow_version=$$(AIRFLOW__LOGGING__LOGGING_LEVEL=INFO && gosu airflow airflow version)
        airflow_version_comparable=$$(ver $${airflow_version})
        min_airflow_version=2.2.0
        min_airflow_version_comparable=$$(ver $${min_airflow_version})
        if (( airflow_version_comparable < min_airflow_version_comparable )); then
          echo
          echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
          echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
          echo
          exit 1
        fi
        if [[ -z "${AIRFLOW_UID}" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
          echo "If you are on Linux, you SHOULD follow the instructions below to set "
          echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
          echo "For other operating systems you can get rid of the warning with manually created .env file:"
          echo "    See: https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user"
          echo
        fi
        one_meg=1048576
        mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
        cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
        disk_available=$$(df / | tail -1 | awk '{print $$4}')
        warning_resources="false"
        if (( mem_available < 4000 )) ; then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
          echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
          echo
          warning_resources="true"
        fi
        if (( cpus_available < 2 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
          echo "At least 2 CPUs recommended. You have $${cpus_available}"
          echo
          warning_resources="true"
        fi
        if (( disk_available < one_meg * 10 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
          echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
          echo
          warning_resources="true"
        fi
        if [[ $${warning_resources} == "true" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
          echo "Please follow the instructions to increase amount of resources available:"
          echo "   https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin"
          echo
        fi
        mkdir -p /sources/logs /sources/dags /sources/plugins
        chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
        exec /entrypoint airflow version
    # yamllint enable rule:line-length
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
      _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
      _PIP_ADDITIONAL_REQUIREMENTS: ''
    user: "0:0"
    volumes:
      - ${AIRFLOW_PROJ_DIR:-.}:/sources

  airflow-cli:
    <<: *airflow-common
    profiles:
      - debug
    environment:
      <<: *airflow-common-env
      CONNECTION_CHECK_MAX_COUNT: "0"
    # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
    command:
      - bash
      - -c
      - airflow

  # You can enable flower by adding "--profile flower" option e.g. docker-compose --profile flower up
  # or by explicitly targeted on the command line e.g. docker-compose up flower.
  # See: https://docs.docker.com/compose/profiles/
  flower:
    <<: *airflow-common
    command: celery flower
    profiles:
      - flower
    ports:
      - "5555:5555"
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

volumes:
  postgres-db-volume:

2. 不要な部分をコメントアウト

不要な部分をコメントアウトしたのがこれ。Celery 関連、Redis関連をコメントアウトしています。

version: '3.8'
x-airflow-common:
  &airflow-common
  # In order to add custom dependencies or upgrade provider packages you can use your extended image.
  # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
  # and uncomment the "build" line below, Then run `docker-compose build` to build the images.
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.6.1}
  # build: .
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: LocalExecutor
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    # For backward compatibility, with Airflow <2.3
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    # AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    # AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
    AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
    # yamllint disable rule:line-length
    # Use simple http server on scheduler for health checks
    # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
    # yamllint enable rule:line-length
    AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
    # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
    # for other purpose (development, test and especially production usage) build/extend Airflow image.
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
  volumes:
    - ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
    - ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
    - ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
    - ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:0"
  depends_on:
    &airflow-common-depends-on
    # redis:
    #   condition: service_healthy
    postgres:
      condition: service_healthy

services:
  postgres:
    image: postgres:13
    environment:
      POSTGRES_USER: airflow
      POSTGRES_PASSWORD: airflow
      POSTGRES_DB: airflow
    volumes:
      - postgres-db-volume:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "airflow"]
      interval: 10s
      retries: 5
      start_period: 5s
    restart: always

  # redis:
  #   image: redis:latest
  #   expose:
  #     - 6379
  #   healthcheck:
  #     test: ["CMD", "redis-cli", "ping"]
  #     interval: 10s
  #     timeout: 30s
  #     retries: 50
  #     start_period: 30s
  #   restart: always

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8974/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  # airflow-worker:
  #   <<: *airflow-common
  #   command: celery worker
  #   healthcheck:
  #     test:
  #       - "CMD-SHELL"
  #       - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
  #     interval: 30s
  #     timeout: 10s
  #     retries: 5
  #     start_period: 30s
  #   environment:
  #     <<: *airflow-common-env
  #     # Required to handle warm shutdown of the celery workers properly
  #     # See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
  #     DUMB_INIT_SETSID: "0"
  #   restart: always
  #   depends_on:
  #     <<: *airflow-common-depends-on
  #     airflow-init:
  #       condition: service_completed_successfully

  airflow-triggerer:
    <<: *airflow-common
    command: triggerer
    healthcheck:
      test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-init:
    <<: *airflow-common
    entrypoint: /bin/bash
    # yamllint disable rule:line-length
    command:
      - -c
      - |
        function ver() {
          printf "%04d%04d%04d%04d" $${1//./ }
        }
        airflow_version=$$(AIRFLOW__LOGGING__LOGGING_LEVEL=INFO && gosu airflow airflow version)
        airflow_version_comparable=$$(ver $${airflow_version})
        min_airflow_version=2.2.0
        min_airflow_version_comparable=$$(ver $${min_airflow_version})
        if (( airflow_version_comparable < min_airflow_version_comparable )); then
          echo
          echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
          echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
          echo
          exit 1
        fi
        if [[ -z "${AIRFLOW_UID}" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
          echo "If you are on Linux, you SHOULD follow the instructions below to set "
          echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
          echo "For other operating systems you can get rid of the warning with manually created .env file:"
          echo "    See: https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user"
          echo
        fi
        one_meg=1048576
        mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
        cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
        disk_available=$$(df / | tail -1 | awk '{print $$4}')
        warning_resources="false"
        if (( mem_available < 4000 )) ; then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
          echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
          echo
          warning_resources="true"
        fi
        if (( cpus_available < 2 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
          echo "At least 2 CPUs recommended. You have $${cpus_available}"
          echo
          warning_resources="true"
        fi
        if (( disk_available < one_meg * 10 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
          echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
          echo
          warning_resources="true"
        fi
        if [[ $${warning_resources} == "true" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
          echo "Please follow the instructions to increase amount of resources available:"
          echo "   https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin"
          echo
        fi
        mkdir -p /sources/logs /sources/dags /sources/plugins
        chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
        exec /entrypoint airflow version
    # yamllint enable rule:line-length
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
      _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
      _PIP_ADDITIONAL_REQUIREMENTS: ''
    user: "0:0"
    volumes:
      - ${AIRFLOW_PROJ_DIR:-.}:/sources

  airflow-cli:
    <<: *airflow-common
    profiles:
      - debug
    environment:
      <<: *airflow-common-env
      CONNECTION_CHECK_MAX_COUNT: "0"
    # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
    command:
      - bash
      - -c
      - airflow

  # You can enable flower by adding "--profile flower" option e.g. docker-compose --profile flower up
  # or by explicitly targeted on the command line e.g. docker-compose up flower.
  # See: https://docs.docker.com/compose/profiles/
  # flower:
  #   <<: *airflow-common
  #   command: celery flower
  #   profiles:
  #     - flower
  #   ports:
  #     - "5555:5555"
  #   healthcheck:
  #     test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
  #     interval: 30s
  #     timeout: 10s
  #     retries: 5
  #     start_period: 30s
  #   restart: always
  #   depends_on:
  #     <<: *airflow-common-depends-on
  #     airflow-init:
  #       condition: service_completed_successfully

volumes:
  postgres-db-volume:
% docker-compose up airflow-init
% docker-compose up -d

3. Postgresql の部分を MySQL へ変更

version: '3.8'
x-airflow-common:
  &airflow-common
  # In order to add custom dependencies or upgrade provider packages you can use your extended image.
  # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
  # and uncomment the "build" line below, Then run `docker-compose build` to build the images.
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.6.1}
  # build: .
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: LocalExecutor
    # AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: mysql://airflow:airflowpassword@mysql/airflow
    # For backward compatibility, with Airflow <2.3
    # AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: mysql://airflow:airflowpassword@mysql/airflow
    # AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    # AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
    AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
    # yamllint disable rule:line-length
    # Use simple http server on scheduler for health checks
    # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
    # yamllint enable rule:line-length
    AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
    # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
    # for other purpose (development, test and especially production usage) build/extend Airflow image.
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
  volumes:
    - ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
    - ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
    - ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
    - ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:0"
  depends_on:
    &airflow-common-depends-on
    # redis:
    #   condition: service_healthy
    # postgres:
    #   condition: service_healthy
    mysql:
      condition: service_healthy

services:
  # postgres:
  #   image: postgres:13
  #   environment:
  #     POSTGRES_USER: airflow
  #     POSTGRES_PASSWORD: airflow
  #     POSTGRES_DB: airflow
  #   volumes:
  #     - postgres-db-volume:/var/lib/postgresql/data
  #   healthcheck:
  #     test: ["CMD", "pg_isready", "-U", "airflow"]
  #     interval: 10s
  #     retries: 5
  #     start_period: 5s
  #   restart: always
  mysql:
    image: mysql:8.0
    command: --default-authentication-plugin=mysql_native_password
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: example
      MYSQL_DATABASE: airflow
      MYSQL_USER: airflow
      MYSQL_PASSWORD: airflowpassword
    volumes:
      - mysql-db-volume:/var/lib/mysql
    healthcheck:
      test: ["CMD", "mysqladmin" ,"ping", "-h", "localhost"]
      interval: 20s
      timeout: 10s
      retries: 5
    ports:
      - "13306:3306"

  # redis:
  #   image: redis:latest
  #   expose:
  #     - 6379
  #   healthcheck:
  #     test: ["CMD", "redis-cli", "ping"]
  #     interval: 10s
  #     timeout: 30s
  #     retries: 50
  #     start_period: 30s
  #   restart: always

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8974/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  # airflow-worker:
  #   <<: *airflow-common
  #   command: celery worker
  #   healthcheck:
  #     test:
  #       - "CMD-SHELL"
  #       - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
  #     interval: 30s
  #     timeout: 10s
  #     retries: 5
  #     start_period: 30s
  #   environment:
  #     <<: *airflow-common-env
  #     # Required to handle warm shutdown of the celery workers properly
  #     # See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
  #     DUMB_INIT_SETSID: "0"
  #   restart: always
  #   depends_on:
  #     <<: *airflow-common-depends-on
  #     airflow-init:
  #       condition: service_completed_successfully

  airflow-triggerer:
    <<: *airflow-common
    command: triggerer
    healthcheck:
      test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-init:
    <<: *airflow-common
    entrypoint: /bin/bash
    # yamllint disable rule:line-length
    command:
      - -c
      - |
        function ver() {
          printf "%04d%04d%04d%04d" $${1//./ }
        }
        airflow_version=$$(AIRFLOW__LOGGING__LOGGING_LEVEL=INFO && gosu airflow airflow version)
        airflow_version_comparable=$$(ver $${airflow_version})
        min_airflow_version=2.2.0
        min_airflow_version_comparable=$$(ver $${min_airflow_version})
        if (( airflow_version_comparable < min_airflow_version_comparable )); then
          echo
          echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
          echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
          echo
          exit 1
        fi
        if [[ -z "${AIRFLOW_UID}" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
          echo "If you are on Linux, you SHOULD follow the instructions below to set "
          echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
          echo "For other operating systems you can get rid of the warning with manually created .env file:"
          echo "    See: https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user"
          echo
        fi
        one_meg=1048576
        mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
        cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
        disk_available=$$(df / | tail -1 | awk '{print $$4}')
        warning_resources="false"
        if (( mem_available < 4000 )) ; then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
          echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
          echo
          warning_resources="true"
        fi
        if (( cpus_available < 2 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
          echo "At least 2 CPUs recommended. You have $${cpus_available}"
          echo
          warning_resources="true"
        fi
        if (( disk_available < one_meg * 10 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
          echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
          echo
          warning_resources="true"
        fi
        if [[ $${warning_resources} == "true" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
          echo "Please follow the instructions to increase amount of resources available:"
          echo "   https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin"
          echo
        fi
        mkdir -p /sources/logs /sources/dags /sources/plugins
        chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
        exec /entrypoint airflow version
    # yamllint enable rule:line-length
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
      _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
      _PIP_ADDITIONAL_REQUIREMENTS: ''
    user: "0:0"
    volumes:
      - ${AIRFLOW_PROJ_DIR:-.}:/sources

  airflow-cli:
    <<: *airflow-common
    profiles:
      - debug
    environment:
      <<: *airflow-common-env
      CONNECTION_CHECK_MAX_COUNT: "0"
    # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
    command:
      - bash
      - -c
      - airflow

  # You can enable flower by adding "--profile flower" option e.g. docker-compose --profile flower up
  # or by explicitly targeted on the command line e.g. docker-compose up flower.
  # See: https://docs.docker.com/compose/profiles/
  # flower:
  #   <<: *airflow-common
  #   command: celery flower
  #   profiles:
  #     - flower
  #   ports:
  #     - "5555:5555"
  #   healthcheck:
  #     test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
  #     interval: 30s
  #     timeout: 10s
  #     retries: 5
  #     start_period: 30s
  #   restart: always
  #   depends_on:
  #     <<: *airflow-common-depends-on
  #     airflow-init:
  #       condition: service_completed_successfully

volumes:
  # postgres-db-volume:
  mysql-db-volume:

4. コメント部分を削除

コメント部分を削除したのがこちら。

version: '3.8'
x-airflow-common:
  &airflow-common
  # In order to add custom dependencies or upgrade provider packages you can use your extended image.
  # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
  # and uncomment the "build" line below, Then run `docker-compose build` to build the images.
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.6.1}
  # build: .
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: LocalExecutor
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: mysql://airflow:airflowpassword@mysql/airflow
    # For backward compatibility, with Airflow <2.3
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: mysql://airflow:airflowpassword@mysql/airflow
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
    AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
    # yamllint disable rule:line-length
    # Use simple http server on scheduler for health checks
    # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
    # yamllint enable rule:line-length
    AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
    # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
    # for other purpose (development, test and especially production usage) build/extend Airflow image.
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
  volumes:
    - ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
    - ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
    - ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
    - ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:0"
  depends_on:
    &airflow-common-depends-on
    mysql:
      condition: service_healthy

services:
  mysql:
    image: mysql:8.0
    command: --default-authentication-plugin=mysql_native_password
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: airflowpassword
      MYSQL_DATABASE: airflow
      MYSQL_USER: airflow
      MYSQL_PASSWORD: airflowpassword
    volumes:
      - mysql-db-volume:/var/lib/mysql
    healthcheck:
      test: ["CMD", "mysqladmin" ,"ping", "-h", "localhost"]
      interval: 20s
      timeout: 10s
      retries: 5
    ports:
      - "13306:3306"

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8974/health"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-triggerer:
    <<: *airflow-common
    command: triggerer
    healthcheck:
      test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s
    restart: always
    depends_on:
      <<: *airflow-common-depends-on
      airflow-init:
        condition: service_completed_successfully

  airflow-init:
    <<: *airflow-common
    entrypoint: /bin/bash
    # yamllint disable rule:line-length
    command:
      - -c
      - |
        function ver() {
          printf "%04d%04d%04d%04d" $${1//./ }
        }
        airflow_version=$$(AIRFLOW__LOGGING__LOGGING_LEVEL=INFO && gosu airflow airflow version)
        airflow_version_comparable=$$(ver $${airflow_version})
        min_airflow_version=2.2.0
        min_airflow_version_comparable=$$(ver $${min_airflow_version})
        if (( airflow_version_comparable < min_airflow_version_comparable )); then
          echo
          echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
          echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
          echo
          exit 1
        fi
        if [[ -z "${AIRFLOW_UID}" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
          echo "If you are on Linux, you SHOULD follow the instructions below to set "
          echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
          echo "For other operating systems you can get rid of the warning with manually created .env file:"
          echo "    See: https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user"
          echo
        fi
        one_meg=1048576
        mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
        cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
        disk_available=$$(df / | tail -1 | awk '{print $$4}')
        warning_resources="false"
        if (( mem_available < 4000 )) ; then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
          echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
          echo
          warning_resources="true"
        fi
        if (( cpus_available < 2 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
          echo "At least 2 CPUs recommended. You have $${cpus_available}"
          echo
          warning_resources="true"
        fi
        if (( disk_available < one_meg * 10 )); then
          echo
          echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
          echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
          echo
          warning_resources="true"
        fi
        if [[ $${warning_resources} == "true" ]]; then
          echo
          echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
          echo "Please follow the instructions to increase amount of resources available:"
          echo "   https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin"
          echo
        fi
        mkdir -p /sources/logs /sources/dags /sources/plugins
        chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
        exec /entrypoint airflow version
    # yamllint enable rule:line-length
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
      _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
      _PIP_ADDITIONAL_REQUIREMENTS: ''
    user: "0:0"
    volumes:
      - ${AIRFLOW_PROJ_DIR:-.}:/sources

  airflow-cli:
    <<: *airflow-common
    profiles:
      - debug
    environment:
      <<: *airflow-common-env
      CONNECTION_CHECK_MAX_COUNT: "0"
    # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
    command:
      - bash
      - -c
      - airflow

volumes:
  mysql-db-volume:
% docker-compose up airflow-init
% docker-compose up -d

DBeaver でリモートサーバーの MySQL へ接続

DBeaver からリモートーサーバ上にある MySQL への接続を作成したので、メモとして残します。

今回は SSH Tunnel を使用し、SSH でサーバーへ接続→localhost として MySQL へアクセスする方法を選びました。

  1. 新しい接続
  2. SSH Tunnel の設定
  3. allowPublicKeyRetrieval
  4. MySQL ログイン情報の設定

新しい接続

まず、DBeaver の画面上でデータベースタブをクリックし、「新しい接続」をクリックします。

すると下記の様な画面が表示されるので今回の場合は「MySQL」を選択します。

SSH Tunnel の設定

今回、SSH Tunnel を使用して接続するので、まず「SSH」タブをクリックし、MySQL を設置しているサーバーへの SSH 接続情報を入力します。

MySQL へのログイン情報ではありません。

SSH 接続部分のテストをします。

無事成功です。

allowPublicKeyRetrieval

「ドライバのプロパティ」タブで「allowPublicKeyRetrieval」を TRUE にします。

MySQL ログイン情報の設定

最後にデータベース名やユーザー名、パスワードなど MySQL のログイン情報を入力します。

これもテストしてみます。

成功です。「終了」をクリックで接続の作成は完了です。

Mac ローカルにインストールした MySQL を使う方法

  1. MySQL を起動する
  2. MySQL に接続する
  3. MySQL を遮断する
  4. MySQL を終了する

1. MySQL を起動する

コマンド「brew services start mysql」でローカルの MySQL を起動します。

% brew services start mysql
==> Successfully started `mysql` (label: homebrew.mxcl.mysql)

2. MySQL に接続する

コマンド「mysql -u root」で MySQL に接続します。

% mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.23 Homebrew

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

3. MySQL を遮断する

コマンド「exit」で MySQL のコマンドプロンプトから出ます。

mysql> exit
Bye

4. MySQL を終了する

コマンド「brew services stop mysql」で終了です。

% brew services stop mysql
==> Successfully stopped `mysql` (label: homebrew.mxcl.mysql)

【mac】PostgreSQL にテーブルを作って csv のデータを挿入した時のメモ

  1. テーブルの作成
  2. csv ファイルをテーブルに copy (insert)
  3. (ついでに)inspectdb でモデル情報を確認

1. テーブルの作成

ターミナルにコマンド「psql データベース名」を入力し対象のデータベースを開いて、別で書いておいた create 文のコピペで処理可能。最後に「;」をつけるのを忘れない様に気をつけてください。

下記の例では「yt_trend_channel」というテーブルを作りました。

sample_db=# create table yt_trend_channel (
channel_id varchar(30),
view_count varchar(20),
join_date varchar(20),
country varchar(20),
channel_name varchar(100),
thumbnail_url varchar(50),
subscriber_count varchar(20),
data_date varchar(8)
);

2. csv ファイルをテーブルに copy (insert)

csv ファイルを テーブルに copy します。

psql を起動する前に対象のファイルのディレクトリへ cd しておくと、単純にファイル名を渡すだけで済むのでわかりやすいです。

下記の例では先ほど create した yt_trend_channel テーブルに、別で用意した yt_trend_channel.csv という csv ファイルのデータを copy (insert) しています。

sample_db=# \copy yt_trend_channel from 'yt_trend_channel.csv' with csv
COPY 4259
sample_db=#

全部で 4259 件のレコードが copy されました。

select クエリで確認します。

sample_db=# select * from yt_trend_channel;
        channel_id        | view_count  | join_date  |    country     |                            channel_name                             |                   thumbnail_url                    | subscriber_count | data_date 
--------------------------+-------------+------------+----------------+---------------------------------------------------------------------+----------------------------------------------------+------------------+-----------
 UC9OvLVXb-okaVtoK9V8Biwg | 103499474   | 2018/05/23 | 日本           | ABEMA バラエティ【公式】                                            | https://yt3.ggpht.com/ytc/AAUvwnjBKC3y2KycvPfvHdn1 |  19.9万人        | 20210104
 UCZZ0UGjWsRdM8_5bsqtxYaQ | 190130764   | 2015/10/02 | 日本           | RIZIN FIGHTING FEDERATION                                           | https://yt3.ggpht.com/ytc/AAUvwnjDj0e-hS0um-ZKG1Tc |  35万人          | 20210104
 UCaminwG9MTO4sLYeC3s6udA | 3201321432  | 2016/03/18 | 日本           | ヒカル(Hikaru)                                                    | https://yt3.ggpht.com/ytc/AAUvwnhBYwfz6DycI4xBURsZ |  428万人         | 20210104
 UCPn_e-7LD7SbzIBxdU_XBnQ | 21113326    | 2020/07/30 | 日本           | あーたろチャンネル                                                  | https://yt3.ggpht.com/ytc/AAUvwngZDvNAYRF_qgPfOgy9 |  18.7万人        | 20210104
 UC4-TMrb7Mm4KnYx1VsUgcJA | 135489838   | 2012/08/04 | 日本           | あいり                                                              | https://yt3.ggpht.com/ytc/AAUvwngy1VncfSpEQkWmZXYK |  26.8万人        | 20210104

ちなみに MySQL の場合は下記の様な感じ

load data local infile  'yt_trend_channel.csv'
into table yt_trend_channel
fields terminated by ',' 
enclosed by '"'
;

3. (ついでに Django)inspectdb でモデル情報を確認

実はそもそも Django で使うテーブルなので「python manage.py inspectdb テーブル名」でモデル情報を確認してみます。

% python manage.py inspectdb yt_trend_channel
# This is an auto-generated Django model module.
# You'll have to do the following manually to clean this up:
#   * Rearrange models' order
#   * Make sure each model has one field with primary_key=True
#   * Make sure each ForeignKey has `on_delete` set to the desired behavior.
#   * Remove `managed = False` lines if you wish to allow Django to create, modify, and delete the table
# Feel free to rename the models, but don't rename db_table values or field names.
from django.db import models


class YtTrendChannel(models.Model):
    channel_id = models.CharField(max_length=30, blank=True, null=True)
    view_count = models.CharField(max_length=20, blank=True, null=True)
    join_date = models.CharField(max_length=20, blank=True, null=True)
    country = models.CharField(max_length=20, blank=True, null=True)
    channel_name = models.CharField(max_length=100, blank=True, null=True)
    thumbnail_url = models.CharField(max_length=50, blank=True, null=True)
    subscriber_count = models.CharField(max_length=20, blank=True, null=True)
    data_date = models.CharField(max_length=8, blank=True, null=True)

    class Meta:
        managed = False
        db_table = 'yt_trend_channel'
% 

class 文を models.py にコピーすると Django でモデルとして使える様になります。その際に 1 つ「primary_key=True」のフィールドを持つ必要がありますので確認してください。

【mac】Django プロジェクトで PostgreSQL のデータベースを利用する(仮想環境内)

Python 仮想環境内で Django プロジェクト用データベースを PostgreSQL で作成し、Django プロジェクトと紐づける方法を書きます。

今回は例として「sample_db」というデータベース作成し、プロジェクトと紐づけます。

  1. PostgreSQL で新規のデータベースを作成
  2. Django プロジェクトとデータベースの紐付け
  3. 仮想環境で接続用モジュールをインストール
  4. ちなみに
  5. 参考書籍

1. PostgreSQL で新規のデータベースを作成

PostgreSQL を起動し「sample_db」という名前のデータベースを作成します。

% sudo brew services start postgresql
% createdb sample_db

「psql -l」でデータベースのリストを表示すると、「sample_db」が作成されたのが確認できます。

% psql -l
                                         List of databases
   Name    |      Owner       | Encoding | Collate | Ctype |           Access privileges           
-----------+------------------+----------+---------+-------+---------------------------------------
 postgres  | ユーザー名         | UTF8     | C       | C     | 
 sample_db | ユーザー名         | UTF8     | C       | C     | 
 template0 | ユーザー名         | UTF8     | C       | C     | =c/ユーザー名                  +
           |                  |          |         |       | ユーザー名=CTc/ユーザー名
 template1 | ユーザー名         | UTF8     | C       | C     | =c/ユーザー名                  +
           |                  |          |         |       | ユーザー名=CTc/ユーザー名
(4 rows)

ちなみにデータベースを削除したい場合は「dropdb sample_db」と入力します。

2. Django プロジェクトとデータベースの紐付け

Django プロジェクトの settings.py を変更し、先ほど新規作成したデータベースを紐付けます。

プロジェクト作成時に自動作成された settings.py の内容は下記の通り。

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': BASE_DIR / 'db.sqlite3',
    }
}

これを下記の様に変更します。

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'sample_db',
        'USER': os.environ.get('DB_USER'),
        'PASSWORD': os.environ.get('DB_PASSWORD'),
        'HOST': '',
        'PORT': '',
    }
}

またその際にファイル上部に下記を追記し os モジュールをインポートします。

import os

3. 仮想環境で接続用モジュールをインストール

Python で PostgreSQL データベースに接続するために、接続用ドライバのモジュールが必要となります。もっとも多く使われている psycopg2 というモジュールをインストールします。

仮想環境に入った状態で「pip install psycopg2-binary」を実行します。

(sample_venv) % pip install psycopg2-binary
Collecting psycopg2-binary
  Downloading https://files.pythonhosted.org/packages/2c/85/c26507efb110f5a91f503e517f1db55f12ebecb001ff224b2cea234a07ef/psycopg2_binary-2.8.6-cp38-cp38-macosx_10_9_x86_64.macosx_10_9_intel.macosx_10_10_intel.macosx_10_10_x86_64.whl (1.5MB)
     |████████████████████████████████| 1.5MB 7.1MB/s 
Installing collected packages: psycopg2-binary
Successfully installed psycopg2-binary-2.8.6
WARNING: You are using pip version 19.2.3, however version 21.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(sample_venv) % 

4. ちなみに

PyCharm の runserver を実行した時に下記のエラーが出て一瞬焦りましたが、PostgreSQL を起動したら治りました。

django.db.utils.OperationalError: could not connect to server: No such file or directory

ちなみのちなみに、settings.py の「DATABASES」の部分をコメントアウトなどするとデータベースを起動していなくても問題なく runserver を実行できます。

5. 参考書籍

僕は始め下記の書籍に沿って Django の全体像を勉強しました。書籍は PostgreSQL を使用した開発になっているので参考にしてみてください。

動かして学ぶ! Python Django開発入門 (NEXT ONE)

Mac に PostgreSQL をインストールする

Homebrew を使って PostgreSQL の最新バーションをインストールします。

実行環境:maxOS Big Sur バージョン 11.2.2

  1. PostgreSQL をインストールする
  2. PostgreSQL を起動・終了する

PostgreSQL をインストールする

下記のコマンドを実行します。

% brew install postgresql

ログが表示されます。(下記のログには Homebrew 自体の自動アップデートなど余分なものも含まれてます。)

% brew install postgresql            
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).
==> New Formulae
aerc                       hblock                     php@7.4
aida-header                heppdt2                    pickle
aliddns                    hexo                       pkger
ansible@2.9                htmltest                   projectm
arturo                     iconsur                    ptpython
as-tree                    indicators                 pyqt-3d
atkmm@2.28                 inframap                   pyqt-builder
attr                       isl@0.18                   pyqt-networkauth
aws-console                isort                      pyqt@5
aws-rotate-key             jpeg-xl                    pyside@2
bandit                     jql                        python-tabulate
bas55                      json5                      qt-libiodbc
bit-git                    k3sup                      qt-mariadb
blaze                      kcgi                       qt-mysql
bpython                    keptn                      qt-percona-server
bpytop                     khiva                      qt-postgresql
cairomm@1.14               klee                       qt-unixodbc
cargo-audit                ko                         rain
cargo-edit                 kotlin-language-server     rdkit
cargo-watch                kube-linter                regipy
cfn-flip                   kubecm                     richmd
cfn-format                 kubergrunt                 rm-improved
chars                      kubevela                   rtl_433
checkov                    lab                        rttr
cherrytree                 latino                     ruby@2.7
clash                      libbsd                     saml2aws
cloudflare-wrangler        libcap                     showkey
coin3d                     libcap-ng                  snowpack
condure                    libfuse                    spack
counterfeiter              libmd                      sqlancer
cpplint                    libnsl                     sqlc
crane                      libpinyin                  strace
crcany                     libpipeline                t-rec
crispy-doom                libprelude                 taskwarrior-tui
curlie                     librttopo                  tctl
cyrus-sasl                 libslirp                   tendermint
dasel                      libtirpc                   terracognita
datalad                    libva                      terraform@0.13
delve                      libvdpau                   tgenv
device-mapper              libvnc                     thrax
dnsx                       libxml++@4                 tm
docui                      libxml++@5                 tomcat@9
dog                        libxp                      trec_eval
dotenv-linter              linux-headers              tree-sitter
dprint                     logswan                    truffle
driftctl                   lp_solve                   utf8cpp
dstask                     lua@5.3                    v2ray
duf                        luajit-openresty           vc
easy-rsa                   luv                        vint
efm-langserver             magic_enum                 vitess
enzyme                     markdownlint-cli           vsh
evernote2md                massdns                    vulture
f3d                        md4c                       waffle
fizz                       mermaid-cli                wangle
flow-cli                   mesa-glu                   wayland
flux                       mockery                    wayland-protocols
fnm                        mpdecimal                  wllvm
forge                      msc-generator              xbitmaps
gallery-dl                 muffet                     xcb-util
gcalcli                    name-that-hash             xcb-util-cursor
gdu                        ncspot                     xcb-util-image
geph4                      nicotine-plus              xcb-util-keysyms
ghc@9                      node@14                    xcb-util-renderutil
gitlint                    nuclei                     xcb-util-wm
giza                       oakc                       xcbeautify
glab                       ocaml-zarith               xcinfo
glibc                      open-adventure             xcprojectlint
go@1.15                    openalpr                   xinput
gofumpt                    openjdk@8                  xkeyboardconfig
gojq                       openmama                   xorgrgb
googletest                 openmodelica               xray
gopass-jsonapi             or-tools                   xterm
gopls                      oras                       yh
goredo                     osmcoastline               ykdl
gosec                      overdrive                  youtube-dlc
gost                       ox                         youtubedr
gping                      parliament                 yq@3
grokj2k                    pcalc                      yubikey-agent
grokmirror                 pdftilecut                 zsh-async
gtkmm4                     pdm
h2spec                     phive
==> Updated Formulae
Updated 5020 formulae.
==> Renamed Formulae
glibmm@2.64 -> glibmm@2.66               pangomm@2.42 -> pangomm@2.46
gtk+4 -> gtk4                            prest -> prestd
ht-rust -> xh                            qt5 -> qt@5
now-cli -> vercel-cli
==> Deleted Formulae
avian                      gobby                      scw@1
boost@1.55                 godep                      stlviewer
boost@1.59                 llvm@6                     unp64
confluent-platform         meson-internal             unrar
curl-openssl               mysql-connector-c++@1.1    woboq_codebrowser
dtrx                       ori                        xspin
fmsx                       pgplot
geant4                     rmtrash

==> Downloading https://homebrew.bintray.com/bottles/icu4c-68.2.big_sur.bottle.t
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/ba2fd8c7d37025c93db6b
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/openssl%401.1-1.1.1j.big_su
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/5725361adcd088a5b4fb2
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/krb5-1.19.1.big_sur.bottle.
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/d544c1111503eb27b253e
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/readline-8.1.big_sur.bottle
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/2cc3a9582e3c7e21eb3c2
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/postgresql-13.2_1.big_sur.b
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/67a547842ae49911d301d
######################################################################## 100.0%
==> Installing dependencies for postgresql: icu4c, openssl@1.1, krb5 and readline
==> Installing postgresql dependency: icu4c
==> Pouring icu4c-68.2.big_sur.bottle.tar.gz
==> Caveats
icu4c is keg-only, which means it was not symlinked into /usr/local,
because macOS provides libicucore.dylib (but nothing else).

If you need to have icu4c first in your PATH, run:
  echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.zshrc
  echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.zshrc

For compilers to find icu4c you may need to set:
  export LDFLAGS="-L/usr/local/opt/icu4c/lib"
  export CPPFLAGS="-I/usr/local/opt/icu4c/include"

==> Summary
🍺  /usr/local/Cellar/icu4c/68.2: 259 files, 72.5MB
==> Installing postgresql dependency: openssl@1.1
==> Pouring openssl@1.1-1.1.1j.big_sur.bottle.tar.gz
==> Caveats
A CA file has been bootstrapped using certificates from the system
keychain. To add additional certificates, place .pem files in
  /usr/local/etc/openssl@1.1/certs

and run
  /usr/local/opt/openssl@1.1/bin/c_rehash

openssl@1.1 is keg-only, which means it was not symlinked into /usr/local,
because macOS provides LibreSSL.

If you need to have openssl@1.1 first in your PATH, run:
  echo 'export PATH="/usr/local/opt/openssl@1.1/bin:$PATH"' >> ~/.zshrc

For compilers to find openssl@1.1 you may need to set:
  export LDFLAGS="-L/usr/local/opt/openssl@1.1/lib"
  export CPPFLAGS="-I/usr/local/opt/openssl@1.1/include"

==> Summary
🍺  /usr/local/Cellar/openssl@1.1/1.1.1j: 8,071 files, 18.5MB
==> Installing postgresql dependency: krb5
==> Pouring krb5-1.19.1.big_sur.bottle.tar.gz
==> Caveats
krb5 is keg-only, which means it was not symlinked into /usr/local,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

If you need to have krb5 first in your PATH, run:
  echo 'export PATH="/usr/local/opt/krb5/bin:$PATH"' >> ~/.zshrc
  echo 'export PATH="/usr/local/opt/krb5/sbin:$PATH"' >> ~/.zshrc

For compilers to find krb5 you may need to set:
  export LDFLAGS="-L/usr/local/opt/krb5/lib"
  export CPPFLAGS="-I/usr/local/opt/krb5/include"

==> Summary
🍺  /usr/local/Cellar/krb5/1.19.1: 162 files, 5.2MB
==> Installing postgresql dependency: readline
==> Pouring readline-8.1.big_sur.bottle.tar.gz
==> Caveats
readline is keg-only, which means it was not symlinked into /usr/local,
because macOS provides BSD libedit.

For compilers to find readline you may need to set:
  export LDFLAGS="-L/usr/local/opt/readline/lib"
  export CPPFLAGS="-I/usr/local/opt/readline/include"

==> Summary
🍺  /usr/local/Cellar/readline/8.1: 48 files, 1.6MB
==> Installing postgresql
==> Pouring postgresql-13.2_1.big_sur.bottle.tar.gz
==> /usr/local/Cellar/postgresql/13.2_1/bin/initdb --locale=C -E UTF-8 /usr/loca
==> Caveats
To migrate existing data from a previous major version of PostgreSQL run:
  brew postgresql-upgrade-database

This formula has created a default database cluster with:
  initdb --locale=C -E UTF-8 /usr/local/var/postgres
For more details, read:
  https://www.postgresql.org/docs/13/app-initdb.html

To have launchd start postgresql now and restart at login:
  brew services start postgresql
Or, if you don't want/need a background service you can just run:
  pg_ctl -D /usr/local/var/postgres start
==> Summary
🍺  /usr/local/Cellar/postgresql/13.2_1: 3,218 files, 42.6MB
==> `brew cleanup` has not been run in 30 days, running now...
Removing: /Users/ユーザー名/Library/Caches/Homebrew/mysql--8.0.21_1.catalina.bottle.tar.gz... (80.4MB)
Removing: /usr/local/Cellar/openssl@1.1/1.1.1h... (8,067 files, 18.5MB)
Removing: /Users/ユーザー名/Library/Caches/Homebrew/openssl@1.1--1.1.1h.catalina.bottle.tar.gz... (5.4MB)
Removing: /Users/ユーザー名/Library/Caches/Homebrew/protobuf--3.13.0.catalina.bottle.tar.gz... (4.8MB)
Removing: /Users/ユーザー名/Library/Logs/Homebrew/mysql... (1.1KB)
Removing: /Users/ユーザー名/Library/Logs/Homebrew/protobuf... (64B)
Pruned 1 symbolic links and 2 directories from /usr/local
==> Upgrading 2 dependents:
mysql 8.0.21_1 -> 8.0.23_1, wget 1.20.3_2 -> 1.21.1
==> Upgrading mysql 8.0.21_1 -> 8.0.23_1 
==> Downloading https://homebrew.bintray.com/bottles/protobuf-3.15.5.big_sur.bot
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/872baaeb92aaaaa8d2634
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/mysql-8.0.23_1.big_sur.bott
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/9e9e4b5bfcad47adfbf7f
######################################################################## 100.0%
==> Installing dependencies for mysql: protobuf
==> Installing mysql dependency: protobuf
==> Pouring protobuf-3.15.5.big_sur.bottle.tar.gz
==> Caveats
Emacs Lisp files have been installed to:
  /usr/local/share/emacs/site-lisp/protobuf
==> Summary
🍺  /usr/local/Cellar/protobuf/3.15.5: 211 files, 17.7MB
==> Installing mysql
==> Pouring mysql-8.0.23_1.big_sur.bottle.tar.gz
==> Caveats
We've installed your MySQL database without a root password. To secure it run:
    mysql_secure_installation

MySQL is configured to only allow connections from localhost by default

To connect run:
    mysql -uroot

To have launchd start mysql now and restart at login:
  brew services start mysql
Or, if you don't want/need a background service you can just run:
  mysql.server start
==> Summary
🍺  /usr/local/Cellar/mysql/8.0.23_1: 298 files, 297.7MB
Removing: /usr/local/Cellar/mysql/8.0.21_1... (290 files, 291.2MB)
==> Upgrading wget 1.20.3_2 -> 1.21.1 
==> Downloading https://homebrew.bintray.com/bottles/gettext-0.21.big_sur.bottle
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/a025e143fe3f5f7e24a93
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/wget-1.21.1.big_sur.bottle.
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/e6ea2a50b8196206f7072
######################################################################## 100.0%
==> Installing dependencies for wget: gettext
==> Installing wget dependency: gettext
==> Pouring gettext-0.21.big_sur.bottle.tar.gz
🍺  /usr/local/Cellar/gettext/0.21: 1,953 files, 19.8MB
==> Installing wget
==> Pouring wget-1.21.1.big_sur.bottle.tar.gz
🍺  /usr/local/Cellar/wget/1.21.1: 88 files, 4MB
Removing: /usr/local/Cellar/wget/1.20.3_2... (50 files, 4.0MB)
==> Checking for dependents of upgraded formulae...
==> No broken dependents found!
==> Caveats
==> icu4c
icu4c is keg-only, which means it was not symlinked into /usr/local,
because macOS provides libicucore.dylib (but nothing else).

If you need to have icu4c first in your PATH, run:
  echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.zshrc
  echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.zshrc

For compilers to find icu4c you may need to set:
  export LDFLAGS="-L/usr/local/opt/icu4c/lib"
  export CPPFLAGS="-I/usr/local/opt/icu4c/include"

==> openssl@1.1
A CA file has been bootstrapped using certificates from the system
keychain. To add additional certificates, place .pem files in
  /usr/local/etc/openssl@1.1/certs

and run
  /usr/local/opt/openssl@1.1/bin/c_rehash

openssl@1.1 is keg-only, which means it was not symlinked into /usr/local,
because macOS provides LibreSSL.

If you need to have openssl@1.1 first in your PATH, run:
  echo 'export PATH="/usr/local/opt/openssl@1.1/bin:$PATH"' >> ~/.zshrc

For compilers to find openssl@1.1 you may need to set:
  export LDFLAGS="-L/usr/local/opt/openssl@1.1/lib"
  export CPPFLAGS="-I/usr/local/opt/openssl@1.1/include"

==> krb5
krb5 is keg-only, which means it was not symlinked into /usr/local,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

If you need to have krb5 first in your PATH, run:
  echo 'export PATH="/usr/local/opt/krb5/bin:$PATH"' >> ~/.zshrc
  echo 'export PATH="/usr/local/opt/krb5/sbin:$PATH"' >> ~/.zshrc

For compilers to find krb5 you may need to set:
  export LDFLAGS="-L/usr/local/opt/krb5/lib"
  export CPPFLAGS="-I/usr/local/opt/krb5/include"

==> readline
readline is keg-only, which means it was not symlinked into /usr/local,
because macOS provides BSD libedit.

For compilers to find readline you may need to set:
  export LDFLAGS="-L/usr/local/opt/readline/lib"
  export CPPFLAGS="-I/usr/local/opt/readline/include"

==> postgresql
To migrate existing data from a previous major version of PostgreSQL run:
  brew postgresql-upgrade-database

This formula has created a default database cluster with:
  initdb --locale=C -E UTF-8 /usr/local/var/postgres
For more details, read:
  https://www.postgresql.org/docs/13/app-initdb.html

To have launchd start postgresql now and restart at login:
  brew services start postgresql
Or, if you don't want/need a background service you can just run:
  pg_ctl -D /usr/local/var/postgres start
==> protobuf
Emacs Lisp files have been installed to:
  /usr/local/share/emacs/site-lisp/protobuf
==> mysql
We've installed your MySQL database without a root password. To secure it run:
    mysql_secure_installation

MySQL is configured to only allow connections from localhost by default

To connect run:
    mysql -uroot

To have launchd start mysql now and restart at login:
  brew services start mysql
Or, if you don't want/need a background service you can just run:
  mysql.server start
% 

コマンド「psql --version」を実行し、PostgreSQL のバージョンが正常に表示されれば問題ありません。

% psql --version
psql (PostgreSQL) 13.2

PostgreSQL を起動・終了する

「brew services start postgresql」で起動します。初回起動時のみ下記の様なログが表示されますが、2回目以降は表示されません。

% brew services start postgresql
==> Tapping homebrew/services
Cloning into '/usr/local/Homebrew/Library/Taps/homebrew/homebrew-services'...
remote: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 1141 (delta 8), reused 0 (delta 0), pack-reused 1121
Receiving objects: 100% (1141/1141), 331.40 KiB | 5.18 MiB/s, done.
Resolving deltas: 100% (485/485), done.
Tapped 1 command (41 files, 420.9KB).
==> Successfully started `postgresql` (label: homebrew.mxcl.postgresql)
%

コマンド「psql -l」を実行するとデータベースの一覧が参照できます。

% psql -l
                                         List of databases
   Name    |      Owner       | Encoding | Collate | Ctype |           Access privileges           
-----------+------------------+----------+---------+-------+---------------------------------------
 postgres  | ユーザー名         | UTF8     | C       | C     | 
 template0 | ユーザー名         | UTF8     | C       | C     | =c/ユーザー名                  +
           |                  |          |         |       | ユーザー名=CTc/ユーザー名
 template1 | ユーザー名         | UTF8     | C       | C     | =c/ユーザー名                  +
           |                  |          |         |       | ユーザー名=CTc/ユーザー名
(3 rows)
%

「brew services stop postgresql」で終了

% brew services stop postgresql
Stopping `postgresql`... (might take a while)
==> Successfully stopped `postgresql` (label: homebrew.mxcl.postgresql)
%