# How to install Graylog in AWS Cloud

### Overview

* This guide outlines the process of creating a Docker-based image for setting up a `Graylog` cluster in **AWS** cloud environment.
    

### Architecture

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1710218879458/c080ffcd-a4e4-40ec-8b2d-eb524b77269a.png align="center")

* **MongoDB**, **OpenSearch**, and **Graylog** must exist within the same **VPC** (or through VPC peering connections).
    
* A listener for receiving logs in real-time should be load balanced from the internal **NLB** to the internal **Graylog** service's `TCP_UDP 12201` port.
    
* The `HTTPS 443` port for external web interface access should be load balanced from the external **ALB** to the internal **Graylog** service's `HTTP 9000` port.
    

### Required Resources

* **VPC** (with **Private** and **Public Subnet**s)
    
* **MongoDB Atlas Cluster** (with **VPC Peering Connection**)
    
* **OpenSearch Service Domain** (with **Security Group**)
    
* **S3 Bucket** (for Environment Variables)
    
* **Target Group** (UDP 1541, UDP 12201, HTTP 9000)
    
* **NLB** (with **Security Group**)
    
* **ALB** (with **Security Group**)
    
* **Route 53**
    
* **ECS Cluster**
    
* **ECS Task Execution IAM Role**
    
* **ECS Task Definition**
    
* **ECS Service** (with **Security Group**)
    

### Prerequisites

* **Graylog 5.2** requires **MongoDB 5.0** or higher. `Amazon DocumentDB` is not supported. Setting up a cluster using `MongoDB Atlas` cloud is recommended. Additionally, a database named `graylog` must be pre-created in **MongoDB**, and the user account information granted `readWrite`, `dbAdmin` permissions for this database must be passed as environment variables.
    
    ```bash
    # [Method 1] MongoDB (Local or Amazon EC2)
    $ mongosh "{mongodb-uri}" --username "{root_username}" --password "{root_password}"
    > use graylog
    > db.createUser({user: '{graylog_username}', pwd: '{graylog_password}', roles: ["readWrite","dbAdmin"]})
    > exit
    
    # [Method 2] MongoDB Atlas
    # Authenticate MongoDB Atlas account (code entry required in browser)
    $ atlas auth login
    # Create graylog user account
    $ atlas dbusers create --username {graylog_username} --password {graylog_password} --role readWrite@{mongodb-atlas-db-name} --role dbAdmin@{mongodb-atlas-db-name} --projectId {mongodb-atlas-project-id}
    ```
    
* **Graylog 5.2** supports and recommends **OpenSearch 2.x** version. Setting up `Amazon OpenSearch` is recommended. (**Elasticsearch 7.10** is also supported but not officially recommended.)
    
* Additional settings required for **Amazon OpenSearch 2.x** are as follows:
    

```bash
# [1] compatibility.override_main_response_version: false
$ nano opensearch-settings.json
{
  "persistent": {
    "compatibility.override_main_response_version": false
  }
}
 
$ curl -X PUT -d @'opensearch-settings.json' -H 'Content-Type: application/json' 'https://{opensearh-uri}/_cluster/settings'
{
  "acknowledged": true,
  "persistent": {
    "compatibility": {
      "override_main_response_version": "false"
    }
  },
  "transient": {}
}

# [2] rest.action.multi.allow_explicit_index: true
Amazon OpenSearch Console > Edit Cluster Configuration > Advanced Cluster Settings
- Check [Allow APIs that can span multiple indices and bypass index-specific access policies]
- Max clause count: 1024 (enter)
- Click [Dry Run]
- Click [Save Changes]
```

### Creating a MongoDB Atlas User

```bash
MongoDB Atlas Console

# Security
→ Click [Database Access]
→ Click [ADD NEW DATABASE USER]

# Add New Database User
→ Authentication Method: Select [Password]

# Password Authentication
→ Username: {your-graylog-mongodb-username}
→ Password: {your-graylog-mongodb-password}

# Database User Privileges
→ Built-in Role: Select [Atlas admin]

# Restrict Access to Specific Clusters/Federated Database Instances/Stream Processing Instances
→ Grant Access To: Check {your-graylog-mongodb-cluster}

→ Click [Create User]
```

### Creating an OpenSearch Security Group

* Below is an example of creating a Security Group for OpenSearch.
    

```bash
Amazon EC2 Console
→ [Security Groups]
→ [Create security group]
→ Security group name: GRAYLOG-OPENSEARCH-SG
→ Description: GRAYLOG-OPENSEARCH-SG
→ VPC: {your-vpc}
# Inbound rules
→ [Add rule]
→ Type: Select [HTTPS]
→ CIDR: {your-vpc-cidr}
→ [Create]
```

### Creating an OpenSearch Service Domain

* Below is an example of creating a low-spec OpenSearch Service Domain for development purposes.
    

```bash
Amazon OpenSearch Console
→ [Domains] → [Create domain]
 
# Name
→ Domain name: graylog
 
# Domain creation method
→ Domain creation method: Select [Standard create]
 
# Templates
→ Templates: Select [Dev/test]
 
# Deployment option(s)
→ Deployment option(s): Select [Domain without standby]
→ Availability Zone(s): Select [1-AZ]
 
# Engine options
→ Version: Select [2.13 (latest)]
 
# Data nodes
→ Instance family: [General purpose]
→ Instance type: [m6g.2xlarge.search]
→ Number of nodes: 3
→ Storage type: [EBS]
→ EBS volume type: [General Purpose (SSD) - gp3]
→ EBS storage size per node: 1024

# Dedicated master nodes
→ Instance type: [m6g.large.search]
→ Number of master nodes: 3
 
# Network
→ Network: Select [VPC access]
→ IP address type: Select [IPv4 only]
→ VPC: {your-vpc}
→ Subnets: {your-private-subnet}
→ Security groups: {your-opensearch-security-group}
 
# Fine-grained access control
→ Uncheck [Enable fine-grained access control]
 
# Access policy
→ Domain access policy: [Only use fine-grained access control]

# Summary
→ [Create]
```

### Dockerfile

* Write the **Dockerfile** as follows. (The current latest version can be checked at [this link](https://github.com/Graylog2/graylog-docker).) As an option, the **Slack** plugin, which is not present in the base image, has been added.
    

```bash
FROM docker.io/graylog/graylog:5.2.5
EXPOSE 9000
EXPOSE 12201
USER root
RUN apt-get update && apt-get install wget -y
RUN wget https://github.com/graylog-labs/graylog-plugin-slack/releases/download/3.1.0/graylog-plugin-slack-3.1.0.jar
COPY graylog-plugin-slack-3.1.0.jar /usr/share/graylog-server/plugin
USER graylog
```

### Generating Admin Account Password

* Generate the **Secret** and **Hash** values to be entered in the environment variables below for creating an admin account password.
    

```bash
# Creating GRAYLOG_PASSWORD_SECRET
$ sudo pwgen -N 1 -s 96
LSIN8jBbQBxVkIyBHtkOCyanjBLLyWWABhrQmFcHskJZ5DAr1pTRCto45UfO7RMSRlCaX2YQHS6udal3yUxwnmZisaBv0HMS

# Creating GRAYLOG_ROOT_PASSWORD_SHA2 (to be used as the admin account password upon first login)
$ echo -n "Enter Password: " && head -1 </dev/stdin | tr -d '\n' | sha256sum | cut -d" " -f1
Enter Password: *****
610468bcb9b7a141e760b5d3d557c5e67678068016e224e48e304edb79dcc0ce
```

### Environment Variables

* Write the environment variables to be passed to the container as follows:
    

```bash
GRAYLOG_PASSWORD_SECRET={secret}
GRAYLOG_ROOT_PASSWORD_SHA2={hash}
GRAYLOG_BIND_ADDRESS=0.0.0.0:9000
GRAYLOG_HTTP_EXTERNAL_URI=https://{your-domain}.com/
GRAYLOG_ROOT_TIMEZONE=UTC
GRAYLOG_MONGODB_URI=mongodb+srv://{username:password@mongodb-uri}/graylog?retryWrites=true&w=majority
GRAYLOG_ELASTICSEARCH_HOSTS={opensearch-uri}
```

### Creating an NLB Target Group

* It's time to create a Target Group that will receive logs sent via **UDP** within the same **VPC**.
    

```bash
Amazon EC2 Console
→ [Load Balancing] → [Target Groups]
→ [Create target group]
→ Choose a target type: [IP addresses]
→ Target group name: GRAYLOG-TG-12201
→ Protocol: [TCP_UDP]
→ Port: 12201
→ VPC: {your-vpc}
# Health checks
→ Health check protocol: [HTTP]
→ Health check path: /api
→ Health check port: [Override] → 9000
→ Healthy threshold: 10
→ Unhealthy threshold: 10
→ Timeout: 59
→ Interval: 60
→ Success codes: 200
→ [Create target group]
```

* Since **UDP** does not support health checks, we specified the **GraylogAPI** endpoint as an alternative.
    

### NLB Target Group Settings

* The `TCP_UDP 12201` port provided by **Graylog** should be load balanced by creating an **NLB**. The attributes of the **target group** should be set as follows:
    

```bash
- Terminate connections on deregistration: Select [Enabled]
- Deregistration delay: Enter [30 seconds]
- Proxy protocol v2: [Disabled]
- Turn on Stickiness: Select [Enabled]
```

* When an existing **Graylog** instance is terminated due to deployment or other actions, it will be removed from the load balancer's target group after a grace period set by the `Deregistration delay` option, depending on the `Connection termination on deregistration` option. Since **UDP** logging does not check for the failure of the destination connection, enabling the **Connection termination on deregistration** option is essential to prevent logs from being sent to non-existent instances.
    
* If the final compressed size of a log in **UDP** logging exceeds **8,192 bytes**, **Chunking** is activated to split the packet into multiple parts for transmission. The `Stickiness` option must be enabled to ensure that split logs are not distributed but sent to a specific instance.
    

### Creating an ALB Target Group

* It's time to create a Target Group to receive browser traffic to **Graylog** from outside the **VPN**.
    

```bash
Access Amazon EC2 Console
→ [Load Balancing] → [Target Groups]
→ [Create target group]
→ Choose a target type: Select [IP addresses]
→ Target group name: GRAYLOG-TG-9000
→ Protocol: [HTTP]
→ Port: 9000
→ VPC: {your-vpc}
→ Protocol version: HTTP1
→ Health check protocol: HTTP
→ [Create target group]
```

### Creating the ALB Security Group

* The **Security Group** for the **ALB** should include a list of sources that are allowed to access the `Graylog Web Interface` and `Graylog REST API` from the public internet. The following example shows a configuration where everything is allowed:
    

```bash
Access Amazon EC2 Console
→ [Security Groups] → [Create security group]
→ Security group name: GRAYLOG-ALB-SG
→ Description: GRAYLOG-ALB-SG
→ VPC: {your-vpc}
# Inbound rules
→ [Add rule]
→ Type: [HTTPS]
→ CIDR: 0.0.0.0/0
→ [Add rule]
→ Type: [HTTPS]
→ CIDR: ::/0
→ [Create security group]
```

### Creating the ALB

* The **ALB** routes traffic from **HTTPS 443** on the public internet to **HTTP 9000** within the internal **VPC**.
    

```bash
Access Amazon EC2 Console
→ [Load Balancing] → [Load Balancers] → [Create load balancer]
→ Load Balancer type: [Application Load Balancer]
→ Load balancer name: GRAYLOG-ALB
→ VPC: {your-vpc}
→ Mappings: {your-vpc-subnets}
→ Security groups: [GRAYLOG-ALB-SG]
→ Listener → Protocol: [HTTPS] → Port: 443 → Default action: [GRAYLOG-TG-9000]
→ Certificate: {your-acm-certificate}
→ [Create load balancer]
```

### Creating the Route 53 Record for ALB

* The **ALB**'s **CNAME** is mapped to a domain owned by **Route 53**.
    

```bash
Access Amazon Route 53 Console
→ Hosted zones → {your-domain} → [Create record]
→ Record name: graylog
→ Record type: CNAME
→ Value: {your-alb-domain}.
→ [Create record]
```

### Creating the ECS Cluster

* Create an **ECS Cluster** for actually serving **Graylog**.
    

```bash
Amazon ECS Console → Cluster → [Create cluster]
→ Cluster name: GRAYLOG-ECS-CLUSTER
→ Infrastructure: [AWS Fargate (serverless)]
→ Check [Use Container insights]
→ [Create]
```

### Creating ECS IAM Task Execution Role

* The **Task Execution Role** in **ECS** defines the policies needed to start Docker containers.
    

```bash
Amazon IAM Console → Roles → [Create role]
→ Trusted entity type: [AWS service]
→ Use case: [Elastic Container Service Task]
→ Role name: GRAYLOG-ECS-TASK-EXECUTION-ROLE
→ [Create role]

→ Roles → [APNE2-DEV-IIC-GRAYLOG-ECS-TASK-EXECUTION-ROLE]
→ [Add permissions]
→ [Attach policies]
→ [AmazonEC2ContainerRegistryReadOnly], [AmazonS3ReadOnlyAccess], [CloudWatchLogsFullAccess]
→ [Add permissions]
```

### Creating ECS IAM Task Role

* The **Task Role** in **ECS** defines the policies needed by the **Graylog** application to run after the Docker containers have started.
    

```bash
Amazon IAM Console → Roles → [Create role]
→ Trusted entity type: [AWS service]
→ Use case: [Elastic Container Service Task]
→ Role name: GRAYLOG-ECS-TASK-ROLE
→ [Create role]

# Add IAM Task Role policies
-> Roles -> [GRAYLOG-ECS-TASK-ROLE]
-> [Add permissions]
-> [Attach policies]
-> [AmazonOpenSearchServiceFullAccess]
-> [Add permissions]
```

### Creating the ECS Task Definition

* In the **ECS Task Definition**, you can set specifications and auto-scaling policies for the **Graylog** service.
    

```bash
Access Amazon ECS Console → Task definitions → [Create new task definition]
# Task definition configuration
→ Task definition family: GRAYLOG-ECS-TASK-DEFINITION
# Infrastructure requirements
→ Launch type: [AWS Fargate]
→ Operating system/Architecture: [Linux/X86_64]
→ CPU: [2 VCPU]
→ Memory: [4 GB]
→ Task role: [GRAYLOG-ECS-TASK-ROLE]
→ Task execution role: [GRAYLOG-ECS-TASK-EXECUTION-ROLE]
# Container details
→ Name: GRAYLOG-ECS-CONTAINER
→ Image URI: graylog/graylog:5.2.6
→ Container Port: 9000 → Protocol: TCP → App protocol: [HTTP]
→ Container Port: 1541 → Protocol: UDP
→ Container Port: 12201 → Protocol: UDP
# Environment variables
→ Location: {your-graylog-env-s3-arn}
→ [Create]
# HealthCheck
→ Command: CMD-SHELL, wget -q http://127.0.0.1:9000/api || exit 1
→ Interval: 30
→ Timeout: 60
→ Start period: 60
→ Retries: 10
```

### Creating the ECS Security Group

* **Graylog** requires communication between nodes, so port **9000** must be opened within the same **VPC**. Additionally, it must receive **UDP** logs from applications within the same **VPC**, so referencing the **NLB**'s **SG** is necessary.
    

```bash
Amazon EC2 Console
→ [Security Groups]
→ [Create security group]
→ Security group name: GRAYLOG-ECS-SERVICE-SG
→ Description: GRAYLOG-ECS-SERVICE-SG
→ VPC: {your-vpc}
# Inbound rules
→ [Add rule]
→ Type: [TCP 9000]
→ CIDR: → [GRAYLOG-ALB-SG]
→ [Add rule]
→ Type: [TCP 9000]
→ CIDR: → {your-vpc-cidr}
→ [Add rule]
→ Type: [UDP 1541]
→ CIDR: → [GRAYLOG-NLB-SG]
→ [Add rule]
→ Type: [UDP 12201]
→ CIDR: → [GRAYLOG-NLB-SG]
```

### Creating the ECS Service

* Creating the **ECS Service** completes the final setup of the **Graylog** infrastructure.
    
* The **ECS Service** requires using the **AWS CLI** to configure a list of **Target Groups** that are mapped to different load balancers. Additionally, a **Service-linked role** called `AWSServiceRoleForECS` is automatically set as the **IAM Task Role**.
    

```bash
$ nano GRAYLOG-ECS-SERVICE.json
{
  "cluster": "GRAYLOG-ECS-CLUSTER",
  "serviceName": "GRAYLOG-ECS-SERVICE",
  "taskDefinition": "{your-ecs-task-definition-arn}",
  "loadBalancers": [
    {
      "targetGroupArn": "{your-alb-target-group}",
      "containerName": "GRAYLOG-ECS-CONTAINER",
      "containerPort": 9000
    },
    {
      "targetGroupArn": "{your-nlb-target-group}",
      "containerName": "GRAYLOG-ECS-CONTAINER",
      "containerPort": 1541
    },
    {
      "targetGroupArn": "{your-nlb-target-group}",
      "containerName": "GRAYLOG-ECS-CONTAINER",
      "containerPort": 12201
    }
  ],
  "desiredCount": 2,
  "launchType": "FARGATE",
  "platformVersion": "1.4.0",
  "schedulingStrategy": "REPLICA",
  "networkConfiguration": {
    "awsvpcConfiguration": {
      "subnets": [
        "{your-vpc-private-subnet-id}"
      ],
      "securityGroups": [
        "{your-ecs-service-security-group-id}"
      ],
      "assignPublicIp": "DISABLED"
    }
  }
}

$ aws ecs create-service --region "{your-region}" --cli-input-json "file://GRAYLOG-ECS-SERVICE.json"
```

### Local Environment Startup

* Below is an example of running **Graylog** in a local environment for testing purposes using Docker Compose. Once executed, access to the **Graylog Web Interface** is available via a browser at [**http://localhost:9000**](http://localhost:9000).
    

```yaml
# Writing the Docker Compose file
$ nano docker-compose.yml
version: '3.8'
 
networks:
  graylog:
    driver: bridge
 
services:
  mongodb:
    image: mongo:7.0.7
    container_name: mongodb
    restart: always
    ports:
      - "27017:27017"
    networks:
      - graylog
    environment:
      - MONGO_INITDB_ROOT_USERNAME={mongodb-root-username}
      - MONGO_INITDB_ROOT_PASSWORD={mongodb-root-password}
 
  opensearch:
    image: opensearchproject/opensearch:latest
    container_name: opensearch
    restart: always
    ports:
      - "9200:9200"
      - "9600:9600"
    networks:
      - graylog
    environment:
      - plugins.security.disabled=true
      - plugins.security.ssl.http.enabled=false
      - discovery.type=single-node
      - OPENSEARCH_USERNAME={opensearch-username}
      - OPENSEARCH_PASSWORD={opensearch-password}
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD={opesearch-initial-admin-password}

  graylog:
    image: graylog/graylog:5.2.5
    container_name: graylog
    restart: always
    links:
      - mongodb
      - opensearch
    depends_on:
      - mongodb
      - opensearch
    ports:
      - "9000:9000"
      - "12201:12201/udp"
      - "12201:12201/tcp"
    networks:
      - graylog
    environment:
      - GRAYLOG_PASSWORD_SECRET={secret}
      - GRAYLOG_ROOT_PASSWORD_SHA2={hash}
      - GRAYLOG_BIND_ADDRESS=127.0.0.1:9000
      - GRAYLOG_HTTP_EXTERNAL_URI=http://127.0.0.1:9000/
      - GRAYLOG_ROOT_TIMEZONE=UTC
      - GRAYLOG_MONGODB_URI=mongodb://{mongodb-graylog-username}:{mongodb-graylog-password}@127.0.0.1:27017/graylog
      - GRAYLOG_ELASTICSEARCH_HOSTS=http://{opensearch-username}:{opensearch-password}@127.0.0.1:9200

# Run Graylog as a background process
$ docker-compose up -d
```

### **References and Further Reading**

* [HIGH AVAILABILITY WIREGUARD ON AWS](https://www.procustodibus.com/blog/2021/02/ha-wireguard-on-aws/)
    
* [GitHub - Graylog Docker Image](https://github.com/Graylog2/graylog-docker)
