Azure Kubernetes Service (AKS): Deploying Elasticsearch, Logstash and Kibana (ELK) and consume messages from Azure Event Hub

This article is part of the series of blogs on Azure Kubernetes Service (AKS). In this article I am going to share my experience of setting up Elasticsearch, Logstash and Kibana cluster in Azure Kubernetes Service (AKS) and consume messages from Event Hub. After the end of this article, we are going to have fully functional ELK stack with Azure Event Hub integration.

A sample client App (e.g. IOT device) will be publishing messages to Event Hub and these messages will be ingested into Elasticsearch using ‘Azure Event Hub’ plugin of Logstash. This article needs x-pack features of Elasticsearch thus I will show steps needed to activate trial license.

The second part of this series  goes through steps needed to enable Azure AD SAML based single sign on to secure Elasticsearch and Kibana hosted in AKS. The third part of this series goes through steps needed to ingest Azure Redis Cache messages into Elasticsearch using Logstash’s Redis plugin.

The dev tools used to develop these components are Visual Studio for Mac/Visual Studio 2017, AKS Dashboard as well as kubectl commands are used to create/manager Kubernetes resources in AKS.

Azure Kubernetes Service (AKS) Cluster

In case you don’t have AKS cluster up and running, please go through this article to Create AKS Cluster. Once AKS cluster is up and running, you can go through the rest of the article. The code snippets in this article are mostly yaml snippets and are included for reference only as formatting may get distorted thus please refer to GitHub repository for formatted resources.

Event Hub

Event hub messages will be ingested into Elasticsearch using Logstash pipeline, thus first resource I am going to create is Event Hub. Please follow steps listed in this article to create an Event Hub. The main pointers are

  • Event Hub name I have selected for this sample solution is ‘logstash’. If you select a different name, update event hub name in the source code.
  • Keep a note of Event Hub connection string as this needs to be updated in Logstash pipeline and Event Hub messages publisher client.
  • Source code uses ‘$Default’ Resource_Group. Update this value if you create a Resource_Group

Azure Blob Storage

The next resource which you will need to create is Azure Blob Storage. Please follow steps listed in this article to create a storage account. Once storage account is created, create Blob Service-> Blobs. The main pointers are

  • Blob container name I have specified in source code is ‘logstash’. If you select a different name, update storage container name in the source code.
  • Keep a note of Storage connection string as this needs to be updated in Logstash pipeline.

Client App to send messages to Event Hub

AzureEventHubSample project is client app to send messages to Event hub. You will need to update connectionString variable with Event Hub connection string and name of the hub. You can download the source code of this publisher client from Github.

Deploy Elasticsearch to Azure Kubernetes Service

Elasticsearch is a near real time search platform. The steps needed to deploy Elasticsearch to AKS cluster are listed below

Create a Persistent Volume

Persistent volume claim is needed to store Elasticsearch data. The yaml snippet to create a 5 GB storage is displayed below. The StatefulSet resource is going to mount files to this storage claim. You can read more about Persistent Volumes.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sample-elasticsearch-data-claim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
   requests:
    storage: 5Gi

Since Elasticsearch data nodes will need to be deployed as a StatefulSet as you will see in subsequent sections, we are going to define a volumeClaimTemplates so that a new Persistent Volume Claim is created for each replica automatically.

Create a Kubernetes ConfigMap

ConfigMaps allow you to decouple configuration aspects from image and a few pointers about the yaml snippet displayed below are

  • elasticsearch.yml and role_mapping.yml files will be mounted from ConfigMap
  • xpack-security is enabled
  • role_mapping.yml is intentionally left blank as it will be discussed in detail in next blogs of this series

apiVersion: v1
kind: ConfigMap
metadata:
  name: sample-elasticsearch-configmap
  namespace: default
data:
  elasticsearch.yml: |
    cluster.name: "sample-elasticsearch-cluster"
    network.host: 0.0.0.0
    discovery.zen.minimum_master_nodes: 1
    xpack.license.self_generated.type: trial

    node.max_local_storage_nodes: 1
    xpack.security.enabled: true
    xpack.monitoring.collection.enabled: true
  role_mapping.yml: |

You can read more about ConfigMap.

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Elasticsearch. As you can see in yaml snippet below, port 9200/9300 is defined and type is ClusterIP i.e. this service doesn’t has external endpoints. Kubernetes will use to selector ‘service: sample-elasticsearch‘ to map to the deployment as you are going to see next. You can read more about Services. 

apiVersion: v1               
kind: Service                
metadata:                    
  name: sample-elasticsearch        
  labels:                    
    service: sample-elasticsearch
spec:                        
  type: ClusterIP
  selector:                   
    service: sample-elasticsearch
  ports:                     
  - name: http               
    port: 9200               
    targetPort: 9200
    protocol: TCP            
  - name: transport          
    port: 9300               
    targetPort: 9300
    protocol: TCP

You also need to create a headless service for zen discovery which Elasticsearch uses to join nodes to a cluster. The main pointers about headless service are

  • clusterIP: None – clusterIP should be set to None
  • name:sample-elasticsearch-discovery– metadata:name of the service should match value of spec: serviceName of StatefulSet

apiVersion: v1
kind: Service
metadata:
  namesample-elasticsearch-discovery
spec:
  clusterIP: None
  selector:
    service: sample-elasticsearch
  ports:
  - name: transport
    port: 9300

Create a Kubernetes StatefulSet

Kubernetes StatefulSet is the workload API used to manage stateful applications. You can read more about StatefulSets. The yaml snippet to create a StatefulSet is displayed below and a few pointers are

  • service: sample-elasticsearch matches to the selector defined in the service.
  • serviceName: sample-elasticsearch-discovery matches to the name of the headless service.
  • I have specified replicas: 1 which means that only one instance of Pod will be created by Kubernetes.
  • The docker image being used to create this resource is image:docker.elastic.co/elasticsearch/elasticsearch:6.4.1.
  • By default, Elasticsearch runs inside the container as user elasticsearch using uid:gid 1000:1000. If you are bind-mounting a local directory or file, ensure it is readable by this user, while the data and log dirs additionally require write access. This is the reason I have used environment variable  - name: "TAKE_FILE_OWNERSHIP" value: “1”. The alternate option is to grant write access by adding an initcontainer step e.g. command:
    - sh
    - -c
    - chown -R 1000:1000 /usr/share/elasticsearch/data
    . You can read more at Elasticsearch Docker.
  • mmap count has been increased  vm.max_map_count=262144 by adding an initcontainer step. You can read more about Elasticsearch virtual memory.
  • ConfigMap defined in previous step is used to mount elasticsearch.yml and role_mapping.yml files i.e. configMap:
    name: sample-elasticsearch-configmap
    .
  • Lastly, persistent volume claims are created as explained in previous section using volumeClaimTemplates.
  • Assign CPU resource as needed by updating section resources:limits:. You can read more about Assign CPU Resources to Containers and Pods

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sample-elasticsearch
  labels:
    service: sample-elasticsearch
spec:
  serviceName: sample-elasticsearch-discovery
  # Number of PODS
  replicas: 1
  selector:
    matchLabels:
      service: sample-elasticsearch
  template:
    metadata:
      labels:
        service: sample-elasticsearch
    spec:
      terminationGracePeriodSeconds: 15
      initContainers:
      # https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
      - name: increase-the-vm-max-map-count
        image: busybox
        command:
        - sysctl
        - -w
        - vm.max_map_count=262144
        securityContext:
          privileged: true
      containers:
      - name: sample-elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:6.4.1
        resources:
          limits:
            cpu: 2000m
            memory: 2Gi
          requests:
            cpu: 100m
            memory: 1Gi
        ports:
        - containerPort: 9200
          name: http
        - containerPort: 9300
          name: tcp
        env:
          - name: cluster.name
            value: "sample-elasticsearch-cluster"
          - name: "xpack.security.enabled"
            value: "true"
          - name: "TAKE_FILE_OWNERSHIP"
            value: “1”
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.zen.ping.unicast.hosts
            value: "sample-elasticsearch-discovery:9300"

        volumeMounts:
        - name: sample-elasticsearch-data-claim
          mountPath: /usr/share/elasticsearch/data
        - name: sample-elasticsearch-configmap
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          subPath: elasticsearch.yml
        - name: sample-elasticsearch-configmap
          mountPath: /usr/share/elasticsearch/config/role_mapping.yml
          subPath: role_mapping.yml
      volumes:
      - name: sample-elasticsearch-configmap
        configMap:
            name: sample-elasticsearch-configmap
  volumeClaimTemplates:
  - metadata:
      name: sample-elasticsearch-data-claim
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: default
      resources:
        requests:
          storage: 2Gi

By default Elasticsearch will be deployed with basic license. After Elasticsearch is deployed, the next step is to activate trail license of Elasticsearch to use x-pack features of Elasticsearch.

Enable Trial Version of Elasticsearch

The steps needed to activate trail license are

  • Run command kubectl port-forward sample-elasticsearch-0 9200:9200 and now you can access Elasticsearch endpoint at http://localhost:9200. POST http://localhost:9200/_xpack/license/start_trial?acknowledge=true request from any rest client. This is going to activate the trial license.

  • You can verify if trail version is activated by GET http://localhost:9200/_xpack/license request from any rest client.
  • Once trial license is activated you can close the terminal as port forwarding isn’t needed anymore.

Setup Password for buit-in user accounts of Elasticsearch

x-pack security feature of Elasticsearch is used to secure access thus we now need to setup passwords for built-in user accounts and the steps are

  • Connect to Elasticsearch POD by running command kubectl exec -ti sample-elasticsearch-0 bash
  • Run command bin/elasticsearch-setup-passwords interactive to setup built-in user passwords interactively. For this sample I have specified same password i.e. Password1$ for all accounts thus you will need to change password you specified in a few places in source code.

Deploy Kibana to Azure Kubernetes Service

Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack. The steps needed to deploy Kibana to AKS are listed below

Create a Kubernetes ConfigMap

The yaml snippet to create this resource is displayed below and a few pointers are

  • kibana.yml file will be mounted from ConfigMap
  • Kibana points to Elasticsearch based on elasticsearch.url: http://sample-elasticsearch:9200
  • Updateelasticsearch.password: Password1$ based on your specified password for kibana (built-in) user

apiVersion: v1
kind: ConfigMap
metadata:
  name: sample-kibana-configmap
  namespace: default
data:
  kibana.yml: |
    server.name: sample-kibana
    server.host: "0"
    elasticsearch.url: http://sample-elasticsearch:9200
    xpack.monitoring.ui.container.elasticsearch.enabled: true
    elasticsearch.username: kibana
    elasticsearch.password: Password1$

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Kibana. As you can see in yaml snippet below, port 80 is defined and type is LoadBalancer i.e. this service has external endpoints. Kubernetes will use to selector ‘component: sample-kibana‘ to map to the deployment as you are going to see next. The creation of this service is going to take a while and once done you can get the external endpoint of this service either by opening AKS Dashboard or running Kubectl command kubectl describe services sample-kibana.

apiVersion: v1
kind: Service
metadata:
  name: sample-kibana
  labels:
    component: sample-kibana
spec:
  type: LoadBalancer
  selector:
    component: sample-kibana
  ports:
  - name: http
    port: 80
    targetPort: http

Create a Kubernetes Deployment

The next step is to create a Kubernetes Deployment for Kibana. The yaml snippet is displayed below and a few pointers are

  • The docker image being used to create this resource is image: docker.elastic.co/kibana/kibana:6.4.1
  • You can change the number of pods by updating replicas: 1
  • Labelcomponent: sample-kibana has to match the selector defined in the service
  • ConfigMap defined in previous step is used to mount kibana.yml file i.e. configMap:
    name: sample-kibana-configmap

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: sample-kibana
  labels:
    component: sample-kibana
spec:
  replicas: 1
  selector:
    matchLabels:
     component: sample-kibana
  template:
    metadata:
      labels:
        component: sample-kibana
    spec:
      containers:
      - name: sample-kibana
        image: docker.elastic.co/kibana/kibana:6.4.1
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 5601
          name: http
        volumeMounts:
        - name: sample-kibana-configmap
          mountPath: /usr/share/kibana/config/kibana.yml
          subPath: kibana.yml
        resources: {}
        terminationMessagePath: "/dev/termination-log"
        terminationMessagePolicy: File
        imagePullPolicy: Always
        securityContext:
          privileged: false
      volumes:
      - name: sample-kibana-configmap
        configMap:
            name: sample-kibana-configmap
      restartPolicy: Always
      terminationGracePeriodSeconds: 5
      dnsPolicy: ClusterFirst
      securityContext: {}
      schedulerName: default-scheduler

Open the external endpoint of Kibana service once deployment is completed. Since x-pack security is enabled, Kibana will prompt for credentials.

Deploy Logstash to Azure Kubernetes Service

Logstash is data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to Elasticsearch. Logstash will use Azure Event Hub plugin to ingest data into Elasticsearch. The steps needed to deploy Logstash to AKS are listed below

Create a Kubernetes ConfigMap

The yaml snippet to create ConfigMap is displayed below and a few pointers are

  • logstash.yml file will be mounted from ConfigMap
  • pipelines.yml file will be mounted from ConfigMap. Multiple pipelines can be defined in this file e.g. you can see that AzureEventHubs pipeline is defined.
  • azureeventhub.cfg file will be mounted from ConfigMap. The Logstash event processing pipeline has three stages: inputs → filters → outputs. This file defines the logstash pipeline for Azure Event Hub.
    • Update {AZURE_EVENT_HUB_CONNECTION_STRING} and {STORAGE_ACCOUNT_CONNECTION_STRING} values based on your Event Hub and Storage Account values.
    • Update EntityPath in case your event hub is named differently then logstash.
    • Update storage_container in case your storage container is named differently than logstash.
    • Update consumer_group in case your consumer group is different than $Default. Specifying $Default isn’t an ideal option.
    • Index name is defined as index => "azureeventhub-%{+YYYY.MM.dd}
  • logstash.conf file will be mounted from ConfigMap
  • Elasticsearch URL is defined in xpack.monitoring.elasticsearch.url: http://sample-elasticsearch:9200
  • Updatexpack.monitoring.elasticsearch.password: Password1$ based on your specified password for logstash_system (built-in) user.
  • Elasticsearch endpoint is defined in hosts => [ "sample-elasticsearch:9200" ]
  • Update password => "Password1$" to the specified password of your elastic (built-in) user

apiVersion: v1
kind: ConfigMap
metadata:
  name: sample-logstash-configmap
  namespace: default
data:
  logstash.yml: |
    xpack.monitoring.elasticsearch.url: http://sample-elasticsearch:9200
    dead_letter_queue.enable: true
    xpack.monitoring.enabled: true
    xpack.monitoring.elasticsearch.username: logstash_system 
    xpack.monitoring.elasticsearch.password: Password1$
  pipelines.yml: |
    - pipeline.id: azureeventhubs
      path.config: "/usr/share/logstash/azureeventhubs.cfg"
  azureeventhubs.cfg: |
    input {
      azure_event_hubs {
        event_hub_connections => ["{AZURE_EVENT_HUB_CONNECTION_STRING};EntityPath=logstash"]
        threads => 2
        decorate_events => true
        consumer_group => "$Default"
        storage_connection => "{STORAGE_ACCOUNT_CONNECTION_STRING}"
        storage_container => "logstash"
        }
    }
    filter {
    }
    output {
      elasticsearch {
        hosts => [ "sample-elasticsearch:9200" ]
        user => "elastic"
        password => "Password1$"
        index => "azureeventhub-%{+YYYY.MM.dd}"
      }
    }
  logstash.conf: |

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Logstash. As you can see in yaml snippet below, port 80 is defined and type is ClusterIP i.e. this service has no external endpoints. Kubernetes will use to selector ‘component: sample-logstash‘ to map to the deployment as you are going to see next.

apiVersion: v1
kind: Service
metadata:
  name: sample-logstash
  labels:
    component: sample-logstash
spec:
  type: ClusterIP
  selector:
    component: sample-logstash
  ports:
  - name: http
    port: 80
    targetPort: http

Create a Kubernetes Deployment

The next step is to create a Kubernetes Deployment for Logstash. The yaml snippet is displayed below and a few pointers are

  • The docker image being used to create this resource is docker.elastic.co/logstash/logstash:6.4.1
  • You can change the number of pods by updating replicas: 1
  • Labelcomponent: sample-logstash has to match the selector defined in the service
  • ConfigMap defined in previous step is used to mount logstash.yml, logstash.conf, pipelines.yml and azureeventhubs.cfg files i.e. configMap:
    name: sample-logstash-configmap

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: sample-logstash
  labels:
    component: sample-logstash
spec:
  replicas: 1
  selector:
    matchLabels:
     component: sample-logstash
  template:
    metadata:
      labels:
        component: sample-logstash
    spec:
      containers:
      - name: sample-logstash
        image: docker.elastic.co/logstash/logstash:6.4.1
        volumeMounts:
        - name: sample-logstash-configmap
          mountPath: /usr/share/logstash/config/logstash.yml
          subPath: logstash.yml
        - name: sample-logstash-configmap
          mountPath: /usr/share/logstash/pipeline/logstash.conf
          subPath: logstash.conf
        - name: sample-logstash-configmap
          mountPath: /usr/share/logstash/azureeventhubs.cfg
          subPath: azureeventhubs.cfg
        - name: sample-logstash-configmap
          mountPath: /usr/share/logstash/config/pipelines.yml
          subPath: pipelines.yml
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 5601
          name: http
      volumes:
        - name: sample-logstash-configmap
          configMap:
            name: sample-logstash-configmap

As mentioned earlier, Logstash will use azure_event_hubs plugin to ingest data into Elasticsearch. You can get installed plugins list by following steps listed below

  • Run command kubectl exec -ti {Logstash_Pod_Name}  bash to connect to Logstash POD.
  • Run command bin/logstash-plugin list to see installed plugins

Demo

After all resources are deployed to AKS, run client app to send messages to Event hub. Open Kibana and you will see ELK stack statistics in Monitoring section. The number of messages sent by client App to Event hub will be displayed in Logstash’s Events received and events emitted statistics.

The Discover tab of Kibana will display events ingested into Elasticsearch once you create index filters e.g. I have selected index filter as ‘azureeventhub*’ and you can see that this index naming convention was defined in azureeventshub.cfg pipelineindex => "azureeventhub-%{+YYYY.MM.dd}.

You can download source code for this article from GitHub repository

The second part of this series is Azure Kubernetes Service (AKS): Azure AD SAML based Single Sign on to secure Elasticsearch and Kibana and securing communications in ELK

Leave a Reply

Your email address will not be published. Required fields are marked *