Amazon RDS database monitoring with Amazon CloudWatch Logs, AWS Lambda, and Amazon SNS to send logs over email

Hi All,

Today, we are going to learn “how to get amazon RDS database logs notifications to our personal or official email id.

Suppose Dev team is working on amazon rds db to utilize the db functionality with CRUD operations and same will be applicable while it’s in-use for running applications.

But sometimes what happens that there could be errors occurs during the insert, update, delete the data into the databases and may be we could missed out those.

When we are using amazon rds db, then aws have provided logging mechanism for the same and we can find the db logs inside aws cloudwatch. But those logs are available only in aws management console. To see the amazon rds db logs,everytime we need to login to aws console and check those logs. This is not a good solution for production databases. There should be automated way to get the logs whenever there is some operations happens at db end.

To get the logs with automation by using aws services and sends db logs to personal or official email ids. here is the step by step guide to achieve the same.

In general, Database administrators generally monitor for keywords like ORA- errors in Oracle databases and ERROR in PostgreSQL, Aurora, MySQL, SQL Server, and MariaDB databases. When the database error occurs, DBAs need to be notified of the alert to acknowledge the seriousness of the error and take appropriate action.

The solution in this post addresses two issues:

  • Monitoring the RDS database for database errors that appear in the logs
  • Streaming the database logs for RDS databases without having to read the whole file every time to monitor the database errors

This post details the steps to implement proactive monitoring of alerts for an RDS database by streaming the database logs based on the keywords of ORA- errors in an Oracle database or ERROR in PostgreSQL, Aurora, MySQL, MariaDB, or SQL Server databases and send a notification by email to the database administrator to take necessary action to fix the issue. The following diagram illustrates our solution architecture.

We provide two methods to configure this solution: set up the resources manually through various AWS services, or deploy the resources with an AWS CloudFormation template. Both methods complete the following high-level steps:

  1. Create an SNS topic.
  2. Create AWS Identity and Access Management (IAM) roles and policies needed for the solution.
  3. Create a Lambda function with the provided code and assign the appropriate IAM roles.

We then walk you through identifying the log group to monitor, creating a trigger in the Lambda function, and testing the error notification system.

Prerequisites

For this walkthrough, the following prerequisites are necessary:

  • An AWS account with RDS instances running.
  • An RDS instance with logs exported to Amazon CloudWatch. To verify this, on the Amazon RDS console, navigate to your instance and choose Configuration. Under the published logs, you see PostgreSQL (when using PostgreSQL or Aurora), alert (when using Oracle), or error (when using MariaDB, MySQL, or SQL Server).

Set up proactive monitoring and alerting in Amazon RDS manually

In this section, we walk through the process to set up the resources for active monitoring and alerting using various AWS services. To deploy these resources via AWS CloudFormation, you can skip to the next section.

Create an SNS topic

We first create a standard SNS topic and subscribe to it in order to receive email notifications. For instructions, see To create a topic using the AWS Management Console and Email notifications, respectively.

If you already have an existing SNS topic that you want to use, you can skip to the next step.

Set up an IAM role and policies

This step creates a role to create the Lambda function and grant it appropriate permissions. We start by creating our policy.

  1. On the IAM console, under Access management, choose Policies.
  2. Choose Create policy.
  3. Choose JSON.
  4. Enter the following JSON code, providing your Region, account ID, SNS topic name, and the name of the function you’re creating:

================================================

{

“Version”: “2012-10-17”,

“Statement”: [

{

“Effect”: “Allow”,

“Action”: [

“logs:CreateLogGroup”,

“sns:Publish”

],

“Resource”: [

“arn:aws:logs:<region>:<account number>:*”,

“arn:aws:sns:<region>:<account number>:<sns topic name>”,

“arn:aws:lambda:<region>:<account_number>:function:<lambda function name>”

]

},

{

“Effect”: “Allow”,

“Action”: [

“logs:CreateLogStream”,

“logs:PutLogEvents”

],

“Resource”: [

“arn:aws:logs:<region>:<account number>:log-group:/aws/lambda/<lambda function name>:*”

]

}

]

}

========================================================

  1. Choose Review policy.
  2. Enter a policy name, such as AWSLambdaExecutionRole-ErrorNotification.
  3. Choose Create policy.You now create a role and attach your policy.
  4. In the navigation pane, under Access management, choose Roles.
  5. Choose Create role.
  6. Choose Lambda.
  7. Choose Next: Permissions.
Find and choose the policy you just created.
Choose Next: Tags.
Choose Next: Review.
Enter a name for the role, such as LambdaRoleErrorNotification.
Choose Create role.

Create a Lambda function

This steps illustrates how to create the Lambda function that is used to process the CloudWatch logs and send notifications using the Amazon SNS ARN of the topic you created.

  1. On the Lambda function, choose Create function.You need to create the function in the same Region as that of the RDS database server you want to monitor.
  2. Select Author from scratch.
  3. For Function name, enter a name, such as RDSErrorsNotification.
  4. For Runtime, choose Python 3.8.
  5. For Execution role¸ select Use an existing role.
  6. For Existing role, choose the role you created.
  7. Enter the following code:

==============================================================

import sys

import re

import boto3

import math, time

import datetime

import base64

import json

import gzip

import os

def lambda_handler(event, context):

# Reading the cloudwatch log data

cloud_log_data= event[‘awslogs’][‘data’]

message=””

compressed_data= base64.b64decode(cloud_log_data)

uncompressed_data = gzip.decompress(compressed_data)

logdataload = json.loads(uncompressed_data)

# Getting the log group name that needs processing LogGroupName = logdataload[‘logGroup’]

# Debug output of logEvents

print(logdataload[“logEvents”])

# Get the environment variables in Lambda

Region=os.environ.get(‘Region’)

SNSArn=os.environ.get(‘SNSArn’)

SNSRegion=os.environ.get(‘SNSRegion’)

ExcludeFilterPattern=os.environ.get(‘ExcludeFilterPattern’)

if os.environ.get(‘ExcludeFilterPattern’) is None:

ExcludeFilterPattern=”password”

else:

ExcludeFilterPattern=ExcludeFilterPattern+”,password”

ExcludeFilterPattern=ExcludeFilterPattern.split(“,”) IncludeFilterPattern=os.environ.get(‘IncludeFilterPattern’)

# The script works for Oracle/PostgreSQL/Aurora, the condition checks which database and assigns a pattern

if os.environ.get(‘IncludeFilterPattern’) is None:

if “oracle” in LogGroupName.lower():

IncludeFilterPattern=”ORA-”

if “postgres” in LogGroupName.lower():

IncludeFilterPattern=”ERROR”

if “aurora” in LogGroupName.lower():

IncludeFilterPattern=”ERROR”

if “maria” in LogGroupName.lower():

IncludeFilterPattern=”ERROR”

if “sqlserver” in LogGroupName.lower():

IncludeFilterPattern=”ERROR”

if “mysql” in LogGroupName.lower():

IncludeFilterPattern=”ERROR”

IncludeFilterPattern=IncludeFilterPattern.split(“,”)

# Checking if errors exist which match the pattern

errors_exist = len(logdataload[“logEvents”])

if errors_exist == 0:

print(“No errors exist”)

else:

for record in logdataload[“logEvents”]:

# checks if the error is in exclude list or need to be filtered

if re.compile(‘|’.join(ExcludeFilterPattern),re.IGNORECASE).search(record[“message”]): print(‘Error is ignored for {0}’.format(record[“message”]) )

else:

if re.compile(‘|’.join(IncludeFilterPattern),re.IGNORECASE).search(record[“message”]):

message=”Errors in logfile are:n” + record[“message”] + “\n”

else:

print(“No errors match IncludeFilterPattern list”)

# Sends an SNS notification with the error information

if len(message) > 0:

SNSClient = boto3.client(‘sns’, region_name=SNSRegion)

response = SNSClient.publish( TopicArn=SNSArn, Message=str(message), Subject=’Errors exists in RDS database log group’+LogGroupName )

print(‘\t Response:{xyz} \n’.format(xyz=response[‘ResponseMetadata’][‘HTTPStatusCode’]))

====================================================

  1. Choose Deploy.
  2. On your Lambda function page, choose Edit environment variables and input the following keys with corresponding values.
KeyValueAdditional Information
SNSArnThe ARN of the SNS topic you created.This variable is mandatory.
SNSRegionThe Region of SNS topic which you created.This variable is mandatory.
RegionThe Region of Lambda and the RDS database CloudWatch logs.This variable is mandatory.
IncludeFilterPatternSpace-separated patterns for errors that you want to be alterted of.For example, if the RDS database is Oracle, you could only be notified of errors like ORA-00600,ORA-07445.
for ex: “?ERROR(space)?FATAL(space)?LOG
where, ?-means whatever the logs coming after above pattern, that sentence will be part of email body which we are sending over email.
You can use this parameter to filter any pattern (not just errors) that need to be monitored in the database.This variable is optional.
ExcludeFilterPattern(optional)Space-separated patterns for errors that you don’t want to be alterted of.For example, if the RDS database is Oracle, the value can be ORA-12560,ORA-12152.This variable is optional.

By default, if no filter patterns are mentioned, all ORA- errors in the Oracle RDS alert.log and ERROR messages in the PostgreSQL or Aurora postgresql.log are alerted.

  1. Choose Save.

Create a CloudWatch trigger in the Lambda function

The database logs stored in CloudWatch need to be streamed to the Lambda function for it to process in order to send notifications. You can stream CloudWatch logs by creating a CloudWatch trigger in the function.

  1. On the Lambda console, choose the function you created (or the CloudFormation stack created for you).
  2. In the Designer section, choose Add trigger.
  3. On the drop-down menu, choose CloudWatch Logs.
  4. For Log group, choose the log group corresponding to the RDS database you want to monitor.You can add additional log groups by creating additional triggers.
  5. For Filter name, enter a name.
  6. For Filter pattern, enter a pattern (ORA- for Oracle or ERROR for PostgreSQL, Aurora, MariaDB, MySQL, or SQL Server).If you don’t provide a filter pattern, the function is triggered for every log data written to the database logs, which we don’t want because it increases costs. Because we just want to be notified for alerts, we need to stream just the ERROR or ORA- keywords to the Lambda function.
  7. Choose Add.

To add more triggers, repeat these steps for other database logs.

Test the error notification

Now we’re ready to test the notification when an error occurs in the database. You can generate a fake alert in the RDS database.

For this post, we create an alert for an RDS for PostgreSQL database.

You should receive an email to the email subscribed to the SNS topic.

Once we are able to receive such notifications, it means it’s grabbing all the recent logs from cloudwatch and sending the same over email.

References:

https://aws.amazon.com/blogs/database/build-proactive-database-monitoring-for-amazon-rds-with-amazon-cloudwatch-logs-aws-lambda-and-amazon-sns/

We are successfully completed our Goal…Cheers and Enjoy !!!!

Ansible Tower Installation

 

Hi Friends,

Today we are going to install Ansible Tower on CentOS 7 machine.

Before going to start the installation steps, we need to know some benefits of Ansible Tower.

Why we need Ansible Tower and it’s benefits?

1.Ansible Tower helps you scale IT automation, manage complex deployments and               speed productivity.

2. Centralize and control your IT infrastructure with a visual dashboard, role-based                access control, job scheduling, integrated notifications and graphical inventory                     management.

3. Ansible Tower’s REST API and CLI make it easy to embed Ansible Tower into                        existing  tools and processes.

4. The Ansible Tower dashboard provides a heads-up NOC-style display for                                everything  going on in your Ansible environment.

5. As soon as you log in, you’ll see your host and inventory status, all the recent job                 activity and a snapshot of recent job runs. Adjust your job status settings to                           graph data from specific job and time ranges.

6. We can get Real-Time Job Status update:

Within Ansible Tower, playbook runs stream by in real time. As Ansible                                automates across your infrastructure, you’ll see plays and tasks complete,                            broken down by each machine, and each success or failure, complete with                            output. Easily see the status of your automation, and what’s next in the                                  queue.

Other types of jobs, such as source control updates or cloud inventory                                    refreshes, appear in the common job view. Know what Ansible Tower is                                up to at any time.

7. Multi-Playbook Workflows:

Ansible Tower Workflows allow you to easily model complex processes                                 with Ansible Tower’s intuitive workflow editor. Ansible Tower workflows                             chain any number of playbooks, updates, and other workflows, regardless                           of whether they use different inventories, run as different users, run at                                 once or utilize different credentials.

You can build a provisioning workflow that provisions machines, applies a                           base system configuration, and deploys an application, all with different                               playbooks maintained by different teams. You can build a CI/CD testing                                 workflow that builds an application, deploys it to a test environment, runs                           tests, and automatically promotes the application based on test results. Set                           up different playbooks to run in case of success or failure of a prior                                         workflow playbook.

8. Who Ran What Job When:

With Ansible Tower, all automation activity is securely logged. Who ran it,                             how they customized it, what it did, where it happened – all securely stored                           and viewable later, or exported through Ansible Tower’s API.

Activity streams extend this by showing a complete audit trail of all                                         changes made to Ansible Tower itself – job creation, inventory changes,                                 credential storage, all securely tracked.

All audit and log information can be sent to your external logging and                                   analytics provider to perform analysis of automation and event                                               correlation   across your entire environment.

9.  Ansible Tower allows us to easily streamline the delivery of applications and                        services to both OpenStack and Amazon Clouds in a cost effective, simple, and                    secure manner.

10. Scale Capacity With Tower Clusters:

Connect multiple Ansible Tower nodes into a Ansible Tower cluster.                                        Ansible Tower clusters add redundancy and capacity, allowing you to scale                          Ansible automation across your enterprise, including with reserved                                        capacity for certain teams and jobs, and remote execution for access across                          network zones.

11. Integrated Notifications:

Stay informed of your automation status via integrated notifications.

Notify  a person or team when your job succeeds, or escalate when jobs fail. Send              notifications across your entire organization at once, or customize on a                                  per- job basis.

Connect your notifications to Slack, Hipchat, PagerDuty, SMS, email, and                                more – or post notifications to a custom webhook to trigger other tools in                              your infrastructure.

12. Schedule Ansible Jobs:

Playbook runs, cloud inventory updates, and source control updates can be                          scheduled inside Ansible Tower – run now, run later, or run forever.

Set up occasional tasks like nightly backups, periodic configuration                                        remediation for compliance, or a full continuous delivery pipeline with just a few clicks.

13. Manage And Track Your Entire Inventory:

Ansible Tower helps you manage your entire infrastructure.                                                     Easily pull your  inventory from public cloud providers such as Amazon                               Web Services, Microsoft Azure, and more, or synchronize from your local                             OpenStack cloud or VMware environment. Connect your inventory directly                         to your Red Hat Satellite or Red Hat CloudForms environment, or your                                   custom CMDB.

Ansible Tower can keep your cloud inventory in sync, and Ansible Tower’s                           powerful provisioning callbacks allow nodes to request configuration on                               demand, enabling autoscaling. You can also see alerts from Red Hat Insights                         directly from Ansible Tower, and use Insights-provided Playbook                                             Remediation to fix issues in your infrastructure.

Plus, Ansible Tower Smart Inventories allow you to organize and automate                           hosts across all your providers based on a powerful host fact query engine.

14. Remote Command Execution:

Run simple tasks on any host or group of hosts in your inventory with                                    Ansible Tower’s remote command execution. Add users or groups, reset                                passwords, restart a malfunctioning service or patch a critical security                                  issue, quickly.                                                                                                                                         As always, remote command execution uses Ansible Tower’s role-based                                 access control engine and logs every action.

15. Comprehensive Rest API And Tower CLI Tool:

Far from being limited to just the user interface, every feature of Ansible                              Tower is available via Ansible Tower’s REST API, providing the ideal API for                          a systems management infrastructure to build against. Call Ansible Tower                            jobs from your build tools, show Ansible Tower information in your custom                         dashboards and more. Get API usage information and best practices with                             built-in documentation.

Prerequisites:

Ansible Tower server (I’m using a VMware environment, so both my servers are VMs)

1 Core, 1GB RAM Ubuntu 12.04 LTS Server, 64-bit

Active Directory Server (I’m using Windows Server 2012 R2)

2 Cores, 4GB RAM

Officially, Tower supports CentOS 6, RedHat Enterprise Linux 6, Ubuntu Server 12.04 LTS, and Ubuntu Server 14.04 LTS.

Installing Tower requires Internet connectivity, because it downloads from their repo servers.

I have managed to perform an offline installation, but you have to set up some kind of system to mirror their repositories, and change some settings in the Ansible Installer file.

*highly* recommend you dedicate a server (vm or otherwise) to Ansible Tower, because the installer will rewrite pg_hba.conf and supervisord.conf to suit its needs.  Everything is easier if you give it it’s own environment to run in.

You *might* be able to do it in Docker, although I haven’t tried, and I’m willing to bet you’re asking for trouble.

I’m going to assume you already know about installing Windows Server 2012 and building a domain controller. (If there’s significant call for it, I might write a separate blog post about this…)

 

Installation Steps:

Step – 1:

Download the latest .tar file from ‘https://releases.ansible.com/ansible-tower/setup/&#8217;

Step – 2:

Now untar the downloaded .tar file using below command:

[devops@localhost ~]$ tar xvzf ansible-tower-setup-latest.tar.gz

Step – 3:

goto “ansible-tower-setup-<VERSION>” directory

[devops@localhost ~]$ cd ansible-tower-setup-3.4.0-2

Step – 4:

edit inventory file which is present inside “ansible-tower-setup-3.4.0-2″ directory.

Put the below content to inventory file:

# /home/ubuntu/ansible-tower-setup-3.2.5/inventory file content

[tower]
localhost ansible_connection=local

[database]

[all:vars]
admin_password=’admin’

pg_host=”
pg_port=”

pg_database=’awx’
pg_username=’awx’
pg_password=’admin’

rabbitmq_port=5672
rabbitmq_vhost=tower
rabbitmq_username=tower
rabbitmq_password=’admin’
rabbitmq_cookie=cookiemonster

# Needs to be true for fqdns and ip addresses
rabbitmq_use_long_name=false

# Isolated Tower nodes automatically generate an RSA key for authentication;
# To disable this behavior, set this value to false
# isolated_key_generation=true
################################# Up to here ########################

Step – 5:

Now Run the ansible-tower setup by using below command:

[devops@localhost ansible-tower-setup-3.4.0-2]$ sudo ansible-playbook -i inventory install.yml

[INFO]  To Run Above command we should have ansible installed on this server

[INFO]  If ansible is not present then please open the below link and install the ansible:

[INFO]  https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-ansible-on-centos-7

 

[INFO] SetUp will take approx 45 mins or more.

Sample Output:

[sudo] password for devops:
[WARNING]: Could not match supplied host pattern, ignoring: instance_group_*
PLAY [tower:database:instance_group_*] ******************************************************************************************************************************

TASK [check_config_static : Ensure expected variables are defined] **************************************************************************************************
skipping: [localhost] => (item=tower_package_name)
skipping: [localhost] => (item=tower_package_version)
skipping: [localhost] => (item=tower_package_release)

TASK [check_config_static : Detect unsupported HA inventory file] ***************************************************************************************************
skipping: [localhost]

TASK [check_config_static : Ensure at least one tower host is defined] **********************************************************************************************
skipping: [localhost]

TASK [check_config_static : Ensure only one database host exists] ***************************************************************************************************
skipping: [localhost]

TASK [check_config_static : Ensure when postgres host is defined that the port is defined] **************************************************************************
skipping: [localhost]

TASK [check_config_static : Ensure that when a database host is specified, that pg_host is defined] *****************************************************************
skipping: [localhost]

TASK [check_config_static : Ensure that when a database host is specified, that pg_port is defined] *****************************************************************
skipping: [localhost]

TASK [check_config_static : HA mode requires an external postgres database with pg_host defined] ********************************************************************
skipping: [localhost]

TASK [check_config_static : HA mode requires an external postgres database with pg_port defined] ********************************************************************
skipping: [localhost]

TASK [config_dynamic : Set database to internal or external] ********************************************************************************************************
ok: [localhost]

TASK [config_dynamic : Database decision] ***************************************************************************************************************************
ok: [localhost] => {
“config_dynamic_database”: “internal”
}

TASK [config_dynamic : Set postgres host and port to local if not set] **********************************************************************************************
ok: [localhost]

TASK [config_dynamic : Ensure connectivity to hosts and gather facts] ***********************************************************************************************
ok: [localhost]

TASK [config_dynamic : Get effective uid] ***************************************************************************************************************************
changed: [localhost]

TASK [config_dynamic : Ensure user is root] *************************************************************************************************************************
skipping: [localhost]

PLAY [Group nodes by OS distribution] *******************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************
ok: [localhost]

TASK [group hosts by distribution] **********************************************************************************************************************************
ok: [localhost]
[WARNING]: Could not match supplied host pattern, ignoring: RedHat-7*

[WARNING]: Could not match supplied host pattern, ignoring: Ubuntu-16.04

[WARNING]: Could not match supplied host pattern, ignoring: OracleLinux-7*
PLAY [Group supported distributions] ********************************************************************************************************************************

TASK [group hosts for supported distributions] **********************************************************************************************************************
ok: [localhost]
[WARNING]: Could not match supplied host pattern, ignoring: none

.

.

.

.

TASK [misc : Create the default organization if it is needed.] ******************************************************************************************************
changed: [localhost]

RUNNING HANDLER [supervisor : restart supervisor] *******************************************************************************************************************
changed: [localhost] => {
“msg”: “Restarting supervisor.”
}

RUNNING HANDLER [supervisor : Stop supervisor.] *********************************************************************************************************************
changed: [localhost]

RUNNING HANDLER [supervisor : Wait for supervisor to stop.] *********************************************************************************************************
ok: [localhost]

RUNNING HANDLER [supervisor : Start supervisor.] ********************************************************************************************************************
changed: [localhost]

RUNNING HANDLER [nginx : restart nginx] *****************************************************************************************************************************
changed: [localhost]

PLAY [Install Tower isolated node(s)] *******************************************************************************************************************************
skipping: no hosts matched

PLAY RECAP **********************************************************************************************************************************************************
localhost : ok=139 changed=65 unreachable=0 failed=0

########### Sample Output Ends Here ############

Step – 6: Now, open browser and type below:

https://Your-Server-IP-Address  and hit enter.

Example:

https://192.168.56.102

1.Browser_Proceed_unsafe.png

 

Now, Click on –> Advance–>click on–>proceed to <192.168.56.102> –>It will show as below

 

 

It will automatically redirect to Ansible Tower page(as shown below)

2.Login_Page_on_browser.png

Now, Put your username and password(which you have provided in “/home/ubuntu/ansible-tower-setup-3.2.5/inventory” file)

And click on “SIGN IN” button.

After logged-in successfully, It will ask for license.Please click on “REQUEST LICENSE”3.Request_license.png

It will take to you on ansible tower license request page.Please select the appropriate options and fill the all mandatory fields and click on “Submit” button as shown below:

4.Redirect_to_get_license_after_filling_proper_details_main.png

After Click on “Submit” button, it will send the license file to your mail within 1 business day. Please check the mail.

After getting the license file, Please open the Ansible tower and do below things which are marked as Orange Square:

3.1browse_license_file

And click on “Submit” button.

After successful submission, It will show you the Ansible Tower Dashboard as shown below:

5.After_successful_login_first_page_view.png

Finally, You have successful Ansible Tower installation on Centos 7 Machine.

In the next blog we will see how to run the first playbook from Ansible Tower.

 

References:

https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-ansible-on-centos-7

Click to access tower_user_guide-2.1.6.pdf

https://www.ansible.com/products/tower

 

 

DataDog(Modern monitoring & analytics)

Part-1

Reference Links:

https://www.datadoghq.com/

Note: Installation performed with ubuntu 16.10

What is DataDog?

– Datadog is SAAS application.

– Datadog is a monitoring service for cloud-scale applications, which allows to monitor        servers, databases, tools, and services.

– These capabilities are provided on a SaaS-based data analytics platform.

– DataDog is monitoring service for IT, operations and Develpoment teams who
write and run applications at scale and want to turn the massive amounts
of data produced by their apps,tools and services into actionable insights.

– Datadog supports to cloud infrastructure monitoring service, with a dashboard,      alerting, and visualizations of metrics.

– As cloud adoption increased, Datadog grew rapidly and expanded its product           offering to cover service providers including Amazon Web Services (AWS),               Microsoft Azure, Google Cloud Platform, Red Hat OpenShift, and OpenStack.

Why we use DataDog?

  • All cloud, servers, apps, services, metrics, and more – gets all in one place.
  • It supports to HipChat, slack,pagerduty,OpsGenie and VictorOps for messaging purpose.
  • Can take snapshot of a perticular graph at any time frame.
  • supports with static reports with dynamic view of graphs
  • can sets alert proactively(can define multiple alerts at a time.)
  • allows to get metrics info in advance using restAPI call.
  • provides libraries for common languages like java,node.js,perl,ruby,php,go,python.
  • provides integration libraries for saltstack,ansible,freeswitch,google analytics,logstash,elasticsearch,apache2,jenkins,nginx,aws,etc….
  • can write custom metrics to get perticular info related to our application
  • provides log management also.

Installation:

Reference:

1. https://docs.datadoghq.com/agent/basic_agent_usage/ubuntu/

Step -1 Create free-trial account on data dog. It allows you to use data dog application for                  14 days trial basis. To create account, Please visit the following link:

https://www.datadoghq.com/

and click on “Free Trial” button as shown in below picture:

free-trial-1.jpg

It will opens following:

free-trial-2.jpg

Fill all the information and click on “SignUp” Button.

It will prompt the following. Don’t write anything inside any block. Just click on “Next” Button till step 2.

In Step 3, It will asks, that on which OS you want to install the datadog agent. So, Please select the Operating system. Once you select the os, on right hand side you will see the installation steps(i.e how to install data dog agent on particular OS’s).

Here, In this Blog, I am going to install data dog agent on ubuntu machine. So As shown below, I have selected “ubuntu” from left pane.

install-dd-agent-ubuntu-4

Here, I have selected OS as Ubuntu and follow the steps(Mostly It will take only 1st step to install the data dog-agent). Then click on “Finish”.

check whether datadog-agent service is running or not , using following command:

Command:

for ubuntu 16.04 or higher

sudo systemctl restart datadog-agent.service for Start the Agent with Ubuntu 14.04:
sudo initctl start datadog-agent
You are Done with Data Dog agent installation on ubuntu machine.

After that, when you click on following link:

https://www.datadoghq.com/

Go to –> login–>enter username and password–>click on login

You can see the following:

getting-started-5

Part-1 ends here.

Part-2 Monitor Jenkins Jobs with Data Dog:

Reference Links:

https://docs.datadoghq.com/integrations/jenkins/

https://www.datadoghq.com/blog/monitor-jenkins-datadog/

  • Jenkins plugin sends your build and deployment events to Datadog. From there, you can overlay them onto graphs of your other metrics so you can identify which deployments really affect your application’s performance and reliability—for better or worse.
  • The plugin also tracks build times (as a metric) and statuses (as a service check), so you’ll know when your builds aren’t healthy.

The Jenkins check for the Datadog Agent is deprecated. Use the Jenkins plugin

Installation

This plugin requires Jenkins 1.580.1 or newer.

This plugin can be installed from the Update Center (found at Manage Jenkins -> Manage Plugins) in your Jenkins installation.

  1. Navigate to the web interface of your Jenkins installation.
  2. From the Update Center (found at Manage Jenkins -> Manage Plugins), under the Available tab, search for Datadog Plugin.
  3. Check the checkbox next to the plugin, and install using one of the two install buttons at the bottom of the screen.
  4. To configure the plugin, navigate to the Manage Jenkins -> Configure System page, and find the Datadog Pluginsection.
  5. Copy/Paste your API Key from the API Keys page on your Datadog account, into the API Key textbox on the configuration screen.
  6. Before saving your configuration, test your API connection using the Test Key button, directly below the API Keytextbox.
  7. Restart Jenkins to get the plugin enabled.
  8. Optional: Set a custom Hostname You can set a custom hostname for your Jenkins host via the Hostname textbox on the same configuration screen. Note: Hostname must follow the RFC 1123 format.

Configuration:

This feature is not applicable for jenkins integration.

Validation

You will start to see Jenkins events in the Event Stream when the plugin is up and running.

Metrics

The following metrics are available in Datadog:

METRIC NAME DESCRIPTION
jenkins.queue.size (Gauge) Size of your Jenkins queue
jenkins.job.waiting (Gauge) Time spent by a job waiting in seconds
jenkins.job.duration (Gauge) Duration of a job in seconds

Events

The following events are generated by the plugin:

  • Started build
  • Finished build

Service Checks

  • jenkins.job.status: Build status

When you done with above steps then

login to datadog UI–>on leftpane –>dashboard lists–>search for “jenkins overview dashboard”–>click on it–>get all jenkins details at one place.

Jenkins is an open source, Java-based continuous integration server that helps organizations build, test, and deploy projects automatically. Jenkins is widely used, having been adopted by organizations like GitHub, Etsy, LinkedIn, and Datadog.

You can set up Jenkins to test and deploy your software projects every time you commit changes, to trigger new builds upon successful completion of other builds, and to run jobs on a regular schedule. With hundreds of plugins, Jenkins supports a wide variety of use cases.

As shown in the out-of-the-box dashboard below, our Datadog plugin will provide more insights into job history and trends than Jenkins’s standard weather reports. You can use the plugin to:

  • Set alerts for important build failures
  • Identify trends in build durations
  • Correlate Jenkins events with performance metrics from other parts of your infrastructure in order to identify and resolve issues
Monitor Jenkins default dashboard in Datadog

Monitor Jenkins build status in real-time

Once you install the Jenkins-Datadog plugin, Jenkins activities (when a build starts, fails, or succeeds) will start appearing in your Datadog event stream. You will also see what percentage of builds failed within the same job, so that you can quickly spot which jobs are experiencing a higher rate of failure than others.

Monitor Jenkins events in Datadog event stream

Remember to blacklist any jobs you don’t want to track by indicating them in your plugin configuration.

Datadog’s Jenkins dashboard gives you a high-level overview of how your jobs are performing. The status widget displays the current status of all jobs that have run in the past day, grouped by success or failure. To explore further, you can also click on the widget to view the jobs that have failed or succeeded in the past day.

Monitor Jenkins jobs tagged by result success or failure

You can also see the proportion of successful vs. failed builds, along with the total number of job runs completed over the past four hours.

Datadog also enables you to correlate Jenkins events with application performance metrics to investigate the root cause of an issue. For example, the screenshot below shows that average CPU on the app servers increased sharply after a Jenkins build was completed and deployed (indicated by the pink bar). Your team can use this information as a starting point to investigate if code changes in the corresponding release may be causing the issue.

Monitor Jenkins - build affects CPU graph

Visualize job duration metrics

Every time a build is completed, Datadog’s plugin collects its duration as a metric that you can aggregate by job name or any other tag, and graph over time. In the screenshot below, we can view the average job durations in the past four hours, sorted in decreasing order:

Monitor Jenkins job durations ranked in Datadog

You can also graph and visualize trends in build durations for each job by using Datadog’s robust_trend() linear regression function, as shown in the screenshot below. This graph indicates which jobs’ durations are trending longer over time, so that you can investigate if there appears to be a problem. If you’re experimenting with changes to your CI pipeline, consulting this graph can help you track the effects of those changes over time.

Monitor Jenkins build duration trends graph

Use tags to monitor Jenkins jobs

Tags add custom dimensions to your monitoring, so you can focus on what’s important to you right now.

Every Jenkins event, metric, and service check is auto-tagged with jobresult, and branch (if applicable). You can also enable the optional node tag in the plugin settings.

As of version 0.5.0, the plugin supports custom tags. This update was developed by one of our open source contributors, Mads Nielsen. Many thanks to Mads for helping us implement this feature.

You can create custom tags for the name of the application you’re building, your particular team name (e.g. team=licorice), or any other info that matters to you. For example, if you have multiple jobs that perform nightly builds, you might want to create a descriptive tag that distinguishes them from other types of jobs.

 

Testing Part-2 functionality:

login to –>Jenkins –> run any single job–>after completion of this job please go to the datadog ui page and click on “Dashboard Lists”–>”Jenkins verview” dashboard.

There you can see the jenkins job details which you executed just couple of minutes ago.

Part-2 ends here

 

Part-3 Log collection with DataDog

Reference: https://docs.datadoghq.com/integrations/nginx/

Overview

The Datadog Agent can collect many metrics from NGINX instances, including (but not limited to)::

  • Total requests
  • Connections (e.g. accepted, handled, active)

For users of NGINX Plus, the commercial version of NGINX, the Agent can collect the significantly more metrics that NGINX Plus provides, like:

  • Errors (e.g. 4xx codes, 5xx codes)
  • Upstream servers (e.g. active connections, 5xx codes, health checks, etc.)
  • Caches (e.g. size, hits, misses, etc.)
  • SSL (e.g. handshakes, failed handshakes, etc.)

Setup

Installation

The NGINX check is included in the Datadog Agent package, so you don’t need to install anything else on your NGINX servers.

NGINX STATUS MODULE

The NGINX check pulls metrics from a local NGINX status endpoint, so your nginx binaries need to have been compiled with one of two NGINX status modules:

NGINX Plus packages always include the http status module, so if you’re a Plus user, skip to Configuration now. For NGINX Plus release 13 and above, the status module is deprecated and you should use the new Plus API instead. See the announcement for more information.

If you use open source NGINX, however, your instances may lack the stub status module. Verify that your nginx binary includes the module before proceeding to Configuration:

$ nginx -V 2>&1| grep -o http_stub_status_module
http_stub_status_module

If the command output does not include http_stub_status_module, you must install an NGINX package that includes the module. You can compile your own NGINX—enabling the module as you compile it—but most modern Linux distributions provide alternative NGINX packages with various combinations of extra modules built in. Check your operating system’s NGINX packages to find one that includes the stub status module.

Configuration

Edit the nginx.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your NGINX metrics and logs. See the sample nginx.d/conf.yaml for all available configuration options.

PREPARE NGINX

On each NGINX server, create a status.conf file in the directory that contains your other NGINX configuration files (e.g. /etc/nginx/conf.d/).

server {
  listen 81;
  server_name localhost;

  access_log off;
  allow 127.0.0.1;
  deny all;

  location /nginx_status {
    # Choose your status module

    # freely available with open source NGINX
    stub_status;

    # for open source NGINX < version 1.7.5
    # stub_status on;

    # available only with NGINX Plus
    # status;
  }
}

NGINX Plus can also use stub_status, but since that module provides fewer metrics, you should use status if you’re a Plus user.

You may optionally configure HTTP basic authentication in the server block, but since the service is only listening locally, it’s not necessary.

Reload NGINX to enable the status endpoint. (There’s no need for a full restart)

METRIC COLLECTION

  • Add this configuration block to your nginx.d/conf.yaml file to start gathering your NGINX metrics:
  init_config:

  instances:
    - nginx_status_url: http://localhost:81/nginx_status/
    # If you configured the endpoint with HTTP basic authentication
    # user: <USER>
    # password: <PASSWORD>

See the sample nginx.d/conf.yaml for all available configuration options.

LOG COLLECTION

Available for Agent >6.0

  • Collecting logs is disabled by default in the Datadog Agent, you need to enable it in datadog.yaml:
  logs_enabled: true
  • Add this configuration block to your nginx.d/conf.yaml file to start collecting your NGINX Logs:
  logs:
    - type: file
      path: /var/log/nginx/access.log
      service: nginx
      source: nginx
      sourcecategory: http_web_access

    - type: file
      path: /var/log/nginx/error.log
      service: nginx
      source: nginx
      sourcecategory: http_web_access

Change the service and path parameter values and configure them for your environment. See the sample nginx.d/conf.yaml for all available configuration options.

Learn more about log collection in the log documentation

Validation

Run the Agent’s status subcommand and look for nginx under the Checks section.

Data Collected

Metrics

nginx.net.writing
(gauge)
The number of connections waiting on upstream responses and/or writing responses back to the client.
shown as connection
nginx.net.waiting
(gauge)
The number of keep-alive connections waiting for work.
shown as connection
nginx.net.reading
(gauge)
The number of connections reading client requets.
shown as connection
nginx.net.connections
(gauge)
The total number of active connections.
shown as connection
nginx.net.request_per_s
(gauge)
Rate of requests processed.
shown as request
nginx.net.conn_opened_per_s
(gauge)
Rate of connections opened.
shown as connection
nginx.net.conn_dropped_per_s
(gauge)
Rate of connections dropped.
shown as connection
nginx.cache.bypass.bytes
(gauge)
The total number of bytes read from the proxied server
shown as byte
nginx.cache.bypass.bytes_count
(count)
The total number of bytes read from the proxied server (shown as count)
shown as byte
nginx.cache.bypass.bytes_written
(gauge)
The total number of bytes written to the cache
shown as byte
nginx.cache.bypass.bytes_written_count
(count)
The total number of bytes written to the cache (shown as count)
shown as byte
nginx.cache.bypass.responses
(gauge)
The total number of responses not taken from the cache
shown as response
nginx.cache.bypass.responses_count
(count)
The total number of responses not taken from the cache (shown as count)
shown as response
nginx.cache.bypass.responses_written
(gauge)
The total number of responses written to the cache
shown as response
nginx.cache.bypass.responses_written_count
(count)
The total number of responses written to the cache (shown as count)
shown as response
nginx.cache.cold
(gauge)
A boolean value indicating whether the “cache loader” process is still loading data from disk into the cache
shown as response
nginx.cache.expired.bytes
(gauge)
The total number of bytes read from the proxied server
shown as byte
nginx.cache.expired.bytes_count
(count)
The total number of bytes read from the proxied server (shown as count)
shown as byte
nginx.cache.expired.bytes_written
(gauge)
The total number of bytes written to the cache
shown as byte
nginx.cache.expired.bytes_written_count
(count)
The total number of bytes written to the cache (shown as count)
shown as byte
nginx.cache.expired.responses
(gauge)
The total number of responses not taken from the cache
shown as response
nginx.cache.expired.responses_count
(count)
The total number of responses not taken from the cache (shown as count)
shown as response
nginx.cache.expired.responses_written
(gauge)
The total number of responses written to the cache
shown as response
nginx.cache.expired.responses_written_count
(count)
The total number of responses written to the cache (shown as count)
shown as response
nginx.cache.hit.bytes
(gauge)
The total number of bytes read from the cache
shown as byte
nginx.cache.hit.bytes_count
(count)
The total number of bytes read from the cache (shown as count)
shown as byte
nginx.cache.hit.responses
(gauge)
The total number of responses read from the cache
shown as response
nginx.cache.hit.responses_count
(count)
The total number of responses read from the cache (shown as count)
shown as response
nginx.cache.max_size
(gauge)
The limit on the maximum size of the cache specified in the configuration
shown as byte
nginx.cache.miss.bytes
(gauge)
The total number of bytes read from the proxied server
shown as byte
nginx.cache.miss.bytes_count
(count)
The total number of bytes read from the proxied server (shown as count)
shown as byte
nginx.cache.miss.bytes_written
(gauge)
The total number of bytes written to the cache
shown as byte
nginx.cache.miss.bytes_written_count
(count)
The total number of bytes written to the cache (shown as count)
shown as byte
nginx.cache.miss.responses
(gauge)
The total number of responses not taken from the cache
shown as response
nginx.cache.miss.responses_count
(count)
The total number of responses not taken from the cache (shown as count)
shown as response
nginx.cache.miss.responses_written
(gauge)
The total number of responses written to the cache
shown as response
nginx.cache.miss.responses_written_count
(count)
The total number of responses written to the cache
shown as response
nginx.cache.revalidated.bytes
(gauge)
The total number of bytes read from the cache
shown as byte
nginx.cache.revalidated.bytes_count
(count)
The total number of bytes read from the cache (shown as count)
shown as byte
nginx.cache.revalidated.response
(gauge)
The total number of responses read from the cache
shown as responses
nginx.cache.revalidated.response_count
(count)
The total number of responses read from the cache (shown as count)
shown as responses
nginx.cache.size
(gauge)
The current size of the cache
shown as response
nginx.cache.stale.bytes
(gauge)
The total number of bytes read from the cache
shown as byte
nginx.cache.stale.bytes_count
(count)
The total number of bytes read from the cache (shown as count)
shown as byte
nginx.cache.stale.responses
(gauge)
The total number of responses read from the cache
shown as response
nginx.cache.stale.responses_count
(count)
The total number of responses read from the cache (shown as count)
shown as response
nginx.cache.updating.bytes
(gauge)
The total number of bytes read from the cache
shown as byte
nginx.cache.updating.bytes_count
(count)
The total number of bytes read from the cache (shown as count)
shown as byte
nginx.cache.updating.responses
(gauge)
The total number of responses read from the cache
shown as response
nginx.cache.updating.responses_count
(count)
The total number of responses read from the cache (shown as count)
shown as response
nginx.connections.accepted
(gauge)
The total number of accepted client connections.
shown as connection
nginx.connections.accepted_count
(count)
The total number of accepted client connections (shown as count).
shown as connection
nginx.connections.active
(gauge)
The current number of active client connections.
shown as connection
nginx.connections.dropped
(gauge)
The total number of dropped client connections.
shown as connection
nginx.connections.dropped_count
(count)
The total number of dropped client connections (shown as count).
shown as connection
nginx.connections.idle
(gauge)
The current number of idle client connections.
shown as connection
nginx.generation
(gauge)
The total number of configuration reloads
shown as reload
nginx.generation_count
(count)
The total number of configuration reloads (shown as count)
shown as reload
nginx.load_timestamp
(gauge)
Time of the last reload of configuration (time since Epoch).
shown as millisecond
nginx.pid
(gauge)
The ID of the worker process that handled status request.
nginx.ppid
(gauge)
The ID of the master process that started the worker process
nginx.processes.respawned
(gauge)
The total number of abnormally terminated and respawned child processes.
shown as process
nginx.processes.respawned_count
(count)
The total number of abnormally terminated and respawned child processes (shown as count).
shown as process
nginx.requests.current
(gauge)
The current number of client requests.
shown as request
nginx.requests.total
(gauge)
The total number of client requests.
shown as request
nginx.requests.total_count
(count)
The total number of client requests (shown as count).
shown as request
nginx.server_zone.discarded
(gauge)
The total number of requests completed without sending a response.
shown as request
nginx.server_zone.discarded_count
(count)
The total number of requests completed without sending a response (shown as count).
shown as request
nginx.server_zone.processing
(gauge)
The number of client requests that are currently being processed.
shown as request
nginx.server_zone.received
(gauge)
The total amount of data received from clients.
shown as byte
nginx.server_zone.received_count
(count)
The total amount of data received from clients (shown as count).
shown as byte
nginx.server_zone.requests
(gauge)
The total number of client requests received from clients.
shown as request
nginx.server_zone.requests_count
(count)
The total number of client requests received from clients (shown as count).
shown as request
nginx.server_zone.responses.1xx
(gauge)
The number of responses with 1xx status code.
shown as response
nginx.server_zone.responses.1xx_count
(count)
The number of responses with 1xx status code (shown as count).
shown as response
nginx.server_zone.responses.2xx
(gauge)
The number of responses with 2xx status code.
shown as response
nginx.server_zone.responses.2xx_count
(count)
The number of responses with 2xx status code (shown as count).
shown as response
nginx.server_zone.responses.3xx
(gauge)
The number of responses with 3xx status code.
shown as response
nginx.server_zone.responses.3xx_count
(count)
The number of responses with 3xx status code (shown as count).
shown as response
nginx.server_zone.responses.4xx
(gauge)
The number of responses with 4xx status code.
shown as response
nginx.server_zone.responses.4xx_count
(count)
The number of responses with 4xx status code (shown as count).
shown as response
nginx.server_zone.responses.5xx
(gauge)
The number of responses with 5xx status code.
shown as response
nginx.server_zone.responses.5xx_count
(count)
The number of responses with 5xx status code (shown as count).
shown as response
nginx.server_zone.responses.total
(gauge)
The total number of responses sent to clients.
shown as response
nginx.server_zone.responses.total_count
(count)
The total number of responses sent to clients (shown as count).
shown as response
nginx.server_zone.sent
(gauge)
The total amount of data sent to clients.
shown as byte
nginx.server_zone.sent_count
(count)
The total amount of data sent to clients (shown as count).
shown as byte
nginx.slab.pages.free
(gauge)
The current number of free memory pages
shown as page
nginx.slab.pages.used
(gauge)
The current number of used memory pages
shown as page
nginx.slab.slots.fails
(gauge)
The number of unsuccessful attempts to allocate memory of specified size
shown as request
nginx.slab.slots.fails_count
(count)
The number of unsuccessful attempts to allocate memory of specified size (shown as count)
shown as request
nginx.slab.slots.free
(gauge)
The current number of free memory slots
shown as slot
nginx.slab.slots.reqs
(gauge)
The total number of attempts to allocate memory of specified size
shown as request
nginx.slab.slots.reqs_count
(count)
The total number of attempts to allocate memory of specified size (shown as count)
shown as request
nginx.slab.slots.used
(gauge)
The current number of used memory slots
shown as slot
nginx.ssl.handshakes
(gauge)
The total number of successful SSL handshakes.
nginx.ssl.handshakes_count
(count)
The total number of successful SSL handshakes (shown as count).
nginx.ssl.handshakes_failed
(gauge)
The total number of failed SSL handshakes.
nginx.ssl.handshakes_failed_count
(count)
The total number of failed SSL handshakes (shown as count).
nginx.ssl.session_reuses
(gauge)
The total number of session reuses during SSL handshake.
nginx.ssl.session_reuses_count
(count)
The total number of session reuses during SSL handshake (shown as count).
nginx.stream.server_zone.connections
(gauge)
The total number of connections accepted from clients
shown as connection
nginx.stream.server_zone.connections_count
(count)
The total number of connections accepted from clients (shown as count)
shown as connection
nginx.stream.server_zone.discarded
(gauge)
The total number of requests completed without sending a response.
shown as request
nginx.stream.server_zone.discarded_count
(count)
The total number of requests completed without sending a response (shown as count).
shown as request
nginx.stream.server_zone.processing
(gauge)
The number of client requests that are currently being processed.
shown as request
nginx.stream.server_zone.received
(gauge)
The total amount of data received from clients.
shown as byte
nginx.stream.server_zone.received_count
(count)
The total amount of data received from clients (shown as count).
shown as byte
nginx.stream.server_zone.sent
(gauge)
The total amount of data sent to clients.
shown as byte
nginx.stream.server_zone.sent_count
(count)
The total amount of data sent to clients (shown as count).
shown as byte
nginx.stream.server_zone.sessions.2xx
(gauge)
The number of responses with 2xx status code.
shown as session
nginx.stream.server_zone.sessions.2xx_count
(count)
The number of responses with 2xx status code (shown as count).
shown as session
nginx.stream.server_zone.sessions.4xx
(gauge)
The number of responses with 4xx status code.
shown as session
nginx.stream.server_zone.sessions.4xx_count
(count)
The number of responses with 4xx status code (shown as count).
shown as session
nginx.stream.server_zone.sessions.5xx
(gauge)
The number of responses with 5xx status code.
shown as session
nginx.stream.server_zone.sessions.5xx_count
(count)
The number of responses with 5xx status code (shown as count).
shown as session
nginx.stream.server_zone.sessions.total
(gauge)
The total number of responses sent to clients.
shown as session
nginx.stream.server_zone.sessions.total_count
(count)
The total number of responses sent to clients (shown as count).
shown as session
nginx.stream.upstream.peers.active
(gauge)
The current number of connections
shown as connection
nginx.stream.upstream.peers.backup
(gauge)
A boolean value indicating whether the server is a backup server.
nginx.stream.upstream.peers.connections
(gauge)
The total number of client connections forwarded to this server.
shown as connection
nginx.stream.upstream.peers.connections_count
(count)
The total number of client connections forwarded to this server (shown as count).
shown as connection
nginx.stream.upstream.peers.downstart
(gauge)
The time (time since Epoch) when the server became “unavail” or “checking” or “unhealthy”
shown as millisecond
nginx.stream.upstream.peers.downtime
(gauge)
Total time the server was in the “unavail” or “checking” or “unhealthy” states.
shown as millisecond
nginx.stream.upstream.peers.fails
(gauge)
The total number of unsuccessful attempts to communicate with the server.
shown as fail
nginx.stream.upstream.peers.fails_count
(count)
The total number of unsuccessful attempts to communicate with the server (shown as count).
shown as fail
nginx.stream.upstream.peers.health_checks.checks
(gauge)
The total number of health check requests made.
shown as request
nginx.stream.upstream.peers.health_checks.checks_count
(count)
The total number of health check requests made (shown as count).
shown as request
nginx.stream.upstream.peers.health_checks.fails
(gauge)
The number of failed health checks.
shown as fail
nginx.stream.upstream.peers.health_checks.fails_count
(count)
The number of failed health checks (shown as count).
shown as fail
nginx.stream.upstream.peers.health_checks.last_passed
(gauge)
Boolean indicating if the last health check request was successful and passed tests.
nginx.stream.upstream.peers.health_checks.unhealthy
(gauge)
How many times the server became unhealthy (state “unhealthy”).
nginx.stream.upstream.peers.health_checks.unhealthy_count
(count)
How many times the server became unhealthy (state “unhealthy”) (shown as count).
nginx.stream.upstream.peers.id
(gauge)
The ID of the server.
nginx.stream.upstream.peers.received
(gauge)
The total number of bytes received from this server.
shown as byte
nginx.stream.upstream.peers.received_count
(count)
The total number of bytes received from this server (shown as count).
shown as byte
nginx.stream.upstream.peers.selected
(gauge)
The time (time since Epoch) when the server was last selected to process a connection.
shown as millisecond
nginx.stream.upstream.peers.sent
(gauge)
The total number of bytes sent to this server.
shown as byte
nginx.stream.upstream.peers.sent_count
(count)
The total number of bytes sent to this server (shown as count).
shown as byte
nginx.stream.upstream.peers.unavail
(gauge)
How many times the server became unavailable for client connections (state “unavail”).
nginx.stream.upstream.peers.unavail_count
(count)
How many times the server became unavailable for client connections (state “unavail”) (shown as count).
nginx.stream.upstream.peers.weight
(gauge)
Weight of the server.
nginx.stream.upstream.zombies
(gauge)
The current number of servers removed from the group but still processing active client connections.
shown as server
nginx.timestamp
(gauge)
Current time since Epoch.
shown as millisecond
nginx.upstream.keepalive
(gauge)
The current number of idle keepalive connections.
shown as connection
nginx.upstream.peers.active
(gauge)
The current number of active connections.
shown as connection
nginx.upstream.peers.backup
(gauge)
A boolean value indicating whether the server is a backup server.
nginx.upstream.peers.downstart
(gauge)
The time (since Epoch) when the server became “unavail” or “unhealthy”.
shown as millisecond
nginx.upstream.peers.downtime
(gauge)
Total time the server was in the “unavail” and “unhealthy” states.
shown as millisecond
nginx.upstream.peers.fails
(gauge)
The total number of unsuccessful attempts to communicate with the server.
nginx.upstream.peers.fails_count
(count)
The total number of unsuccessful attempts to communicate with the server (shown as count).
nginx.upstream.peers.health_checks.checks
(gauge)
The total number of health check requests made.
nginx.upstream.peers.health_checks.checks_count
(count)
The total number of health check requests made (shown as count).
nginx.upstream.peers.health_checks.fails
(gauge)
The number of failed health checks.
nginx.upstream.peers.health_checks.fails_count
(count)
The number of failed health checks (shown as count).
nginx.upstream.peers.health_checks.last_passed
(gauge)
Boolean indicating if the last health check request was successful and passed tests.
nginx.upstream.peers.health_checks.unhealthy
(gauge)
How many times the server became unhealthy (state “unhealthy”).
nginx.upstream.peers.health_checks.unhealthy_count
(count)
How many times the server became unhealthy (state “unhealthy”) (shown as count).
nginx.upstream.peers.id
(gauge)
The ID of the server.
nginx.upstream.peers.received
(gauge)
The total amount of data received from this server.
shown as byte
nginx.upstream.peers.received_count
(count)
The total amount of data received from this server (shown as count).
shown as byte
nginx.upstream.peers.requests
(gauge)
The total number of client requests forwarded to this server.
shown as request
nginx.upstream.peers.requests_count
(count)
The total number of client requests forwarded to this server (shown as count).
shown as request
nginx.upstream.peers.responses.1xx
(gauge)
The number of responses with 1xx status code.
shown as response
nginx.upstream.peers.responses.1xx_count
(count)
The number of responses with 1xx status code (shown as count).
shown as response
nginx.upstream.peers.responses.2xx
(gauge)
The number of responses with 2xx status code.
shown as response
nginx.upstream.peers.responses.2xx_count
(count)
The number of responses with 2xx status code (shown as count).
shown as response
nginx.upstream.peers.responses.3xx
(gauge)
The number of responses with 3xx status code.
shown as response
nginx.upstream.peers.responses.3xx_count
(count)
The number of responses with 3xx status code (shown as count).
shown as response
nginx.upstream.peers.responses.4xx
(gauge)
The number of responses with 4xx status code.
shown as response
nginx.upstream.peers.responses.4xx_count
(count)
The number of responses with 4xx status code (shown as count).
shown as response
nginx.upstream.peers.responses.5xx
(gauge)
The number of responses with 5xx status code.
shown as response
nginx.upstream.peers.responses.5xx_count
(count)
The number of responses with 5xx status code (shown as count).
shown as response
nginx.upstream.peers.responses.total
(gauge)
The total number of responses obtained from this server.
shown as response
nginx.upstream.peers.responses.total_count
(count)
The total number of responses obtained from this server (shown as count).
shown as response
nginx.upstream.peers.selected
(gauge)
The time (since Epoch) when the server was last selected to process a request (1.7.5).
shown as millisecond
nginx.upstream.peers.sent
(gauge)
The total amount of data sent to this server.
shown as byte
nginx.upstream.peers.sent_count
(count)
The total amount of data sent to this server (shown as count).
shown as byte
nginx.upstream.peers.unavail
(gauge)
How many times the server became unavailable for client requests (state “unavail”) due to the number of unsuccessful attempts reaching the max_fails threshold.
nginx.upstream.peers.unavail_count
(count)
How many times the server became unavailable for client requests (state “unavail”) due to the number of unsuccessful attempts reaching the max_fails threshold (shown as count).
nginx.upstream.peers.weight
(gauge)
Weight of the server.
nginx.version
(gauge)
Version of nginx.

Not all metrics shown are available to users of open source NGINX. Compare the module reference for stub status (open source NGINX) and http status (NGINX Plus) to understand which metrics are provided by each module.

A few open-source NGINX metrics are named differently in NGINX Plus; they refer to the exact same metric, though:

NGINX NGINX PLUS
nginx.net.connections nginx.connections.active
nginx.net.conn_opened_per_s nginx.connections.accepted
nginx.net.conn_dropped_per_s nginx.connections.dropped
nginx.net.request_per_s nginx.requests.total

These metrics don’t refer exactly to the same metric, but they are somewhat related:

NGINX NGINX PLUS
nginx.net.waiting nginx.connections.idle

Finally, these metrics have no good equivalent:

nginx.net.reading The current number of connections where nginx is reading the request header.
nginx.net.writing The current number of connections where nginx is writing the response back to the client.

Events

The NGINX check does not include any events at this time.

Service Checks

nginx.can_connect:

Returns CRITICAL if the Agent cannot connect to NGINX to collect metrics, otherwise OK.

Troubleshooting

You may observe one of these common problems in the output of the Datadog Agent’s info subcommand.

Agent cannot connect

  Checks
  ======

    nginx
    -----
      - instance #0 [ERROR]: "('Connection aborted.', error(111, 'Connection refused'))"
      - Collected 0 metrics, 0 events & 1 service check

Either NGINX’s local status endpoint is not running, or the Agent is not configured with correct connection information for it.

Check that the main nginx.conf includes a line like the following:

http{

  ...

  include <directory_that_contains_status.conf>/*.conf;
  # e.g.: include /etc/nginx/conf.d/*.conf;
}

Otherwise, review the Configuration section.

Part-3 Ends Here

ELK(Elasticsearch,Logstash,Kibana)

ELK(Elasticsearch,Logstash,Kibana):

Reference links:

Note: This tutorial performed on ubuntu machine

https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04
https://logz.io/blog/10-elasticsearch-concepts/
https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html#_near_realtime_nrt
https://qbox.io/blog/welcome-to-the-elk-stack-elasticsearch-logstash-kibana

for centos 7 use following link:

https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-centos-7

Why we use ELK?

The Elastic Stack (aka ELK) is a robust solution for search, log management, and data analysis. ELK consists of a combination of three open source project: Elasticsearch, Logstash, and Kibana. These projects have specific roles in ELK:

  • Elasticsearch handles storage and provides a RESTful search and analytics endpoint.
  • Logstash is a server-side data processing pipeline that ingests, transforms and loads data.
  • Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack.

 

1. Elasticsearch -The Amazing Log Search Tool:

  • Real-time data extraction, and real-time data analytics. Elasticsearch is the engine that gives you both the power and the speed.

2.Logstash — Routing Your Log Data:

  •  Logstash is a tool for log data intake, processing, and output. This includes virtually any type of log that you manage: system logs, webserver logs, error logs, and app logs.
  • As administrators, we know how much time can be spent normalizing data from disparate data sources.
    We know, for example, how widely Apache logs differ from NGINX logs.

3.Kibana — Visualizing Your Log Data:

  • Kibana is your log-data dashboard.
  • Get a better grip on your large data stores with point-and-click pie charts, bar graphs, trendlines, maps and scatter plots.
  • You can visualize trends and patterns for data that would otherwise be extremely tedious to read and interpret.

Benefits:

  1. Real-time data and real-time analytics::
  • The ELK stack gives you the power of real-time data insights, with the ability to perform super-fast data extractions from virtually all structured or unstructured data sources.
  • Real-time extraction, and real-time analytics. Elasticsearch is the engine that gives you both the power and the speed.

2. Scalable, high-availability, multi-tenant:

  • With Elasticsearch, you can start small and expand it along with your business growth-when you are ready.
  • It is built to scale horizontally out of the box. As you need more capacity, simply add another node and let the cluster reorganize itself to accommodate and exploit the extra hardware.
  • Elasticsearch clusters are resilient, since they automatically detect and remove node failures.

You can set up multiple indices and query each of them independently or in combination.

Some Important Concepts In ELK as follows:

1. Documents:

  • Documents are JSON objects that are stored within an Elasticsearch index and are    considered the base unit of storage.
  • In the world of relational databases, documents can be compared to a row in table    Data in documents is defined with fields comprised of keys and values.
  • A key is the name of the field, and a value can be an item of many different types such as a string, a number, a boolean expression, another object, or an array of values.
  • Documents also contain reserved fields that constitute the document metadata such as:

1.  _index – the index where the document resides
2. _type – the type that the document represents
3.  _id – the unique identifier for the document

2.Index:

  • Indices are the largest unit of data in Elasticsearch, are logical partitions of documents and can be compared to a database in the world of relational databases.
  • You can have as many indices defined in Elasticsearch as you want.
  • These in turn will hold documents that are unique to each index.
  • Indices are identified by lowercase names that refer to actions that are performed actions (such as searching and deleting)against the documents that are inside each index.

3.Shards:

  • Elasticsearch provides the ability to subdivide your index into multiple pieces called shards.
  • When you create an index, you can simply define the number of shards that you want.
  • Each shard is in itself a fully-functional and independent “index” that can be hosted on any node in the cluster.

When you create an index, you can define how many shards you want. Each shard is an independent Lucene index that can be hosted anywhere in your cluster.

Sharding is important for two primary reasons:

  • It allows you to horizontally split/scale your content volume.
  • It allows you to distribute and parallelize operations across shards (potentially on multiple nodes) thus increasing performance/throughput.

example :
curl -XPUT localhost:9200/example -d ‘{
“settings” : {
“index” : {
“number_of_shards” : 2,
“number_of_replicas” : 1
}
}
}’

4.Replicas:

  • Replicas, as the name implies, are Elasticsearch fail-safe mechanisms and are basically copies of your index’s shards.
  • This is a useful backup system for a rainy day — or, in other words, when a node crashes.
  • Replicas also serve read requests, so adding replicas can help to increase search performance.

To ensure high availability, replicas are not placed on the same node as the original shards (called the “primary” shard) from which they were replicated.

To ensure high availability, replicas are not placed on the same node as the original shards (called the “primary” shard)from which they were replicated.

Replication is important for two primary reasons:

  • It provides high availability in case a shard/node fails. For this reason,
    it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.
  • It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.

5.Analyzers:

Analyzers are used during indexing to break down phrases or expressions into terms.
Defined within an index, an analyzer consists of a single tokenizer and any number of token filters.
For example, a tokenizer could split a string into specifically defined terms when encountering a specific expression.

A token filter is used to filter or modify some tokens. For example, a ASCII folding filter will convert characters like ê, é, è to e.
example:

curl -XPUT localhost:9200/example -d ‘{
“mappings”: {
“mytype”: {
“properties”: {
“name”: {
“type”: “string”,
“analyzer”: “whitespace”
}
}
}
}
}’

6.Nodes:

The heart of any ELK setup is the Elasticsearch instance, which has the crucial task of storing and indexing data.

In a cluster, different responsibilities are assigned to the various node types:
1.Data nodes — stores data and executes data-related operations such as search and aggregation.
2.Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes
3.Client nodes — forwards cluster requests to the master node and data-related requests to data nodes
4.Tribe nodes — act as a client node, performing read and write operations against all of the nodes in the cluster
5.Ingestion nodes (this is new in Elasticsearch 5.0) — for pre-processing documents before indexing

By default, each node is automatically assigned a unique identifier, or name, that is used for management purposes and becomes even more important in a multi-node, or clustered, environment.

When installed, a single node will form a new single-node cluster entitled elasticsearch,” but it can also be configured to join an existing cluster (see below) using the cluster name.

In a development or testing environment, you can set up multiple nodes on a single server.
In production, however, due to the number of resources that an Elasticsearch node consumes,
it is recommended to have each Elasticsearch instance run on a separate server.

7.Cluster:

An Elasticsearch cluster is comprised of one or more Elasticsearch nodes.
As with nodes, each cluster has a unique identifier that must be used by any node attempting to join the cluster.
By default, the cluster name is “elasticsearch,” but this name can be changed, of course.

One node in the cluster is the “master” node, which is in charge of cluster-wide management and configurations actions (such as adding and removing nodes).

This node is chosen automatically by the cluster, but it can be changed if it fails. (See above on the other types of nodes in a cluster.)

For example, the cluster health API returns health status reports of either “green” (all shards are allocated), “yellow” (the primary shard is allocated but replicas are not), or “red” (the shard is not allocated in the cluster).

# Output Example
{
“cluster_name” : “elasticsearch”,
“status” : “yellow”,
“timed_out” : false,
“number_of_nodes” : 1,
“number_of_data_nodes” : 1,
“active_primary_shards” : 5,
“active_shards” : 5,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 5,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 50.0
}

ELK Installation:

Our Goal:

The goal of the tutorial is to set up Logstash to gather syslogs of multiple servers, and set up Kibana to visualize the gathered logs.

Our ELK stack setup has four main components:

  • Logstash: The server component of Logstash that processes incoming logs
  • Elasticsearch: Stores all of the logs
  • Kibana: Web interface for searching and visualizing logs, which will be proxied through Nginx
  • Filebeat: Installed on client servers that will send their logs to Logstash, Filebeat serves as a log shipping agent that utilizes the lumberjack networking protocol to communicate with Logstash

ELK_Architecture.png

NOTE:

We will install the first three components on a single server, which we will refer to as our ELK Server. Filebeat will be installed on all of the client servers that we want to gather logs for, which we will refer to collectively as our Client Servers.

Pre-requisites:

The amount of CPU, RAM, and storage that your ELK Server will require depends on the volume of logs that you intend to gather. For this tutorial, we will be using a VPS with the following specs for our ELK Server:

  • OS: Ubuntu 14.04
  • RAM: 4GB
  • CPU: 2

In addition to your ELK Server, you will want to have a few other servers that you will gather logs from.

Let’s get started on setting up our ELK Server!

Step-1 : Install Java8

Elasticsearch and Logstash require Java, so we will install that now. We will install a recent version of Oracle Java 8 because that is what Elasticsearch recommends. It should, however, work fine with OpenJDK, if you decide to go that route.

Add the Oracle Java PPA to apt:

$ sudo add-apt-repository -y ppa:webupd8team/java

Update your apt package database:

$ sudo apt-get update -y

Install the latest stable version of Oracle Java 8 with this command (and accept the license agreement that pops up):

$ sudo apt-get -y install oracle-java8-installer

Now that Java 8 is installed.

let’s install ElasticSearch.

Step-2: Install ElasticSearch

Elasticsearch can be installed with a package manager by adding Elastic’s package source list.

Run the following command to import the Elasticsearch public GPG key into apt:

$ wget -qO – https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –

If your prompt is just hanging there, it is probably waiting for your user’s password (to authorize the sudocommand). If this is the case, enter your password.

Create the Elasticsearch source list:

$ echo “deb http://packages.elastic.co/elasticsearch/2.x/debian stable main” | sudo tee -a /etc/apt/sources.list.d/elasticsearch-2.x.list

Update your apt package database:

$ sudo apt-get update -y

Install Elasticsearch with this command:

$ sudo apt-get -y install elasticsearch

Elasticsearch is now installed. Let’s edit the configuration:

$ sudo vi /etc/elasticsearch/elasticsearch.yml

You will want to restrict outside access to your Elasticsearch instance (port 9200), so outsiders can’t read your data or shutdown your Elasticsearch cluster through the HTTP API. Find the line that specifies network.host, uncomment it, and replace its value with “localhost” so it looks like this:

elasticsearch.yml excerpt (updated)
network.host: localhost

Save and exit elasticsearch.yml.

Now start Elasticsearch:

$ sudo service elasticsearch restart

Then run the following command to start Elasticsearch on boot up:

$ sudo update-rc.d elasticsearch defaults 95 10

Now that Elasticsearch is up and running, let’s install Kibana.

Step-3:Install Kibana

Kibana can be installed with a package manager by adding Elastic’s package source list.

Create the Kibana source list:

$ echo “deb http://packages.elastic.co/kibana/4.5/debian stable main” | sudo tee -a /etc/apt/sources.list.d/kibana-4.5.x.list

Update your apt package database:

  • sudo apt-get update -y

Install Kibana with this command:

  • sudo apt-get -y install kibana

Kibana is now installed.

Open the Kibana configuration file for editing:

$ sudo vi /opt/kibana/config/kibana.yml

In the Kibana configuration file, find the line that specifies server.host, and replace the IP address (“0.0.0.0” by default) with “localhost”:

server.host: "localhost"

Save and exit. This setting makes it so Kibana will only be accessible to the localhost. This is fine because we will use an Nginx reverse proxy to allow external access.

Now enable the Kibana service, and start it:

  • sudo update-rc.d kibana defaults 96 9
  • sudo service kibana start

Before we can use the Kibana web interface, we have to set up a reverse proxy. Let’s do that now, with Nginx.

Step-4:Install Nginx

Because we configured Kibana to listen on localhost, we must set up a reverse proxy to allow external access to it. We will use Nginx for this purpose.

Note: If you already have an Nginx instance that you want to use, feel free to use that instead. Just make sure to configure Kibana so it is reachable by your Nginx server (you probably want to change the hostvalue, in /opt/kibana/config/kibana.yml, to your Kibana server’s private IP address or hostname). Also, it is recommended that you enable SSL/TLS.

Use apt to install Nginx and Apache2-utils:

$ sudo apt-get install nginx apache2-utils -y

Use htpasswd to create an admin user, called “kibanaadmin” (you should use another name), that can access the Kibana web interface:

$ sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin

Enter a password at the prompt. Remember this login, as you will need it to access the Kibana web interface.

Now open the Nginx default server block in your favorite editor. We will use vi:

$ sudo vim /etc/nginx/sites-available/default

Delete the file’s contents, and paste the following code block into the file. Be sure to update the server_name to match your server’s name:

server {

listen 80;

server_name example.com;

auth_basic “Restricted Access”;

auth_basic_user_file /etc/nginx/htpasswd.users;

location / {

proxy_pass http://localhost:5601;

proxy_http_version 1.1;

proxy_set_header Upgrade $http_upgrade;

proxy_set_header Connection ‘upgrade’;

proxy_set_header Host $host;

proxy_cache_bypass $http_upgrade;

}

}

Nginx configuration look like :

nginx.png

 

Save and exit. This configures Nginx to direct your server’s HTTP traffic to the Kibana application, which is listening on localhost:5601. Also, Nginx will use the htpasswd.users file, that we created earlier, and require basic authentication.

Now restart Nginx to put our changes into effect:

$ sudo service nginx restart

Kibana is now accessible via your FQDN or the public IP address of your ELK Server i.e. http://elk-server-public-ip/. If you go there in a web browser, after entering the “kibanaadmin” credentials, you should see a Kibana welcome page which will ask you to configure an index pattern. Let’s get back to that later, after we install all of the other components.

Step-5 Install Logstash

The Logstash package is available from the same repository as Elasticsearch, and we already installed that public key, so let’s create the Logstash source list:

$ echo ‘deb http://packages.elastic.co/logstash/2.2/debian stable main’ | sudo tee /etc/apt/sources.list.d/logstash-2.2.x.list

Update your apt package database:

$ sudo apt-get update -y

Install Logstash with this command:

$ sudo apt-get install logstash -y

Logstash is installed but it is not configured yet.

Since we are going to use Filebeat to ship logs from our Client Servers to our ELK Server, we need to create an SSL certificate and key pair. The certificate is used by Filebeat to verify the identity of ELK Server. Create the directories that will store the certificate and private key with the following commands:

  • sudo mkdir -p /etc/pki/tls/certs
  • sudo mkdir /etc/pki/tls/private

Now you have two options for generating your SSL certificates. If you have a DNS setup that will allow your client servers to resolve the IP address of the ELK Server, use Option 2. Otherwise, Option 1 will allow you to use IP addresses.

Option 1:IP Address

If you don’t have a DNS setup—that would allow your servers, that you will gather logs from, to resolve the IP address of your ELK Server—you will have to add your ELK Server’s private IP address to the subjectAltName (SAN) field of the SSL certificate that we are about to generate. To do so, open the OpenSSL configuration file:

$ sudo vim /etc/ssl/openssl.cnf

Find the [ v3_ca ] section in the file, and add this line under it (substituting in the ELK Server’s private IP address):

subjectAltName = IP: ELK_server_private_IP

Save and exit.

Now generate the SSL certificate and private key in the appropriate locations (/etc/pki/tls/), with the following commands:

  • cd /etc/pki/tls
  • sudo openssl req -config /etc/ssl/openssl.cnf -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt

The logstash-forwarder.crt file will be copied to all of the servers that will send logs to Logstash but we will do that a little later. Let’s complete our Logstash configuration. If you went with this option, skip option 2 and move on to Configure Logstash.

Option 2: FQDN(DNS)

If you have a DNS setup with your private networking, you should create an A record that contains the ELK Server’s private IP address—this domain name will be used in the next command, to generate the SSL certificate. Alternatively, you can use a record that points to the server’s public IP address. Just be sure that your servers (the ones that you will be gathering logs from) will be able to resolve the domain name to your ELK Server.

Now generate the SSL certificate and private key, in the appropriate locations (/etc/pki/tls/…), with the following command (substitute in the FQDN of the ELK Server):

$ cd /etc/pki/tls; sudo openssl req -subj ‘/CN=ELK_server_fqdn/’ -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt

The logstash-forwarder.crt file will be copied to all of the servers that will send logs to Logstash but we will do that a little later. Let’s complete our Logstash configuration.

Configure Logstash

Logstash configuration files are in the JSON-format, and reside in /etc/logstash/conf.d. The configuration consists of three sections: inputs, filters, and outputs.

Let’s create a configuration file called 02-beats-input.conf and set up our “filebeat” input:

$ sudo vi /etc/logstash/conf.d/02-beats-input.conf

Insert the following input configuration:

input {

beats {

port => 5044

ssl => true

ssl_certificate => “/etc/pki/tls/certs/logstash-forwarder.crt”

ssl_key => “/etc/pki/tls/private/logstash-forwarder.key”

}

}

Or

02-beats-input.conf file content looks like:

logstash1.png

Save and quit. This specifies a beats input that will listen on tcp port 5044, and it will use the SSL certificate and private key that we created earlier.

Now let’s create a configuration file called 10-syslog-filter.conf, where we will add a filter for syslog messages:

$ sudo vi /etc/logstash/conf.d/10-syslog-filter.conf

filter {

if [type] == “syslog” {

grok {

match => { “message” => “%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}” }

add_field => [ “received_at”, “%{@timestamp}” ]

add_field => [ “received_from”, “%{host}” ]

}

syslog_pri { }

date {

match => [ “syslog_timestamp”, “MMM d HH:mm:ss”, “MMM dd HH:mm:ss” ]

}

}

}

Or 10-syslog-filter.conf file content looks like this:

logstash2

Save and quit. This filter looks for logs that are labeled as “syslog” type (by Filebeat), and it will try to use grok to parse incoming syslog logs to make it structured and query-able.

Lastly, we will create a configuration file called 30-elasticsearch-output.conf:

$ sudo vim /etc/logstash/conf.d/30-elasticsearch-output.conf

output {
elasticsearch {

hosts => [“localhost:9200”]
sniffing => true
manage_template => false
index => “%{[@metadata][beat]}-%{+YYYY.MM.dd}”
document_type => “%{[@metadata][type]}”

}

}

Or 30-elasticsearch-output.conf file content looks like:

logstash3.png

Save and exit. This output basically configures Logstash to store the beats data in Elasticsearch which is running at localhost:9200, in an index named after the beat used (filebeat, in our case).

If you want to add filters for other applications that use the Filebeat input, be sure to name the files so they sort between the input and the output configuration (i.e. between 02- and 30-).

Test your Logstash configuration with this command:

  • $ sudo service logstash configtest

It should display Configuration OK if there are no syntax errors. Otherwise, try and read the error output to see what’s wrong with your Logstash configuration.

Restart Logstash, and enable it, to put our configuration changes into effect:

  • sudo service logstash restart
  • sudo update-rc.d logstash defaults 96 9

Next, we’ll load the sample Kibana dashboards.

Loading Sample Kibana Dashboards:

Elastic provides several sample Kibana dashboards and Beats index patterns that can help you get started with Kibana. Although we won’t use the dashboards in this tutorial, we’ll load them anyway so we can use the Filebeat index pattern that it includes.

First, download the sample dashboards archive to your home directory:

Install the unzip package with this command:

  • sudo apt-get -y install unzip

Next, extract the contents of the archive:

  • unzip beats-dashboards-*.zip

And load the sample dashboards, visualizations and Beats index patterns into Elasticsearch with these commands:

  • cd beats-dashboards-*
  • ./load.sh

These are the index patterns that we just loaded:

  • [packetbeat-]YYYY.MM.DD
  • [topbeat-]YYYY.MM.DD
  • [filebeat-]YYYY.MM.DD
  • [winlogbeat-]YYYY.MM.DD

When we start using Kibana, we will select the Filebeat index pattern as our default.

Load Filebeat index templates in Elasticsearch

Because we are planning on using Filebeat to ship logs to Elasticsearch, we should load a Filebeat index template. The index template will configure Elasticsearch to analyze incoming Filebeat fields in an intelligent way.

First, download the Filebeat index template to your home directory:

Then load the template with this command:

$ curl -XPUT ‘http://localhost:9200/_template/filebeat?pretty&#8217; -d@filebeat-index-template.json

If the template loaded properly, you should see a message like this:

Output:
{
  "acknowledged" : true
}

Now that our ELK Server is ready to receive Filebeat data, let’s move onto setting up Filebeat on each client server.

Step-6: SetUp filebeat(add clients servers)

Do these steps for each Ubuntu or Debian server that you want to send logs to Logstash on your ELK Server. For instructions on installing Filebeat on Red Hat-based Linux distributions (e.g. RHEL, CentOS, etc.), refer to the Set Up Filebeat (Add Client Servers) section of the CentOS variation of this tutorial.

Copy SSL Certificate

On your ELK Server, copy the SSL certificate—created in the prerequisite tutorial—to your Client Server(substitute the client server’s address, and your own login):

  • scp /etc/pki/tls/certs/logstash-forwarder.crt user@client_server_private_address:/tmp

After providing your login’s credentials, ensure that the certificate copy was successful. It is required for communication between the client servers and the ELK Server.

Now, on your Client Server, copy the ELK Server’s SSL certificate into the appropriate location (/etc/pki/tls/certs):

  • Client $ sudo mkdir -p /etc/pki/tls/certs
  • CLient$ sudo cp /tmp/logstash-forwarder.crt /etc/pki/tls/certs/

Now we will install the Topbeat package.

Install Filebeat packages:

On Client Server, create the Beats source list,using following command :

Client$ echo “deb https://packages.elastic.co/beats/apt stable main” | sudo tee -a /etc/apt/sources.list.d/beats.list

It also uses the same GPG key as Elasticsearch, which can be installed with this command:

Client$ wget -qO – https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –

Then install the Filebeat package:

  • sudo apt-get update
  • sudo apt-get install filebeat

Filebeat is installed but it is not configured yet.

Configure filebeat

Now we will configure Filebeat to connect to Logstash on our ELK Server. This section will step you through modifying the example configuration file that comes with Filebeat. When you complete the steps, you should have a file that looks something like this.

On Client Server, create and edit Filebeat configuration file:

  • sudo vi /etc/filebeat/filebeat.yml

Note: Filebeat’s configuration file is in YAML format, which means that indentation is very important! Be sure to use the same number of spaces that are indicated in these instructions.

Near the top of the file, you will see the prospectors section, which is where you can define prospectorsthat specify which log files should be shipped and how they should be handled. Each prospector is indicated by the - character.

We’ll modify the existing prospector to send syslog and auth.log to Logstash. Under paths, comment out the - /var/log/*.log file. This will prevent Filebeat from sending every .log in that directory to Logstash. Then add new entries for syslog and auth.log. It should look something like this when you’re done:

...
      paths:
        - /var/log/auth.log
        - /var/log/syslog
#        - /var/log/*.log
...

Then find the line that specifies document_type:, uncomment it and change its value to “syslog”. It should look like this after the modification:

...
      document_type: syslog
...

This specifies that the logs in this prospector are of type syslog (which is the type that our Logstash filter is looking for).

If you want to send other files to your ELK server, or make any changes to how Filebeat handles your logs, feel free to modify or add prospector entries.

Next, under the output section, find the line that says elasticsearch:, which indicates the Elasticsearch output section (which we are not going to use). Delete or comment out the entire Elasticsearch output section (up to the line that says #logstash:).

Find the commented out Logstash output section, indicated by the line that says #logstash:, and uncomment it by deleting the preceding #. In this section, uncomment the hosts: ["localhost:5044"]line. Change localhost to the private IP address (or hostname, if you went with that option) of your ELK server:

 ### Logstash as output
  logstash:
    # The Logstash hosts
    hosts: ["ELK_server_private_IP:5044"]

This configures Filebeat to connect to Logstash on your ELK Server at port 5044 (the port that we specified a Logstash input for earlier).

Directly under the hosts entry, and with the same indentation, add this line in filebeat.yml file:

bulk_max_size: 1024

Next, find the tls section, and uncomment it. Then uncomment the line that specifies certificate_authorities, and change its value to ["/etc/pki/tls/certs/logstash-forwarder.crt"]. It should look something like this:

...
    tls:
      # List of root certificates for HTTPS server verifications
      certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]

This configures Filebeat to use the SSL certificate that we created on the ELK Server.

Save and quit.

Now restart Filebeat to put our changes into place:

  • sudo service filebeat restart
  • sudo update-rc.d filebeat defaults 95 10

Again, if you’re not sure if your Filebeat configuration is correct, compare it against this example Filebeat configuration.

Now Filebeat is sending syslog and auth.log to Logstash on your ELK server! Repeat this section for all of the other servers that you wish to gather logs for.

Test the filebeat installation:

If your ELK stack is setup properly, Filebeat (on your client server) should be shipping your logs to Logstash on your ELK server. Logstash should be loading the Filebeat data into Elasticsearch in a date-stamped index, filebeat-YYYY.MM.DD.

On your ELK Server, verify that Elasticsearch is indeed receiving the data by querying for the Filebeat index with this command:

You should see a bunch of output that looks like this:

Sample Output:
...
{
      "_index" : "filebeat-2016.01.29",
      "_type" : "log",
      "_id" : "AVKO98yuaHvsHQLa53HE",
      "_score" : 1.0,
      "_source":{"message":"Feb  3 14:34:00 rails sshd[963]: Server listening on :: port 22.","@version":"1","@timestamp":"2016-01-29T19:59:09.145Z","beat":{"hostname":"topbeat-u-03","name":"topbeat-u-03"},"count":1,"fields":null,"input_type":"log","offset":70,"source":"/var/log/auth.log","type":"log","host":"topbeat-u-03"}
    }
...

If your output shows 0 total hits, Elasticsearch is not loading any logs under the index you searched for, and you should review your setup for errors. If you received the expected output, continue to the next step.

Connect to Kibana

When you are finished setting up Filebeat on all of the servers that you want to gather logs for, let’s look at Kibana, the web interface that we installed earlier.

example:

http://localhost:5601/         and hit enter

In a web browser, go to the FQDN or public IP address of your ELK Server. After entering the “kibanaadmin” credentials, you should see a page prompting you to configure a default index pattern:

kibana1.png

Go ahead and select [filebeat]-YYY.MM.DD from the Index Patterns menu (left side), then click the Star (Set as default index) button to set the Filebeat index as the default.

Now click the Discover link in the top navigation bar. By default, this will show you all of the log data over the last 15 minutes. You should see a histogram with log events, with log messages below:

kibana2.png

Right now, there won’t be much in there because you are only gathering syslogs from your client servers. Here, you can search and browse through your logs. You can also customize your dashboard.

Try the following things:

  • Search for “root” to see if anyone is trying to log into your servers as root
  • Search for a particular hostname (search for host: "hostname")
  • Change the time frame by selecting an area on the histogram or from the menu above
  • Click on messages below the histogram to see how the data is being filtered

Kibana has many other features, such as graphing and filtering, so feel free to poke around!

Conclusion

Now that your syslogs are centralized via Elasticsearch and Logstash, and you are able to visualize them with Kibana, you should be off to a good start with centralizing all of your important logs. Remember that you can send pretty much any type of log or indexed data to Logstash, but the data becomes even more useful if it is parsed and structured with grok.

To improve your new ELK stack, you should look into gathering and filtering your other logs with Logstash, and creating Kibana dashboards. You may also want to gather system metrics by using Topbeat with your ELK stack. All of these topics are covered in the other tutorials in this series.

Good luck!

Docker

Docker Reference_Links-https://www.tutorialspoint.com/docker/docker_tutorial.pdf

https://docs.docker.com/get-started/#setup

https://docs.docker.com/

For Docker network-https://docs.docker.com/engine/userguide/networking/work-with-networks/

https://docs.docker.com/engine/swarm/networking/

https://docs.docker.com/engine/userguide/networking/default_network/container-communication/

https://docs.docker.com/engine/api/v1.25/#section/Versioning

—————————————————————————————————————————————–

 

——–Why DevOps—–

  1. Increase the speed, efficiency and quality of software delivery as well as improving staff morale and motivation.
  2. Shorter Development Cycles, Faster Innovation
  3. Removes silos (the communication barriers between teams)

     4. As the burden of manual work is removed from staff members, they can then             focus on more creative work that increases their job satisfaction

     5. Reduced Deployment Failures, Rollbacks, and Time to Recover

     6. Improved Communication and Collaboration

     7. Increased Efficiency

     8. Reduced Costs and IT Headcount

     9. Faster application delivery, enhanced innovation, more stable operating                       environments, performance-focused employee teams.

    10. less complexity

    11. Faster resolution of problems.

    12. More stable operating environments.

    13. More time to innovate (rather than fix/maintain)

    14. 200 times more frequent deploys, 24 times faster recovery time, 3 times lower            change failure rates, 50% less time on solving security issue, 22 % less  time on            unplanned work and rework.

    15. Responsible for delivering both new features and stability.

    16. Quicker mitigation of software defects

    17. Reduced human errors

    18. Enhanced version control

 

What is Docker?

Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

 

Why Do I need to use Docker?

Ans: Docker provides this same capability without the overhead of a virtual machine. It lets you put your environment and configuration into code and deploy it.

The same Docker configuration can also be used in a variety of environments. This decouples (Seperates)infrastructure requirements from the application environment. 

When To Use Docker?

(Reference Link :-https://www.ctl.io/developers/blog/post/what-is-docker-and-when-to-use-it/)

  1. Use Docker as version control system for your entire app’s operating system
  2. Use Docker when you want to distribute/collaborate on your app’s operating system with a team
  3. Use Docker to run your code on your laptop in the same environment as you have on your server (try the building tool)
  4. Use Docker whenever your app needs to go through multiple phases of development (dev/test/qa/prod, try Droneor Shippable, both do Docker CI/CD)
  5. Use Docker with your Chef Cookbooks and Puppet Manifests(remember, Docker doesn’t do configuration management).

What Alternatives Are There to Docker?

  • The Amazon AMI Marketplaceis the closest thing to the Docker Index that you will find. With AMIs, you can only run them on Amazon. With Docker, you can run the images on any Linux server that runs Docker.
  • The Warden projectis a LXC manager written for Cloud Foundry without any of the social features of Docker like sharing images with other people on the Docker Index.

How Docker Is Like Java

Java’s promise: Write Once. Run Anywhere.

Docker has the same promise. Except instead of code, you can configure your servers exactly the way you want them (pick the OS, tune the config files, install binaries, etc.) and you can be certain that your server template will run exactly the same on any host that runs a Docker server.

For example, in Java, you write some code:

************************************************************************************

class HelloWorldApp {

public static void main(String[] args) {

System.out.println(“Hello World!”);

}

}

***********************************************************************************

Then run javac HelloWorld.java. The resulting HelloWorld.class can be run on any machine with a JVM.

In Docker, you write a Dockerfile:

**************************************************************************

$ FROM ubuntu:13.10

$ ENV DEBIAN_FRONTEND noninteractive

$ RUN apt-get update -qq -y &&

$ apt-get install curl -qq -y &&

$ apt-get clean

***************************************************************************

$ RUN curl -sSL https://get.rvm.io | bash -s stable –ruby=2.1.1

Then run docker build -t my/ruby . and the resulting container, my/ruby can be run on any machine with a Docker server.

The Docker server is like a JVM for systems. It lets you get around the leaky abstraction of Virtual Machines by giving you an abstraction that runs just above virtualization (or even bare metal).

How Docker Is Like Git

Git’s promise: Tiny footprint with lightning fast performance.

Docker has the same promise. Except instead of for tracking changes in code, you can track changes in systems. Git outclasses SCM tools like Subversion, CVS, Perforce, and ClearCase with features like cheap local branching, convenient staging areas, and multiple workflows. Docker outclasses other tools with features like ultra-fast container startup times (microseconds, not minutes), convenient image building tools, and collaboration workflows.

For example, in Git you make some change and can see changes with git status:

*********************************************************************************

$ git init .

$ touch README.md

$ git add .

$ git status

On branch master

Initial commit

**********************************************************************************

Changes to be committed: (use “git rm –cached …” to unstage)

Add new file README.md to git…

***********************************************************************************

$ git commit -am “Adding README.md”

[master (root-commit) 78184aa] Adding README.md

1 file changed, 0 insertions(+), 0 deletions(-)

create mode 100644 README.md

************************************************************************************

Push change to git repo…

*****************************git push command*************************************

$ git push

Counting objects: 49, done.

Delta compression using up to 4 threads.

Compressing objects: 100% (39/39), done.

Writing objects: 100% (49/49), 4.29 KiB | 0 bytes/s, done.

Total 49 (delta 13), reused 0 (delta 0)

To git@github.com:my/repo.git

* [new branch] master -> master

***********************************************************************************

Branch master set up to track remote branch master from origin…

*******************git pull commannd***************************************

$ git pull

remote: Counting objects: 4, done.

remote: Compressing objects: 100% (3/3), done.

remote: Total 3 (delta 0), reused 0 (delta 0)

Unpacking objects: 100% (3/3), done.

From github.com:cardmagic/docker-ruby

f98f3ac..4578f21 master -> origin/master

Updating f98f3ac..4578f21

Fast-forward

README.md | 3 +++

1 file changed, 3 insertions(+)

create mode 100644 README.md

*************************************************************************************

Use git whatchanged commmand to see what changed…

******************** git whatchanged commmand ************************************

$ git whatchanged

commit 78184aa2a04b4a9fefb13d534d157ef4ac7e81b9

Author: Lucas Carlson

Date: Mon Apr 21 16:46:34 2014 -0700

Adding README.md

:000000 100644 0000000… e69de29… A README.md

*************************************************************************************

In Docker, you can track changes throughout your entire system:

_________________________________________________________________________________________

$ MY_DOCKER=$(docker run -d ubuntu bash -c ‘touch README.md; sleep 10000’)

$ docker diff $MY_DOCKER

A /README.md

C /dev

C /dev/core

C /dev/fd

C /dev/ptmx

C /dev/stderr

C /dev/stdin

C /dev/stdout

___________________________________________________________________________________________

Github commit…command

********************Git commit…command *****************************************

$ docker commit -m “Adding README.md” $MY_DOCKER my/ubuntu

4d46072299621b8e5409cbc5d325d5ce825f788517101fe63f5bda448c9954da

*************************************************************************************

Docker push to update repo…

************************************************************************************

$ docker push my/ubuntu

The push refers to a repository [my/ubuntu] (len: 1)

Sending image list

Pushing repository my/ubuntu (1 tags)

511136ea3c5a: Image already pushed, skipping

Image 6170bb7b0ad1 already pushed, skipping

Image 9cd978db300e already pushed, skipping

de2fdfc8f7d8: Image successfully pushed

Pushing tag for rev [de2fdfc8f7d8] on {https://registry-1.docker.io/v1/repositories/my/ubuntu/tags/latest}

*******************************************************************************

Docker pull to get image…

********* docker pull command ***********************************************

$ docker pull my/ubuntu

Pulling repository my/ubuntu

de2fdfc8f7d8: Download complete

511136ea3c5a: Download complete

6170bb7b0ad1: Download complete

9cd978db300e: Download complete

************************************************************************************

Docker history to see recent changes…

**************************** docker history command ********************************

$ docker history my/ubuntu

IMAGE CREATED CREATED BY SIZE

de2fdfc8f7d8 3 minutes ago bash -c touch README.md; sleep 10000 77 B

9cd978db300e 11 weeks ago /bin/sh -c #(nop) ADD precise.tar.xz in / 204.4 MB

6170bb7b0ad1 11 weeks ago /bin/sh -c #(nop) MAINTAINER Tianon Gravi

*************************************************************************************

These collaboration features (docker push and docker pull) are one of the most disruptive parts of Docker. The fact that any Docker image can run on any machine running Docker is amazing. But The Docker pull/push are the first time developers and ops guys have ever been able to easily collaborate quickly on building infrastructure together. The developers can focus on building great applications and the ops guys can focus on building perfect service containers. The app guys can share app containers with ops guys and the ops guys can share MySQL and PosgreSQL and Redis servers with app guys.

This is the game changer with Docker. That is why Docker is changing the face of development for our generation. The Docker community is already curating and cultivating generic service containers that anyone can use as starting points. The fact that you can use these Docker containers on any system that runs the Docker server is an incredible feat of engineering.

DOCKER Contents(Basic Concepts in Docker):-

Docker images —-> is a lightweight, stand-alone, executable package that includes everything needed to run a piece of software, including the code, a runtime, libraries, environment variables, and config files.

A container——–> is a runtime instance of an image—what the image becomes in memory when actually executed.It runs completely isolated from the host environment by default, only accessing host files and ports if configured to do so.

Dockerfile ——–> will define what goes on in the environment inside your container.                                      Access to resources like networking interfaces and disk drives is virtualized inside this environment, which is isolated from the rest of your system, so you have to map ports to the outside world, and be specific about what files you want to “copy in” to that environment.

However, after doing that, you can expect that the build of your app defined in this Dockerfile  will behave exactly the same wherever it runs.

Task —->A single container running in a service is called a task.

What is the use of Docker engine?

Ans”Docker engine” is the part of Docker which creates and runs Docker containers. A Docker container is a live running instance of a Docker image. A Docker image is a file you have created to run a specific service or program in a particular OS.

Who uses dockers?

Docker containers are primarily used by developers and system administrators.

For developers:->

  • all focus can be placed on writing the code, rather than worrying about the environment within which it will eventually be deployed.
  • There are also a huge number of programs designed to run on docker containers that they can use for their own projects, giving them a sizeable head start.

For system admins:->

1.dockers’ smaller footprint and lower overhead compared to virtual machines means the number of systems required for application deployment can often be reduced. 

2.Their portability and ease of installation makes the process far less laborious and enables administrators to regain lost time installing individual components and VMs.

How quick is a docker container, exactly?

When a virtual machine boots, it usually has to retrieve between 10-20GB of operating system data from storage. That can be a painfully slow process by today’s standards, but what about docker containers?

Well,because they don’t have to pull anything other than themselves from the hard disk, they boot within just a fraction of a second. So, they’re quick. Very quick.

****************** Docker Architecture: ************************************

Reference Links:-https://www.tutorialspoint.com/docker/docker_tutorial.pdf

1. Traditional virtualization:

traditionalWay.png

  1. The Host OS is the base machine such as Linux or Windows. ·
  2. The Hypervisor is either VMWare or Windows Hyper V that is used to host virtual machines.
  3. You would then install multiple operating systems as virtual machines on top of the existinghypervisor as Guest OS.
  4. You would then host your applications on top of each Guest OS.

The following image shows the new generation of virtualization that is                  enabled via Dockers. Let’s have a look at the various layers.

              2. Virtualization enabled via Docker:

Picture2.png

  • The server is the physical server that is used to host multiple virtual machines. So this layer remains the same.
  • The Host OS is the base machine such as Linux or Windows. So this layer remains the same. ·
  •  Now comes the new generation which is the Docker engine. This is used to run the operating system which earlier used to be virtual machines as Docker containers.
  • All of the Apps now run as Docker containers. The clear advantage in this architecture is that you don’t need to have extra hardware for Guest OS. Everything works as Docker containers.

Installation of Docker On Ubuntu:-

To start the installation of Docker, we are going to use an Ubuntu instance.

You can use Oracle Virtual Box to setup a virtual Linux instance, in case you don’t have it already.

The following screenshot shows a simple Ubuntu server which has been installed on Oracle Virtual Box. There is an OS user named demo which has been defined on the system having entire root access to the sever. To install Docker, we need to follow the steps given below

Step 1: Before installing Docker, you first have to ensure that you have the right Linux kernel version running. Docker is only designed to run on Linux kernel version 3.8 and higher. We can do this by running the following command:

Uname

This method returns the system information about the Linux system. Syntax :

uname -a

Options a – This is used to ensure that the system information is returned. Return Value This method returns the following information on the Linux system:

  • kernel name
  • node name ·
  • kernel release ·
  • kernel version ·
  • machine ·
  • processor ·
  • hardware platform ·
  • operating system

For Example:

uname -a

Picture1.png

From the output, we can see that the Linux kernel version is 4.2.0-27 which is higher than version 3.8, so we are good to go.

Step 2:

You need to update the OS with the latest packages, which can be done via the following command:

apt-get

This method installs packages from the Internet on to the Linux system.

Syntax

sudo apt-get update

Options

  • sudo – The sudo command is used to ensure that the command runs with root access.
  • update – The update option is used ensure that all packages are updated on the Linux system.

Return Value None

Output :

When we run the above command, we will get the following result:

Picture2.png

Step 3: The next step is to install the necessary certificates that will be required to work with the Docker site later on to download the necessary Docker packages. It can be done with the following command:

$ sudo apt-get install apt-transport-https ca-certificates

Picture3.png

Step 4:

The next step is to add the new GPG key. This key is required to ensure that all data is encrypted when downloading the necessary packages for Docker. The following command will download the key with the ID 58118E89F3A912897C070ADBF76221572C52609D from the keyserver hkp://ha.pool.sks-keyservers.net:80 and adds it to the adv keychain. Please note that this particular key is required to download the necessary Docker packages.

Picture4.png

Step 5: Next, depending on the version of Ubuntu you have, you will need to add the relevant site to the docker.list for the apt package manager, so that it will be able to detect the Docker packages from the Docker site and download them accordingly.

And then, we will need to add this repository to the docker.list as mentioned above.

echo “deb https://apt.dockerproject.org/repo ubuntu-trusty main” | sudo tee /etc/apt/sources.list.d/docker.list

Picture5.png

Step 6: Next, we issue the apt-get update command to update the packages on the Ubuntu system.

Picture6.png

Step 7: If you want to verify that the package manager is pointing to the right repository, you can do it by issuing the apt-cache command.

apt-cache policy docker-engine

In the output, you will get the link to https://apt.dockerproject.org/repo/

Step 8: Issue the apt-get update command to ensure all the packages on the local system are up to date.

Command: apt-get update

Step 9: For Ubuntu Trusty, Wily, and Xenial, we have to install the linux-image-extra-* kernel packages, which allows one to use the aufs storage driver. This driver is used by the newer versions of Docker. It can be done by using the following command:

Picture7.png

Step 10: The final step is to install Docker and we can do this with the following command:
sudo apt-get install –y docker-engine
Here, apt-get uses the install option to download the Docker-engine image from the Docker website and get Docker installed.
The Docker-engine is the official package from the Docker Corporation for Ubuntu-based systems.

In the next section, we will see how to check for the version of Docker that was installed.

Docker Version :
To see the version of Docker running, you can issue the following command:

$docker version

Options :
 version – It is used to ensure the Docker command returns the Docker version installed.
Return Value The output will provide the various details of the Docker version installed on the system.
Output When we run the above program, we will get the following result:

Picture9.png

Docker Info:

To see more information on the Docker running on the system, you can issue the following command:

 $docker info

Options

  • info – It is used to ensure that the Docker command returns the detailed information on the Docker service installed.

Return Value The output will provide the various details of the Docker installed on the system such as

  • Number of containers ·
  • Number of images ·
  • The storage driver used by Docker ·
  • The root directory used by Docker ·
  • The execution driver used by Docker

Picture12.png

Docker Images:

In Docker, everything is based on Images. An image is a combination of a file system and parameters. Let’s take an example of the following command in Docker.

$ docker run hello-world

  • The Docker command is specific and tells the Docker program on the Operating System that something needs to be done. ·
  • The run command is used to mention that we want to create an instance of an image, which is then called a container. ·
  • Finally, “hello-world” represents the image from which the container is made. Now let’s look at how we can use the CentOS image available in Docker Hub to run CentOS on our Ubuntu machine. We can do this by executing the following command on our Ubuntu machine:

$ sudo docker run centos –it /bin/bash

  • We are using the sudo command to ensure that it runs with root access.
  • Here, centos is the name of the image we want to download from Docker Hub and install on our Ubuntu machine.
  • ─it is used to mention that we want to run in interactive mode.
  • /bin/bash is used to run the bash shell once CentOS is up and running

To see the list of Docker images on the system

$ docker images

Picture13.png

DOCKER Commands—–>

  • docker build -t friendlyname .  # Create image using this directory’s Dockerfile
  • docker run -p 4000:80 friendlyname  # Run “friendlyname” mapping port 4000 to 80
  • docker run -d -p 4000:80 friendlyname         # Same thing, but in detached mode
  • docker container ls                                # List all running containers
  • docker container ls -a             # List all containers, even those not running
  • docker container stop           # Gracefully stop the specified container
  • docker container kill         # Force shutdown of the specified container
  • docker container rm        # Remove specified container from this machine
  • docker container rm $(docker container ls -a -q)         # Remove all containers
  • docker image ls -a                             # List all images on this machine
  • docker image rm            # Remove specified image from this machine
  • docker image rm $(docker image ls -a -q)   # Remove all images from this machine
  • docker login             # Log in this CLI session using your Docker credentials
  • docker tag username/repository:tag  # Tag for upload to registry
  • docker push username/repository:tag            # Upload tagged image to registry
  • docker run username/repository:tag                   # Run image from a registry

——-To lists the docker containers, use following command ———-

$ docker container ls

$ docker container ls -q

—–To Stop particular container, use following command——-

$ docker container stop

—–Use following command to remove all containers——-

$ docker rm $(docker ps –no-trunc -aq)

—-To Remove all docker images, Use command as——-

$ docker rmi $(docker images -q –filter “dangling=true”)

——Stopping Docker-container using container name—

$ docker stop $(docker ps -a –filter=”name=csm” -q)

$ docker rmi imageID     to remove Docker images

$ docker images -q

where,q – It tells the Docker command to return the Image IDs   only

$ docker inspect Docker_image_name  —-gives details of docker image

Docker Containers:

Containers are instances of Docker images that can be run using the Docker run command. The basic purpose of Docker is to run containers. Let’s discuss how to work with containers.

Running of containers is managed with the Docker run command. To run a container in an interactive mode, first launch the Docker container.

$ sudo docker run –it centos /bin/bash

After running the above command, We will entered into docker container’s root directory.

Listing of Containers:

One can list all of the containers on the machine via the docker ps command. This command is used to return the currently running containers.

$ docker ps

The output will show the currently running containers.

$ sudo docker ps -a

Where,”─a” It tells the docker ps command to list all of the containers on the system.

Picture14.png

docker history :

With this command, you can see all the commands that were run with an image via a container.

$ docker history ImageID

Picture15.png

Docker – Working With Containers:

docker top Command :

With this command, you can see the top processes within a container.

$ docker top ContainerID

Where, ContainerID – This is the Container ID for which you want to see the top                                processes.

Picture16.png

$docker stop ContainerID

$docker rm ContainerID

Picture17.png

docker stats command:

This command is used to provide the statistics of a running container.

$docker stats ContainerID

Where,

  • ContainerID – This is the Container ID for which the stats need to be provided.

Return Value – The output will show the CPU and Memory utilization of the                                                     Container.

$sudo docker rm 9f215ed0b0d3

The above command will provide CPU and memory utilization of the Container 9f215ed0b0d3.

Picture18.png

docker attach command:

$ sudo docker attach 07b0b6f434fe

The above command will attach to the Docker container 07b0b6f434fe. Output When we run the above command, it will produce the following result:

Picture20.png

Once you have attached to the Docker container, you can run the above command to see the process utilization in that Docker container.

Picture21.png

docker pause command:

$sudo docker pause ContainerId

The above command will pause the processes in a running container 07b0b6f434fe.

Picture22.png

docker unpause command:

$ docker unpause ContainerID

$ sudo docker unpause 07b0b6f434fe

The above command will unpause the processes in a running container: 07b0b6f434fe

Output :

When we run the above command, it will produce the following result:

Picture23.png

docker kill command:

This command is used to kill the processes in a running container.

$docker kill ContainerID

Where,

ContainerID – This is the Container ID to which you need to kill the processes in the container. Return Value

The ContainerID of the running container.

$ sudo docker kill 07b0b6f434fe

The above command will kill the processes in the running container 07b0b6f434fe

Picture25.png

Docker –Container Lifecycle:

Picture26.png

  • Initially, the Docker container will be in the created state.
  • Then the Docker container goes into the running state when the Docker run command is used. ·
  • The Docker kill command is used to kill an existing Docker container. ·
  • The Docker pause command is used to pause an existing Docker container.
  • The Docker stop command is used to pause an existing Docker container.
  • The Docker run command is used to put a container back from a stopped state to a running state.

Docker File:

Step 1: Create a file called Docker File and edit it using vim. Please note that the name of the file has to be “Dockerfile” with “D” as capital.

$ sudo vim Dockerfile

Step 2: Write down the following instruction into that file(Dockerfile)

Picture27.png

The following points need to be noted about the above file: ·

  1. The first line “#This is a sample Image” is a comment. You can add comments to the Docker File with the help of the # command.
  2. The next line has to start with the FROM keyword. It tells docker, from which base image you want to base your image from. In our example, we are creating an image from the ubuntu image.
  3. The next command is the person who is going to maintain this image. Here you specify the MAINTAINER keyword and just mention the email ID.
  4. The RUN command is used to run instructions against the image. In our case, we first update our Ubuntu system and then install the nginx server on our ubuntu image.

5.The last command is used to display a message to the user.

Step 3: Save the file. In the next chapter, we will discuss how to build the image.Picture28.png

Building Docker File:

Docker build command:

This method allows the users to build their own Docker images.

Picture29.png

Picture30.png

Output:

From the output, you will first see that the Ubuntu Image will be downloaded from Docker Hub, because there is no image available locally on the machine.

Picture31.png

Picture32.png

You will then see the successfully built message and the ID of the new Image. When you run the Docker images command, you would then be able to see your new image.

Picture33.png

Docker Public Repository:

  • Public repositories can be used to host Docker images which can be used by everyone else.
  • An example is the images which are available in Docker Hub. Most of the images such as Centos, Ubuntu, and Jenkins are all publicly available for all.

We can also make our images available by publishing it to the public repository on Docker Hub. For our example, we will use the myimage repository built in the “Building Docker Files” chapter and upload that image to Docker Hub.

Let’s first review the images on our Docker host to see what we can push to the Docker registry

Picture34.png

Here, we have our myimage:0.1 image which was created as a part of the “Building Docker Files” chapter. Let’s use this to upload to the Docker public repository. The following steps explain how you can upload an image to public repository.

Docker Public Repository(Content):

1.1 Upload an docker image to public repository :

Step 1: Log into Docker Hub and create your repository. This is the repository where your image will be stored. Go to https://hub.docker.com/ and log in with your credentials.

Picture35.png

Step 2: Click the button “Create Repository” on the above screen and create a repository with the name demorep. Make sure that the visibility of the repository is public.

Picture36.png

Once the repository is created, make a note of the pull command which is attached to the repository.

Picture37.png

The pull command which will be used in our repository is as follows:

$docker pull demousr/demorep

Step 3: Now go back to the Docker Host. Here we need to tag our myimage to the new repository created in Docker Hub. We can do this via the Docker tag command. We will learn more about this tag command later in this chapter.

Step 4: Issue the Docker login command to login into the Docker Hub repository from the command prompt. The Docker login command will prompt you for the username and password to the Docker Hub repository.

Picture38.png

Step 5: Once the image has been tagged, it’s now time to push the image to the Docker Hub repository. We can do this via the Docker push command.

1.2  Docker tag Command:

Picture39.png

Output:

A sample output of the above example is given below.Picture40.png

1.3 Docker push command:

Picture41.png

Picture42.png

If you go back to the Docker Hub page and go to your repository, you will see the tag name in the repository.Picture43.png

Now let’s try to pull the repository we uploaded onto our Docker host. Let’s first delete the images, myimage:0.1 and demousr/demorep:1.0, from the local Docker host. Let’s use the Docker pull command to pull the repository from the Docker Hub.

Picture44.png

From the above screenshot, you can see that the Docker pull command has taken our new repository from the Docker Hub and placed it on our machine.

Docker Managing Ports:

In Docker, the containers themselves can have applications running on ports. When you run a container, if you want to access the application in the container via a port number, you need to map the port number of the container to the port number of the Docker host. Let’s look at an example of how this can be achieved.

In our example, we are going to download the Jenkins container from Docker Hub. We are then going to map the Jenkins port number to the port number on the Docker host.

Step 1: First, you need to do a simple sign-up on Docker Hub.

Picture45.png

Picture48.png Once you have signed up, you will be logged into Docker Hub.

Picture46.png

Step 3: Next, let’s browse and find the Jenkins image.Picture47.png

Step 4: If you scroll down on the same page, you can see the Docker pull command. This will be used to download the Jenkins Image onto the local Ubuntu server.

Step 5: Now go to the Ubuntu server and run the command:

$sudo docker pull jenkins

Picture49.png

Step 6: To understand what ports are exposed by the container, you should use the Docker inspect command to inspect the image.

Run the following command to get low level information of the image or container in JSON format:

Example:

Picture50.png

The output of the inspect command gives a JSON output. If we observe the output, we can see that there is a section of “ExposedPorts” and see that there are two ports mentioned. One is the data port of 8080 and the other is the control port of 50000.

To run Jenkins and map the ports, you need to change the Docker run command and add the ‘p’ option which specifies the port mapping.

So, you need to run the following command:

$sudo docker run -p 8080:8080 -p 50000:50000 jenkins    …equation 1

Or

$ docker run -d -p 4000:8080 friendlyhello

Where,

Port No.4000 is docker local port and

Port No.8080  is docker container’s port

Friendlyhello is application name

The left-hand side of the port number mapping is the Docker host port to map to and the right-hand side is the Docker container port number.

So that we can access our app through http://localhost:4000

Because we have exported the containers port no.(80) and binds that container’s port no.80 is binds with local port no.4000 that’s why we can access the application through local port.

With equation 1, When you open the browser and navigate to the Docker host on port 8080, you will see Jenkins up and running.

Picture51.png

Docker – Private Registries:

You might have the need to have your own private repositories. You may not want to host the repositories on Docker Hub. For this, there is a repository container itself from Docker. Let’s see how we can download and use the container for registry.

Step 1:Use the Docker run command to download the private registry. This can be done using the following command:

Picture52.png

Step 2: Let’s do a docker ps to see that the registry container is indeed running.

Picture53.png

We have now confirmed that the registry container is indeed running. Step

Step 3: Now let’s tag one of our existing images so that we can push it to our local repository.

In our example, since we have the centos image available locally, we are going to tag it to our private repository and add a tag name of centos.

$sudo docker tag 67591570dd29 localhost:5000/centos

The following points need to be noted about the above command: · 67591570dd29 refers to the Image ID for the centos image.

  • localhost:5000 is the location of our private repository.
  • We are tagging the repository name as centos in our private repository.

Picture54.png

Step 4: Now let’s use the Docker push command to push the repository to our private repository.

$sudo docker push localhost:5000/centos

Here, we are pushing the centos image to the private repository hosted at localhost:5000.

Step 5: Now let’s delete the local images we have for centos using the docker rmi commands. We can then download the required centos image from our private repository.

$sudo docker rmi centos:latest

$sudo docker rmi 67591570dd29

Picture55.png

Step 6: Now that we don’t have any centos images on our local machine, we can now use the following Docker pull command to pull the centos image from our private repository.

$sudo docker pull localhost:5000/centos

Here, we are pulling the centos image to the private repository hosted at localhost:5000.

Picture56.png

If you now see the images on your system, you will see the centos image as well.

Docker – Building a Web Server Docker File

We have already learnt how to use Docker File to build our own custom images. Now let’s see how we can build a web server image which can be used to build containers. In our example, we are going to use the Apache Web Server on Ubuntu to build our image. Let’s follow the steps given below, to build our web server Docker file.

Step 1: The first step is to build our Docker File. Let’s use vim and create a Docker File with the following information.

Put following content to newly created Dockerfile.—–>

Picture57.png

The following points need to be noted about the above statements:

  1. We are first creating our image to be from the Ubuntu base image. ·
  2. Next, we are going to use the RUN command to update all the packages on the Ubuntu system. ·
  3. Next, we use the RUN command to install apache2 on our image. · Next, we use the RUN command to install the necessary utility apache2 packages on our image. ·
  4. Next, we use the RUN command to clean any unnecessary files from the system.
  5. The EXPOSE command is used to expose port 80 of Apache in the container to the Docker host. ·
  6. Finally, the CMD command is used to run apache2 in the background.

Picture59.png

Step 2: Run the Docker build command to build the Docker file. It can be done using the following command:

$sudo docker build –t=”mywebserver” .

We are tagging our image as mywebserver. Once the image is built, you will get a successful message that the file has been built.

Picture60.png

Picture61.png

Step 3: Now that the web server file has been built, it’s now time to create a container from the image. We can do this with the Docker run command.

Picture62.png

The following points need to be noted about the above command: ·

  • The port number exposed by the container is 80. Hence with the –p command, we are mapping the same port number to the 80 port number on our localhost. ·
  • The –d option is used to run the container in detached mode. This is so that the container can run in the background.

If you go to port 80 of the Docker host in your web browser, you will now see that Apache is up and running.

Picture63.png

Docker Instruction commands:

WORKDIR command:

In docker file we can use ‘WORKDIR’command used to set the working directory of the container.

WORKDIR dirname       …….Should specify in Docker file

Where,

dirname – The new working directory. If the directory does not exist, it will be added.

Docker Container Linking:

Container Linking allows multiple containers to link with each other. It is a better option than exposing ports. Let’s go step by step and learn how it works.

Step 1: Download the Jenkins image, if it is not already present, using the Jenkins pull command.

$ sudo docker jenkins pull

Step 2: Once the image is available, run the container, but this time, you can specify a name to the container by using the –-name option. This will be our source container.

Picture64.png

Picture65.png

Picture66.png

Creating a Docker Volume using Command:

$docker volume create  –name=volumename  –opt options

For Example:

$ sudo docker volume create –name=demo -opt o=size=100m

Picture67.png

Listing All Docker Volumes:Picture68.png

Docker Networking:

Docker takes care of the networking aspects so that the containers can communicate with other containers and also with the Docker Host. If you do an ifconfig on the Docker Host, you will see the Docker Ethernet adapter. This adapter is created when Docker is installed on the Docker Host.

Picture69.png

This is a bridge between the Docker Host and the Linux Host. Now let’s look at some commands associated with networking in Docker.

Listing All Docker Networks:

Picture70.png

Inspecting a Dockernetwork:

If you want to see more details on the network associated with Docker, you can use the Docker network inspect command.

$sudo docker network inspect bridge

Where,

‘bridge‘ is ur network name.

OutPut:

Picture71.png

Now let’s run a container and see what happens when we inspect the network again. Let’s spin up an Ubuntu container with the following command:

$sudo docker run –it ubuntu:latest /bin/bash

Picture72.png

Picture73.png

Creating Your Own New Network in Docker:

Picture74.png

Picture75.png

And now when you inspect the network via the following command, you will see the container attached to the network.

Picture76.png

EXCERCISE:

1.Docker – Setting Node.js Application:

Node.js is a JavaScript framework that is used for developing server-side applications. It is an open source framework that is developed to run on a variety of operating systems.

Since Node.js is a popular framework for development, Docker has also ensured it has support for Node.js applications.

We will now see the various steps for getting the Docker container for Node.js up and running.

Step 1: The first step is to pull the image from Docker Hub. When you log into Docker Hub, you will be able to search and see the image for Node.js as shown below. Just type in Node in the search box and click on the node (official) link which comes up in the search results.

Picture77.png

Step 2: You will see that the Docker pull command for node in the details of the repository in Docker Hub.

Picture78.png

Step 3: On the Docker Host, use the Docker pull command as shown above to download the latest node image from Docker Hub.

Picture79.png

Step 4: On the Docker Host, let’s use the vim editor and create one Node.js example file.

In this file, we will add a simple command to display “HelloWorld” to the command prompt.

Picture80.png

Step 5: To run our Node.js script using the Node Docker container, we need to execute the following statement:

demo@ubuntudemo: $ cd /usr/src/

demo@ubuntudemo src:$ mkdir app

demo@ubuntudemo:$ cd app

demo@ubuntudemo app: $ sudo docker run -it -rm -name=HelloWorld -v “$PWD”:/usr/src/app -w /usr/src/app node node HelloWorld.js

Or

Picture81.png

  • The following points need to be noted about the above command:
  • The –rm option is used to remove the container after it is run. ·
  • We are giving a name to the container called “HelloWorld” ·
  • We are mentioning to map the volume in the container which is /usr/src/app to our current present working directory. This is done so that the node container will pick up our HelloWorld.js script which is present in our working directory on the Docker Host. ·
  • The –w option is used to specify the working directory used by Node.js. ·
  • The first node option is used to specify to run the node image. ·
  • The second node option is used to mention to run the node command in the node container. ·
  • And finally we mention the name of our script.

We will then get the following output. And from the output, we can clearly see that the Node container ran as a container and executed the HelloWorld.js script.

Picture82.png

Or

demo@ubuntudemo:$ sudo docker run -it –rm –name=HelloWorld -v /usr/src/app/HelloWorld.js:/usr/src/app -w /usr/src/app node node HelloWorld.js

2.Docker – Setting MongoDB:

MongoDB is a famous document-oriented database that is used by many modern-day web applications.

Since MongoDB is a popular database for development, Docker has also ensured it has support for MongoDB. We will now see the various steps for getting the Docker container for MongoDB up and running.

Step 1: The first step is to pull the image from Docker Hub. When you log into Docker Hub, you will be able to search and see the image for Mongo as shown below. Just type in Mongo in the search box and click on the Mongo (official) link which comes up in the search results.

Picture83.png

Step 2: You will see that the Docker pull command for Mongo in the details of the repository in Docker Hub.

Picture84.png

Step 3: On the Docker Host, use the Docker pull command as shown above to download the latest Mongo image from Docker Hub.

Picture85.png

Step 4: Now that we have the image for Mongo, let’s first run a MongoDB container which will be our instance for MongoDB.

For this, we will issue the following command:

Picture86.png

You can then issue the docker ps command to see the running containers:

Picture87.png

Picture88.png

Step 5: Now let’s spin up another container which will act as our client which will be used to connect to the MongoDB database. Let’s issue the following command for this:

Picture89.png

You will now be in the new container.

Step 6: Run the env command in the new container to see the details of how to connect to the MongoDB server container.

Picture90.png

Step 6: Now it’s time to connect to the MongoDB server from the client container. We can do this via the following command:

Picture91.png

Picture92.png

You can then run any MongoDB command in the command prompt. In our example, we are running the following command:

Picture93.png

Now you have successfully created a client and server MongoDB container.

3.Docker – Setting NGINX

NGINX is a popular lightweight web application that is used for developing server-side applications. It is an open-source web server that is developed to run on a variety of operating systems.

Since nginx is a popular web server for development, Docker has ensured that it has support for nginx.

We will now see the various steps for getting the Docker container for nginx up and running.

Step 1: The first step is to pull the image from Docker Hub. When you log into Docker Hub, you will be able to search and see the image for nginx as shown below.

Just type in nginx in the search box and click on the nginx (official) link which comes up in the search results.

Picture94.png

Step 2: You will see that the Docker pull command for nginx in the details of the repository in Docker Hub.

Picture95.png

Step 3: On the Docker Host, use the Docker pull command as shown above to download the latest nginx image from Docker Hub.

Picture96.png

Step 4: Now let’s run the nginx container via the following command:

Picture97.png

Once you run the command, you will get the following output if you browse to the URL http://dockerhost:8080. This shows that the nginx container is up and running.

Picture98.png

Step 5: Let’s look at another example where we can host a simple web page in our ngnix container. In our example, we will create a simple HelloWorld.html file and host it in our nginx container.

Let’s first create an HTML file called HelloWorld.html

Picture99.png

The following points need to be noted about the above command: ·

  • We are exposing the port on the nginx server which is port 80 to port 8080 on the Docker Host. ·
  • Next, we are attaching the volume on the container which is /usr/share/nginx/html to our present working directory. This is where our HelloWorld.html file is stored.

Picture100.png

Picture101.png

4.Docker – Docker Cloud:

The Docker Cloud is a service provided by Docker in which you can carry out the following operations:

  • Nodes ─ You can connect the Docker Cloud to your existing cloud providers such as Azure and AWS to spin up containers on these environments. ·
  • Cloud Repository ─ Provides a place where you can store your own repositories. ·
  • Continuous Integration ─ Connect with Github and build a continuous integration pipeline. ·
  • Application Deployment ─ Deploy and scale infrastructure and containers. Continuous Deployment ─ Can automate deployments.

You can go to the following link to getting started with Docker Cloud: https://cloud.docker.com/

Picture102.png

Once logged in, you will be provided with the following basic interface:Picture103.png

Connecting to the Cloud Provider The first step is to connect to an existing cloud provider. The following steps will show you how to connect with an Amazon Cloud provider.

Step 1: The first step is to ensure that you have the right AWS keys. This can be taken from the aws console. Log into your aws account using the following link – https://aws.amazon.com/console/

Picture104.png

Step 2: Once logged in, go to the Security Credentials section. Make a note of the access keys which will be used from Docker Hub.

Picture105.png

Step 3: Next, you need to create a policy in aws that will allow Docker to view EC2 instances. Go to the profiles section in aws. Click the Create Policy button.

Picture106.png

Step 4: Click on ‘Create Your Own Policy’ and give the policy name as dockercloudpolicy and the policy definition as shown below.

Picture107.png

Picture108.png

Next, click the Create Policy button.

Step 5: Next, you need to create a role which will be used by Docker to spin up nodes on AWS.

For this, go to the Roles section in AWS and click the Create New Role option.

Picture109.png

Picture110.png

Step 7: On the next screen, go to ‘Role for Cross Account Access’ and select “Provide access between your account and a 3rd party AWS account”

Picture111.png

Picture112.png

Picture113.png

Picture114.png

Picture115.png

Picture116.png

Picture117.png

Setting Up Nodes:

Once the integration with AWS is complete, the next step is to setup a node.

Go to the Nodes section in Docker Cloud. Note that the setting up of nodes will automatically setup a node cluster first.

Step 1: Go to the Nodes section in Docker Cloud.

Picture118.png

Picture119.png

Picture120.png

Picture121.png

Step 2: Choose the Service which is required. In our case, let’s choose mongo.

Picture122.png

Step 3: On the next screen, choose the Create & Deploy option. This will start deploying the Mongo container on your node cluster.

Picture123.png

Once deployed, you will be able to see the container in a running state.

Picture124.png

5.Docker – Docker Compose:

Docker Compose is used to run multiple containers as a single service.

For example, suppose you had an application which required NGNIX and MySQL, you could create one file which would start both the containers as a service without the need to start each one separately.

5.1 Docker Compose ─Installation:

Step 1: Download the necessary files from github using the following command:

Picture125.png

We can then use the following command to see the compose version.

Picture126.png

5.2 Creating Your First Docker-Compose File:

Now let’s go ahead and create our first Docker Compose file. All Docker Compose files are YAML files. You can create one using the vim editor. So execute the following command to create the compose file:

Picture127.png

Picture128.png

Picture129.png

Picture130.png

  1. Docker – Continuous Integration:

Docker has integrations with many Continuous Integrations tools, which also includes the popular CI tool known as Jenkins. Within Jenkins, you have plugins available which can be used to work with containers.

So let’s quickly look at a Docker plugin available for the Jenkins tool. Let’s go step by step and see what’s available in Jenkins for Docker containers.

Step 1: Go to your Jenkins dashboard and click Manage Jenkins.

Picture131.png

Picture132.png

Step 3: Search for Docker plugins. Choose the Docker plugin and click the Install without restart button.

Picture133.png

Step 4: Once the installation is completed, go to your job in the Jenkins dashboard. In our example, we have a job called Demo.

Picture134.png

Step 5: In the job, when you go to the Build step, you can now see the option to start and stop containers.

Picture135.png

Step 6: As a simple example, you can choose the further option to stop containers when the build is completed. Then, click the Save button.

Picture137.png

Now, just run your job in Jenkins. In the Console output, you will now be able to see that the command to Stop All containers has run.

Picture138.png

  1. Docker – Kubernetes Architecture:

Kubernetes is an orchestration framework for Docker containers which helps expose containers as services to the outside world. For example, you can have two services:

One service would contain nginx and mongoDB, and another service would contain nginx and redis. Each service can have an IP or service point which can be connected by other applications.

Kubernetes is then used to manage these services.

Picture139.png

  • The minion is the node on which all the services run. You can have many minions running at one point in time.
  • Each minion will host one or more POD. Each POD is like hosting a service. Each POD then contains the Docker containers.
  • Each POD can host a different set of Docker containers.
  • The proxy is then used to control the exposing of these services to the outside world.

Picture140.png

  7.1 Docker – Working of Kubernetes:

Step 1: Ensure that the Ubuntu server version you are working on is 16.04.

Step 2: Ensure that you generate a ssh key which can be used for ssh login. You can do this using the following command:

Picture141.png

Step 3: Next, depending on the version of Ubuntu you have, you will need to add the relevant site to the docker.list for the apt package manager, so that it will be able to detect the Kubernetes packages from the kubernetes site and download them accordingly.

We can do it using the following commands:

Picture142.png

Picture143.png

Step 5: Install the Docker package as detailed in the earlier chapters.

Step 6: Now it’s time to install kubernetes by installing the following packages:

Picture144.png

Step 7: Once all kubernetes packages are downloaded, it’s time to start the kubernetes controller using the following command:

Picture145.png

Once done, you will get a successful message that the master is up and running and nodes can now join the cluster.

Install the Munin Monitoring Tool on Ubuntu 14.04

Introduction

Munin is a system, network, and infrastructure monitoring application that provides information in graphs through a web browser. It is designed around a client-server architecture and can be configured to monitor the machine it’s installed on (the Munin master) and any number of client machines, which in Munin parlance, are called Munin nodes.

In this article, we’ll install and configure Munin to monitor the server it’s installed on and one node. To install Munin on multiple nodes, just follow the instructions for creating a node on each system.

Prerequisites

  • Two Ubuntu 14.04 Droplets. One of the servers will be the Munin master. The other will be the Munin node.
  • For each Droplet, a non-root user with sudo privileges

All the commands in this tutorial should be run as a non-root user. If root access is required for the command, it will be preceded by sudoInitial Server Setup with Ubuntu 14.04 explains how to add users and give them sudo access.

Step 1 — Installing Required Packages

We will start working on the Munin master first. Before installing Munin, a few dependencies need to be installed.

Though Munin can function with most popular Web servers like Nginx and Lighttpd, it is, by default, designed to work with the Apache Web server. So be sure that Apache is installed and configured on the Munin master. If it’s not already installed, do so using:

  • sudo apt-get update
  • sudo apt-get install -y apache2 apache2-utils

To ensure that the dynazoom functionality, which is responsible for zooming into the generated graphs, work properly on click, install the following:

$ sudo apt-get install -y libcgi-fast-perl libapache2-mod-fcgid

After installing those two packages, the fcgid module should be enabled. To double-check, type:

$ /usr/sbin/apachectl -M | grep -i cgi

The output should be:

fcgid_module (shared)

If the output is blank, then it’s not enabled. You may then enable it using:

$ sudo a2enmod fcgid

When executing the apachectl command, you can ignore the following warning:

Could not reliably determine the server's fully qualified domain name ...

Apache will still work with Munin with this warning.

The rest of the configuration that will make graph zooming work properly will be covered in Step 3.

Step 2 — Installing Munin on the Munin Master

Installation packages for Munin are available in the official Ubuntu repository, so they can be installed using the distribution’s package manager. In this step, you’ll install the Munin master package. The version in the repository is the latest stable release.

To install it to monitor the server it’s installed on, type:

$ sudo apt-get install -y munin

Step 3 — Configuring the Munin Master

Munin’s main configuration file munin.conf and other files required for it to function are in the /etc/munin directory and its sub-directories. In this step, we’ll modify the main configuration file for the Munin master and its Apache configuration apache.conf.

The main configuration file is made up of at least two sections — a global and at least one host section. Optionally, there can be a group section. Host and group sections start with their respective names in square brackets. This file contains variable definitions, directives that govern how Munin monitors servers and services, and which servers to monitor.

To begin, open the main configuration file:

  • $ cd /etc/munin
  • $ sudo nano munin.conf

Look for these lines and uncomment them — remove the # sign that precedes them. The dbdir stores all of the rrdfiles containing the actual monitoring information; htmldir stores the images and site files; logdirmaintains the logs; rundir holds the state files; and tmpldir is the location for the HTML templates. Be sure to change the htmldir from /var/cache/munin/www to your web directory. In this example, we’ll be using /var/www/munin:

make following changes into /etc/munin/munin.conf file:
dbdir     /var/lib/munin
htmldir   /var/www/munin
logdir    /var/log/munin
rundir    /var/run/munin

tmpldir /etc/munin/templates

Since the htmldir does not exist, let’s create and chown it so that it’s owned by the munin system user:

  • sudo mkdir /var/www/munin
  • sudo chown munin:munin /var/www/munin

Finally, in munin.conf, look for the first host tree. It defines how to access and monitor the host machine. It should read:

/etc/munin/munin.conf
[localhost.localdomain]
    address 127.0.0.1
    use_node_name yes

Change the name of that tree to one that uniquely identifies the server. This the name that will be displayed in the Munin web interface. In this example, we’ll be using MuninMaster, but you could also use the server’s hostname:

/etc/munin/munin.conf
[MuninMaster]
    address 127.0.0.1
    use_node_name yes

That’s all for the configuration file, so save and close it.

That’s all for the configuration file, so save and close it.

Within the same /etc/munin directory, the next file we’ll be modifying is apache.conf, which is Munin’s Apache configuration file. It is sym-linked to /etc/apache2/conf-available/munin.conf, which, in turn, is sym-linked to /etc/apache2/conf-enabled/munin.conf. To start modifying it, open it with nano:

  • sudo nano apache.conf

At the very top of the file, modify the first line so that it reflects the htmldir path you specified in munin.conf and created previously. Based on the directory path used in this article, it should read as follows, which makes it so you can access Munin’s web interface by appending munin to the server’s IP address or domain hosted on the server:

make following changes to /etc/munin/apache.conf file:
Alias /munin /var/www/munin

Next, look for the Directory section, and change the directory to /var/www/munin. Also comment out (or delete) the first four lines and then add two new directives so that it reads:

make following changes to /etc/munin/apache.conf file:
<Directory /var/www/munin>
        #Order allow,deny
        #Allow from localhost 127.0.0.0/8 ::1
        #Allow from all
        #Options None

        Require all granted
        Options FollowSymLinks SymLinksIfOwnerMatch

        ...

        ...

</Directory>

Look for the penultimate location section, comment out or delete the first two lines and add two new ones so that it reads:

make following changes to /etc/munin/apache.conf file:
<Location /munin-cgi/munin-cgi-graph>
        #Order allow,deny
        #Allow from localhost 127.0.0.0/8 ::1

        Require all granted
        Options FollowSymLinks SymLinksIfOwnerMatch

        ...

        ...

</Location>

Do the same to the last location section:

make following changes to /etc/munin/apache.conf file:
<Location /munin-cgi/munin-cgi-html>
        #Order allow,deny
        #Allow from localhost 127.0.0.0/8 ::1

        Require all granted
        Options FollowSymLinks SymLinksIfOwnerMatch

        ...

        ...

</Location>

Save and close the file. Then restart Apache and Munin.

sudo service apache2 restart
sudo service munin-node restart

You may now access Munin’s web interface by pointing your browser to server-ip-address/munin

for example: 110.x.x.188/munin (hit enter)

 

Munin Web Interface

Step 4 — Adding a Node to Munin Master

In this step, we’ll show how to add a remote server (or node) to the Munin master so that you can monitor it within the same web interface. This involves modifying the Munin master’s configuration file to specify a host tree for the node. Then, you will need to install the Munin node package on the node and modify its configuration file so that it can be monitored by the Munin master.

Let’s start with the Munin node — the second Ubuntu Droplet you created.

Log into the Munin node, update the package database and install the Munin node package:

  • sudo apt-get update
  • sudo apt-get install -y munin-node

After the installation has completed successfully, the node’s configuration file should be in the /etc/munin directory. Open it with nano:

  • sudo nano /etc/munin/munin-node.conf

Towards the middle of the file, look for an allow ^127.0.0.1$ line and modify it so that it reflects the IP address of the Munin master. Note that the IP address is in regex format, so assuming that the master server’s IP address is 123.46.78.100, the line should read as follows:

[label  /etc/munin/munin-node.conf}
allow ^123\.456\.78\.100$

Save and close the file. Then restart the Munin:

  • sudo service munin-node restart

Back on the Munin master, open the main configuration file:

  • sudo nano /etc/munin/munin.conf

All we need to do in this file is insert a host tree for the (remote) node. The easiest approach to that is to copy and modify the host tree of the master. Be sure to replace node-ip-address with the IP address of the node you are adding:

make the following changes to /etc/munin/munin.conf file:
[MuninNode]
    address node-ip-address
    use_node_name yes

Save and close the file. Then restart Apache:

  • sudo service apache2 restart

Munin checks for new nodes every 5 minutes. Wait a few minutes, and then reload the Munin master’s web interface. You should see an entry for the node. If you don’t see it yet, try again in 5 minutes. Using this method, you may add as many nodes as you have to monitor.

Munin Node Added

Setting Threshold values in /etc/munin/munin.conf file:

We can set the threshold values for available plugins inside /etc/munin/munin.conf file.

Under added node section please add threshold values as(we are adding threshold values for df plugin only) :

######################################################

[MuninSlave]
address 110.110.112.179
use_node_name yes

df._dev_sda1.warning(space)30

df._dev_sda1.critical(space)40

#######################################################

 

Step 5 — Enabling Extra Plugins

Munin monitors a system using plugin scripts, and by default, about a dozen set of plugins are installed and active. A complete list of available plugins are in the /usr/share/munin/plugins directory. To see which plugins can be used on your system, Munin provides the following command:

  • sudo munin-node-configure –suggest

The output should be of this sort:

Plugin                     | Used | Suggestions
------                     | ---- | -----------
cps_                       | no   | no
cpu                        | yes  | yes
cpuspeed                   | no   | no [missing /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state]
cupsys_pages               | no   | no [could not find logdir]
df                         | yes  | yes
df_inode                   | yes  | yes
fail2ban                   | no   | yes
ip_                        | no   | yes

A plugin with a yes in the Used column means just what it indicates, while one with a yes in the Suggestions column means it can be used. One with a no on both columns means it is not in use and cannot be used on the system. Finally, if a plugin has a no in the Used column and a yes in the Suggestions, then it is not being used but can be enabled and used on the system.

On the Munin master and node, you can also see a list of installed plugins in the /etc/munin/pluginsdirectory.

munin-plugins-extra package should have been installed when you installed Munin. If it was not, do so using.

  • sudo apt-get install munin-plugins-extra

To enable an available plugin that’s not currently in use, create a symbolic link for it from the /usr/share/munin/plugins directory to the /etc/munin/plugin directory.

For example, to enable the Fail2ban plugin, first install Fail2ban:

  • sudo apt-get install fail2ban

Then, create the symlink that enables the Munin plugin:

  • sudo ln -s /usr/share/munin/plugins/fail2ban /etc/munin/plugins

Restart Munin:

  • sudo service munin-node restart

Wait a few minutes, reload the web interface, and you should see graphs for Fail2ban under the title Hosts blacklisted by fail2ban under the network category for the Munin master.

Troubleshooting

If you are having trouble configuring the Munin master, the Munin node, or getting the master to see the node, check out the log files for error messages:

  • Munin master: /var/log/munin/munin-update.log
  • Munin node: /var/log/munin/munin-node.log

You can also check the project’s page for additional troubleshooting tips.

 

Sample content of /etc/munin/minin.conf file is:

############ start of file ########################

# Example configuration file for Munin, generated by ‘make build’

# The next three variables specifies where the location of the RRD
# databases, the HTML output, logs and the lock/pid files. They all
# must be writable by the user running munin-cron. They are all
# defaulted to the values you see here.
#

dbdir /var/lib/munin
htmldir /var/www/munin
logdir /var/log/munin
rundir /var/run/munin

# Where to look for the HTML templates
#
tmpldir /etc/munin/templates
# Where to look for the static www files
#
#staticdir /etc/munin/static

# temporary cgi files are here. note that it has to be writable by
# the cgi user (usually nobody or httpd).
#
# cgitmpdir /var/lib/munin/cgi-tmp

# (Exactly one) directory to include all files from.
includedir /etc/munin/munin-conf.d

# You can choose the time reference for “DERIVE” like graphs, and show
# “per minute”, “per hour” values instead of the default “per second”
#
#graph_period second

# Graphics files are generated either via cron or by a CGI process.
# See http://munin-monitoring.org/wiki/CgiHowto2 for more
# documentation.
# Since 2.0, munin-graph has been rewritten to use the cgi code.
# It is single threaded *by design* now.
#
#graph_strategy cron

# munin-cgi-graph is invoked by the web server up to very many times at the
# same time. This is not optimal since it results in high CPU and memory
# consumption to the degree that the system can thrash. Again the default is
# 6. Most likely the optimal number for max_cgi_graph_jobs is the same as
# max_graph_jobs.
#
#munin_cgi_graph_jobs 6

# If the automatic CGI url is wrong for your system override it here:
#
cgiurl_graph /munin-cgi/munin-cgi-graph

# max_size_x and max_size_y are the max size of images in pixel.
# Default is 4000. Do not make it too large otherwise RRD might use all
# RAM to generate the images.
#
#max_size_x 4000
#max_size_y 4000

# HTML files are normally generated by munin-html, no matter if the
# files are used or not. You can change this to on-demand generation
# by following the instructions in http://munin-monitoring.org/wiki/CgiHowto2
#
# Notes:
# – moving to CGI for HTML means you cannot have graph generated by cron.
# – cgi html has some bugs, mostly you still have to launch munin-html by hand
#
#html_strategy cron

# munin-update runs in parallel.
#
# The default max number of processes is 16, and is probably ok for you.
#
# If set too high, it might hit some process/ram/filedesc limits.
# If set too low, munin-update might take more than 5 min.
#
# If you want munin-update to not be parallel set it to 0.
#
#max_processes 16

# RRD updates are per default, performed directly on the rrd files.
# To reduce IO and enable the use of the rrdcached, uncomment it and set it to
# the location of the socket that rrdcached uses.
#
#rrdcached_socket /var/run/rrdcached.sock

# Drop somejuser@fnord.comm and anotheruser@blibb.comm an email everytime
# something changes (OK -> WARNING, CRITICAL -> OK, etc)
#contact.someuser.command mail -s “Munin notification” somejuser@fnord.comm
#contact.anotheruser.command mail -s “Munin notification” anotheruser@blibb.comm
#
# For those with Nagios, the following might come in handy. In addition,
# the services must be defined in the Nagios server as well.
#contact.nagios.command /usr/bin/send_nsca nagios.host.comm -c /etc/nsca.conf

contacts ubuntu
contact.ubuntu.command >”/etc/munin/externalscript”

# a simple host tree
[MuninMaster]
address 127.0.0.1
use_node_name yes

[MuninSlave]
address 110.110.112.179
use_node_name yes

#
# A more complex example of a host tree
#
## First our “normal” host.
# [fii.foo.com]
# address foo
#
## Then our other host…
# [fay.foo.com]
# address fay
#
## IPv6 host. note that the ip adress has to be in brackets
# [ip6.foo.com]
# address [2001::1234:1]
#
## Then we want totals…
# [foo.com;Totals] #Force it into the “foo.com”-domain…
# update no # Turn off data-fetching for this “host”.
#
# # The graph “load1”. We want to see the loads of both machines…
# # “fii=fii.foo.com:load.load” means “label=machine:graph.field”
# load1.graph_title Loads side by side
# load1.graph_order fii=fii.foo.com:load.load fay=fay.foo.com:load.load
#
# # The graph “load2”. Now we want them stacked on top of each other.
# load2.graph_title Loads on top of each other
# load2.dummy_field.stack fii=fii.foo.com:load.load fay=fay.foo.com:load.load
# load2.dummy_field.draw AREA # We want area instead the default LINE2.
# load2.dummy_field.label dummy # This is needed. Silly, really.
#
# # The graph “load3”. Now we want them summarised into one field
# load3.graph_title Loads summarised
# load3.combined_loads.sum fii.foo.com:load.load fay.foo.com:load.load
# load3.combined_loads.label Combined loads # Must be set, as this is
# # not a dummy field!
#
## …and on a side note, I want them listen in another order (default is
## alphabetically)
#
# # Since [foo.com] would be interpreted as a host in the domain “com”, we
# # specify that this is a domain by adding a semicolon.
# [foo.com;]
# node_order Totals fii.foo.com fay.foo.com

############## End Of File ###########################

 

Sample /usr/share/perl5/Munin/Master/LimitsOld.pm file content

***************       Start Of File  ************************************

click on following link,you will get all content of LimitsOld.pm file

https://pastebin.com/hu9E24vw

*************** End Of File *****************************************

Problem-1: munin logs are not generating properly:

solution:

When you installed munin first time then it will not generate logs properly.

For generating logs we need to make changes in /usr/share/perl5/Munin/Master/LimitsOld.pm file

after opening this file, just find the word ‘$DEBUG'(without single quotes) and check its value. If its value is zero(0) then we need to change it to 1.

$DEBUG = 0  …………original line present in  /usr/share/perl5/Munin/Master/LimitsOld.pm                                     file.

change it as :

$DEBUG = 1

after making this change , we need to wait for 5 minutes to see newly generated logs.

we can see the logs in /var/log/munin/munin-limits.log file.

or we can check it directly using following command:

$ tail -f  /var/log/munin/munin-limits.log

Problem – 2 Unable to Update the graphs on Munin Dashboard:

Solution:

For resolving this problem, we need make a single change in /etc/munin/munin.conf file.

just open that file using vi editor and search for following line(which is commented out in default configuration):

#cgiurl_graph /munin-cgi/munin-cgi-graph

we need to only uncomment the above line from /etc/munin/munin.conf. Save it and close that file.

just restart the services like munin-node and apache2 using commands like:

  •  $ sudo service munin-node restart
  •  $ sudo service apache2 restart

Wait for 5 minutes to generate next logs and after 5 minutes you will get updated graphs on munin dashboard.

Problem-3: UpdateWorker is not working Properly:

Solution:

To find out error related to UpdateWorker, we need to check the updated log file (sudo vi /var/log/munin/munin-update.log). If we got error like following in that file:

******************************************************************************

2017/10/26 08:05:02 [ERROR] Munin::Master::UpdateWorker<MuninSlave;MuninSlave> died with ‘[FATAL] Socket read from MuninSlave failed.  Terminating process. at /usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.

******************************************************************************

for solving this type of problem we need to go with our munin configuration file i.e /etc/munin/munin.conf file.

this error comes because of setting threshold values wrongly(inside that munin.conf file)

following is  wrong method for setting threshold values:

*********************** Start Of Section **********************

[MuninMaster]

address 127.x.x.1

use-node_name  yes

df._dev_sda1.warning = 30

df._dev_sda1.critical = 40

************************* End Of Section ***************************

Following is  correct method for setting threshold values:

*********************** Start Of Section **********************

[MuninMaster]

                                 address 127.x.x.1

                                  use-node_name  yes

                                  df._dev_sda1.warning  30

                                  df._dev_sda1.critical  40

************************* End Of Section ***************************

after making this change we can see  the UpdateWorker error is gone…..

and we can see the output like this……

*******************************************************************

2017/10/26 22:50:06 [INFO] Reaping Munin::Master::UpdateWorker<MuninSlave;MuninSlave>.  Exit value/signal: 0/0
2017/10/26 22:50:06 [INFO] Reaping Munin::Master::UpdateWorker<MuninMaster;MuninMaster>.  Exit value/signal: 0/0

*****************************************************************************

Where , ‘Exit value/signal: 0/0 ‘ means that everything is fine.

Event Handling In Munin:

If some service went into critical state,To handle these scenario we need to create or call an external script which can send mail to particular user and handle that occurred event.

before making disk is in critical state,we need to check how much % disk is occuppied using following command:

  •  df -h

Output of above command is:

Munin-event-handler.png

Here, Disk is occupied only 36%.

Now, we will make disk is in critical state by using following command on MuninMaster:

  •  sudo fallocate -l 25G /etc/munin/test.img

where,

25G —–>how much space you want to occupy.

test.img—–>name of that created file.

sample output of above command is:

Munin-event-handler1.png

After successful completion of above command,we need to run the following command to check how much % of disk is occupied:

  • df -h

output of above command is:

Munin-event-handler2.png

Now disk usage is in critical. Same thing we can see on munin dashboard as follows:

Munin-event-handler3.png

when we click on that red marked disk on munin dashboard, then we can see more elaborate graphs of disk :

Munin-event-handler4.png

Now we need to handle that disk event. For that we need to make some changes in to /etc/munin/munin.conf file on MuninMaster as follows(add the following lines to munin.conf):

#####################################################################

contacts(space)ubuntu root

contact.ubuntu.command(space)>(No space here)”path_ of_ur_externalscript”

####################################################################

where,

we need to add usernames(to contacts list)of ur machine(here,ubuntu and root                   are the users).So I added like “contacts(space)ubuntu root”.

And executing the external script by using path of that script                                                     like(contact.ubuntu.command(space)>(No space here)”path_ of_ur_externalscript”).

 

Now your /etc/munin/munin.conf file looking like this :

###########################################################

 

Munin-event-handler5.png

##################################################################

and content of sample externalscript is as follows script stored on path is “/etc/munin/externalscript”:

###################################################################

 

https://pastebin.com/X07ZKpbF

click on above link, we will get content of externalscript

###################################################################

when any service goes into critical then this externalscript will be called.

for handling the particular event, we need to write a script which deletes content of directories like ‘/tmp’,’/var/tmp’ and finally test.img file(which we created earlier).

Script for event handling:

###################################################################

#!/bin/bash

sudo rm -rf /var/tmp/*

sudo rm -rf /tmp/*

sudo rm -rf /etc/munin/test.img

###################################################################

when any service goes into critical then this externalscript will be called. and delete the temporary files and reduces the disk usage.

After every 5 minutes we can see the updated changes on munin dashboard i.e. on 110.x.x.188/munin as follows:

Munin-event-handler6.png

more elaborated view of disk of MuninMaster as:

Munin-event-handler7.png

See the above snapshot, here we can see that disk is not in critical state.(after execution of externalscript).

Conclusion:

Munin can be configured to monitor the system on which it is installed. Adding remote servers to the monitored system is as simple as install the munin-node package on the remote server (or node) and then modifying the server’s and node’s configuration files to point to the other IP address.

Munin works by using plugins, but not all are enabled out of the box. Information about plugins are available on the project’s page.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Redmine Ticketing Tool Installation on Ubuntu 14.04/16.04

 

Introduction:

Redmine is open source and is a flexible project management web application. It is also supports to cross-platform and cross-database.

It features per project wiki pages and forums, time tracking, and flexible, role-based access control. It includes a calendar and Gantt charts to aid visual representation of projects and their deadlines. Redmine integrates with various version control systems and includes a repository browser and diff viewer.

The design of Redmine is significantly influenced by trac, a software package with some similar features.

Redmine is written using the Ruby on Rails framework. It is cross-platform and cross-database and supports 34 languages.

Supporting Features:

1.Supports to multiple projects.
2.Role Based access control on project and on it’s content.

3.Flexible issue tracking system.
4.Supports for calender and Gantt chart.
5.Supports to email notification.
6..Supports to create wiki pages per project and project forum per project.
7.Time Tracking.
8.Supports to SVN, Git ,CVS , etc…
9.Multiple LDAP(Lightweight Directory Access Protocol)authentication support.
10.Multi language support.(English,French ,Japanese,Russian,German,etc..)
11.Multiple DB support.(MySQL, PostgreSQL , SQL server ,etc…)
12.Supports API.

We are going to with simple steps installation(Bitnami Redmine Stack Installation).

Installation steps:

  1. System requirements for Redmine Installation:-> 
  •  Intel x86 or compatible processor
  • Minimum 512 MB RAM
  • 1000 MB Hard Drive Space
  • Operating System- Ubuntu-14.04 or Ubuntu-16.04

Step-1. Download Bitnami Redmine from following Link—->

https://bitnami.com/redirect/to/156271/bitnami-redmine-3.4.2-2-linux-x64-installer.run

Redmine Download2.png

or you can download that setup using wget command on ubuntu as follows—->

ubuntu@localhost~]sudo wget https://bitnami.com/redirect/to/156271/bitnami-redmine-3.4.2-2-linux-x64-installer.run

Step 2:

It’s a binami-redmine*.run file but it doesn’t matter for Linux, as Linux do not care about extensions.

Check whether the file has executable permissions:

ubuntu@localhost~]$ getfacl bitnami-redmine-3.1.1-1-linux-x64-installer.run

If it has a executable permissions it’s file, if not u can do:

$ chmod 700  bitnami-redmine-3.1.1-1-linux-x64-installer.run

Redmine2.png

Step 3:

Now run this file as:

$  ./ bitnami-redmine-3.1.1-1-linux-x64-installer.run

Step 4:

Now it will prompt some basic question, first one is installation language.

Then it will ask for some package installation.

Redmine1.png

Then, your directory to put the all redmine files and admin profile information, like name, username and password.

Step 5:Mail configuration:

Redmine3.png

You need to specify your gmail account for mail (Recommended).

  • Select gmail for gmail account or custom for any other.

Give user name and password.

  • You can also setup this after whole installation.Redmine4.png

Fill default section:

Redmine5.png

And its done.

Step  6:It will show a installation progress:

Redmine6.png

At last it will give u a link to access redmine webpage (generally it is):

http://localhost:80/redmine

Here, You have to specify your ubuntu machine’s IP address instead of putting localhost in URL.

it may change port number  as if any other web service is running in your machine.

Step 7:Access login screen in your browser and enter you’r  admin credentials

Redmine7.png

To Restart you’r all redmine services:

If your machine ever go down. There is a shell script to restart all services.

Installation of Bitnami Redmine is done in “/opt/redmine” directory.

So for restating the all services please go to following—>

ubuntu@localhost~]cd  /opt/redmine 

ubuntu@redmine]sudo ./ctlscript.sh restart

or

ubuntu@localhost~]$ sh /opt/redmine/ctlscript.sh

ctlscript.sh is a bash script which is useful for restarting all services related to Bitnami Redmine.

Where , * is a yourRedmineDirectory  form root(‘/’)

As everything in this redmine directory,  you can configure whole setup according to u, like apache and others.

Advanced Settings:

If you want to set your project name on redmine dashboard then follow the image.

Here I have changed mine project name(SEAL Development) instead of Redmine on Redmine Dash board Home Page.

 

Redmine_change_to_SEAL2.png

 

Redmine_change_to_SEAL4.png

 

 

Redmine_change_to_SEAL5.png

After making all changes, Your Project Name(SEAL Development) set on Redmine home page and it will looking like as follows—->

Redmine_change_to_SEAL6.png

You Are Done With Redmine Installation.!!!!!!!!!!!!!!!!

SolarWinds(Monitoring Tool) Part-1 Adding Node To Solarwind and Monitoring service from same (added) Node.

 

Introduction:

Solarwind is a monitoring Tools. Where we can monitor the Server,Services,Application,Network performance(of particular Server).

There are different setup for different functionality like for monitoring server and their application, Solarwinds provides SAM(Server & Application Monitoring) Tool, For measuring n/w performance Solarwinds provides NPM(n/w performance Monitoring) Tool, For monitoring n/w configurations Solarwind provides NCM(N/w configuration Monitoring) Tool. Also Solarwind provices lot more different tools for monitoring. Few of them we have introduced above.

 

Restrictions For Using Solarwinds SAM:->

1.Active Directory should not be in that server where you are going to install                Solarwind SAM.

2.Servers should be registered to Active Directory which is going to add for monitor          (using Solarwind SAM).

3.Active Directory and SAM setup should not be installed on same windows server 2012        R2 Base  machine.

Currently, We are focusing on Server And application Monitoring(SAM) Tool.

SAM System Requirements-1.png

SAM System Requirements-2.png

SAM System Requirements-3.png

SAM System Requirements-4.png

SAM System Requirements-5.png

SAM System Requirements-6.png

SAM Setup Installation Process On Windows Server 2012 R2: 

SAM Setup Download Link1:-

http://links.solarwindsmarketing.mkt4971.com/ctt?kn=19&ms=NTE1NjQyMjgS1&r=MjY1MDQ5OTA3NjQ5S0&b=0&j=MTE0MTUzMzk5MAS2&mt=2&rj=MTE0MTUzMDQ1NgS2&rt=0

or

SAM Setup Download Link2:-

http://links.solarwindsmarketing.mkt4971.com/ctt?kn=6&ms=NTQ1OTMxMTMS1&r=MjYzNzYyMTI2NTIyS0&b=0&j=MTI0NjU1MDM5NwS2&mt=2&rj=MTI0NjUwMDcxOAS2&rt=0

Step -1

After Downloading the setup (setup should be of size nearer to 1.5 GB or nearer to 1.7 GB or may be more than that)Don’t Download setup whose size is 2kb or 5 kb. Go to windows server 2012 R2, and Double click on setup file.After Clicking on that setup window will pops up which says that there is requirement of .net 3.5 or .net 4.0. So Now we need to install these.

1)click on ->Add Role->Role Based Installation->Next->Next->Select “.net Framework 3.5 features->click ->install.

SAM Dot Ner Requirements-1.png

SAM Dot Ner Requirements-2.png

if it is successfull then it is ok. And if it is giving error to install .net 3.5 features then  u need to follow different way as follows—->

first,copy setup of windows server 2012 R2 on windows server 2012 R2(where u r trying to install solarwinds.) put this iso image in to “C:” drive and right click on that and select “Mount” option.

Mount Iso.png

After it get mounted then go back to my computer, there u see a new Drive as shown in next figure…..

Mount Iso-2.png

Now Again u need to follow the same

SAM Dot Ner Requirements-1.png

SAM Dot Ner Requirements-2.png

SAM Dot Ner Requirements-3.png

In the next figure u need to specify the path as “D:\Sources\SxS\”      … Sometimes u need to change or specify drive letter as “D:” or “specify Whatever Drive letter it shows in my computer comes after mounting the iso” into path section.

SAM Dot Ner Requirements-4.png

now, click –>ok and click–>Install.

After  clicking on Install it will definitely get installed.

Now we can try with installation of SAM by double clicking on that SAM setup. it will  populate installation then click->next–>next–>next->ok

Installation will starts.

After installation finishes, Goto start–>search–>solarwinds orion web console–>Press Enter. It will opens into browser.

Orion -login page.png

For first time, Don’t put any password (password field should Be blank at first time) –>clock on Login

after login you can see following page–>

Orion -dashboard.png

Now We will add a Node(server) whose service is going to be monitored.

Step -1 Adding Node(Server) To Solarwind Dashboard Steps:

Add Node -1.png

click on- Manage Node ->

Add Node -2.png

click on add Node–>

Add Node -3.png

after putting credentials of server(which u want to add) then click on “Test”.

If it is successful then and then only you can move further for installation of Solarwinds agent on that server machine.

Add Node -4.png

click->Next

When you click on Next button then Installation of solarwind agent (on server) is in progress.

Use of Solarwind Agent:

For making communication between Solarwind Orion dashboard and newly added server(Node).

Add Node -5.png

Add Node -6.png

The next step is “Choose Resources”.Here No need to click on anything.Just click on Next button which is present at bottom right corner.

Add Node -7.png

Next step is “ADD APPLICATION MONITORS”. You need to select template which is below of “Show Only”.Also You need to just click on check-box which is present just below “Show only”.

Add Node -8.1.png

figure 8.1 Add Application Monitor part-1

After that just scroll down the same page and click on drop down box of “Choose Credentials” and choose the newly added node’s credentials.Here I have added 112.131 machine whose credentials named as “112.131 Windows Credentials”.Now I am selecting the same credentials by clicking on drop-down box. After that click on “Test” button.If your credentials are correct then and then only you got the message like “Testing on node 112.131 is finished successfully with ‘Up’ Status.” now click on “Next” button.

Add Node -8.2.png

figure 8.1 Add Application Monitor part-2

Add Node -8.2.1.png

figure 8.2 Change Properties part-1

Add Node -8.2.2.png

figure 8.2 Change Properties part-2

No need to change anything on “Change Properties”.Just click on “Ok,ADD NODE”.

Adding Node To solarwind is Completed Here.

Step – 2 : Adding Service for Monitoring From Added Server(Node):

Now We will add services( for monitoring) from newly added Node(server).Here is the procedure to add service for monitoring.

Goto –>Orion Dashboard–>Settings—>All Settings—>PRODUCT SPECIFIC SETTINGS–>SAM Settings–> “Component Monitor Wizard”.

After clicking on “Component Monitor Wizard” then you can see the following screen.

Component wizard-1.png

figure-9.1 Component Wizard part-1

please select component monitor type(This selection is dependent on what type of server you added for monitoring whether it is windows server or linux server. )

Here I have added windows server thats why I will select “Windows Service Monitor” Radio button.

Component wizard-2.png

figure-9.1 Component Wizard part-2

after that click on “Next”

Select Target-1.png

After clicking on Browse button You can see the following screen.Then You need to select a node which you added in previous step and then click on “Select”.

Select Target-2.png

after clicking on “Select ” you will get following screen

Select Target-3.png

Now click on “Next”, you will get following screen.

Select Component.png

Edit Properties of Component.png

Add Application to Monitor.png

Assign To Node.png

Create Component.png

SAM Summary Page.png

When you will see the SAM summary page then you can see the newly created node with its newly created Template.

you have to wait for 10 minutes to see whether its is really working or not.

After 10 Minutes if it goes green then it will be fine and working correctly.

If it shows gray in color, it means that It is not added successfully.

Application Summary SAM.png

Done With Adding Server with its Application & Service Monitoring.

*******************************PART-1ENDS*************************

Nagios XI – Restarting A LINUX Service With NRPE Part-5

Introduction:

In this part, we are going to  monitor Linux service using Nagiosxi as monitoring tool.

Here are the Ip’s which we are going to use for demo purpose.

110.x.x.69——>Nagios Core Server(Nagiosxi)

110.x.x.67——>Centos as client (machine whose service is going to be monitored).

Configuration on Client Machine:

In this section we are going to doing configuration on centos(i.e on Linux as cilent machine.)

Now,configure 110.x.x.67.
[root@nagios-linux-test~]#yum install epel-release

install_epel-release

[root@nagios-linux-test~]#yum install nagios plugins

install_nagios plugins.png

Now, We need to stop the firewall on both machines(i.e on centos as client m/c and also on nagios server machine.)

To do this use the following command

[root@nagiosclient ~]# service iptables stop

firewall stop.png

Firewall is stopped now.

Now, We need to create the script,which gives us the current status of particular service(i.e Status is either “OK-Running” or “Critical-Not running”).

So,go to the following path to create script file.

[root@nagiosclient ~]# cd /usr/lib64/nagios/plugins/

or

[root@nagiosclient~]#cd /usr/local/nagios/libexec

[root@nagiosclient plugins]#vi check_httpd.sh

and paste following content.

***********************************************

#!/bin/sh
service httpd status | grep running

case $? in
0)
echo “OK httpd is Running”
exit 0
;;
1)
echo “Critical httpd not running”
exit 2
;;
*)
echo “UNKONWN – Failed to connect”
exit 3
;;
esac

************************************************

then press “esc” after that :wq (enter).

Now change the permissions of that “check_httpd.sh” file for that use following command-

[root@nagiosclient plugins]#chmod 777 check_httpd.sh( hit enter)

[root@nagiosclient plugins]#ls

when You type the ls command then color of “check_httpd.sh” file changes from white to green.

Configuring NRPE Command:

First we’ll create a command in the “/usr/local/nagios/etc/nrpe.cfg” file that will perform the restart command. Establish a terminal or SSH session to your Linux server as the root user and execute the following command:

[root@nagiosclient~]#vi /usr/local/nagios/etc/nrpe.cfg

                                            OR

[root@nagiosclient~]#vi /etc/nagios/nrpe.cfg

nrpe_cfg_file path.png

When using the vi editor, to make changes press i on the keyboard first to enter insert mode. Press Esc to exit insert mode.

Go to the end of the file by pressing Shift + G and add the following line:

command[service_restart]=sudo service $ARG1$ restart

add.png

When you have finished, save the changes in vi by typing:->”ESC :wq” and press Enter.

***********************************************************************

OPTIONAL:(Generally No Need to perform This Step upto  next Star Line):

The nagios user will also need to be granted permissions to execute the service command. Execute the following command as root to give NRPE permission to restart services:

echo “nagios ALL = NOPASSWD: `which service`” >> /etc/sudoers

It’s very important to use the back-tick key on your keyboard around the `which service` words above, this key is commonly located to the left of the 1 key.

***********************************************************************

Testing the Commands from Nagios XI Server(From 110.x.x.69):

Now we will test from the Nagios XI server that the command you just added to the NRPE client on the Linux server is working. This example is going to restart the crond service as it is unlikely to cause any issues. Establish a terminal session to your Nagios XI server and execute the following command:

[root@nagiosclient~]#/usr/local/nagios/libexec/check_nrpe -H -p 5666 -c service_restart -a

manually checking.png

You can see from the screenshot that we received back the results from the “service_restart” command, it appears to be working.

Create Event Handler Script:-

Next we need to create a script that will be used by Nagios XI for the event handler. The script will be called “service_restart.sh” and will be located in the “/usr/local/nagios/libexec/” directory on the Nagios XI server. Execute the following command:

[root@nagiosclient~]#vi /usr/local/nagios/libexec/service_restart.sh

When using the vi editor, to make changes press “i”on the keyboard first to enter insert mode. Press “Esc” to exit insert mode.

Now Paste the following:-

************************************************

#!/bin/sh

case “$1” in

OK)

;;

WARNING)

;;

UNKNOWN)

;;

CRITICAL)

/usr/local/nagios/libexec/check_nrpe -H “$2” -p 5666 -c service_restart -a “$3″

;;

esac

exit 0

************************************************

When you have finished, save the changes in vi by typing:->” ESC :wq”

and press Enter.

Now execute the following commands to set the correction permissions:

[root@nagiosclient~]#chown apache:nagios /usr/local/nagios/libexec/service_restart.sh

[root@nagiosclient~]#chmod 775 /usr/local/nagios/libexec/service_restart.sh

You can now test the script works by executing the following command:

[root@nagiosclient~]#/usr/local/nagios/libexec/service_restart.sh CRITICAL

You can see from the script above that it’s only when the service is in a “CRITICAL” state that the “service_restart” command will be executed.

Create Event Handler from Nagiosxi server Dashboard(i.e from 110.x.x.69):———>

Now an event handler on the Nagios XI server will be created which will be used by your services.

Navigate to Configure > Core Configuration Manager

Select “Commands” from the list on the left, click the ” >_ Commands” link and then click the Add New button.

command1.png

You will need to populate the fields with the following values:

Command Name: Service Restart – Linux

Command line: $USER1$/service_restart.sh $SERVICESTATE$ $HOSTADDRESS$ $_SERVICESERVICE$

Command type:  misc command

Check the Active check box.

Click the Save button and then Apply Configuration.

If you didn’t get anything …Please refer the following snapshot:

command2.png

For the Event handler drop down list select the option Service Restart – Linux

command3.png

command4.png

We will be adding a custom variable so that the event handler knows the name of the service to restart.

Name: _SERVICE

Value:   ……………for e.g “crond” or “httpd” , etc..

command5.png

In the event handler command you created, you can see the macro “$_SERVICESERVICE$” was used. This is how a service macro is referenced by the Nagios Core engine.

Testing:

To test simply force the service to stop on the Client Linux machine. Execute the following command on your Linux machine:

[root@nagios-linux-test~]#service crond stop

Now see the dashboard(on browser).

Wait for the Nagios service to go to a critical state or force the next check. Once the Nagios XI Cron Scheduling Daemon service is in a critical state the event handler will be executed and the Linux crond service will be restarted. The next time Nagios XI checks the Cron Scheduling Daemon service it will return to an OK state as the Linux crond service will now be running.

Troubleshooting:—–>

Problem-1:

Sometimes firewall is not allowing you to communicate with other machines. That’s why see the following snapshot and add port number 5666 and port number 12489 to the firewall.:

firewall_port_adding_nagios1.png

When You are going to monitor the “httpd” service on client Linux amchine then follwing problem may occur.

Problem-2:”httpd dead but subsys locked”

Solution:

still “httpd dead but subsys locked” even following the fix of some source.

When such type of problem occurs at that moment you need to resolve the same.

for that purpose please run the following command:

[root@nagios-linux-test~]#killall -9 httpd
[root@nagios-linux-test~]#sudo rm -f /var/lock/subsys/httpd
[root@nagios-linux-test~]#sudo service httpd restart
Stopping httpd:                                            [  OK  ]
Starting httpd:                                            [  OK  ]
[root@nagios-linux-test nagios]# service httpd status

httpd (pid 324) is running...

If You still facing the same problem as above related to “httpd” service then please goto the following link.There You definitely get the solution:-

https://awaseroot.wordpress.com/2012/06/11/subsys-lock-problem-with-centos-6-2-and-apache/

just copy this link and past it into your browser and see the full solution for that problem.

Problem-3:“CHECK_NRPE: Error – Could not complete SSL handshake”:-

Getting error “CHECK_NRPE: Error – Could not complete SSL handshake” while connecting nagios server trying to connect remote NRPE server.

# /usr/lib64/nagios/plugins/check_nrpe -H 110.x.x.67

CHECK_NRPE: Error - Could not complete SSL handshake.

Solution:

This issue generally comes when NRPE server is not allowing to access service from Nagios server. You need to add nagios server ip in nrpe configuration file.

Step 1:

Edit NRPE configuration file /etc/nagios/nrpe.cfg and search for allowed_hosts configuration variable.

Step 2:

Add your nagios servers ip address in allowed_hosts. For multiple nagios servers add all ips with comma-delimited list. It also supports subnets (like: 110.110.110.1/24).

allowed_hosts=127.0.0.1, 192.168.10.3, 192.168.10.4

After making above changes restart nrpe service

# service nrpe restart

Step 3:

Finally verify changes again using check_nrpe command from nagios server

# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.10.45

NRPE v2.14

Still If You didn’t get above problem’s solution,Please go to the following link:-

http://tecadmin.net/check-nrpe-error-could-not-complete-ssl-handshake/

https://www.howtoforge.com/nagios-icinga-debian-squeeze-check_nrpe-error-could-not-complete-ssl-handshake

Copy this link and paste into your browser and hit enter.You will get proper problem solution.

Problem:4 No route to host centos

Solution:

To solve this problem,We need to add port number 5666 to firewall

So,type the following command and hit Enter:

[root@nagios-linux-test~]#iptables -I INPUT -s 0/0 -p tcp --dport 5666 -j ACCEPT

You need to add this port for NRPE So, Instead of “INPUT”word,add  “NRPE”.

Now command should be look like:

[root@nagios-linux-test~]#iptables -I NRPE -s 0/0 -p tcp --dport 5666 -j ACCEPT

Reference Linkfor problem no.4:

http://ask.xmodulo.com/open-port-firewall-centos-rhel.html

So, We have successfully completed the monitoring of specific service on Linux client (CentOS) machine.

NagiosXI (Monitoring Tool)for monitoring services on specific server..Part-4“Nagios XI – Adding Windows server IP to NagiosXI server on Dashboard“

“Adding Windows server IP to NagiosXI server on Dashboard”

Here,We are going to learn about adding windows server 2012(IP) to the nagiosxi (on dashboard).

To do the same, please follow the screenshots:

For adding the windows server to nagiosxi (for monitoring the windows server),

First we need to goto “Core Config Manager”—->”Configuration Wizard”

add_Host_to_monitoring2.png

add_Host_to_monitoring3.png

Now you will see the following screen,There You get search box,You need to search which machine you want to add for monitoring.

Here, I am doing with windows server that’s why I searched for “Windows server” As you can see.

add_Host_to_monitoring4.png

Just click on “Windows server 2003,2008,2012,etc…”.Then you will see the next step as:

add_Host_to_monitoring5.png

In the next screenshot, It will asks for NSClient++’s password(which you have provided while installation of NSClient++ on windows server 2012,so provide the same password here also.)

add_Host_to_monitoring6_step2-2.1.png

This “configuration wizard:windows server- step 2” is big thats why while taking snapshots I have splitted this into 4 parts.add_Host_to_monitoring6_step2-2.4.png

add_Host_to_monitoring6_step2-2.5.png

add_Host_to_monitoring6_step-4.pngadd_Host_to_monitoring6_step-4.2.png

add_Host_to_monitoring6_step-5.1.png

add_Host_to_monitoring6_step-5.2.pngadd_Host_to_monitoring6_step-6.png

After Pressing “Apply” button , the windows server’s Ip get added to Nagios Monitoring Hosts. You can manually check whether new host is added to nagios server or not(on nagios server dashboard.)

goto “Core Config Manager”–>on left pane you will find “Monitoring”—->Hosts—->click on it. and check your host is added or not. Here is screenshots.

hosts1.png

Here You can see I have added 110.110.110.90.(inside Red Square Box)

hosts2.png

Enjoy….You have completed this part successfully…..