Saturday, February 5, 2022

Find Me in ElasticSearch - ASAR

Quick about Logs: Organizations using AWS cloud platform and services mostly logs info in Amazon Simple Storage Service (Amazon S3) which is then shipped to an external monitoring and analysis solution (Kibana/Grafana/AWS Quicksight etc). 

As of today, we follow some of the other time taking processes like provisioning a VM or installing data shippers for pushing data (logs) directly from AWS to Elastic.

Now AWS users can quickly ingest logs stored in Amazon S3 with the new Elastic Serverless Forwarder, an AWS Lambda application, and view them in the Elastic Stack alongside other logs and metrics for centralized analytics.

Elastic Serverless Forwarder is an AWS Lambda function that ships logs from your AWS environment to Elastic. The function can forward data to Elastic self-managed or Elastic cloud environments. It supports trigger/input from S3/SQS Events.

Elastic Serverless Forwarder — Is published in the AWS Serverless Application Repository (SAR ) — to simplify this design and send logs to ElasticSearch.

The Elastic serverless forwarder Lambda application supports pushing logs from the AWS S3 bucket to Elastic. The SQS queue event notification on AWS S3 serves as a trigger for the Lambda function. When a new log file gets written to an AWS S3 bucket, internally triggers the Lambda function. 

Below is a high-level view to set up the SQS function trigger on their S3 bucket and provide Elastic connection information to let the logs flow and use the prebuilt dashboards and full analytics features of Kibana to bring logs data to life.











For detailed coverage and steps behind, please check the link

Just for your reference: Most AWS Logs go to AWS S3.













Hope this helps.

Arun Manglick

Tuesday, December 1, 2020

Resilient Application & Design Aspects

This post is to cover parameters of resilient application, essential goals of many modern architecture exercises. A well-designed app provides high availability, reliability and scales up and down as demand increases and decreases, and is resilient enough to withstand service disruptions. Building and operating apps that meet these requirements requires careful planning and design.

  1. Availability
  2. Scalability
  3. Reliability
  4. Performance
  5. Reduce Communication Between Micro Services
  6. Use Appropriate DB and Storage Technology
  7. Caching

More details below to know what is required for each of such parameters.

"Well before you reading thru below, there are multiple factors contribute to 'Resiliency of Microservices'  - 

  1. Timely Timeouts - Do not indefinitely wait to timeout, if no response, else it will degrade system.
  2. Circuit Breakers - Stop making requests stop making requests after a certain threshold of failures.
  3. Bulkheads - Do not have single thread pool for all outbound endpoints. Rather one thread per endpoint
  4. Steady State - Adhere to designs which allows your system to run in a steady state for long time. Could be thru Automated Deployments, Clearing Log Files to avoid growing them indefinitely, Clear Cache before growing them enormous etc.
  5. Fail Fast - Make decision early to fail, if you know the request is going to fail/rejected. E.g. Failed Node. Even Circuit Breakers can also be used to implement Fail Fast Strategy.
  6. Let it Crash - This strategy believes to abandon a broken sub-system, to preserve the overall stability of the system. E.g. Remove Failed Node, Remove Failed Endpoint etc
  7. Load Shedding - Load shedding drops some proportion of load by dropping traffic as the server approaches overload conditions (e.g. Reduce Queue size, Introduce Caching etc).
  8. Fallback - Sometimes a request is going to fail no matter how many times you retry. The Fallback policy lets you return some default or perform an action  - Like paging an admin, scaling a system or restarting a service. 
  9. Disaster Recovery - Replicate Services at multiple AZs to handle site failures.

    I'll write up a separate post - Designing Microservices - To cover all these details with few more colors"

    Availability: 

    • Eliminate SPOF (Single Point of Failure)
      • Implement Geographic Redundancy  - At least two copy of every key component
      • Fault Isolation Zone
    • Data Replication 
    • Ensure Automatic Failover - 
      • Retry Logic, 
      • Circuit Breakers (stop making requests stop making requests after a certain threshold of failures)
    • Load Balancing - With Health checks to avoid traffic to unhealthy nodes
    • Increase Monitoring - To determine failures early - Like New Relic, site24x7.com

    Reliability: 

    • More or less similar to Availability, as Availability increase Reliability
    • In Addition
      • Use Micro Services
      • Horizontally Scaling than Vertical Scaling to reduce SPOF 
    Scalability: 

    • Layered Architecture 
    • Loose Coupled Design/Components
    • Micro Services
    • Load Balancer - To route traffic to more available nodes.
    • Sharding 
      • Horizontal Sharding - Available with No-SQL DB
      • Vertical Sharding - Available with SQL DB
    • Caching 
    Performance: 

    • Asynchronous Programming
    • CDN (Content Delivery Network)
    • Caching
    • Prefer Static Content From Cache
    • Load Balancer - To Distribute Traffic Equally 
    • Compress Data
    • Event Driven Designs
    • Message Bus
    • Reduced Image Size
    • Review SPs for Best Execution Plan
    • Layered Architecture - Allows to boost any particular layer to scale independently 
    • Keep Instrumenting and Work on Weak Areas

    Reduce Over Communication b/w Micro-Services: 
    • This is required to stop traffic to failed services and avoid cascading failures
    • This can be done using
      • Circuit Breakers  - Stop making requests stop making requests after a certain threshold of failures
      • Fallback - Sometimes a request is going to fail no matter how many times you retry. The Fallback policy lets you return some default or perform an action  - Like paging an admin, scaling a system or restarting a service. 
      • Graceful Degradations
        • Load shedding drops some proportion of load by dropping traffic as the server approaches overload conditions (e.g. Reduce Queue size).  The goal is to keep the server from running out of RAM, failing health checks etc. 
        • Graceful degradation - Takes the concept of load shedding one step further by reducing the amount of work that needs to be performed. In some applications, it’s possible to significantly decrease the amount of work or time needed by decreasing the quality of responses. For instance, a search application might only search a subset of data stored in an in-memory cache rather than the full on-disk database or use a less-accurate (but faster) ranking algorithm when overloaded.
    Appropriate DB and Storage Technology
    • SQL Provides more Data Consistency
    • No-SQL provides more Scalability - Horizontal Scaling
    • If application does not require all features of RDBMS and can live with 'Eventual Consistency', No-SQL is recommended for Better Availability & Scalability
    Caching Implementation:
    • Having Caching reduces Load and Thus increases Scalability and Availability - By reducing reliance of Disk-Based Storage
    • Reduces Load on down-stream services, specifically DB calls
    • Increases Resiliency by Supporting Techniques like Graceful Degradation 

    Few Reference(s): 

    Application Design Practices:

    Below are few parameters, good to consider while designing application:
    1. Single Responsibility 
    2. Open Closed Principles
    3. Interface Segregation
    4. Deep Error Catching 
    5. Retry Logics - Reference Polly Framework -  https://www.pluralsight.com/blog/software-development/intro-to-polly
    6. Evaluate Dependencies and Dependency Failures
    7. Evaluate Scalability
      1. Within Limit
      2. Beyond Limit - Archival, Horizontal/Vertical Fragmentation 
    8. Content Delivery Network
    9. Implement Automatic Failover - 
      • Retry Logic, 
      • Circuit Breakers (stop making requests stop making requests after a certain threshold of failures)
    10. High Cohesion & Loose Coupling
    11. Layered Design & Separation of Concern
    12. Implement Angular SPA & Avoid Round Trips
    13. Avoid Long Running Threads
    14. Avoid Unnecessary Exception 
    15. Prefer Stateless Services - Better Performance, No Server Affinity, More Scalable
    16. Keep Session Size Low
    17. Prefer No-SQL - To Enable Horizontal Scaling
    18. Build Instrumentation

    Hope this helps..

    Arun Manglick

    Wednesday, September 13, 2017

    Collection of Visio Shapes: UML

    Below is Type-Collection of Visio Shapes:


    Type
    Sub-Type
    Comments
    Business
    -          Brain Storming
    -          Business Process
    o    Arrow
    o    Audit Diagram
    o    Cause & Effect
    o    Compliance Shapes
    o    EPC Diagram
    o    Fault Tree Analysis
    o    ITIL
    o    Six-Sigma
    o    TQM
    o    Value Stream Map
    o    Workflow Diagram
    -          Charts & Graphs
    -          Organization Chart
    -          Pivot Diagram
    Engineering
    -          Electrical
    -          Mechanical
    -          Process
    Not Applicable
    Flowchart
    -          SharePoint
    -          Arrow Shapes
    -          Basic Shapes
    -          Cross Functional Shapes
    -          Department
    -          Workflow
    -          Miscellaneous Flowchart
    -          BPMN Basic Shapes
    No Sub Types

    - Cross Functional Shape is used for creating Swim Lane (Horizontal & Vertical) Visio designs. Under same use ‘Separators’ for (Horizontal & Vertical) separation.
    General
    -          Basic Flowchart
    -          Blocks
    -          Decorative
    -          Graph & Math Shapes
    No Sub Types
    Map-Floor Plans
    -          Building Plan
    -          Map
    Not Applicable
    Network
    -          Active Directory
    -          Compute & Monitors
    -          Detailed N/w Diagram
    -          Exchange Objects
    -          LDAP
    -          Network & Peripherals
    -          Network Locations
    -          Network Symbols
    -          Rack
    -          Servers
    No Sub Types
    Schedule
    -          Calendar
    -          GANTT
    -          PERT
    -          Timeline
    No Sub Types
    Software & DB
    -          Database
    o    Chen Database
    o    Crow’s Database
    o    IDFE1X Database
    o    UML Database

    -          Software
    o    COM & OLE
    o    Common Icons
    o    Controls
    o    Cursors
    o    DFD Shapes
    o    Dialogs
    o    Enterprise Applications
    o    Gane-Sarson
    o    Language Level
    o    Memory Objects
    o    Toolbars
    o    UML – Activity/Class/Sequence/State Diagram/Use Cases
    o    Web & Media Icons

    -          Web Diagram
    o    Conceptual Websites
    o    Website Maps

    Visio Extras
    -          Annotations
    -          Callout
    -          Connectors
    -          Custom Patterns
    -          Dimensions
    -          Drawing Tool
    -          Embellishments
    -          Icon Sets
    -          Symbols (Traffic, Weather)
    -          Title Blocks
    No Sub Types

    Hope this helps!!!

    Arun Manglick


    Monday, May 1, 2017

    ELK/EKK - AWS Implementation

    This article is about ELK (buzz word now) implementation.

    The ELK stack consists of Elasticsearch, Logstash, and Kibana.





    Logstash is a tool for log data intake, processing, and output. This includes virtually any type of log that you manage: system logs, webserver logs, error logs, and app logs.
    Here in this post, Logstash will be replaced by AWS CloudWatch and AWS Kinesis Firehose.

    Elasticsearch - Is a NoSQL database that is based on the Lucene search engine.  Is a popular open-source search and analytics engine. It is designed to be distributed across multiple nodes enabling work with large datasets. Handle use cases as : Log Analytics, Real-time application monitoring, Click Stream Analytics and Text Search
    Here in this post, AWS Elasticsearch Service  will be used for Elasticsearch component.

    Kibana is your log-data dashboard. It’s a stylish interface for visualizing logs and other time-stamped data.
    Enable better grip on your large data stores with point-and-click pie charts, bar graphs, trendlines, maps and scatter plots.

    First Implementation – ELK With CloudTrail/CloudWatch (as LogStash)

    We’ll try to list few easy steps to do so:

    -          Go to AWS Elastic Search
    -          Create ES Domain – amelasticsearchdomain
    o   Set Access Policy to Allow All/Your Id

    -          Go to AWS CloudTrail Service
    -          Create Cloud Trail - amElasticSearchCloudTrail
    o   Create S3 Bucket – amelasticsearchbucket (Used to hold cloudtrail data)
    o   Create CloudWatch Group - amElasticSearchCloudWatchGroup
    o   In order to deliver CloudTrail events to CloudWatch Logs log group, CloudTrail will assume role with below two permissions
    §  CreateLogStream: Create a CloudWatch Logs log stream in the CloudWatch Logs log group you specify
    §  PutLogEvents: Deliver CloudTrail events to the CloudWatch Logs log stream

    -          Go & Setup Cloud Watch,
    -          Select Group and Then Action to Stream data to Elastic Search Domain
    o   Create New Role - AM_lambda_elasticsearch_execution
    o   Create Lambda (Automatically) LogsToElasticsearch_amelasticsearchdomain - CloudWatch Logs uses Lambda to deliver log data to Amazon Elasticsearch Service / Amazon Elasticsearch Service Cluster.

    -          Go to Elastic Search
    o   Hit Kibana link
    o   On Kibana - Configure an index pattern


    Second Implementation – ELK With AWS KinesisFirehose/CloudWatch (as LogStash)

    We’ll try to list few easy steps to do so:

    -          Go to AWS Elastic Search
    -          Create ES Domain - amelasticsearchdomain
    o   Set Access Policy to Allow All/Your Id
          
    -          Create Kinesis Firehose Delivery Stream - amelasticsearchkinesisfirehosestream
    o   Attach it to above ES Domain
    o   Create Lambda (Optional)  - amelasticsearchkinesisfirehoselambda
    o   Create S3 Bucket for Backup - amelasticsearchkinesisfirehosebucket
    o   Create Role - am_kinesisfirehose_delivery_role

    -          Create EC2 System - (To send log data to above configured Kinesis Firehose)
    o   This will be using 1995 NASA Apache Log (http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html) to feed into Kinesis Firehose.
    o   EC2 used the Amazon Kinesis Agent to flow data from my file system into my Firehose stream.
    o   Amazon Kinesis Agent is a standalone Java software application that offers an easy way to collect and send data to Amazon Kinesis and to Firehose
                   
    - Steps:
           - Launch an EC2 Instance (t2.micro) running the Amazon Linux Amazon Machine Image (AMI)
           - Putty into instance/etc/aws-kinesis/agent
           - Install Kinesis Agent - sudo yum install –y aws-kinesis-agent
           - Go to directory - /etc/aws-kinesis/
           - Open file - nano agent.json
           - Make sure it has this data:
                           {
                             "cloudwatch.emitMetrics": true,
                             "firehose.endpoint": "https://firehose.us-east-1.amazonaws.com",

                             "flows": [
                                           {
                                             "filePattern": "/tmp/mylog.txt",
                                             "deliveryStream": "amelasticsearchkinesisfirehosestream",
                                             "initialPosition": "START_OF_FILE"
                                           }
                             ]
                           }
           - Now Download NASA access log file in your local desktop and Upload to S3
                           - URL - http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
                           - File download - Jul 01 to Jul 31, ASCII format, 20.7 MB gzip compressed,
                           - Unzip and uplaod this file to any S3 bucket (other than any used above)
                           - Make sure file is Public
                          
           - Again go to EC2 Putty 
                           - Go to directory - /etc/aws-kinesis/
                           - Downlaod file from S3 - wget https://s3-us-west-1.amazonaws.com/arunm/access_log_Jul95
                           - Concatenate this file to mylog.txt - cat access_log_Jul95 >> /tmp/mylog.txt
                          
           -  Again go to EC2 Putty
                           - Come to root - cd ~
                           - Go to directory -  /var/log/aws-kinesis-agent/
                           - Monitor the agent’s log at /var/logs/aws-kinesis-agent/aws-kinesis-agent.log.
                           - Open file - nano aws-kinesis-agent.log
                           - You’ll find log lines like : 2017-03-01 21:46:38.476+0000 ip-10-0-0-55 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 1891715 records parsed (205242369 bytes), and 1891715 records sent successfully to destinations. Uptime: 630024ms

    -          Create Kibana ( To Visualize data)                    
    o   Go to AWS Elasticsearch
    o   Click on link to Kibana
    o   The first thing you need to do is configure an index pattern. Use the index root you set when you created the Firehose stream (in our case, logs*).
    o   Kibana should recognize the logs indexes and let you set the Time-field name value. Firehose provides two possibilities:
    §  @timestamp – the time as recorded in the file
    §  @timestamp_utc – available when time zone information is present in the log data
    o   Choose either one, and you should see a summary of the fields detected.
    o   Select the discover tab, and you see a graph of events by time along with some expandable details for each event.
    o   As we are using the NASA dataset, we get a message that there are no results. That’s because the data is way back in 1995.
    o   Expand the time selector in the top right of the Kibana dashboard and choose an absolute time. Pick a start of June 30, 1995, and an end of August 1, 1995. You’ll see something like this.



    Hope this helps.

    Regards,
    Arun Manglick