Reference Architecture

Reference architecture for a generic interface for Cloud Search on AWS with a broker in any lambda-supported runtime. For the particular implementation, I chose and used Node.js. Hence any client request is authorized from an API key and hits the aws api gateway which in turn invokes the lambda function. In this function internally the code will do necessary normalization and pass it on to aws Cloud Search and if any response the same is reformatted for adapting as aws api gateway response. Along with this functionality, the lambda broker will write a human-readable version of the request as analyzed from the request with request method as verb keywords and sort direction with a prefix of JSON property names, etc into AWS cloud watch with simple console.log methods. Tried to make it as generic as possible.

An event bridge scheduler will trigger another lambda which will analyze these human readable messages and try to detect any missing indexes which will be auto-created into the Cloud Search and updated into a config file on aws S3. Lots of production testing and fine tuning is pending along with necessary documentation as well as the AWS sam template to deploy the same. As of now, this is just a blueprint and the components are lying in different locations and need orchestration there are no plans to open this on any public repository. But anyone who wants to adopt the design is free to pick this and do it on his own without any commitment to me. By creating this with the self-learning capabilities this system can be used literally by many applications even those that already depend on some kind of custom clumsy backend.

A few real-time use cases could be community member databases, hospital patient records, pet shops and many more. Generally, the request methods should work like POST create a new record, PUT updates a record, DELETE deletes ( or trash ) a referenced record, and GET fetch according with proper documentation the feature can be defined as the client software is designed and developed.

The reference architecture drawing is attached here and that is just my thoughts. Please share if you think this is good enough.

Cloud Migration – A Thought Process

Everybody is running after this and gets stuck at one stage or the other unless their product or application is still in black and white on some notebooks or just in the wireframe and has to be built from the ground up. Now if you are going to build a new application, it can be designed to take full advantage of the cloud by combining multiple microservices, leaving out more time and resources to do the application into a perfectly usable solution. Whereas we are considering migration of existing applications into the cloud. The development language, database as well as design approach of the whole application should be considered when thinking about migration. It means that migration to the cloud should be considered on a case-to-case basis and there is no storyboard which fits all use cases.

Continue reading “Cloud Migration – A Thought Process”

Architecture in a Serverless Mindset

Consider designing a simple serverless system to process orders within an e-commerce workflow. This is an architecture for a REST micro-service that is simple to implement.

Simple Rest Architecture

How an order will be processed by this e-commerce workflow is as follows.

  1. Amazon API Gateway handles requests and responses to those API calls.
  2. Lambda contains the business logic to process the calls.
  3. Amazon DynamoDB provides persistent JSON document storage.

Though this is simple to implement, this can cause bottlenecks and failures resulting in frustrated clients at the web front end. Analyze the flow and see the possible failure points. Amazon API Gateway integrates with AWS Lambda with a synchronous invocation method and expects AWS Lambda to respond within 30 seconds. As long as this happens, all is well and good. But what if a promo gets shared over social media and very large users pile up with orders, Scaling is built into the AWS Services, but can reach the throttling limits.

The configuration of Amazon DynamoDB, where capacity specifications do play a lot. AWS Lambda throttling as well as concurrency can also create failures. Large dynamic library linking which requires initializing time also affects the cold start time and eventually the latency of AWS Lambda which could get lost with the http request timeout of Amazon API Gateway. Getting deep into the system, the business logic could have some complications and in case one request cannot be processed due to the custom code written in AWS Lambda could fail without any trace of the request saved into any persistent storage. Considering all these factors as well as suggestions by veterans in this walk of life this architecture could be further expanded to something like the below.

Revised Order Processing Architecture

What is the revision and what do the additional components provide as advantage? Let’s discuss it now.

  • Order information comes in through an API call over HTTP into Amazon API Gateway
  • AWS Lambda validates and populates the request into Amazon Simple Queue Service (SQS)
  • SQS integrates with AWS Lambda asynchronously and automatic retries for failed requests as well as Dead Letter Queues (left out in illustration) could help out
  • Business logic Processed requests could be stored to DynamoDB
  • DynamoDB Streams could trigger another AWS Lambda to intimate through SNS about the order to Customer Support

Digging more into the illustrations and explanations there are more to be done to make this a full production-ready blueprint let’s leave those thoughts to upcoming Serverless enthusiasts.

Conclusion

I strongly believe that I have been loyal to the core thoughts of being in a Serverless Mindset. Further thoughts of cost optimizing and scaling can be considered with savings plans, AWS Lambda Concurrent provisioning, Amazon DynamoDB on-demand capacity setting and making sure to optimize business logic and reduce latency.

Rearchitecting an Old Solution

It was in 2016 that a friend approached me with a requirement of a solution. They were receiving video files with high resolution into an ftp server which was maintained by the media supplier. They had created an in-house locally hosted solution to show these to the news operators to preview video files and decide where to attach them. They were starting to spread out their news operational desk to multiple cities and wanted to migrate the solution to the cloud, which they did promptly by lift and shift and the whole solution was reconfigured on a large Ec2 instance which had custom scripts to automatically check the FTP location and copy any new media to their local folders. When I was approached, they were experiencing some sluggish streaming from the hosted Ec2 instance as the media files were being accessed from different cities at the same time. Also, the full high-definition videos had to be downloaded for the preview. They wanted to optimize bandwidth utilization and improve the operator’s response times.

Continue reading “Rearchitecting an Old Solution”

Generic Reference Architecture for massive Data Lake

Organised data archival and analysis for prediction and improving productivity is a challenge for many industries. The architecture referenced can be used for live data ingestion and breakdown into reports with machine-learning capabilities. For a traditional customer premises system when migrating into this massive scale data storage and processing this architecture requires some modifications which will also be explored as case studies.

Reference Architecture

ETL Edge though depicted with an AWS Snowball Edge device, this could even be swapped with IOT sensors and scanners polled in through a Raspberry PI which has the AWS IOT Core libraries or ulitizes the message broking MQTT to dynamically send telemetry data, to be ingested into the warehouse and to be processed by machine learning process in Amazon SageMaker. This is just a reference architecture which requires further polishing to adapt it to any real-world situations. We can explore some use cases.

Continue reading “Generic Reference Architecture for massive Data Lake”

Hybrid streaming sessions – free and open

Wondering what the title means?

Free and public stream sessions I have taken a lot of and organised also. Most recently also I had delivered a session at a hybrid technical event. Out of the five total sessions, two were remote, and the others in-person. This article explores my outline towards a very low-cost solution to handle the same situation. To summarize the options are as follows. Will go in detail about each of the functions and steps as the article proceeds.

  • Live Stream (Linkedin, YouTube, Twitch and endless others)
  • AWS Cloud Front and AWS Media Services
  • OBS Studio (Free broadcaster)
  • BigBlueButton (FOSS video conferencing solution)
  • AWS EC2 / Fargate (Hosting of BigBlueButton)
  • AWS Route53 (DNS and Domain)
  • AWS CloudFront (Low latency endpoints)
  • LetsEncrypt (Free SSL)
  • DroidCam (Use mobile as a Camera for OBS)

There are multiple documents and detailed setup information for almost all the tools referenced available on the internet. I followed several ones to configure the suit and finally almost there, now its only setting up my studio room that is pending which would be completed by early 2023 and start some video sessions with live on YouTube or LinkedIn.

OBS Studio can be installed as per the KB Article Install Instructions on the obs project site. Live Streaming to AWS Media Services can be configured by referring to the blog article Connecting OBS Studio to AWS Media Services in the Cloud written by Dan Gehred and Steve Ward. The article How to broadcast to LinkedIn with OBS on Restream is a good read and reference to configure OBS broadcasting to LinkedIn.

BigBlueButton is an open source video conference system that supports various audio and video formats and allows the use of integrated video-, screen- and document-sharing functions. BigBlueButton has features for multi-user whiteboards, breakout rooms, public and private chats, polling, moderation, emojis, and raise-hands. The blog post How to build a scalable BigBlueButton video conference solution on AWS by David Surrey and Bastian Klein gives full instructions as to how the BigBlueButton video conferencing solution can be installed and run on AWS Ec2. The authors continue to explain how AWS customers who are looking for a self-managed and open-source software-based video conference solution can leverage AWS to build and deploy a scalable BigBlueButton setup. They also explain how to use the necessary scripts and stack templates.

There is much more to explore on the basis of these but I am satisfied with what I already have and will try to bring out a full demo or video tutorial on how I configured the whole thing at a later stage. With dedication towards my official status and the vacations and mundane things related to activities at my #organic Farm and during the start of 2023 would be a bit too busy restructuring my workshop room into a serviceable studio.

AWS for Software Testing Professionals

Software testing professionals should know something about some services and facilities that AWS provides for the automation and integration of testing and quality control into continuous integration pipelines. This is where QA/QC has to work in hand with DevOps. Though it sounds complicated and scary, knowledge about certain items makes it wonderful and easy. Let us dig into those facilities and suggested practices.

  • AWS EC2
  • AWS Cloud Watch
  • AWS SNS
  • AWS Inspector
  • AWS Device Farm
  • AWS Cloud9
  • Script Suites by Third-party Vendors
Continue reading “AWS for Software Testing Professionals”

Complete Managed Development Environment on AWS

Amazon CodeCatalyst, a Unified Software Development Service it was only a few days back that I suggested about Run your Development Environment on Cloud, and as though our dear fellows at AWS had heard my thoughts the preview of Amazon CodeCatalyst was announced two days back as of this post.

As we go through the explanation and blog post we find that it is really intriguing and exciting to hear about the features. Well, I did give a run through the preview and I found that this could change the way we work. At least it did change the way I worked but not for the full-time job as that would violate the compliance complications. But mostly this would be used by me for my leisure time and commitments towards FOSS and my GitHub presence.

Project templates – or blueprints as they define the term do help in fast-tracking the initial development phase and creating a boilerplate to start working. On-demand development environment hosted on the AWS cloud, automated ci-cd pipelines with a multitude of options and drag and drop building, browser-based ide cloud9 with terminal access on the development instance running amazon linux2 which is based out of centos, invite collaborators across the globe to inspect your code with just a few clicks are just a few of the facilities of this unified development environment as service.

I am still very much excited to dig into this service and will go further into this and maybe come out with more like a session with the awsugtvm very soon as time and health permits. Last month I was bedridden after a bike accident involving a stray dog.