Providing guidance to operations team based on event meta-data
Publication Date: 2017-Aug-04
The IP.com Prior Art Database
Providing Guidance to Operations Team Based on Event Meta-Data
One of the driving forces for building a microservices architecture is often for isolation of components - loosely coupled services each with a discrete responsibility that work together and enable technology diversity and for code changes to reach production faster without having to redeploy the whole system:
http://eugenedvorkin.com/seven-micro-services-architecture-advantages/ A common design patterns for intercommunication between microservices is a circuit
breaker, where a request is retried several times potentially with incremental backouts before the circuit breaker cuts off the request for a period of time in an attempt to save system resources:
http://martinfowler.com/bliki/CircuitBreaker.html In a microservices architecture it is not uncommon to have a chain of services calling
each other, if a service at the end of the chain is not responding for some reason or is responding with error then the chaining can have the effect of multiplying the retries and system resources are wasted. A solution is required to warn operations of services that are failing to meet anticipated SLAs.
Circuit Breakers: http://martinfowler.com/bliki/CircuitBreaker.html - Describes retry and back off mechanism followed by a given time for which requests are blocked to save resources.
Dynamic Timeouts in SOA: https://www.ibm.com/developerworks/websphere/techjournal/0909_tost/0909_tost.html
- Describes automatically adjusting time-outs of upstream components when a new downstream component is added. Does not refer to context being passed in from the request nor adjusting dynamic back-off logic based on incoming context of the requesting client.
The proposed solution is for a calling service to include a required or anticipated response time in a header field when calling a downstream service and for this to be utilised to conditionally generate an alert to be displayed on an operations dashboard if the response times exceed or are within a given threshold of the required or anticipated response time.
The following example illustrates how the system would work for a scenario where a microservices solution consists of Service A, Service B and Service C where Service B exposes two operations "getOperation1" and "getOperation2". The "getOperation1" operation is serviced within Service B, but the "getOperation2" requires Service B to send a request to Service C over HTTP transport prior to responding to the service consumer. In this example Service A is the service consumer of Service B and all micros...