Redefining the IT war room with end-to-end observability


Transforming the war room starts with Customer Digital Experience Monitoring (CDEM) to break down silos with correlated, cross-domain insights and efficiency for rapid resolutions. 

Time is money and commandeering a lot of time from many of the smartest and most expensive people across your organization, often at short notice, can be unthinkably expensive.

There’s the hourly cost of their time. Plus, the cost of lost opportunities related to the work they’re doing, which is now delayed. That’s far from the full story though. The costs extend far beyond their own input as everybody needs time to speak, listen, consider, and work through the possibilities.

And yet, when a new software release rolls around, that’s exactly how many organizations respond. They can’t be sure what might go wrong with a software release, so they make sure all the right people are available, just in case.

When it’s obvious that something is going wrong in the application runtime environment, or a mission-critical application starts to experience performance problems, and it needs to be fixed immediately, that same wide group is gathered to figure out the problem and determine the best way to fix it.

Meanwhile, reputational damage to the company is growing with every minute of disruption, and the financial clock is ticking with each minute spent identifying and remediating issues while customers and end users have limited or no access to the applications that make modern business work.

The war room is a blunt instrument that casts a wide net 

Convening an IT war room is born of a lack of visibility. The team must leverage their collective expertise to determine the likely root cause of a performance-impacting issue, because it’s typically not obvious to anyone at the outset exactly where the problem lies.

The time required to pinpoint the issue can be significant, even when the war room is filled with skilled, intelligent subject matter experts. That’s because modern applications are built on cloud-native architectures and can be accessed from anywhere using different devices. They leverage packaged code and dependencies deployed as microservices to increase developer speed and flexibility.

That includes containers, third-party libraries, and application programming interfaces (APIs) which create a complicated environment in which updates, changes, and conflicts between dependencies need to be constantly managed to ensure applications run optimally. If the application slows down, doesn’t work as it should, or crashes, the result is poor user experience and even lost business.

Application dependencies can also affect the security of an application. This is particularly true when an application depends on third-party code or libraries which could contain vulnerabilities which offer an attack path. That puts not only the application, but also user data, at risk.

For example, misconfiguration and even ransomware or distributed denial-of-service (DDoS) attacks can all present confusingly similar symptoms as network packet loss in terms of performance degradation, with no clear indication of the root cause.

Consider the scenario of a large supermarket at the height of holiday season shopping. Products are flying off the shelves and need frequent restocking throughout the day. It’s critical to know inventory availability right up to the minute, so shelves remain full. Inaccurate inventory or running out of stock undermines trust the business has worked hard to build, not to mention lost sales.

At that point, the hand scanners used for inventory start to falter. They’re not reliably scanning, which means the movement of products from the stock room onto the shelves isn’t being recorded accurately. The team can no longer be sure what’s on the shelves, what’s left in the stockroom, what needs to be reordered and when it needs to arrive.

A call is made to the IT team and a war room is convened to investigate what’s causing the problem. The Wi-Fi network is an obvious culprit, however as time passes, the networking team can’t find any Wi-Fi problems. Eventually, they realize it’s the scanner firmware. The scanners themselves need to be replaced, and once they are, normal service is resumed.

Customer Digital Experience Monitoring (CDEM) changes everything  

This story is one of many that illustrate the shortcomings of infrastructure monitoring which lacks visibility into the digital experience.

In this example, the war room participants must sequentially sort through all the different scanner dependencies according to their collective experience to spot the most likely culprit, in the least amount of time. The effort involves cross-functional teams, who each investigate their area of responsibility, so there’s a similar level of effort and time required from everyone. The result is that most teams can typically prove their “innocence” — that is, they can show that their area of responsibility does or does not harbor the root cause.

In effect, because they lack clear insight, each team spends a huge amount of expensive time looking for an issue that isn’t theirs to find. There’s a better way. Cisco Full-Stack Observability allows operational teams to completely change their troubleshooting perspective.

Customer Digital Experience Monitoring (CDEM), a capability of Cisco Full-Stack Observability (FSO) solutions, allows teams to track the user journey itself starting with the device and traversing every touchpoint including dependencies like APIs and microservices.

Had they used CDEM, the teams in our example would have seen the user journey failing at the first step. Eliminating their theoretical most likely culprit – the Wi-Fi network – would have taken just moments instead of hours, and attention would have immediately focused on the scanners themselves.

It’s easy to see how observability at this level fundamentally changes the IT war room, and dramatically accelerates mean time to resolution (MTTR) through bypassing many of the steps that teams would otherwise have to take.

Answers lie in observable telemetry data 

War rooms are complicated by multiple different data sets surfaced by separate monitoring tools. For example, Network Ops looks at data from the network, DevSecOps looks at data from the application and third-party dependencies.

Achieving a complete view of all relevant application data from normal business operations is a massive task. Worse yet, it’s impossible to correlate these endless streams of incoming data within a workable timeframe using disparate tools and systems that were never designed for the job. That makes spotting anomalies across the full stack, let alone prioritizing and acting on them, virtually impossible in a reasonable timeframe.

Cisco Full-Stack Observability solutions democratize data access, breaking down cross-functional silos and bringing teams together to collaborate on the next best step for resolving problems. Customer Digital Experience Monitoring combines Cisco’s application observability capabilities with industry-leading network intelligence, allowing IT teams to quickly identify the root cause of issues before they hurt the overall performance of the application, affect the end user and ultimately the business.

Cisco’s solution provides insights into both the application and the network, with internet connectivity metrics for application operations and real-time application dependency mapping for network operations. This combined application and network view significantly reduces MTTR with actionable recommendations that help teams prioritize remediation activities based on business impact and criticality.

For instance, teams can see at which point along the user’s path performance degradation is occurring, or communication is failing altogether. Vitally, they have contextual visibility that helps them collaboratively identify, triage, and resolve issues because they’re all working from the same data sourced from every possible touchpoint, including the network, which is an area often missing from other solutions.

The result is the end of war rooms as we know them. Instead, teams have end-to-end visibility, correlated insights, and recommended actions all tied to business context, across applications, security, the network, and the internet. Only Cisco combines the vantage points of applications, networking, and security at scale to power true observability over the entire IT estate.

Share:



Source link