Files
nexus/knowledgebase/csd-wiki/ICSD/Operation-excellence-improvement_686083916.md

705 B

Operation-excellence-improvement_686083916

Introduction

This page tracks all the scenarios requiring operation excellence improvement

Detailed scope

  1. Any critical issue can be tracked/reported
    1. OpsB or PCS
  2. Any critical issue will have a auto collection
    1. Log 2. Thread dump 3. Worker sysctl log 4. Flamegraph?
  3. Any critical issue can be well-assigned
    1. TBD
  4. Any critical issue can be auto analyzed
    1. TBD
  5. Any critical issue can be auto mitigated
    1. Auto-healing expansion
      1. Optimization ⇒ Platform need to be restarted always 2. XMPP 3. Other candidates 4. XIE?
      2. Scheduled scaling
      3. Auto tuning
  6. Product readiness