Incidents

Incidents in PixoMonitor track service disruptions from detection to resolution. They provide a timeline of what happened, who was involved, and how long it took to resolve. Incidents can be created automatically by monitors or manually when you're aware of an issue.

Overview

Every incident tracks:

  • Title — Description of the incident
  • Status — Current state (Investigating, Identified, Monitoring, Resolved)
  • Severity — Impact level (Critical, High, Medium, Low)
  • Timeline — When it started and was resolved
  • Acknowledgement — Who responded and when

Automatic Incidents

When a monitor detects a failure (status DOWN), PixoMonitor automatically:

  1. Creates a new incident linked to the monitor
  2. Sets status to "Investigating"
  3. Triggers alerts through configured channels
  4. Starts escalation policy (if configured)

When the monitor recovers (status UP):

  1. Updates incident status to "Resolved"
  2. Records the resolution time
  3. Notifies subscribers of the recovery

Automatic incidents use the monitor name as the incident title. You can edit the title and add details at any time.


Manual Incidents

Create incidents manually when you're aware of an issue that monitors haven't detected, or for issues that span multiple services.

Create a Manual Incident

1

Navigate to Incidents

Go to Incidents in the sidebar and click New Incident.

2

Fill in details

  • Title — Clear description of the issue
  • Monitor — Associate with a monitor (required)
  • Severity — Select the impact level
3

Create

Click Create Incident. The incident is now active and visible on your status page.


Incident Statuses

Track incident progress through standard status levels:

StatusDescriptionWhen to Use
InvestigatingIssue detected, investigating causeInitial status
IdentifiedRoot cause identified, working on fixAfter diagnosis
MonitoringFix deployed, monitoring for stabilityAfter fix deployed
ResolvedIssue fully resolvedWhen complete

Status Flow

The typical status progression:

Investigating → Identified → Monitoring → Resolved

You can skip statuses if appropriate (e.g., go directly from Investigating to Resolved for quick fixes).


Severity Levels

Severity indicates the impact of an incident:

SeverityImpactExamples
CriticalComplete outage, major data lossAll services down, security breach
HighSignificant impact, major feature unavailablePrimary function broken
MediumModerate impact, workaround availableDegraded performance
LowMinor issue, minimal user impactMinor bug, cosmetic issue

Severity and Escalation

Severity affects escalation behavior:

  • Critical incidents trigger immediate escalation
  • High incidents may have shorter escalation delays
  • Medium/Low incidents follow normal escalation timing

Set severity based on user impact, not technical complexity. A "simple" bug that affects all users is more severe than a complex bug affecting few.


Acknowledging Incidents

Acknowledging an incident indicates someone is actively working on it. This:

  • Stops escalation to additional responders
  • Records who acknowledged and when
  • Shows team that the incident is being handled

Acknowledge an Incident

1

Open the incident

Go to Incidents and click on the active incident.

2

Click Acknowledge

Click the Acknowledge button. This records:

  • Your user account
  • Timestamp
  • Acknowledgement method (web, API, etc.)

Acknowledgement Information

After acknowledgement, the incident shows:

  • Who acknowledged
  • When they acknowledged
  • How they acknowledged (web, API, SMS link, etc.)

Only one person needs to acknowledge an incident. Once acknowledged, escalation stops for that incident.


Updating Incidents

Keep stakeholders informed by updating incidents as you learn more.

Update Status

1

Open the incident

Go to Incidents and select the incident to update.

2

Change status

Select the new status from the dropdown.

3

Save

Changes are saved automatically and reflected on the status page.

Update Severity

If the impact changes (worse or better than initially thought), update the severity level to reflect current reality.

Update Title

Edit the title to better describe the incident as you learn more about it.


Resolving Incidents

Mark an incident as resolved when the issue is fully fixed and confirmed stable.

Resolve an Incident

1

Verify the fix

Confirm the issue is resolved and monitors show UP status.

2

Update status to Resolved

Change the incident status to Resolved.

3

Confirm resolution

PixoMonitor records the resolution time and calculates total duration.

Automatic Resolution

When a monitor recovers from DOWN to UP, associated incidents are automatically resolved. The resolution timestamp is set to when the monitor recovered.


Incident Timeline

Every incident maintains a timeline showing:

  • When it started
  • Status changes
  • Who made changes
  • When it was resolved

Viewing the Timeline

Open any incident to see its complete history. The timeline shows:

  • Initial detection or creation
  • All status updates
  • Acknowledgement events
  • Resolution

Incident Metrics

Track incident performance over time:

Key Metrics

MetricDescription
Time to Acknowledge (TTA)Time from incident start to acknowledgement
Time to Resolve (TTR)Total time from start to resolution
Downtime DurationHow long the service was unavailable
Incident CountNumber of incidents per monitor/period

Using Metrics

These metrics help you:

  • Identify monitors that fail frequently
  • Track response time improvements
  • Meet SLA commitments
  • Plan capacity improvements

Filtering Incidents

Find specific incidents using filters:

Filter by Severity

View only Critical, High, Medium, or Low severity incidents to focus on what matters most.

Filter by Status

  • Active — All unresolved incidents
  • Resolved — Past incidents

Filter by Monitor

View incidents for a specific monitor to understand its history.


Linking to Post-Mortems

After resolving significant incidents, create a post-mortem to document:

  • What happened
  • Root cause
  • Impact
  • Action items to prevent recurrence

See the Post-Mortems guide for details.

Create post-mortems for all Critical and High severity incidents, plus any incident that teaches valuable lessons.


Status Page Integration

Active incidents automatically appear on your public status page:

  • Incident title is visible to users
  • Status updates show progress
  • Users see you're aware and working on it
  • Resolution is announced when complete

What Users See

Incident StatusStatus Page Display
Investigating"We are investigating this issue"
Identified"The issue has been identified"
Monitoring"A fix has been implemented"
Resolved"This incident has been resolved"

Subscriber Notifications

If you have subscribers enabled, they receive automatic emails:

EventNotification
New Incident"New incident affecting [monitor]"
Status Update"Update: [incident title]"
Resolved"Resolved: [incident title]"

Best Practices

  1. Acknowledge quickly — Stop escalation and show the team it's being handled
  2. Update regularly — Keep stakeholders informed, even if just "still investigating"
  3. Use appropriate severity — Based on user impact, not technical complexity
  4. Write clear titles — "API returning 500 errors" is better than "API down"
  5. Document everything — Future you will thank present you
  6. Create post-mortems — Learn from incidents to prevent recurrence
  7. Track metrics — Measure and improve response times

Don't delete incidents to "clean up" history. The incident record is valuable for understanding patterns and meeting compliance requirements.