Incidents

Incidents in PixoMonitor track service disruptions from detection to resolution. They provide a timeline of what happened, who was involved, and how long it took to resolve. Incidents can be created automatically by monitors or manually when you're aware of an issue.

Overview

Every incident tracks:

Title — Description of the incident
Status — Current state (Investigating, Identified, Monitoring, Resolved)
Severity — Impact level (Critical, High, Medium, Low)
Timeline — When it started and was resolved
Acknowledgement — Who responded and when

Automatic Incidents

When a monitor detects a failure (status DOWN), PixoMonitor automatically:

Creates a new incident linked to the monitor
Sets status to "Investigating"
Triggers alerts through configured channels
Starts escalation policy (if configured)

When the monitor recovers (status UP):

Updates incident status to "Resolved"
Records the resolution time
Notifies subscribers of the recovery

Automatic incidents use the monitor name as the incident title. You can edit the title and add details at any time.

Manual Incidents

Create incidents manually when you're aware of an issue that monitors haven't detected, or for issues that span multiple services.

Create a Manual Incident

Navigate to Incidents

Go to Incidents in the sidebar and click New Incident.

Fill in details

Title — Clear description of the issue
Monitor — Associate with a monitor (required)
Severity — Select the impact level

Create

Click Create Incident. The incident is now active and visible on your status page.

Incident Statuses

Track incident progress through standard status levels:

Status	Description	When to Use
Investigating	Issue detected, investigating cause	Initial status
Identified	Root cause identified, working on fix	After diagnosis
Monitoring	Fix deployed, monitoring for stability	After fix deployed
Resolved	Issue fully resolved	When complete

Status Flow

The typical status progression:

Investigating → Identified → Monitoring → Resolved

You can skip statuses if appropriate (e.g., go directly from Investigating to Resolved for quick fixes).

Severity Levels

Severity indicates the impact of an incident:

Severity	Impact	Examples
Critical	Complete outage, major data loss	All services down, security breach
High	Significant impact, major feature unavailable	Primary function broken
Medium	Moderate impact, workaround available	Degraded performance
Low	Minor issue, minimal user impact	Minor bug, cosmetic issue

Severity and Escalation

Severity affects escalation behavior:

Critical incidents trigger immediate escalation
High incidents may have shorter escalation delays
Medium/Low incidents follow normal escalation timing

Set severity based on user impact, not technical complexity. A "simple" bug that affects all users is more severe than a complex bug affecting few.

Acknowledging Incidents

Acknowledging an incident indicates someone is actively working on it. This:

Stops escalation to additional responders
Records who acknowledged and when
Shows team that the incident is being handled

Acknowledge an Incident

Open the incident

Go to Incidents and click on the active incident.

Click Acknowledge

Click the Acknowledge button. This records:

Your user account
Timestamp
Acknowledgement method (web, API, etc.)

Acknowledgement Information

After acknowledgement, the incident shows:

Who acknowledged
When they acknowledged
How they acknowledged (web, API, SMS link, etc.)

Only one person needs to acknowledge an incident. Once acknowledged, escalation stops for that incident.

Updating Incidents

Keep stakeholders informed by updating incidents as you learn more.

Update Status

Open the incident

Go to Incidents and select the incident to update.

Change status

Select the new status from the dropdown.

Save

Changes are saved automatically and reflected on the status page.

Update Severity

If the impact changes (worse or better than initially thought), update the severity level to reflect current reality.

Update Title

Edit the title to better describe the incident as you learn more about it.

Resolving Incidents

Mark an incident as resolved when the issue is fully fixed and confirmed stable.

Resolve an Incident

Verify the fix

Confirm the issue is resolved and monitors show UP status.

Update status to Resolved

Change the incident status to Resolved.

Confirm resolution

PixoMonitor records the resolution time and calculates total duration.

Automatic Resolution

When a monitor recovers from DOWN to UP, associated incidents are automatically resolved. The resolution timestamp is set to when the monitor recovered.

Incident Timeline

Every incident maintains a timeline showing:

When it started
Status changes
Who made changes
When it was resolved

Viewing the Timeline

Open any incident to see its complete history. The timeline shows:

Initial detection or creation
All status updates
Acknowledgement events
Resolution

Incident Metrics

Track incident performance over time:

Key Metrics

Metric	Description
Time to Acknowledge (TTA)	Time from incident start to acknowledgement
Time to Resolve (TTR)	Total time from start to resolution
Downtime Duration	How long the service was unavailable
Incident Count	Number of incidents per monitor/period

Using Metrics

These metrics help you:

Identify monitors that fail frequently
Track response time improvements
Meet SLA commitments
Plan capacity improvements

Filtering Incidents

Find specific incidents using filters:

Filter by Severity

View only Critical, High, Medium, or Low severity incidents to focus on what matters most.

Filter by Status

Active — All unresolved incidents
Resolved — Past incidents

Filter by Monitor

View incidents for a specific monitor to understand its history.

Linking to Post-Mortems

After resolving significant incidents, create a post-mortem to document:

What happened
Root cause
Impact
Action items to prevent recurrence

See the Post-Mortems guide for details.

Create post-mortems for all Critical and High severity incidents, plus any incident that teaches valuable lessons.

Status Page Integration

Active incidents automatically appear on your public status page:

Incident title is visible to users
Status updates show progress
Users see you're aware and working on it
Resolution is announced when complete

What Users See

Incident Status	Status Page Display
Investigating	"We are investigating this issue"
Identified	"The issue has been identified"
Monitoring	"A fix has been implemented"
Resolved	"This incident has been resolved"

Subscriber Notifications

If you have subscribers enabled, they receive automatic emails:

Event	Notification
New Incident	"New incident affecting [monitor]"
Status Update	"Update: [incident title]"
Resolved	"Resolved: [incident title]"

Best Practices

Acknowledge quickly — Stop escalation and show the team it's being handled
Update regularly — Keep stakeholders informed, even if just "still investigating"
Use appropriate severity — Based on user impact, not technical complexity
Write clear titles — "API returning 500 errors" is better than "API down"
Document everything — Future you will thank present you
Create post-mortems — Learn from incidents to prevent recurrence
Track metrics — Measure and improve response times

Don't delete incidents to "clean up" history. The incident record is valuable for understanding patterns and meeting compliance requirements.