Skip to content

Quality Attributes

Comprehensive definitions and guidance for quality attributes commonly used as ADR decision drivers.

Definition: How quickly the system responds and processes requests.

Measures:

  • Latency (response time in milliseconds)
  • Throughput (requests per second)
  • Resource utilization (CPU, memory, I/O)

Scenarios:

  • “The API must respond within 100ms for 95% of requests”
  • “The system must handle 10,000 concurrent users”

Trade-offs:

  • Often conflicts with maintainability (optimized code is complex)
  • May require more expensive infrastructure
  • Can conflict with flexibility

Definition: The proportion of time the system is operational.

Measures:

  • Uptime percentage (99.9%, 99.99%)
  • Mean Time Between Failures (MTBF)
  • Mean Time To Recovery (MTTR)

Common Targets:

AvailabilityDowntime/YearDowntime/Month
99%3.65 days7.3 hours
99.9%8.76 hours43.8 minutes
99.99%52.6 minutes4.38 minutes
99.999%5.26 minutes26.3 seconds

Trade-offs:

  • Higher availability = higher cost
  • May require redundancy, which adds complexity
  • Can conflict with consistency (CAP theorem)

Definition: Ability to handle increased load by adding resources.

Types:

  • Horizontal: Add more instances
  • Vertical: Add more resources to existing instances

Measures:

  • Maximum concurrent users
  • Data volume capacity
  • Scaling factor (2x users = ?x resources)

Trade-offs:

  • Horizontal scaling adds architectural complexity
  • May require stateless design
  • Increases infrastructure cost

Definition: Ability to function correctly under stated conditions.

Measures:

  • Error rate
  • Data accuracy
  • Failure frequency

Aspects:

  • Fault tolerance: Continue operating despite failures
  • Recoverability: Restore operation after failure
  • Data integrity: Maintain data correctness

Definition: Protection against unauthorized access and threats.

Aspects:

  • Confidentiality: Data accessible only to authorized parties
  • Integrity: Data not modified by unauthorized parties
  • Authentication: Verify identity of users/systems
  • Authorization: Control what authenticated users can do
  • Non-repudiation: Actions cannot be denied
  • Accountability: Actions can be traced

Measures:

  • Vulnerability count
  • Time to detect breach
  • Compliance certifications

Definition: How easy the system is to use effectively.

Measures:

  • Task completion time
  • Error rate
  • Learning curve
  • User satisfaction

Aspects:

  • Learnability
  • Efficiency
  • Memorability
  • Error prevention
  • Satisfaction

Definition: How easy the system is to modify and evolve.

Measures:

  • Cyclomatic complexity
  • Code coupling/cohesion
  • Technical debt metrics
  • Time to make typical changes

Aspects:

  • Modularity: Independent components
  • Analyzability: Easy to understand
  • Modifiability: Easy to change
  • Reusability: Components usable elsewhere

Definition: How easy the system is to test effectively.

Measures:

  • Test coverage
  • Test execution time
  • Defect detection rate

Factors affecting testability:

  • Dependency injection
  • Modular design
  • Observable state
  • Deterministic behavior

Definition: How easy and safe it is to deploy changes.

Measures:

  • Deployment frequency
  • Deployment time
  • Rollback time
  • Failed deployment rate

Aspects:

  • Automation level
  • Rollback capability
  • Environment parity
  • Configuration management

Definition: How easy it is to add new capabilities.

Measures:

  • Time to add new feature
  • Impact on existing code
  • Plugin/extension support

Factors:

  • Plugin architecture
  • Well-defined interfaces
  • Loose coupling

Definition: Ability to understand system state from external outputs.

Pillars:

  • Metrics: Quantitative measurements
  • Logs: Event records
  • Traces: Request flow tracking

Measures:

  • Time to detect issues
  • Time to diagnose root cause
  • Alert accuracy

Definition: How easy the system is to operate and administer.

Aspects:

  • Configuration management
  • Monitoring and alerting
  • Backup and recovery
  • User administration

Definition: Ability to run on different platforms/environments.

Types:

  • Platform portability: Different OS/hardware
  • Cloud portability: Different cloud providers
  • Data portability: Move data between systems

Measures:

  • Platforms supported
  • Migration effort
  • Environment-specific code percentage

Definition: Ability to work with other systems.

Aspects:

  • Standard protocols
  • API compatibility
  • Data format support
  • Integration patterns

Definition: Value delivered relative to cost.

Measures:

  • Total cost of ownership (TCO)
  • Cost per transaction
  • Infrastructure cost
  • Development cost

Definition: Speed of delivering new capabilities.

Measures:

  • Feature lead time
  • Release frequency
  • Time from idea to production

Definition: Adherence to regulations and standards.

Common Requirements:

  • GDPR (data privacy)
  • HIPAA (healthcare)
  • PCI-DSS (payment cards)
  • SOC 2 (service organizations)
  • ISO 27001 (information security)
AttributeConflicts WithBecause
PerformanceMaintainabilityOptimizations add complexity
PerformancePortabilityPlatform-specific optimizations
SecurityUsabilityMore security = more friction
SecurityPerformanceEncryption/validation overhead
AvailabilityCostRedundancy is expensive
AvailabilityConsistencyCAP theorem
ScalabilitySimplicityDistributed systems are complex
FlexibilityPerformanceAbstractions have overhead
FlexibilitySimplicityMore options = more complexity
## Decision Drivers
* **High availability (99.99%)** - SLA requires < 1 hour downtime/year
* **Horizontal scalability** - Must handle 10x current load by Q4
* **Maintainability** - Small team must support long-term
* **Security compliance** - Must meet SOC 2 requirements
### Option 1: Managed Service
* **Availability**: Excellent - provider guarantees 99.99%
* **Scalability**: Good - auto-scaling included
* **Maintainability**: Excellent - no infrastructure to manage
* **Cost**: Moderate - $X/month at current scale
* **Security**: Good - provider is SOC 2 certified
## Consequences
* Good, because availability target will be met with provider SLA
* Good, because team can focus on features instead of infrastructure
* Bad, because we accept vendor lock-in
* Bad, because cost will increase with scale