blog.dynatrace.com
blog.codecentric.com
highscalability.com
Scalability: Increased resources results in increased performance
- Werner Vogels
Scalability
- Can ensure performance with increasing load.
Performance
- characteristics like response time, processing volume..
Queuing Theory
- Modeling resource
- Resources and waiting requests are modeled as queues
- EX: CPU and current requests
Queuing Networks
- Modeling of app as network of resources
- Combing multiple queues
- EX: CPU, Network, Connections Pools, Servlet Threads
Scalability and Requirements
- Increasing number of parallel requests
- Increasing number of parallel users
- Increasing number of data
- Increasing availability with same performance
Types of Scalability
- Vertical (UP)
  A physical node gets more resources (CPU or memory)
  In virtual env "on demand"
- Horizontal (OUT)
  A new physical node is added
  Trends to more and more servers
  Distributed system
If system is fast in single user mode but slow under heavy load then the problem is scalability, not performance
- Cameron Purdy
Scalability does not improve performance
- Increases complexity and degrades performance
- Slower in single user mode
- performance is still an issue
Scalability can improve availability
- Scale out introduces redundance
- Synchronize status/state
Scalability never comes for free
- Must be engineered into the application
- Introduces additional complexity
- Plan time for building scalability into app
Limiting factors
- CPU Time
  scalability via hardware
- Memory consumption
  scalability via hardware
- I/O and Network
  limited and harder to scale
  architectural changes
- shared resource access
  limited and harder to scale
  architectural changes
Major problems
- Database access
  high number of database requests
  Locks and concurrent access
- Remoting Behavior
  Bad interaction design (communication patterns)
  High data volume
  (it's too easy to make remote calls)
- Locking
  Configuration problems (Connections Pools)
  Wrong synchronization patterns
  Serial data access
Metrics
- Throughput
- Response times
  vs CPU time
- System metrics
  CPU and Garbage Collection
  Memory pools (including generations)
- Application metrics
  connections pools
  component level
  SQL
  Transactional trace data
Problems:
- Complex interdependencies due to frameworks
- High serialization 
- Full heap due to memory leak
- Inefficient or redundant remote calls
- Too many SQL calls
Database access
- Wrong use of O/R mappers
  Configuration/Loading behavior
- Bad Transaction Design
  Connection kept too long
  Isolation level incorrectly defined (or not at all)
- Inefficient data loading logic
Distributed Systems/Remoting
- Bad interface design
  Interfaces are too generic
  Interactions not suitable for SOA
- Wrong communication protocols
  SOAP services in homogeneous landscape
  Sync instead of async interactions
Synchronization
- Locks kept too long
- wrong lock granularity
- Neglecting locking at DB level
Serial access
- Access MUST be handled in serial way
- Scalability logically limited
Too much synchronous program logic
- Developer think procedurally
- Sync interaction in distributed systems is problematic
- high resource usage
- typical indicator: low CPU while other resources saturated
- class ex: web apps
Memory management
- Bad GC Config
  Generation size; collection strategy
- Unnecessary creation of objects
  Serialization
  O/R mapping frameworks
- Memory leaks
  Bad reference clearing logic
Problems in development
- Don't understand dynamic program behavior
- Sacrifice functionality over scalability and performance
- Waiting for symptoms
- Real problems often only visible under high load
Questions
- tools: jimmi, selenium, sniffer, soapUI, 
- check DB and/or remoting at continuous integration time