Performance Optimization When Customers Start Complaining

Veld Systems||7 min read

When customers start complaining about performance, you are already behind. Users tolerate minor slowness silently. By the time they reach out to support or leave a review mentioning speed, the problem has been affecting your business for weeks or months. Conversion rates have dropped. Engagement is down. Users have quietly switched to alternatives.

The good news is that performance problems are diagnosable and fixable. The bad news is that most teams waste time optimizing the wrong things because they do not start with data. Here is the systematic approach we use when a product is bleeding users due to performance.

Step One: Measure Before You Touch Anything

The worst thing you can do is start optimizing based on hunches. "The database feels slow" is not a diagnosis. You need numbers.

Set up Real User Monitoring (RUM) if you do not already have it. Synthetic tests tell you how fast your site loads from a data center in Virginia. RUM tells you how fast it loads for your actual users on their actual devices and networks. The gap between these numbers is often enormous.

Capture these metrics across your entire user base:

- Time to First Byte (TTFB). How long before the server starts sending a response. If this is above 600ms, your backend is the bottleneck.

- Largest Contentful Paint (LCP). How long before the main content is visible. Google considers above 2.5 seconds a poor experience.

- Total Blocking Time (TBT). How long JavaScript blocks the main thread during page load. This directly affects how responsive the page feels.

- API response times at p50, p95, and p99. Averages hide problems. If your p50 is 200ms but your p99 is 4 seconds, one in a hundred requests is painfully slow, and that adds up to thousands of frustrated users per day.

Segment these metrics by page, by user plan tier, by geographic region, and by device type. Performance problems are rarely uniform. They cluster around specific flows, specific user segments, or specific geographies.

Our website performance optimization guide covers the full measurement framework. Start there if you do not have monitoring in place yet.

The Database: Where Most Problems Live

In our experience, 70% of performance complaints trace back to the database. Not the frontend, not the CDN, not the application server. The database.

Here is how to find and fix database performance issues:

Identify slow queries. Every major database has a slow query log. In PostgreSQL, set log_min_duration_statement to 200ms and let it run for 24 hours. You will get a list of every query that takes longer than 200ms, sorted by frequency and total time consumed. The top 10 queries on this list are your optimization targets.

Missing indexes are the most common problem. A query that scans a full table with 10 million rows when it could use an index on two columns is the difference between 5ms and 5 seconds. Run EXPLAIN ANALYZE on every slow query and look for sequential scans on large tables.

N+1 queries are the second most common problem. Your ORM loads a list of 50 orders, then fires a separate query for each order's customer data. That is 51 queries instead of 1. This pattern alone accounts for the majority of "the app got slow as we scaled" complaints. Add eager loading or batch loading to eliminate them.

Connection exhaustion happens when your application opens more database connections than the database can handle. Requests queue up waiting for a connection, and response times spike. Implement connection pooling and monitor your pool utilization. If you are consistently above 80% utilization, either increase the pool size or optimize queries so connections are held for less time.

Table bloat in PostgreSQL occurs when dead rows accumulate faster than autovacuum can clean them up. This makes tables physically larger, which makes sequential scans slower. Check your autovacuum settings and consider more aggressive tuning for high write tables.

Caching: The Fastest Fix for Read Heavy Workloads

If your application reads the same data repeatedly, caching is the fastest path to dramatic performance improvement. A cache hit at 1ms versus a database query at 50ms is a 50x improvement that your users feel immediately.

Layer your caching strategy:

Application level caching (Redis, Memcached) for database query results, computed values, and session data. Cache aggressively but set appropriate TTLs. Stale data is usually better than slow data, within reason.

HTTP caching for API responses that do not change frequently. Set Cache-Control headers so browsers and CDNs cache responses. For authenticated endpoints, use ETag or Last-Modified headers for conditional requests.

CDN caching for static assets, images, CSS, JavaScript, and fonts. If your static assets are not served from a CDN, you are forcing every user to download them from your origin server regardless of their location. This alone can shave 500ms to 2 seconds off page loads for international users.

Edge caching for dynamic content that is the same for all users in a region (pricing pages, product catalogs, public content). Edge functions on platforms like Cloudflare or Vercel can serve cached responses from the nearest point of presence without hitting your origin.

The key is knowing what to cache, how long to cache it, and how to invalidate it. Cache invalidation is famously one of the hardest problems in computer science, but for most applications, a TTL based approach with manual invalidation for critical updates works well.

Frontend Performance: Death by a Thousand Scripts

If your TTFB is fine but your LCP and TBT are poor, the problem is on the client side. Frontend performance issues are usually caused by:

Too much JavaScript. Audit your bundle size. Many production applications ship 2 to 5MB of JavaScript when users need less than 500KB for the initial page load. Every byte of JavaScript has to be downloaded, parsed, compiled, and executed before the page is interactive.

Use code splitting to load JavaScript on demand instead of upfront. Route based splitting ensures users only download the code for the page they are viewing. Component level splitting defers heavy components (charts, editors, maps) until they are needed.

Unoptimized images. Images are typically the largest assets on a page. Use modern formats (WebP, AVIF), serve responsive sizes based on the user's viewport, and lazy load images that are not visible on initial render. A single unoptimized hero image can add 2 to 3 seconds to your page load.

Third party scripts. Analytics, chat widgets, A/B testing tools, social media embeds, each one adds latency. Audit every third party script and defer non critical ones. Load them after the page is interactive, not during the initial render. We have seen pages where third party scripts account for 60% of the total load time.

Render blocking resources. CSS files in the head block rendering until they are fully downloaded. Inline critical CSS for above the fold content and load the rest asynchronously. This dramatically improves perceived performance even if total load time stays the same.

API and Backend Optimization

If your backend is the bottleneck, look at these areas:

Serialization overhead. Converting database objects to JSON responses takes time, especially for large payloads. Profile your serialization layer. Sometimes switching to a faster JSON library or reducing response payload size cuts response times by 30% or more.

External API calls in the request path. If your endpoint calls two or three external APIs synchronously before responding, those latencies add up. Parallelize independent external calls. Cache responses from external APIs that do not change frequently. Move non critical external calls to background jobs.

Background job processing. If you are doing heavy computation, file processing, or third party sync in the request path, move it to a background worker. Return a 202 Accepted to the client and process asynchronously. This keeps your API responsive even when handling expensive operations.

For cloud and infrastructure optimization, right sizing your servers matters too. Over provisioned servers waste money. Under provisioned servers cause the exact performance problems your customers are complaining about. Use monitoring data to find the right balance. Our guide on reducing AWS cloud costs covers this in detail.

The Quick Wins Checklist

When customers are already complaining, you need quick wins while you work on deeper fixes. Here are changes that typically take less than a day each and produce measurable improvement:

1. Enable gzip/brotli compression on your web server if it is not already on. This reduces transfer sizes by 60 to 80%.

2. Add database indexes for your top 5 slowest queries.

3. Set Cache-Control headers on static assets to cache for 1 year (use content hashes in filenames for cache busting).

4. Lazy load images below the fold.

5. Defer third party scripts to load after DOMContentLoaded.

6. Enable HTTP/2 on your web server for multiplexed requests.

7. Add connection pooling if your database connections are not pooled.

These seven changes, none of which require significant code refactoring, typically improve page load times by 30 to 50% combined.

When to Call for Help

Performance optimization is iterative. You measure, identify the bottleneck, fix it, and the next bottleneck reveals itself. Some teams can handle this cycle internally. But if performance complaints are affecting revenue, customer retention, or your team's ability to ship new features, the cost of slow iteration is too high.

We specialize in performance audits and optimization for production applications. If your customers are already complaining, reach out today and we will help you find and fix the root cause before you lose more users.

Ready to Build?

Let us talk about your project

We take on 3-4 projects at a time. Get an honest assessment within 24 hours.