• SupportPHP
  • Too many connections errors after switch to PHP-CGI

Hi! After the migration to PHP-CGI a couple of months ago I've started seeing a lot of errors - especially "Too many connections". Is there anything that can be done to mitigate this? I've attempted to cache the application as much as I can, and a particularly overloaded part was rewritten to be static, but these errors remain. I did not experience them on PHP-FPM.

For reference, this is a Laravel application with roughly 1000-2000 page views per day. I'm seeing traffic drop and several warnings cropping up in search console as well. Is it time for me to consider moving to a VPS instead of a shared instance? PHP-FPM on Nginx might be an option, but since I'm reliant on .htaccess I would need to figure out how to replace that logic first.

Any input greatly appreciated.

Cheers,
Linus

    bohman it's very possible that PHP-CGI is contributing to this. Nginx+FPM usually does help, but your concern about htaccess is a valid one. If you can make it work without htaccess then I really recommend FPM.

    It's also possible that's not actually your own site causing the problem. If you would like us to look into it then please email support to let us know which site and the date/time/tz that you saw the errors.

    FWIW I'm also seeing this on a Django site. My monitoring is picking up occasional outages due to too many MySQL connections. My site is very low traffic and (looking at the log traffic) it doesn't seem to be my site that's causing the issue. I'll drop support an email with further details. @bohman It's perhaps worth checking the same before you do any re-engineering.

    Yeah, I'll drop an e-mail to support too. Thanks!

    Please do update this thread if you find any details, seeing the same issue with a django app.

    4 days later

    Same error here quite often with a Django app since I migrated from Webfaction (not Php issue as the OP).

    • sean replied to this.

      josearr The database service is shared by all customers regardless of the type of application they're running.

      What we think might be happening is:

      • A botnet or some other bad actor starts hitting common target URLs (like Wordpress xmlrpc.php etc) on a shared server.
      • Many PHP-CGI processes are spawned, along with their individual database connections
      • System load rises so processes run slower, which means individual DB connections are open longer, exacerbating the problem
      • Eventually the system-wide DB connection limit is hit and the problem then affects other apps.

      We're working to resolve this as soon as possible. In the interim some possible ways to mitigate the effects of this are:

      • Minimize your database hits by using whatever caching capabilities your application provides.
      • Use a private database instance (MariaDB or PostgreSQL) so that you're not subject to a shared systemwide connection limit.

      Maybe it would help if wait_timeout on DB server is lowered from 28800 seconds (8 hours) to something more reasonable? As I understand it this way some app can keep open connection for 8 hours without even using it.

      • sean replied to this.

        If this is a system-wide issue, maybe some temporary fixes can be:

        • provide a quick installer for a private instance
        • provide a migration script from/to a private instance?

        Just suggesting because it's a frustrating experience to have your application limited like this, but because any effort here is likely temporary and of course running a custom db counts against application/memory limits.

        ... and doing this for multiple apps will have quite the overhead ;P

          igor hmm, will pass that along to the sysadmin.

          etienneh great suggestions, thanks! and I hear on you the frustration.

          @sean Are you maybe creating backups or some other DB maintenance at about 22:00-22:20 UTC? 🙂

          I just witnessed active connections skyrocket from 180 to 250 (and too many connections) in two seconds, after that it plummeted to around 20 connections and stayed there. I assume it started at around 20-40 and pushed to 250 really fast. Seems to happen at approximately the same time every day.

            igor can corroborate that my monitoring alert fires at 2PM PST (UTC-8) consistently.

              igor etienneh yep, earlier today we determined that the backups seem to line up with most of these problems. We're looking into ways to reduce the impact that the backups have.

                7 days later

                sean any news? :-)
                Still getting daily crash reports and downtime alerts

                  etienneh still working on it. It's likely we'll have to stop backing up the largest DBs, will notify the affected customers if it comes to that.

                    sean would it be possible to backup with a jitter? Each new dB gets a random timeslot in which the backups occur?

                    • sean replied to this.

                      etienneh I don't think the current backup setup can do that, but I'll pass the suggestion along.

                      5 days later

                      And thoughts on providing a quick installer for a private instance or a migration script from/to a private instance?

                      Still getting daily crashes and downtime alerts 😉

                      Actually @sean I tried to do the migration myself and I'm still getting errors on the private instance. So something must be wrong that's not adding up?

                      • sean replied to this.
                        Mastodon