NOTE: Special thanks to Andrew Nagy at Sangoma for pointing me to some information that formed the basis of this fix. I had originally created a bug report with a patch that fixed this script, which set the entire thing in motion. Also, this information is being contributed to the community by myself, and the company I own (Voiceopia Communications), in hopes it’ll be of assistance to others. Of course, if you need some turn-key FreePBX stuff, drop us a line!
For quite some time, I have been happily running FreePBX 13 systems (under FreePBX Distro) in the Amazon EC2 cloud without any issues. Once FreePBX 14 went stable, it was the next logical step to set it up to run within EC2 as well, with the idea being to upgrade some of my 13 systems over to it.
Several months of experimentation went by, and I kept running into a common issue: The systems seemed to drop off the network after an hour or so of uptime. Restarting the system via AWS seemed to get things back in order — for another hour or so. Then the process repeats itself.
After spending a fair amount of time on this, as well as reaching out to AWS paid support (who was not able to figure this out, either), I came to the determination that the systems weren’t losing their IP address per-se, but they were losing their default gateway. Thus, they were unable to communicate with anything outside of their subnet within AWS.
To add insult to injury, it seemed this was an issue that happened only within AWS. I was able to spin up the same systems at other providers, as well as within our own virtual environment at the office, and everything worked just fine for months on end.
Scouring the ‘net revealed the problem to be a new “valid_lft” and “preferred_lft” parameter used when binding the IP address to the interface. It was suggested to remove those from the DHCP client’s script (/usr/sbin/dhclient-script) and that would fix the problem. I gave that a try, and sure enough, it worked!
At least until an updated dhclient package was released…
Searching for a more permanent solution, I created a patch file and submitted it to FreePBX as a bug report. I’m fairly convinced it’s an AWS issue, but they were of no help before, so I figured I’d start somewhere.
I’ll let you read the backstory on the bug report if you’re interested. However, Mr. Nagy’s link provided a bit of inspiration: Can I redefine the problem functions in a dhclient hook?
It turns out that’s entirely possible, and that’s the basis of my fix. The path given for “enter” hooks in the link provided is incorrect (apparently there is no dhclient-enter-hooks.d directory), but by looking at the dhclient-script file I was able to figure out where these need to go.
So… how do you fix it?
Once you have downloaded this script, upload it to the /etc/dhcp directory on your FreePBX system at AWS. Once that’s done, you can reboot your system, or use the following command to restart networking:
ifdown eth0 ; ifup eth0
Note that you must execute that all on one line — otherwise you’ll lock yourself out of your server.
Once that’s completed, run: ip addr
Pay attention to the output for eth0. It should look like:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP qlen 1000 link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff inet 10.xx.x.xx/23 brd 10.xx.x.xxx scope global eth0 valid_lft forever preferred_lft forever inet6 2600:1f18:4550:xxxx:x:dead:beef:cafe/64 scope global valid_lft forever preferred_lft forever inet6 fe80::418:xxxx:xxxx:xxxx/64 scope link valid_lft forever preferred_lft forever
Pay attention to your system’s inet and inet6 addresses, in particular the “valid_lft” and “preferred_lft” values. If they say “forever” as above, then you have successfully applied the fix. If they show a numeric value (which will decrease when you re-execute the command), double check that you named the script “dhclient-enter-hooks” and saved it into /etc/dhcp on your system.
The fix you apply here should survive updates to the dhclient package from upstream.