Hard Drive Issues

March 29th, 2012: Filed under  by Alexander Catapang @ 9:03 pm

Site was down for a couple of hours. It appears like the HDD where the main database was located has some bad sectors already. This caused some database issues, which also affected the game servers in the process. Unfortunately, it’s the kind of error that my automated scripts are not able to detect, so the games have been inaccessible for a number of hours.

The same incident happened last month, though I was quicker to catch the problem then. So today I made some major changes to prevent this from happening again. I have moved the entire database to the other server. This means more load to that particular server, but ultimately, it’s a more stable setup. I’m gonna have to replace the server with the failing HDD, which would take some time. I figured it’s probably better to just provision an entirely new server instead of just replacing the hard drive (as I likely have to re-install/re-configure everything anyway). In the meantime, the current setup looks stable enough.

I’ve also coded some new scripts, so I will be notified immediately in case the same issue happens again in the future. This should help lessen potential down time from hours, to just minutes. Since I am unable to monitor everything 24/7, and since hardware components are bound to have issues eventually, I think this is a good enough solution for now.


Flash Policy Server

March 31st, 2011: Filed under  by Alexander Catapang @ 10:12 pm

My latest project has a dedicated Flash policy server. Basically, this speeds up the initial connection to the bingo game servers. The good news is this is also usable on HeyBingo. The changes have been live in half of all the bingo rooms yesterday for initial tests. Today, all the rooms now benefit from it.

For quite some time now, some players are reporting that they are always having issues connecting to the bingo games. Everyday, the games do get interrupted once a day, a scheduled automatic daily maintenance, however this should only take a few seconds. Other than this very brief scheduled down time (and the monthly maintenance which takes a bit longer), the games should be accessible all the time.

There are other potential reasons why some players are always getting errors that the bingo games can’t connect to the servers (e.g. firewall settings). However, most of the reports come from players who are just having a hard time getting through, like it takes a lot of retries before the games connect successfully to the game servers.

Without a policy server, there is really at least a 3-second delay before the bingo games can confirm a successful connection (a Flash security thing). I am guessing that because of this 3-second delay, the connection of some players may be timing out improperly, hence the reported inability to connect to the game servers. I am hoping that the recent changes will minimize, if not completely solve, this particular problem.




Emergency Maintenance

March 28th, 2011: Filed under  by Alexander Catapang @ 3:59 pm

One of the servers became unresponsive yesterday. Since that is where the entire database was located, the site and all the games have been affected. I estimate total downtime to be around 30 minutes or so, before I got things back to normal.

I have been experiencing this issue in the past, however it usually resolves by itself within a minute or two, and it only occurs every couple of weeks. I haven’t been able to figure out the exact cause of the problem, since it occurs very briefly, so I assumed it was just some intermittent network issues with my host.

Yesterday, the server was just being unresponsive on and off, so I had no choice but to just reboot the machine, to see if it will fix things (it did). Fortunately, I was able to catch a glimpse of what’s possibly causing this issue, which looks like database related. I have not really made optimizations since I ordered the new servers last December, so it is likely causing some bottlenecks given the site’s regular traffic.

I have made some optimization changes yesterday and today, so hopefully the problem is fully resolved. As usual, I will be monitoring things just in case the same issue re-appears.


E-Mail Delivery Issues

March 18th, 2011: Filed under  by Alexander Catapang @ 4:19 pm

Recently, I’ve noticed that more players are reporting that they are not receiving their registration codes via e-mail. Upon further investigation, it looks like the problem started last March 3, or 2 weeks ago. Unfortunately, it’s only now that the problem is being resolved.

Microsoft has apparently blocked my server from sending e-mails to any of their e-mail systems, due to suspected spam. E-mails are not just being flagged as potential spam, all e-mails are being rejected outright. I got in touch with them 2 days ago, and they have now corrected the issue, however it might still take 24-48 hours to have the fix propagated across all their systems worldwide.

So to anyone who has a Microsoft hosted email account (Hotmail, MSN, Live), from any country or region, and you are still waiting for your registration code, please request to have your code re-sent again.

I apologize for the inconvenience this may have caused some players, especially to the new ones who thought I have been ignoring their registration code inquiries, when in fact I was unable to send them ANY e-mail at all. Anyway, the bounced e-mails are getting lesser now, so hopefully the problem has been fully resolved.

Please note: this doesn’t guarantee e-mail delivery to your Inbox 100%. I still suggest that you put HeyBingo.com’s support e-mail in your Safe List or Contact List prior to requesting for the code. This way, the e-mail will have better chances of getting delivered to your Inbox, instead of your Spam folder. This tip also applies to non-Microsoft e-mail users.


Server Migration

December 9th, 2010: Filed under  by Alexander Catapang @ 3:45 pm

I have not been very happy with my dedicated server host lately. I’ve been with them the past few years (since this site has been launched), and things have been generally smooth, but they have undergone recent changes and I felt the service has been going downhill since then.

They have also increased prices, so if I need new servers, I can no longer get new ones at the same price I used to pay. Plus, I have experienced some minor issues that I will no longer write in detail. In the end, I have decided to transfer to a new server host. Prices are comparable to what I was paying before, and service so far has been great. There’s a lot more tools at my disposal to automate things, including mobile apps so I don’t need to be in front of a computer to manage the servers in case of emergency.

The past few days, I have been migrating the entire site to the new servers in several stages, to keep downtime as minimal as possible. Many of you may not have noticed anything unusual at all actually.

I now have less physical servers to manage, but at the same time, these servers are faster than the old ones, so it kinda evens out. I was also able to save as much as 40% in monthly server costs, which comes in handy these days. So less servers to manage, faster servers, lower overall costs … I’m a happy guy. 🙂

It took a lot of effort, as I also have to learn new things in the process, since I’ve decided to go with a different server set-up than what I was used to. Prior to ordering the servers, I did a lot of local tests to cover potential issues when I work on the production servers already.

Getting the new servers is also in preparation for the launch of my newest project, which is what’s keeping me busy these days. I am sorry that I have not been able to update the HeyBingo site except the usual maintenance, but perhaps in the future there will be some major changes to be done (for the better).

In the meantime, since you are able to read this post now, it means the server migration has been fully completed. As usual, I will be monitoring things the next few days in case some new issues appear, but so far everything has been smooth and I’ve only encountered minor issues that were easily corrected.

Happy holidays everyone!




Power Outage

June 21st, 2009: Filed under  by Alexander Catapang @ 7:59 am

There was a power outage at my dedicated server host provider. They have fixed the problem, although they are still investigating the cause of this.

Unfortunately, when my servers became available again, the one holding the player database was having connection issues when it came back up. I have written maintenance scripts to successfully recover from such situations, but this one is something new, and the overall setup was not able to correct itself automatically. I was asleep when the incident happened, and it’s a good thing my phone alerted me of the problem (though a bit late already, for some reason it has been delayed as well).

Total downtime is about 1 and 1/2 hours. I am still monitoring things to make sure that everything is back to normal.

Thanks to those who have e-mailed about the database error. My apologies for the game interruptions.


Server Issues [Resolved]

May 13th, 2009: Filed under  by Alexander Catapang @ 6:17 am

One of the servers has been having issues the past 1 hour or so, so about 1/3 of the game rooms have been affected. I have made a temporary set-up, so all the rooms are now back online. I am still waiting for a status update from my hosting provider about what happened to the server, and when it will be back online.

UPDATE: The server is now back online. I have reverted back to the normal set-up. I had to restart 2/3 of the game rooms though, but everything is now back to normal.


« Older Entries