New DBA Series: 99 Ways to CRASH a Database
The best way to avoid disasters is to learn from mistakes, preferably someone else’s. Memorize this series or polish your resume-writing skills!
Posts about:
The best way to avoid disasters is to learn from mistakes, preferably someone else’s. Memorize this series or polish your resume-writing skills!
Data protection is comprised of two equally important components: backups and after-image archives. Almost everyone understands backups: if there’s a problem, they get you 95% of your data back, up until the time of the last backup. After-image archives get you down that last mile, containing the detailed changes that were applied to your database. Think of them as a recording that you can play back (we call this “rolling forward”) on top of your restored DB. All the recorded changes are applied to the restored database in the same way as they were done the first time around.
After-Imaging (AI) is an OpenEdge continuous change logging system that stores database changes in specially formatted log files. It allows the DBA to restore a database from a backup then apply all changes from the end point of the backup to the point of the last AI archived change. AI is required to:
Hardware-based backups including OS backups, disk mirroring, “snapshots,” and third party tools are not the best way to back up the database as all of these require the database to be in a quiescent state. In order to get a snapshot of the database files, you have to shut down the DB server or use proquiet. Careful with proquiet: you need to wait until the “Quiet point has been enabled” message appears in the db.lg before proceeding.
Ahhh…Mike Tyson. You gotta love Mike Tyson. You, on the other hand, you are probably more like Marvis Frazier. I see the puzzled look on your face: no, not Smokin’ Joe Frazier, the first person to beat Muhammad Ali in 1971. I’m talking about Marvis Frazier, his son. Look it up: KO’d in 30 seconds. Nice uppercut in the first round. When people talk to me about their high availability measures, I think about Marvis Frazier. Good looking guy. He talked the talk and walked the walk, but when it came time to deliver, he failed. Twice. Most of your high availability planning is the same. It’s good enough to impress the 24-year-old junior auditor from Deloitte but will never deliver when you actually need it.
A couple of weeks ago we received a ProTop alert about a long running transaction that was threatening to bring down our customer’s ERP system. It is not a pleasant SMS to get on a Saturday night, but I knew I needed to respond quickly to prevent a crash.
Before we start: if you’re a technical person reading this, please forward to your boss, your boss’ boss and your boss’ boss’ boss. Forward it all the way up the org chart. Print it out and tack it to the bulletin board in the cafeteria. Make sure no one in I.T. management or at the C-level can pretend they didn’t know.