Article Checklist Business Continuity

New DBA Series: DBA Checklists – Part One

Paul Koufalis

Database Administrator (DBA). Sounds important, doesn’t it? But what are you supposed to do every day?

People often ask us: “what tasks should an OpenEdge DBA be performing? When? What should my daily, weekly, quarterly, etc. checklist have on it?”

In this New DBA Series group of posts, we’ll provide a few recommendations. Use the comments below to let us know what else you check regularly.

Morning

Paul likes to call these tasks his “Morning Smile Routine”. One of our customers uses the term and it stuck. Start your work day with your favourite beverage and this list of tasks:

Databases

Verify successful backup
Verify after-imaging is enabled and properly switching extents; most extents should be empty and one busy
If using OE Replication, ensure the targets are synchronized and there is no more than one locked AI extent
If you are using after-imaging to maintain a warm spare verify that it is available and up to date
Check BI file size
Check for long open transactions
Review database log file for overnight error messages
If using Alternate Buffer Pool, ensure that there are free buffers in ABP and that lru2 is disabled
If you use fixed extents rather than “all variable”, check fixed extent free space
Verify monitors are running and alerts are flowing
Check monitored metrics for trends approaching actionable thresholds
Read Progress PANS alerts

Verify sufficient file system free space, especially check:
- DB filesystems (data, bi, and ai filesystems)
- AI archive filesystem
- backup filesystem
- temp files (the Progress -T parameter)
- application specific filesystems
Check for “runaway” processes
Check for overnight processes that may still be running (but should not be)
Review OS logs
Review OS free memory
Review summary of previous day’s CPU and disk utilization
Check OS configuration for unwelcome changes

During the Day

No one wants to spend their day staring at a monitoring screen and reading hundreds of alert emails, but you should check the following metrics at least 2-3 times throughout the day. Keep note of the values in a spreadsheet so that you and your colleagues can more easily identify what is normal and what is unusual. Or use ProTop to trend the data automatically.

Make sure that AI switching & warm spare applies are occurring regularly
Pay attention to the number of users/connections, know what is normal so that you recognize unusual increases or decreases
Look out for high disk IO rates, especially if they result in a low buffer hit ratio
Watch for unusually active tables or indexes, be extra vigilant after new code has been released
Keep an eye peeled for unusual log file messages and alerts
Notice the lock table high water mark and the normal number of active locks
Observe the time between checkpoints
Check for long open TRX & BI file growth – Find the oldest transaction (usually a code problem)
Investigate blocked users/connections. REC = record locking, coding issue. BK*, TX*, etc. indicate system resource constraints
Take note of excessively active connections or “rapid readers.” What are they doing? Is it legitimate? Is there a better way?
Use the “client statement cache” or “proGetStack” to identify specific code causing a problem – work with development to get it fixed
Identify latch waits or high numbers of latch requests on latches that should be low, e.g. OM, LRU, LRU2, EC, CDC
Scan for OS bottlenecks and constraints

Weekly

There is nothing worse than needing to scan a database log file, only to find that it is 42 GB in size. Or spending 20 minutes analyzing a DB Analysis before realizing it was generated in 2003.

Rotate/Truncate the .lg files; this can be automated in 11.7+
Clean up trash in database directories -protrace, leftover scratch files, core files, etc. Don’t forget the -T directories!
Refresh dbanalys – don’t forget the DEV/TEST/QA/Training databases.
After refreshing dbanalys – review index utilization, identify idxcompact targets. Check rows per block settings. Check for very long RM Chains; fragmentation; scatter.
Schedule appropriate remediation activities

Monthly

Once a month, take an hour to summarize and document what happened. Did you have any down time? Was it scheduled or unscheduled? Is your infrastructure able to handle the load? Were there any major changes (more users, new application code/processes/database features/settings/OpenEdge release…) and how did they affect the systems?

Outage summary
- Planned and unplanned
Capacity Planning Reports:
- Basic CRUD & transaction trends
- Overall database growth
- User/connection trends
- I/O response
- Project disk space needs
- Project disk throughput needs
- Project memory & CPU utilization

Quarterly

If you’re actively monitoring your OpenEdge environment, you will often come across tuning and tweaking opportunities. But at the very least, schedule some time once per quarter to review your OpenEdge environment and adjust as necessary.

Review all startup parameters (brokers and clients) and config options
- Did you recently move to a new OpenEdge release that affords new configuration opportunities?
Review your storage area configuration, has the usage of any of the tables changed in ways that might mean that table should be in a different area?
If you are allowing SQL-92 connections – run dbtool to adjust SQL width; run “update statistics” for the optimizer
- These tasks can be automated in recent OpenEdge releases
If you are using SSL, etc. – review certificate validity & expiration. Don’t find out about expired certificates the hard way
Review OS configuration, kernel parameters, etc. – test IO throughput: random reads & synchronous writes
Review OpenEdge release level, is it time to apply patches or start planning an upgrade?
Review monitored metrics and alerts

Annually

Ah yes…the annual checklist…that few actually do. When was the last time you actually executed your DR plan or reviewed your licensing?

Disaster recovery test
License review & “true-up”
Review hardware landscape and potential upgrades
Review business growth plans & projections
Review the current actual RPO and RTO with business leads and confirm they still meet the business’ needs
Make your plans to attend PUG Challenge, Progress Next and other conferences!

What are you doing?

Are you a good DBA? Are you checking some or all of these metrics? Did we forget anything? Let us know in the comments below!

But wait…there’s more!

Stay tuned for part two, where we’ll talk about Special Events like updates, upgrades and migrations. We have our checklists for those too!

Database Administrator (DBA). Sounds important, doesn’t it? But what are you supposed to do every day?

Morning

During the Day

Weekly

Monthly

Quarterly

Annually

What are you doing?

But wait…there’s more!

Related Articles

New DBA Series: More Ways to CRASH Your Database

New DBA Series: Frequently Useful Numbers

“One Size Fits ALL” may be acceptable when talking about thong sandals or knit caps (it’s not entirely true even then). However, it is an obvious mistake when talking...