Secrets of Database Startup Parameters – Part One of Two
Many are the secret societies in the human race: the Knights Templar, The Masons, The Department of Motor Vehicles…all carry a rich history of secrets and rituals both mysterious and inscrutable. One of the most secretive of such groups is that of the OpenEdge Database Administrators. Follow us, and you too may join their ranks.
So many parameters – what do they mean? Which should I use? Which shouldn’t I use? What are the secret magic values?
Database Buffers (-B)
The DBA equivalent of buying flowers for your wife – more is better…to a point. You want it to be big enough to hold the “working set” of data: the orders, invoices, pick-lists, etc. that your business needs right now to function efficiently.
If you have bad code [HA! – IF!] , a big -B can mitigate some of the effect by transferring the load from disk resources to CPU and memory. But too big a -B might take memory away from client processes that could use it more effectively.
A high-level indicator or performance, Buffer Hit %, is often quoted but can be misleading: for example, 95% is 5X worse than 99%! Think about it: with 95% buffer hit ratio, 1 in 20 reads is going to disk. With 99% buffer hit ratio, only 1 in 100 reads is going to disk!
Recommendation – keep buffer hit ratio above 99%. Tuning requires LARGE increases to -B for small increases in buffer hit ratio. So fix your code, and try out the BigB Guesstimator in ProTop.
Spin and LRU Skips
These are no-brainer parameters that typically don’t require much tuning. Use them, but be mindful of all kinds of bizarre recommendations out on the Internet.
Set -lruskips to 100. That value will work for 99.9% of OpenEdge databases and is the default value in OpenEdge 12+.
For -spin, any value from 5,000 to 20,000 is going to be more than good enough for 99.9% of databases. Monitor latch timeouts and CPU usage before adjusting. Increasing -spin may increase CPU usage (that’s kinda the whole point!) so don’t increase it unless you have CPU cycles to spare.
About -spin
A few fun tidbits about -spin:
- At a very high level, when a process needs a shared memory resource but cannot access it because it is in use, the spin parameter tells that process to keep trying up to <spin value> number of times, eating up its CPU quantum. The idea is that “latches” on shared memory structures are held for very brief periods of time, and the process may succeed in obtaining the latch before its CPU quantum is exhausted. This de facto consumes more CPU, as the process would have put itself to sleep (nap) otherwise, hoping to get the latch the next time it was given CPU time.
- The default calculation by the database broker is 6000 x #CPUs reported by the operating system, which was probably fine when there were 2 or 4 cores/cpus per server, but now, with multi-core machines with 16 and 32 cores, the auto-calculated number is probably too high.
- Dan Foreman likes to use pi * <DBA date of birth>, which, if you think about it, gives you a value in that 6000ish range. Not wrong. Lately, with faster and more cores, we’ve been trending higher.
- This KB entry says “start at 50,000”.
- This KB entry says “5000 x # CPU”.
- This KB entry says “Set the -spin startup parameter to 4000 * number of processors. With the latest processors, use 10,000 * number of processors”.
- You may see [wrong] recommendations like [some number of thousands] x #CPUs. Ignore them please.
- There are edge cases where very high values of -spin are recommended. Ask us if you think this might apply to you.
Lock Table (-L)
The total number of locks in the database across all users. While the memory cost of a big -L is negligible (around 72 bytes per lock table entry), I find it questionable when I come across a database that needs a -L of 1,000,000 or more.
Nonetheless, the number of lock table entries are almost completely application-dependent. Monitor the lock table high watermark (HWM) and increase -L as the high watermark approaches it.
A couple of gotchas to watch out for:
- Sudden increases after code promotion
- Continuously hitting -L: every time you fill the lock table, the database broker steals around 72 bytes from -Mxs and increases the value of -L by 1. When there is no more memory available in -Mxs, the database crashes!!
Connections
-n limits the number of connections to the database, it is NOT the number of users. -Mi,-Ma,-Mn & -Mpb are network broker connection parameters. And -m3 is used to start a secondary login broker, often used to segregate SQL vs 4GL connections.
Recommended settings:
-n: not directly correlated with licensing count, so give yourself a good buffer.
-Mi 1 & –Ma 5 is a good start.
-Mpb = maximum servers per broker (4GL or SQL). Set to <maximum users you need to connect> divided by <-Ma parameter>. If you you need to support 100 client-server 4GL users and -Ma is 5, then -Mpb should be 20.
-Mn = Sum of all –Mpb values, plus the number of secondary login brokers, plus a generous few extra (I usually add 10). This gives you some latitude to spin up a new broker if you screw up.
-ServerType: segregate 4GL and SQL connections on separate brokers.
Replication Buffers (-pica)
If you are using OpenEdge Replication, the pica parameter is ultra important – if you fill pica, you throttle all DB updates. So be sure to use the max value for your version for versions where the max value is less than 1 million. For 10.2B08/11.2+, start with 32K:
Versions 9.1E, 10.0B, 10.1A = 256
Version 9.1e04 = 1024
Version 10.0b05 = 512
Versions 10.1A02, 10.1B01= 8192
Versions 10.2B08, 11.2+ = 1M (start with 32K)
Database Structure
OK, so not technically a start-up parameter, but the structure has a huge impact on performance. Set DB block size to match your file system block size, usually 4 or 8 Kb. Well…ok…lately we’ve been using 8KB across the board, though we haven’t benchmarked the difference. AI/BI block size should be 16Kb, BI Cluster Size = 8 Mg and up depending on database load. Make sure you are using Type II storage areas. 99% of databases are better off using variable-length extents for db/bi and ai. Yes, there is a “cost” to variable-length extents, a few milliseconds a day. In most cases, the management benefit outweighs the negligible cost.
Helper Processes
After-Image Writer (AIW), Before-Image Writer (BIW), Asynchronous Page Writers (APW), and Watch Dog process (WDOG). For some reason, these are started manually and separately from the DB broker process! One of each please. If you think you need 2 APWs or more, hmmm…maybe…but highly unlikely. Ask us.
Freebies
After-Imaging (AI) and Before-Imaging (BI) Buffers
The default value used to be something silly, like 5. Watch for “Empty Buffer Waits.” 50 is probably good enough for most of you; 100 almost definitely is. Cost is negligible: 100 X 16 Kb X 2 (AI and BI) = 3.2 Mg of memory.
AI and BI Stall and BI Threshold. Why crash when you can Freeze instead?
-aistall: quiet the database when you run out of AI files.
-bistall: quiet the database when you hit the BI threshold size.
-bithold: Maximum size (in Mg) of BI file before emergency shutdown or stall. Set –bithold to about double your normal BI high water mark – if your variable-length BI file is normally 2 Gb, use –bithold 5000. Watch out for month-end/year-end type processing.
Two caveats with aistall/bistall/bithold:
- You must have a written procedure explaining the generated alert or symptom, how to confirm, and how to resolve. We have seen countless sites activate these parameters then have no idea what to do when they are triggered, so they end up shutting down the database anyways.
- You ABSOLUTELY need monitoring to alert you before a stall occurs. Hint: ProTop Monitoring and Advanced Alerting takes care of that for you.
PIN Shared Memory (-pinshm)
Another no-brainer: why would you ever want to swap out DB shared memory? Not as important as it was in the past as most servers have lots of memory. This startup parameter is ignored on Windows and AIX.
Storage Object Cache Size (-omsize)
Another no-brainer, -omsize eliminates object cache I/O and latch usage. Recommendation – calculate it as:
select count(*) from _storageobject and round up a bit.
AI File Management (-aiarcinterval –aiarcdir)
You all have After Imaging enabled, right? Ancient versions of Progress required scripting – AIMGT does all that automatically. BUT BUT BUT…do not archive to an NFS or network drive!
rfutil –C aiarchiver enable
-aiarcdir: local filesystem #1, local filesystem #2 (NO NFS mounts).
-aiarcinterval: business dependent. 15 minutes is good, 2-4hours if using OE Replication.
Stay Tuned for Part Deux
Wait! There’s more! If you order now, we’ll throw in, free of charge…sorry…been staying up late watching too much TV. But there are a lot more “no-brainer” startup parameters. We’ll cover them in part two, coming soon.
Disagree with me? Questions? Comments? Let us know!