I was just informed that there are still some free seats for the 2-day seminar I will present in Berlin in two weeks (June 7-8). Hence, do not wait too long if you want to join us…
The content is based on the chapters 4, 5, 6 and 7 of my book, i.e. part 3: Query Optimizer. The essential difference is that the content was updated to cover version 11.2 as well.
The event is organized by DOAG. You can read the full description of the seminar (incl. agenda) here. Just be careful that the spoken language will be German (slides will be in English, though).
Hi Chris,
first of all, let me congratulate you on your Berlin workshop regarding the query optimizer – it was a real informative, nice and interesting event!
Now as you already said, questions mostly appear after the fact… and I’d have one regarding workload statistics.
In fact already two months ago, I’d collected workload statistics on several production systems, but there never was any time to analyze the results…
On all systems, statistics gathering started at 8:00 in the morning and went on for 10 hours, day after day, for one month, so the sample size was around 30. Now with the exception of cpuspeed, which was remarkably stable, the standard deviation turned out to be enormous – especially for mbrc, mreadtim and sreadtim.
For example: On one DSS system, with a mean mreadtim of 9.12, the standard deviation was 3.74, resulting in a conclusion going “with a probability of 90%, mreadtim lies between 2.97 and 15.27″… Even more impressing, on one OLTP system, the mean mbrc was 50, with a stddev of 32, giving a confidence interval, at 90% probability, of minus 3 to plus 103 for mbrc…
Now I wonder how to explain this extreme variation, given that the values are already taken over a 10-hour period… the only source of variation would be the day of week, and AFAIK there should not be such great variation in workload here, at least as regards the OLTP applications.
Could it be that sampling over a 10-hour period is not a good thing to do, and if so, for what reason (I just remember you having an example involving more fine-grained sampling periods, which were aggregated somehow afterwards …)?
Also, I can’t help wondering whether this great standard deviation invalidates the result, somehow…?
Anyway, thanks for any advice on this!
Ciao
Sigrid
Also,
Hi Sigrid
> first of all, let me congratulate you on your Berlin workshop regarding the
> query optimizer – it was a real informative, nice and interesting event!
Thank you!
> For example: On one DSS system, with a mean mreadtim of 9.12, the standard deviation was 3.74,
> resulting in a conclusion going “with a probability of 90%, mreadtim lies between 2.97 and 15.27″…
> Even more impressing, on one OLTP system, the mean mbrc was 50, with a stddev of 32, giving a
> confidence interval, at 90% probability, of minus 3 to plus 103 for mbrc…
What about the standard deviation of sreadtim?
> Now I wonder how to explain this extreme variation, given that the values are already taken
> over a 10-hour period… the only source of variation would be the day of week, and AFAIK
> there should not be such great variation in workload here, at least as regards the OLTP applications.
IMO, because of the way multiblock reads work, makes such variations more or less normal. In fact, the size of multiblock reads is strongly dependent on the data which is stored in the buffer cache.
> Could it be that sampling over a 10-hour period is not a good thing to do, and if so,
> for what reason (I just remember you having an example involving more fine-grained
> sampling periods, which were aggregated somehow afterwards …)?
I always worked with shorter periods. With them I had the opportunity to choose between an average, max, or whatever value I wanted. With long periods you can only take the average.
> Also, I can’t help wondering whether this great standard deviation invalidates the result, somehow…?
IMO no.
Cheers,
Chris
Hi Chris,
thanks for your answer!
> What about the standard deviation of sreadtim?
Standard deviation of sreadtim is not AS big as for mreadtim, but rather big, too, I would say.
(On the four systems: System 1: mean=3.64, stddev=0.89; system 2: mean=3.03, stddev=1.32; system 3: mean=1.52, stddev=0.50; system 4: mean=3.85, stddev=0.63.)
Over all 4 systems, I end up with a stddev-to-mean ratio of 0.36 for mreadtim and 0.29 for sreadtim.
> IMO, because of the way multiblock reads work, makes such variations more or less normal. In fact, the size of multiblock reads is strongly dependent on the data which is stored in the buffer cache.
I understand … but then, the variation in sreadtim would still be astonishing, because those factors don’t apply here …?
>I always worked with shorter periods. With them I had the opportunity to choose between an average, max, or
> whatever value I wanted. With long periods you can only take the average.
I see. Also, it might be interesting to see what happens to the variation if I have shorter sample intervals, focused on clearly characterizable intervals (report generation, OLTP “rush hour”, batch job…) I will try taking new samples for shorter periods, and compare the “quality” of the results (if one can say so).
Thanks and ciao,
Sigrid
Hi Sigrid
> the variation in sreadtim would still be astonishing, because those factors don’t apply here …?
For sreadtim two are the main factors:
– actual load of the I/O subsystem
– whether the blocks are cached or not at the I/O subsystem level
Typically, also a look at V$EVENT_HISTOGRAM shows a variation of 1 or 2 order of magnitude.
HTH
Chris