|
Author : James F. Koopmann More at : www.jameskoopmann.com
|
Do you know how your disk subsystem actually is performing? This article will take a look at extracting some I/O statistics so that you can monitor and determine just how well your disks are doint.
How can I separate Oracle I/O to maximize performance?
Should I separate data files from index files?
Sshould I separate redo logs”
These question(s), AND many more, seem to flood our minds as database administrators. They are easy to answer with generalities but in practice can be very difficult to come to a conclusion on unless we take a look at how our disk subsystem is actually performing.
Many of us might stop reading this article right now as we will say to ourselves: those questions and level of detail is only available to our system administrators or those that control the disk farm. All too often I have seen basically two different methodologies when configuring Oracle on storage. The first is to use Oracle’s Flexible Architecture (OFA) approach where architects will separate Oracle object types (data, index, redo, archive, et.) across a storage array. The second approach is to architect a JBOD (Just a Bunch Of Disks) configuration and throw everything on it. Both of these approaches lack the planning and configuration that ultimately produces a well tuned database system. They are just taking a shot in the dark, hoping everything is going to work well because they followed a predefined methodology. Well, methodologies may not work in your case. Don’t look to your system administrators as they might not even know how to extract information themselves. And when they do it is usually at a higher level since they too can not relate the information to the Oracle stack.
So the DBA must begin to understand the application from a purely I/O perspective. Then relay that information to the Storage Administrator and then, together, develop a plan for configuring or altering a storage subsystem that will be able to service the application mix. For the storage, a key performance indicator of an OLTP environment is based on I/Os per second (IOPS) and latencies (I/O turn-around time). OLAP databases are your data warehouses or reporting systems and are categorized by moving large amounts of data that is mostly read only. For the storage array the performance of an OLAP environment is based on Mega-bytes per second (MBPS). A database workload is often descriptive of its’ application mix. Understanding and translating an application mix into a database workload is critical for optimizing a storage system. The workload of an OLTP database is categorized by small random I/O while OLAP is categorized by large sequential or random I/O.
For the database administrator it is now time to get dirty and take a look at the internals of your database. Somewhere and somehow you must extract some form of statistics that allow you to categorize the type of SQL and I/O requests at the database level. Oracle for instance has quite a few internal tables that allows for the interrogation of this information. For instance we can query the gv$sysstat view for (‘physical read total IO requests’ – ‘physical read total multi block requests’) to get the number of small reads in the system. Do this over a period of time, subtract the beginning value from the ending value and you quickly get IOPS for small reads over that period of time. This MUST be done for each statistic available to get a view of total IOPS and MBPS being requested by your particular database. These are the pertinent statistics you will need to extract. Basically large reads and writes are used to calculate MBPS and small reads and writes are used for IOPS calculations.
gv$sysstat (name, value)
Total Reads :'physical read total IO requests'
Total Writes:'physical write total IO requests'
Large Reads :'physical read total multi block requests'
Large Writes:'physical write total multi block requests'
Total Bytes Read :'physical read total bytes'
Total Bytes Written :'physical write total bytes'
To calculate small reads:
Small Reads = Total Reads – Large Reads
Small Writes = Total Writes – Large Writes
Coming up with these numbers allows us to make intelligent decisions in regards to our current database performance and storage requirements. Granted the database may be experiencing contention and actual I/O requests may be lower then optimal so tuning may be in order. Regardless it is the DBA’s responsibility to take these IOPS and MBPS to the Storage Administrator, evaluate if the storage solution is being taxed, and if a reconfiguration is required. These reconfigurations may require, moving data files around to use more disks, add more disk for higher throughput, or reduce IOPS and MBPS through the application. Well we have come full circle to the application again. I know you know what to do.
Below are two scripts that will get you started. I have spent quite a bit of time getting them easy to use and giving you the information quickly to understand how your disk subsystem is performing over time.
Calculate the IOPS for small and large I/O, the percentage of reads to writes, and the MBPS throughput over a period of time as seen by the database. This is important for matching up expected I/O rates on the disk subsystem and what is actually seen in the database. Execute this SQL script at a point in time, it will ask you for an amount of time to sleep before taking the next sample and then display IOPS and MBPS. Very cool!set echo offset feedback offset heading offset linesize 40set pagesize 55set verify off set termout offcolumn rpt new_value rptselect instance_name||'_'||to_char(sysdate,'YYYYMMDDHH24MISS')||'_vsysstat_ioworkload.LST' rpt from v$instance;set termout onpromptpromptprompt ^^^^^^^^^^^^^prompt Report Name : ../LST/&&rptprompt ^^^^^^^^^^^^^spool ../LST/&&rpt column sr1 new_value sr1column sw1 new_value sw1column lr1 new_value lr1column lw1 new_value lw1column tbr1 new_value tbr1column tbw1 new_value tbw1set termout offSELECT sum(decode(name,'physical read total IO requests',value,0)- decode(name,'physical read total multi block requests',value,0)) sr1,sum(decode(name,'physical write total IO requests',value,0)- decode(name,'physical write total multi block requests',value,0)) sw1,sum(decode(name,'physical read total multi block requests',value,0)) lr1,sum(decode(name,'physical write total multi block requests',value,0)) lw1,sum(decode(name,'physical read total bytes',value,0)) tbr1,sum(decode(name,'physical write total bytes',value,0)) tbw1FROM v$sysstat;set termout on promptpromptprompt ^^^^^^^^^^^^prompt First Sampleprompt ^^^^^^^^^^^^prompt Number of Small Reads : &&sr1prompt Number of Small Writes: &&sw1prompt Number of Large Reads : &&lr1prompt Number of Large Writes: &&lw1prompt Total Bytes Read : &&tbr1prompt Total Bytes Written : &&tbw1promptpromptprompt Enter the amount of time (in seconds) you would like this process to sleep for sampling dataprompt ^^^^^^^^^^^^^^^^^^prompt Sleep Time (secs): &&sleeptimeprompt ^^^^^^^^^^^^^^^^^^exec DBMS_LOCK.SLEEP (&&sleeptime); column sr2 new_value sr2column sw2 new_value sw2column lr2 new_value lr2column lw2 new_value lw2column tbr2 new_value tbr2column tbw2 new_value tbw2set termout offSELECT sum(decode(name,'physical read total IO requests',value,0)- decode(name,'physical read total multi block requests',value,0)) sr2,sum(decode(name,'physical write total IO requests',value,0)- decode(name,'physical write total multi block requests',value,0)) sw2,sum(decode(name,'physical read total multi block requests',value,0)) lr2,sum(decode(name,'physical write total multi block requests',value,0)) lw2,sum(decode(name,'physical read total bytes',value,0)) tbr2,sum(decode(name,'physical write total bytes',value,0)) tbw2FROM v$sysstat;set termout on promptpromptprompt ^^^^^^^^^^^^^prompt Second Sampleprompt ^^^^^^^^^^^^^prompt Number of Small Reads : &&sr2prompt Number of Small Writes: &&sw2prompt Number of Large Reads : &&lr2prompt Number of Large Writes: &&lw2prompt Total Bytes Read : &&tbr2prompt Total Bytes Written : &&tbw2promptpromptprompt ^^^^^^^^^prompt Results :prompt ^^^^^^^^^ column sri new_value sricolumn swi new_value swicolumn tsi new_value tsicolumn srp new_value srpcolumn swp new_value swpcolumn lri new_value lricolumn lwi new_value lwicolumn tli new_value tlicolumn lrp new_value lrpcolumn lwp new_value lwpcolumn tr new_value tr column tw new_value tw column tm new_value tm SELECTROUND((&&sr2-&&sr1)/&&sleeptime,3) sri,ROUND((&&sw2-&&sw1)/&&sleeptime,3) swi,ROUND(((&&sr2-&&sr1)+(&&sw2-&&sw1))/&&sleeptime,3) tsi,ROUND((&&sr2-&&sr1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3) srp,ROUND((&&sw2-&&sw1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3) swp,ROUND((&&lr2-&&lr1)/&&sleeptime,3) lri,ROUND((&&lw2-&&lw1)/&&sleeptime,3) lwi,ROUND(((&&lr2-&&lr1)+(&&lw2-&&lw1))/&&sleeptime,3) tli,ROUND((&&lr2-&&lr1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3) lrp,ROUND((&&lw2-&&lw1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3) lwp,ROUND(((&&tbr2-&&tbr1)/&&sleeptime)/1048576,3) tr,ROUND(((&&tbw2-&&tbw1)/&&sleeptime)/1048576,3) tw,ROUND((((&&tbr2-&&tbr1)+(&&tbw2-&&tbw1))/&&sleeptime)/1048576,3) tmFROM dual; SELECT'Small Read IOPS = '||ROUND((&&sr2-&&sr1)/&&sleeptime,3)||' IOPS','Small Write IOPS = '||ROUND((&&sw2-&&sw1)/&&sleeptime,3)||' IOPS','Total Small IOPS = '||ROUND(((&&sr2-&&sr1)+(&&sw2-&&sw1))/&&sleeptime,3)||' IOPS','Small Read I/O % = '||ROUND((&&sr2-&&sr1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3)||' %','Small Write I/O % = '||ROUND((&&sw2-&&sw1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3)||' %','Large Read IOPS = '||ROUND((&&lr2-&&lr1)/&&sleeptime,3)||' IOPS','Large Write IOPS = '||ROUND((&&lw2-&&lw1)/&&sleeptime,3)||' IOPS','Total Large IOPS = '||ROUND(((&&lr2-&&lr1)+(&&lw2-&&lw1))/&&sleeptime,3)||' IOPS','Large Read I/O % = '||ROUND((&&lr2-&&lr1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3)||' %','Large Write I/O % = '||ROUND((&&lw2-&&lw1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3)||' %','Total Read = '||ROUND(((&&tbr2-&&tbr1)/&&sleeptime)/1048576,3)||' MBPS','Total Written = '||ROUND(((&&tbw2-&&tbw1)/&&sleeptime)/1048576,3)||' MBPS','Total MBPS = '||ROUND((((&&tbr2-&&tbr1)+(&&tbw2-&&tbw1))/&&sleeptime)/1048576,3)||' MBPS'FROM dual; prompt Small Read IOPS = &&sri IOPSprompt Small Write IOPS = &&swi IOPSprompt Total Small IOPS = &&tsi IOPSprompt Small Read I/O % = &&srp %prompt Small Write I/O % = &&swp %prompt Large Read IOPS = &&lri IOPSprompt Large Write IOPS = &&lwi IOPSprompt Total Large IOPS = &&tli IOPSprompt Large Read I/O % = &&lrp %prompt Large Write I/O % = &&lwp %prompt Total Read = &&tr MBPSprompt Total Written = &&tw MBPSprompt Total MBPS = &&tm MBPS spool offundefine sleeptimeGet complete history of IOPS & MBPS from workload repository history and graph it for you management. This allows you to see total database disk activity. Compare this against what your disk capacity is. Just remember these numbers are for ALL disks. You can get average IOPS/MBPS by dividing by your total number of disks used in servicing database requests. This is great information and once you graph the results you will really se how your I/O, and application performance, might be suffering during the day or at least determine where peak periods are.
set echo off
set feedback off
set linesize 300
set pagesize 55
set verify off
set termout off
column rpt new_value rpt
select instance_name||'_wrh_sysstat_ioworkload_'||'.LST' rpt from v$instance;
set termout on
prompt
prompt
prompt ^^^^^^^^^^^^^
prompt Report Name : ../LST/&&rpt
prompt ^^^^^^^^^^^^^
spool ../LST/&&rpt
column sri head "Small|Read|IOPS"
column swi head "Small|Write|IOPS"
column tsi head "Total|Small|IOPS"
column srp head "Small|Read|I/O%"
column swp head "Small|Write|I/O%"
column lri head "Large|Read|IOPS"
column lwi head "Large|Write|IOPS"
column tli head "Total|Large|IOPS"
column lrp head "Large|Read|I/O%"
column lwp head "Large|Write|I/O%"
column tr head "Total|Read|MBPS"
column tw head "Total|Written|MBPS"
column tm head "Total|MBPS"
column begin_time for a25
column end_time for a25
SELECT end_time,
ROUND(sr/inttime,3) sri,
ROUND(sw/inttime,3) swi,
ROUND((sr+sw)/inttime,3) tsi,
ROUND(sr/DECODE((sr+sw),0,1,(sr+sw))*100,3) srp,
ROUND(sw/DECODE((sr+sw),0,1,(sr+sw))*100,3) swp,
ROUND(lr/inttime,3) lri,
ROUND(lw/inttime,3) lwi,
ROUND((lr+lw)/inttime,3) tli,
ROUND(lr/DECODE((lr+lw),0,1,(lr+lw))*100,3) lrp,
ROUND(lw/DECODE((lr+lw),0,1,(lr+lw))*100,3) lwp,
ROUND((tbr/inttime)/1048576,3) tr,
ROUND((tbw/inttime)/1048576,3) tw,
ROUND(((tbr+tbw)/inttime)/1048576,3) tm
FROM (
SELECT beg.snap_id beg_id, end.snap_id end_id,
beg.begin_interval_time, beg.end_interval_time,
end.begin_interval_time begin_time, end.end_interval_time end_time,
(extract(day from (end.end_interval_time - end.begin_interval_time))*86400)+
(extract(hour from (end.end_interval_time - end.begin_interval_time))*3600)+
(extract(minute from (end.end_interval_time - end.begin_interval_time))*60)+
(extract(second from (end.end_interval_time - end.begin_interval_time))*01) inttime,
decode(end.startup_time,end.begin_interval_time,end.sr,(end.sr-beg.sr)) sr,
decode(end.startup_time,end.begin_interval_time,end.sw,(end.sw-beg.sw)) sw,
decode(end.startup_time,end.begin_interval_time,end.lr,(end.lr-beg.lr)) lr,
decode(end.startup_time,end.begin_interval_time,end.lw,(end.lw-beg.lw)) lw,
decode(end.startup_time,end.begin_interval_time,end.tbr,(end.tbr-beg.tbr)) tbr,
decode(end.startup_time,end.begin_interval_time,end.tbw,(end.tbw-beg.tbw)) tbw
FROM
(SELECT dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time,
sum(decode(stat_name,'physical read total IO requests',value,0)- decode(stat_name,'physical read total multi block requests',value,0)) sr,
sum(decode(stat_name,'physical write total IO requests',value,0)- decode(stat_name,'physical write total multi block requests',value,0)) sw,
sum(decode(stat_name,'physical read total multi block requests',value,0)) lr,
sum(decode(stat_name,'physical write total multi block requests',value,0)) lw,
sum(decode(stat_name,'physical read total bytes',value,0)) tbr,
sum(decode(stat_name,'physical write total bytes',value,0)) tbw
FROM wrh$_sysstat, wrh$_stat_name, dba_hist_snapshot
WHERE wrh$_sysstat.stat_id = wrh$_stat_name.stat_id
AND wrh$_sysstat.snap_id = dba_hist_snapshot.snap_id
group by dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time) beg,
(SELECT dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time,
sum(decode(stat_name,'physical read total IO requests',value,0)- decode(stat_name,'physical read total multi block requests',value,0)) sr,
sum(decode(stat_name,'physical write total IO requests',value,0)- decode(stat_name,'physical write total multi block requests',value,0)) sw,
sum(decode(stat_name,'physical read total multi block requests',value,0)) lr,
sum(decode(stat_name,'physical write total multi block requests',value,0)) lw,
sum(decode(stat_name,'physical read total bytes',value,0)) tbr,
sum(decode(stat_name,'physical write total bytes',value,0)) tbw
FROM wrh$_sysstat, wrh$_stat_name, dba_hist_snapshot
WHERE wrh$_sysstat.stat_id = wrh$_stat_name.stat_id
AND wrh$_sysstat.snap_id = dba_hist_snapshot.snap_id
group by dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time) end
WHERE beg.snap_id + 1 = end.snap_id
)
order by 1
/
spool off
Understanding an application from a purely I/O perspective is a key aspect of configuring storage. Oracle has a variety of I/O types that ultimately need to be mapped, sampled, and related to storage. In Oracle’s case, there is I/O generated by server processes on behalf of users, multiple database writers, checkpoint activity, logging facilities that not only write as updates are being done but also the reading from online logs and writing to archive logs by the archive process. Plus a few more and some internals that determine size and frequency of the I/Os. Oracle is a very complex system of processes that without understanding your disk I/O patterns it is nearly impossible to configure properly.