We should rightly size the column definition . I have seen in various implemantations ,
defining right size , right data type has taken back seat .
The wrong size would impact data integrity , memory usage , network traffic and
runtime performance.
Let us take a look at runtime performance .
I am a creating two tables from the same source . The only difference is that the second
column varies by size .
SQL> desc t_normal
Name Type Nullable Default Comments
----------- ------------ -------- ------- --------
OBJECT_ID NUMBER
OBJECT_NAME VARCHAR2(30)
SQL> desc t_4000
Name Type Nullable Default Comments
----------- -------------- -------- ------- --------
OBJECT_ID NUMBER
OBJECT_NAME VARCHAR2(4000)
SQL> truncate table t_normal;
Table truncated
SQL> truncate table t_4000;
Table truncated
SQL> insert /*+ APPEND */ into t_normal select object_id , object_name from all_objects
2 ;
629424 rows inserted
SQL> insert /*+ APPEND */ into t_4000 select object_id , object_name from all_objects;
629424 rows inserted
SQL> commit;
Commit complete
Let us set the arraysize to 100 and time it .
SQL> set arraysize 100
SQL> set timing on
SQL> exec dbms_stats.gather_table_stats(user , 'T_NORMAL');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.09
SQL> exec dbms_stats.gather_table_stats(user , 'T_4000');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.10
SQL>select object_id , object_name from t_normal;
......
......
.....
629424 rows selected.
Elapsed: 00:02:00.78
SQL>select object_id , object_name from t_4000;
.....
.....
....
629424 rows selected.
Elapsed: 00:15:03.40
Look at the run time . There is a huge difference between the first run and the second run.
defining right size , right data type has taken back seat .
The wrong size would impact data integrity , memory usage , network traffic and
runtime performance.
Let us take a look at runtime performance .
I am a creating two tables from the same source . The only difference is that the second
column varies by size .
SQL> desc t_normal
Name Type Nullable Default Comments
----------- ------------ -------- ------- --------
OBJECT_ID NUMBER
OBJECT_NAME VARCHAR2(30)
SQL> desc t_4000
Name Type Nullable Default Comments
----------- -------------- -------- ------- --------
OBJECT_ID NUMBER
OBJECT_NAME VARCHAR2(4000)
SQL> truncate table t_normal;
Table truncated
SQL> truncate table t_4000;
Table truncated
SQL> insert /*+ APPEND */ into t_normal select object_id , object_name from all_objects
2 ;
629424 rows inserted
SQL> insert /*+ APPEND */ into t_4000 select object_id , object_name from all_objects;
629424 rows inserted
SQL> commit;
Commit complete
Let us set the arraysize to 100 and time it .
SQL> set arraysize 100
SQL> set timing on
SQL> exec dbms_stats.gather_table_stats(user , 'T_NORMAL');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.09
SQL> exec dbms_stats.gather_table_stats(user , 'T_4000');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.10
SQL>select object_id , object_name from t_normal;
......
......
.....
629424 rows selected.
Elapsed: 00:02:00.78
SQL>select object_id , object_name from t_4000;
.....
.....
....
629424 rows selected.
Elapsed: 00:15:03.40
Look at the run time . There is a huge difference between the first run and the second run.
Thanks Zahir ! A good POC showing clrearly how important column sizing is. Sometimes we have a threshold like this mapped to the front-end i.e. there validations based on the column definition. Is there a way to achieve lazy column definition? meaning, size it based on the last longest value entered?
ReplyDeleteIdeally , you should rightly size it . If you can't help it , you can achieve it by the following SQL. However , it adds additional cost.
ReplyDeleteWith v_t
as (
select length(max(object_name)) bsize from t_4000)
select object_id , substr(object_name , 1 , v_t.bsize )
from
t_4000 , v_t