Wednesday, October 11, 2017

Running an Oracle Data Pump export from PL/SQL

Using The dbms_datapump Procedure

I love PL/SQL - it has become an incredibly powerful language. PL/SQL can be abused of course - for example, it's well documented that writing DML statements using PL/SQL loops can perform much slower than a well written SQL statement. However, I think that DBA's don't use PL/SQL nearly enough in their work.

We tend to refresh development databases quite a bit around here. We take a production copy, scrub it, and then create a development database out of it - a very common thing to do.

However, when this happens, certain data that was generated in the previous development database is lost during this refresh process, and we want to preserve it.

One option, of course, is to call Data Pump before the refresh and export the tables in question. Then, after the refresh, we can import the tables. This is all well in good, but I am already performing quite a bit of SQL and PL/SQL during the refresh, and it would be nice to export and import the data from the SQL prompt - so I don't have to be shelling back and forth between Oracle and the OS.

Oracle makes this easy to do with the dbms_datapump PL/SQL procedure.

First, all of the work I'm doing is run from a shell script. To make it easy to test, I've created different functions in the shell script for all of the actions I need to take.

Specifically:
  • Setting preliminary variables
  • Creation of directories
  • Creating the export function
  • Check the status of the export

Setting Preliminary Variables

First, the script I have already sets several variables. The ones I'll be using in the new functions are:
  • ${HistoryDumpDate} - this ensures I get a unique name for the dump file. I simply grab the current system date to get this value as seen in this example:

    export HistoryDumpDate="`date +20%y%m%d_%p%H_%M_%S`"
  •  DestDB - This is the name of the database I am creating.

Creating The Directories

 Next, I need to create the directory on the file system and in the database that I am going to put the Data Pump dumpfile into ....  In this case, my script checks to see if the physical directory exists and if it does not, it created this.

Once the physical directory is created, I use the Oracle create directory command to create the directory entry we need inside the database as seen here:

f_create_directory
{
# DestDb is the name of the database I'm refreshing
export DestDb=$1

if [ ! -d /backup/log_history/${DestDb} ]
then
     mkdir -p /backup/log_history/${DestDb}
fi

# Note that this is a refresh so we assume this directory does not already exist. Otherwise
# we'd create a test for it first.
sqlplus / as sysdba<<WEOF
create directory log_history as '/backup/log_history/${DestDb}';
exit
WEOF
}

Creating the Export Function

Having created the directories we are ready to create the export function. In this function we

  1. The function takes two parameters
    1. The first is the schema name that owns the table
    2. The second is the name of the table we want to export.  
  2. Define the HistoryDumpDate variable mentioned earlier
  3. Connect to SQL Plus using a here document (that's the WEOF part)
  4. Run an PL/SQL anonymous block to do the export. 

Here is the function:


f_export_log_history_table
{
export Schema_Name=$1
export Table_Name=$2

printf "Dumping the log_history_table to /backup/log_history/${DestDb} \n">>$Log

export HistoryDumpDate="`date +20%y%m%d_%p%H_%M_%S`"

sqlplus / as sysdba<<WEOF
declare
     h1 number;
begin
     h1:=DBMS_DATAPUMP.OPEN('EXPORT','TABLE',NULL,'EXPORT_${DestDb}_${Schema_Name}_${Table_Name}_${HistoryDunpDate}','LATEST');
     dbms_datapump.add_file(h1,'history_testdb_${HistoryDumpDate}.dmp','LOG_HISTORY');
     dbms_datapump.metadata_filter(h1, 'SCHEMA_EXPR','IN (''$Schema_Name'')');
     dbms_datapump.metadata_filter(h1, 'NAME_EXPR','IN (''NONPROD_LOG_HISTORY'')');
     DBMS_DATAPUMP.START_JOB(h1);
end;
/
exit
WEOF
}


A few things to note about this function.
  • At the moment, I have not added exception handling to the function. I need to go back and do that and fully plan on it. Right now, I'm kind of in the "get it working" stage of development.
  • You can find the documentation for dbms_datapump here
  • The dbms_datapump functions I've used here do the following:
    • OPEN
      • The OPEN function creates a new Data Pump job. 
      • It returns a handle that you will reference in the other functions as you define parameters for the job. 
      • Note that I've indicated this is an EXPORT as opposed to an IMPORT. 
      • The export Job name is
         

        'EXPORT_${DestDb}_${Schema_Name}_${Table_Name}${HistoryDunpDate}'

        I use this job naming convention to avoid any collisions.
      • If there are errors in your script, it's possible that the job will be created. In this case, you may need to clear out the export job. This page has a great article on clearing out Data Pump jobs that won't seem to go away.
    •  ADD_FILE
      • This procedure is used to define the dumpfile that I want to use. You can parallelize the data pump export by defining more than one dumpfile. 
    • METADATA_FILTER
      • Note that there are two calls to dbms_datapump.metadata_filter.
      • The first metadata filter defines the schema name that the object I want to export is in. By default, Data Pump will use the schema you are logged into. In my case, I didn't want to use that default, so I had to create the metadata_filter definition you see.
      • The second metadata filter defines the object I want to export. In this case, just a single table.
      • Note that in both filters I've used SCHEMA_EXPR and NAME_EXPR
        • The use of the EXPR indicates that I'll be using a schema and name expressions.
        • The next paramter is the expression itself- in this case, IN clauses. 
        • I could have used SCHEMA and NAME instead and just listed the names.
    •  START_JOB - This starts the data pump job.

Once the start_job procedure is executed, a data pump job will be executed. Note that dbms_metadata.start_job is a procedure and not a function. This is because the whole data pump process is an asynchronous process. So, in many ways there really is no success or failure condition that would be returned by the function. It COULD return an exception, of course, and I need to program for that... which I will do later. 

Additionally, you can 
  • Import the data using the dbms_datapump package.
  • Check the status of an export or import job using the dbms_datapump package.
  • Set various export/import related parameters, such as remapping and the like.

Tuesday, August 29, 2017

Where are my bind variables...

I remember way back when - you could not even capture the bind variables of a session without tracing the session itself. But now, we have a number of different ways of extracting that information.

Still, there have been several occasions where I've gone looking for bind variables for a given SQL that had performance issues, only to find that some of the bind variables were not captured.That is not a good thing.

I always just wrote it off to some annoying bug, something special that the optimizer did when parsing the SQL statement or some other undocumented functionality that would someday surface.

Lo and behold - the answer seemed to surface today in the documentation. I happened to be looking at the documentation for v$sql_bind_capture and there it was - in black and white - the seeming reason some of my bind variables had been missing:

Bind data
One of the bind values used for the bind variable during a past execution of its associated SQL statement. Bind values are not always captured for this view. Bind values are displayed by this view only when the type of the bind variable is simple (this excludes LONG, LOB, and ADT datatypes) and when the bind variable is used in the WHERE or HAVING clauses of the SQL statement.


There you go! It appears that bind variable capture does not happen in several cases. In the one very repeatable case I had, there are a number of bind variables used before the where clause appears - and none of those is captured. There are scalar sub-queries in the select clause, there are some case statements that have bind variables - all of these are not captured. The first bind variable captured is the first one that shows up in the where clause.

Yet - being a skeptic - after my discovery, I went and did some testing.

I ran the contrived test SQL statement below (tables have been renamed to protect the innocent), trying to validate my assumptions - thinking I would not see the bind variable captured:

select dummy, 
(select count(*) from schema.testtable where test_num>2000000) as total_tables from dual;

Yet, it was indeed captured for this execution as you can see here:

select dbms_sqltune.extract_binds(bind_data) from v$sql where sql_id='01pnj1r5r1n90';

DBMS_SQLTUNE.EXTRACT_BINDS(BIND_DATA)(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED, VALUE_STRING, VALUE_ANYDATA())
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 2, 'NUMBER', NULL, NULL, NULL, 22, '29-AUG-17', '2000000', ANYDATA()))


The obvious answer might seem to be: Well, maybe it works differently when cursor_sharing is set to FORCE - as is the case on our databases. Perhaps it's the conversion of the literal to a bind variable that causes it to get captured (which is my guess). So, what happens if I define the bind variable from the outset?

alter system flush shared_pool;

var my_var number
exec :my_var:=2000000

select dummy, (select count(*) from schema.testtable where test_num>:my_var) as total_tables from dual;

SQL> select dbms_sqltune.extract_binds(bind_data) from v$sql where sql_id='0gp8g0rd67w9w';

DBMS_SQLTUNE.EXTRACT_BINDS(BIND_DATA)(NAME, POSITION, DUP_POSITION, DATATYPE, DATATYPE_STRING, CHARACTER_SID, PRECISION, SCALE, MAX_LENGTH, LAST_CAPTURED, VALUE_STRING, VALUE_ANYDATA())
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_BIND_SET(SQL_BIND(NULL, 1, NULL, 2, 'NUMBER', NULL, NULL, NULL, 22, '29-AUG-17', '2000000', ANYDATA()))

So... now I'm sitting back at square 1 again.

Based on my initial discovery and theories, I assumed when the description of bind data indicated that it would not capture anything before the where clause of the main body of the query. However, this test seems to refute that belief. Perhaps the complexity of the SQL statement itself has some impact on bind capture. There is a good lesson here - assume nothing until you have tested it out in 64 different ways. (ok, that might be over kill...)

Here I thought I'd found the answer and instead, I just find more questions!

Wooo Hooo!

It's my anniversary today, so I won't be researching this anymore for now... But, it is an interesting problem.

Moving ... once again....

And, we are back....

It's been quite some time since I've blogged anything. My last entry was while I was working at Oracle. That was over 2 years ago. I left Oracle about 2 years ago and moved to Des Moines for a DBA job with Businessolver. It's been a busy two years with a new job, new baby, new house... new many things. My old blog at Oracle is no longer accessible since I don't work there anymore. I put off returning here to my personal Oracle blogging site - for many reasons.

But.... I've decided that I need to return and try to re-engage the Oracle world. I'm doing all sorts of fun and interesting DBA related stuff - discovering new stuff - getting burned by stuff... and I really should be sharing my experiences. Not because I'm awesome or great - mostly so you can avoid the mistakes I make as I DBA my way though life. Also, I hope that by sharing my experiences and discoveries, I might reduce the time you might spend learning about something or fixing a problem.

So, borrowing from MacArthur - I have returned. Watch for my my first Oracle post in quite some time very soon.

As an afterthought, I'll be at Oracle Open World this year for the first time in many years. If you see me and say, "Is that Freeman?" - it probably is... please take a moment to say "Hi!"