mysql_lib api documentation

class mysql_lib.mysql_source[source]

Bases: object

begin_tx()[source]

The method sets the isolation level to repeatable read and begins a transaction

connect_db_buffered()[source]

The method creates a new connection to the mysql database. The connection is made using the dictionary type cursor factory, which is buffered.

connect_db_unbuffered()[source]

The method creates a new connection to the mysql database. The connection is made using the unbuffered cursor factory.

copy_data(schema, table)[source]

The method copy the data between the origin and destination table. The method locks the table read only mode and gets the log coordinates which are returned to the calling method.

Parameters:
  • schema – the origin’s schema

  • table – the table name

Returns:

the log coordinates for the given table

Return type:

dictionary

create_destination_schemas()[source]

Creates the loading schemas in the destination database and associated tables listed in the dictionary self.schema_tables. The method builds a dictionary which associates the destination schema to the loading schema. The loading_schema is named after the destination schema plus with the prefix _ and the _tmp suffix. As postgresql allows, by default up to 64 characters for an identifier, the original schema is truncated to 59 characters, in order to fit the maximum identifier’s length. The mappings are stored in the class dictionary schema_loading. If the source parameter keep_existing_schema is set to true the method doesn’t create the schemas. Instead assumes the schema and the tables are already there.

create_destination_tables()[source]

The method creates the destination tables in the loading schema. The tables names are looped using the values stored in the class dictionary schema_tables.

disconnect_db_buffered()[source]

The method disconnects any connection with dictionary type cursor from the mysql database.

disconnect_db_unbuffered()[source]

The method disconnects any unbuffered connection from the mysql database.

drop_loading_schemas()[source]

The method drops the loading schemas from the destination database. The drop is performed on the schemas generated in create_destination_schemas. The method assumes the class dictionary schema_loading is correctly set.

end_tx()[source]

The method ends the current transaction by rollback - We should never have changed source anyway

generate_select_statements(schema, table)[source]

The generates the csv output and the statements output for the given schema and table. The method assumes there is a buffered database connection active.

Parameters:
  • schema – the origin’s schema

  • table – the table name

Returns:

the select list statements for the copy to csv and the fallback to inserts.

Return type:

dictionary

get_foreign_keys_metadata()[source]

The method collects the foreign key metadata for the detach replica process. Currently doesn’t get the ON UPDATE/ON DELETE triggers

get_master_coordinates()[source]

The method gets the master’s coordinates and return them stored in a dictionary. The method assumes there is a database connection active.

Returns:

the master’s log coordinates for the given table

Return type:

dictionary

get_table_list()[source]

The method pulls the table list from the information_schema. The list is stored in a dictionary which key is the table’s schema.

get_table_metadata(table, schema)[source]

The method builds the table’s metadata querying the information_schema. The data is returned as a dictionary.

Parameters:
  • table – The table name

  • schema – The table’s schema

Returns:

table’s metadata as a cursor dictionary

Return type:

dictionary

get_table_type_map()[source]

The method builds a dictionary with a key per each schema replicated. Each key maps a dictionary with the schema’s tables stored as keys and the column/type mappings. The dictionary is used in the read_replica method, to determine whether a field requires hexadecimal conversion.

init_replica()[source]

The method performs a full init replica for the given source

insert_table_data(ins_arg)[source]

This method is a fallback procedure whether copy_table_data fails. The ins_args is a list with the informations required to run the select for building the insert statements and the slices’s start and stop. The process is performed in memory and can take a very long time to complete.

Parameters:
  • pg_engine – the postgresql engine

  • ins_arg – the list with the insert arguments (slice_insert, schema, table, select_stat,column_list, copy_limit)

lock_table(schema, table)[source]

The method flushes the given table with read lock. The method assumes there is a database connection active.

Parameters:
  • schema – the origin’s schema

  • table – the table name

make_tx_snapshot(schema, table)[source]

The method forces creation of transaction snapshot by making a read of one row from the source table and discarding it

print_progress(iteration, total, schema, table)[source]

Print the copy progress in slices and estimated total slices. In order to reduce noise when the log level is info only the tables copied in multiple slices get the print progress.

Parameters:
  • iteration – The slice number currently processed

  • total – The estimated total slices

  • table_name – The table name

read_replica()[source]

The method gets the batch data from PostgreSQL. If the batch data is not empty then method read_replica_stream is executed to get the rows from the mysql replica stored into the PostgreSQL database. When the method exits the replica_data list is decomposed in the master_data (log name, position and last event’s timestamp). If the flag close_batch is set then the master status is saved in PostgreSQL the batch id returned by the method is is saved in the class variable id_batch. This variable is used to determine whether the old batch should be closed or not. If the variable is not empty then the previous batch gets closed with a simple update of the processed flag.

refresh_schema()[source]

The method performs a sync for an entire schema within a source. The method works in a similar way like init_replica. The swap happens in a single transaction.

set_copy_max_memory()[source]

The method sets the class variable self.copy_max_memory using the value stored in the source setting.

sync_tables()[source]

The method performs a sync for specific tables. The method works in a similar way like init_replica except when swapping the relations. The tables are loaded into a temporary schema and the log coordinates are stored with the table in the replica catalogue. When the load is complete the method drops the existing table and changes the schema for the loaded tables to the destination schema. The swap happens in a single transaction.

unlock_tables()[source]

The method unlocks all tables