Experimental MySQL UDF code partially written in Python

Hi dear all,
I would like to share our experimental tests about, writing UDF with Python.
So the basic idea was writing a UDF for factorial calculation. The interesting part was to move factorial calculation logic to Python code and then calling this Python file inside UDF code.

Here is the repo link -> UDFs-written-in-Python

The factorial calculation is here -> factorial.py
UDF code is here -> factorial.cc

Basic idea with UDF code quite simple as it has 3 parts:
1. initialization part -> factorial_init() -> all necessary checks reside here
2. main logic -> factorial() -> calling Python script, reading data back and showing the result.
3. deinitialization -> factorial_deinit()

Compiled as:

gcc $(/home/sh/REPOS/MYSQL_DEV_ACTIONS/mysql-5.7/mysql-server/5.7.16/bin/mysql_config --cflags) -fPIC -shared -o factorial.so factorial.cc -std=c++11

Copied to plugin_dir:

cp factorial.so /home/sh/REPOS/MYSQL_DEV_ACTIONS/mysql-5.7/mysql-server/5.7.16/lib/plugin/
cp factorial.py /home/sh/REPOS/MYSQL_DEV_ACTIONS/mysql-5.7/mysql-server/5.7.16/lib/plugin/

Activated as:
mysql > create function factorial returns integer soname 'factorial.so';

Result:

root@test1:~# python --version
Python 2.7.12
mysql> select factorial(5);
+--------------+
| factorial(5) |
+--------------+
| 120 |
+--------------+
1 row in set (0.02 sec)

My first patch for MySQL world; fixing BUG #76852

Hi dear MySQL Community.
I have reported a bunch of bugs and wrote some articles, but today is a special for me 🙂
Because today I send my first PR for MySQL source code.

So I have logged a bug -> #76852 and now fixed it after year.
Okey, I know I am lazy.

Let’s reproduce the bug.
Simulating full disk condition is easy as:

mkdir /filesystems
dd if=/dev/zero of=/filesystems/tmp_fs bs=1024 count=0 seek=$[1024*10]
mkfs.ext4 /filesystems/tmp_fs
mkdir small_mounted_dir
mount -o loop /filesystems/tmp_fs /home/sh/small_mounted_dir/

Then obtain MySQL source and compile it with DEBUG. If you don’t know how to do it just refer to -> installing-mysql-from-source-cmake-issues

From now you can use MySQL Sandbox to install 1 master and 2 slaves:

make_sandbox_from_source /home/sh/MySQLPakcages/mysql-5.6.32 replication

Inside main path there will be a script for enabling GTID:

/home/sh/sandboxes/rsandbox_mysql-5_6_32/enable_gtid

Run it.

Then stop master using ./stop script inside master folder:

/home/sh/sandboxes/rsandbox_mysql-5_6_32/master/stop

Move binary mysql-bin.* files from datadir to /home/sh/small_mounted_dir/.
Update my.sandbox.cnf and add new path for log-bin variable. Then start MySQL using ./start script.
You can run sysbench to populate data:

 sysbench --test=/home/sh/REPOS/sysbench/sysbench/tests/db/oltp.lua --oltp-test-mode=complex \
 --num-threads=100 --mysql-db=dbtest --mysql-user=root --mysql-password=msandbox --db-driver=mysql \
 --max-requests=0 --innodb-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox23097.sock prepare
  
sysbench --test=/home/sh/REPOS/sysbench/sysbench/tests/db/insert.lua --oltp-test-mode=complex \
--num-threads=100 --mysql-db=dbtest --mysql-user=root --mysql-password=msandbox --db-driver=mysql  \
--max-requests=0 --innodb-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox23097.sock run

Next you will likely see something:

2016-11-08 00:56:32 13417 [Warning] Disk is full writing '/home/sh/small_mounted_dir/mysql-bin.000003' (Errcode: 28 - No space left on device). Waiting for someone to free space...
2016-11-08 00:56:32 13417 [Warning] Retry in 60 secs. Message reprinted in 600 secs

If you try to stop MySQL, you will likely hit -> #83705

And if you try to start you will likely hit -> #76852

According to coredump:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000ae6242 in my_printf_warning (format=0x1026f08 "Disk is full writing '%s' (Errcode: %d - %s). Waiting for someone to free space...")
    at /root/mysql-5.6.24/mysys/my_error.c:260
#2  0x0000000000ac9052 in wait_for_free_space (filename=0x1fe7ee0 "/opt/mysql/datadir/mysql-bin.~rec~", errors=0) at /root/mysql-5.6.24/mysys/errors.c:115
#3  0x0000000000af1ac7 in my_write (Filedes=19, Buffer=0x1964d00 "/opt/mysql/datadir/mysql-bin.000003\n", Count=36, MyFlags=52)
    at /root/mysql-5.6.24/mysys/my_write.c:89
#4  0x0000000000acd5ae in inline_mysql_file_write (src_file=0x1027708 "/root/mysql-5.6.24/mysys/mf_iocache.c", src_line=1788, file=19, 
    buffer=0x1964d00 "/opt/mysql/datadir/mysql-bin.000003\n", count=36, flags=52) at /root/mysql-5.6.24/include/mysql/psi/mysql_file.h:1141
#5  0x0000000000ad078c in my_b_flush_io_cache (info=0x183c1a8, need_append_buffer_lock=0) at /root/mysql-5.6.24/mysys/mf_iocache.c:1787
#6  0x0000000000a7132b in MYSQL_BIN_LOG::sync_purge_index_file (this=0x183b400) at /root/mysql-5.6.24/sql/binlog.cc:4420
#7  0x0000000000a6e206 in MYSQL_BIN_LOG::open_binlog (this=0x183b400, log_name=0x190d130 "/opt/mysql/datadir/mysql-bin", new_name=0x0, 
    io_cache_type_arg=WRITE_CACHE, max_size_arg=1073741824, null_created_arg=false, need_lock_index=true, need_sid_lock=true, extra_description_event=0x0)
    at /root/mysql-5.6.24/sql/binlog.cc:3146
#8  0x000000000063ad8e in init_server_components () at /root/mysql-5.6.24/sql/mysqld.cc:5012
#9  0x000000000063b6e7 in mysqld_main (argc=19, argv=0x186de68) at /root/mysql-5.6.24/sql/mysqld.cc:5455
#10 0x000000000062fc74 in main (argc=10, argv=0x7fffffffe3d8) at /root/mysql-5.6.24/sql/main.cc:25

The problem is somewhere at line 260 in /mysys/my_error.c:

void(*sql_print_warning_hook)(const char *format,...);
void my_printf_warning(const char *format, ...)
{
  va_list args;
  char wbuff[ERRMSGSIZE];
  DBUG_ENTER("my_printf_warning");
  DBUG_PRINT("my", ("Format: %s", format));
  va_start(args,format);
  (void) my_vsnprintf (wbuff, sizeof(wbuff), format, args);
  va_end(args);
  (*sql_print_warning_hook)(wbuff);
  DBUG_VOID_RETURN;
}

It gives SEGFAULT exactly on (*sql_print_warning_hook)(wbuff) which makes clear that, sql_print_warning_hook is not initialized at that point.

So as a suggestion by Sergei Glushchenko the solution can be achieved by checking sql_print_warning_hook prior calling inside my_error.c or initialize earlier sql_print_warning_hook inside sql/mysqld.cc. The best to have both:

I decide to initialize sql_print_warning_hook inside init_server_components():
Took this one:

+  /*
    The subsequent calls may take a long time : e.g. innodb log read.
    Thus set the long running service control manager timeout
   */
@@ -5596,7 +5598,7 @@ int mysqld_main(int argc, char **argv)
   */
   error_handler_hook= my_message_sql;
   start_signal_handler();       // Creates pidfile
-  sql_print_warning_hook = sql_print_warning;
+
[/cpp]

and added to:


@@ -4663,6 +4663,8 @@ static int init_server_components()
     We need to call each of these following functions to ensure that
     all things are initialized so that unireg_abort() doesn't fail
   */
+  sql_print_warning_hook = sql_print_warning;
+
   mdl_init();
   if (table_def_init() | hostname_cache_init(host_cache_size))
     unireg_abort(1);
@@ -5486,7 +5488,7 @@ int mysqld_main(int argc, char **argv)
 #endif
   }

Second thing is to add simple check inside /mysys/my_error.c:

diff --git a/mysys/my_error.c b/mysys/my_error.c
index b1f57ba..4441228 100644
--- a/mysys/my_error.c
+++ b/mysys/my_error.c
@@ -257,7 +257,14 @@ void my_printf_warning(const char *format, ...)
   va_start(args,format);
   (void) my_vsnprintf (wbuff, sizeof(wbuff), format, args);
   va_end(args);
-  (*sql_print_warning_hook)(wbuff);
+  if(sql_print_warning_hook)
+  {
+    (*sql_print_warning_hook)(wbuff);
+  }
+  else
+  {
+    fprintf(stderr, "%s\n", wbuff);
+  }
   DBUG_VOID_RETURN;
 }

After recompiling and retesting, there will be no SEGFAULT anymore.
So it can be considered as a fix, except that, my IDE added/removed some spaces from source code 🙂

The PR link -> PR #109

Playing with MySQL Source code; Adding “STOP ALL SLAVES” command

This blog post is a result of reading book.
To be clear -> “Expert MySQL” by Charles Bell -> http://www.apress.com/gp/book/9781430246596
This book explain internals of MySQL based on 5.6.x version. So relatively new and really good resource to use.

So the task is:
Chapter 8, “Extending MySQL High Availability” -> page 310, “Extending Replication”.
Where we should add new command -> “SQLCOM_STOP_SLAVES” executed on master and causing to STOP IO_THREAD on all slaves.
The basic idea is to run “STOP ALL SLAVES” on master, which in turn will appear on binary log and then it will applied on slaves.

Great 🙂
MySQL version is 5.6.32.

So, as book advice go to sql/lex.h. “We are adding the new symbol for the command. This file contains the symbol array stored in alphabetical order.”

Added:

{ "SLAVE",            SYM(SLAVE)},
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /* Add SLAVES command */
  { "SLAVES",           SYM(SLAVES)},
  /* END CAB MODIFICATION*/
  { "SLOW",             SYM(SLOW)},

Then go to sql/sql_cmd.h to add a new enumeration for the big switch. Locate enum enum_sql_command definition near the top of the file.
Added:

SQLCOM_COMMIT, SQLCOM_SAVEPOINT, SQLCOM_RELEASE_SAVEPOINT,
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /*Add SQLCOM_STOP_SLAVES enum */
  SQLCOM_SLAVE_START, SQLCOM_SLAVE_STOP, SQLCOM_STOP_SLAVES,
  /*END CAB MODIFICATION*/
  SQLCOM_BEGIN, SQLCOM_CHANGE_MASTER,

“Next we need to add a new token to be used in the new rule. Once again, the list of tokens is arranged in alphabetical order. Open the sql/sql_yacc.yy and locate the section where new tokens are defined. In this case, we need to add a definition for a token for new command. We will name it SLAVES.”
Added near line 1539/1540:

%token  SLAVE
/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/*Add SLAVES token */
%token SLAVES
/*END CAB MODIFICATION*/
%token  SLOW

“Next modify the section where the “%type” defintion resides. We need to add new token to this definition.”
Added near line 1855/1856:

slave master_def master_defs master_file_def slave_until_opts
        /* BEGIN CAB MODIFICATION */
        /* Reason for Modification */
        /* Add stop to list of NONE types */
        repair analyze check start stop checksum
        /*END CAB MODIFICATION*/
        field_list field_list_item field_spec kill column_def key_def

“Next, we add a new command definition to the list of commands so that the parser can direct control to the new rule. Notice we add a new ‘or’ condition mapping to a new rule to be evaluated.”
Added near line 2079/2080:

       | start
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /* Add stop to list of statement targets */
        | stop
  /*END CAB MODIFICATION*/
        | truncate

“Last, we will add the new rule to process the STOP ALL SLAVES command. The rule simply saves the new enumeration to the lex->sql_command attribute. This is how the code maps the result of the rule(and the processing of the command) to the big switch to a case equal to the enumeration value.”

Added near line 8114/8115:

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add rule for stop ALL SLAVES command*/
stop:
          STOP_SYM ALL SLAVES
          {
            LEX *lex= Lex;
            lex->sql_command= SQLCOM_STOP_SLAVES;
          }
        ;
/*END CAB MODIFICATION*/

“With changes to the YACC file complete, we can add a new case for the big switch to ensure the command, once captured by the parser, is directed to code to written the event to the binary log. Normally, the STOP SLAVE command is not replicated. Our code would also override this restriction. Let us add that case statement. Open the sql/sql_parse.cc

Added near line 3159/3160:

    mysql_mutex_unlock(&LOCK_active_mi);
    break;
  }

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add case statement for STOP ALL SLAVES command */
  case SQLCOM_STOP_SLAVES:
  {
    if (!lex->no_write_to_binlog)
    {
        res = write_bin_log(thd, TRUE, "STOP SLAVE IO_THREAD", 20);
    }

    break;
  }
/*END CAB MODIFICATION*/


#endif /* HAVE_REPLICATION */

  case SQLCOM_RENAME_TABLE:

“The first statement is designed to check to see if the server is able to write to the binary log. If so, we add a new log event, passing it the STOP SLAVE SQL command. Notice that we use a specific version of the STOP SLAVE command. In this case, we are stopping only IO_THREAD.”

I assume that you are familiar with compiling MySQL from source code. If not then read -> installing-mysql-from-source-cmake-issues

Go ahead and run “make” again. It will likely fail with:

Scanning dependencies of target sql
[ 54%] Building CXX object sql/CMakeFiles/sql.dir/sql_parse.cc.o
[ 54%] Building CXX object sql/CMakeFiles/sql.dir/mysqld.cc.o
In file included from /home/sh/MySQLPakcages/mysql-5.6.32/include/my_compiler.h:151:0,
                 from /home/sh/MySQLPakcages/mysql-5.6.32/include/my_global.h:473,
                 from /home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc:17:
/home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc: In function ‘int init_common_variables()’:
/home/sh/MySQLPakcages/mysql-5.6.32/include/my_attribute.h:38:42: error: size of array ‘compile_time_assert’ is negative
 #  define MY_ATTRIBUTE(A) __attribute__(A)
                                          ^
/home/sh/MySQLPakcages/mysql-5.6.32/include/my_global.h:377:52: note: in expansion of macro ‘MY_ATTRIBUTE’
     typedef char compile_time_assert[(X) ? 1 : -1] MY_ATTRIBUTE((unused)); \
                                                    ^
/home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc:3871:3: note: in expansion of macro ‘compile_time_assert’
   compile_time_assert(sizeof(com_status_vars)/sizeof(com_status_vars[0]) - 1 ==
   ^
sql/CMakeFiles/sql.dir/build.make:3581: recipe for target 'sql/CMakeFiles/sql.dir/mysqld.cc.o' failed
make[2]: *** [sql/CMakeFiles/sql.dir/mysqld.cc.o] Error 1
CMakeFiles/Makefile2:5192: recipe for target 'sql/CMakeFiles/sql.dir/all' failed
make[1]: *** [sql/CMakeFiles/sql.dir/all] Error 2
Makefile:160: recipe for target 'all' failed
make: *** [all] Error 2

As a beginner in source code world, I suddenly found a BUG in book 🙂
After reading comments I have found something inside sql/sql_cmd.h:

/*
  When a command is added here, be sure it's also added in mysqld.cc
  in "struct show_var_st status_vars[]= {" ...
*/

So the new command should also be added inside sql/mysqld.cc. But there is no such struct show_var_st status_vars[] there. It took me some time to figure out that instead of status_vars[], command should be added into SHOW_VAR com_status_vars[]
See related report -> #83613

So edited mysqld.cc file near line 3571/3572:

{"slave_stop",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SLAVE_STOP]), SHOW_LONG_STATUS},
        /* BEGIN CAB MODIFICATION */
        /* Reason for Modification */
  {"stop_slaves",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_STOP_SLAVES]), SHOW_LONG_STATUS},
        /*END CAB MODIFICATION*/

Save and recompile it will succeed. Next is to look at easy way to setup replication and test the new added command. In the book, author used tools from “MySQL Utilities”. But I have used “MySQL Sandbox” as:

make_sandbox_from_source /home/sh/MySQLPakcages/mysql-5.6.32 replication

That’s it, 1 master and 2 slaves are ready to test.

For my bad, ” stop all slaves” command failed with following assertion on master:

Version: '5.6.32-debug-log'  socket: '/tmp/mysql_sandbox23097.sock'  port: 23097  Shahriyar Rzayev's Ubuntu 16.04 MySQL-5.6.32
mysqld: /home/sh/MySQLPakcages/mysql-5.6.32/sql/protocol.cc:515: void Protocol::end_statement(): Assertion `0' failed.
14:49:07 UTC - mysqld got signal 6 ;

So ideally my statement, should return something like “Query OK”. After asking Weixiang Zhai issue fixed by adding my_ok(thd)

So the final code for sql/sql_parse.cc:

    mysql_mutex_unlock(&LOCK_active_mi);
    break;
  }

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add case statement for STOP ALL SLAVES command */
  case SQLCOM_STOP_SLAVES:
  {
    if (!lex->no_write_to_binlog)
    {
        res = write_bin_log(thd, TRUE, "STOP SLAVE IO_THREAD", 20);
    }

    if(!res)
    {
        my_ok(thd);
    }

    break;
  }
/*END CAB MODIFICATION*/


#endif /* HAVE_REPLICATION */

  case SQLCOM_RENAME_TABLE:

Recompiled and reinstalled with sandbox, the result of running new command on master:

master [localhost] {msandbox} ((none)) > stop all slaves; 
Query OK, 0 rows affected (0.00 sec)

Result on slaves:

Slave_IO_Running: No 
Slave_SQL_Running: Yes

So it works like a charm 🙂