My first patch for MySQL world; fixing BUG #76852

Hi dear MySQL Community.
I have reported a bunch of bugs and wrote some articles, but today is a special for me🙂
Because today I send my first PR for MySQL source code.

So I have logged a bug -> #76852 and now fixed it after year.
Okey, I know I am lazy.

Let’s reproduce the bug.
Simulating full disk condition is easy as:

mkdir /filesystems
dd if=/dev/zero of=/filesystems/tmp_fs bs=1024 count=0 seek=$[1024*10]
mkfs.ext4 /filesystems/tmp_fs
mkdir small_mounted_dir
mount -o loop /filesystems/tmp_fs /home/sh/small_mounted_dir/

Then obtain MySQL source and compile it with DEBUG. If you don’t know how to do it just refer to -> installing-mysql-from-source-cmake-issues

From now you can use MySQL Sandbox to install 1 master and 2 slaves:

make_sandbox_from_source /home/sh/MySQLPakcages/mysql-5.6.32 replication

Inside main path there will be a script for enabling GTID:

/home/sh/sandboxes/rsandbox_mysql-5_6_32/enable_gtid

Run it.

Then stop master using ./stop script inside master folder:

/home/sh/sandboxes/rsandbox_mysql-5_6_32/master/stop

Move binary mysql-bin.* files from datadir to /home/sh/small_mounted_dir/.
Update my.sandbox.cnf and add new path for log-bin variable. Then start MySQL using ./start script.
You can run sysbench to populate data:

 sysbench --test=/home/sh/REPOS/sysbench/sysbench/tests/db/oltp.lua --oltp-test-mode=complex \
 --num-threads=100 --mysql-db=dbtest --mysql-user=root --mysql-password=msandbox --db-driver=mysql \
 --max-requests=0 --innodb-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox23097.sock prepare
  
sysbench --test=/home/sh/REPOS/sysbench/sysbench/tests/db/insert.lua --oltp-test-mode=complex \
--num-threads=100 --mysql-db=dbtest --mysql-user=root --mysql-password=msandbox --db-driver=mysql  \
--max-requests=0 --innodb-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox23097.sock run

Next you will likely see something:

2016-11-08 00:56:32 13417 [Warning] Disk is full writing '/home/sh/small_mounted_dir/mysql-bin.000003' (Errcode: 28 - No space left on device). Waiting for someone to free space...
2016-11-08 00:56:32 13417 [Warning] Retry in 60 secs. Message reprinted in 600 secs

If you try to stop MySQL, you will likely hit -> #83705

And if you try to start you will likely hit -> #76852

According to coredump:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000ae6242 in my_printf_warning (format=0x1026f08 "Disk is full writing '%s' (Errcode: %d - %s). Waiting for someone to free space...")
    at /root/mysql-5.6.24/mysys/my_error.c:260
#2  0x0000000000ac9052 in wait_for_free_space (filename=0x1fe7ee0 "/opt/mysql/datadir/mysql-bin.~rec~", errors=0) at /root/mysql-5.6.24/mysys/errors.c:115
#3  0x0000000000af1ac7 in my_write (Filedes=19, Buffer=0x1964d00 "/opt/mysql/datadir/mysql-bin.000003\n", Count=36, MyFlags=52)
    at /root/mysql-5.6.24/mysys/my_write.c:89
#4  0x0000000000acd5ae in inline_mysql_file_write (src_file=0x1027708 "/root/mysql-5.6.24/mysys/mf_iocache.c", src_line=1788, file=19, 
    buffer=0x1964d00 "/opt/mysql/datadir/mysql-bin.000003\n", count=36, flags=52) at /root/mysql-5.6.24/include/mysql/psi/mysql_file.h:1141
#5  0x0000000000ad078c in my_b_flush_io_cache (info=0x183c1a8, need_append_buffer_lock=0) at /root/mysql-5.6.24/mysys/mf_iocache.c:1787
#6  0x0000000000a7132b in MYSQL_BIN_LOG::sync_purge_index_file (this=0x183b400) at /root/mysql-5.6.24/sql/binlog.cc:4420
#7  0x0000000000a6e206 in MYSQL_BIN_LOG::open_binlog (this=0x183b400, log_name=0x190d130 "/opt/mysql/datadir/mysql-bin", new_name=0x0, 
    io_cache_type_arg=WRITE_CACHE, max_size_arg=1073741824, null_created_arg=false, need_lock_index=true, need_sid_lock=true, extra_description_event=0x0)
    at /root/mysql-5.6.24/sql/binlog.cc:3146
#8  0x000000000063ad8e in init_server_components () at /root/mysql-5.6.24/sql/mysqld.cc:5012
#9  0x000000000063b6e7 in mysqld_main (argc=19, argv=0x186de68) at /root/mysql-5.6.24/sql/mysqld.cc:5455
#10 0x000000000062fc74 in main (argc=10, argv=0x7fffffffe3d8) at /root/mysql-5.6.24/sql/main.cc:25

The problem is somewhere at line 260 in /mysys/my_error.c:

void(*sql_print_warning_hook)(const char *format,...);
void my_printf_warning(const char *format, ...)
{
  va_list args;
  char wbuff[ERRMSGSIZE];
  DBUG_ENTER("my_printf_warning");
  DBUG_PRINT("my", ("Format: %s", format));
  va_start(args,format);
  (void) my_vsnprintf (wbuff, sizeof(wbuff), format, args);
  va_end(args);
  (*sql_print_warning_hook)(wbuff);
  DBUG_VOID_RETURN;
}

It gives SEGFAULT exactly on (*sql_print_warning_hook)(wbuff) which makes clear that, sql_print_warning_hook is not initialized at that point.

So as a suggestion by Sergei Glushchenko the solution can be achieved by checking sql_print_warning_hook prior calling inside my_error.c or initialize earlier sql_print_warning_hook inside sql/mysqld.cc. The best to have both:

I decide to initialize sql_print_warning_hook inside init_server_components():
Took this one:

+  /*
    The subsequent calls may take a long time : e.g. innodb log read.
    Thus set the long running service control manager timeout
   */
@@ -5596,7 +5598,7 @@ int mysqld_main(int argc, char **argv)
   */
   error_handler_hook= my_message_sql;
   start_signal_handler();       // Creates pidfile
-  sql_print_warning_hook = sql_print_warning;
+
[/cpp]

and added to:


@@ -4663,6 +4663,8 @@ static int init_server_components()
     We need to call each of these following functions to ensure that
     all things are initialized so that unireg_abort() doesn't fail
   */
+  sql_print_warning_hook = sql_print_warning;
+
   mdl_init();
   if (table_def_init() | hostname_cache_init(host_cache_size))
     unireg_abort(1);
@@ -5486,7 +5488,7 @@ int mysqld_main(int argc, char **argv)
 #endif
   }

Second thing is to add simple check inside /mysys/my_error.c:

diff --git a/mysys/my_error.c b/mysys/my_error.c
index b1f57ba..4441228 100644
--- a/mysys/my_error.c
+++ b/mysys/my_error.c
@@ -257,7 +257,14 @@ void my_printf_warning(const char *format, ...)
   va_start(args,format);
   (void) my_vsnprintf (wbuff, sizeof(wbuff), format, args);
   va_end(args);
-  (*sql_print_warning_hook)(wbuff);
+  if(sql_print_warning_hook)
+  {
+    (*sql_print_warning_hook)(wbuff);
+  }
+  else
+  {
+    fprintf(stderr, "%s\n", wbuff);
+  }
   DBUG_VOID_RETURN;
 }

After recompiling and retesting, there will be no SEGFAULT anymore.
So it can be considered as a fix, except that, my IDE added/removed some spaces from source code🙂

The PR link -> PR #109

Playing with MySQL Source code; Adding “STOP ALL SLAVES” command

This blog post is a result of reading book.
To be clear -> “Expert MySQL” by Charles Bell -> http://www.apress.com/gp/book/9781430246596
This book explain internals of MySQL based on 5.6.x version. So relatively new and really good resource to use.

So the task is:
Chapter 8, “Extending MySQL High Availability” -> page 310, “Extending Replication”.
Where we should add new command -> “SQLCOM_STOP_SLAVES” executed on master and causing to STOP IO_THREAD on all slaves.
The basic idea is to run “STOP ALL SLAVES” on master, which in turn will appear on binary log and then it will applied on slaves.

Great🙂
MySQL version is 5.6.32.

So, as book advice go to sql/lex.h. “We are adding the new symbol for the command. This file contains the symbol array stored in alphabetical order.”

Added:

{ "SLAVE",            SYM(SLAVE)},
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /* Add SLAVES command */
  { "SLAVES",           SYM(SLAVES)},
  /* END CAB MODIFICATION*/
  { "SLOW",             SYM(SLOW)},

Then go to sql/sql_cmd.h to add a new enumeration for the big switch. Locate enum enum_sql_command definition near the top of the file.
Added:

SQLCOM_COMMIT, SQLCOM_SAVEPOINT, SQLCOM_RELEASE_SAVEPOINT,
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /*Add SQLCOM_STOP_SLAVES enum */
  SQLCOM_SLAVE_START, SQLCOM_SLAVE_STOP, SQLCOM_STOP_SLAVES,
  /*END CAB MODIFICATION*/
  SQLCOM_BEGIN, SQLCOM_CHANGE_MASTER,

“Next we need to add a new token to be used in the new rule. Once again, the list of tokens is arranged in alphabetical order. Open the sql/sql_yacc.yy and locate the section where new tokens are defined. In this case, we need to add a definition for a token for new command. We will name it SLAVES.”
Added near line 1539/1540:

%token  SLAVE
/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/*Add SLAVES token */
%token SLAVES
/*END CAB MODIFICATION*/
%token  SLOW

“Next modify the section where the “%type” defintion resides. We need to add new token to this definition.”
Added near line 1855/1856:

slave master_def master_defs master_file_def slave_until_opts
        /* BEGIN CAB MODIFICATION */
        /* Reason for Modification */
        /* Add stop to list of NONE types */
        repair analyze check start stop checksum
        /*END CAB MODIFICATION*/
        field_list field_list_item field_spec kill column_def key_def

“Next, we add a new command definition to the list of commands so that the parser can direct control to the new rule. Notice we add a new ‘or’ condition mapping to a new rule to be evaluated.”
Added near line 2079/2080:

       | start
  /* BEGIN CAB MODIFICATION */
  /* Reason for Modification */
  /* Add stop to list of statement targets */
        | stop
  /*END CAB MODIFICATION*/
        | truncate

“Last, we will add the new rule to process the STOP ALL SLAVES command. The rule simply saves the new enumeration to the lex->sql_command attribute. This is how the code maps the result of the rule(and the processing of the command) to the big switch to a case equal to the enumeration value.”

Added near line 8114/8115:

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add rule for stop ALL SLAVES command*/
stop:
          STOP_SYM ALL SLAVES
          {
            LEX *lex= Lex;
            lex->sql_command= SQLCOM_STOP_SLAVES;
          }
        ;
/*END CAB MODIFICATION*/

“With changes to the YACC file complete, we can add a new case for the big switch to ensure the command, once captured by the parser, is directed to code to written the event to the binary log. Normally, the STOP SLAVE command is not replicated. Our code would also override this restriction. Let us add that case statement. Open the sql/sql_parse.cc

Added near line 3159/3160:

    mysql_mutex_unlock(&LOCK_active_mi);
    break;
  }

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add case statement for STOP ALL SLAVES command */
  case SQLCOM_STOP_SLAVES:
  {
    if (!lex->no_write_to_binlog)
    {
        res = write_bin_log(thd, TRUE, "STOP SLAVE IO_THREAD", 20);
    }

    break;
  }
/*END CAB MODIFICATION*/


#endif /* HAVE_REPLICATION */

  case SQLCOM_RENAME_TABLE:

“The first statement is designed to check to see if the server is able to write to the binary log. If so, we add a new log event, passing it the STOP SLAVE SQL command. Notice that we use a specific version of the STOP SLAVE command. In this case, we are stopping only IO_THREAD.”

I assume that you are familiar with compiling MySQL from source code. If not then read -> installing-mysql-from-source-cmake-issues

Go ahead and run “make” again. It will likely fail with:

Scanning dependencies of target sql
[ 54%] Building CXX object sql/CMakeFiles/sql.dir/sql_parse.cc.o
[ 54%] Building CXX object sql/CMakeFiles/sql.dir/mysqld.cc.o
In file included from /home/sh/MySQLPakcages/mysql-5.6.32/include/my_compiler.h:151:0,
                 from /home/sh/MySQLPakcages/mysql-5.6.32/include/my_global.h:473,
                 from /home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc:17:
/home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc: In function ‘int init_common_variables()’:
/home/sh/MySQLPakcages/mysql-5.6.32/include/my_attribute.h:38:42: error: size of array ‘compile_time_assert’ is negative
 #  define MY_ATTRIBUTE(A) __attribute__(A)
                                          ^
/home/sh/MySQLPakcages/mysql-5.6.32/include/my_global.h:377:52: note: in expansion of macro ‘MY_ATTRIBUTE’
     typedef char compile_time_assert[(X) ? 1 : -1] MY_ATTRIBUTE((unused)); \
                                                    ^
/home/sh/MySQLPakcages/mysql-5.6.32/sql/mysqld.cc:3871:3: note: in expansion of macro ‘compile_time_assert’
   compile_time_assert(sizeof(com_status_vars)/sizeof(com_status_vars[0]) - 1 ==
   ^
sql/CMakeFiles/sql.dir/build.make:3581: recipe for target 'sql/CMakeFiles/sql.dir/mysqld.cc.o' failed
make[2]: *** [sql/CMakeFiles/sql.dir/mysqld.cc.o] Error 1
CMakeFiles/Makefile2:5192: recipe for target 'sql/CMakeFiles/sql.dir/all' failed
make[1]: *** [sql/CMakeFiles/sql.dir/all] Error 2
Makefile:160: recipe for target 'all' failed
make: *** [all] Error 2

As a beginner in source code world, I suddenly found a BUG in book🙂
After reading comments I have found something inside sql/sql_cmd.h:

/*
  When a command is added here, be sure it's also added in mysqld.cc
  in "struct show_var_st status_vars[]= {" ...
*/

So the new command should also be added inside sql/mysqld.cc. But there is no such struct show_var_st status_vars[] there. It took me some time to figure out that instead of status_vars[], command should be added into SHOW_VAR com_status_vars[]
See related report -> #83613

So edited mysqld.cc file near line 3571/3572:

{"slave_stop",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SLAVE_STOP]), SHOW_LONG_STATUS},
        /* BEGIN CAB MODIFICATION */
        /* Reason for Modification */
  {"stop_slaves",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_STOP_SLAVES]), SHOW_LONG_STATUS},
        /*END CAB MODIFICATION*/

Save and recompile it will succeed. Next is to look at easy way to setup replication and test the new added command. In the book, author used tools from “MySQL Utilities”. But I have used “MySQL Sandbox” as:

make_sandbox_from_source /home/sh/MySQLPakcages/mysql-5.6.32 replication

That’s it, 1 master and 2 slaves are ready to test.

For my bad, ” stop all slaves” command failed with following assertion on master:

Version: '5.6.32-debug-log'  socket: '/tmp/mysql_sandbox23097.sock'  port: 23097  Shahriyar Rzayev's Ubuntu 16.04 MySQL-5.6.32
mysqld: /home/sh/MySQLPakcages/mysql-5.6.32/sql/protocol.cc:515: void Protocol::end_statement(): Assertion `0' failed.
14:49:07 UTC - mysqld got signal 6 ;

So ideally my statement, should return something like “Query OK”. After asking Weixiang Zhai issue fixed by adding my_ok(thd)

So the final code for sql/sql_parse.cc:

    mysql_mutex_unlock(&LOCK_active_mi);
    break;
  }

/* BEGIN CAB MODIFICATION */
/* Reason for Modification */
/* Add case statement for STOP ALL SLAVES command */
  case SQLCOM_STOP_SLAVES:
  {
    if (!lex->no_write_to_binlog)
    {
        res = write_bin_log(thd, TRUE, "STOP SLAVE IO_THREAD", 20);
    }

    if(!res)
    {
        my_ok(thd);
    }

    break;
  }
/*END CAB MODIFICATION*/


#endif /* HAVE_REPLICATION */

  case SQLCOM_RENAME_TABLE:

Recompiled and reinstalled with sandbox, the result of running new command on master:

master [localhost] {msandbox} ((none)) > stop all slaves; 
Query OK, 0 rows affected (0.00 sec)

Result on slaves:

Slave_IO_Running: No 
Slave_SQL_Running: Yes

So it works like a charm🙂

Getting started MySQL Group Replication on Ubuntu with MySQL Sandbox

Welcome to next topic dedicated to Group Replication plugin. Today’s topic is to get started with Group Replication. As you remember we set up our environment using MySQL Sandbox on Ubuntu 14.04 and also we have compiled plugin with MySQL 5.7.8-rc on Ubuntu 14.04. Please refer to this topics respectively:

  1. MySQL Sandbox creating test environment
  2. Compiling MySQL Group replication plugin

So we have already our 3 nodes of MySQL 5.7.8-rc + group replication plugin.
Before starting group replication. We need to play with Corosync. Here is very dedicated article to this -> Corosync Guide

I want to talk about some issues where you may see and help to resolve these problems.
First thing is when you try to start group replication, without starting corosync daemon or while corosync is stopped:

mysql> START GROUP_REPLICATION;
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.

The error indicates that, you will find some input in error log but you won’t🙂
So just be sure to start corosync daemon before getting your hand with group replication.

Second thing is to save correct user on /etc/corosync/uidgid.d directory.
For eg, if you will start MySQL service with some ‘john’ Linux user, you must put this user into some file. For me, it is ‘sh’ local user on my Ubuntu:

sh@shrzayev:~$ cat /etc/corosync/uidgid.d/users 
uidgid {
 uid: sh
 gid: sh
}

If you will save wrong user you will likely see:

node1 [localhost] {msandbox} ((none)) > start group_replication;
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.

But again there will be no such input in error log indicating exact issue.
Just be sure to save exact user related to corosync.

These tips came from Pedro Gomes. He is a replication developer from Oracle.
For further discussion please refer to related report -> #78619

Let’s assume you have successfully completed Corosync setup.
Now it is time to config MySQL server for group replication plugin.
For all 3 nodes the my.cnf file should include following settings:

server-id                      = UNIQUE_NUMBER_HERE
binlog-format                  = row
enforce-gtid-consistency       = true
gtid-mode                      = on
log_slave_updates              = 1
master-info-repository         = TABLE
relay-log-info-repository      = TABLE
binlog-checksum                = NONE
transaction-write-set-extraction=MURMUR32

With MySQL Sandbox there are already unique server-ids and some other usefull system variables with my.sandbox.cnf file. Just add non-existing parameters to config file. Start all 3 nodes first time:

sh@shrzayev:~/sandboxes/multi_msb_5_7_8/node[1,2,3]$ ./start
.. sandbox server started

Just be sure that, all servers started successfully:

sh@shrzayev:~/sandboxes/multi_msb_5_7_8/node[1,2,3]$ ./status
node1 on

Now stop all 3 nodes and append –plugin-load=group_replication.so to ./start script:

sh@shrzayev:~/sandboxes/multi_msb_5_7_8/node[1,2,3]$ ./start --plugin-load=group_replication.so
... sandbox server started

You can check plugin status:

node[1,2,3] [localhost] {msandbox} ((none)) > select plugin_name, plugin_status from information_schema.plugins where plugin_name like 'group_%';
+-------------------+---------------+
| plugin_name       | plugin_status |
+-------------------+---------------+
| group_replication | ACTIVE        |
+-------------------+---------------+
1 row in set (0,01 sec)

You have enabled group_replication plugin.
We need to create replication user do it only on node1:

CREATE USER 'rpl_user'@'%' IDENTIFIED BY 'rpl_pass';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
GRANT SELECT on performance_schema.global_variables TO rpl_user@'%';

Also we need to give a SELECT grant to performance_schema.global_variables table, otherwise you will get an error:

2015-10-08T07:32:26.000461Z 8 [ERROR] Slave I/O for channel 'group_replication_recovery': The slave I/O thread stops because a fatal error is e
ncountered when it try to get the value of SERVER_ID variable from master. Error: SELECT command denied to user 'rpl_user'@'localhost' for tabl
e 'global_variables', Error_code: 1142

Now there are some global variables to change. do it on node1:

SET GLOBAL group_replication_group_name= "8a94f357-aab4-11df-86ab-c80aa9429562";
SET GLOBAL group_replication_recovery_user='rpl_user';
SET GLOBAL group_replication_recovery_password='rpl_pass';
SET GLOBAL group_replication_recovery_retry_count= 2;
SET GLOBAL group_replication_recovery_reconnect_interval= 120;

After all we can start group replication:

node1 [localhost] {msandbox} ((none)) > start group_replication;
Query OK, 0 rows affected (0,16 sec)

In MySQL error log, you will see new entries:

2015-10-08T05:55:22.419952Z 4 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 3307, master_log_file='', master_log_pos= 4, master_bind=''.
2015-10-08T05:55:22.578212Z 3 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2015-10-08T05:55:22.578283Z 5 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './mysql-relay-group_replication_applier.000001' position: 4
2015-10-08T05:55:22.578306Z 3 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'
2015-10-08T05:55:22.578359Z 3 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 101'
2015-10-08T05:55:22.579736Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 571734024:1'
2015-10-08T05:55:22.712797Z 6 [Note] Plugin group_replication reported: 'Only one server alive. Declaring this server as online within the replication group'
2015-10-08T05:55:22.713721Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'

Even you can notice new connection from corosync log file:

Oct 08 10:55:22 [1257] shrzayev corosync debug   [QB    ] IPC credentials authenticated (1293-7202-19)
Oct 08 10:55:22 [1257] shrzayev corosync debug   [QB    ] connecting to client [7202]
Oct 08 10:55:22 [1257] shrzayev corosync debug   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Oct 08 10:55:22 [1257] shrzayev corosync debug   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Oct 08 10:55:22 [1257] shrzayev corosync debug   [QB    ] shm size:1048589; real_size:1052672; rb->word_size:263168
Oct 08 10:55:22 [1257] shrzayev corosync debug   [MAIN  ] connection created
Oct 08 10:55:22 [1257] shrzayev corosync debug   [CPG   ] lib_init_fn: conn=0x7fe2da6a4e60, cpd=0x7fe2da6a5454
Oct 08 10:55:22 [1257] shrzayev corosync debug   [CPG   ] got procjoin message from cluster node 2130706433 (r(0) ip(127.0.0.1) ) for pid 7202

Now you can check group replication member status:

node1 [localhost] {msandbox} ((none)) > SELECT * FROM performance_schema.replication_connection_statusG
*************************** 1. row ***************************
             CHANNEL_NAME: group_replication_applier
               GROUP_NAME: 8a94f357-aab4-11df-86ab-c80aa9429562
              SOURCE_UUID: 8a94f357-aab4-11df-86ab-c80aa9429562
                THREAD_ID: NULL
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET: 8a94f357-aab4-11df-86ab-c80aa9429562:1-2
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE: 
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
1 row in set (0,01 sec)

So there is no ERRORs here, it is working.
You can check also group members:

node1 [localhost] {msandbox} ((none)) > SELECT * FROM performance_schema.replication_group_membersG
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 00008379-1111-1111-1111-111111111111
 MEMBER_HOST: shrzayev
 MEMBER_PORT: 8379
MEMBER_STATE: ONLINE
1 row in set (0,01 sec)

At this moment we have only 1 member -> node1.
Let’s add node2 too. Procedure is same set global variables and start group replication:

SET GLOBAL group_replication_group_name= "8a94f357-aab4-11df-86ab-c80aa9429562";
SET GLOBAL group_replication_recovery_user='rpl_user';
SET GLOBAL group_replication_recovery_password='rpl_pass';
SET GLOBAL group_replication_recovery_retry_count= 2;
SET GLOBAL group_replication_recovery_reconnect_interval= 120;

And start group replication:

node2 [localhost] {msandbox} ((none)) > start group_replication;
Query OK, 0 rows affected (0,16 sec)

Same procedure for node3 and we will have 3 nodes on our group:

node1 [localhost] {msandbox} ((none)) > select * from performance_schema.replication_group_membersG
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 00008379-1111-1111-1111-111111111111
 MEMBER_HOST: shrzayev
 MEMBER_PORT: 8379
MEMBER_STATE: ONLINE
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 00008380-2222-2222-2222-222222222222
 MEMBER_HOST: shrzayev
 MEMBER_PORT: 8380
MEMBER_STATE: ONLINE
*************************** 3. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 00008381-3333-3333-3333-333333333333
 MEMBER_HOST: shrzayev
 MEMBER_PORT: 8381
MEMBER_STATE: ONLINE
3 rows in set (0,00 sec)

That’s all. Here it is, how we actually can use MySQL Sandbox on Ubuntu 14.04 to create environment and to get started with Group Replication plugin.
As well as from now, you should be able to get start with Group Replication.
If you have your very own test cases, feel free to share or to ask.

MySQL Sandbox, creating test environment for Group Replication plugin

Today’s topic is about how to test Group Replication plugin in easy way, with using MySQL Sandbox
But firstly we must compile MySQL with Group Replication plugin, refer to previous topic -> Compiling Group Replication Plugin

Who did not use Sandbox before, please refer to official link and read a bit more. It is wonderful tool for installing, testing new things with MySQL. Especially if you have new MySQL release to test and want to install 3 or 5 instances, just use MySQL Sandbox and it will do this work for you.

In our condition, we have already compiled MySQL with our plugin. So we have source folder in -> /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN

With MySQL Sandbox there is a wonderful tool, named make_sandbox_from_source.
Let’s see options available with this tool:

sh@shrzayev:~$ make_sandbox_from_source --help
    The MySQL Sandbox,  version 3.1.00
    (C) 2006-2015 Giuseppe Maxia

*** sandbox_type missing
Syntax: /usr/local/bin/make_sandbox_from_source source_directory sandbox_type [options]

source directory is where you have successfully
    run ./configure && make

sandbox_type is one of 
    single
    replication
    circular
    multiple

options is anything that is needed by the sandbox
    application of your choice

We want to test Group Replication plugin, so we need multiple option it will install 3 nodes of same version of MySQL by default. So after running we will get on our hand 3 nodes of Group Replication enabled MySQL.
We must just specify the source folder as follows:

sh@shrzayev:~$ make_sandbox_from_source /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN multiple
can't find executable script 'make_binary_distribution'

Again with appending BUILD folder:

sh@shrzayev:~$ make_sandbox_from_source /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD multiple
can't find executable script 'make_binary_distribution'

No success.
If you go on and list BUILD folder:

sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD$ ls
mysql-server  mysql-server.build  mysql-server.inst

There is a mysql-server.build folder which we will append and run again:

sh@shrzayev:~$ make_sandbox_from_source /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build/ multiple
can't find a version in Makefile

Error has changed -> can’t find a version in Makefile
After searching i found a BUG report(LP BUG 853764) related to this issue and found a workaround. We should add MYSQL_NO_DASH_VERSION option in Makefile:

sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD$ cd mysql-server.build/
sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build$ vim Makefile

And put MYSQL_NO_DASH_VERSION=5.7.8 (for me) anywhere in Makefile, save and exit. Then rerun:

sh@shrzayev:~$ make_sandbox_from_source /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build/ multiple
no old tarball found
CPack: Create package using TGZ
CPack: Install projects
CPack: - Run preinstall target for: MySQL
CPack: - Install project: MySQL
CPack: Create package
CPack: - package: /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build/mysql-5.7.8-rc-linux-x86_64.tar.gz generated.
installing node 1
installing node 2
installing node 3
group directory installed in $HOME/sandboxes/multi_msb_mysql-5_7_8

That’s it we have 3 nodes and we are ready to test plugin.
Check for node1 for eg:

node1 [localhost] {msandbox} ((none)) > select @@plugin_dir;
+-----------------------------------------------------------------------------------------------------+
| @@plugin_dir                                                                                        |
+-----------------------------------------------------------------------------------------------------+
| /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build/5.7.8/lib/plugin/ |
+-----------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)

sh@shrzayev:~$ ls /home/sh/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN/BUILD/mysql-server.build/5.7.8/lib/plugin/ | grep group_
group_replication_license
group_replication.so

We have our plugin in plugin directory.
In the subsequent topics i will talk about activating and using group replication plugin. Again, we are using Ubuntu 14.04 with MySQL 5.7.8-rc2 and mysql-group-replication-0.5.0-dmr.

Compiling MySQL Group Replication plugin with MySQL 5.7.8-rc2 on Ubuntu

MySQL Group Replication plugin is in labs.mysql.com and is available for EL6 x86_64 version Linux. But most of us have Ubuntu desktops where it should be easier to test this new thing, especially with MySQL Sandbox. After getting source code we should have compile this plugin with MySQL from source. So let’s begin. Extract both mysql group replication archive and mysql source archive:

sh@shrzayev:~/Sandboxes$ ls -l
total 650732
drwxr-xr-x 34 sh sh      4096 Ä°yl 20 17:25 mysql-5.7.8-rc
-rw-rw-r--  1 sh sh  49762480 Avq 20 16:19 mysql-5.7.8-rc.tar.gz
drwxrwxr-x  3 sh sh      4096 Sen 28 12:08 mysql-group-replication-0.5.0-dmr
-rw-rw-r--  1 sh sh    251687 Sen 28 11:57 mysql-group-replication-0.5.0-labs.tar.gz

You will have 2 directories as above. Then, go to mysql-group-replication folder:

sh@shrzayev:~/Sandboxes$ cd mysql-group-replication-0.5.0-dmr/
sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr$ mkdir BIN
sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr$ cd BIN

Inside BIN folder:

> NOTE: there must be no spaces beetween cmake options(between ";")

You can run following cmake command:

cmake ..  -DMYSQL_SERVER_SRC_DIR="/home/sh/Sandboxes/mysql-5.7.8-rc" -DMYSQL_SERVER_CMAKE_ARGS="-DMYSQL_DATADIR=/opt/mysql-5.7.8-grouprepl/datadir;-DSYSCONFDIR=/opt/mysql-5.7.8-grouprepl;-DWITH_SSL=system;-DMYSQL_TCP_PORT=3307;-DMYSQL_UNIX_ADDR=/opt/mysql-5.7.8-grouprepl/mysqld.sock;-DDEFAULT_CHARSET=utf8;-DDEFAULT_COLLATION=utf8_general_ci;-DWITH_DEBUG=1;-DCOMPILATION_COMMENT='Group Replication enabled MySQL-5.7.8';-DOPTIMIZER_TRACE=1;-DWITH_ZLIB=system;-DWITH_VALGRIND=1;-DCMAKE_C_FLAGS=-DHAVE_purify;-DCMAKE_CXX_FLAGS=-DHAVE_purify;-DDOWNLOAD_BOOST=1;-DWITH_BOOST=/home/sh/Sandboxes"

But before running this command you must install corosync and related packages, if not, you will have success with compilation but will get an empty binary package🙂 For further reading refer to related BUG report -> #78600

So install dependencies:

sudo apt-get install corosync corosync-dev libcpg-dev

Then run cmake command above and after sucessful cmake run:

make package

If you list BIN folder:

sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN$ ls
BUILD           CMakeFiles           CPackConfig.cmake  CPackSourceConfig.cmake  mysql-group-replication-0.5.0-dmr-amd64.tar.gz  SOURCE
CMakeCache.txt  cmake_install.cmake  _CPack_Packages    Makefile                 mysql_server-prefix

You will notice that there is, mysql-group-replication-0.5.0-dmr-amd64.tar.gz package where is group_replication.so file located:

sh@shrzayev:~/Sandboxes/mysql-group-replication-0.5.0-dmr/BIN$ tar -tf mysql-group-replication-0.5.0-dmr-amd64.tar.gz | grep group_replication.so
mysql-group-replication-0.5.0-dmr-amd64/lib/plugin/group_replication.so

From now we have mysql group replication plugin for 64 bit Ubuntu 14.04. We will continue our topic about testing group replication plugin in easy ways. There is some other related issues, if you are interested refer to my previous article: Group Replication Plugin issues

MySQL Group Replication plugin issues with 5.7.8-rc2

Well the new things are cool to test and learn. But it is not possible every time, to have smooth testing process. if you can’t get right software on your hand.

So there are great articles about this new plugin here you can follow all of them: Group Replication Topics

I must say that, plugin available for download as source code and as for Oracle Linux 6 x86_64 from -> labs.mysql.com

So if you want to try it with MySQL 5.7.8-rc2 on Ubuntu you should compile it with MySQL. But you will not be able to compile due to well known ‘boost’ dependency -> see BUG #78600

UPDATE to BUG 78600:

Pedro Gomes clarified the reason of failed compiling process. There must not be any “spaces” between cmake options specified with -DMYSQL_SERVER_CMAKE_ARGS=””. So we have working cmake command as follows:

    cmake ..  -DMYSQL_SERVER_SRC_DIR="/home/sh/Sandboxes/mysql-5.7.8-rc" -DMYSQL_SERVER_CMAKE_ARGS="-DMYSQL_DATADIR=/opt/mysql-5.7.8-grouprepl/datadir;-DSYSCONFDIR=/opt/mysql-5.7.8-grouprepl;-DWITH_SSL=system;-DMYSQL_TCP_PORT=3307;-DMYSQL_UNIX_ADDR=/opt/mysql-5.7.8-grouprepl/mysqld.sock;-DDEFAULT_CHARSET=utf8;-DDEFAULT_COLLATION=utf8_general_ci;-DWITH_DEBUG=1;-DCOMPILATION_COMMENT='Group Replication enabled MySQL-5.7.8';-DOPTIMIZER_TRACE=1;-DWITH_ZLIB=system;-DWITH_VALGRIND=1;-DCMAKE_C_FLAGS=-DHAVE_purify;-DCMAKE_CXX_FLAGS=-DHAVE_purify;-DDOWNLOAD_BOOST=1;-DWITH_BOOST=/home/sh/Sandboxes"

But even you have Oracle Linux 6 x86_64 and even you follow topics: 1. Corosync Guide 2. Getting Started with MySQL Group Replication

You will likely get “Segmentation Fault” after running START GROUP_REPLICATION:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc8ef2700 (LWP 3294)]
0x00007ffff625d532 in _int_malloc () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install corosynclib-1.4.7-2.el6.x86_64
(gdb) bt
#0  0x00007ffff625d532 in _int_malloc () from /lib64/libc.so.6
#1  0x00007ffff625e991 in malloc () from /lib64/libc.so.6
#2  0x00007ffff6acf0bd in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6
#3  0x00007fffc8f5b2ab in Certifier::Certifier (this=0x7fffbc051d20)
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/certifier.cc:209
#4  0x00007fffc8f6bb36 in Certification_handler::initialize (this=0x7fffbc0507e0)
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/handlers/certification_handler.cc:32
#5  0x00007fffc8f634d7 in configure_pipeline (pipeline=0x7fffbc051a80, handler_list=0x7fffbc0506f0, num_handlers=3)
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/pipeline_factory.cc:143
#6  0x00007fffc8f6364d in get_pipeline (pipeline_type=<value optimized out>, pipeline=0x7fffbc051a80)
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/pipeline_factory.cc:30
#7  0x00007fffc8f554b0 in Applier_module::setup_applier_module (this=0x7fffbc051920, pipeline_type=STANDARD_GROUP_REPLICATION_PIPELINE, reset_logs=false, 
    stop_timeout=31536000, group_sidno=2)
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/applier.cc:99
#8  0x00007fffc8f63eca in configure_and_start_applier_module ()
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/plugin.cc:761
#9  0x00007fffc8f657df in plugin_group_replication_start ()
    at /export/home/pb2/build/sb_0-16452762-1441960761.59/release-0.5.0-export-8447366/BIN/BUILD/mysql-server/plugin/group_replication/src/plugin.cc:289
#10 0x00000000016927c9 in Group_replication_handler::start (this=0x378f2d0) at /home/full_datadir/mysql-5.7.8-rc/sql/rpl_group_replication.cc:56
#11 0x0000000001692b81 in group_replication_start () at /home/full_datadir/mysql-5.7.8-rc/sql/rpl_group_replication.cc:192
#12 0x000000000149739c in mysql_execute_command (thd=0x7fffbc007180) at /home/full_datadir/mysql-5.7.8-rc/sql/sql_parse.cc:3029
#13 0x000000000149e15b in mysql_parse (thd=0x7fffbc007180, parser_state=0x7fffc8ef14a0) at /home/full_datadir/mysql-5.7.8-rc/sql/sql_parse.cc:5255
#14 0x0000000001492dec in dispatch_command (thd=0x7fffbc007180, com_data=0x7fffc8ef1cd0, command=COM_QUERY)
    at /home/full_datadir/mysql-5.7.8-rc/sql/sql_parse.cc:1272
#15 0x00000000014917f3 in do_command (thd=0x7fffbc007180) at /home/full_datadir/mysql-5.7.8-rc/sql/sql_parse.cc:852
#16 0x00000000015c9042 in handle_connection (arg=0x36c1a50) at /home/full_datadir/mysql-5.7.8-rc/sql/conn_handler/connection_handler_per_thread.cc:300
#17 0x0000000001c56c41 in pfs_spawn_thread (arg=0x37d1990) at /home/full_datadir/mysql-5.7.8-rc/storage/perfschema/pfs.cc:2178
#18 0x00007ffff7bc79d1 in start_thread () from /lib64/libpthread.so.0
#19 0x00007ffff62ccb6d in clone () from /lib64/libc.so.6

For further reading refer to related fresh BUG report -> #78627

It is really cool to help community to improve Open Source projects. So go on and test by your own, report interesting things and ofcourse contact with community members. They are all open to discussion🙂

The origin of “Assertion Failed” errors in MySQL

Recently, we have discussed BUGs with our local community members. Where the half off people said that, they have seen “Assertion Failure” errors in error log. And the question was what is this at all?

In general “assert” is a code portion put by developer to handle certain things.

Well, i am not a core MySQL developer and my C/C++ knowledge is limited to simple calculator🙂 But i will try to explain where all these “Assertion Failure” errors come up.
The exact reason of failing Assertion/crash should be known by core developers. I know how to crash, they know how to fix. You help developers to reproduce a crash filling a bug report, by uploading test cases or maybe the core dump, backtrace etc.
Say, for eg, if you try to create a view after getting “full disk” error you will crash MySQL. See related BUG report -> #76827

While testing full disk conditions, if we try to import huge dump, we will see:

2015-04-24 09:07:52 7faf8bcab700 InnoDB: Error: Write to file ./sales/sales.ibd failed at offset 247463936.
InnoDB: 1048576 bytes should have been written, only 299008 were written.
InnoDB: Operating system error number 11.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 11 means 'Resource temporarily unavailable'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
2015-04-24 09:07:52 28213 [ERROR] /opt/mysql/bin/mysqld: The table 'sales' is full
2015-04-24 09:08:13 28213 [ERROR] /opt/mysql/bin/mysqld: The table 'sales' is full
2015-04-24 09:08:27 28213 [ERROR] /opt/mysql/bin/mysqld: The table 'sales' is full

After detecting Full Disk error, if you try to create view:

mysql> create view f as select * from sales;
ERROR 2013 (HY000): Lost connection to MySQL server during query

From error log:

mysqld: /root/mysql-5.6.24/mysys/mf_iocache.c:1799: my_b_flush_io_cache: Assertion `info->end_of_file == inline_mysql_file_tell("/root/mysql-5.6.24/mysys/mf_iocache.c", 1799, info->file, (myf) (0))' failed.
13:13:48 UTC - mysqld got signal 6 ;

So that’s all, from error log it obvious that there is a file named: /root/mysql-5.6.24/mysys/mf_iocache.c and on line 1799 there is an assert inside my_b_flush_io_cache function.
If you go ahead and open up this line you will see something like:

else
     {
        info->end_of_file+=(info->write_pos-info->append_read_pos);
       DBUG_ASSERT(info->end_of_file == mysql_file_tell(info->file, MYF(0)));
     }

For now let’s pause here and introduce new things, such as what is “Optimized”, “Debug” and “Valgrind” builds of MySQL. Please watch this wonderfull video recorded by QA expert Roel Van de Paar after you will learn about newly intoduced topics. -> MySQL QA Episode 11

In general the “Optimized” MySQL is a GA version of MySQL released by vendors. It is production ready and it is working as fast as possible. So there is no “unnecessary” codes inside this build.
The “Debug” build is for debugging purpose and there is a DEBUG instrumentation code portions inside source code.
in “Valgrind” build, there are “Debug” + “Valgrind” instrumentation codes inside source code.

So above we saw DBUG_ASSERT(info->end_of_file == mysql_file_tell(info->file, MYF(0))); -> It means that this “assert” code will be shown only with “Debug” build. You will not see this code in “Optimized” MySQL source code.

Ok, let’s go on. As we have mentioned “assert” code is written by developer to handle several conditions. It might be for eg, developer decides that, if variable named num will be equal to NULL something weird is happened, terminate the program at that point.

Let’s write a simple code with our very own “assert”. Here is our assert_test.c file:

#include <stdio.h>      /* printf */
#include <assert.h>     /* assert */

void check_number(int myInt) {
  assert (myInt!=NULL);
  printf ("%dn",myInt);
}

int main ()
{
  int a=10;
  int b = NULL;

  check_number (a);
  check_number (b);

  return 0;
}

We have put an “assert” ensuring that myInt will never be “NULL”.
Compile source code:

    gcc assert_test.c -o assert_test

And run:

sh@shrzayev:~$ ./assert_test
10
assert_test: assert_test.c:5: check_number: Assertion `myInt!=((void *)0)' failed.
Aborted (core dumped)

So as you see, the same error comes up with our little code.
We have assert_test.c file and inside check_number function at line 5 there is an “assert” code which is failed.

I Hope have explained this point. Thank you for reading.