MySQL InnoDB cluster troubleshooting

MySQL InnoDB cluster troubleshooting | commands

Cluster Validation:
select * from performance_schema.replication_group_members;
All members should be online.
select instance_name, mysql_server_uuid, addresses from mysql_innodb_cluster_metadata.instances;
All instances should return same value for mysql_server_uuid
SELECT @@GTID_EXECUTED;
All nodes should return same value
Frequently use commands:

mysql> SET SQL_LOG_BIN = 0;
mysql> stop group_replication;
mysql> set global super_read_only=0;
mysql> drop database mysql_innodb_cluster_metadata;
mysql> RESET MASTER;
mysql> RESET SLAVE ALL;
JS > var cluster = dba.getCluster()
JS > var cluster = dba.getCluster("<Cluster_name>")
JS > var cluster = dba.createCluster('name')
JS > cluster.removeInstance('root@<IP_Address>:<Port_No>',{force: true})
JS > cluster.addInstance('root@<IP add>,:<port>')
JS > cluster.addInstance('root@<IP add>,:<port>')
JS > dba.getCluster ()
JS > dba.getCluster().status()
JS >dba.getCluster().checkInstanceState('root@<IP_Address>:<Port_No>')
JS >dba.getCluster().rejoinInstance('root@<IP_Address>:<Port_No>')

Connect to cluster:
mysql-js> \c root@<host_name>:<port_no>

Find out node is R/W OR what R:
mysql-js> dba.getCluster().status()
Note: If you get following message, then it's time to reboot cluster from complete outage
Dba.getCluster: This function is not available through a session to a standalone instance (metadata exists, but GR is not active) (RuntimeError)
Restoring a Cluster from Quorum Loss:
Re-establish quorum using the method cluster.forceQuorumUsingPartitionOf()
JS > cluster.forceQuorumUsingPartitionOf("root@<IP address>:<port>")
Ref.: https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-working-with-cluster.html
Reboot cluster from outages:
MySQL <node1_ip_address>:<port_no> ssl JS > dba.rebootClusterFromCompleteOutage('<cluster_name>')
Reconfiguring the cluster '<cluster_name>' from complete outage...
The instance '<node2_ip_address>:<port_no>' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y
The instance '<node3_ip_address>:<port_no>' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y
The cluster was successfully rebooted.
<Cluster:cluster_name>
Reboot cluster from outage:
dba.rebootClusterFromCompleteOutage('cluster_name')
Get cluster status:
MySQL <node1_ip_address>:<port_no> ssl JS > var cluster = dba.getCluster()
MySQL <node1_ip_address>:<port_no> ssl JS > cluster.status()
{
    "clusterName": "<cluster_name>",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "<node1_ip_address>:<port_no>",
        "ssl": "REQUIRED",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "<node2_ip_address>:<port_no>": {
                "address": "<node2_ip_address>:<port_no>",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "<node1_ip_address>:<port_no>": {
                "address": "<node1_ip_address>:<port_no>",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "<node3_ip_address>:<port_no>": {
                "address": "
                ",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    },
    "groupInformationSourceMember": "mysql://root@<node1_ip_address>:<port_no>"
}
Describe cluster:
MySQL <node1_ip_address>:<port_no> ssl JS > var cluster = dba.getCluster()
MySQL <node1_ip_address>:<port_no> ssl JS > cluster.describe();
{
    "clusterName": "<cluster_name>",
    "defaultReplicaSet": {
        "name": "default",
        "topology": [
            {
                "address": "<node1_ip_address>:<port_no>",
                "label": "<node1_ip_address>:<port_no>",
                "role": "HA"
            },
            {
                "address": "<node3_ip_address>:<port_no>",
                "label": "<node3_ip_address>:<port_no>",
                "role": "HA"
            },
            {
                "address": "<node2_ip_address>:<port_no>",
                "label": "<node2_ip_address>:<port_no>",
                "role": "HA"
            }
        ]
    }
}

Remove cluster:
mysql-js> cluster.removeInstance('root@localhost:<port_no>')
Rescan cluster:
cluster.rescan()
Cluster check instance:
mysql-js> cluster.checkInstanceState('root@<node1_ip_address>:<port_no>')
mysql-js> cluster.checkInstanceState('root@<node3_ip_address>:<port_no>')
mysql-js> cluster.checkInstanceState('root@<node2_ip_address>:<port_no>')

MySQL <node1_ip_address>:<port_no> ssl JS > cluster.checkInstanceState('root@<node1_ip_address>:<port_no>')
Please provide the password for 'root@<node1_ip_address>:<port_no>': ********
Analyzing the instance replication state...

The instance 'root@<node1_ip_address>:<port_no>' is valid for the cluster.
The instance is fully recoverable.

{
    "reason": "recoverable",
    "state": "ok"
}

cluster.checkInstanceState('<node3_ip_address>:<port_no>')

Dissolve cluster:
mysql-js> cluster.dissolve({force:true})
Rejoin cluster:
var cluster = dba.getCluster()
cluster.rejoinInstance( 'root@<node3_ip_address>:<port_no>')

Debug MySQL cluster:

# mysqlsh --log-level=DEBUG3

JS > dba.verbose=2
JS > \c root@<IP_Address>:<Port>
JS >   dba.configureInstance()
JS > MySQL <IP_Address>:<Port>
JS > var cluster = dba.createCluster('<cluster_name>')

Getting errors while adding node to an existing cluster:

JS > cluster.addInstance('root@<IP_Address>:<Port>')

"status": "RECOVERING"
"status": "(MISSING)"
[ERROR] [MY-011526] [Repl] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 0d310ed3-825e-11e9-8729-005056b2df3b:1-3 > Group transactions: 23f0be3a-825b-11e9-b1d9-005056b2d6d7:1-27,
[ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'
[ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'
[ERROR] [MY-010584] [Repl] Slave I/O for channel 'group_replication_recovery': error connecting to master
ERROR: Error joining instance to cluster: '<IP_Address>:<Port>' - Query failed. MySQL Error (3092): ClassicSession.query: The server is not configured properly to be an active member of the group. Please see more details on error log.. Query: START group_replication: MySQL Error (3092):
[ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to
Solution:
Recreate empty directory for data, log, iblogs, binlog, execute following and add try to add node

mysql> SET sql_log_bin = OFF;mysql> alter user root@localhost identified by '<code>';mysql> update mysql.user set host='%' where user='root';mysql> CREATE USER 'repl'@'%' IDENTIFIED BY '<code>';mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';mysql> FLUSH PRIVILEGES;mysql> SET sql_log_bin = ON;

Ref.: https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-working-with-cluster.html

Getting instance belonging to an unmanaged replication group (RuntimeError):

MySQL JS > dba.getCluster()

Dba.getCluster: This function is not available through a session to an instance belonging to an unmanaged replication group (RuntimeError)

AND

InnoDB cluster status is different for mysqlsh and performance_schema.replicaion_group_members. In performance_schema it is ONLINE but mysqlsh depicts MISSING.

Solution:

Restore auto.cnf form backup, restart mysqld. If require remove instance and add it back again.

Meta Link - Cluster.status() is out of sync with the Group Replication view replication_group_members (Doc ID 2526137.1)

About auto.cnf and server_uuid:

auto.cnf file is similar to my.cnf. It contains server_uuid. Server_uuid is generated automatically. When starting MySQL it read and use server_uuid from auto.cnf. The value of the server_uuid used in reapplication. server_uuid is true UUID in an addition to user supplied server_id system variable.

Ref.: https://dev.mysql.com/doc/refman/8.0/en/replication-options.html#sysvar_server_uuid

AWS / Azure / GCP Cloud | Database | Data Analytics | LakeHouse | Machine Learning | Shrenik Parekh

Search This Blog

MySQL InnoDB cluster troubleshooting | commands

Labels

Comments

Post a Comment

Archive

Popular posts from this blog

InnoDB cluster Remove Instance Force | Add InnoDB instance

Oracle E-Business Suite Online Patch Phases executing adop