Skip to main content

Mesosphere DC/OS Concept

Mesosphere DC/OS Concept:
  • DC/OS stands for Data Center Operating System
  • It is a distributed operating system, abstracts the cluster hardware, software resources, and provides container orchestration,package management, networking, logging and metrics, storage and volumes, and identity management.
  • DC/OS has system space / kernel space and user space.
  • System space is a protected area, not accessible to users, which involves low-level operations such as resource allocation, security, and process isolation
  • User space includes user applications, jobs, and services.
  • Built-in package manager can be used to install services into user space.
  • Each DC/OS node also has a host operating system which manages the underlying machine
  • Made up of many components - distributed systems kernel and a container orchestration engine Marathon.
  • DC/OS runs on a cluster of nodes, instead of a single machine.
  • The kernel of DC/OS is based on Apache Mesos distributed system kernel
  • A cluster manager, a container platform, and an operating system
  • Group of agent nodes, coordinated by a group of master nodes
  • It provides platform to containerized task such as Docker images.
  • Mesosphere DC/OS Enterprise may include closed-source components and multi-tenancy, fine-grained permissions secrets management, and end-to-end encryption.
  • Agent nodes provides resources to the cluster --> Resources are then bundled into resource offers made available to registered schedulers --> Schedulers accepts these offers and allocate resources to specific tasks by placing tasks on specific agent nodes --> The agent nodes then spawn executors to manage each task type and the executors run and manage the tasks assigned to them.
  • You can manage multiple machines as they are single computer.
  • Automates resource management, schedules process placement, facilitates inter-process communication,and simplifies the installation and management of distributed services.
  • Manages both resources and tasks running on the agent nodes.
  • DC/OS runs in the cluster and manages the life cycle of the tasks it launches.
  • There are two options available to interact with DC/OS, using them you can remotely manage and monitor cluster and cluster services web interface and command-line interface (CLI) facilitate
Ref.: https://docs.d2iq.com/mesosphere/dcos/2.0/overview/what-is-dcos/

Comments

Popular posts from this blog

MySQL InnoDB cluster troubleshooting | commands

Cluster Validation: select * from performance_schema.replication_group_members; All members should be online. select instance_name, mysql_server_uuid, addresses from  mysql_innodb_cluster_metadata.instances; All instances should return same value for mysql_server_uuid SELECT @@GTID_EXECUTED; All nodes should return same value Frequently use commands: mysql> SET SQL_LOG_BIN = 0;  mysql> stop group_replication; mysql> set global super_read_only=0; mysql> drop database mysql_innodb_cluster_metadata; mysql> RESET MASTER; mysql> RESET SLAVE ALL; JS > var cluster = dba.getCluster() JS > var cluster = dba.getCluster("<Cluster_name>") JS > var cluster = dba.createCluster('name') JS > cluster.removeInstance('root@<IP_Address>:<Port_No>',{force: true}) JS > cluster.addInstance('root@<IP add>,:<port>') JS > cluster.addInstance('root@ <IP add>,:<port> ') JS > dba.getC...

MySQL slave Error_code: 1032 | MySQL slave drift | HA_ERR_KEY_NOT_FOUND

MySQL slave Error_code: 1032 | MySQL slave drift: With several MySQL, instance with master slave replication, I have one analytics MySQL, environment which is larger in terabytes, compared to other MySQL instances in the environment. Other MySQL instances with terabytes of data are running fine master, slave replication. But this analytics environment get started generating slave Error_code :1032. mysql> show slave status; Near relay log: Error_code: 1032; Can't find record in '<table_name>', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log <name>-bin.000047, end_log_pos 5255306 Near master section: Could not execute Update_rows event on table <db_name>.<table_name>; Can't find record in '<table_name>', Error_code: 1032; Can't find record in '<table_name>', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log <name>-bin.000047, end_l...

InnoDB cluster Remove Instance Force | Add InnoDB instance

InnoDB cluster environment UUID is different on node: To fix it stop group replication, remove instance (use force if require), add instance back Identify the node which is not in sync: Execute following SQL statement on each node and identify the node has different UUID on all nodes. mysql> select * from mysql_innodb_cluster_metadata.instances; Stop group replication: Stop group replication on the node which does not have same UUID on all nodes. mysql > stop GROUP_REPLICATION; Remove instances from cluster: Remove all secondary node from the cluster and add them back if require. $mysqlsh JS >\c root@<IP_Address>:<Port_No> JS > dba.getCluster().status() JS > dba.getCluster () <Cluster:cluster_name> JS > var cluster = dba.getCluster("cluster_name"); JS >  cluster.removeInstance('root@<IP_Address>:<Port_No>'); If you get "Cluster.removeInstance: Timeout reached waiting......" JS > cluster.removeInstance(...