0%

Cassandra 集群部署

如何部署一套线上 Cassandra 集群

本文主要讲解 Cassandra 3.11.6 集群的安装部署过程, 分为以下几个步骤:

  • 准备安装包
  • 基础环境安装
  • 配置集群节点
  • 集群部署
  • nodetool 工具使用

准备安装包

基础环境安装

项目路径 (示例地址: /work/cassandra )

1
# mkdir /work/cassandra

安装 JDK1.8 (以 jdk1.8.0_191 为例)

1
2
3
4
5
6
7
# echo ${JAVA_HOME}
/opt/soft/jdk/jdk1.8.0_191
#
# java -version
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

检查 python 环境

1
2
# python --version
Python 2.7.14

解压缩 Cassandra 安装包

1
2
3
4
5
6
7
8
9
# tar xf apache-cassandra-3.11.1-bin.tar.gz
#
# mv apache-cassandra-3.11.6/* /work/cassandra
#
# pwd
/work/cassandra
#
# ls
bin conf doc interface javadoc lib pylib tools

配置集群节点

Cassandra 基础环境配置

添加 cassandra 用户

1
2
3
4
5
6
7
8
9
10
# 解除账号锁定
# chattr -i /etc/passwd;chattr -i /etc/shadow;chattr -i /etc/group;chattr -i /etc/gshadow
#
# useradd cassandra
#
# id cassandra
uid=1008(cassandra) gid=1008(cassandra) groups=1008(cassandra)
#
# 添加账号锁定
# chattr +i /etc/passwd;chattr +i /etc/shadow;chattr +i /etc/group;chattr +i /etc/gshadow

添加环境变量

1
2
3
4
5
# vim /etc/profile.d/cassandra.sh
# 写入以下内容保存
export CASSANDRA_HOME=/work/cassandra
export PATH=$PATH:$CASSANDRA_HOME/bin
# . /etc/profile

创建 Cassandra 数据目录、日志目录

1
2
3
4
5
6
7
8
9
10
# cd /work/cassandra
#
# 创建 数据目录、日志目录
# mkdir data logs
#
# ls
bin conf data doc interface javadoc lib logs pylib tools
#
# 数据库目录权限
# chown -R cassandra.cassandra .

创建启停脚本

1
2
3
4
5
6
7
8
9
10
11
12
# pwd
/work/cassandra
#
# cat start.sh
nohup bin/cassandra > /dev/null 2>&1 &
#
# cat bin/stop-server
user=`whoami`
pgrep -u $user -f cassandra | xargs kill -15
# cat stop.sh
bin/stop-server
#

记得要将数据库目录归属权限授予用户 cassandra

配置文件

  • cassandra.yaml
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    cluster_name: 'Test Cluster'
    num_tokens: 256
    hinted_handoff_enabled: true
    hinted_handoff_throttle_in_kb: 1024
    max_hints_delivery_threads: 2
    hints_directory: /work/cassandra/data/hints
    hints_flush_period_in_ms: 10000
    max_hints_file_size_in_mb: 128
    batchlog_replay_throttle_in_kb: 1024
    authenticator: AllowAllAuthenticator
    authorizer: AllowAllAuthorizer
    role_manager: CassandraRoleManager
    roles_validity_in_ms: 2000
    permissions_validity_in_ms: 2000
    credentials_validity_in_ms: 2000
    partitioner: org.apache.cassandra.dht.Murmur3Partitioner
    data_file_directories:
    - /work/cassandra/data
    commitlog_directory: /work/cassandra/data/commitlog
    cdc_enabled: false
    disk_failure_policy: stop
    commit_failure_policy: stop
    prepared_statements_cache_size_mb:
    thrift_prepared_statements_cache_size_mb:
    key_cache_size_in_mb:
    key_cache_save_period: 14400
    row_cache_size_in_mb: 0
    row_cache_save_period: 0
    counter_cache_size_in_mb:
    counter_cache_save_period: 7200
    saved_caches_directory: /work/cassandra/data/saved_caches
    commitlog_sync: periodic
    commitlog_sync_period_in_ms: 10000
    commitlog_segment_size_in_mb: 32
    seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
    parameters:
    - seeds: "192.168.1.78,192.168.1.66,192.168.1.67"
    concurrent_reads: 32
    concurrent_writes: 32
    concurrent_counter_writes: 32
    concurrent_materialized_view_writes: 32
    memtable_allocation_type: heap_buffers
    index_summary_capacity_in_mb:
    index_summary_resize_interval_in_minutes: 60
    trickle_fsync: false
    trickle_fsync_interval_in_kb: 10240
    storage_port: 7000
    ssl_storage_port: 7001
    listen_address: 192.168.1.78
    start_native_transport: true
    native_transport_port: 9042
    start_rpc: true
    rpc_address: 192.168.1.78
    rpc_port: 9160
    rpc_keepalive: true
    rpc_server_type: sync
    auto_snapshot: true
    column_index_size_in_kb: 64
    column_index_cache_size_in_kb: 2
    compaction_throughput_mb_per_sec: 16
    sstable_preemptive_open_interval_in_mb: 50
    read_request_timeout_in_ms: 5000
    range_request_timeout_in_ms: 10000
    write_request_timeout_in_ms: 2000
    counter_write_request_timeout_in_ms: 5000
    cas_contention_timeout_in_ms: 1000
    truncate_request_timeout_in_ms: 60000
    request_timeout_in_ms: 10000
    slow_query_log_timeout_in_ms: 500
    cross_node_timeout: false
    endpoint_snitch: SimpleSnitch
    dynamic_snitch_update_interval_in_ms: 100
    dynamic_snitch_reset_interval_in_ms: 600000
    dynamic_snitch_badness_threshold: 0.1
    request_scheduler: org.apache.cassandra.scheduler.NoScheduler
    internode_compression: dc
    inter_dc_tcp_nodelay: false
    tracetype_query_ttl: 86400
    tracetype_repair_ttl: 604800
    enable_user_defined_functions: false
    enable_scripted_user_defined_functions: false
    windows_timer_interval: 1
    gc_warn_threshold_in_ms: 1000
    back_pressure_enabled: false
    back_pressure_strategy:
    - class_name: org.apache.cassandra.net.RateBasedBackPressure
    parameters:
    - high_ratio: 0.90
    factor: 5
    flow: FAST
    enable_materialized_views: true
    enable_sasi_indexes: true
  • jvm.options 限制进程最大内存,避免 OOM
    1
    2
    3
    -Xms16G
    -Xmx16G
    -Xmn4G

    集群部署

服务器列表

  • 192.168.1.66
  • 192.168.1.67
  • 192.168.1.78

启动步骤

  • 在各个节点上创建 cassandra 用户,安装 JDK、Cassandra 数据库、Python环境
  • 配置统一集群的名字
  • 为每个节点分配一个IP
  • 确定种子节点,不需要配置全部节点
  • 如果是多数据中心,为每个数据中心和机架确定命名约定

启动节点

1
2
3
4
5
6
7
# pwd
/work/cassandra
#
# 切换到 cassandra 用户
# su cassandra
#
$ ./start.sh

如需关停节点

1
2
3
4
5
6
7
# pwd
/work/cassandra
#
# 切换到 cassandra 用户
# su cassandra
#
$ ./stop.sh

查看集群状态

1
2
3
4
5
6
7
8
9
# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.66 3.97 GiB 256 32.3% 641c928e-873d-4d6d-9c1e-a473206193f4 rack1
UN 192.168.1.67 3.99 GiB 256 32.6% ca4b1c70-1077-4963-89ed-583307cd2b17 rack1
UN 192.168.1.78 4.52 GiB 256 35.1% 0618f9f6-ac66-43cb-bf39-edd70519d409 rack1

nodetool 工具使用

  • nodetool 工具使用帮助

    1
    2
    3
    4
    5
    6
    7
    8
    # 列出nodetool所有可用的命令
    $ nodetool help

    # 列出指定command 的帮助内容
    $ nodetool help command-name

    # 例如:查看status 命令的详细帮助内容
    $ nodetool help status
  • 查看集群运行状态

    1
    $ nodetool status
  • 移除某个废弃节点

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # 命令模板
    $ nodetool <options> removenode -- <status> | <force> | <ID>

    # 使用实例
    # 移除Host ID为 641c928e-873d-4d6d-9c1e-a473206193f4 的节点
    $ nodetool removenode 641c928e-873d-4d6d-9c1e-a473206193f4

    # 查看节点删除状态
    $ nodetool removenode status
  • 参看某个节点负载,内存使用情况

    1
    $ nodetool info
  • 查看各个CF的详细统计信息,包括读写次数、响应时间、memtable信息等

    1
    $ nodetool cfstats
  • 其他

    1
    $ nodetool help