注意

本文档适用于 Ceph 开发版本。

故障排除

Slow/stuck operations

If you are experiencing apparent hung operations, the first task is to identifySlow requests (MDS), below), and narrow it down from there.

We can get hints about what’s going on by dumping the MDS cache

ceph daemon mds.<name> dump cache /tmp/dump.txt

Note

The filedump.txtis on the machine executing the MDS and for systemdnsenter(1)to locatedump.txtor specify another system-wide path.

If high logging levels are set on the MDS, that will almost certainly hold the

Stuck during recovery

Stuck in up:replay

If your MDS is stuck inup:replaythen it is likely that the journal isMDS_HEALTH_TRIMcluster warnings saying the MDS is

Reduce MDS debugging to 0. Even at the default settings, the MDS logs some

ceph config set mds debug_mds 0
ceph config set mds debug_ms 0
ceph config set mds debug_monc 0

Note if the MDS fails then there will be virtually no information to determineup:replaywill complete, you should restore

ceph config rm mds debug_mds
ceph config rm mds debug_ms
ceph config rm mds debug_monc

Once you’ve got replay moving along faster, you can calculate when the MDS will

$ ceph tell mds.<fs_name>:0 status | jq .replay_status
{
  "journal_read_pos": 4195244,
  "journal_write_pos": 4195244,
  "journal_expire_pos": 4194304,
  "num_events": 2,
  "num_segments": 2
}

Replay completes when thejournal_read_posreaches thejournal_write_pos. The write position will not change during replay. Track

Avoiding recovery roadblocks

When trying to urgently restore your file system during an outage, here are some

  • Deny all reconnect to clients.This effectively blocklists all existing

    ceph config set mds mds_deny_all_reconnect true
    

    Remember to undo this after the MDS becomes active.

    Note

    This does not prevent new sessions from connecting. For that, see therefuse_client_sessionfile system setting.

  • Extend the MDS heartbeat grace period. This avoids replacing an MDS that appears

    ceph config set mds mds_heartbeat_grace 3600
    

    Note

    This has the effect of having the MDS continue to send beacons to the monitorsmds_beacon_gracemonitor setting.

  • Disable open file table prefetch.Normally, the MDS will prefetchand large. So this behavior

    ceph config set mds mds_oft_prefetch_dirfrags false
    
  • Turn off clients.Clients reconnecting to the newlyup:activeMDS may

    You can do this manually or use the new file system tunable:

    ceph fs set <fs_name> refuse_client_session true
    

    That prevents any clients from establishing new sessions with the MDS.

  • Dont tweak max_mdsModifying the FS setting variablemax_mdsismax_mdsmustmax_mdswith--yes-i-really-mean-it)

  • Turn off async purge threadsThe volumes plugin spawns threads for

    ceph config set mgr mgr/volumes/pause_purging true
    

    To resume purging run:

    ceph config set mgr mgr/volumes/pause_purging false
    
  • Turn off async cloner threadsThe volumes plugin spawns threads for

    ceph config set mgr mgr/volumes/pause_cloning true
    

    To resume cloning run:

    ceph config set mgr mgr/volumes/pause_cloning false
    

Expediting MDS journal trim

If your MDS journal grew too large (maybe your MDS was stuck in up:replay for aMDS_HEALTH_TRIMwarnings.

The main tunable available to do this is to modify the MDS tick interval. Theup:active. The MDS does not

ceph config set mds mds_tick_interval 2

RADOS Health

If part of the CephFS metadata or data pools is unavailable and CephFS is not故障排除).

The MDS

If an operation is hung inside the MDS, it will eventually show up inceph health,

Generally it will be the result of

  1. Overloading the system (if you have extra RAM, increase the

  2. Running an older (misbehaving) client.

  3. Underlying RADOS issues.

Otherwise, you have probably discovered a new bug and should report it to

Slow requests (MDS)

You can list current operations via the admin socket by running:

ceph daemon mds.<name> dump_ops_in_flight

from the MDS host. Identify the stuck commands and examine why they are stuck.

If it’s a result of a bug in the capabilities code, restarting the MDS

If there are no slow requests reported on the MDS, and it is not reporting

ceph-fuse调试

ceph-fuse also supportsdump_ops_in_flight. See if it has any and where they are

Debug output

To get more debugging information from ceph-fuse, try running in the foreground-d) and enabling client debug--debug-client=20), enabling prints for each message sent--debug-ms=1).

If you suspect a potential monitor issue, enable monitor debugging as well--debug-monc=20).

内核挂载调试

If there is an issue with the kernel client, the most important thing isdmesg. Collect it and any inappropriate kernel state.

Slow requests

Unfortunately the kernel client does not support the admin socket, but it hassys/kernel/debug/ceph/, and that folder (whose name will28f7427e-5558-4ffd-ae1a-51ec3042759a.client25386880)catthem. These files are described below; the most interesting when debuggingmdscosdcfiles.

  • bdi: BDI info about the Ceph system (blocks dirtied, written, etc)

  • caps: counts of file “caps” structures in-memory and used

  • client_options: dumps the options provided to the CephFS mount

  • dentry_lru: Dumps the CephFS dentries currently in-memory

  • mdsc: Dumps current requests to the MDS

  • mdsmap: Dumps the current MDSMap epoch and MDSes

  • mds_sessions: Dumps the current sessions to MDSes

  • monc: Dumps the current maps from the monitor, and any “subscriptions” held

  • monmap: Dumps the current monitor map epoch and monitors

  • osdc: Dumps the current ops in-flight to OSDs (ie, file data IO)

  • osdmap: Dumps the current OSDMap epoch, pools, and OSDs

If the data pool is in a NEARFULL condition, then the kernel cephfs client

Disconnected+Remounted FS

Because CephFS has a “consistent cache”, if your network connection is

You can identify you are in this situation if dmesg/kern.log report something like:

Jul 20 08:14:38 teuthology kernel: [3677601.123718] ceph: mds0 closed our session
Jul 20 08:14:38 teuthology kernel: [3677601.128019] ceph: mds0 reconnect start
Jul 20 08:14:39 teuthology kernel: [3677602.093378] ceph: mds0 reconnect denied
Jul 20 08:14:39 teuthology kernel: [3677602.098525] ceph:  dropping dirty+flushing Fw state for ffff8802dc150518 1099935956631
Jul 20 08:14:39 teuthology kernel: [3677602.107145] ceph:  dropping dirty+flushing Fw state for ffff8801008e8518 1099935946707
Jul 20 08:14:39 teuthology kernel: [3677602.196747] libceph: mds0 172.21.5.114:6812 socket closed (con state OPEN)
Jul 20 08:14:40 teuthology kernel: [3677603.126214] libceph: mds0 172.21.5.114:6812 connection reset
Jul 20 08:14:40 teuthology kernel: [3677603.132176] libceph: reset on mds0

This is an area of ongoing work to improve the behavior. Kernels will soon

挂载

Mount 5 Error

A mount 5 error typically occurs if a MDS server is laggy or if it crashed.active + healthy.

Mount 12 Error

A mount 12 error withcannot allocate memoryusually occurs if you have aCeph Clientversion and theCephversion. Check the versions using:

ceph -v

If the Ceph Client is behind the Ceph cluster, try to upgrade it:

sudo apt-get update && sudo apt-get install ceph-common

You may need to uninstall, autoclean and autoremoveceph-commonand then reinstall it so that you have the latest version.

Dynamic Debugging

You can enable dynamic debug against the CephFS module.

Please see:https://github.com/ceph/ceph/blob/master/src/script/kcon_all.sh

In-memory Log Dump

In-memory logs can be dumped by settingmds_extraordinary_events_dump_intervalduring a lower level debugging (log level < 10).mds_extraordinary_events_dump_intervalis the interval in seconds for dumping the recent in-memory logs when there is an Extra-Ordinary event.

The Extra-Ordinary events are classified as:

  • Client Eviction

  • Missed Beacon ACK from the monitors

  • Missed Internal Heartbeats

In-memory Log Dump is disabled by default to prevent log file bloat in a production environment.

$ ceph config set mds debug_mds <log_level>/<gather_level>
$ ceph config set mds mds_extraordinary_events_dump_interval <seconds>

The log_levelshould be < 10 andgather_levelshould be >= 10 to enable in-memory log dump.mds_extraordinary_events_dump_intervalseconds and if any of them occurs, MDS dumps the

Note

For higher log levels (log_level >= 10) there is no reason to dump the In-memory Logs and a

The In-memory Log Dump can be disabled using:

$ ceph config set mds mds_extraordinary_events_dump_interval 0

Filesystems Become Inaccessible After an Upgrade

Note

You can avoidoperation not permittederrors by running this procedureoperation not permittederrors of the kind discussed here occur after upgrades after Nautilus

IF

you have CephFS file systems that have data and metadata pools that wereceph fs newcommand (meaning that they were not created

OR

you have an existing CephFS file system and are upgrading to a new post-Nautilus

THEN

in order for the documentedceph fs authorize...commands to function asclient.adminuser), you must first run:

ceph osd pool application set <your metadata pool name> cephfs metadata <your ceph fs filesystem name>

ceph osd pool application set <your data pool name> cephfs data <your ceph fs filesystem name>

Otherwise, when the OSDs receive a request to read or write data (not the

data pool=fsname
metadata pool=fsname_metadata

to:

data pool=fsname.data and
metadata pool=fsname.meta

Any setup that usedclient.adminfor all mounts did not run into this

A temporary fix involves changing mount requests to the ‘client.admin’ user andcaps osd = "allow rw"and deletetag cephfs data=....

Disabling the Volumes Plugin

In certain scenarios, the Volumes plugin may need to be disabled to prevent禁用卷插件

Reporting Issues

If you have identified a specific issue, please report it with as much

  • Ceph versions installed on client and server

  • Whether you are using the kernel or fuse client

  • If you are using the kernel client, what kernel version?

  • How many clients are in play, doing what kind of workload?

  • If a system is ‘stuck’, is that affecting all clients or just one?

  • Any ceph health messages

  • Any backtraces in the ceph logs from crashes

If you are satisfied that you have found a bug, please file it onthe bug. For more general queries, please write to theceph-users mailing.

由 Ceph 基金会带给您

Ceph 文档是一个社区资源,由非盈利的 Ceph 基金会资助和托管Ceph Foundation. 如果您想支持这一点和我们的其他工作,请考虑加入现在加入.