注意

本文档适用于 Ceph 开发版本。

Hammer

Hammer 是 Ceph 的第 8 个稳定版本。它以锤头章鱼(Octopus australis)命名。

v0.94.10 锤子

本次锤子点发布修复了几个错误并添加了两个新功能。

我们建议所有锤子 v0.94.x 用户升级。

更多详细信息,请参阅the complete changelog.

新功能

ceph-objectstore-tool 和 ceph-monstore-tool 现在允许用户从 OSD 重建监控数据库。(此功能在所有监控因 leveldb 损坏而无法启动时特别有用。)

在 RADOS 网关中,现在可以使用离线工具重新分片现有的存储桶的索引。

用法:

$ radosgw-admin bucket reshard --bucket=<bucket_name> --num_shards=<num_shards>

这将创建一个新的链接存储桶实例,该实例指向新创建的索引对象。旧的存储桶实例仍然存在,目前需要用户手动删除旧的存储桶索引。(注意,当前存储桶的 IO(尤其是写入)需要暂停才能重新分片。)

其他值得注意的更改

v0.94.9 锤子

本次锤子点发布修复了 0.94.8 中存在的构建问题,该问题阻止了我们为 Ubuntu Precise 和 CentOS 6.x 生成软件包。

我们建议所有 v0.94.7 或更早版本的锤子用户升级。

更多详细信息,请参阅the complete changelog.

值得注意的变化

  • build/ops: ceph-create-keys 无限循环 (问题#10913, Sage Weil)

v0.94.8 锤子

本次锤子点发布修复了几个错误。

我们建议所有锤子 v0.94.x 用户升级。

更多详细信息,请参阅the complete changelog.

值得注意的变化

v0.94.7 锤子

本次锤子点发布修复了几个小错误。它还包括对改进的“ceph osd reweight-by-utilization”命令的回滚,用于处理利用率高于平均值的 OSD。

我们建议所有锤子 v0.94.x 用户升级。

更多详细信息,请参阅the complete changelog.

值得注意的变化

v0.94.6 锤子

This Hammer point release fixes a range of bugs, most notably a fix for unbounded growth of the monitor’s leveldb store, and a workaround in the OSD to keep most xattrs small enough to be stored inline in XFS inodes.

我们建议所有锤子 v0.94.x 用户升级。

更多详细信息,请参阅the complete changelog.

值得注意的变化

v0.94.5 锤子

本次锤子点发布修复了 librbd 中的一个关键回归,该回归可能导致在启用缓存的图像上缓存时 QEMU/KVM 崩溃。

所有 v0.94.4 锤子用户都被强烈建议升级。

值得注意的变化

更多详细信息,请参阅the complete changelog.

v0.94.4 锤子

本次锤子点发布修复了 Hammer 中的几个重要错误,以及修复了在升级到 Infernalis 前需要解决的可互操作性问题。也就是说,所有较早版本的 Hammer 或任何版本 Firefly 的用户首先需要升级到 hammer v0.94.4 或更高版本才能升级到 Infernalis(或未来发布)。

所有 v0.94.x 锤子用户都被强烈建议升级。

值得注意的变化

更多详细信息,请参阅the complete changelog.

v0.94.3 锤子

This Hammer point release fixes a critical (though rare) data corruption bug that could be triggered when logs are rotated via SIGHUP. It also fixes a range of other important bugs in the OSD, monitor, RGW, RGW, and CephFS.

所有 v0.94.x 锤子用户都被强烈建议升级。

升级

  • The pg ls-by-{pool,primary,osd} commands and pg ls now take the argument recovering instead of recovery in order to include the recovering pgs in the listed pgs.

值得注意的变化

  • librbd: aio calls may block (issue#11770, pr#4875, Jason Dillaman)

  • osd: make the all osd/filestore thread pool suicide timeouts separately configurable (issue#11701, pr#5159, Samuel Just)

  • mon: ceph fails to compile with boost 1.58 (issue#11982, pr#5122, Kefu Chai)

  • tests: TEST_crush_reject_empty must not run a mon (issue#12285,11975, pr#5208, Kefu Chai)

  • osd: FAILED assert(!old_value.deleted()) in upgrade:giant-x-hammer-distro-basic-multi run (issue#11983, pr#5121, Samuel Just)

  • build/ops: linking ceph to tcmalloc causes segfault on SUSE SLE11-SP3 (issue#12368, pr#5265, Thorsten Behrens)

  • common: utf8 and old gcc breakage on RHEL6.5 (issue#7387, pr#4687, Kefu Chai)

  • crush: take crashes due to invalid arg (issue#11740, pr#4891, Sage Weil)

  • rgw: need conversion tool to handle fixes following #11974 (issue#12502, pr#5384, Yehuda Sadeh)

  • rgw: Swift API: support for 202 Accepted response code on container creation (issue#12299, pr#5214, Radoslaw Zarzynski)

  • common: Log::reopen_log_file: take m_flush_mutex (issue#12520, pr#5405, Samuel Just)

  • rgw: Properly respond to the Connection header with Civetweb (issue#12398, pr#5284, Wido den Hollander)

  • rgw: multipart list part response returns incorrect field (issue#12399, pr#5285, Henry Chang)

  • build/ops: ceph.spec.in: 95-ceph-osd.rules, mount.ceph, and mount.fuse.ceph not installed properly on SUSE (issue#12397, pr#5283, Nathan Cutler)

  • rgw: radosgw-admin dumps user info twice (issue#12400, pr#5286, guce)

  • doc: fix doc build (issue#12180, pr#5095, Kefu Chai)

  • tests: backport 11493 fixes, and test, preventing ec cache pools (issue#12314, pr#4961, Samuel Just)

  • rgw: does not send Date HTTP header when civetweb frontend is used (issue#11872, pr#5228, Radoslaw Zarzynski)

  • mon: pg ls is broken (issue#11910, pr#5160, Kefu Chai)

  • librbd: A client opening an image mid-resize can result in the object map being invalidated (issue#12237, pr#5279, Jason Dillaman)

  • doc: missing man pages for ceph-create-keys, ceph-disk-* (issue#11862, pr#4846, Nathan Cutler)

  • tools: ceph-post-file fails on rhel7 (issue#11876, pr#5038, Sage Weil)

  • build/ops: rcceph script is buggy (issue#12090, pr#5028, Owen Synge)

  • rgw: Bucket header is enclosed by quotes (issue#11874, pr#4862, Wido den Hollander)

  • build/ops: packaging: add SuSEfirewall2 service files (issue#12092, pr#5030, Tim Serong)

  • rgw: Keystone PKI token expiration is not enforced (issue#11722, pr#4884, Anton Aksola)

  • build/ops: debian/control: ceph-common (>> 0.94.2) must be >= 0.94.2-2 (issue#12529,11998, pr#5417, Loic Dachary)

  • mon: Clock skew causes missing summary and confuses Calamari (issue#11879, pr#4868, Thorsten Behrens)

  • rgw: rados objects wronly deleted (issue#12099, pr#5117, wuxingyi)

  • tests: kernel_untar_build fails on EL7 (issue#12098, pr#5119, Greg Farnum)

  • fs: Fh ref count will leak if readahead does not need to do read from osd (issue#12319, pr#5427, Zhi Zhang)

  • mon: OSDMonitor: allow addition of cache pool with non-empty snaps with co… (issue#12595, pr#5252, Samuel Just)

  • mon: MDSMonitor: handle MDSBeacon messages properly (issue#11979, pr#5123, Kefu Chai)

  • tools: ceph-disk: get_partition_type fails on /dev/cciss… (issue#11760, pr#4892, islepnev)

  • build/ops: max files open limit for OSD daemon is too low (issue#12087, pr#5026, Owen Synge)

  • mon: add an “osd crush tree” command (issue#11833, pr#5248, Kefu Chai)

  • mon: mon crashes when “ceph osd tree 85 --format json” (issue#11975, pr#4936, Kefu Chai)

  • build/ops: ceph / ceph-dbg steal ceph-objecstore-tool from ceph-test / ceph-test-dbg (issue#11806, pr#5069, Loic Dachary)

  • rgw: DragonDisk fails to create directories via S3: MissingContentLength (issue#12042, pr#5118, Yehuda Sadeh)

  • build/ops: /usr/bin/ceph from ceph-common is broken without installing ceph (issue#11998, pr#5206, Ken Dreyer)

  • build/ops: systemd: Increase max files open limit for OSD daemon (issue#11964, pr#5040, Owen Synge)

  • build/ops: rgw/logrotate.conf calls service with wrong init script name (issue#12044, pr#5055, wuxingyi)

  • common: OPT_INT option interprets 3221225472 as -1073741824, and crashes in Throttle::Throttle() (issue#11738, pr#4889, Kefu Chai)

  • doc: doc/release-notes: v0.94.2 (issue#11492, pr#4934, Sage Weil)

  • common: admin_socket: close socket descriptor in destructor (issue#11706, pr#4657, Jon Bernard)

  • rgw: Object copy bug (issue#11755, pr#4885, Javier M. Mellid)

  • rgw: empty json response when getting user quota (issue#12245, pr#5237, wuxingyi)

  • fs: cephfs Dumper tries to load whole journal into memory at once (issue#11999, pr#5120, John Spray)

  • rgw: Fix tool for #11442 does not correctly fix objects created via multipart uploads (issue#12242, pr#5229, Yehuda Sadeh)

  • rgw: Civetweb RGW appears to report full size of object as downloaded when only partially downloaded (issue#12243, pr#5231, Yehuda Sadeh)

  • osd: stuck incomplete (issue#12362, pr#5269, Samuel Just)

  • osd: start_flush: filter out removed snaps before determining snapc’s (issue#11911, pr#4899, Samuel Just)

  • librbd: internal.cc: 1967: FAILED assert(watchers.size() == 1) (issue#12239, pr#5243, Jason Dillaman)

  • librbd: new QA client upgrade tests (issue#12109, pr#5046, Jason Dillaman)

  • librbd: [ FAILED ] TestLibRBD.ExclusiveLockTransition (issue#12238, pr#5241, Jason Dillaman)

  • rgw: Swift API: XML document generated in response for GET on account does not contain account name (issue#12323, pr#5227, Radoslaw Zarzynski)

  • rgw: keystone does not support chunked input (issue#12322, pr#5226, Hervé Rousseau)

  • mds: MDS is crashed (mds/CDir.cc: 1391: FAILED assert(!is_complete())) (issue#11737, pr#4886, Yan, Zheng)

  • cli: ceph: cli interactive mode does not understand quotes (issue#11736, pr#4776, Kefu Chai)

  • librbd: add valgrind memory checks for unit tests (issue#12384, pr#5280, Zhiqiang Wang)

  • build/ops: admin/build-doc: script fails silently under certain circumstances (issue#11902, pr#4877, John Spray)

  • osd: Fixes for rados ops with snaps (issue#11908, pr#4902, Samuel Just)

  • build/ops: ceph.spec.in: ceph-common subpackage def needs tweaking for SUSE/openSUSE (issue#12308, pr#4883, Nathan Cutler)

  • fs: client: reference counting ‘struct Fh’ (issue#12088, pr#5222, Yan, Zheng)

  • build/ops: ceph.spec: update OpenSUSE BuildRequires (issue#11611, pr#4667, Loic Dachary)

更多详细信息,请参阅the complete changelog.

v0.94.2 锤子

This Hammer point release fixes a few critical bugs in RGW that can prevent objects starting with underscore from behaving properly and that prevent garbage collection of deleted objects when using the Civetweb standalone mode.

All v0.94.x Hammer users are strongly encouraged to upgrade, and to make note of the repair procedure below if RGW is in use.

Upgrading from previous Hammer release

Bug #11442 introduced a change that made rgw objects that start with underscore incompatible with previous versions. The fix to that bug reverts to the previous behavior. In order to be able to access objects that start with an underscore and were created in prior Hammer releases, following the upgrade it is required to run (for each affected bucket):

$ radosgw-admin bucket check --check-head-obj-locator \
                             --bucket=<bucket> [--fix]

Notable changes

  • build: compilation error: No high-precision counter available (armhf, powerpc..) (#11432, James Page)

  • ceph-dencoder links to libtcmalloc, and shouldn’t (#10691, Boris Ranto)

  • ceph-disk: disk zap sgdisk invocation (#11143, Owen Synge)

  • ceph-disk: use a new disk as journal disk,ceph-disk prepare fail (#10983, Loic Dachary)

  • ceph-objectstore-tool should be in the ceph server package (#11376, Ken Dreyer)

  • librados: can get stuck in redirect loop if osdmap epoch == last_force_op_resend (#11026, Jianpeng Ma)

  • librbd: A retransmit of proxied flatten request can result in -EINVAL (Jason Dillaman)

  • librbd: ImageWatcher should cancel in-flight ops on watch error (#11363, Jason Dillaman)

  • librbd: Objectcacher setting max object counts too low (#7385, Jason Dillaman)

  • librbd: Periodic failure of TestLibRBD.DiffIterateStress (#11369, Jason Dillaman)

  • librbd: Queued AIO reference counters not properly updated (#11478, Jason Dillaman)

  • librbd: deadlock in image refresh (#5488, Jason Dillaman)

  • librbd: notification race condition on snap_create (#11342, Jason Dillaman)

  • mds: Hammer uclient checking (#11510, John Spray)

  • mds: remove caps from revoking list when caps are voluntarily released (#11482, Yan, Zheng)

  • messenger: double clear of pipe in reaper (#11381, Haomai Wang)

  • mon: Total size of OSDs is a maginitude less than it is supposed to be. (#11534, Zhe Zhang)

  • osd: don’t check order in finish_proxy_read (#11211, Zhiqiang Wang)

  • osd: handle old semi-deleted pgs after upgrade (#11429, Samuel Just)

  • osd: object creation by write cannot use an offset on an erasure coded pool (#11507, Jianpeng Ma)

  • rgw: Improve rgw HEAD request by avoiding read the body of the first chunk (#11001, Guang Yang)

  • rgw: civetweb is hitting a limit (number of threads 1024) (#10243, Yehuda Sadeh)

  • rgw: civetweb should use unique request id (#10295, Orit Wasserman)

  • rgw: critical fixes for hammer (#11447, #11442, Yehuda Sadeh)

  • rgw: fix swift COPY headers (#10662, #10663, #11087, #10645, Radoslaw Zarzynski)

  • rgw: improve performance for large object (multiple chunks) GET (#11322, Guang Yang)

  • rgw: init-radosgw: run RGW as root (#11453, Ken Dreyer)

  • rgw: keystone token cache does not work correctly (#11125, Yehuda Sadeh)

  • rgw: make quota/gc thread configurable for starting (#11047, Guang Yang)

  • rgw: make swift responses of RGW return last-modified, content-length, x-trans-id headers.(#10650, Radoslaw Zarzynski)

  • rgw: merge manifests correctly when there’s prefix override (#11622, Yehuda Sadeh)

  • rgw: quota not respected in POST object (#11323, Sergey Arkhipov)

  • rgw: restore buffer of multipart upload after EEXIST (#11604, Yehuda Sadeh)

  • rgw: shouldn’t need to disable rgw_socket_path if frontend is configured (#11160, Yehuda Sadeh)

  • rgw: swift: Response header of GET request for container does not contain X-Container-Object-Count, X-Container-Bytes-Used and x-trans-id headers (#10666, Dmytro Iurchenko)

  • rgw: swift: Response header of POST request for object does not contain content-length and x-trans-id headers (#10661, Radoslaw Zarzynski)

  • rgw: swift: response for GET/HEAD on container does not contain the X-Timestamp header (#10938, Radoslaw Zarzynski)

  • rgw: swift: response for PUT on /container does not contain the mandatory Content-Length header when FCGI is used (#11036, #10971, Radoslaw Zarzynski)

  • rgw: swift: wrong handling of empty metadata on Swift container (#11088, Radoslaw Zarzynski)

  • tests: TestFlatIndex.cc races with TestLFNIndex.cc (#11217, Xinze Chi)

  • tests: ceph-helpers kill_daemons fails when kill fails (#11398, Loic Dachary)

更多详细信息,请参阅the complete changelog.

v0.94.1 锤子

This bug fix release fixes a few critical issues with CRUSH. The most important addresses a bug in feature bit enforcement that may prevent pre-hammer clients from communicating with the cluster during an upgrade. This only manifests in some cases (for example, when the ‘rack’ type is in use in the CRUSH map, and possibly other cases), but for safety we strongly recommend that all users use 0.94.1 instead of 0.94 when upgrading.

There is also a fix in the new straw2 buckets when OSD weights are 0.

We recommend that all v0.94 users upgrade.

Notable changes

  • crush: fix divide-by-0 in straw2 (#11357 Sage Weil)

  • crush: fix has_v4_buckets (#11364 Sage Weil)

  • osd: fix negative degraded objects during backfilling (#7737 Guang Yang)

更多详细信息,请参阅the complete changelog.

v0.94 锤子

This major release is expected to form the basis of the next long-term stable series. It is intended to supersede v0.80.x Firefly.

Highlights since Giant include:

  • RADOS Performance: a range of improvements have been made in the OSD and client-side librados code that improve the throughput on flash backends and improve parallelism and scaling on fast machines.

  • Simplified RGW deployment: the ceph-deploy tool now has a new ‘ceph-deploy rgw create HOST’ command that quickly deploys a instance of the S3/Swift gateway using the embedded Civetweb server. This is vastly simpler than the previous Apache-based deployment. There are a few rough edges (e.g., around SSL support) but we encourage users to try the new method.

  • RGW object versioning: RGW now supports the S3 object versioning API, which preserves old version of objects instead of overwriting them.

  • RGW bucket sharding: RGW can now shard the bucket index for large buckets across, improving performance for very large buckets.

  • RBD object maps: RBD now has an object map function that tracks which parts of the image are allocating, improving performance for clones and for commands like export and delete.

  • RBD mandatory locking: RBD has a new mandatory locking framework (still disabled by default) that adds additional safeguards to prevent multiple clients from using the same image at the same time.

  • RBD copy-on-read: RBD now supports copy-on-read for image clones, improving performance for some workloads.

  • CephFS snapshot improvements: Many many bugs have been fixed with CephFS snapshots. Although they are still disabled by default, stability has improved significantly.

  • CephFS Recovery tools: We have built some journal recovery and diagnostic tools. Stability and performance of single-MDS systems is vastly improved in Giant, and more improvements have been made now in Hammer. Although we still recommend caution when storing important data in CephFS, we do encourage testing for non-critical workloads so that we can better gauge the feature, usability, performance, and stability gaps.

  • CRUSH improvements: We have added a new straw2 bucket algorithm that reduces the amount of data migration required when changes are made to the cluster.

  • Shingled erasure codes (SHEC): The OSDs now have experimental support for shingled erasure codes, which allow a small amount of additional storage to be traded for improved recovery performance.

  • RADOS cache tiering: A series of changes have been made in the cache tiering code that improve performance and reduce latency.

  • RDMA support: There is now experimental support the RDMA via the Accelio (libxio) library.

  • New administrator commands: The ‘ceph osd df’ command shows pertinent details on OSD disk utilizations. The ‘ceph pg ls …’ command makes it much simpler to query PG states while diagnosing cluster issues.

Other highlights since Firefly include:

  • CephFS: we have fixed a raft of bugs in CephFS and built some basic journal recovery and diagnostic tools. Stability and performance of single-MDS systems is vastly improved in Giant. Although we do not yet recommend CephFS for production deployments, we do encourage testing for non-critical workloads so that we can better gauge the feature, usability, performance, and stability gaps.

  • Local Recovery Codes: the OSDs now support an erasure-coding scheme that stores some additional data blocks to reduce the IO required to recover from single OSD failures.

  • Degraded vs misplaced: the Ceph health reports from ‘ceph -s’ and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). The distinction is important because the latter does not compromise data safety.

  • Tiering improvements: we have made several improvements to the cache tiering implementation that improve performance. Most notably, objects are not promoted into the cache tier by a single read; they must be found to be sufficiently hot before that happens.

  • Monitor performance: the monitors now perform writes to the local data store asynchronously, improving overall responsiveness.

  • Recovery tools: the ceph-objectstore-tool is greatly expanded to allow manipulation of an individual OSDs data store for debugging and repair purposes. This is most heavily used by our QA infrastructure to exercise recovery code.

I would like to take this opportunity to call out the amazing growth in contributors to Ceph beyond the core development team from Inktank. Hammer features major new features and improvements from Intel, Fujitsu, UnitedStack, Yahoo, UbuntuKylin, CohortFS, Mellanox, CERN, Deutsche Telekom, Mirantis, and SanDisk.

Dedication

This release is dedicated in memoriam to Sandon Van Ness, aka Houkouonchi, who unexpectedly passed away a few weeks ago. Sandon was responsible for maintaining the large and complex Sepia lab that houses the Ceph project’s build and test infrastructure. His efforts have made an important impact on our ability to reliably test Ceph with a relatively small group of people. He was a valued member of the team and we will miss him. H is also for Houkouonchi.

升级

  • If your existing cluster is running a version older than v0.80.x Firefly, please first upgrade to the latest Firefly release before moving on to Giant. We have not tested upgrades directly from Emperor, Dumpling, or older releases.

    We have tested:

    • Firefly to Hammer

    • Giant to Hammer

    • Dumpling to Firefly to Hammer

  • Please upgrade daemons in the following order:

    1. Monitors

    2. OSDs

    3. MDSs and/or radosgw

    Note that the relative ordering of OSDs and monitors should not matter, but we primarily tested upgrading monitors first.

  • The ceph-osd daemons will perform a disk-format upgrade improve the PG metadata layout and to repair a minor bug in the on-disk format. It may take a minute or two for this to complete, depending on how many objects are stored on the node; do not be alarmed if they do not marked “up” by the cluster immediately after starting.

  • If upgrading from v0.93, set

    osd enable degraded writes = false

    on all osds prior to upgrading. The degraded writes feature has been reverted due to 11155.

  • The LTTNG tracing in librbd and librados is disabled in the release packages until we find a way to avoid violating distro security policies when linking libust.

Upgrading from v0.87.x Giant

  • librbd and librados include lttng tracepoints on distros with liblttng 2.4 or later (only Ubuntu Trusty for the ceph.com packages). When running a daemon that uses these libraries, i.e. an application that calls fork(2) or clone(2) without exec(3), you must set LD_PRELOAD=liblttng-ust-fork.so.0 to prevent a crash in the lttng atexit handler when the process exits. The only ceph tool that requires this is rbd-fuse.

  • If rgw_socket_path is defined and rgw_frontends defines a socket_port and socket_host, we now allow the rgw_frontends settings to take precedence. This change should only affect users who have made non-standard changes to their radosgw configuration.

  • If you are upgrading specifically from v0.92, you must stop all OSD daemons and flush their journals (ceph-osd -i NNN --flush-journal) before upgrading. There was a transaction encoding bug in v0.92 that broke compatibility. Upgrading from v0.93, v0.91, or anything earlier is safe.

  • 实验性的 “keyvaluestore-dev” OSD 后端已被重命名为 “keyvaluestore”(为了简单起见)并标记为实验性。要启用这个未经测试的功能并承认您知道它未经测试并且可能会破坏数据,您需要在您的 ceph.conf 中添加以下内容:

    enable experimental unrecoverable data corrupting features = keyvaluestore
    
  • 以下 librados C API 函数调用接受一个 “flags” 参数,其值现在被正确解释:

    rados_write_op_operate()

    标志没有被正确地从 librados 常量转换为内部值。现在它们是。任何传递这些方法标志的代码都应该进行检查,以确保它们在使用正确的 LIBRADOS_OP_FLAG_*

  • “rados” CLI “copy” 和 “cppool” 命令现在使用复制操作,这意味着最新的 CLI 不能对预 firefly OSD 运行这些命令。

  • librados 监视/通知 API 现在包括一个 watch_flush() 操作来刷新通知操作的异步队列。任何监视/通知用户在 rados_shutdown() 之前都应该调用它。

  • 对象的 “category” 字段已被删除。这最初是为了跟踪不同对象类别的 PG 统计信息,供 radosgw 使用。它不再有任何已知用户,并且因为它可能导致无界的 pg_stat_t 结构,所以它容易受到滥用。现在,接受此字段的 librados API 调用会忽略它,OSD 不再跟踪每个类别的汇总。

  • “ceph pg stat -f …” 的格式化输出现在是一个完整的 pg 转储,其中包括系统中的所有 PG 的所有元数据。它现在是一个高级 PG 统计的简洁摘要,就像未格式化的 “ceph pg stat” 命令一样。

  • The ‘rados create <objectname> [category]’ optional category argument is no longer supported or recognized.

  • rados.py’s Rados class no longer has a __del__ method; it was causing problems on interpreter shutdown and use of threads. If your code has Rados objects with limited lifetimes and you’re concerned about locked resources, call Rados.shutdown() explicitly.

  • There is a new version of the librados watch/notify API with vastly improved semantics. Any applications using this interface are encouraged to migrate to the new API. The old API calls are marked as deprecated and will eventually be removed.

  • The librados rados_unwatch() call used to be safe to call on an invalid handle. The new version has undefined behavior when passed a bogus value (for example, when rados_watch() returns an error and handle is not defined).

  • The structure of the formatted ‘pg stat’ command is changed for the portion that counts states by name to avoid using the ‘+’ character (which appears in state names) as part of the XML token (it is not legal).

  • Previously, the formatted output of ‘ceph pg stat -f …’ was a full pg dump that included all metadata about all PGs in the system. It is now a concise summary of high-level PG stats, just like the unformatted ‘ceph pg stat’ command.

  • 所有浮点值的 JSON 转储周围都带有引号。这些引号已被删除。任何消费结构化 JSON 输出并消费浮点值的人之前不得不解释带引号的字符串,现在很可能需要修复以接受未加引号的数字。

  • New ability to list all objects from all namespaces that can fail or return incomplete results when not all OSDs have been upgraded. Features rados --all ls, rados cppool, rados export, rados cache-flush-evict-all and rados cache-try-flush-evict-all can also fail or return incomplete results.

  • Due to a change in the Linux kernel version 3.18 and the limits of the FUSE interface, ceph-fuse needs be mounted as root on at least some systems. See issues #9997, #10277, and #10542 for details.

Upgrading from v0.80x Firefly (additional notes)

  • The client-side caching for librbd is now enabled by default (rbd cache = true). A safety option (rbd cache writethrough until flush = true) is also enabled so that writeback caching is not used until the library observes a ‘flush’ command, indicating that the librbd users is passing that operation through from the guest VM. This avoids potential data loss when used with older versions of qemu that do not support flush.

    leveldb_write_buffer_size = 8*1024*1024 = 33554432 // 8MB leveldb_cache_size = 512*1024*1204 = 536870912 // 512MB leveldb_block_size = 64*1024 = 65536 // 64KB leveldb_compression = false leveldb_log = “”

    OSDs will still maintain the following osd-specific defaults:

    leveldb_log = “”

  • The ‘rados getxattr …’ command used to add a gratuitous newline to the attr value; it now does not.

  • The *_kb perf counters on the monitor have been removed. These are replaced with a new set of *_bytes counters (e.g., cluster_osd_kb is replaced by cluster_osd_bytes).

  • The rd_kbwr_kb fields in the JSON dumps for pool stats (accessed via the ceph df detail -f json-pretty and related commands) have been replaced with corresponding *_bytes fields. Similarly, the total_space, total_used, and total_avail fields are replaced with total_bytes, total_used_bytes, and total_avail_bytes fields.

  • The rados df --format=json output read_byteswrite_bytes fields were incorrectly reporting ops; this is now fixed.

  • The rados df --format=json output previously included read_kbwrite_kb fields; these have been removed. Please use read_byteswrite_bytes instead (and divide by 1024 if appropriate).

  • The experimental keyvaluestore-dev OSD backend had an on-disk format change that prevents existing OSD data from being upgraded. This affects developers and testers only.

  • mon-specific and osd-specific leveldb options have been removed. From this point onward users should use the leveldb_* generic options and add the options in the appropriate sections of their configuration files. Monitors will still maintain the following monitor-specific defaults:

    leveldb_write_buffer_size = 8*1024*1024 = 33554432 // 8MB leveldb_cache_size = 512*1024*1204 = 536870912 // 512MB leveldb_block_size = 64*1024 = 65536 // 64KB leveldb_compression = false leveldb_log = “”

    OSDs will still maintain the following osd-specific defaults:

    leveldb_log = “”

  • CephFS support for the legacy anchor table has finally been removed. Users with file systems created before firefly should ensure that inodes with multiple hard links are modified prior to the upgrade to ensure that the backtraces are written properly. For example:

    sudo find /mnt/cephfs -type f -links +1 -exec touch \{\} \;
    
  • We disallow nonsensical ‘tier cache-mode’ transitions. From this point onward, ‘writeback’ can only transition to ‘forward’ and ‘forward’ can transition to 1) ‘writeback’ if there are dirty objects, or 2) any if there are no dirty objects.

Notable changes since v0.93

  • build: a few cmake fixes (Matt Benjamin)

  • build: fix build on RHEL/CentOS 5.9 (Rohan Mars)

  • build: reorganize Makefile to allow modular builds (Boris Ranto)

  • ceph-fuse: be more forgiving on remount (#10982 Greg Farnum)

  • ceph: improve CLI parsing (#11093 David Zafman)

  • common: fix cluster logging to default channel (#11177 Sage Weil)

  • crush: fix parsing of straw2 buckets (#11015 Sage Weil)

  • doc: update man pages (David Zafman)

  • librados: fix leak in C_TwoContexts (Xiong Yiliang)

  • librados: fix leak in watch/notify path (Sage Weil)

  • librbd: fix and improve AIO cache invalidation (#10958 Jason Dillaman)

  • librbd: fix memory leak (Jason Dillaman)

  • librbd: fix ordering/queueing of resize operations (Jason Dillaman)

  • librbd: validate image is r/w on resize/flatten (Jason Dillaman)

  • librbd: various internal locking fixes (Jason Dillaman)

  • lttng: tracing is disabled until we streamline dependencies (Josh Durgin)

  • mon: 添加 bootstrap-rgw 配置文件 (Sage Weil)

  • mon: do not pollute mon dir with CSV files from CRUSH check (Loic Dachary)

  • mon: 修复时钟漂移时间检查间隔 (#10546 Joao Eduardo Luis)

  • mon: fix units in store stats (Joao Eduardo Luis)

  • mon: improve error handling on erasure code profile set (#10488, #11144 Loic Dachary)

  • mon: 在 “osd tier add-cache …” 上设置 {read,write}_tier (Jianpeng Ma)

  • ms: xio: fix misc bugs (Matt Benjamin, Vu Pham)

  • osd: DBObjectMap: fix locking to prevent rare crash (#9891 Samuel Just)

  • osd: fix and document last_epoch_started semantics (Samuel Just)

  • osd: fix divergent entry handling on PG split (Samuel Just)

  • osd: fix leak on shutdown (Kefu Chai)

  • osd: fix recording of digest on scrub (Samuel Just)

  • osd: fix whiteout handling (Sage Weil)

  • rbd: allow v2 striping parameters for clones and imports (Jason Dillaman)

  • rbd: fix formatted output of image features (Jason Dillaman)

  • rbd: updat eman page (Ilya Dryomov)

  • rgw: don’t overwrite bucket/object owner when setting ACLs (#10978 Yehuda Sadeh)

  • rgw: enable IPv6 for civetweb (#10965 Yehuda Sadeh)

  • rgw: fix sysvinit script when rgw_socket_path is not defined (#11159 Yehuda Sadeh, Dan Mick)

  • rgw: pass civetweb configurables through (#10907 Yehuda Sadeh)

  • rgw: use new watch/notify API (Yehuda Sadeh, Sage Weil)

  • osd: reverted degraded writes feature due to 11155

Notable changes since v0.87.x Giant

  • 添加实验功能选项 (Sage Weil)

  • 架构: 修复 NEON 特性检测 (#10185 Loic Dachary)

  • asyncmsgr: misc fixes (Haomai Wang)

  • buffer: add ‘shareable’ construct (Matt Benjamin)

  • buffer: add list::get_contiguous (Sage Weil)

  • buffer: avoid rebuild if buffer already contiguous (Jianpeng Ma)

  • build: CMake support (Ali Maredia, Casey Bodley, Adam Emerson, Marcus Watts, Matt Benjamin)

  • build: a few cmake fixes (Matt Benjamin)

  • build: aarch64 build fixes (Noah Watkins, Haomai Wang)

  • 构建: 调整 yasm、virtualenv 的构建依赖 (Jianpeng Ma)

  • 构建: 修复 “make check” 竞态 (#10384 Loic Dachary)

  • build: fix build on RHEL/CentOS 5.9 (Rohan Mars)

  • 构建: 当 libkeyutils 缺失时修复包名 (Pankag Garg, Ken Dreyer)

  • 构建: 改进构建依赖工具 (Loic Dachary)

  • build: reorganize Makefile to allow modular builds (Boris Ranto)

  • build: support for jemalloc (Shishir Gowda)

  • ceph-disk: Scientific Linux support (Dan van der Ster)

  • ceph-disk: allow journal partition re-use (#10146 Loic Dachary, Dav van der Ster)

  • ceph-disk: 调用 partx/partprobe 一致性 (#9721 Loic Dachary)

  • ceph-disk: do not re-use partition if encryption is required (Loic Dachary)

  • ceph-disk: 修复 dmcrypt 密钥权限 (Loic Dachary)

  • ceph-disk: 修复卸载竞争条件 (#10096 Blaine Gardner)

  • ceph-disk: improved systemd support (Owen Synge)

  • ceph-disk: init=none 选项 (Loic Dachary)

  • ceph-disk: misc fixes (Christos Stavrakakis)

  • ceph-disk: respect --statedir for keyring (Loic Dachary)

  • ceph-disk: set guid if reusing journal partition (Dan van der Ster)

  • ceph-disk: support LUKS for encrypted partitions (Andrew Bartlett, Loic Dachary)

  • ceph-fuse, libcephfs: POSIX file lock support (Yan, Zheng)

  • ceph-fuse, libcephfs: allow xattr caps in inject_release_failure (#9800 John Spray)

  • ceph-fuse, libcephfs: fix I_COMPLETE_ORDERED checks (#9894 Yan, Zheng)

  • ceph-fuse, libcephfs: fix cap flush overflow (Greg Farnum, Yan, Zheng)

  • ceph-fuse, libcephfs: fix root inode xattrs (Yan, Zheng)

  • ceph-fuse, libcephfs: preserve dir ordering (#9178 Yan, Zheng)

  • ceph-fuse, libcephfs: trim inodes before reconnecting to MDS (Yan, Zheng)

  • ceph-fuse,libcephfs: add support for O_NOFOLLOW and O_PATH (Greg Farnum)

  • ceph-fuse,libcephfs: resend requests before completing cap reconnect (#10912 Yan, Zheng)

  • ceph-fuse: be more forgiving on remount (#10982 Greg Farnum)

  • ceph-fuse: fix dentry invalidation on 3.18+ kernels (#9997 Yan, Zheng)

  • ceph-fuse: fix kernel cache trimming (#10277 Yan, Zheng)

  • ceph-fuse: select kernel cache invalidation mechanism based on kernel version (Greg Farnum)

  • ceph-monstore-tool: 修复关闭 (#10093 Loic Dachary)

  • ceph-monstore-tool: 修复/改进 CLI (Joao Eduardo Luis)

  • ceph-objectstore-tool: 修复导入 (#10090 David Zafman)

  • ceph-objectstore-tool: improved import (David Zafman)

  • ceph-objectstore-tool: 许多改进和测试 (David Zafman)

  • ceph-objectstore-tool: many many improvements (David Zafman)

  • ceph-objectstore-tool: misc improvements, fixes (#9870 #9871 David Zafman)

  • ceph.spec: 打包 rbd-replay-prep (Ken Dreyer)

  • ceph: add ‘ceph osd df [tree]’ command (#10452 Mykola Golub)

  • ceph: do not parse injectargs twice (Loic Dachary)

  • ceph: fix ‘ceph tell …’ command validation (#10439 Joao Eduardo Luis)

  • ceph: improve ‘ceph osd tree’ output (Mykola Golub)

  • ceph: improve CLI parsing (#11093 David Zafman)

  • ceph: make ‘ceph -s’ output more readable (Sage Weil)

  • ceph: 使 “ceph -s” 显示 PG 状态计数按排序顺序 (Sage Weil)

  • ceph: 使 “ceph tell mon.* version” 工作 (Mykola Golub)

  • ceph: new ‘ceph tell mds.$name_or_rank_or_gid’ (John Spray)

  • ceph: 在 “ceph osd tree” 中显示主亲和性 (Mykola Golub)

  • ceph: test robustness (Joao Eduardo Luis)

  • ceph_objectstore_tool: behave with sharded flag (#9661 David Zafman)

  • cephfs-journal-tool: add recover_dentries function (#9883 John Spray)

  • cephfs-journal-tool: fix journal import (#10025 John Spray)

  • cephfs-journal-tool: skip up to expire_pos (#9977 John Spray)

  • cleanup rados.h definitions with macros (Ilya Dryomov)

  • common: 添加 “perf reset …” 管理命令 (Jianpeng Ma)

  • common: 添加 TableFormatter (Andreas Peters)

  • common: add newline to flushed json output (Sage Weil)

  • common: 检查 syncfs() 返回代码 (Jianpeng Ma)

  • common: 在销毁时不要解锁 rwlock (Federico Simoncelli)

  • common: filtering for ‘perf dump’ (John Spray)

  • common: fix Formatter factory breakage (#10547 Loic Dachary)

  • common: 修复块设备丢弃检查 (#10296 Sage Weil)

  • common: make json-pretty output prettier (Sage Weil)

  • common: 删除损坏的 CEPH_LOCKDEP 选项 (Kefu Chai)

  • common: shared_cache unit tests (Cheng Cheng)

  • common: support new gperftools header locations (Key Dreyer)

  • config: add $cctid meta variable (Adam Crume)

  • crush: fix buffer overrun for poorly formed rules (#9492 Johnu George)

  • crush: fix detach_bucket (#10095 Sage Weil)

  • crush: fix parsing of straw2 buckets (#11015 Sage Weil)

  • crush: fix several bugs in adjust_item_weight (Rongze Zhu)

  • crush: 修复树桶行为 (Rongze Zhu)

  • crush: improve constness (Loic Dachary)

  • crush: new and improved straw2 bucket type (Sage Weil, Christina Anderson, Xiaoxi Chen)

  • crush: straw bucket weight calculation fixes (#9998 Sage Weil)

  • crush: update tries stats for indep rules (#10349 Loic Dachary)

  • crush: use larger choose_tries value for erasure code rulesets (#10353 Loic Dachary)

  • crushtool: add --location <id> command (Sage Weil, Loic Dachary)

  • debian,rpm: move RBD udev rules to ceph-common (#10864 Ken Dreyer)

  • debian: split python-ceph into python-{rbd,rados,cephfs} (Boris Ranto)

  • default to libnss instead of crypto++ (Federico Gimenez)

  • doc: CephFS disaster recovery guidance (John Spray)

  • doc: CephFS for early adopters (John Spray)

  • 文档: 添加 Fedora 和 CentOS/RHEL 的构建文档指南 (Nilamdyuti Goswami)

  • doc: add dumpling to firefly upgrade section (#7679 John Wilkins)

  • doc: ceph osd reweight vs crush weight (Laurent Guerby)

  • 文档: 不要建议危险的 XFS nobarrier 选项 (Dan van der Ster)

  • doc: document erasure coded pool operations (#9970 Loic Dachary)

  • doc: document the LRC per-layer plugin configuration (Yuan Zhou)

  • 文档: 在 openstack 部署上启用 rbd 缓存 (Sebastien Han)

  • doc: erasure code doc updates (Loic Dachary)

  • doc: file system osd config settings (Kevin Dalley)

  • doc: fix OpenStack Glance docs (#10478 Sebastien Han)

  • 文档: 改进 CentOS/RHEL 安装的安装说明 (John Wilkins)

  • doc: key/value store config reference (John Wilkins)

  • 文档: 各种清理 (Adam Spiers, Sebastien Han, Nilamdyuti Goswami, Ken Dreyer, John Wilkins)

  • doc: misc improvements (Nilamdyuti Goswami, John Wilkins, Chris Holcombe)

  • doc: misc updates (#9793 #9922 #10204 #10203 Travis Rhoden, Hazem, Ayari, Florian Coste, Andy Allan, Frank Yu, Baptiste Veuillez-Mainard, Yuan Zhou, Armando Segnini, Robert Jansen, Tyler Brekke, Viktor Suprun)

  • doc: misc updates (Alfredo Deza, VRan Liu)

  • 文档: 各种更新 (Nilamdyuti Goswami, John Wilkins)

  • 文档: 新的 man 页面 (Nilamdyuti Goswami)

  • doc: preflight doc fixes (John Wilkins)

  • doc: replace cloudfiles with swiftclient Python Swift example (Tim Freund)

  • doc: update PG count guide (Gerben Meijer, Laurent Guerby, Loic Dachary)

  • doc: update man pages (David Zafman)

  • doc: update openstack docs for Juno (Sebastien Han)

  • 文档: 更新发布描述 (Ken Dreyer)

  • 文档: 更新 sepia 硬件清单 (Sandon Van Ness)

  • erasure-code: add mSHEC erasure code support (Takeshi Miyamae)

  • erasure-code: improved docs (#10340 Loic Dachary)

  • erasure-code: set max_size to 20 (#10363 Loic Dachary)

  • fix cluster logging from non-mon daemons (Sage Weil)

  • init-ceph: check for systemd-run before using it (Boris Ranto)

  • install-deps.sh: 当 root 时不需要 sudo (Loic Dachary)

  • keyvaluestore: misc fixes (Haomai Wang)

  • keyvaluestore: performance improvements (Haomai Wang)

  • libcephfs,ceph-fuse: add ‘status’ asok (John Spray)

  • libcephfs,ceph-fuse: fix getting zero-length xattr (#10552 Yan, Zheng)

  • libcephfs: 修复目录碎片修剪 (#10387 Yan, Zheng)

  • libcephfs: 修复挂载超时 (#10041 Yan, Zheng)

  • libcephfs: 修复测试 (#10415 Yan, Zheng)

  • libcephfs: 修复使用 afer-free 在卸载时 (#10412 Yan, Zheng)

  • libcephfs: 在客户端元数据中包含 ceph 和 git 版本 (Sage Weil)

  • librados, osd: new watch/notify implementation (Sage Weil)

  • librados: add blacklist_add convenience method (Jason Dillaman)

  • librados: add rados_pool_get_base_tier() call (Adam Crume)

  • librados: 添加 watch_flush() 操作 (Sage Weil, Haomai Wang)

  • librados: 在 getxattr、读取时避免 memcpy (Jianpeng Ma)

  • librados: cap buffer length (Loic Dachary)

  • librados: 通过池 id 创建 ioctx (Jason Dillaman)

  • librados: 在快速分发中完成监视完成 (Sage Weil)

  • librados: drop ‘category’ feature (Sage Weil)

  • librados: expose rados_{read|write}_op_assert_version in C API (Kim Vandry)

  • librados: fix infinite loop with skipped map epochs (#9986 Ding Dinghua)

  • librados: fix iterator operator= bugs (#10082 David Zafman, Yehuda Sadeh)

  • librados: fix leak in C_TwoContexts (Xiong Yiliang)

  • librados: fix leak in watch/notify path (Sage Weil)

  • librados: fix null deref when pool DNE (#9944 Sage Weil)

  • librados: fix objecter races (#9617 Josh Durgin)

  • librados: fix pool deletion handling (#10372 Sage Weil)

  • librados: fix pool name caching (#10458 Radoslaw Zarzynski)

  • librados: fix resource leak, misc bugs (#10425 Radoslaw Zarzynski)

  • librados: fix some watch/notify locking (Jason Dillaman, Josh Durgin)

  • librados: fix timer race from recent refactor (Sage Weil)

  • librados: new fadvise API (Ma Jianpeng)

  • librados: 仅导出公共 API 符号 (Jason Dillaman)

  • librados: 删除隐藏变量 (Kefu Chain)

  • librados: 将 op 标志从 C API 翻译 (Matthew Richards)

  • libradosstriper: fix remove() (Dongmao Zhang)

  • libradosstriper: fix shutdown hang (Dongmao Zhang)

  • libradosstriper: 修复 stat strtoll (Dongmao Zhang)

  • libradosstriper: 修复截断方法 (#10129 Sebastien Ponce)

  • libradosstriper: fix write_full when ENOENT (#10758 Sebastien Ponce)

  • libradosstriper: misc fixes (Sebastien Ponce)

  • librbd: CRC protection for RBD image map (Jason Dillaman)

  • librbd: add missing python docstrings (Jason Dillaman)

  • librbd: add per-image object map for improved performance (Jason Dillaman)

  • librbd: add readahead (Adam Crume)

  • librbd: add support for an “object map” indicating which objects exist (Jason Dillaman)

  • librbd: adjust internal locking (Josh Durgin, Jason Dillaman)

  • librbd: better handling of watch errors (Jason Dillaman)

  • librbd: complete pending ops before closing image (#10299 Josh Durgin)

  • librbd: coordinate maint operations through lock owner (Jason Dillaman)

  • librbd: copy-on-read (Min Chen, Li Wang, Yunchuan Wen, Cheng Cheng, Jason Dillaman)

  • librbd: 区分 R/O 和 R/W 特性 (Jason Dillaman)

  • librbd: don’t close a closed parent in failure path (#10030 Jason Dillaman)

  • librbd: enforce write ordering with a snapshot (Jason Dillaman)

  • librbd: 排他图像锁定 (Jason Dillaman)

  • librbd: fadvise API (Ma Jianpeng)

  • librbd: fadvise-style hints; add misc hints for certain operations (Jianpeng Ma)

  • librbd: fix and improve AIO cache invalidation (#10958 Jason Dillaman)

  • librbd: fix cache tiers in list_children and snap_unprotect (Adam Crume)

  • librbd: fix coverity false-positives (Jason Dillaman)

  • librbd: fix diff test (#10002 Josh Durgin)

  • librbd: 修复从无效池 ioctxs 列出子项 (#10123 Jason Dillaman)

  • librbd: fix locking for readahead (#10045 Jason Dillaman)

  • librbd: fix memory leak (Jason Dillaman)

  • librbd: fix ordering/queueing of resize operations (Jason Dillaman)

  • librbd: fix performance regression in ObjectCacher (#9513 Adam Crume)

  • librbd: fix snap create races (Jason Dillaman)

  • librbd: 修复写入与导入的竞争条件 (#10590 Jason Dillaman)

  • librbd: flush AIO operations asynchronously (#10714 Jason Dillaman)

  • librbd: 优雅地处理已删除/重命名的池 (#10270 Jason Dillaman)

  • librbd: lttng tracepoints (Adam Crume)

  • librbd: make async versions of long-running maint operations (Jason Dillaman)

  • librbd: misc fixes (Xinxin Shu, Jason Dillaman)

  • librbd: mock tests (Jason Dillaman)

  • librbd: 仅导出公共 API 符号 (Jason Dillaman)

  • librbd: optionally blacklist clients before breaking locks (#10761 Jason Dillaman)

  • librbd: prevent copyup during shrink (Jason Dillaman)

  • librbd: refactor unit tests to use fixtures (Jason Dillaman)

  • librbd: validate image is r/w on resize/flatten (Jason Dillaman)

  • librbd: various internal locking fixes (Jason Dillaman)

  • 许多 coverity 修复 (Danny Al-Gaaf)

  • many many coverity cleanups (Danny Al-Gaaf)

  • mds: “flush journal” 管理命令 (John Spray)

  • mds: ENOSPC and OSDMap epoch barriers (#7317 John Spray)

  • mds: a whole bunch of initial scrub infrastructure (Greg Farnum)

  • mds: add cephfs-table-tool (John Spray)

  • mds: asok 命令用于获取子树地图 (John Spray)

  • mds: avoid sending traceless replies in most cases (Yan, Zheng)

  • mds: 将 MDSCacheObjects 常量化 (John Spray)

  • mds: dirfrag buf fix (Yan, Zheng)

  • mds: disallow most commands on inactive MDS’s (Greg Farnum)

  • mds: drop dentries, leases on deleted directories (#10164 Yan, Zheng)

  • mds: export dir asok command (John Spray)

  • mds: 修复 MDLog IO 回调死锁 (John Spray)

  • mds: fix compat_version for MClientSession (#9945 John Spray)

  • mds: 修复 journal 探测与清除期间的死锁 (Joao Eduardo Luis)

  • mds: 修复日志段修剪的竞争条件 (Yan, Zheng)

  • mds: fix reply snapbl (Yan, Zheng)

  • mds: fix sessionmap lifecycle bugs (Yan, Zheng)

  • mds: fix stray/purge perfcounters (#10388 John Spray)

  • mds: handle heartbeat_reset during shutdown (#10382 John Spray)

  • mds: handle zero-size xattr (#10335 Yan, Zheng)

  • mds: initialize root inode xattr version (Yan, Zheng)

  • mds: 引入认证功能集 (John Spray)

  • mds: 许多许多与快照相关的修复 (Yan, Zheng)

  • mds: 各种错误 (Greg Farnum, John Spray, Yan, Zheng, Henry Change)

  • mds: 重构,改进会话存储 (John Spray)

  • mds: 为随机目录存储回溯 (Yan, Zheng)

  • mds: 子树配额支持 (Yunchuan Wen)

  • mds: 在获取目录碎片时验证回溯 (#9557 Yan, Zheng)

  • memstore: 跟踪空闲空间 (John Spray)

  • 各种清理 (Danny Al-Gaaf, David Anderson)

  • 各种 coverity 修复 (Danny Al-Gaaf)

  • 各种 coverity 修复 (Danny Al-Gaaf)

  • 各种 valgrind 修复和清理 (Danny Al-Gaaf)

  • mon: “osd crush reweight-all” 命令 (Sage Weil)

  • mon: 添加 “ceph osd rename-bucket …” 命令 (Loic Dachary)

  • mon: 添加 bootstrap-rgw 配置文件 (Sage Weil)

  • mon: 添加每个 osd 的最大 pgs 警告 (Sage Weil)

  • mon: 为某些 mon 命令添加 noforward 标志 (Mykola Golub)

  • mon: 允许向文件池添加层 (#10135 John Spray)

  • mon: 允许手动清除全标志 (#9323 Sage Weil)

  • mon: 清理认证列表输出 (Loic Dachary)

  • mon: 延迟故障注入 (Joao Eduardo Luis)

  • mon: 禁止空池名 (#10555 Wido den Hollander)

  • mon: 不要停用最后一个 mds (#10862 John Spray)

  • mon: do not pollute mon dir with CSV files from CRUSH check (Loic Dachary)

  • mon: 删除旧的 ceph_mon_store_converter (Sage Weil)

  • mon: 修复 “ceph pg dump_stuck degraded” (Xinxin Shu)

  • mon: 修复备用 MDS 的 “mds fail” (John Spray)

  • mon: 修复 “osd crush link” id 解析 (John Spray)

  • mon: 修复 “profile osd” 在 mon 上使用 config-key 函数 (#10844 Joao Eduardo Luis)

  • mon: 修复_ratio单位和类型 (Sage Weil)

  • mon: 修复 JSON 转储以将浮点数转储为 flots 而不是字符串 (Sage Weil)

  • mon: 修复来自 peons 的 MDS 健康状态 (#10151 John Spray)

  • mon: 修复 min_last_epoch_clean 的缓存 (#9987 Sage Weil)

  • mon: 修复时钟漂移时间检查间隔 (#10546 Joao Eduardo Luis)

  • mon: 在 mkfs 期间修复 compatset 初始化 (Joao Eduardo Luis)

  • mon: 修复 add_data_pool 的错误输出 (#9852 Joao Eduardo Luis)

  • mon: 修复选举期间的特性跟踪 (Joao Eduardo Luis)

  • mon: 修复格式化器 “pg stat” 命令输出 (Sage Weil)

  • mon: 修复 mds gid/rank/state 解析 (John Spray)

  • mon: 修复各种错误路径 (Joao Eduardo Luis)

  • mon: 修复 paxos off-by-one 角落情况 (#9301 Sage Weil)

  • mon: fix paxos timeouts (#10220 Joao Eduardo Luis)

  • mon: 修复存储的 monmap 编码 (#5203 Xie Rui)

  • mon: fix units in store stats (Joao Eduardo Luis)

  • mon: get canonical OSDMap from leader (#10422 Sage Weil)

  • mon: 忽略 up_from 之前的故障报告 (#10762 Dan van der Ster, Sage Weil)

  • mon: 实现 “fs reset” 命令 (John Spray)

  • mon: improve error handling on erasure code profile set (#10488, #11144 Loic Dachary)

  • mon: 改进损坏的 CRUSH 地图检测 (Joao Eduardo Luis)

  • mon: include entity name in audit log for forwarded requests (#9913 Joao Eduardo Luis)

  • mon: 在 osdmap 摘要中包含 pg_temp 计数 (Sage Weil)

  • mon: 将健康摘要记录到集群日志 (#9440 Joao Eduardo Luis)

  • mon: 使 “mds fail” idempotent (John Spray)

  • mon: 使 pg 转储 {sum,pgs,pgs_brief} 在 format=plain 上工作 (#5963 #6759 Mykola Golub)

  • mon: 新的 “ceph pool ls [detail]” 命令 (Sage Weil)

  • mon: 新的池安全标志 nodelete, nopgchange, nosizechange (#9792 Mykola Golub)

  • mon: 新的、友好的 “ceph pg ls …” 命令 (Xinxin Shu)

  • mon: paxos: allow reads while proposing (#9321 #9322 Joao Eduardo Luis)

  • mon: 防止 MDS 从 STOPPING 过渡 (#10791 Greg Farnum)

  • mon: 在一个事务中提议所有挂起的工 作 (Sage Weil)

  • mon: 删除不存在的池的 pg_temps (Joao Eduardo Luis)

  • mon: 删除池需要 mon_allow_pool_delete 选项 (Sage Weil)

  • mon: 在提升备用时尊重 down 标志 (John Spray)

  • mon: 将 globalid prealloc 设置为更大的值 (Sage Weil)

  • mon: 在 “osd tier add-cache …” 上设置 {read,write}_tier (Jianpeng Ma)

  • mon: 在 get_rule_avail 中跳过零 osd 统计 (#10257 Joao Eduardo Luis)

  • mon: 验证 min_size 范围 (Jianpeng Ma)

  • mon: wait for writeable before cross-proposing (#9794 Joao Eduardo Luis)

  • mount.ceph: 修复虚假错误消息 (#10351 Yan, Zheng)

  • ms: xio: fix misc bugs (Matt Benjamin, Vu Pham)

  • msgr: 异步: 将线程绑定到 CPU 核心,改进 poll (Haomai Wang)

  • msgr: 异步: 许多修复和单元测试 (Haomai Wang)

  • msgr: 异步: 许多修复 (Haomai Wang)

  • msgr: asyncmessenger: add kqueue support (#9926 Haomai Wang)

  • msgr: avoid useless new/delete (Haomai Wang)

  • msgr: fix RESETSESSION bug (#10080 Greg Farnum)

  • msgr: 修复 crc 配置 (Mykola Golub)

  • msgr: fix delay injection bug (#9910 Sage Weil, Greg Farnum)

  • msgr: 各种单元测试 (Haomai Wang)

  • msgr: new AsymcMessenger alternative implementation (Haomai Wang)

  • msgr: prefetch data when doing recv (Yehuda Sadeh)

  • msgr: 简单: 修复罕见死锁 (Greg Farnum)

  • msgr: 简单: 在失败时重试绑定到端口 (#10029 Wido den Hollander)

  • msgr: xio: XioMessenger RDMA 支持 (Casey Bodley, Vu Pham, Matt Benjamin)

  • objectstore: deprecate collection attrs (Sage Weil)

  • osd, librados: fadvise-style librados 提示 (Jianpeng Ma)

  • osd, librados: 修复 xattr_cmp_u64 (Dongmao Zhang)

  • osd, librados: revamp PG listing API to handle namespaces (#9031 #9262 #9438 David Zafman)

  • osd, mds: “ops” 作为 asok 上 “dump_ops_in_flight” 的简称 (Sage Weil)

  • osd, mon: add checksums to all OSDMaps (Sage Weil)

  • osd, mon: send intiial pg create time from mon to osd (#9887 David Zafman)

  • osd,mon: 添加 “norebalance” 标志 (Kefu Chai)

  • osd,mon: 在 MOSDBoot 中明确指定 OSD 特性 (#10911 Sage Weil)

  • osd: DBObjectMap: fix locking to prevent rare crash (#9891 Samuel Just)

  • osd: 当校验和错误时在整对象读取期间发生 EIO (Sage Weil)

  • osd: add erasure code corpus (Loic Dachary)

  • osd: 添加 fadvise 标志到 ObjectStore API (Jianpeng Ma)

  • osd: 添加 get_latest_osdmap asok 命令 (#9483 #9484 Mykola Golub)

  • osd: add misc tests (Loic Dachary, Danny Al-Gaaf)

  • osd: 添加优先处理心跳网络流量的选项 (Jian Wen)

  • osd: 支持 SHEC 消息编码算法 (Takeshi Miyamae, Loic Dachary)

  • osd: allow deletion of objects with watcher (#2339 Sage Weil)

  • osd: 允许在低于 min_size 时恢复 (Samuel Just)

  • osd: 允许少于 min_size OSD 的恢复 (Samuel Just)

  • osd: allow sparse read for Push/Pull (Haomai Wang)

  • osd: allow whiteout deletion in cache pool (Sage Weil)

  • osd: 允许向降级对象写入 (Samuel Just)

  • osd: 允许向降级对象写入 (Samuel Just)

  • osd: 避免发布未更改的 PG 统计 (Sage Weil)

  • osd: 批量 pg 日志修剪 (Xinze Chi)

  • osd: cache pool: ignore min flush age when cache is full (Xinze Chi)

  • osd: 缓存最近的 ObjectContexts (Dong Yuan)

  • osd: 缓存 reverse_nibbles 哈希值 (Dong Yuan)

  • osd: 清理内部 ObjectStore 接口 (Sage Weil)

  • osd: 清理 boost optionals (William Kennington)

  • osd: 在间隔更改时清除缓存 (Samuel Just)

  • osd: 除非目标 OSD 是新的,否则不代理读取 (#10788 Sage Weil)

  • osd: 当缓存满时忽略最小刷新年龄 (Xinze Chi)

  • osd: 不要在不一致的对象上更新摘要 (#10524 Samuel Just)

  • osd: 不要为快照目录记录摘要 (#10536 Samuel Just)

  • osd: drop upgrade support for pre-dumpling (Sage Weil)

  • osd: enable and use posix_fadvise (Sage Weil)

  • osd: erasure coding: allow bench.sh to test ISA backend (Yuan Zhou)

  • osd: erasure-code: encoding regression tests, corpus (#9420 Loic Dachary)

  • osd: erasure-code: enforce chunk size alignment (#10211 Loic Dachary)

  • osd: erasure-code: jerasure support for NEON (Loic Dachary)

  • osd: erasure-code: relax cauchy w restrictions (#10325 David Zhang, Loic Dachary)

  • osd: erasure-code: update gf-complete to latest upstream (Loic Dachary)

  • osd: 通过 ceph-osd CLI 暴露非日志后端 (Hoamai Wang)

  • osd: filejournal: 不使用直接 IO 时不要缓存 journal (Jianpeng Ma)

  • osd: 修复随机 OSD 的 JSON 输出 (Loic Dachary)

  • osd: 修复旧 (el6) boost::spirit 的 OSDCap 解析器 (#10757 Kefu Chai)

  • osd: 修复 el6 的 OSDCap 解析器 (#10757 Kefu Chai)

  • osd: 修复 ObjectStore::Transaction 编码版本 (#10734 Samuel Just)

  • osd: fix WBTHrottle perf counters (Haomai Wang)

  • osd: fix and document last_epoch_started semantics (Samuel Just)

  • osd: 修复修复期间的认证对象选择 (#10524 Samuel Just)

  • osd: fix backfill bug (#10150 Samuel Just)

  • osd: 修复挂起摘要更新的错误 (#10840 Samuel Just)

  • osd: 修复取消代理读取操作 (Sage Weil)

  • osd: 修复中断 pg 删除的清理 (#10617 Sage Weil)

  • osd: fix divergent entry handling on PG split (Samuel Just)

  • osd: fix ghobject_t formatted output to include shard (#10063 Loic Dachary)

  • osd: 修复 ioprio 选项 (Mykola Golub)

  • osd: 修复 ioprio 选项 (Loic Dachary)

  • osd: fix journal shutdown race (Sage Weil)

  • osd: 修复 journal 包装错误 (#10883 David Zafman)

  • osd: 修复 SnapTrimWQ 中的泄漏 (#10421 Kefu Chai)

  • osd: fix leak on shutdown (Kefu Chai)

  • osd: 修复 memstore 空闲空间计算 (Xiaoxi Chen)

  • osd: 修复混合版本对等问题 (Samuel Just)

  • osd: fix object age eviction (Zhiqiang Wang)

  • osd: fix object atime calculation (Xinze Chi)

  • osd: 修复对象摘要更新错误 (#10840 Samuel Just)

  • osd: fix occasional peering stalls (#10431 Sage Weil)

  • osd: 修复与新事务编码的顺序问题 (#10534 Dong Yuan)

  • osd: fix osd peer check on scrub messages (#9555 Sage Weil)

  • osd: fix past_interval display bug (#9752 Loic Dachary)

  • osd: 修复 past_interval 生成 (#10427 #10430 David Zafman)

  • osd: fix pgls filter ops (#9439 David Zafman)

  • osd: fix recording of digest on scrub (Samuel Just)

  • osd: 修复 scrub 延迟错误 (#10693 Samuel Just)

  • osd: fix scrub vs try-flush bug (#8011 Samuel Just)

  • osd: 修复推送上的短读处理 (#8121 David Zafman)

  • osd: fix stderr with -f or -d (Dan Mick)

  • osd: 修复事务会计 (Jianpeng Ma)

  • osd: 修复监视重新连接竞争条件 (#10441 Sage Weil)

  • osd: 修复监视超时缓存状态更新 (#10784 David Zafman)

  • osd: fix whiteout handling (Sage Weil)

  • osd: flush snapshots from cache tier immediately (Sage Weil)

  • osd: 强制提升监视/通知操作 (Zhiqiang Wang)

  • osd: 使用快照处理无操作写入 (#10262 Sage Weil)

  • osd: 改进缓存提升/降低时的 idempotency 检测 (#8935 Sage Weil, Samuel Just)

  • osd: 在 blocked_by 中包含激活的对等方 (#10477 Sage Weil)

  • osd: jerasure 和 gf-complete 从上游更新 (#10216 Loic Dachary)

  • osd: journal: 检查 fsync/fdatasync 结果 (Jianpeng Ma)

  • osd: journal: fix alignment checks, avoid useless memmove (Jianpeng Ma)

  • osd: journal: 修复关闭时的死锁 (#10474 David Zafman)

  • osd: journal: 修复 header.committed_up_to (Xinze Chi)

  • osd: journal: 当直接 IO 启用时修复 journal 清零 (#Xie Rui)

  • osd: journal: 初始化限流器 (Ning Yao)

  • osd: journal: 各种错误修复 (#6003 David Zafman, Samuel Just)

  • osd: journal: update committed_thru after replay (#6756 Samuel Just)

  • osd: keyvaluestore: 清理死代码 (Ning Yao)

  • osd: keyvaluestore: fix getattr semantics (Haomai Wang)

  • osd: keyvaluestore: fix key ordering (#10119 Haomai Wang)

  • osd: keyvaluestore_dev: optimization (Chendi Xue)

  • osd: limit in-flight read requests (Jason Dillaman)

  • osd: log when scrub or repair starts (Loic Dachary)

  • osd: make misdirected op checks robust for EC pools (#9835 Sage Weil)

  • osd: memstore: 修复大小限制 (Xiaoxi Chen)

  • osd: misc FIEMAP fixes (Ma Jianpeng)

  • osd: 各种清理 (Xinze Chi, Yongyue Sun)

  • osd: 各种优化 (Xinxin Shu, Zhiqiang Wang, Xinze Chi)

  • osd: 各种 scrub 修复 (#10017 Loic Dachary)

  • osd: 新的 “activating” 状态,在对等和活动之间 (Sage Weil)

  • osd: 为 ObjectStore::Transaction 添加优化的编码 (Dong Yuan)

  • osd: optimize Finisher (Xinze Chi)

  • osd: 使用 unordered_map 优化 WBThrottle map (Ning Yao)

  • osd: 优化 filter_snapc (Ning Yao)

  • osd: 为 promote/demote 的 idempotency 检查保留 reqids (Sage Weil, Zhiqiang Wang, Samuel Just)

  • osd: 代理读取支持 (Zhiqiang Wang)

  • osd: 在缓存提升期间代理读取 (Zhiqiang Wang)

  • osd: 删除死锁代码 (Xinxin Shu)

  • osd: 删除遗留的经典 scrub 代码 (Sage Weil)

  • osd: 在 MOSDSubOp 中删除未使用的字段 (Xiaoxi Chen)

  • osd: removed some dead code (Xinze Chi)

  • osd: 用更简单、优化的 MOSDRepOp 替换 MOSDSubOp 消息 (Xiaoxi Chen)

  • osd: 将清理限制在特定时间 (Xinze Chi)

  • osd: rocksdb: 修复关闭 (Hoamai Wang)

  • osd: store PG metadata in per-collection objects for better concurrency (Sage Weil)

  • osd: 在 scrub、write_full 上存储整个对象校验和 (Sage Weil)

  • osd: support for discard for journal trim (Jianpeng Ma)

  • osd: 使用 FIEMAP_FLAGS_SYNC 而不是 fsync (Jianpeng Ma)

  • osd: 在使用 XFS extsize ioctl 之前验证内核是否足够新,默认启用 (#9956 Sage Weil)

  • pybind: 修复 librados 绑定中的内存泄漏 (Billy Olsen)

  • pyrados: add object lock support (#6114 Mehdi Abaakouk)

  • pyrados: fix misnamed wait_* routings (#10104 Dan Mick)

  • pyrados: misc cleanups (Kefu Chai)

  • qa: add large auth ticket tests (Ilya Dryomov)

  • qa: 修复 mds 测试 (#10539 John Spray)

  • qa: fix osd create dup tests (#10083 Loic Dachary)

  • qa: 在 rados ls 中忽略重复项 (Josh Durgin)

  • qa: 改进 hadoop 测试 (Noah Watkins)

  • qa: many ‘make check’ improvements (Loic Dachary)

  • qa: misc tests (Loic Dachary, Yan, Zheng)

  • qa: parallelize make check (Loic Dachary)

  • qa: 重新组织文件配额测试 (Greg Farnum)

  • qa: tolerate nearly-full disk for make check (Loic Dachary)

  • rados: 修复 /dev/null 的放置 (Loic Dachary)

  • rados: 修复使用 (Jianpeng Ma)

  • rados: 更严格地解析命令行参数 (#8983 Adam Crume)

  • rados: 使用复制操作进行复制、cppool (Sage Weil)

  • radosgw-admin: 添加 replicalog 更新命令 (Yehuda Sadeh)

  • rbd-fuse: 关闭时清理 (Josh Durgin)

  • rbd-fuse: 修复内存泄漏 (Adam Crume)

  • rbd-replay-many (Adam Crume)

  • rbd-replay: --anonymize 标志到 rbd-replay-prep (Adam Crume)

  • rbd: 添加 “merge-diff” 功能 (MingXin Liu, Yunchuan Wen, Li Wang)

  • rbd: allow v2 striping parameters for clones and imports (Jason Dillaman)

  • rbd: 修复 “rbd diff” 对于不存在的对象 (Adam Crume)

  • rbd: 修复图像导入上的缓冲区处理 (#10590 Jason Dillaman)

  • rbd: 当使用格式 1 条带时修复错误 (Sebastien Han)

  • rbd: 修复超过 2GB 的图像导出 (Vicente Cheng)

  • rbd: fix formatted output of image features (Jason Dillaman)

  • rbd: 默认情况下不关闭排他锁 (Jason Dillaman)

  • rbd: updat eman page (Ilya Dryomov)

  • rbd: 更新 init-rbdmap 以修复重复挂载点 (Karel Striegel)

  • rbd: 对导入、导出和基准操作使用 IO 提示 (#10462 Jason Dillaman)

  • rbd: 使用滚动平均值来计算 rbd bench-write 吞吐量 (Jason Dillaman)

  • rbd_recover_tool: RBD 图像恢复工具 (Min Chen)

  • rgw: S3 风格的对象版本控制支持 (Yehuda Sadeh)

  • rgw: add location header when object is in another region (VRan Liu)

  • rgw: 更改多部分上传 id 魔术 (#10271 Yehuda Sadeh)

  • rgw: 检查 keystone 认证 S3 POST 请求 (#10062 Abhishek Lekshmanan)

  • rgw: check timestamp on s3 keystone auth (#10062 Abhishek Lekshmanan)

  • rgw: conditional PUT on ETag (#8562 Ray Lv)

  • rgw: create subuser if needed when creating user (#10103 Yehuda Sadeh)

  • rgw: 解码 http 查询参数更正 (#10271 Yehuda Sadeh)

  • rgw: don’t overwrite bucket/object owner when setting ACLs (#10978 Yehuda Sadeh)

  • rgw: enable IPv6 for civetweb (#10965 Yehuda Sadeh)

  • rgw: 扩展副本日志 API (purge-all) (Yehuda Sadeh)

  • rgw: 如果 keystone 未配置,则失败 S3 POST (#10688 Valery Tschopp, Yehuda Sadeh)

  • rgw: fix If-Modified-Since (VRan Liu)

  • rgw: 修复 get ACL 请求的 XML 头 (#10106 Yehuda Sadeh)

  • rgw: 修复带数据清除的桶删除 (Yehuda Sadeh)

  • rgw: 修复内容长度检查 (#10701 Axel Dunkel, Yehuda Sadeh)

  • rgw: fix content-length update (#9576 Yehuda Sadeh)

  • rgw: fix disabling of max_size quota (#9907 Dong Lei)

  • rgw: fix error codes (#10334 #10329 Yehuda Sadeh)

  • rgw: fix incorrect len when len is 0 (#9877 Yehuda Sadeh)

  • rgw: fix object copy content type (#9478 Yehuda Sadeh)

  • rgw: 修复 swift 中的部分 GET (#10553 Yehuda Sadeh)

  • rgw: 修复副本日志索引 (#8251 Yehuda Sadeh)

  • rgw: 修复关闭 (#10472 Yehuda Sadeh)

  • rgw: 修复 swift 元数据头名称 (Dmytro Iurchenko)

  • rgw: fix sysvinit script when rgw_socket_path is not defined (#11159 Yehuda Sadeh, Dan Mick)

  • rgw: fix user stags in get-user-info API (#9359 Ray Lv)

  • rgw: 在 get ACL 请求中包含 XML ns (#10106 Yehuda Sadeh)

  • rgw: index swift keys appropriately (#10471 Yehuda Sadeh)

  • rgw: make sysvinit script set ulimit -n properly (Sage Weil)

  • rgw: 各种修复 (#10307 Yehuda Sadeh)

  • rgw: 仅跟踪我们写入的清理 (#10311 Yehuda Sadeh)

  • rgw: pass civetweb configurables through (#10907 Yehuda Sadeh)

  • rgw: prevent illegal bucket policy that doesn’t match placement rule (Yehuda Sadeh)

  • rgw: 在中止时从桶索引中删除多部分条目 (#10719 Yehuda Sadeh)

  • rgw: remove swift user manifest (DLO) hash calculation (#9973 Yehuda Sadeh)

  • rgw: 在容器上的 POST 上返回 204 (10667 Yuan Zhou)

  • rgw: return timestamp on GET/HEAD (#8911 Yehuda Sadeh)

  • rgw: 重复 fcgx 连接结构 (#10194 Yehuda Sadeh)

  • rgw: run radosgw as apache with systemd (#10125 Loic Dachary)

  • rgw: 发送明确的 HTTP 状态字符串 (Yehuda Sadeh)

  • rgw: set ETag on object copy (#9479 Yehuda Sadeh)

  • rgw: 为 keystone 令牌验证请求设置长度 (#7796 Yehuda Sadeh, Mark Kirkwood)

  • rgw: support X-Storage-Policy header for Swift storage policy compat (Yehuda Sadeh)

  • rgw: 支持多个主机名 (#7467 Yehuda Sadeh)

  • rgw: swift: 导出容器的自定义元数据 (#10665 Ahmad Faheem, Dmytro Iurchenko)

  • rgw: swift: 支持响应格式 Accept 头 (#10746 Dmytro Iurchenko)

  • rgw: swift: 支持 X-Remove-Container-Meta-{key} (#10475 Dmytro Iurchenko)

  • rgw: 调整错误代码 (#10329 #10334 Yehuda Sadeh)

  • rgw: update bucket index on attr changes, for multi-site sync (#5595 Yehuda Sadeh)

  • rgw: use rn for http headers (#9254 Yehuda Sadeh)

  • rgw: 使用 gc 进行多部分中止 (#10445 Aaron Bassett, Yehuda Sadeh)

  • rgw: use new watch/notify API (Yehuda Sadeh, Sage Weil)

  • rpm: misc fixes (Key Dreyer)

  • rpm: 将 rgw logrotate 移动到 radosgw 子包 (Ken Dreyer)

  • systemd: better systemd unit files (Owen Synge)

  • sysvinit: 修复 “stop” 中的竞争条件 (#10389 Loic Dachary)

  • 测试: 修复 bufferlist 测试 (Jianpeng Ma)

  • tests: ability to run unit tests under docker (Loic Dachary)

  • 测试: centos-6 dockerfile (#10755 Loic Dachary)

  • 测试: 改进基于 docker 的测试 (Loic Dachary)

  • 测试: 共享缓存单元测试 (Dong Yuan)

  • udev: 修复 CentOS7/RHEL7 的规则 (Loic Dachary)

  • 使用 clock_gettime 而不是 gettimeofday (Jianpeng Ma)

  • vstart.sh: 设置 s3-tests 的环境 (Luis Pabon)

  • vstart.sh: 与 cmake 工作 (Yehuda Sadeh)

v0.93

This is the first release candidate for Hammer, and includes all of the features that will be present in the final release. We welcome and encourage any and all testing in non-production clusters to identify any problems with functionality, stability, or performance before the final Hammer release.

We suggest some caution in one area: librbd. There is a lot of new functionality around object maps and locking that is disabled by default but may still affect stability for existing images. We are continuing to shake out those bugs so that the final Hammer release (probably v0.94) will be rock solid.

Major features since Giant include:

  • cephfs: journal scavenger repair tool (John Spray)

  • crush: new and improved straw2 bucket type (Sage Weil, Christina Anderson, Xiaoxi Chen)

  • doc: improved guidance for CephFS early adopters (John Spray)

  • librbd: add per-image object map for improved performance (Jason Dillaman)

  • librbd: copy-on-read (Min Chen, Li Wang, Yunchuan Wen, Cheng Cheng)

  • librados: fadvise-style IO hints (Jianpeng Ma)

  • mds: 许多许多与快照相关的修复 (Yan, Zheng)

  • mon: new ‘ceph osd df’ command (Mykola Golub)

  • mon: new ‘ceph pg ls …’ command (Xinxin Shu)

  • osd: improved performance for high-performance backends

  • osd: improved recovery behavior (Samuel Just)

  • osd: improved cache tier behavior with reads (Zhiqiang Wang)

  • rgw: S3-compatible bucket versioning support (Yehuda Sadeh)

  • rgw: large bucket index sharding (Guang Yang, Yehuda Sadeh)

  • RDMA “xio” messenger support (Matt Benjamin, Vu Pham)

升级

  • If you are upgrading from v0.92, you must stop all OSD daemons and flush their journals (ceph-osd -i NNN --flush-journal) before upgrading. There was a transaction encoding bug in v0.92 that broke compatibility. Upgrading from v0.91 or anything earlier is safe.

  • No special restrictions when upgrading from firefly or giant.

值得注意的变化

  • build: CMake support (Ali Maredia, Casey Bodley, Adam Emerson, Marcus Watts, Matt Benjamin)

  • ceph-disk: do not re-use partition if encryption is required (Loic Dachary)

  • ceph-disk: support LUKS for encrypted partitions (Andrew Bartlett, Loic Dachary)

  • ceph-fuse,libcephfs: add support for O_NOFOLLOW and O_PATH (Greg Farnum)

  • ceph-fuse,libcephfs: resend requests before completing cap reconnect (#10912 Yan, Zheng)

  • ceph-fuse: select kernel cache invalidation mechanism based on kernel version (Greg Farnum)

  • ceph-objectstore-tool: improved import (David Zafman)

  • ceph-objectstore-tool: misc improvements, fixes (#9870 #9871 David Zafman)

  • ceph: add ‘ceph osd df [tree]’ command (#10452 Mykola Golub)

  • ceph: fix ‘ceph tell …’ command validation (#10439 Joao Eduardo Luis)

  • ceph: improve ‘ceph osd tree’ output (Mykola Golub)

  • cephfs-journal-tool: add recover_dentries function (#9883 John Spray)

  • common: add newline to flushed json output (Sage Weil)

  • common: filtering for ‘perf dump’ (John Spray)

  • common: fix Formatter factory breakage (#10547 Loic Dachary)

  • common: make json-pretty output prettier (Sage Weil)

  • crush: new and improved straw2 bucket type (Sage Weil, Christina Anderson, Xiaoxi Chen)

  • crush: update tries stats for indep rules (#10349 Loic Dachary)

  • crush: use larger choose_tries value for erasure code rulesets (#10353 Loic Dachary)

  • debian,rpm: move RBD udev rules to ceph-common (#10864 Ken Dreyer)

  • debian: split python-ceph into python-{rbd,rados,cephfs} (Boris Ranto)

  • doc: CephFS disaster recovery guidance (John Spray)

  • doc: CephFS for early adopters (John Spray)

  • doc: fix OpenStack Glance docs (#10478 Sebastien Han)

  • doc: misc updates (#9793 #9922 #10204 #10203 Travis Rhoden, Hazem, Ayari, Florian Coste, Andy Allan, Frank Yu, Baptiste Veuillez-Mainard, Yuan Zhou, Armando Segnini, Robert Jansen, Tyler Brekke, Viktor Suprun)

  • doc: replace cloudfiles with swiftclient Python Swift example (Tim Freund)

  • erasure-code: add mSHEC erasure code support (Takeshi Miyamae)

  • erasure-code: improved docs (#10340 Loic Dachary)

  • erasure-code: set max_size to 20 (#10363 Loic Dachary)

  • libcephfs,ceph-fuse: fix getting zero-length xattr (#10552 Yan, Zheng)

  • librados: add blacklist_add convenience method (Jason Dillaman)

  • librados: expose rados_{read|write}_op_assert_version in C API (Kim Vandry)

  • librados: fix pool name caching (#10458 Radoslaw Zarzynski)

  • librados: fix resource leak, misc bugs (#10425 Radoslaw Zarzynski)

  • librados: fix some watch/notify locking (Jason Dillaman, Josh Durgin)

  • libradosstriper: fix write_full when ENOENT (#10758 Sebastien Ponce)

  • librbd: CRC protection for RBD image map (Jason Dillaman)

  • librbd: add per-image object map for improved performance (Jason Dillaman)

  • librbd: add support for an “object map” indicating which objects exist (Jason Dillaman)

  • librbd: adjust internal locking (Josh Durgin, Jason Dillaman)

  • librbd: better handling of watch errors (Jason Dillaman)

  • librbd: coordinate maint operations through lock owner (Jason Dillaman)

  • librbd: copy-on-read (Min Chen, Li Wang, Yunchuan Wen, Cheng Cheng, Jason Dillaman)

  • librbd: enforce write ordering with a snapshot (Jason Dillaman)

  • librbd: fadvise-style hints; add misc hints for certain operations (Jianpeng Ma)

  • librbd: fix coverity false-positives (Jason Dillaman)

  • librbd: fix snap create races (Jason Dillaman)

  • librbd: flush AIO operations asynchronously (#10714 Jason Dillaman)

  • librbd: make async versions of long-running maint operations (Jason Dillaman)

  • librbd: mock tests (Jason Dillaman)

  • librbd: optionally blacklist clients before breaking locks (#10761 Jason Dillaman)

  • librbd: prevent copyup during shrink (Jason Dillaman)

  • mds: add cephfs-table-tool (John Spray)

  • mds: avoid sending traceless replies in most cases (Yan, Zheng)

  • mds: export dir asok command (John Spray)

  • mds: fix stray/purge perfcounters (#10388 John Spray)

  • mds: handle heartbeat_reset during shutdown (#10382 John Spray)

  • mds: 许多许多与快照相关的修复 (Yan, Zheng)

  • mds: 重构,改进会话存储 (John Spray)

  • 各种 coverity 修复 (Danny Al-Gaaf)

  • mon: 为某些 mon 命令添加 noforward 标志 (Mykola Golub)

  • mon: 禁止空池名 (#10555 Wido den Hollander)

  • mon: 不要停用最后一个 mds (#10862 John Spray)

  • mon: 删除旧的 ceph_mon_store_converter (Sage Weil)

  • mon: 修复 “ceph pg dump_stuck degraded” (Xinxin Shu)

  • mon: 修复 “profile osd” 在 mon 上使用 config-key 函数 (#10844 Joao Eduardo Luis)

  • mon: 在 mkfs 期间修复 compatset 初始化 (Joao Eduardo Luis)

  • mon: 修复选举期间的特性跟踪 (Joao Eduardo Luis)

  • mon: 修复 mds gid/rank/state 解析 (John Spray)

  • mon: 忽略 up_from 之前的故障报告 (#10762 Dan van der Ster, Sage Weil)

  • mon: 改进损坏的 CRUSH 地图检测 (Joao Eduardo Luis)

  • mon: 在 osdmap 摘要中包含 pg_temp 计数 (Sage Weil)

  • mon: 将健康摘要记录到集群日志 (#9440 Joao Eduardo Luis)

  • mon: 使 “mds fail” idempotent (John Spray)

  • mon: 使 pg 转储 {sum,pgs,pgs_brief} 在 format=plain 上工作 (#5963 #6759 Mykola Golub)

  • mon: 新的池安全标志 nodelete, nopgchange, nosizechange (#9792 Mykola Golub)

  • mon: 新的、友好的 “ceph pg ls …” 命令 (Xinxin Shu)

  • mon: 防止 MDS 从 STOPPING 过渡 (#10791 Greg Farnum)

  • mon: 在一个事务中提议所有挂起的工 作 (Sage Weil)

  • mon: 删除不存在的池的 pg_temps (Joao Eduardo Luis)

  • mon: 删除池需要 mon_allow_pool_delete 选项 (Sage Weil)

  • mon: 将 globalid prealloc 设置为更大的值 (Sage Weil)

  • mon: 在 get_rule_avail 中跳过零 osd 统计 (#10257 Joao Eduardo Luis)

  • mon: 验证 min_size 范围 (Jianpeng Ma)

  • msgr: 异步: 将线程绑定到 CPU 核心,改进 poll (Haomai Wang)

  • msgr: 修复 crc 配置 (Mykola Golub)

  • msgr: 各种单元测试 (Haomai Wang)

  • msgr: xio: XioMessenger RDMA 支持 (Casey Bodley, Vu Pham, Matt Benjamin)

  • osd, librados: fadvise-style librados 提示 (Jianpeng Ma)

  • osd, librados: 修复 xattr_cmp_u64 (Dongmao Zhang)

  • osd,mon: 添加 “norebalance” 标志 (Kefu Chai)

  • osd,mon: 在 MOSDBoot 中明确指定 OSD 特性 (#10911 Sage Weil)

  • osd: 添加优先处理心跳网络流量的选项 (Jian Wen)

  • osd: 支持 SHEC 消息编码算法 (Takeshi Miyamae, Loic Dachary)

  • osd: 允许在低于 min_size 时恢复 (Samuel Just)

  • osd: 允许少于 min_size OSD 的恢复 (Samuel Just)

  • osd: 允许向降级对象写入 (Samuel Just)

  • osd: 允许向降级对象写入 (Samuel Just)

  • osd: 避免发布未更改的 PG 统计 (Sage Weil)

  • osd: 缓存最近的 ObjectContexts (Dong Yuan)

  • osd: 在间隔更改时清除缓存 (Samuel Just)

  • osd: 除非目标 OSD 是新的,否则不代理读取 (#10788 Sage Weil)

  • osd: 不要在不一致的对象上更新摘要 (#10524 Samuel Just)

  • osd: 不要为快照目录记录摘要 (#10536 Samuel Just)

  • osd: 修复旧 (el6) boost::spirit 的 OSDCap 解析器 (#10757 Kefu Chai)

  • osd: 修复 el6 的 OSDCap 解析器 (#10757 Kefu Chai)

  • osd: 修复 ObjectStore::Transaction 编码版本 (#10734 Samuel Just)

  • osd: 修复修复期间的认证对象选择 (#10524 Samuel Just)

  • osd: 修复挂起摘要更新的错误 (#10840 Samuel Just)

  • osd: 修复取消代理读取操作 (Sage Weil)

  • osd: 修复中断 pg 删除的清理 (#10617 Sage Weil)

  • osd: 修复 journal 包装错误 (#10883 David Zafman)

  • osd: 修复 SnapTrimWQ 中的泄漏 (#10421 Kefu Chai)

  • osd: 修复 memstore 空闲空间计算 (Xiaoxi Chen)

  • osd: 修复混合版本对等问题 (Samuel Just)

  • osd: 修复对象摘要更新错误 (#10840 Samuel Just)

  • osd: 修复与新事务编码的顺序问题 (#10534 Dong Yuan)

  • osd: 修复 past_interval 生成 (#10427 #10430 David Zafman)

  • osd: 修复推送上的短读处理 (#8121 David Zafman)

  • osd: 修复监视超时缓存状态更新 (#10784 David Zafman)

  • osd: 强制提升监视/通知操作 (Zhiqiang Wang)

  • osd: 改进缓存提升/降低时的 idempotency 检测 (#8935 Sage Weil, Samuel Just)

  • osd: 在 blocked_by 中包含激活的对等方 (#10477 Sage Weil)

  • osd: jerasure 和 gf-complete 从上游更新 (#10216 Loic Dachary)

  • osd: journal: 检查 fsync/fdatasync 结果 (Jianpeng Ma)

  • osd: journal: 修复关闭时的死锁 (#10474 David Zafman)

  • osd: journal: 修复 header.committed_up_to (Xinze Chi)

  • osd: journal: 初始化限流器 (Ning Yao)

  • osd: journal: 各种错误修复 (#6003 David Zafman, Samuel Just)

  • osd: 各种清理 (Xinze Chi, Yongyue Sun)

  • osd: 新的 “activating” 状态,在对等和活动之间 (Sage Weil)

  • osd: 为 promote/demote 的 idempotency 检查保留 reqids (Sage Weil, Zhiqiang Wang, Samuel Just)

  • osd: 删除死锁代码 (Xinxin Shu)

  • osd: 将清理限制在特定时间 (Xinze Chi)

  • osd: rocksdb: 修复关闭 (Hoamai Wang)

  • pybind: 修复 librados 绑定中的内存泄漏 (Billy Olsen)

  • qa: 修复 mds 测试 (#10539 John Spray)

  • qa: 在 rados ls 中忽略重复项 (Josh Durgin)

  • qa: 改进 hadoop 测试 (Noah Watkins)

  • qa: 重新组织文件配额测试 (Greg Farnum)

  • rados: 修复使用 (Jianpeng Ma)

  • radosgw-admin: 添加 replicalog 更新命令 (Yehuda Sadeh)

  • rbd-fuse: 关闭时清理 (Josh Durgin)

  • rbd: 添加 “merge-diff” 功能 (MingXin Liu, Yunchuan Wen, Li Wang)

  • rbd: 修复图像导入上的缓冲区处理 (#10590 Jason Dillaman)

  • rbd: 默认情况下不关闭排他锁 (Jason Dillaman)

  • rbd: 更新 init-rbdmap 以修复重复挂载点 (Karel Striegel)

  • rbd: 对导入、导出和基准操作使用 IO 提示 (#10462 Jason Dillaman)

  • rbd_recover_tool: RBD 图像恢复工具 (Min Chen)

  • rgw: S3 风格的对象版本控制支持 (Yehuda Sadeh)

  • rgw: 检查 keystone 认证 S3 POST 请求 (#10062 Abhishek Lekshmanan)

  • rgw: 扩展副本日志 API (purge-all) (Yehuda Sadeh)

  • rgw: 如果 keystone 未配置,则失败 S3 POST (#10688 Valery Tschopp, Yehuda Sadeh)

  • rgw: 修复 get ACL 请求的 XML 头 (#10106 Yehuda Sadeh)

  • rgw: 修复带数据清除的桶删除 (Yehuda Sadeh)

  • rgw: 修复副本日志索引 (#8251 Yehuda Sadeh)

  • rgw: 修复 swift 元数据头名称 (Dmytro Iurchenko)

  • rgw: 在中止时从桶索引中删除多部分条目 (#10719 Yehuda Sadeh)

  • rgw: 在容器上的 POST 上返回 204 (10667 Yuan Zhou)

  • rgw: 重复 fcgx 连接结构 (#10194 Yehuda Sadeh)

  • rgw: 支持多个主机名 (#7467 Yehuda Sadeh)

  • rgw: swift: 导出容器的自定义元数据 (#10665 Ahmad Faheem, Dmytro Iurchenko)

  • rgw: swift: 支持响应格式 Accept 头 (#10746 Dmytro Iurchenko)

  • rgw: swift: 支持 X-Remove-Container-Meta-{key} (#10475 Dmytro Iurchenko)

  • rpm: 将 rgw logrotate 移动到 radosgw 子包 (Ken Dreyer)

  • 测试: centos-6 dockerfile (#10755 Loic Dachary)

  • 测试: 共享缓存单元测试 (Dong Yuan)

  • vstart.sh: 与 cmake 工作 (Yehuda Sadeh)

v0.93

这是 hammer 之前最后一块新内容。重要项目包括 OSD 对象上的额外校验和、缓存层中的代理读取、RBD 中的图像锁定、优化的 OSD 事务和复制消息,以及大量的 RGW 和 MDS 错误修复。

升级

  • 实验性的 “keyvaluestore-dev” OSD 后端已被重命名为 “keyvaluestore”(为了简单起见)并标记为实验性。要启用这个未经测试的功能并承认您知道它未经测试并且可能会破坏数据,您需要在您的 ceph.conf 中添加以下内容:

    enable experimental unrecoverable data corrupting features = keyvaluestore
    
  • 以下 librados C API 函数调用接受一个 “flags” 参数,其值现在被正确解释:

    rados_write_op_operate()

    标志没有被正确地从 librados 常量转换为内部值。现在它们是。任何传递这些方法标志的代码都应该进行检查,以确保它们在使用正确的 LIBRADOS_OP_FLAG_*

  • “rados” CLI “copy” 和 “cppool” 命令现在使用复制操作,这意味着最新的 CLI 不能对预 firefly OSD 运行这些命令。

  • librados 监视/通知 API 现在包括一个 watch_flush() 操作来刷新通知操作的异步队列。任何监视/通知用户在 rados_shutdown() 之前都应该调用它。

值得注意的变化

  • 添加实验功能选项 (Sage Weil)

  • 构建: 修复 “make check” 竞态 (#10384 Loic Dachary)

  • 构建: 当 libkeyutils 缺失时修复包名 (Pankag Garg, Ken Dreyer)

  • ceph: 使 “ceph -s” 显示 PG 状态计数按排序顺序 (Sage Weil)

  • ceph: 使 “ceph tell mon.* version” 工作 (Mykola Golub)

  • ceph-monstore-tool: 修复/改进 CLI (Joao Eduardo Luis)

  • ceph: 在 “ceph osd tree” 中显示主亲和性 (Mykola Golub)

  • common: 添加 TableFormatter (Andreas Peters)

  • common: 检查 syncfs() 返回代码 (Jianpeng Ma)

  • 文档: 不要建议危险的 XFS nobarrier 选项 (Dan van der Ster)

  • 文档: 各种更新 (Nilamdyuti Goswami, John Wilkins)

  • install-deps.sh: 当 root 时不需要 sudo (Loic Dachary)

  • libcephfs: 修复目录碎片修剪 (#10387 Yan, Zheng)

  • libcephfs: 修复挂载超时 (#10041 Yan, Zheng)

  • libcephfs: 修复测试 (#10415 Yan, Zheng)

  • libcephfs: 修复使用 afer-free 在卸载时 (#10412 Yan, Zheng)

  • libcephfs: 在客户端元数据中包含 ceph 和 git 版本 (Sage Weil)

  • librados: 添加 watch_flush() 操作 (Sage Weil, Haomai Wang)

  • librados: 在 getxattr、读取时避免 memcpy (Jianpeng Ma)

  • librados: 通过池 id 创建 ioctx (Jason Dillaman)

  • librados: 在快速分发中完成监视完成 (Sage Weil)

  • librados: 删除隐藏变量 (Kefu Chain)

  • librados: 将 op 标志从 C API 翻译 (Matthew Richards)

  • librbd: 区分 R/O 和 R/W 特性 (Jason Dillaman)

  • librbd: 排他图像锁定 (Jason Dillaman)

  • librbd: 修复写入与导入的竞争条件 (#10590 Jason Dillaman)

  • librbd: 优雅地处理已删除/重命名的池 (#10270 Jason Dillaman)

  • mds: asok 命令用于获取子树地图 (John Spray)

  • mds: 将 MDSCacheObjects 常量化 (John Spray)

  • 各种 valgrind 修复和清理 (Danny Al-Gaaf)

  • mon: 修复备用 MDS 的 “mds fail” (John Spray)

  • mon: 修复存储的 monmap 编码 (#5203 Xie Rui)

  • mon: 实现 “fs reset” 命令 (John Spray)

  • mon: 在提升备用时尊重 down 标志 (John Spray)

  • mount.ceph: 修复虚假错误消息 (#10351 Yan, Zheng)

  • msgr: 异步: 许多修复和单元测试 (Haomai Wang)

  • msgr: 简单: 在失败时重试绑定到端口 (#10029 Wido den Hollander)

  • osd: 添加 fadvise 标志到 ObjectStore API (Jianpeng Ma)

  • osd: 添加 get_latest_osdmap asok 命令 (#9483 #9484 Mykola Golub)

  • osd: 当校验和错误时在整对象读取期间发生 EIO (Sage Weil)

  • osd: filejournal: 不使用直接 IO 时不要缓存 journal (Jianpeng Ma)

  • osd: 修复 ioprio 选项 (Mykola Golub)

  • osd: 修复 scrub 延迟错误 (#10693 Samuel Just)

  • osd: 修复监视重新连接竞争条件 (#10441 Sage Weil)

  • osd: 使用快照处理无操作写入 (#10262 Sage Weil)

  • osd: journal: 当直接 IO 启用时修复 journal 清零 (#Xie Rui)

  • osd: keyvaluestore: 清理死代码 (Ning Yao)

  • osd, mds: “ops” 作为 asok 上 “dump_ops_in_flight” 的简称 (Sage Weil)

  • osd: memstore: 修复大小限制 (Xiaoxi Chen)

  • osd: 各种 scrub 修复 (#10017 Loic Dachary)

  • osd: 为 ObjectStore::Transaction 添加优化的编码 (Dong Yuan)

  • osd: 优化 filter_snapc (Ning Yao)

  • osd: 使用 unordered_map 优化 WBThrottle map (Ning Yao)

  • osd: 在缓存提升期间代理读取 (Zhiqiang Wang)

  • osd: 代理读取支持 (Zhiqiang Wang)

  • osd: 删除遗留的经典 scrub 代码 (Sage Weil)

  • osd: 在 MOSDSubOp 中删除未使用的字段 (Xiaoxi Chen)

  • osd: 用更简单、优化的 MOSDRepOp 替换 MOSDSubOp 消息 (Xiaoxi Chen)

  • osd: 在 scrub、write_full 上存储整个对象校验和 (Sage Weil)

  • osd: 在使用 XFS extsize ioctl 之前验证内核是否足够新,默认启用 (#9956 Sage Weil)

  • rados: 使用复制操作进行复制、cppool (Sage Weil)

  • rgw: 更改多部分上传 id 魔术 (#10271 Yehuda Sadeh)

  • rgw: 解码 http 查询参数更正 (#10271 Yehuda Sadeh)

  • rgw: 修复内容长度检查 (#10701 Axel Dunkel, Yehuda Sadeh)

  • rgw: 修复 swift 中的部分 GET (#10553 Yehuda Sadeh)

  • rgw: 修复关闭 (#10472 Yehuda Sadeh)

  • rgw: 在 get ACL 请求中包含 XML ns (#10106 Yehuda Sadeh)

  • rgw: 各种修复 (#10307 Yehuda Sadeh)

  • rgw: 仅跟踪我们写入的清理 (#10311 Yehuda Sadeh)

  • rgw: 调整错误代码 (#10329 #10334 Yehuda Sadeh)

  • rgw: 使用 gc 进行多部分中止 (#10445 Aaron Bassett, Yehuda Sadeh)

  • sysvinit: 修复 “stop” 中的竞争条件 (#10389 Loic Dachary)

  • 测试: 修复 bufferlist 测试 (Jianpeng Ma)

  • 测试: 改进基于 docker 的测试 (Loic Dachary)

v0.92

我们正迅速接近 hammer 功能冻结,但在到达之前还有几个开发版本要走。头条新闻是 CephFS 中的基于子树的配额支持(目前仅限于 ceph-fuse/libcephfs 客户端支持),重写 RBD 和 RGW 使用的 watch/notify librados API,OSDMap 校验和以确保地图在集群内始终一致,librados 和 librbd 中的新 API 调用用于 IO 提示,类似于 posix_fadvise,以及改善每 PG 状态的存储。

We expect two more releases before the Hammer feature freeze (v0.93).

升级

  • 对象的 “category” 字段已被删除。这最初是为了跟踪不同对象类别的 PG 统计信息,供 radosgw 使用。它不再有任何已知用户,并且因为它可能导致无界的 pg_stat_t 结构,所以它容易受到滥用。现在,接受此字段的 librados API 调用会忽略它,OSD 不再跟踪每个类别的汇总。

  • “ceph pg stat -f …” 的格式化输出现在是一个完整的 pg 转储,其中包括系统中的所有 PG 的所有元数据。它现在是一个高级 PG 统计的简洁摘要,就像未格式化的 “ceph pg stat” 命令一样。

  • The ‘rados create <objectname> [category]’ optional category argument is no longer supported or recognized.

  • rados.py’s Rados class no longer has a __del__ method; it was causing problems on interpreter shutdown and use of threads. If your code has Rados objects with limited lifetimes and you’re concerned about locked resources, call Rados.shutdown() explicitly.

  • There is a new version of the librados watch/notify API with vastly improved semantics. Any applications using this interface are encouraged to migrate to the new API. The old API calls are marked as deprecated and will eventually be removed.

  • The librados rados_unwatch() call used to be safe to call on an invalid handle. The new version has undefined behavior when passed a bogus value (for example, when rados_watch() returns an error and handle is not defined).

  • The structure of the formatted ‘pg stat’ command is changed for the portion that counts states by name to avoid using the ‘+’ character (which appears in state names) as part of the XML token (it is not legal).

值得注意的变化

  • asyncmsgr: misc fixes (Haomai Wang)

  • buffer: add ‘shareable’ construct (Matt Benjamin)

  • build: aarch64 build fixes (Noah Watkins, Haomai Wang)

  • build: support for jemalloc (Shishir Gowda)

  • ceph-disk: allow journal partition re-use (#10146 Loic Dachary, Dav van der Ster)

  • ceph-disk: misc fixes (Christos Stavrakakis)

  • ceph-fuse: fix kernel cache trimming (#10277 Yan, Zheng)

  • ceph-objectstore-tool: many many improvements (David Zafman)

  • common: support new gperftools header locations (Key Dreyer)

  • crush: straw bucket weight calculation fixes (#9998 Sage Weil)

  • doc: misc improvements (Nilamdyuti Goswami, John Wilkins, Chris Holcombe)

  • libcephfs,ceph-fuse: add ‘status’ asok (John Spray)

  • librados, osd: new watch/notify implementation (Sage Weil)

  • librados: drop ‘category’ feature (Sage Weil)

  • librados: fix pool deletion handling (#10372 Sage Weil)

  • librados: new fadvise API (Ma Jianpeng)

  • libradosstriper: fix remove() (Dongmao Zhang)

  • librbd: complete pending ops before closing image (#10299 Josh Durgin)

  • librbd: fadvise API (Ma Jianpeng)

  • mds: ENOSPC and OSDMap epoch barriers (#7317 John Spray)

  • mds: dirfrag buf fix (Yan, Zheng)

  • mds: disallow most commands on inactive MDS’s (Greg Farnum)

  • mds: drop dentries, leases on deleted directories (#10164 Yan, Zheng)

  • mds: handle zero-size xattr (#10335 Yan, Zheng)

  • mds: 子树配额支持 (Yunchuan Wen)

  • memstore: 跟踪空闲空间 (John Spray)

  • 各种清理 (Danny Al-Gaaf, David Anderson)

  • mon: “osd crush reweight-all” 命令 (Sage Weil)

  • mon: 允许手动清除全标志 (#9323 Sage Weil)

  • mon: 延迟故障注入 (Joao Eduardo Luis)

  • mon: fix paxos timeouts (#10220 Joao Eduardo Luis)

  • mon: get canonical OSDMap from leader (#10422 Sage Weil)

  • msgr: fix RESETSESSION bug (#10080 Greg Farnum)

  • objectstore: deprecate collection attrs (Sage Weil)

  • osd, mon: add checksums to all OSDMaps (Sage Weil)

  • osd: allow deletion of objects with watcher (#2339 Sage Weil)

  • osd: allow sparse read for Push/Pull (Haomai Wang)

  • osd: 缓存 reverse_nibbles 哈希值 (Dong Yuan)

  • osd: drop upgrade support for pre-dumpling (Sage Weil)

  • osd: enable and use posix_fadvise (Sage Weil)

  • osd: erasure-code: enforce chunk size alignment (#10211 Loic Dachary)

  • osd: erasure-code: jerasure support for NEON (Loic Dachary)

  • osd: erasure-code: relax cauchy w restrictions (#10325 David Zhang, Loic Dachary)

  • osd: erasure-code: update gf-complete to latest upstream (Loic Dachary)

  • osd: fix WBTHrottle perf counters (Haomai Wang)

  • osd: fix backfill bug (#10150 Samuel Just)

  • osd: fix occasional peering stalls (#10431 Sage Weil)

  • osd: fix scrub vs try-flush bug (#8011 Samuel Just)

  • osd: fix stderr with -f or -d (Dan Mick)

  • osd: misc FIEMAP fixes (Ma Jianpeng)

  • osd: optimize Finisher (Xinze Chi)

  • osd: store PG metadata in per-collection objects for better concurrency (Sage Weil)

  • pyrados: add object lock support (#6114 Mehdi Abaakouk)

  • pyrados: fix misnamed wait_* routings (#10104 Dan Mick)

  • pyrados: misc cleanups (Kefu Chai)

  • qa: add large auth ticket tests (Ilya Dryomov)

  • qa: many ‘make check’ improvements (Loic Dachary)

  • qa: misc tests (Loic Dachary, Yan, Zheng)

  • rgw: conditional PUT on ETag (#8562 Ray Lv)

  • rgw: fix error codes (#10334 #10329 Yehuda Sadeh)

  • rgw: index swift keys appropriately (#10471 Yehuda Sadeh)

  • rgw: prevent illegal bucket policy that doesn’t match placement rule (Yehuda Sadeh)

  • rgw: run radosgw as apache with systemd (#10125 Loic Dachary)

  • rgw: support X-Storage-Policy header for Swift storage policy compat (Yehuda Sadeh)

  • rgw: use rn for http headers (#9254 Yehuda Sadeh)

  • rpm: misc fixes (Key Dreyer)

v0.90

This is the last development release before Christmas. There are some API cleanups for librados and librbd, and lots of bug fixes across the board for the OSD, MDS, RGW, and CRUSH. The OSD also gets support for discard (potentially helpful on SSDs, although it is off by default), and there are several improvements to ceph-disk.

The next two development releases will be getting a slew of new functionality for hammer. Stay tuned!

升级

  • Previously, the formatted output of ‘ceph pg stat -f …’ was a full pg dump that included all metadata about all PGs in the system. It is now a concise summary of high-level PG stats, just like the unformatted ‘ceph pg stat’ command.

  • 所有浮点值的 JSON 转储周围都带有引号。这些引号已被删除。任何消费结构化 JSON 输出并消费浮点值的人之前不得不解释带引号的字符串,现在很可能需要修复以接受未加引号的数字。

值得注意的变化

  • 架构: 修复 NEON 特性检测 (#10185 Loic Dachary)

  • 构建: 调整 yasm、virtualenv 的构建依赖 (Jianpeng Ma)

  • 构建: 改进构建依赖工具 (Loic Dachary)

  • ceph-disk: 调用 partx/partprobe 一致性 (#9721 Loic Dachary)

  • ceph-disk: 修复 dmcrypt 密钥权限 (Loic Dachary)

  • ceph-disk: 修复卸载竞争条件 (#10096 Blaine Gardner)

  • ceph-disk: init=none 选项 (Loic Dachary)

  • ceph-monstore-tool: 修复关闭 (#10093 Loic Dachary)

  • ceph-objectstore-tool: 修复导入 (#10090 David Zafman)

  • ceph-objectstore-tool: 许多改进和测试 (David Zafman)

  • ceph.spec: 打包 rbd-replay-prep (Ken Dreyer)

  • common: 添加 “perf reset …” 管理命令 (Jianpeng Ma)

  • common: 在销毁时不要解锁 rwlock (Federico Simoncelli)

  • common: 修复块设备丢弃检查 (#10296 Sage Weil)

  • common: 删除损坏的 CEPH_LOCKDEP 选项 (Kefu Chai)

  • crush: 修复树桶行为 (Rongze Zhu)

  • 文档: 添加 Fedora 和 CentOS/RHEL 的构建文档指南 (Nilamdyuti Goswami)

  • 文档: 在 openstack 部署上启用 rbd 缓存 (Sebastien Han)

  • 文档: 改进 CentOS/RHEL 安装的安装说明 (John Wilkins)

  • 文档: 各种清理 (Adam Spiers, Sebastien Han, Nilamdyuti Goswami, Ken Dreyer, John Wilkins)

  • 文档: 新的 man 页面 (Nilamdyuti Goswami)

  • 文档: 更新发布描述 (Ken Dreyer)

  • 文档: 更新 sepia 硬件清单 (Sandon Van Ness)

  • librados: 仅导出公共 API 符号 (Jason Dillaman)

  • libradosstriper: 修复 stat strtoll (Dongmao Zhang)

  • libradosstriper: 修复截断方法 (#10129 Sebastien Ponce)

  • librbd: 修复从无效池 ioctxs 列出子项 (#10123 Jason Dillaman)

  • librbd: 仅导出公共 API 符号 (Jason Dillaman)

  • 许多 coverity 修复 (Danny Al-Gaaf)

  • mds: “flush journal” 管理命令 (John Spray)

  • mds: 修复 MDLog IO 回调死锁 (John Spray)

  • mds: 修复 journal 探测与清除期间的死锁 (Joao Eduardo Luis)

  • mds: 修复日志段修剪的竞争条件 (Yan, Zheng)

  • mds: 为随机目录存储回溯 (Yan, Zheng)

  • mds: 在获取目录碎片时验证回溯 (#9557 Yan, Zheng)

  • mon: 添加每个 osd 的最大 pgs 警告 (Sage Weil)

  • mon: 修复_ratio单位和类型 (Sage Weil)

  • mon: 修复 JSON 转储以将浮点数转储为 flots 而不是字符串 (Sage Weil)

  • mon: 修复格式化器 “pg stat” 命令输出 (Sage Weil)

  • msgr: 异步: 许多修复 (Haomai Wang)

  • msgr: 简单: 修复罕见死锁 (Greg Farnum)

  • osd: 批量 pg 日志修剪 (Xinze Chi)

  • osd: 清理内部 ObjectStore 接口 (Sage Weil)

  • osd: 当缓存满时忽略最小刷新年龄 (Xinze Chi)

  • osd: fix ghobject_t formatted output to include shard (#10063 Loic Dachary)

  • osd: fix osd peer check on scrub messages (#9555 Sage Weil)

  • osd: fix pgls filter ops (#9439 David Zafman)

  • osd: flush snapshots from cache tier immediately (Sage Weil)

  • osd: keyvaluestore: fix getattr semantics (Haomai Wang)

  • osd: keyvaluestore: fix key ordering (#10119 Haomai Wang)

  • osd: limit in-flight read requests (Jason Dillaman)

  • osd: log when scrub or repair starts (Loic Dachary)

  • osd: support for discard for journal trim (Jianpeng Ma)

  • qa: fix osd create dup tests (#10083 Loic Dachary)

  • rgw: add location header when object is in another region (VRan Liu)

  • rgw: check timestamp on s3 keystone auth (#10062 Abhishek Lekshmanan)

  • rgw: make sysvinit script set ulimit -n properly (Sage Weil)

  • systemd: better systemd unit files (Owen Synge)

  • tests: ability to run unit tests under docker (Loic Dachary)

v0.89

This is the second development release since Giant. The big items include the first batch of scrub patches from Greg for CephFS, a rework in the librados object listing API to properly handle namespaces, and a pile of bug fixes for RGW. There are also several smaller issues fixed up in the performance area with buffer alignment and memory copies, osd cache tiering agent, and various CephFS fixes.

升级

  • New ability to list all objects from all namespaces can fail or return incomplete results when not all OSDs have been upgraded. Features rados --all ls, rados cppool, rados export, rados cache-flush-evict-all and rados cache-try-flush-evict-all can also fail or return incomplete results.

值得注意的变化

  • buffer: add list::get_contiguous (Sage Weil)

  • buffer: avoid rebuild if buffer already contiguous (Jianpeng Ma)

  • ceph-disk: improved systemd support (Owen Synge)

  • ceph-disk: set guid if reusing journal partition (Dan van der Ster)

  • ceph-fuse, libcephfs: allow xattr caps in inject_release_failure (#9800 John Spray)

  • ceph-fuse, libcephfs: fix I_COMPLETE_ORDERED checks (#9894 Yan, Zheng)

  • ceph-fuse: fix dentry invalidation on 3.18+ kernels (#9997 Yan, Zheng)

  • crush: fix detach_bucket (#10095 Sage Weil)

  • crush: fix several bugs in adjust_item_weight (Rongze Zhu)

  • doc: add dumpling to firefly upgrade section (#7679 John Wilkins)

  • doc: document erasure coded pool operations (#9970 Loic Dachary)

  • doc: file system osd config settings (Kevin Dalley)

  • doc: key/value store config reference (John Wilkins)

  • doc: update openstack docs for Juno (Sebastien Han)

  • fix cluster logging from non-mon daemons (Sage Weil)

  • init-ceph: check for systemd-run before using it (Boris Ranto)

  • librados: fix infinite loop with skipped map epochs (#9986 Ding Dinghua)

  • librados: fix iterator operator= bugs (#10082 David Zafman, Yehuda Sadeh)

  • librados: fix null deref when pool DNE (#9944 Sage Weil)

  • librados: fix timer race from recent refactor (Sage Weil)

  • libradosstriper: fix shutdown hang (Dongmao Zhang)

  • librbd: don’t close a closed parent in failure path (#10030 Jason Dillaman)

  • librbd: fix diff test (#10002 Josh Durgin)

  • librbd: fix locking for readahead (#10045 Jason Dillaman)

  • librbd: refactor unit tests to use fixtures (Jason Dillaman)

  • many many coverity cleanups (Danny Al-Gaaf)

  • mds: a whole bunch of initial scrub infrastructure (Greg Farnum)

  • mds: fix compat_version for MClientSession (#9945 John Spray)

  • mds: fix reply snapbl (Yan, Zheng)

  • mon: 允许向文件池添加层 (#10135 John Spray)

  • mon: 修复来自 peons 的 MDS 健康状态 (#10151 John Spray)

  • mon: 修复 min_last_epoch_clean 的缓存 (#9987 Sage Weil)

  • mon: 修复 add_data_pool 的错误输出 (#9852 Joao Eduardo Luis)

  • mon: include entity name in audit log for forwarded requests (#9913 Joao Eduardo Luis)

  • mon: paxos: allow reads while proposing (#9321 #9322 Joao Eduardo Luis)

  • msgr: asyncmessenger: add kqueue support (#9926 Haomai Wang)

  • osd, librados: revamp PG listing API to handle namespaces (#9031 #9262 #9438 David Zafman)

  • osd, mon: send intiial pg create time from mon to osd (#9887 David Zafman)

  • osd: allow whiteout deletion in cache pool (Sage Weil)

  • osd: cache pool: ignore min flush age when cache is full (Xinze Chi)

  • osd: erasure coding: allow bench.sh to test ISA backend (Yuan Zhou)

  • osd: erasure-code: encoding regression tests, corpus (#9420 Loic Dachary)

  • osd: fix journal shutdown race (Sage Weil)

  • osd: fix object age eviction (Zhiqiang Wang)

  • osd: fix object atime calculation (Xinze Chi)

  • osd: fix past_interval display bug (#9752 Loic Dachary)

  • osd: journal: fix alignment checks, avoid useless memmove (Jianpeng Ma)

  • osd: journal: update committed_thru after replay (#6756 Samuel Just)

  • osd: keyvaluestore_dev: optimization (Chendi Xue)

  • osd: make misdirected op checks robust for EC pools (#9835 Sage Weil)

  • osd: removed some dead code (Xinze Chi)

  • qa: parallelize make check (Loic Dachary)

  • qa: tolerate nearly-full disk for make check (Loic Dachary)

  • rgw: create subuser if needed when creating user (#10103 Yehuda Sadeh)

  • rgw: fix If-Modified-Since (VRan Liu)

  • rgw: fix content-length update (#9576 Yehuda Sadeh)

  • rgw: fix disabling of max_size quota (#9907 Dong Lei)

  • rgw: fix incorrect len when len is 0 (#9877 Yehuda Sadeh)

  • rgw: fix object copy content type (#9478 Yehuda Sadeh)

  • rgw: fix user stags in get-user-info API (#9359 Ray Lv)

  • rgw: remove swift user manifest (DLO) hash calculation (#9973 Yehuda Sadeh)

  • rgw: return timestamp on GET/HEAD (#8911 Yehuda Sadeh)

  • rgw: set ETag on object copy (#9479 Yehuda Sadeh)

  • rgw: update bucket index on attr changes, for multi-site sync (#5595 Yehuda Sadeh)

v0.88

This is the first development release after Giant. The two main features merged this round are the new AsyncMessenger (an alternative implementation of the network layer) from Haomai Wang at UnitedStack, and support for POSIX file locks in ceph-fuse and libcephfs from Yan, Zheng. There is also a big pile of smaller items that re merged while we were stabilizing Giant, including a range of smaller performance and bug fixes and some new tracepoints for LTTNG.

值得注意的变化

  • ceph-disk: Scientific Linux support (Dan van der Ster)

  • ceph-disk: respect --statedir for keyring (Loic Dachary)

  • ceph-fuse, libcephfs: POSIX file lock support (Yan, Zheng)

  • ceph-fuse, libcephfs: fix cap flush overflow (Greg Farnum, Yan, Zheng)

  • ceph-fuse, libcephfs: fix root inode xattrs (Yan, Zheng)

  • ceph-fuse, libcephfs: preserve dir ordering (#9178 Yan, Zheng)

  • ceph-fuse, libcephfs: trim inodes before reconnecting to MDS (Yan, Zheng)

  • ceph: do not parse injectargs twice (Loic Dachary)

  • ceph: make ‘ceph -s’ output more readable (Sage Weil)

  • ceph: new ‘ceph tell mds.$name_or_rank_or_gid’ (John Spray)

  • ceph: test robustness (Joao Eduardo Luis)

  • ceph_objectstore_tool: behave with sharded flag (#9661 David Zafman)

  • cephfs-journal-tool: fix journal import (#10025 John Spray)

  • cephfs-journal-tool: skip up to expire_pos (#9977 John Spray)

  • cleanup rados.h definitions with macros (Ilya Dryomov)

  • common: shared_cache unit tests (Cheng Cheng)

  • config: add $cctid meta variable (Adam Crume)

  • crush: fix buffer overrun for poorly formed rules (#9492 Johnu George)

  • crush: improve constness (Loic Dachary)

  • crushtool: add --location <id> command (Sage Weil, Loic Dachary)

  • default to libnss instead of crypto++ (Federico Gimenez)

  • doc: ceph osd reweight vs crush weight (Laurent Guerby)

  • doc: document the LRC per-layer plugin configuration (Yuan Zhou)

  • doc: erasure code doc updates (Loic Dachary)

  • doc: misc updates (Alfredo Deza, VRan Liu)

  • doc: preflight doc fixes (John Wilkins)

  • doc: update PG count guide (Gerben Meijer, Laurent Guerby, Loic Dachary)

  • keyvaluestore: misc fixes (Haomai Wang)

  • keyvaluestore: performance improvements (Haomai Wang)

  • librados: add rados_pool_get_base_tier() call (Adam Crume)

  • librados: cap buffer length (Loic Dachary)

  • librados: fix objecter races (#9617 Josh Durgin)

  • libradosstriper: misc fixes (Sebastien Ponce)

  • librbd: add missing python docstrings (Jason Dillaman)

  • librbd: add readahead (Adam Crume)

  • librbd: fix cache tiers in list_children and snap_unprotect (Adam Crume)

  • librbd: fix performance regression in ObjectCacher (#9513 Adam Crume)

  • librbd: lttng tracepoints (Adam Crume)

  • librbd: misc fixes (Xinxin Shu, Jason Dillaman)

  • mds: fix sessionmap lifecycle bugs (Yan, Zheng)

  • mds: initialize root inode xattr version (Yan, Zheng)

  • mds: 引入认证功能集 (John Spray)

  • mds: 各种错误 (Greg Farnum, John Spray, Yan, Zheng, Henry Change)

  • 各种 coverity 修复 (Danny Al-Gaaf)

  • mon: 添加 “ceph osd rename-bucket …” 命令 (Loic Dachary)

  • mon: 清理认证列表输出 (Loic Dachary)

  • mon: 修复 “osd crush link” id 解析 (John Spray)

  • mon: 修复各种错误路径 (Joao Eduardo Luis)

  • mon: 修复 paxos off-by-one 角落情况 (#9301 Sage Weil)

  • mon: 新的 “ceph pool ls [detail]” 命令 (Sage Weil)

  • mon: wait for writeable before cross-proposing (#9794 Joao Eduardo Luis)

  • msgr: avoid useless new/delete (Haomai Wang)

  • msgr: fix delay injection bug (#9910 Sage Weil, Greg Farnum)

  • msgr: new AsymcMessenger alternative implementation (Haomai Wang)

  • msgr: prefetch data when doing recv (Yehuda Sadeh)

  • osd: add erasure code corpus (Loic Dachary)

  • osd: add misc tests (Loic Dachary, Danny Al-Gaaf)

  • osd: 清理 boost optionals (William Kennington)

  • osd: 通过 ceph-osd CLI 暴露非日志后端 (Hoamai Wang)

  • osd: 修复随机 OSD 的 JSON 输出 (Loic Dachary)

  • osd: 修复 ioprio 选项 (Loic Dachary)

  • osd: 修复事务会计 (Jianpeng Ma)

  • osd: 各种优化 (Xinxin Shu, Zhiqiang Wang, Xinze Chi)

  • osd: 使用 FIEMAP_FLAGS_SYNC 而不是 fsync (Jianpeng Ma)

  • rados: 修复 /dev/null 的放置 (Loic Dachary)

  • rados: 更严格地解析命令行参数 (#8983 Adam Crume)

  • rbd-fuse: 修复内存泄漏 (Adam Crume)

  • rbd-replay-many (Adam Crume)

  • rbd-replay: --anonymize 标志到 rbd-replay-prep (Adam Crume)

  • rbd: 修复 “rbd diff” 对于不存在的对象 (Adam Crume)

  • rbd: 当使用格式 1 条带时修复错误 (Sebastien Han)

  • rbd: 修复超过 2GB 的图像导出 (Vicente Cheng)

  • rbd: 使用滚动平均值来计算 rbd bench-write 吞吐量 (Jason Dillaman)

  • rgw: 发送明确的 HTTP 状态字符串 (Yehuda Sadeh)

  • rgw: 为 keystone 令牌验证请求设置长度 (#7796 Yehuda Sadeh, Mark Kirkwood)

  • udev: 修复 CentOS7/RHEL7 的规则 (Loic Dachary)

  • 使用 clock_gettime 而不是 gettimeofday (Jianpeng Ma)

  • vstart.sh: 设置 s3-tests 的环境 (Luis Pabon)

由 Ceph 基金会带给您

Ceph 文档是一个社区资源,由非盈利的 Ceph 基金会资助和托管Ceph Foundation. 如果您想支持这一点和我们的其他工作,请考虑加入现在加入.