beancount.tools

独立工具,这些工具未直接链接到 Beancount,但与其配合使用非常有用。

beancount.scripts 包包含调用 Beancount 库代码的脚本实现。而 beancount.tools 包则实现了一些不直接调用 Beancount 库代码的其他工具,理论上可以独立复制和使用。然而,这些工具仍随 Beancount 一起分发,为了集中管理所有源代码,它们被放置在此包中,并通过 beancount/bin/ 下的存根脚本进行调用,与其他脚本方式一致。

beancount.tools.treeify

识别包含分层标识符的文本列,并将其转换为树状结构。

此脚本将检查一个文本文件,尝试找到一个左对齐的垂直列,其中包含具有多个组成部分的标识符(例如 "Assets:US:Bank:Checking"),并将其替换为用 ASCII 字符渲染的树状结构,必要时插入空行以构建树形。

注意:如果您的路径中包含空格,则此工具将无法正常工作。空格被用作检测列结束的分隔符。您可以通过选项自定义分隔符。

beancount.tools.treeify.Node (list)

一个具有 name 属性、行号列表和子节点列表的节点(继承自其父类)。

beancount.tools.treeify.Node.__repr__(self) 特殊

返回 str(self)。

源代码位于 beancount/tools/treeify.py
def __str__(self):
    return '<Node {} {}>'.format(self.name, [node.name for node in self])

beancount.tools.treeify.create_tree(column_matches, regexp_split)

根据匹配项列表构建一棵树。

参数:
  • column_matches – 由 (行号, 名称) 对组成的列表。

  • regexp_split – 用于拆分名称组件的正则表达式字符串。

返回:
  • Node 的实例,即所创建树的根节点。

源代码位于 beancount/tools/treeify.py
def create_tree(column_matches, regexp_split):
    """Build up a tree from a list of matches.

    Args:
      column_matches: A list of (line-number, name) pairs.
      regexp_split: A regular expression string, to use for splitting the names
        of components.
    Returns:
      An instance of Node, the root node of the created tree.
    """
    root = Node('')
    for no, name in column_matches:
        parts = re.split(regexp_split, name)
        node = root
        for part in parts:
            last_node = node[-1] if node else None
            if last_node is None or last_node.name != part:
                last_node = Node(part)
                node.append(last_node)
            node = last_node
        node.nos.append(no)
    return root

beancount.tools.treeify.dump_tree(node, file=<_io.StringIO object at 0x7f6b93098c40>, prefix='')

将树渲染为树状结构。

参数:
  • node – Node 的一个实例。

  • file – 用于写入的文件对象。

  • prefix – 用于每个子节点行的前缀字符串。

源代码位于 beancount/tools/treeify.py
def dump_tree(node, file=sys.stdout, prefix=''):
    """Render a tree as a tree.

    Args:
      node: An instance of Node.
      file: A file object to write to.
      prefix: A prefix string for each of the lines of the children.
    """
    file.write(prefix)
    file.write(node.name)
    file.write('\n')
    for child in node:
        dump_tree(child, file, prefix + '... ')

beancount.tools.treeify.enum_tree_by_input_line_num(tree_lines)

累积树的行,直到找到行号。

参数:
  • tree_lines – 由 render_tree 返回的行列表。

生成:(行号, (行, 节点) 列表) 的配对。

源代码位于 beancount/tools/treeify.py
def enum_tree_by_input_line_num(tree_lines):
    """Accumulate the lines of a tree until a line number is found.

    Args:
      tree_lines: A list of lines as returned by render_tree.
    Yields:
      Pairs of (line number, list of (line, node)).
    """
    pending = []
    for first_line, cont_line, node in tree_lines:
        if not node.nos:
            pending.append((first_line, node))
        else:
            line = first_line
            for no in node.nos:
                pending.append((line, node))
                line = cont_line
                yield (no, pending)
                pending = []
    if pending:
        yield (None, pending)

beancount.tools.treeify.find_column(lines, pattern, delimiter)

在文本行中查找具有层次结构数据的有效列。

参数:
  • lines – 字符串列表,表示输入内容。

  • pattern – 用于匹配层次结构条目的正则表达式。

  • delimiter – 用于确定列结束位置的正则表达式。通常为单个空格。如果模式中包含空格,则需要增大此值。

返回:
  • 匹配元组 – 一个 (行号, 名称) 元组列表,其中“名称”是要树化的层次字符串,行号为整数,表示该条目所在的行。left:整数,最左列。right:整数,最右列。注意,并非所有行号都一定存在,因此可能需要跳过某些行。但它们保证按排序顺序排列。

源代码位于 beancount/tools/treeify.py
def find_column(lines, pattern, delimiter):
    """Find a valid column with hierarchical data in the text lines.

    Args:
      lines: A list of strings, the contents of the input.
      pattern: A regular expression for the hierarchical entries.
      delimiter: A regular expression that dictates how we detect the
        end of a column. Normally this is a single space. If the patterns
        contain spaces, you will need to increase this.
    Returns:
      A tuple of
        matches: A list of (line-number, name) tuples where 'name' is the
          hierarchical string to treeify and line-number is an integer, the
          line number where this applies.
        left: An integer, the leftmost column.
        right: An integer, the rightmost column.
      Note that not all line numbers may be present, so you may need to
      skip some. However, they are in guaranteed in sorted order.
    """
    # A mapping of the line beginning position to its match object.
    beginnings = collections.defaultdict(list)
    pattern_and_whitespace = "({})(?P<ws>{}.|$)".format(pattern, delimiter)
    for no, line in enumerate(lines):
        for match in re.finditer(pattern_and_whitespace, line):
            beginnings[match.start()].append((no, line, match))

    # For each potential column found, verify that it is valid. A valid column
    # will have the maximum of its content text not overlap with any of the
    # following text. We assume that a column will have been formatted to full
    # width and that no text following the line overlap with the column, even in
    # its trailing whitespace.
    #
    # In other words, the following example is a violation because "10,990.74"
    # overlaps with the end of "Insurance" and so this would not be recognized
    # as a valid column:
    #
    # Expenses:Food:Restaurant     10,990.74 USD
    # Expenses:Health:Dental:Insurance   208.80 USD
    #
    for leftmost_column, column_matches in sorted(beginnings.items()):

        # Compute the location of the rightmost column of text.
        rightmost_column = max(match.end(1) for _, _, match in column_matches)

        # Compute the leftmost location of the content following the column text
        # and past its whitespace.
        following_column = min(match.end() if match.group('ws') else 10000
                               for _, _, match in column_matches)

        if rightmost_column < following_column:
            # We process only the very first match.
            return_matches = [(no, match.group(1).rstrip())
                              for no, _, match in column_matches]
            return return_matches, leftmost_column, rightmost_column

beancount.tools.treeify.render_tree(root)

渲染节点树。

返回:
  • 元组列表 (第一行, 续行, 节点),其中 first_line – 字符串,要渲染的第一行,包含账户名称。continuation_line:字符串,必要时渲染的后续行。node:与此行对应的 Node 实例。以及一个整数,表示新列的宽度。

源代码位于 beancount/tools/treeify.py
def render_tree(root):
    """Render a tree of nodes.

    Returns:
      A list of tuples of (first_line, continuation_line, node) where
        first_line: A string, the first line to render, which includes the
          account name.
        continuation_line: A string, further line to render if necessary.
        node: The Node instance which corresponds to this line.
      and an integer, the width of the new columns.
    """
    # Compute all the lines ahead of time in order to calculate the width.
    lines = []

    # Start with the root node. We push the constant prefix before this node,
    # the account name, and the RealAccount instance. We will maintain a stack
    # of children nodes to render.
    stack = [('', root.name, root, True)]
    while stack:
        prefix, name, node, is_last = stack.pop(-1)

        if node is root:
            # For the root node, we don't want to render any prefix.
            first = cont = ''
        else:
            # Compute the string that precedes the name directly and the one below
            # that for the continuation lines.
            #  |
            #  @@@ Bank1    <----------------
            #  @@@ |
            #  |   |-- Checking
            if is_last:
                first = prefix + PREFIX_LEAF_1
                cont = prefix + PREFIX_LEAF_C
            else:
                first = prefix + PREFIX_CHILD_1
                cont = prefix + PREFIX_CHILD_C

        # Compute the name to render for continuation lines.
        #  |
        #  |-- Bank1
        #  |   @@@       <----------------
        #  |   |-- Checking
        if len(node) > 0:
            cont_name = PREFIX_CHILD_C
        else:
            cont_name = PREFIX_LEAF_C

        # Add a line for this account.
        if not (node is root and not name):
            lines.append((first + name,
                          cont + cont_name,
                          node))

        # Push the children onto the stack, being careful with ordering and
        # marking the last node as such.
        if node:
            child_items = reversed(node)
            child_iter = iter(child_items)
            child_node = next(child_iter)
            stack.append((cont, child_node.name, child_node, True))
            for child_node in child_iter:
                stack.append((cont, child_node.name, child_node, False))

    if not lines:
        return lines

    # Compute the maximum width of the lines and convert all of them to the same
    # maximal width. This makes it easy on the client.
    max_width = max(len(first_line) for first_line, _, __ in lines)
    line_format = '{{:{width}}}'.format(width=max_width)
    return [(line_format.format(first_line),
             line_format.format(cont_line),
             node)
            for (first_line, cont_line, node) in lines], max_width