CAUTION
本文内容基于 LLVM 15.0.7 版本，部分内容在其他版本中可能有所不同，请注意差异。

Clang Tooling 与 LibTooling

Clang 作为 LLVM 默认前端，提供了强大的编译器基础设施和丰富的 API，使得开发者能够构建各种基于 Clang 的工具。Clang Tooling 是一个概念范畴，泛指基于 Clang 构建工具的整个体系，如下图。

LibTooling 是 Clang Tooling 中的一个核心库，提供复用 Clang 前端模块的能力，使得开发者能够专注于工具的业务逻辑，而无需关心底层的编译器实现细节。我们知道，Clang 前端本身包含完整的词法分析、语法分析、语义分析和类型系统等模块，LibTooling 做的事情是把这些原本只服务于编译流程的模块暴露出来，让外部工具可以直接复用，而不需要自己重新实现一遍 C++ 的解析逻辑。

以 AST 相关的 FrontendAction 为例，下图展示了 Clang 前端架构，以及 LibTooling 实际的介入点。这里的 “介入点” 并非 LibTooling 跳过了前面的阶段，而是指 LibTooling 复用了编译器前端的全部流程，在 AST 完成后拿到控制权，在 CodeGen 之前退出。

LibTooling 核心架构

一个外部工具想要复用 Clang 前端，有几方面的问题。一是谁来驱动多文件的编译？基于 Clang 的工具通常需要处理多个源文件，由谁来负责驱动这些源文件的编译流程；二是单次编译如何组装与启动？Clang 的编译流程是以单个源文件为单位的，工具如何组装 Clang 的编译流程；三是控制权如何交接？前端完成编译流程后，工具何时以及如何接管控制权来完成分析或变换的逻辑；四是如何支持工具的业务逻辑？LibTooling 需要提供什么样的接口来支持工具实现自己的业务逻辑。

LibTooling 的核心架构设计正是围绕这些问题来组织的，如下图所示。下面我们将逐一分析 LibTooling 的核心组件以及它们之间的协作机制。

ClangTool：工具驱动器

在 LibTooling 中，ClangTool 可以看作是一个工具运行时的调度器。它的职责类似于在 Clang 编译流程中的 clang driver：负责读取编译数据库，构造编译参数，并驱动 FrontendAction 在一组源文件上执行。

不同的是，clang driver 驱动的是完整的编译流程，而 ClangTool 只负责驱动 Clang 前端。下面是 ClangTool 的核心定义：

1
class ClangTool {
2
public:
3
  ClangTool(const CompilationDatabase &Compilations,
4
            ArrayRef<std::string> SourcePaths,
5
            std::shared_ptr<PCHContainerOperations> PCHContainerOps =
6
                std::make_shared<PCHContainerOperations>(),
7
            IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS =
8
                llvm::vfs::getRealFileSystem(),
9
            IntrusiveRefCntPtr<FileManager> Files = nullptr);
10

11
  ~ClangTool();
12

13
  void mapVirtualFile(StringRef FilePath, StringRef Content);
14

15
  void appendArgumentsAdjuster(ArgumentsAdjuster Adjuster);
16

17
  int run(ToolAction *Action);
18

19
  int buildASTs(std::vector<std::unique_ptr<ASTUnit>> &ASTs);
20
};

ClangTool() 构造函数

我们把目光放在构造函数函数签名上：

1
ClangTool(const CompilationDatabase &Compilations,
2
        ArrayRef<std::string> SourcePaths,
3
        std::shared_ptr<PCHContainerOperations> PCHContainerOps =
4
            std::make_shared<PCHContainerOperations>(),
5
        IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS =
6
            llvm::vfs::getRealFileSystem(),
7
        IntrusiveRefCntPtr<FileManager> Files = nullptr);

作为 LibTooling 的核心调度器，ClangTool 构造函数接受多个参数，最重要的两个参数即：Compilations 和 SourcePaths。

Compilations 是一个 CompilationDatabase 对象的引用，CompilationDatabase 是 Clang 编译数据库的抽象表示，封装了编译命令、编译选项和编译环境等信息。对于编译数据库我想你并不陌生，在 clangd 介绍中我们已经了解过了。而对于 SourcePaths，顾名思义，其中包含了需要处理的源文件路径列表。ClangTool 会根据 SourcePaths 中的每个源文件，在 Compilations 中查找对应的编译命令和选项，然后驱动 Clang 前端来处理。

ClangTool::run() 方法

ClangTool 的核心逻辑由方法 run() 实现：

1
int ClangTool::run(ToolAction *Action) {
2
  static int StaticSymbol;
3
  if (SeenWorkingDirectories.insert("/").second)
4
    for (const auto &MappedFile : MappedFileContents)
5
      if (llvm::sys::path::is_absolute(MappedFile.first))
6
        InMemoryFileSystem->addFile(
7
            MappedFile.first, 0,
8
            llvm::MemoryBuffer::getMemBuffer(MappedFile.second));
9

10
  bool ProcessingFailed = f alse;
11
  bool FileSkipped = false;
12
  std::vector<std::string> AbsolutePaths;
13
  AbsolutePaths.reserve(SourcePaths.size());
14
  for (const auto &SourcePath : SourcePaths) {
15
    auto AbsPath = getAbsolutePath(*OverlayFileSystem, SourcePath);
16
    if (!AbsPath) {
17
      llvm::errs() << "Skipping " << SourcePath
18
                   << ". Error while getting an absolute path: "
19
                   << llvm::toString(AbsPath.takeError()) << "\n";
20
      continue;
21
    }
22
    AbsolutePaths.push_back(std::move(*AbsPath));
23
  }
24
  std::string InitialWorkingDir;
25
  if (RestoreCWD) {
26
    if (auto CWD = OverlayFileSystem->getCurrentWorkingDirectory()) {
27
      InitialWorkingDir = std::move(*CWD);
28
    } else {
29
      llvm::errs() << "Could not get working directory: "
30
                   << CWD.getError().message() << "\n";
31
    }
32
  }
33

34
  for (llvm::StringRef File : AbsolutePaths) {
35
    std::vector<CompileCommand> CompileCommandsForFile =
36
        Compilations.getCompileCommands(File);
37
    if (CompileCommandsForFile.empty()) {
38
      llvm::errs() << "Skipping " << File << ". Compile command not found.\n";
39
      FileSkipped = true;
40
      continue;
41
    }
42
    for (CompileCommand &CompileCommand : CompileCommandsForFile) {
43
      if (OverlayFileSystem->setCurrentWorkingDirectory(
44
              CompileCommand.Directory))
45
        llvm::report_fatal_error("Cannot chdir into \"" +
46
                                 Twine(CompileCommand.Directory) + "\"!");
47

48
      if (SeenWorkingDirectories.insert(CompileCommand.Directory).second)
49
        for (const auto &MappedFile : MappedFileContents)
50
          if (!llvm::sys::path::is_absolute(MappedFile.first))
51
            InMemoryFileSystem->addFile(
52
                MappedFile.first, 0,
53
                llvm::MemoryBuffer::getMemBuffer(MappedFile.second));
54

55
      std::vector<std::string> CommandLine = CompileCommand.CommandLine;
56
      if (ArgsAdjuster)
57
        CommandLine = ArgsAdjuster(CommandLine, CompileCommand.Filename);
58
      assert(!CommandLine.empty());
59
      injectResourceDir(CommandLine, "clang_tool", &StaticSymbol);
60
      LLVM_DEBUG({ llvm::dbgs() << "Processing: " << File << ".\n"; });
61
      ToolInvocation Invocation(std::move(CommandLine), Action, Files.get(),
62
                                PCHContainerOps);
63
      Invocation.setDiagnosticConsumer(DiagConsumer);
64

65
      if (!Invocation.run()) {
66
        // FIXME: Diagnostics should be used instead.
67
        if (PrintErrorMessage)
68
          llvm::errs() << "Error while processing " << File << ".\n";
69
        ProcessingFailed = true;
70
      }
71
    }
72
  }
73

74
  if (!InitialWorkingDir.empty()) {
75
    if (auto EC =
76
            OverlayFileSystem->setCurrentWorkingDirectory(InitialWorkingDir))
77
      llvm::errs() << "Error when trying to restore working dir: "
78
                   << EC.message() << "\n";
79
  }
80
  return ProcessingFailed ? 1 : (FileSkipped ? 2 : 0);
81
}

整体上看，作为 ClangTool 乃至 LibTooliong 的核心逻辑实现，run() 方法负责从 Compilations 获取每个源文件的编译命令，然后为编译命令构造 ToolInvocation，最终通过 ToolInvocation 驱动 FrontendAction 的执行。我们根据源码逐步分析实现细节。

1
if (SeenWorkingDirectories.insert("/").second)
2
    for (const auto &MappedFile : MappedFileContents)
3
      if (llvm::sys::path::is_absolute(MappedFile.first))
4
        InMemoryFileSystem->addFile(
5
            MappedFile.first, 0,
6
            llvm::MemoryBuffer::getMemBuffer(MappedFile.second));

首先，SeenWorkingDirectories 记录了已经访问过的工作目录，类型为 llvm::StringSet<>，底层实现是一个基于 map 的字符串集合，其 second 字段表示插入是否成功，插入成功即表明之前没有访问过这个目录。

ClangTool 支持虚拟文件系统（VFS），允许工具在内存中创建虚拟文件，并将其映射到 Clang 的文件系统中。这里的逻辑是，如果 MappedFileContents 中存在绝对路径（与当前工作路径无关，可以直接添加）的虚拟文件，那么将这些文件添加到 InMemoryFileSystem 中。

IMPORTANT

实际上，ClangTool 在执行时会构建一个 OverlayFileSystem（叠加文件系统），这部分体现在 ClangTool 构造函数中：

1
ClangTool::ClangTool(const CompilationDatabase &Compilations,
2
                     ArrayRef<std::string> SourcePaths,
3
                     std::shared_ptr<PCHContainerOperations> PCHContainerOps,
4
                     IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS,
5
                     IntrusiveRefCntPtr<FileManager> Files)
6
    : Compilations(Compilations), SourcePaths(SourcePaths),
7
      PCHContainerOps(std::move(PCHContainerOps)),
8
      OverlayFileSystem(new llvm::vfs::OverlayFileSystem(std::move(BaseFS))),
9
      InMemoryFileSystem(new llvm::vfs::InMemoryFileSystem),
10
      Files(Files ? Files
11
                  : new FileManager(FileSystemOptions(), OverlayFileSystem)) {
12
  OverlayFileSystem->pushOverlay(InMemoryFileSystem);
13
  appendArgumentsAdjuster(getClangStripOutputAdjuster());
14
  appendArgumentsAdjuster(getClangSyntaxOnlyAdjuster());
15
  appendArgumentsAdjuster(getClangStripDependencyFileAdjuster());
16
  if (Files)
17
    Files->setVirtualFileSystem(OverlayFileSystem);
18
}

OverlayFileSystem 实际上是一个文件系统栈，这里 BaseFS ，即真实文件系统首先被添加到栈底，然后再压入 InMemoryFileSystem 虚拟文件系统，如下所示：

1
OverlayFileSystem
2
 ├── InMemoryFileSystem
3
 └── RealFileSystem

因此，在 run() 方法中，ClangTool 会首先检查虚拟文件系统 InMemoryFileSystem 以处理存在的虚拟文件，而如果普通工具不使用虚拟文件，则最终会回退到真实文件系统，处理磁盘上的源文件。

下面我们聚焦 run() 方法的核心逻辑。

1
for (llvm::StringRef File : AbsolutePaths) {
2
  std::vector<CompileCommand> CompileCommandsForFile =
3
      Compilations.getCompileCommands(File);
4
  if (CompileCommandsForFile.empty()) {
5
    llvm::errs() << "Skipping " << File << ". Compile command not found.\n";
6
    FileSkipped = true;
7
    continue;
8
  }
9
  for (CompileCommand &CompileCommand : CompileCommandsForFile) {
10
    if (OverlayFileSystem->setCurrentWorkingDirectory(
11
            CompileCommand.Directory))
12
      llvm::report_fatal_error("Cannot chdir into \"" +
13
                                Twine(CompileCommand.Directory) + "\"!");
14

15
    if (SeenWorkingDirectories.insert(CompileCommand.Directory).second)
16
      for (const auto &MappedFile : MappedFileContents)
17
        if (!llvm::sys::path::is_absolute(MappedFile.first))
18
          InMemoryFileSystem->addFile(
19
              MappedFile.first, 0,
20
              llvm::MemoryBuffer::getMemBuffer(MappedFile.second));
21

22
    std::vector<std::string> CommandLine = CompileCommand.CommandLine;
23
    if (ArgsAdjuster)
24
      CommandLine = ArgsAdjuster(CommandLine, CompileCommand.Filename);
25
    assert(!CommandLine.empty());
26
    injectResourceDir(CommandLine, "clang_tool", &StaticSymbol);
27
    LLVM_DEBUG({ llvm::dbgs() << "Processing: " << File << ".\n"; });
28
    ToolInvocation Invocation(std::move(CommandLine), Action, Files.get(),
29
                              PCHContainerOps);
30
    Invocation.setDiagnosticConsumer(DiagConsumer);
31

32
    if (!Invocation.run()) {
33
      if (PrintErrorMessage)
34
        llvm::errs() << "Error while processing " << File << ".\n";
35
      ProcessingFailed = true;
36
    }
37
  }
38
}

ClangTool 会遍历 SourcePaths 中的每个源文件，首先通过 getCompileCommands() 获取该源文件对应的编译命令列表。CompilationDatabase 可能会返回多个编译命令，因为同一个源文件可能在不同的编译环境下被编译（例如不同的编译选项）。如果没有找到编译命令，则跳过该文件。

1
if (OverlayFileSystem->setCurrentWorkingDirectory(
2
            CompileCommand.Directory))
3
  llvm::report_fatal_error("Cannot chdir into \"" +
4
                            Twine(CompileCommand.Directory) + "\"!");
5

6
if (SeenWorkingDirectories.insert(CompileCommand.Directory).second)
7
  for (const auto &MappedFile : MappedFileContents)
8
    if (!llvm::sys::path::is_absolute(MappedFile.first))
9
      InMemoryFileSystem->addFile(
10
          MappedFile.first, 0,
11
          llvm::MemoryBuffer::getMemBuffer(MappedFile.second));

对于每个编译命令，ClangTool 首先切换当前工作目录到编译命令指定的目录，这样可以确保后续的文件访问和编译环境一致。然后，如果这是第一次访问这个工作目录，还会将相对路径的虚拟文件添加到 InMemoryFileSystem 中（之前已经处理过绝对路径的虚拟文件）。

接下来，ClangTool 从编译命令中提取出编译参数列表，并通过 ArgsAdjuster 进行调整（如果有的话）。ArgsAdjuster 是一个函数对象，可以用来修改编译参数，例如添加或删除某些选项。ClangTool 内置了一些常用的 ArgsAdjuster，例如 getClangStripOutputAdjuster() 用于去除输出相关的参数，getClangSyntaxOnlyAdjuster() 用于添加 -fsyntax-only 参数以仅进行语法检查。

1
ToolInvocation Invocation(std::move(CommandLine), Action, Files.get(),
2
                              PCHContainerOps);
3
Invocation.setDiagnosticConsumer(DiagConsumer);
4

5
if (!Invocation.run()) {
6
  if (PrintErrorMessage)
7
    llvm::errs() << "Error while processing " << File << ".\n";
8
  ProcessingFailed = true;
9
}

ClangTool 通过 ToolInvocation 来驱动 FrontendAction 的执行。ToolInvocation 是 LibTooling 中一个重要的类，封装了编译命令、前端动作和相关的环境信息。调用 Invocation.run() 会启动 Clang 前端的编译流程，并在完成后将控制权交给 FrontendAction 来执行工具的业务逻辑。

ToolInvocation：编译命令到前端动作执行的桥梁

ToolInvocation 是 LibTooling 中负责从编译命令到前端动作执行的桥梁组件。在真正执行前端动作之前，ToolInvocation 需要完成一些准备工作。其中，最重要的是根据编译命令构造 CompileInvocation——一个包含了编译参数、文件系统和诊断信息等执行环境信息的对象，包含了前端编译流程所需的全部信息。

ToolInvocation 核心定义如下：

1
class ToolInvocation {
2
public:
3
  ToolInvocation(std::vector<std::string> CommandLine,
4
                 std::unique_ptr<FrontendAction> FAction, FileManager *Files,
5
                 std::shared_ptr<PCHContainerOperations> PCHContainerOps =
6
                     std::make_shared<PCHContainerOperations>());
7

8
  ToolInvocation(std::vector<std::string> CommandLine, ToolAction *Action,
9
                 FileManager *Files,
10
                 std::shared_ptr<PCHContainerOperations> PCHContainerOps);
11

12
  ~ToolInvocation();
13

14
  bool run();
15

16
 private:
17
  bool runInvocation(const char *BinaryName,
18
                     driver::Compilation *Compilation,
19
                     std::shared_ptr<CompilerInvocation> Invocation,
20
                     std::shared_ptr<PCHContainerOperations> PCHContainerOps);
21

22
  std::vector<std::string> CommandLine;
23
  ToolAction *Action;
24
  bool OwnsAction;
25
  FileManager *Files;
26
  std::shared_ptr<PCHContainerOperations> PCHContainerOps;
27
  DiagnosticConsumer *DiagConsumer = nullptr;
28
  DiagnosticOptions *DiagOpts = nullptr;
29
};

ToolInvocation 构造函数

我们观察 ToolInvocation 的两个构造函数，对，是两个。

1
ToolInvocation(std::vector<std::string> CommandLine,
2
                std::unique_ptr<FrontendAction> FAction, FileManager *Files,
3
                std::shared_ptr<PCHContainerOperations> PCHContainerOps =
4
                    std::make_shared<PCHContainerOperations>());
5

6
ToolInvocation(std::vector<std::string> CommandLine, ToolAction *Action,
7
                FileManager *Files,
8
                std::shared_ptr<PCHContainerOperations> PCHContainerOps);

两者最大的区别就在于，前者直接接受一个 FrontendAction 的智能指针，而后者则接受 ToolAction 指针。我想是时候介绍一下 ToolAction 了，如果你对 ClangTool::run() 方法还有印象的话，你就会发现 ClangTool::run() 方法正是接受一个 ToolAction 指针作为参数。

工厂方法模式：FrontendActionFactory 与 FrontendAction

ToolAction 定义如下：

1
class ToolAction {
2
public:
3
  virtual ~ToolAction();
4

5
  /// Perform an action for an invocation.
6
  virtual bool
7
  runInvocation(std::shared_ptr<CompilerInvocation> Invocation,
8
                FileManager *Files,
9
                std::shared_ptr<PCHContainerOperations> PCHContainerOps,
10
                DiagnosticConsumer *DiagConsumer) = 0;
11
};

可以看到，ToolAction 是一个抽象基类，实际上作为一个工厂接口基类，定义了一个纯虚函数 runInvocation() 作为接口。而抽象工厂 FrontendActionFactory 继承自 ToolAction，并实现了 runInvocation() 来创建和执行 FrontendAction，如下所示：

1
class FrontendActionFactory : public ToolAction {
2
public:
3
  ~FrontendActionFactory() override;
4

5
  /// Invokes the compiler with a FrontendAction created by create().
6
  bool runInvocation(std::shared_ptr<CompilerInvocation> Invocation,
7
                     FileManager *Files,
8
                     std::shared_ptr<PCHContainerOperations> PCHContainerOps,
9
                     DiagnosticConsumer *DiagConsumer) override;
10

11
  /// Returns a new clang::FrontendAction.
12
  virtual std::unique_ptr<FrontendAction> create() = 0;
13
};

正如你所见，FrontendActionFactory 实际上也是一个抽象基类，定义了一个纯虚函数 create()，用于创建具体 FrontendAction 的实例。FrontendActionFactory 的设计是经典的工厂方法模式，派生类继承自 FrontendActionFactory，并实现工厂方法 create() 来创建具体的FrontendAction 实例。

当然，create() 方法返回指向 FrontendAction 的智能指针，我认为你应该敏锐的想到，FrontendAction 也是一个抽象基类，不同的前端动作，如 ASTFrontendAction、SyntaxOnlyAction 等都继承自它：

1
class ASTFrontendAction : public FrontendAction {
2
protected:
3
  void ExecuteAction() override;
4

5
public:
6
  ASTFrontendAction() {}
7
  bool usesPreprocessorOnly() const override { return false; }
8
};

ClangTool 提供了用于创建具体的 FrontendActionFactory 实例的方法：

1
template <typename T>
2
std::unique_ptr<FrontendActionFactory> newFrontendActionFactory() {
3
  class SimpleFrontendActionFactory : public FrontendActionFactory {
4
  public:
5
    std::unique_ptr<FrontendAction> create() override {
6
      return std::make_unique<T>();
7
    }
8
  };
9

10
  return std::unique_ptr<FrontendActionFactory>(
11
      new SimpleFrontendActionFactory);
12
}

这是简单的模板函数实现，接受一个 FrontendAction 派生类类型 T 作为模板参数，内部定义的简单前端动作工厂类 SimpleFrontendActionFactory 继承自 FrontendActionFactory，并实现 create() 方法来创建 T 类型的 FrontendAction 实例。这样，用户只需要调用 newFrontendActionFactory<T>().create() 就可以得到一个 T 类型的 FrontendAction 实例了。

NOTE

实际上还存在另一个 newFrontendActionFactory() 实现：

1
template <typename FactoryT>
2
inline std::unique_ptr<FrontendActionFactory> newFrontendActionFactory(
3
    FactoryT *ConsumerFactory, SourceFileCallbacks *Callbacks) {
4
  class FrontendActionFactoryAdapter : public FrontendActionFactory {
5
  public:
6
    explicit FrontendActionFactoryAdapter(FactoryT *ConsumerFactory,
7
                                          SourceFileCallbacks *Callbacks)
8
        : ConsumerFactory(ConsumerFactory), Callbacks(Callbacks) {}
9

10
    std::unique_ptr<FrontendAction> create() override {
11
      return std::make_unique<ConsumerFactoryAdaptor>(ConsumerFactory,
12
                                                      Callbacks);
13
    }
14

15
  private:
16
    class ConsumerFactoryAdaptor : public ASTFrontendAction {
17
    public:
18
      ConsumerFactoryAdaptor(FactoryT *ConsumerFactory,
19
                             SourceFileCallbacks *Callbacks)
20
          : ConsumerFactory(ConsumerFactory), Callbacks(Callbacks) {}
21

22
      std::unique_ptr<ASTConsumer>
23
      CreateASTConsumer(CompilerInstance &, StringRef) override {
24
        return ConsumerFactory->newASTConsumer();
25
      }
26

27
    protected:
28
      bool BeginSourceFileAction(CompilerInstance &CI) override {
29
        if (!ASTFrontendAction::BeginSourceFileAction(CI))
30
          return false;
31
        if (Callbacks)
32
          return Callbacks->handleBeginSource(CI);
33
        return true;
34
      }
35

36
      void EndSourceFileAction() override {
37
        if (Callbacks)
38
          Callbacks->handleEndSource();
39
        ASTFrontendAction::EndSourceFileAction();
40
      }
41

42
    private:
43
      FactoryT *ConsumerFactory;
44
      SourceFileCallbacks *Callbacks;
45
    };
46
    FactoryT *ConsumerFactory;
47
    SourceFileCallbacks *Callbacks;
48
  };
49

50
  return std::unique_ptr<FrontendActionFactory>(
51
      new FrontendActionFactoryAdapter(ConsumerFactory, Callbacks));
52
}

相较于前面的通用实现，这个实现聚焦于拥有 ASTConsumer 工厂而不是完整 FrontendAction 的场景。

它定义了一个 FrontendActionFactoryAdapter 类，适配了 ConsumerFactory（ASTConsumer 工厂）和 SourceFileCallbacks（源文件回调）。在 create() 方法中，FrontendActionFactoryAdapter 创建了一个 ConsumerFactoryAdaptor 实例，该实例继承自 ASTFrontendAction，并在 CreateASTConsumer() 方法中调用 ConsumerFactory 来创建 ASTConsumer 实例。同时，在 BeginSourceFileAction() 和 EndSourceFileAction() 方法中调用 SourceFileCallbacks 来处理源文件的开始和结束事件。

通过这种适配器模式，用户可以直接使用一个 ASTConsumer 工厂来创建 FrontendAction，而不需要自己实现完整的 FrontendAction 类，当然这部分了解即可。

我们以一张图来总结一下 ToolAction、FrontendActionFactory 和 FrontendAction 之间的关系，以 ASTFrontendAction 为例：

另一个构造函数

回到 ToolInvocation 直接接受 FrontendAction 的构造函数：

1
ToolInvocation::ToolInvocation(
2
    std::vector<std::string> CommandLine,
3
    std::unique_ptr<FrontendAction> FAction, FileManager *Files,
4
    std::shared_ptr<PCHContainerOperations> PCHContainerOps)
5
    : CommandLine(std::move(CommandLine)),
6
      Action(new SingleFrontendActionFactory(std::move(FAction))),
7
      OwnsAction(true), Files(Files),
8
      PCHContainerOps(std::move(PCHContainerOps)) {}

为了统一接口，ToolInvocation 内部将直接接受 FrontendAction 的构造函数适配为接受 ToolAction 的构造函数。具体来说，它创建了一个 SingleFrontendActionFactory 实例来包装传入的 FrontendAction，并将 OwnsAction 设置为 true，表示 ToolInvocation 负责管理这个 Action 的生命周期。 SingleFrontendActionFactory 是一个简单的 FrontendActionFactory 实现，直接返回传入的 FrontendAction 实例：

1
// // clang/lib/Tooling/Tooling.cpp
2
class SingleFrontendActionFactory : public FrontendActionFactory {
3
  std::unique_ptr<FrontendAction> Action;
4

5
public:
6
  SingleFrontendActionFactory(std::unique_ptr<FrontendAction> Action)
7
      : Action(std::move(Action)) {}
8

9
  std::unique_ptr<FrontendAction> create() override {
10
    return std::move(Action);
11
  }
12
};

ToolInvocation::run()

ToolInvocation 到底做了哪些工作来为前端动作的执行做好准备呢？让我们来分析一下 run() 方法的实现：

1
bool ToolInvocation::run() {
2
  std::vector<const char*> Argv;
3
  for (const std::string &Str : CommandLine)
4
    Argv.push_back(Str.c_str());
5
  const char *const BinaryName = Argv[0];
6

7
  IntrusiveRefCntPtr<DiagnosticOptions> ParsedDiagOpts;
8
  DiagnosticOptions *DiagOpts = this->DiagOpts;
9
  if (!DiagOpts) {
10
    ParsedDiagOpts = CreateAndPopulateDiagOpts(Argv);
11
    DiagOpts = &*ParsedDiagOpts;
12
  }
13

14
  TextDiagnosticPrinter DiagnosticPrinter(llvm::errs(), DiagOpts);
15
  IntrusiveRefCntPtr<DiagnosticsEngine> Diagnostics =
16
      CompilerInstance::createDiagnostics(
17
          &*DiagOpts, DiagConsumer ? DiagConsumer : &DiagnosticPrinter, false);
18

19
  SourceManager SrcMgr(*Diagnostics, *Files);
20
  Diagnostics->setSourceManager(&SrcMgr);
21

22
  const std::unique_ptr<driver::Driver> Driver(
23
      newDriver(&*Diagnostics, BinaryName, &Files->getVirtualFileSystem()));
24

25
  if (!Files->getFileSystemOpts().WorkingDir.empty())
26
    Driver->setCheckInputsExist(false);
27
  const std::unique_ptr<driver::Compilation> Compilation(
28
      Driver->BuildCompilation(llvm::makeArrayRef(Argv)));
29
  if (!Compilation)
30
    return false;
31
  const llvm::opt::ArgStringList *const CC1Args = getCC1Arguments(
32
      &*Diagnostics, Compilation.get());
33
  if (!CC1Args)
34
    return false;
35
  std::unique_ptr<CompilerInvocation> Invocation(
36
      newInvocation(&*Diagnostics, *CC1Args, BinaryName));
37
  return runInvocation(BinaryName, Compilation.get(), std::move(Invocation),
38
                       std::move(PCHContainerOps));
39
}

还记得 ToolInvocation 是如何被 ClangTool 调用的吗？在 ClangTool::run() 方法中，ClangTool 会为每个源文件构造一个 ToolInvocation 实例，并调用后者的 run() 方法执行。对于每一个源文件，有与之相应的编译命令，ClangTool 将这个编译命令作为构造参数传递给了 ToolInvocation。因此，一个源文件对应一个 ToolInvocation 实例。

ToolInvocation::run() 方法做了一些准备工作：将编译命令转换为 C 风格字符串列表、创建 DiagnosticsEngine 来处理诊断信息等。最重要的是，关注高亮部分，ToolInvocation 通过 Clang Driver 来构建 Compilation 对象，并从 Compilation 中提取出 CC1 参数来构造 CompileInvocation。CompileInvocation 是一个包含了编译参数、文件系统和诊断信息等执行环境信息的对象，包含了前端编译流程所需的全部信息。最后，ToolInvocation 调用 runInvocation() 来执行前端动作。你应该对 CC1 不陌生了吧，CC1 其实就是 Clang 前端入口。

IMPORTANT
不要将 Compilation 与 CompileInstance 混淆。前者是 Driver 构建的表示整个编译流程的描述性对象，包含了各个阶段（预处理、编译、汇编以及链接）的命令和参数，但是并不会实际执行编译；而后者是 Clang 前端表示实际编译上下文和环境的核心对象，包含前端执行的各个组件和状态。

ToolInvocation::runInvocation()

ToolInvocation::run() 方法最终调用其私有成员函数 runInvocation()，我们来看看 runInvocation() 的具体实现：

1
bool ToolInvocation::runInvocation(
2
    const char *BinaryName, driver::Compilation *Compilation,
3
    std::shared_ptr<CompilerInvocation> Invocation,
4
    std::shared_ptr<PCHContainerOperations> PCHContainerOps) {
5

6
  if (Invocation->getHeaderSearchOpts().Verbose) {
7
    llvm::errs() << "clang Invocation:\n";
8
    Compilation->getJobs().Print(llvm::errs(), "\n", true);
9
    llvm::errs() << "\n";
10
  }
11

12
  return Action->runInvocation(std::move(Invocation), Files,
13
                               std::move(PCHContainerOps), DiagConsumer);
14
}

这里并没有我们预想的执行前端动作的逻辑，if 判断只是用于调试输出，可以忽略。ToolInvocation::runInvocation() 只是简单地调用了 Action->runInvocation() 来执行，别忘了，Action 是一个 ToolAction 指针，我们前面分析过，ToolAction 是一个抽象基类，定义了 runInvocation() 纯虚函数，而 FrontendActionFactory 继承自 ToolAction，并实现了 runInvocation() 接口。因此，最终前端动作的执行逻辑”居然”是被委托给了 FrontendActionFactory 来实现的。

ToolAction::runInvocation()：真正的前端动作执行逻辑

让我们来看看 FrontendActionFactory::runInvocation() 的具体实现：

1
bool FrontendActionFactory::runInvocation(
2
    std::shared_ptr<CompilerInvocation> Invocation, FileManager *Files,
3
    std::shared_ptr<PCHContainerOperations> PCHContainerOps,
4
    DiagnosticConsumer *DiagConsumer) {
5

6
  CompilerInstance Compiler(std::move(PCHContainerOps));
7
  Compiler.setInvocation(std::move(Invocation));
8
  Compiler.setFileManager(Files);
9

10
  std::unique_ptr<FrontendAction> ScopedToolAction(create());
11

12
  Compiler.createDiagnostics(DiagConsumer, /*ShouldOwnClient=*/false);
13
  if (!Compiler.hasDiagnostics())
14
    return false;
15

16
  Compiler.createSourceManager(*Files);
17

18
  const bool Success = Compiler.ExecuteAction(*ScopedToolAction);
19

20
  Files->clearStatCache();
21
  return Success;
22
}

终于，我们看到了期望的前端动作执行逻辑。FrontendActionFactory::runInvocation() 首先基于传入的 CompileInvocation 构建了一个 CompileInstance 实例，并设置了相关环境信息。接着，调用了 create() 方法来创建具体的 FrontendAction 实例，这个 create() 方法是一个纯虚函数，由具体的 FrontendActionFactory 派生类来实现。最后，调用 Compiler.ExecuteAction() 来执行前端动作，水到渠成了吧。

CompileInstance：前端动作执行环境

CompileInstance 是 Clang 前端表示实际编译上下文和环境的核心对象，包含前端执行的各个组件和状态。CompileInstance 的设计使得前端动作能够在一个完整的编译环境中执行，复用 Clang 前端的全部功能。CompileInstance 的核心定义如下：

1
class CompilerInstance : public ModuleLoader {
2
  /// The options used in this compiler instance.
3
  std::shared_ptr<CompilerInvocation> Invocation;
4
  /// The diagnostics engine instance.
5
  IntrusiveRefCntPtr<DiagnosticsEngine> Diagnostics;
6
  /// The target being compiled for.
7
  IntrusiveRefCntPtr<TargetInfo> Target;
8
  /// Auxiliary Target info.
9
  IntrusiveRefCntPtr<TargetInfo> AuxTarget;
10
  /// The file manager.
11
  IntrusiveRefCntPtr<FileManager> FileMgr;
12
  /// The source manager.
13
  IntrusiveRefCntPtr<SourceManager> SourceMgr;
14
  /// The cache of PCM files.
15
  IntrusiveRefCntPtr<InMemoryModuleCache> ModuleCache;
16
  /// The preprocessor.
17
  std::shared_ptr<Preprocessor> PP;
18
  /// The AST context.
19
  IntrusiveRefCntPtr<ASTContext> Context;
20
  /// An optional sema source that will be attached to sema.
21
  IntrusiveRefCntPtr<ExternalSemaSource> ExternalSemaSrc;
22
  /// The AST consumer.
23
  std::unique_ptr<ASTConsumer> Consumer;
24
  /// The code completion consumer.
25
  std::unique_ptr<CodeCompleteConsumer> CompletionConsumer;
26
  /// The semantic analysis object.
27
  std::unique_ptr<Sema> TheSema;
28
  /// ......
29

30
public:
31
  explicit CompilerInstance(
32
      std::shared_ptr<PCHContainerOperations> PCHContainerOps =
33
          std::make_shared<PCHContainerOperations>(),
34
      InMemoryModuleCache *SharedModuleCache = nullptr);
35
  ~CompilerInstance() override;
36

37
  bool ExecuteAction(FrontendAction &Act);
38
  void createDiagnostics(DiagnosticConsumer *Client = nullptr,
39
                         bool ShouldOwnClient = true);
40

41
  static IntrusiveRefCntPtr<DiagnosticsEngine>
42
  createDiagnostics(DiagnosticOptions *Opts,
43
                    DiagnosticConsumer *Client = nullptr,
44
                    bool ShouldOwnClient = true,
45
                    const CodeGenOptions *CodeGenOpts = nullptr);
46

47
  FileManager *
48
  createFileManager(IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS = nullptr);
49

50
  void createSourceManager(FileManager &FileMgr);
51

52
  void createPreprocessor(TranslationUnitKind TUKind);
53

54
  void createASTContext();
55

56
  void createPCHExternalASTSource(
57
      StringRef Path, DisableValidationForModuleKind DisableValidation,
58
      bool AllowPCHWithCompilerErrors, void *DeserializationListener,
59
      bool OwnDeserializationListener);
60

61
  static IntrusiveRefCntPtr<ASTReader> createPCHExternalASTSource(
62
      StringRef Path, StringRef Sysroot,
63
      DisableValidationForModuleKind DisableValidation,
64
      bool AllowPCHWithCompilerErrors, Preprocessor &PP,
65
      InMemoryModuleCache &ModuleCache, ASTContext &Context,
66
      const PCHContainerReader &PCHContainerRdr,
67
      ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
68
      ArrayRef<std::shared_ptr<DependencyCollector>> DependencyCollectors,
69
      void *DeserializationListener, bool OwnDeserializationListener,
70
      bool Preamble, bool UseGlobalModuleIndex);
71

72
  void createCodeCompletionConsumer();
73

74
  static CodeCompleteConsumer *createCodeCompletionConsumer(
75
      Preprocessor &PP, StringRef Filename, unsigned Line, unsigned Column,
76
      const CodeCompleteOptions &Opts, raw_ostream &OS);
77

78
  void createSema(TranslationUnitKind TUKind,
79
                  CodeCompleteConsumer *CompletionConsumer);
80
  /// ......
81
};

从上面的定义中不难看出，CompileInstance 扮演了执行环境容器的角色。它提供了统一接口来访问和管理 Preprocessor、Sema、ASTContext 等前端组件（正如前面 Clang 前端架构所示），并通过 ExecuteAction() 方法来执行前端动作。对于 LibTooling 而言，我们关心的还是 CompileInstance 如何驱动前端动作的执行——即 CompileInstance::ExecuteAction() 方法的实现。

CompileInstance::ExecuteAction()

CompileInstance::ExecuteAction() 方法定义如下：

1
bool CompilerInstance::ExecuteAction(FrontendAction &Act) {
2
  assert(hasDiagnostics() && "Diagnostics engine is not initialized!");
3
  assert(!getFrontendOpts().ShowHelp && "Client must handle '-help'!");
4
  assert(!getFrontendOpts().ShowVersion && "Client must handle '-version'!");
5

6
  noteBottomOfStack();
7

8
  auto FinishDiagnosticClient = llvm::make_scope_exit([&]() {
9
    getDiagnosticClient().finish();
10
  });
11

12
  raw_ostream &OS = getVerboseOutputStream();
13

14
  if (!Act.PrepareToExecute(*this))
15
    return false;
16

17
  if (!createTarget())
18
    return false;
19

20
  if (getFrontendOpts().ProgramAction == frontend::RewriteObjC)
21
    getTarget().noSignedCharForObjCBool();
22

23
  if (getHeaderSearchOpts().Verbose)
24
    OS << "clang -cc1 version " CLANG_VERSION_STRING
25
       << " based upon " << BACKEND_PACKAGE_STRING
26
       << " default target " << llvm::sys::getDefaultTargetTriple() << "\n";
27

28
  if (getCodeGenOpts().TimePasses)
29
    createFrontendTimer();
30

31
  if (getFrontendOpts().ShowStats || !getFrontendOpts().StatsFile.empty())
32
    llvm::EnableStatistics(false);
33

34
  for (const FrontendInputFile &FIF : getFrontendOpts().Inputs) {
35
    if (hasSourceManager() && !Act.isModelParsingAction())
36
      getSourceManager().clearIDTables();
37

38
    if (Act.BeginSourceFile(*this, FIF)) {
39
      if (llvm::Error Err = Act.Execute()) {
40
        consumeError(std::move(Err));
41
      }
42
      Act.EndSourceFile();
43
    }
44
  }
45

46
  if (getDiagnosticOpts().ShowCarets) {
47
    unsigned NumWarnings = getDiagnostics().getClient()->getNumWarnings();
48
    unsigned NumErrors = getDiagnostics().getClient()->getNumErrors();
49

50
    if (NumWarnings)
51
      OS << NumWarnings << " warning" << (NumWarnings == 1 ? "" : "s");
52
    if (NumWarnings && NumErrors)
53
      OS << " and ";
54
    if (NumErrors)
55
      OS << NumErrors << " error" << (NumErrors == 1 ? "" : "s");
56
    if (NumWarnings || NumErrors) {
57
      OS << " generated";
58
      if (getLangOpts().CUDA) {
59
        if (!getLangOpts().CUDAIsDevice) {
60
          OS << " when compiling for host";
61
        } else {
62
          OS << " when compiling for " << getTargetOpts().CPU;
63
        }
64
      }
65
      OS << ".\n";
66
    }
67
  }
68

69
  if (getFrontendOpts().ShowStats) {
70
    if (hasFileManager()) {
71
      getFileManager().PrintStats();
72
      OS << '\n';
73
    }
74
    llvm::PrintStatistics(OS);
75
  }
76
  StringRef StatsFile = getFrontendOpts().StatsFile;
77
  if (!StatsFile.empty()) {
78
    std::error_code EC;
79
    auto StatS = std::make_unique<llvm::raw_fd_ostream>(
80
        StatsFile, EC, llvm::sys::fs::OF_TextWithCRLF);
81
    if (EC) {
82
      getDiagnostics().Report(diag::warn_fe_unable_to_open_stats_file)
83
          << StatsFile << EC.message();
84
    } else {
85
      llvm::PrintStatisticsJSON(*StatS);
86
    }
87
  }
88

89
  return !getDiagnostics().getClient()->getNumErrors();
90
}

CompileInstance::ExecuteAction() 方法首先进行了一些前置检查和准备工作，例如检查诊断引擎是否初始化、处理命令行选项等。接着，调用了 Act.PrepareToExecute() 来准备前端动作的执行环境，然后创建了 TargetInfo 来表示编译目标平台的信息。最后，进入了一个循环，遍历所有输入文件，并对每个文件调用 Act.BeginSourceFile()、Act.Execute() 和 Act.EndSourceFile() 来执行前端动作。

写到这里，我们可以用一张图来总结至今的分析内容： LibTooling 部分类图

写在最后

我们用了相当长的篇幅来分析了 LibTooling 的核心架构实现，从 ClangTool::run() 方法开始，逐步深入到 ToolInvocation、FrontendActionFactory 以及 CompileInstance 的实现细节。通过这些分析，我们可以清晰地看到 LibTooling 是如何通过这些核心组件来驱动 Clang 前端动作的执行的。

但是，仍然留下了对 FrontendAction 这部分内容的讨论，我们将在下一章继续分析 FrontendAction 的实现细节，这也是 LibTooling 打通架构设计到工具具体业务逻辑的最后一环，尽情期待。

LibTooling架构解析（一）：从 ClangTool 到 CompileInstance

Clang Tooling 与 LibTooling

LibTooling 核心架构

ClangTool：工具驱动器

ClangTool() 构造函数

ClangTool::run() 方法

ToolInvocation：编译命令到前端动作执行的桥梁

ToolInvocation 构造函数

工厂方法模式：FrontendActionFactory 与 FrontendAction

另一个构造函数

ToolInvocation::run()

ToolInvocation::runInvocation()

ToolAction::runInvocation()：真正的前端动作执行逻辑

CompileInstance：前端动作执行环境

CompileInstance::ExecuteAction()

写在最后

Comments

Clang Tooling 与 LibTooling

LibTooling 核心架构

ClangTool：工具驱动器

ClangTool() 构造函数

ClangTool::run() 方法

ToolInvocation：编译命令到前端动作执行的桥梁

ToolInvocation 构造函数

工厂方法模式：FrontendActionFactory 与 FrontendAction

另一个构造函数

ToolInvocation::run()

ToolInvocation::runInvocation()

ToolAction::runInvocation()：真正的前端动作执行逻辑

CompileInstance：前端动作执行环境

CompileInstance::ExecuteAction()

写在最后

Related

Comments