Control Flow

the language mechanisms used to specify ordering into several categories:

  • Sequencing
  • Selection
  • Iteration
  • Procedural abstraction
  • Recursion
  • Concurrency
  • Exception handling and speculation
  • Nondeterminacy

Expression Evaluation

An expression generally consists of either a simple object (e.g., a literal constant, or a named variable or constant) or an operator or function applied to a collection of operands or arguments, each of which in turn is an expression.

Precedence and Associativity

运算符的优先级和结合性

Assignments

computation typically consists of an ordered series of changes to the values of variables in memory. Assignments provide the principal means by which to make the changes.

side effect

Assignment is perhaps the most fundamental side effect: while the evaluation of an assignment may sometimes yield a value, what we really care about is the fact that it changes the value of a variable, thereby influencing the result of any later computation in
which the variable appears.

References and Values

there are some subtle but important differences in the semantics of assignment in different imperative languages. These differences are often invisible, because they do not affect the behavior of simple programs.

  • 值传递和引用传递
  • 左值和右值
Boxing

基础类型的拆装箱

Initialization

There are several reasons, however, why such initial values may be useful:

  • 静态变量需要初始化
  • 编译器预分配初始化值优化(Java Integer预分配)
  • 使用未初始化变量引起的问题

It should be emphasized that initialization saves time only for variables that are statically allocated.

If a variable is not given an initial value explicitly in its declaration, the language may specify a default value.

Dynamic Checks

Instead of giving every uninitialized variable a default value, a language or implementation can choose to define the use of an uninitialized variable as a dynamic semantic error, and can catch these errors at run time.

Definite Assignment

通过控制流分析静态检测未初始化变量:

This notion is based on the control flow of the program, and can be statically checked by the compiler.

Constructors

构造器默认初始化:

Many object-oriented languages (Java and C# among them) allow the programmer to define types for which initialization of dynamically allocated variables occurs automatically, even when no initial value is specified in the declaration.

Ordering within Expressions

表达式内求值重要的原因:

  • Side effects: 表达式内函数求值副作用
  • Code improvement: 求值顺序和编译器优化

Applying Mathematical Identities(数学恒等式)

表达式优化影响求值顺序会导致精度问题

Short-Circuit Evaluation(短路表达式)

A compiler that performs short-circuit evaluation of Boolean expressions will generate code that skips the second half of both of these computations when the overall value can be determined from the first half.

Structured and Unstructured Flow

汇编中使用的goto(非结构化)编程被高级语言抛弃(结构化编程)

Structured Alternatives to goto

Where once a goto might have been used to escape from the middle of a loop, most modern languages provide a break or exit statement for this purpose.

Multilevel Returns

return和 local goto都可以从当前子程序中返回,但是 nonlocal goto会破坏这种情况,如果直接goto到其他子程序中会破坏当前子程序堆栈,还需要立刻加载另一个子程序的堆栈。如果使用return这些都是在执行到return关键字时才发生的。

Errors and Other Exceptions

需要Exceptions的原因:

In a related and arguably more common situation, a deeply nested block or subroutine may discover that it is unable to proceed with its usual function, and moreover lacks the contextual information it would need to recover in any graceful way.

As a structured alternative, many modern languages provide an exception-handling mechanism for convenient, nonlocal recovery from exceptions.

大多数语言只提供异常机制而不提供从多级调用中直接return的机制(想直接跳出多级调用只能抛异常)。

Continuations

The notion of nonlocal gotos that unwind the stack can be generalized by defining what are known as continuations.

call/cc => 控制流变换:

在 Scheme 中,假设 call/cc 捕捉到的 current continuation 为 cc(位于 lambda 中),如果 cc 作为过程 直接或间接地被调用(即给它传值),call/cc 会立即返回,返回值即为传入 cc 的值。即一旦 current continuation 被调用,控制流会跳到 call/cc 处。因此,利用 call/cc,我们可以摆脱顺序执行的限制,在程序中跳来跳去,非常灵活。

Sequencing

Like assignment, sequencing is central to imperative programming. It is the principal means of controlling the order in which side effects (e.g.,assignments) occur

顺序和副作用,无副作用函数可以进行指令重排提升性能。

Selection

Short-Circuited Conditions

虽然if else语句中包含一个布尔表达式,但是通常不需要将其结果解析出来放在寄存器中,而是拆分布尔表达式用来控制代码跳转(通过短路表达式来优化代码)。

Case/Switch Statements

if else语句过长时可能会被优化成 switch/case 语句(通过查表来优化代码性能)。

case查表也需要优化性能(如果静态值范围稀疏。。。)

Iteration

Iteration and recursion are the two mechanisms that allow a computer to perform similar operations repeatedly.

Enumeration-Controlled Loops(for循环)

在编译器生成代码时的一个可能的优化是预先计算迭代次数:max([last-first+step]/step, 0)

枚举控制循环的几个设计问题:

  1. Can control enter or leave the loop in any way other than through the enumeration mechanism? => break/exit
  2. What happens if the loop body modifies variables that were used to compute the end-of-loop bound? => 不允许
  3. What happens if the loop body modifies the index variable itself? => 禁止在循环体内自己更新索引
  4. Can the program read the index variable after the loop has completed, and if so, what will its value be? => 循环结束后索引值是未定义的

Combination Loops

自定义步长的for循环

Iterators

  • range()
  • 对象迭代器
  • 惰性迭代流

Recursion 递归

Iteration and Recursion

Iteration is in some sense the more “natural” of the two in imperative languages, because it is based on the repeated modification of variables.

Recursion is the more natural of the two in functional languages, because it does not change variables.

通常认为迭代比递归更高效,但是编译器优化也可以生成高效的递归代码,特别是尾递归代码的优化(不需要额外生成新的函数堆栈,可以重用当前函数的栈空间)

编译器也可以优化非尾递归代码,转换为尾递归代码执行。

使用递归思想思考代码(没有副作用的递归)可以产生更少的bug。

Applicative- and Normal-Order Evaluation

Throughout the discussion so far we have assumed implicitly that arguments are evaluated before passing them to a subroutine. This need not be the case. It is possible to pass a representation of the unevaluated arguments to the subroutine instead, and to evaluate them only when (if) the value is actually needed.

惰性求值

Nondeterminacy

不确定的控制流(运行时才能确定)。

Summary and Concluding Remarks

在本章中,我们介绍了编程语言中发现的主要控制流形式:顺序、选择、迭代、程序抽象、递归、并发、异常处理和推测以及不确定性。顺序指定某些操作按顺序发生,一个接一个。选择表达了在两个或更多控制流替代方案之间的选择。迭代和递归是重复执行操作的两种方式。递归以自身的简单实例定义操作;它依赖于程序抽象。迭代重复一个操作以获得其副作用。顺序和迭代是命令式(特别是冯·诺依曼)编程的基础。递归是函数式编程的基础。不确定性允许程序员故意不指定控制流的某些方面。我们只简要地触及了并发;它将是第12章的主题。程序抽象(子程序)是第8章的主题。异常处理和推测将在第8.5节和第12.4.4节中介绍。

在我们对控制流机制的调查之前,我们讨论了表达式评估。我们考虑了左值和右值之间的区别,以及变量的值模型和引用模型之间的区别,在值模型中,变量是数据的命名容器,在引用模型中,变量是对数据对象的引用。我们考虑了表达式内的优先级、结合性和排序问题。我们检查了短路布尔评估及其通过跳转代码的实现,这既是一个影响表达式正确性的语义问题(其子部分并非总是定义良好),也是一个影响评估复杂布尔表达式所需时间的实现问题。

构造的发展受到了许多目标的驱动,包括编程的便利性、语义的优雅性、实现的便利性和运行时效率。在某些情况下,这些目标被证明是互补的。例如,我们已经看到短路评估既导致更快的代码,也(在许多情况下)导致更清晰的语义。同样地,为枚举控制循环的索引变量引入一个新的局部作用域,避免了循环后索引值的语义问题和(在某种程度上)潜在溢出的实现问题。

在其他情况下,语言语义的改进被认为值得在运行时效率上付出一点小代价。我们在迭代器的发展中看到了这一点:像许多形式的抽象一样,它们在许多情况下增加了适度的运行时成本(例如,与显式嵌入枚举集合的实现到循环的控制流中相比),但却带来了模块化、清晰性和代码重用机会的巨大回报。同样地,Java的开发者会辩称,对于许多应用程序来说,通过广泛的语义检查、标准格式的数值类型等提供的可移植性和安全性远比速度更重要。

在一些情况下,编译器技术的进步或设计者构建更复杂编译器的简单意愿,使得整合曾被认为过于昂贵的功能成为可能。Ada中的案例语句的标签范围要求编译器准备生成采用二进制搜索的代码。C++中的内联函数消除了在微小函数的低效率和宏的混乱语义之间选择的需要。异常(如我们将在第8.5.3节中看到)可以以这样一种方式实现,即在常见情况下(当它们不发生时)不会产生任何成本,但实现相当复杂。迭代器、装箱、泛型(第8.4节)和一级函数同样相当复杂,但在主流命令式语言中越来越常见。

一些实现技术(例如,重新排列表达式以发现公共子表达式,或者在找到可接受的选择后避免在不确定性构造中评估守卫)足够重要,以至于可以证明对程序员施加适度负担是合理的(例如,在必要时添加括号以避免溢出或确保数值稳定性,或确保守卫中的表达式没有副作用)。其他在语义上有用的机制(例如,惰性评估、续体或真正随机的不确定性)通常被认为足够复杂或昂贵,只在特殊情况下才值得使用(如果有的话)。

在相对原始的语言中,我们通常可以通过编程约定获得一些缺失功能的好处。例如,在早期的Fortran方言中,我们可以限制goto的使用,以模仿更现代语言的控制流。在没有短路评估的语言中,我们可以编写嵌套的选择语句。在没有迭代器的语言中,我们可以编写一组提供等效功能的子程序。

Semantic Analysis

semantics concerns its meaning. Meaning is important for at least two reasons: it allows us to enforce rules (e.g., type consistency) that go beyond mere form, and it provides the information we need in order to generate an equivalent output program.

Semantic rules are further divided into static and dynamic semantics

Both semantic analysis and intermediate code generation can be described in terms of annotation, or decoration of a parse tree or syntax tree.

Attribute grammars provide a formal framework for the decoration of a tree. This framework is a useful conceptual tool even in compilers that do not build a parse tree or syntax tree as an explicit data structure.

The Role of the Semantic Analyzer

The role of the semantic analyzer is to enforce all static semantic rules and to annotate the program with information needed by the intermediate code generator.

Dynamic Checks

Many compilers that generate code for dynamic checks provide the option of disabling them if desired.

Assertions

The compiler then generates code to check the assertions at run time. An assertion is a statement that a specified condition is expected to be true when execution reaches a certain point in the code.

Static Analysis

In general, compile-time algorithms that predict run-time behavior are known as static analysis.

static analysis may enable code improvement:

  • alias analysis
  • escape analysis
  • subtype analysis

Attribute Grammars

To tie these expressions to mathematical concepts (as opposed to, say, floor tile patterns or dance steps), we need additional notation. The most common is based on attributes.

In a compiler or interpreter for a full programming language, the attributes of tree nodes might include:

  • for an identifier, a reference to information about it in the symbol table
  • for an expression, its type
  • for a statement or expression, a reference to corresponding code in the compiler’s intermediate form
  • for almost any construct, an indication of the file name, line, and column where the corresponding source code begins
  • for any internal node, a list of semantic errors found in the subtree below

Evaluating Attributes

The process of evaluating attributes is called annotation or decoration of the parse tree.

Synthesized Attributes 合成属性

synthesized attributes: their values are calculated (synthesized) only in productions in which their symbol appears on the left-hand side.

Inherited Attributes

In general, we can imagine (and will in fact have need of) attributes whose values are calculated when their symbol is on the right-hand side of the current production. Such attributes are said to be inherited.

Attribute Flow

they define a set of valid trees, but they don’t say how to build or decorate them.

the order in which attribute rules are listed for a given production is immaterial; attribute flow may require them to execute in any order.

An algorithm that decorates parse trees by invoking the rules of an attribute grammar in an order consistent with the tree’s attribute flow is called a translation scheme.

One-Pass Compilers

A compiler that interleaves semantic analysis and code generation with parsing is said to be a one-pass compiler

Action Routines

An ad hoc translation scheme that is interleaved with parsing takes the form of a set of action routines.

An action routine is a semantic function that the programmer (grammar writer) instructs the compiler to execute at a particular point in the parse.

Space Management for Attributes

If we are building an explicit parse tree, then the obvious approach is to store attributes in the nodes of the tree themselves.

For a bottom-up parser with an S-attributed grammar, the obvious approach is to maintain an attribute stack

For a top-down parser with an L-attributed grammar, we have two principal options:

  • uses an attribute stack
  • “shortcutting” copy rules

Tree Grammars and Syntax Tree Decoration

attribute grammars can also be used to decorate syntax trees.

Summary and Concluding Remarks

本章讨论了语义分析的任务。我们回顾了可以分类为语法、静态语义和动态语义的语言规则类型,并讨论了是否生成代码以执行动态语义检查的问题。我们还考虑了语义分析器在典型编译器中的作用。我们指出,静态语义规则的执行和中间代码的生成都可以用解析树或语法树的注释或装饰来表达。然后,我们提出了属性文法作为这个装饰过程的形式化框架。

属性文法将属性与上下文无关文法或树文法中的每个符号关联,并将属性规则与每个产生式关联起来。在上下文无关文法中,综合属性只在其符号出现在产生式的左侧时计算。标记的综合属性由扫描器初始化。继承属性在其符号出现在右侧的产生式中计算;它们允许符号下子树中的计算依赖于符号出现的上下文。起始符号(目标)的继承属性可以表示编译器的外部环境。严格来说,属性文法只允许复制规则(一个属性分配给另一个属性)和对语义函数的简单调用,但我们通常放宽这一限制,以允许在某些现有编程语言中使用更多或更少任意的代码片段。

就像上下文无关文法可以根据可以使用它们的解析算法进行分类一样,属性文法可以根据其属性流模式的复杂性进行分类。在S-属性文法中,所有属性都是综合的,可以自然地在解析树上进行单次自底向上遍历,按照LR族解析器发现树的顺序精确计算。在L-属性文法中,所有属性流都是深度优先从左到右的,可以按照LL族解析器预测和匹配解析树的顺序精确计算。具有更复杂属性流模式的属性文法通常不用于生成编译器的解析树,但对于基于语法的编辑器、增量编译器和其他各种工具非常有价值。

虽然可以构建自动工具来分析属性流并装饰解析树,但大多数编译器依赖于动作例程,编译器编写者将这些例程嵌入到产生式的右侧,以在解析的特定点评估属性规则。在LL族解析器中,动作例程可以嵌入到产生式右侧的任意点。在LR族解析器中,动作例程必须遵循产生式的左角。自底向上编译器中的属性空间自然地与解析栈并行分配,但这使得继承属性的管理变得复杂。自顶向下编译器中的属性空间可以自动分配,或由动作例程的编写者显式管理。自动方法具有规律性的优势,并且更易于维护;而临时方法略快且更灵活。

在单遍编译器中,扫描、解析、语义分析和代码生成在对输入的单次遍历中交替进行。语义函数或动作例程负责所有的语义分析和代码生成。更常见的做法是,动作例程仅构建一个语法树,然后在后续的单独遍历中对其进行装饰。这些遍历的代码通常是手工编写的,以相互递归的子程序形式,使得编译器可以在语法树上实现基本上任意的属性流。

在接下来的章节(特别是第6至第10章),我们将考虑各种各样的编程语言构造。我们不会呈现实现这些构造所需的实际属性文法,而是会以非正式的方式描述它们的语义,并给出目标代码的示例。在第15章中,当我们更详细地考虑中间代码生成时,我们将回顾属性文法。

Names, Scopes, and Bindings

A name is a mnemonic character string used to represent something else.

Names allow us to refer to variables, constants, operations, types, and so on using symbolic identifiers rather than low-level concepts like addresses.

Subroutines are control abstractions.

Classes are data abstractions.

The Notion of Binding Time

the notion of binding time, which refers not only to the binding of a name to the thing it represents, but also in general to the notion of resolving any design decision in a language implementation.

通常,早期绑定时机与更高的效率相关,而后期的绑定时机与更大的灵活性相关。

不同的东西的绑定时机是不一样的:

  • Language design time:控制流结构,基本类型,复杂对象组织方法等语义方面的内容
  • Language implementation time:基础类型大小,和操作系统交互,堆和栈的组织方式和大小
  • Program writing time:算法,数据结构,命名
  • Compile time:高级数据结构和机器码的映射,静态数据的内存布局
  • Link time:引用其他的模块的绑定关系到链接时才能确定(增量编译)
  • Load time:程序装载时才能确定实际的地址(虚实地址转换)
  • Run time:变量值绑定,程序启动时机,模块装载时机,首次“看到”声明的时机,子程序调用时机,代码块进入时机,表达式求值、语句执行时机

Compiler-based language implementations tend to be more efficient than interpreter-based implementations because they make earlier decisions.

Object Lifetime and Storage Management

The period of time between the creation and the destruction of a name-to-object binding is called the binding’s lifetime.

生命周期管理不正确肯能会导致“悬挂指针”

对象的生命周期取决于存储分配机制(对象空间):

  • 静态对象(绝对地址)
  • 栈对象(栈上分配,通常在子程序调用)
  • 堆对象(随时分配)

Static Allocation

Global variables are the obvious example of static objects, but not the only one.

Numeric and string-valued constant literals are also statically allocated.

Finally, most compilers produce a variety of tables that are used by run-time support routines for debugging, dynamic type checking, garbage collection, exception handling, and other purposes; these are also statically allocated.

Manifest constants can always be allocated statically, even if they are local to a recursive subroutine: multiple instances can share the same location.

Stack-Based Allocation

If a language permits recursion, static allocation of local variables is no longer an option.

Fortunately, the natural nesting of subroutine calls makes it easy to allocate space for locals on a stack.

Each instance of a subroutine at run time has its own frame (also called an activation record) on the stack, containing arguments and return values, local variables, temporaries, and bookkeeping information.

Heap-Based Allocation

A heap is a region of storage in which subblocks can be allocated and deallocated at arbitrary times.

Heaps are required for the dynamically allocated pieces of linked data structures, and for objects such as fully general character strings, lists, and sets, whose size may change as a result of an assignment statement or other update operation.

The principal concerns are speed and space, and as usual there are tradeoffs between them.

堆内存管理方法:

  • With a first fit algorithm we select the first block on the list that is large enough to satisfy the request.
  • With a best fit algorithm we search the entire list to find the smallest block that is large enough to satisfy the request.

两者的对比:

Intuitively, one would expect a best fit algorithm to do a better job of reserving large blocks for large requests. At the same time, it has higher allocation cost than a first fit algorithm, because it must always search the entire list, and it tends to result in a larger number of very small “left-over” blocks.

内存管理存在的问题:

内存分配效率和堆最小大小有关(多次申请):

In effect, the heap is divided into “pools,” one for each standard size. The division may be static or dynamic. Two common mechanisms for dynamic pool adjustment are known as the buddy system and the Fibonacci heap.

内存碎片问题:

The problem with external fragmentation is that the ability of the heap to satisfy requests may degrade over time.

Garbage Collection

The run-time library for such a language must then provide a garbage collection mechanism to identify and reclaim unreachable objects.

手动 vs 自动:

  • The traditional arguments in favor of explicit deallocation are implementation simplicity and execution speed.
  • manual deallocation errors are among the most common and costly bugs in real-world programs.

Scope Rules

The textual region of the program in which a binding is active is its scope. In most modern languages, the scope of a binding is determined statically, that is, at compile time.

作用域分为:

  • statically scoped: compile time
  • dynamically scoped: bindings depend on the flow of execution at run time

At any given point in a program’s execution, the set of active bindings is called the current referencing environment. The set is principally determined by static or dynamic scope rules.

binding rules:

  • deep binding: the choice is made when the reference is first created
  • shallow binding: the choice is made when the reference is finally used

Static Scoping

In a language with static (lexical) scoping, the bindings between names and objects can be determined at compile time by examining the text of the program, without consideration of the flow of control at run time.

Nested Subroutines

a name that is introduced in a declaration is known in the scope in which it is declared, and in each internally nested scope, unless it is hidden by another declaration of the same name in one or more nested scopes.

To find the object corresponding to a given use of a name, we look for a declaration with that name in the current, innermost scope. If there is one, it defines the active binding for the name. Otherwise, we look for a declaration in the immediately surrounding scope.

A name-to-object binding that is hidden by a nested declaration of the same name is said to have a hole in its scope.

作用域解析运算符:

In others, the programmer can access the outer meaning of a name by applying a qualifier or scope resolution operator.

Declaration Order

Put another way, can an expression E refer to any name declared in the current scope, or only to names that are declared before E in the scope?

Several early languages, required that all declarations appear at the beginning of their scope.

C++ and Java further relax the rules by dispensing with the define-before-use requirement in many cases. In both languages, members of a class (including those that are not defined until later in the program text) are visible inside all of the class’s methods.

Declarations and Definitions

如何处理两个类互相包含彼此?

Recursive types and subroutines introduce a problem for languages that require names to be declared before they can be used: how can two declarations each appear before the other?

  • A declaration introduces a name and indicates its scope, but may omit certain implementation details.
  • A definition describes the object in sufficient detail for the compiler to determine its implementation.

Modules

模块化和信息隐藏,减少认识负荷:

This modularization of effort depends critically on the notion of information hiding, which makes objects and algorithms invisible, whenever possible, to portions of the system that do not need them.

Module Types and Classes

An alternative solution to the multiple instance problem appeared in Euclid, which treated each module as a type. Given a module type, the programmer could declare an arbitrary number of similar module objects.

Dynamic Scoping

In a language with dynamic scoping, the bindings between names and objects depend on the flow of control at run time, and in particular on the order in which subroutines are called.

为什么动态作用域到运行时才能确定?

Because the flow of control cannot in general be predicted in advance, the bindings between names and objects in a language with dynamic scoping cannot in general be determined by a compiler.

Implementing Scope

To keep track of the names in a statically scoped program, a compiler relies on a data abstraction called a symbol table.

In a language with dynamic scoping, an interpreter (or the output of a compiler) must perform operations analogous to symbol table insert and lookup at run time.

The Meaning of Names within a Scope

A name that can refer to more than one object at a given point in the program is said to be overloaded. Overloading is in turn related to the more general subject of polymorphism.

  • aliases: Two or more names that refer to the same object at the same point in the program are said to be aliases.
  • overloaded: A name that can refer to more than one object at a given point in the program is said to be overloaded
  • Redefining Built-in Operators

The Binding of Referencing Environments

When should scope rules be applied to such a subroutine: when the reference is first created, or when the routine is finally called?

动态作用域常使用 shallow binding:

This late binding of the referencing environment of a subroutine that has been passed as a parameter is known as shallow binding.

静态作用域常使用 deep binding:

It therefore makes sense to bind the environment at the time the routine is first passed as a parameter, and then restore that environment when the routine is finally called.

This early binding of the referencing environment is known as deep binding.

Subroutine Closures

Deep binding is implemented by creating an explicit representation of a referencing environment (generally the one in which the subroutine would execute if called at the present time) and bundling it together with a reference to the subroutine. The bundle as a whole is referred to as a closure.

Object Closures

An object that plays the role of a function and its referencing environment may variously be called an object closure, a function object, or a functor.

Macro Expansion

Prior to the development of high-level programming languages, assembly language programmers could find themselves writing highly repetitive code. To ease the burden, many assemblers provided sophisticated macro expansion facilities.

So-called hygienic macros(卫生宏) implicitly encapsulate their arguments, avoiding unexpected interactions with associativity and precedence.

Summary and Concluding Remarks

这一章讨论了名称的主题,以及名称与对象的绑定(在广义上)。我们开始从绑定时间的一般讨论——名称与特定对象关联的时间,或更一般地说,任何开放问题在语言或程序设计或实现中与答案关联的时间。我们定义了对象和名称到对象绑定的生命周期的概念,并指出它们不必相同。然后,我们介绍了三种主要的存储分配机制——静态、栈、和堆——用于管理对象的空间。

在3.3节中,我们描述了名称与对象的绑定是如何受作用域规则的约束。在一些语言中,作用域规则是动态的:一个名称的含义是在最近进入的包含声明且尚未退出的作用域中找到的。然而,在大多数现代语言中,作用域规则是静态的,或者说是词法的:一个名称的含义是在最近的包含声明的词法环绕作用域中找到的。我们发现,词法作用域规则在不同语言之间以重要但有时是微妙的方式变化。我们考虑了哪些类型的作用域是允许嵌套的,作用域是开放的还是封闭的,一个名称的作用域是否包括其声明的整个块,以及是否必须在使用名称之前声明它。我们在3.4节探索了作用域规则的实现。

在3.5节中,我们检查了绑定之间关系的几种方式。别名产生于当在给定作用域中两个或更多名称绑定到同一个对象时。重载产生于一个名称绑定到多个对象时。我们注意到,尽管有时可以通过强制转换或多态性实现类似重载的行为,但底层机制实际上是非常不同的。在3.6节中,我们考虑了何时将引用环境绑定到作为参数传递、从函数返回或存储在变量中的子程序的问题。我们的讨论涉及了闭包和lambda表达式的概念,这两者在后续章节中都会反复出现。在3.7节和3.8节中,我们考虑了宏和分离编译。

词法作用域的一些更复杂的方面说明了对数据抽象的语言支持的发展,这是我们将在第10章回顾的主题。我们首先描述了像Fortran、Algol 60和C这样的语言中的own或静态变量,这些变量允许子程序中的局部变量在一次调用到下一次调用时保持其值。然后我们注意到,简单模块可以被看作是一种使长期存在的对象对一组子程序局部化的方式,这样它们对程序的其他部分来说是不可见的。通过选择性地导出名称,一个模块可以作为一个或多个抽象数据类型的“管理者”。在更高一层的复杂性中,我们注意到有些语言将模块视为类型,允许程序员创建由模块定义的抽象的任意数量的实例。最后,我们注意到面向对象语言通过提供一个继承机制,这个机制允许定义新的抽象(类)作为现有类的扩展或精化,从而扩展了模块作为类型的方法(以及词法作用域的概念)。

在本章考虑的主题中,我们看到了一些有用的特性的例子(递归、静态作用域、前向引用、一级子程序、无限范围),这些特性因为担心实现的复杂性或运行时成本而被某些语言省略。我们还看到了一个特性的例子(模块规范的私有部分),它是为了方便语言的实现而特别引入的,以及另一个(C语言中的独立编译)其设计显然是为了反映特定的实现。在语言设计的几个额外方面(晚绑定与早绑定、静态与动态作用域、对强制转换和转换的支持、对指针和其他别名的容忍),我们看到实现问题起着重要作用。

在类似的脉络中,看似简单的语言规则可能会有出人意料的含义。例如,在3.3.3节中,我们考虑了整个块作用域与名字必须在使用前声明的要求之间的相互作用。就像Fortran的do循环语法和空白规则(2.2.2节)或Pascal的if…then…else语法(2.3.2节),如果选择不当,作用域规则会使程序分析变得困难,这不仅对编译器如此,对人类同样如此。在未来的章节中,我们将看到几个既令人困惑又难以编译的特性示例。当然,语义的实用性和实现的容易程度并不总是一致的。许多容易编译的特性(例如,goto语句)其价值至少是值得怀疑的。我们还将看到几个非常有用且(概念上)简单的特性,比如垃圾收集(8.5.3节)和统一(7.2.4节,C 7.3.2节和12.2.1节),它们的实现却相当复杂。

0%