Wednesday, August 4, 2010

Defer, Panic, and Recover

Go has the usual mechanisms for control flow: if, for, switch, goto. It also has the go statement to run code in a separate goroutine. Here I'd like to discuss some of the less common ones: defer, panic, and recover.

A defer statement pushes a function call onto a list. The list of saved calls is executed after the surrounding function returns. Defer is commonly used to simplify functions that perform various clean-up actions.

For example, let's look at a function that opens two files and copies the contents of one file to the other:

func CopyFile(dstName, srcName string) (written int64, err os.Error) {
src, err := os.Open(srcName, os.O_RDONLY, 0)
if err != nil {
return
}

dst, err := os.Open(dstName, os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return
}

written, err = io.Copy(dst, src)
dst.Close()
src.Close()
return
}

This works, but there is a bug. If the second call to os.Open fails, the function will return without closing the source file. This can be easily remedied by putting a call to src.Close() before the second return statement, but if the function were more complex the problem might not be so easily noticed and resolved. By introducing defer statements we can ensure that the files are always closed:

func CopyFile(dstName, srcName string) (written int64, err os.Error) {
src, err := os.Open(srcName, os.O_RDONLY, 0)
if err != nil {
return
}
defer src.Close()

dst, err := os.Open(dstName, os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return
}
defer dst.Close()

return io.Copy(dst, src)
}

Defer statements allow us to think about closing each file right after opening it, guaranteeing that, regardless of the number of return statements in the function, the files will be closed.

The behavior of defer statements is straightforward and predictable. There are three simple rules:

1. A deferred function's arguments are evaluated when the defer statement is evaluated.

In this example, the expression "i" is evaluated when the Println call is deferred. The deferred call will print "0" after the function returns.

func a() {
i := 0
defer fmt.Println(i)
i++
return
}

2. Deferred function calls are executed in Last In First Out order after the surrounding function returns.

This function prints "3210":

func b() {
for i := 0; i < 4; i++ {
defer fmt.Print(i)
}
}

3. Deferred functions may read and assign to the returning function's named return values.

In this example, a deferred function increments the return value i after the surrounding function returns. Thus, this function returns 2:

func c() (i int) {
defer func() { i++ }()
return 1
}

This is convenient for modifying the error return value of a function; we will see an example of this shortly.

Panic is a built-in function that stops the ordinary flow of control and begins panicking. When the function F calls panic, execution of F stops, any deferred functions in F are executed normally, and then F returns to its caller. To the caller, F then behaves like a call to panic. The process continues up the stack until all functions in the current goroutine have returned, at which point the program crashes. Panics can be initiated by invoking panic directly. They can also be caused by runtime errors, such as out-of-bounds array accesses.

Recover is a built-in function that regains control of a panicking goroutine. Recover is only useful inside deferred functions. During normal execution, a call to recover will return nil and have no other effect. If the current goroutine is panicking, a call to recover will capture the value given to panic and resume normal execution.

Here's an example program that demonstrates the mechanics of panic and defer:

package main

import "fmt"

func main() {
f()
fmt.Println("Returned normally from f.")
}

func f() {
defer func() {
if r := recover(); r != nil {
fmt.Println("Recovered in f", r)
}
}()
fmt.Println("Calling g.")
g(0)
fmt.Println("Returned normally from g.")
}

func g(i int) {
if i > 3 {
fmt.Println("Panicking!")
panic(fmt.Sprintf("%v", i))
}
defer fmt.Println("Defer in g", i)
fmt.Println("Printing in g", i)
g(i+1)
}

The function g takes the int i, and panics if i is greater than 3, or else it calls itself with the argument i+1. The function f defers a function that calls recover and prints the recovered value (if it is non-nil). Try to picture what the output of this program might be before reading on.

The program will output:

Calling g.
Printing in g 0
Printing in g 1
Printing in g 2
Printing in g 3
Panicking!
Defer in g 3
Defer in g 2
Defer in g 1
Defer in g 0
Recovered in f 4
Returned normally from f.

If we remove the deferred function from f the panic is not recovered and reaches the top of the goroutine's call stack, terminating the program. This modified program will output:

Calling g.
Printing in g 0
Printing in g 1
Printing in g 2
Printing in g 3
Panicking!
Defer in g 3
Defer in g 2
Defer in g 1
Defer in g 0
panic: 4

panic PC=0x2a9cd8
[stack trace omitted]

For a real-world example of panic and recover, see the json package from the Go standard library. It decodes JSON-encoded data with a set of recursive functions. When malformed JSON is encountered, the parser calls panic is to unwind the stack to the top-level function call, which recovers from the panic and returns an appropriate error value (see the 'error' and 'unmarshal' functions in decode.go). There is a similar example of this technique in the Compile routine of the regexp package. The convention in the Go libraries is that even when a package uses panic internally, its external API still presents explicit error return values.

Other uses of defer (beyond the file.Close() example given earlier) include releasing a mutex:

mu.Lock()
defer mu.Unlock()

printing a footer:

printHeader()
defer printFooter()

and more.

In summary, the defer statement (with or without panic and recover) provides an unusual and powerful mechanism for control flow. It can be used to model a number of features implemented by special-purpose structures in other programming languages. Try it out.

Tuesday, July 13, 2010

Share Memory By Communicating

Traditional threading models (commonly used when writing Java, C++, and Python programs, for example) require the programmer to communicate between threads using shared memory. Typically, shared data structures are protected by locks, and threads will contend over those locks to access the data. In some cases, this is made easier by the use of thread-safe data structures such as Python's Queue.

Go's concurrency primitives - goroutines and channels - provide an elegant and distinct means of structuring concurrent software. (These concepts have an interesting history that begins with C. A. R. Hoare's Communicating Sequential Processes.) Instead of explicitly using locks to mediate access to shared data, Go encourages the use of channels to pass references to data between goroutines. This approach ensures that only one goroutine has access to the data at a given time. The concept is summarized in the document Effective Go (a must-read for any Go programmer):

Do not communicate by sharing memory; instead, share memory by communicating.

Consider a program that polls a list of URLs. In a traditional threading environment, one might structure its data like so:

type Resource struct {
url string
polling bool
lastPolled int64
}

type Resources struct {
data []*Resource
lock *sync.Mutex
}

And then a Poller function (many of which would run in separate threads) might look something like this:

func Poller(res *Resources) {
for {
// get the least recently-polled Resource
// and mark it as being polled
res.lock.Lock()
var r *Resource
for _, v := range res.data {
if v.polling {
continue
}
if r == nil || v.lastPolled < r.lastPolled {
r = v
}
}
if r != nil {
r.polling = true
}
res.lock.Unlock()
if r == nil {
continue
}

// poll the URL

// update the Resource's polling and lastPolled
res.lock.Lock()
r.polling = false
r.lastPolled = time.Nanoseconds()
res.lock.Unlock()
}
}

This function is about a page long, and requires more detail to make it complete. It doesn't even include the URL polling logic (which, itself, would only be a few lines), nor will it gracefully handle exhausting the pool of Resources.

Let's take a look at the same functionality implemented using Go idiom. In this example, Poller is a function that receives Resources to be polled from an input channel, and sends them to an output channel when they're done.

type Resource string

func Poller(in, out chan *Resource) {
for r := range in {
// poll the URL

// send the processed Resource to out
out <- r
}
}

The delicate logic from the previous example is conspicuously absent, and our Resource data structure no longer contains bookkeeping data. In fact, all that's left are the important parts. This should give you an inkling as to the power of these simple language features.

There are many omissions from the above code snippets. For a walkthrough of a complete, idiomatic Go program that uses these ides, see the Codewalk "Share Memory By Communicating".

Wednesday, July 7, 2010

Go's Declaration Syntax

Newcomers to Go wonder why the declaration syntax is different from the tradition established in the C family. In this post we'll compare the two approaches and explain why Go's declarations look as they do.

C syntax

First, let's talk about C syntax. C took an unusual and clever approach to declaration syntax. Instead of describing the types with special syntax, one writes an expression involving the item being declared, and states what type that expression will have. Thus

    int x;

declares x to be an int: the expression 'x' will have type int. In general, to figure out how to write the type of a new variable, write an expression involving that variable that evaluates to a basic type, then put the basic type on the left and the expression on the right.

Thus, the declarations

    int *p;
int a[3];

state that p is a pointer to int because '*p' has type int, and that a is an array of ints because a[3] (ignoring the particular index value, which is punned to be the size of the array) has type int.

What about functions? Originally, C's function declarations wrote the types of the arguments outside the parens, like this:

    int main(argc, argv)
int argc;
char *argv[];
{ /* ... */ }

Again, we see that main is a function because the expression main(argc, argv) returns an int. In modern notation we'd write

    int main(int argc, char *argv[]) { /* ... */ }

but the basic structure is the same.

This is a clever syntactic idea that works well for simple types but can get confusing fast. The famous example is declaring a function pointer. Follow the rules and you get this:

    int (*fp)(int a, int b);

Here, fp is a pointer to a function because if you write the expression (*fp)(a, b) you'll call a function that returns int. What if one of fp's arguments is itself a function?

    int (*fp)(int (*ff)(int x, int y), int b)

That's starting to get hard to read.

Of course, we can leave out the name of the parameters when we declare a function, so main can be declared

    int main(int, char *[])

Recall that argv is declared like this,

    char *argv[]

so you drop the name from the *middle* of its declaration to construct its type. It's not obvious, though, that you declare something of type char *[] by putting its name in the middle.

And look what happens to fp's declaration if you don't name the parameters:

    int (*fp)(int (*)(int, int), int)

Not only is it not obvious where to put the name inside

    int (*)(int, int)

it's not exactly clear that it's a function pointer declaration at all. And what if the return type is a function pointer?

    int (*(*fp)(int (*)(int, int), int))(int, int)

It's hard even to see that this declaration is about fp.

You can construct more elaborate examples but these should illustrate some of the difficulties that C's declaration syntax can introduce.

There's one more point that needs to be made, though. Because type and declaration syntax are the same, it can be difficult to parse expressions with types in the middle. This is why, for instance, C casts always parenthesize the type, as in

    (int)M_PI

Go syntax

Languages outside the C family usually use a distinct type syntax in declarations. Although it's a separate point, the name usually comes first, often followed by a colon. Thus our examples above become something like (in a fictional but illustrative language)

    x: int
p: pointer to int
a: array[3] of int

These declarations are clear, if verbose - you just read them left to right. Go takes its cue from here, but in the interests of brevity it drops the colon and removes some of the keywords:

    x int
p *int
a [3]int

There is no direct correspondence between the look of [3]int and how to use a in an expression. (We'll come back to pointers in the next section.) You gain clarity at the cost of a separate syntax.

Now consider functions. Let's transcribe the declaration for main, even though the main function in Go takes no arguments:

    func main(argc int, argv *[]byte) int

Superficially that's not much different from C, but it reads well from left to right:

function main takes an int and a pointer to a slice of bytes and returns an int.

Drop the parameter names and it's just as clear - they're always first so there's no confusion.

    func main(int, *[]byte) int

One value of this left-to-right style is how well it works as the types become more complex. Here's a declaration of a function variable (analogous to a function pointer in C):

    f func(func(int,int) int, int) int

Or if f returns a function:

    f func(func(int,int) int, int) func(int, int) int

It still reads clearly, from left to right, and it's always obvious which name is being declared - the name comes first.

The distinction between type and expression syntax makes it easy to write and invoke closures in Go:

    sum := func(a, b int) int { return a+b } (3, 4)

Pointers

Pointers are the exception that proves the rule. Notice that in arrays and slices, for instance, Go's type syntax puts the brackets on the left of the type but the expression syntax puts them on the right of the expression:

    var a []int
x = a[1]

For familiarity, Go's pointers use the * notation from C, but we could not bring ourselves to make a similar reversal for pointer types. Thus pointers work like this

    var p *int
x = *p

We couldn't say

    var p *int
x = p*

because that postfix * would conflate with multiplication. We could have used the Pascal ^, for example:

    var p ^int
x = p^

and perhaps we should have (and chosen another operator for xor), because the prefix asterisk on both types and expressions complicates things in a number of ways. For instance, although one can write

    []int("hi")

as a conversion, one must parenthesize the type if it starts with a *:

    (*int)(nil)

Had we been willing to give up * as pointer syntax, those parentheses would be unnecessary.

So Go's pointer syntax is tied to the familiar C form, but those ties mean that we cannot break completely from using parentheses to disambiguate types and expressions in the grammar.

Overall, though, we believe Go's type syntax is easier to understand than C's, especially when things get complicated.

Notes

Go's declarations read left to right. It's been pointed out that C's read in a spiral! See The "Clockwise/Spiral Rule" by David Anderson.

- Rob Pike, July 2010

Sunday, June 6, 2010

Go Programming session video from Google I/O

Below is the video of the talk given by Rob Pike and Russ Cox at Google I/O 2010.

Thursday, May 27, 2010

Go at I/O: Frequently Asked Questions

Among the high-profile product launches at Google I/O last week, our small team gave presentations to packed rooms and met many present and future Go programmers. It was especially gratifying to meet with so many people who, after learning a bit about Go, were excited by the potential benefits (both immediate and long-term) they could gain from using it.

We were asked a lot of good questions during I/O, and in this post I'd like to recap and expand upon some of them.

How suitable is Go for production systems?
Go is ready and stable now. We are pleased to report that Google is using Go for some production systems, and they are performing well. Of course there is still room for improvement - that's why we're continuing to work on the language, libraries, tools, and runtime.

Do you have plans to implement generics?
Many proposals for generics-like features have been mooted both publicly and internally, but as yet we haven't found a proposal that is consistent with the rest of the language. We think that one of Go's key strengths is its simplicity, so we are wary of introducing new features that might make the language more difficult to understand. Additionally, the more Go code we write (and thus the better we learn how to write Go code ourselves), the less we feel the need for such a language feature.

Do you have any plans to support GPU programming?
We don't have any immediate plans to do this, but as Go is architecture-agnostic it's quite possible. The ability to launch a goroutine that runs on a different processor architecture, and to use channels to communicate between goroutines running on separate architectures, seem like good ideas.

Are there plans to support Go under App Engine?
Both the Go and App Engine teams would like to see this happen. As always, it is a question of resources and priorities as to if and when it will become a reality.

Are there plans to support Go under Android?
Both Go compilers support ARM code generation, so it is possible. While we think Go would be a great language for writing mobile applications, Android support is not something that's being actively worked on.

What can I use Go for?
Go was designed with systems programming in mind. Servers, clients, databases, caches, balancers, distributors - these are applications Go is obviously useful for, and this is how we have begun to use it within Google. However, since Go's open-source release, the community has found a diverse range of applications for the language. From web apps to games to graphics tools, Go promises to shine as a general-purpose programming language. The potential is only limited by library support, which is improving at a tremendous rate. Additionally, educators have expressed interest in using Go to teach programming, citing its succinct syntax and consistency as well-suited to the task.

Thanks to everyone who attended our presentations, or came to talk with us at Office Hours. We hope to see you again at future events.

The video of Rob and Russ' talk will be uploaded to YouTube within the next week, and will then be posted on this blog.

Wednesday, May 12, 2010

Upcoming Google I/O Go Events

Google I/O 2010 is happening next week at the Moscone Centre in San Francisco. Those of you with tickets will be able to catch some of the Go team both at I/O and at Bootcamp. In reverse-chronological order:

Rob Pike and Russ Cox will be presenting a Go Programming talk on Thursday at 10.15am. This session takes a detailed look at how Go differs from other languages in a practical sense. Through a series of examples, they will demonstrate various features of Go and the ways in which they affect program design.

Several members of the Go team will be at the Go cube during Office Hours on Wednesday between 12pm and 2:30pm. Come by to have your Go questions answered by the experts.

At Bootcamp on Tuesday at 4.15pm, Andrew Gerrand will be giving an introductory talk about Go. The session will give an overview of the problems that motivated us to build a new language, and the ways in which Go addresses those problems.

If you're coming to I/O, we look forward to seeing you there!

Wednesday, May 5, 2010

New Talk and Tutorials

Rob Pike recently gave a talk at Stanford's Computer Systems Colloquium (EE380). Titled "Another Go at Language Design", the presentation gives an overview of the itches Go was built to scratch, and how Go addresses those problems. You can view a video stream of the talk, and download the slides.

Last week's release included a code lab, "Writing Web Applications," that details the construction of a simple wiki program. It is a practical introduction to some fundamental Go concepts, and the first of a series of Go code labs.

Lastly, we are often asked "How do Go packages work?" It's easier to show than to explain, so I put together a Go Packages screen cast that demonstrates the process of writing, building, installing, and redistributing Go packages. I hope to post more of these covering a variety of Go programming topics to the gocoding YouTube channel in the near future.