Wednesday, December 21, 2011
Getting to know the Go community
The survey is short. It asks about you, your involvement with Go, and and your interest in Go-related events. Among other things, this data will help myself and the rest of the Go team plan future Go events and schedule conference appearances.
Please take a minute to complete the survey now.
Thanks!
Andrew
Monday, December 19, 2011
Building StatHat with Go
My name is Patrick Crosby and I'm the founder of a company called Numerotron. We recently released StatHat. This post is about why we chose to develop StatHat in Go, including details about how we are using Go.
StatHat is a tool to track statistics and events in your code. Everyone from HTML designers to backend engineers can use StatHat easily, as it supports sending stats from HTML, JavaScript, Go, and twelve other languages.
You send your numbers to StatHat; it generates beautiful, fully-embeddable graphs of your data. StatHat will alert you when specified triggers occur, send you daily email reports, and much more. So instead of spending time writing tracking or reporting tools for your application, you can concentrate on the code. While you do the real work, StatHat remains intensely vigilant, like an eagle in its mountaintop nest, or a babysitter on meth.
Here's an example of a StatHat graph of the temperature in NYC, Chicago, and San Francisco:
Architecture Overview
StatHat consists of two main services: incoming statistic/event API calls and the web application for viewing and analyzing stats. We wanted to keep these as separate as possible to isolate the data collection from the data interaction. We did this for many reasons, but one major reason is that we anticipate handling a ton of automated incoming API HTTP requests and would thus have different optimization strategies for the API service than a web application interacting with humans.
The web application service is multi-tiered. The web server processes all requests and sends them to an interactor layer. For simple tasks, the interactor will handle generating any necessary data. For complex tasks, the interactor relies on multiple application servers to handle tasks like generating graphs or analyzing data sets. After the interactor is finished, the web server sends the result to a presenter. The presenter responds to the HTTP request with either HTML or JSON. We can horizontally scale the web, API, application servers, and databases as the demand for services grows and changes over time. There is no single point of failure as each application server has multiple copies running. The interactor layer allows us to have different interfaces to the system: http, command line, automated tests, mobile API. StatHat uses MySQL for data storage.
Choosing Go
When we designed StatHat, we had the following check list for our development tools:
- same programming language for backend and frontend systems
- good, fast HTML templating system
- fast start-up, recompilation, testing for lots of tinkering
- lots of connections on one machine
- language tools for handling application-level concurrency
- good performance
- robust RPC layer to talk between tiers
- lots of libraries
- open source
We evaluated many popular and not-so-popular web technologies and ended up choosing to develop it in Go.
When Go was released in November 2009, I immediately installed it and loved the fast compilation times, goroutines, channels, garbage collection, and all the packages that were available. I was especially pleased with how few lines of code my applications were using. I soon experimented with making a web app called Langalot that concurrently searched through five foreign language dictionaries as you typed in a query. It was blazingly fast. I put it online and it's been running since February, 2010.
The following sections detail how Go meets StatHat's requirements and our experience using Go to solve our problems.
Runtime
We use the standard Go http package for our API and web app servers. All requests first go through Nginx and any non-file requests are proxied to the Go-powered http servers. The backend servers are all written in Go and use the rpc package to communicate with the frontend.
Templating
We built a template system using the standard template package. Our system adds layouts, some common formatting functions, and the ability to recompile templates on-the-fly during development. We are very pleased with the performance and functionality of the Go templates.
Tinkering
In a previous job, I worked on a video game called Throne of Darkness that was written in C++. We had a few header files that, when modified, required a full rebuild of the entire system, 20-30 minutes long. If anyone ever changed `Character.h`, he would be subject to the wrath of every other programmer. Besides this suffering, it also slowed down development time significantly.
Since then, I've always tried to choose technologies that allowed fast, frequent tinkering. With Go, compilation time is a non-issue. We can recompile the entire system in seconds, not minutes. The development web server starts instantly, tests complete in a few seconds. As mentioned previously, templates are recompiled as they change. The result is that the StatHat system is very easy to work with, and the compiler is not a bottleneck.
RPC
Since StatHat is a multi-tiered system, we wanted an RPC layer so that all communication was standard. With Go, we are using the rpc package and the gob package for encoding Go objects. In Go, the RPC server just takes any Go object and registers its exported methods. There is no need for an intermediary interface description language. We've found it very easy to use and many of our core application servers are under 300 lines of code.
Libraries
We don't want to spend time rewriting libraries for things like SSL, database drivers, JSON/XML parsers. Although Go is a young language, it has a lot of system packages and a growing number of user-contributed packages. With only a few exceptions, we have found Go packages for everything we have needed.
Open source
In our experience, it has been invaluable to work with open source tools. If something is going awry, it is immensely helpful to be able to examine the source through every layer and not have any black boxes. Having the code for the language, web server, packages, and tools allows us to understand how every piece of the system works. Everything in Go is open source. In the Go codebase, we frequently read the tests as they often give great examples of how to use packages and language features.
Performance
People rely on StatHat for up to the minute analysis of their data and we need the system to be as responsive as possible. In our tests, Go's performance blew away most of the competition. We tested it against Rails, Sinatra, OpenResty, and Node. StatHat has always monitored itself by tracking all kinds of performance metrics about requests, the duration of certain tasks, the amount of memory in use. Because of this, we were able to easily evaluate different technologies. We've also taken advantage of the benchmark performance testing features of the Go testing package.
Application-Level Concurrency
In a former life, I was the CTO at OkCupid. My experience there using OKWS taught me the importance of async programming, especially when it comes to dynamic web applications. There is no reason you should ever do something like this synchronously: load a user from the database, then find their stats, then find their alerts. These should all be done concurrently, yet surprisingly, many popular frameworks have no async support. Go supports this at the language level without any callback spaghetti. StatHat uses goroutines extensively to run multiple functions concurrently and channels for sharing data between goroutines.
Hosting and Deployment
StatHat runs on Amazon's EC2 servers. Our servers are divided into several types:
- API
- Web
- Application servers
- Database
There are at least two of each type of server, and they are in different zones for high availability. Adding a new server to the mix takes just a couple of minutes.
To deploy, we first build the entire system into a time-stamped directory. Our packaging script builds the Go applications, compresses the CSS and JS files, and copies all the scripts and configuration files. This directory is then distributed to all the servers, so they all have an identical distribution. A script on each server queries its EC2 tags and determines what it is responsible for running and starts/stops/restarts any services. We frequently only deploy to a subset of the servers.
More
For more information on StatHat, please visit stathat.com. We are releasing some of the Go code we've written. Go to www.stathat.com/src for all of the open source StatHat projects.
To learn more about Go, visit golang.org.
Tuesday, December 13, 2011
From zero to Go: launching on the Google homepage in 24 hours
This article was written by Reinaldo Aguiar, a software engineer from the Search team at Google. He shares his experience developing his first Go program and launching it to an audience of millions - all in one day!
I was recently given the opportunity to collaborate on a small but highly visible "20% project": the Thanksgiving 2011 Google Doodle. The doodle features a turkey produced by randomly combining different styles of head, wings, feathers and legs. The user can customize it by clicking on the different parts of the turkey. This interactivity is implemented in the browser by a combination of JavaScript, CSS and of course HTML, creating turkeys on the fly.
Once the user has created a personalized turkey it can be shared with friends and family by posting to Google+. Clicking a "Share" button (not pictured here) creates in the user's Google+ stream a post containing a snapshot of the turkey. The snapshot is a single image that matches the turkey the user created.
With 13 alternatives for each of 8 parts of the turkey (heads, pairs of legs, distinct feathers, etc.) there are more than than 800 million possible snapshot images that could be generated. To pre-compute them all is clearly infeasible. Instead, we must generate the snapshots on the fly. Combining that problem with a need for immediate scalability and high availability, the choice of platform is obvious: Google App Engine!
The next thing we needed to decide was which App Engine runtime to use. Image manipulation tasks are CPU-bound, so performance is the deciding factor in this case.
To make an informed decision we ran a test. We quickly prepared a couple of equivalent demo apps for the new Python 2.7 runtime (which provides PIL, a C-based imaging library) and the Go runtime. Each app generates an image composed of several small images, encodes the image as a JPEG, and sends the JPEG data as the HTTP response. The Python 2.7 app served requests with a median latency of 65 milliseconds, while the Go app ran with a median latency of just 32 milliseconds.
This problem therefore seemed the perfect opportunity to try the experimental Go runtime.
I had no previous experience with Go and the timeline was tight: two days to be production ready. This was intimidating, but I saw it as an opportunity to test Go from a different, often overlooked angle: development velocity. How fast can a person with no Go experience pick it up and build something that performs and scales?
Design
The approach was to encode the state of the turkey in the URL, drawing and encoding the snapshot on the fly.
The base for every doodle is the background:
A valid request URL might look like this: http://google-turkey.appspot.com/thumb/20332620
The alphanumeric string that follows "/thumb/" indicates (in hexadecimal) which choice to draw for each layout element, as illustrated by this image:
The program's request handler parses the URL to determine which element is selected for each component, draws the appropriate images on top of the background image, and serves the result as a JPEG.
If an error occurs, a default image is served. There's no point serving an error page because the user will never see it - the browser is almost certainly loading this URL into an image tag.
Implementation
In the package scope we declare some data structures to describe the elements of the turkey, the location of the corresponding images, and where they should be drawn on the background image.
var (
// dirs maps each layout element to its location on disk.
dirs = map[string]string{
"h": "img/heads",
"b": "img/eyes_beak",
"i": "img/index_feathers",
"m": "img/middle_feathers",
"r": "img/ring_feathers",
"p": "img/pinky_feathers",
"f": "img/feet",
"w": "img/wing",
}
// urlMap maps each URL character position to
// its corresponding layout element.
urlMap = [...]string{"b", "h", "i", "m", "r", "p", "f", "w"}
// layoutMap maps each layout element to its position
// on the background image.
layoutMap = map[string]image.Rectangle{
"h": {image.Pt(109, 50), image.Pt(166, 152)},
"i": {image.Pt(136, 21), image.Pt(180, 131)},
"m": {image.Pt(159, 7), image.Pt(201, 126)},
"r": {image.Pt(188, 20), image.Pt(230, 125)},
"p": {image.Pt(216, 48), image.Pt(258, 134)},
"f": {image.Pt(155, 176), image.Pt(243, 213)},
"w": {image.Pt(169, 118), image.Pt(250, 197)},
"b": {image.Pt(105, 104), image.Pt(145, 148)},
}
)
The geometry of the points above was calculated by measuring the actual location and size of each layout element within the image.
Loading the images from disk on each request would be wasteful repetition, so we load all 106 images (13 * 8 elements + 1 background + 1 default) into global variables upon receipt of the first request.
var (
// elements maps each layout element to its images.
elements = make(map[string][]*image.RGBA)
// backgroundImage contains the background image data.
backgroundImage *image.RGBA
// defaultImage is the image that is served if an error occurs.
defaultImage *image.RGBA
// loadOnce is used to call the load function only on the first request.
loadOnce sync.Once
)
// load reads the various PNG images from disk and stores them in their
// corresponding global variables.
func load() {
defaultImage = loadPNG(defaultImageFile)
backgroundImage = loadPNG(backgroundImageFile)
for dirKey, dir := range dirs {
paths, err := filepath.Glob(dir + "/*.png")
if err != nil {
panic(err)
}
for _, p := range paths {
elements[dirKey] = append(elements[dirKey], loadPNG(p))
}
}
}
Requests are handled in a straightforward sequence:
- Parse the request URL, decoding the decimal value of each character in the path.
- Make a copy of the background image as the base for the final image.
- Draw each image element onto the background image using the layoutMap to determine where they should be drawn.
- Encode the image as a JPEG
- Return the image to user by writing the JPEG directly to the HTTP response writer.
Should any error occur, we serve the defaultImage to the user and log the error to the App Engine dashboard for later analysis.
Here's the code for the request handler with explanatory comments:
func handler(w http.ResponseWriter, r *http.Request) {
// Defer a function to recover from any panics.
// When recovering from a panic, log the error condition to
// the App Engine dashboard and send the default image to the user.
defer func() {
if err := recover(); err != nil {
c := appengine.NewContext(r)
c.Errorf("%s", err)
c.Errorf("%s", "Traceback: %s", r.RawURL)
if defaultImage != nil {
w.Header().Set("Content-type", "image/jpeg")
jpeg.Encode(w, defaultImage, &imageQuality)
}
}
}()
// Load images from disk on the first request.
loadOnce.Do(load)
// Make a copy of the background to draw into.
bgRect := backgroundImage.Bounds()
m := image.NewRGBA(bgRect.Dx(), bgRect.Dy())
draw.Draw(m, m.Bounds(), backgroundImage, image.ZP, draw.Over)
// Process each character of the request string.
code := strings.ToLower(r.URL.Path[len(prefix):])
for i, p := range code {
// Decode hex character p in place.
if p < 'a' {
// it's a digit
p = p - '0'
} else {
// it's a letter
p = p - 'a' + 10
}
t := urlMap[i] // element type by index
em := elements[t] // element images by type
if p >= len(em) {
panic(fmt.Sprintf("element index out of range %s: "+
"%d >= %d", t, p, len(em)))
}
// Draw the element to m,
// using the layoutMap to specify its position.
draw.Draw(m, layoutMap[t], em[p], image.ZP, draw.Over)
}
// Encode JPEG image and write it as the response.
w.Header().Set("Content-type", "image/jpeg")
w.Header().Set("Cache-control", "public, max-age=259200")
jpeg.Encode(w, m, &imageQuality)
}
For brevity, I've omitted several helper functions from these code listings. See the source code for the full scoop.
Performance
This chart - taken directly from the App Engine dashboard - shows average request latency during launch. As you can see, even under load it never exceeds 60 ms, with a median latency of 32 milliseconds. This is wicked fast, considering that our request handler is doing image manipulation and encoding on the fly.
Conclusions
I found Go's syntax to be intuitive, simple and clean. I have worked a lot with interpreted languages in the past, and although Go is instead a statically typed and compiled language, writing this app felt more like working with a dynamic, interpreted language.
The development server provided with the SDK quickly recompiles the program after any change, so I could iterate as fast as I would with an interpreted language. It's dead simple, too - it took less than a minute to set up my development environment.
Go's great documentation also helped me put this together fast. The docs are generated from the source code, so each function's documentation links directly to the associated source code. This not only allows the developer to understand very quickly what a particular function does but also encourages the developer to dig into the package implementation, making it easier to learn good style and conventions.
In writing this application I used just three resources: App Engine's Hello World Go example, the Go packages documentation, and a blog post showcasing the Draw package. Thanks to the rapid iteration made possible by the development server and the language itself, I was able to pick up the language and build a super fast, production ready, doodle generator in less than 24 hours.
Download the full app source code (including images) at the Google Code project.
Special thanks go to Guillermo Real and Ryan Germick who designed the doodle.

