This post will go through how to interact with an SQL database in Go, and how to make your Go application production ready when it comes to connecting and querying your database server.
If you just want to see the example code, you can view it on Github
The “database/sql” Standard Library
Let’s go through the architecture of the standard library package used for interacting with SQL databases in Go: database/sql.
This package gives us a common interface that can be used to connect and query different types of databases. This interface is implemented by the driver for each database type:
Our application code uses the interface provided by the database/sql
standard library. The driver implements this interface to undertake the actual interaction with the database.
This means that we can use the same interface for interacting with a number of popular databases. Let’s see how we can make use of a driver for connecting to a PostgreSQL database.
Installing the Database Driver
The list of all compatible drivers can be found on the official Go Wiki page.
Since we’re using Postgres for this example, we’ll be using the pgx driver.
We can create a new directory for our code and initialize a new Go module:
mkdir go-sql-database-example
# you can name the module according to your directory
go mod init github.com/sohamkamani/go-sql-database-example
Next, install the pgx
driver and create a main.go
file:
go get github.com/jackc/pgx/v4/stdlib
touch main.go
Creating a Test Database
Let’s create a test database and table that we can use for illustration.
If you’ve installed Postgres, we can create the database and tables in the interactive shell:
CREATE DATABASE bird_encyclopedia;
\c bird_encyclopedia
CREATE TABLE birds (
id SERIAL PRIMARY KEY,
bird VARCHAR(256),
description VARCHAR(1024)
);
INSERT INTO birds (bird , description) VALUES
('pigeon', 'common in cities')
('eagle', 'bird of prey');
Opening a Database Connection
We can now use the installed driver to open and verify a new connection with our database.
// file: main.go
package main
import (
"database/sql"
"fmt"
"log"
// we have to import the driver, but don't use it in our code
// so we use the `_` symbol
_ "github.com/jackc/pgx/v4/stdlib"
)
func main() {
// The `sql.Open` function opens a new `*sql.DB` instance. We specify the driver name
// and the URI for our database. Here, we're using a Postgres URI
db, err := sql.Open("pgx", "postgresql://localhost:5432/bird_encyclopedia")
if err != nil {
log.Fatalf("could not connect to database: %v", err)
}
// To verify the connection to our database instance, we can call the `Ping`
// method. If no error is returned, we can assume a successful connection
if err := db.Ping(); err != nil {
log.Fatalf("unable to reach database: %v", err)
}
fmt.Println("database is reachable")
}
The
db
variable in this example is an instance of*sql.DB
which represents the reference to our database instance
Executing SQL Queries
There are multiple ways to execute queries depending on the use case. Let’s go through some of them:
We can use the db.QueryRow
method when we require a single entry from our table (For example, fetching an entry based on its unique key).
First, let’s define a struct to represent the results of each query:
type Bird struct {
Species string
Description string
}
Let’s use the QueryRow
method to fetch the first entry in our birds
table:
// `QueryRow` always returns a single row from the database
row := db.QueryRow("SELECT bird, description FROM birds LIMIT 1")
// Create a new `Bird` instance to hold our query results
bird := Bird{}
// the retrieved columns in our row are written to the provided addresses
// the arguments should be in the same order as the columns defined in
// our query
if err := row.Scan(&bird.Species, &bird.Description); err != nil {
log.Fatalf("could not scan row: %v", err)
}
fmt.Printf("found bird: %+v\n", bird)
This will give us the following output:
found bird: {Species:pigeon Description:common in cities}
If we want to query multiple rows, we can use the Query
method, which returns a Rows
instance instead of a single row like the previous example.
rows, err := db.Query("SELECT bird, description FROM birds limit 10")
if err != nil {
log.Fatalf("could not execute query: %v", err)
}
// create a slice of birds to hold our results
birds := []Bird{}
// iterate over the returned rows
// we can go over to the next row by calling the `Next` method, which will
// return `false` if there are no more rows
for rows.Next() {
bird := Bird{}
// create an instance of `Bird` and write the result of the current row into it
if err := rows.Scan(&bird.Species, &bird.Description); err != nil {
log.Fatalf("could not scan row: %v", err)
}
// append the current instance to the slice of birds
birds = append(birds, bird)
}
// print the length, and all the birds
fmt.Printf("found %d birds: %+v", len(birds), birds)
Adding Query Parameters
Query parameters are the standard and secure way to add variables to your queries.
Most of the time, we need to modify our SQL queries according to variables defined in our code.
We can use the Query
and QueryRow
methods to add variables in our code as query parameters. To illustrate, let’s add a WHERE
clause to get information about the eagle:
birdName := "eagle"
// For Postgres, parameters are specified using the "$" symbol, along with the index of
// the param. Variables should be added as arguments in the same order
// The sql library takes care of converting types from Go to SQL based on the driver
row := db.QueryRow("SELECT bird, description FROM birds WHERE bird = $1 LIMIT $2", birdName, 1)
// the code to scan the obtained row is the same as before
//...
Note: The symbols used for query params depends on the database you’re using. For example, we have to use the
?
symbol in MySQL instead of$
which is specific to Postgres
Executing Writes - INSERT, UPDATE, and DELETE
As opposed to reads (SELECT
queries), there are no returned rows after writing to a database.
Instead, the database returns information about the executed query, like the number of rows returned.
With the sql
library, we can make use of the Exec
method to execute write queries. Let’s see how we can use this to insert a new entry into the birds
table:
// sample data that we want to insert
newBird := Bird{
Species: "rooster",
Description: "wakes you up in the morning",
}
// the `Exec` method returns a `Result` type instead of a `Row`
// we follow the same argument pattern to add query params
result, err := db.Exec("INSERT INTO birds (bird, description) VALUES ($1, $2)", newBird.Species, newBird.Description)
if err != nil {
log.Fatalf("could not insert row: %v", err)
}
// the `Result` type has special methods like `RowsAffected` which returns the
// total number of affected rows reported by the database
// In this case, it will tell us the number of rows that were inserted using
// the above query
rowsAffected, err := result.RowsAffected()
if err != nil {
log.Fatalf("could not get affected rows: %v", err)
}
// we can log how many rows were inserted
fmt.Println("inserted", rowsAffected, "rows")
You can still execute write queries using
Query
orQueryRow
, but theExec
method saves us some of the work for common operations, like getting the affected rows
Connection Pooling - Timeouts and Max/Idle Connections
This section will go over how to best manage the network connections that we open to our database server.
The database server exists as a separate process from our Go application. All queries that we execute have to go over a TCP connection that we open with the database.
For production applications, we can have a “pool” of simultaneous connections made to the database server. This allows us to run multiple queries concurrently.
There are various options that we can use to configure the database connection pool in our application:
- Maximum Open Connections are the maximum number of parallel connections that can be made to the database at any time.
- Maximum Idle Connections are the maximum number of connections that can be inactive at any time. A connection is idle, if no queries are being executed on it. This can happen if the number of queries being executed are less than the current pool of connections can handle.
- Idle Connection Timeout is the maximum time for which any given connection can be idle. After this time had elapsed, the connection to the database will be closed.
- Connection Lifetime is the maximum amount of time that a connection can be open (regardless of whether it’s idle or not).
By configuring these options, we can set the behavior of our connection pool. The *sql.DB
instance gives us various methods to set these options:
db, err := sql.Open("pgx", "postgresql://localhost:5432/bird_encyclopedia")
if err != nil {
log.Fatalf("could not connect to database: %v", err)
}
// Maximum Idle Connections
db.SetMaxIdleConns(5)
// Maximum Open Connections
db.SetMaxOpenConns(10)
// Idle Connection Timeout
db.SetConnMaxIdleTime(1 * time.Second)
// Connection Lifetime
db.SetConnMaxLifetime(30 * time.Second)
The exact values of these options depend on the overall throughput and average query execution time of our application, but in general:
- It’s good to have a small percentage of your connections be idle to provide for sudden spikes in query throughput
- We should set the maximum number of open connections based on the network capacity of our database server and application servers, the lesser of which will be the limiting factor.
Query Timeouts - Using Context Cancellation
If a query is running for longer than expected, we can cancel it using a context variable.
Context based cancellation is a popular method of prematurely exiting from running processes. The sql
library provides helper methods to use existing context variables to determine when to cancel a query.
Let’s see how we can use context timeouts to stop a query midway:
// create a parent context
ctx := context.Background()
// create a context from the parent context with a 300ms timeout
ctx, _ = context.WithTimeout(ctx, 300*time.Millisecond)
// The context variable is passed to the `QueryContext` method as
// the first argument
// the pg_sleep method is a function in Postgres that will halt for
// the provided number of seconds. We can use this to simulate a
// slow query
_, err = db.QueryContext(ctx, "SELECT * from pg_sleep(1)")
if err != nil {
log.Fatalf("could not execute query: %v", err)
}
Running this code should give us this error message:
2021/10/10 17:41:09 could not execute query: timeout: context deadline exceeded
exit status 1
In production applications, it is always preferred to have timeouts for all queries: A sudden increase in throughput or a network issue can lead to queries slowing down by orders of magnitude.
Slow queries block the connections that they are running on, preventing other queries from running on them. We should always set a timeout after which to cancel a running query, to unblock connections in these cases.
Going Forward
This post went over some of the common methods and practices to make your database production ready, but we are still just scratching the surface.
There are numerous ways you can improve the performance and robustness of your application and database in production:
- You can use DBStats to monitor the state of your database connections.
- Make use of transactions if your queries are dependent on each others results.
Of course, you should make sure to look at your applications requirements in terms of scale and performance before designing it’s architecture.
What practices do you use when interacting with a database in your application? Let me know in the comments!
You can see the full code for all the examples in this post on Github