Bash: declare in depth: Part 5: variable scoping

tl;dr

  • declare outside of a function defines a global variable,
  • declare inside of a function defines a local variable,
  • declare -g inside of a function defines a global variable.

dynamic vs lexical scoping

Before we can look more deeply at how variables are scoped in bash, a quick refresher on dynamic vs lexical scoping is in order.

The value and visibility of a dynamically scoped variable depend on the call stack, whereas a lexically scoped variable is only visible in its lexical (read: source code context) environment.

Consider this example in Python, which uses lexical scoping:

x = 1

def foo():
  x = 2
  bar()

def bar():
  x = 3

print("x = {}", x)

What will the last line print? The answer is "1", because the x in foo is different from the x in bar, and both are different from the top-level x.

If Python were dynamically scoped, we'd print 3 at the end instead.

A bit of history

Bash is one of the few programming languages left in daily use that features dynamic scoping and dynamic scoping only.

Perl started out with dynamic scoping but added lexical scoping already in 1994

The other big user of dynamic scoping was Emacs. In version 24.1 lexical binding was introduced and it quickly became popular.

Perhaps most famously, JavaScript still features a vestigial form of dynamic scoping in the form of the this keyword: it is always in scope, yet what it is bound to depends on the current call stack.

Scoping in bash

Bash only supports dynamic scoping. This is best illustrated with another example:

#!/usr/bin/env bash

let counter=5

increment() {
  let counter++
  printf "increment: counter = %s\n" "$counter"
  reset
}

reset() {
  counter=0
}

increment
printf "counter = %s\n" "$counter"

Both increment and reset operate on the same counter variable, so this program prints:

increment: counter = 6
counter = 0

We can limit the scope of the assignment in reset by making counter a local variable in increment.
This means that reset will still have access to a variable called counter, but this is essentially a new variable made available in the environment for the duration of the call to increment.

Here's the new version of increment:

increment() {
  local -i counter=$counter
  let counter++

  printf "increment: counter = %s\n" "$counter"
  reset
  printf "increment: counter = %s\n" "$counter"  
}

This yields the following output:

increment: counter = 6
increment: counter = 0
counter = 5

We can see that reset only reset the copy of counter introduced by local -i and the global version of counter is untouched.

Essentially, every time you call local, you push a new variable onto a stack. Every time the current function call ends, that stack is popped. In the example above, our variable stack would look like this:

+-------+
|   6   | <-- entry used while `increment` is active
+-------+
|   5   | <-- global value
+-------+
 counter

Application: tunable parameters in interactive systems

Judging by the fact that the this keyword in JavaScript is a constant source of confusion and that every major user of dynamic scoping eventually introduces lexical scoping, one might ask what good dynamic scoping is at all?

I believe Emacs, a large-scale interactive system with built-in documentation for everything is a good example of the usefulness of dynamic scoping. Emacs core functions do what you expect and if you need to change their behavior, there is a variable you can temporarily override to get the desired behavior. In effect, every optional parameter to your function instead becomes a variable in top-level scope that you can override at your convenience.

A prominent example of this same principle is the IFS variable in Bash: it controls how words are split/joined and you can safely override it in a function or temporary environment if you need behavior that's not the default.

We can apply the same idea to our scripts. For example, here is a function that joins an array, using , by default:

# Set join_with to choose which separator to use for individual arguments
join_with=', '

# join produces a single string from all its arguments, by linking them with the value of `join_with`
join() {
  local result=

  while [[ "$#" -gt 0 ]]; do
    result+="$1"
    shift
    [[ "$#" -gt 0 ]] && result+="$join_with"
  done
  printf "%s\n" "$result"
}

In essence this allows us to use a form of keyword arguments in Bash:

$ join a b c
a, b, c
$ join_with=/ join a b c
a/b/c

You'll only receive email when they publish something new.

More from Dario Hamidi
All posts