Bash: declare in depth: Part 5: variable scoping
October 19, 2021•774 words
tl;dr
declare
outside of a function defines a global variable,declare
inside of a function defines a local variable,declare -g
inside of a function defines a global variable.
dynamic vs lexical scoping
Before we can look more deeply at how variables are scoped in bash, a quick refresher on dynamic vs lexical scoping is in order.
The value and visibility of a dynamically scoped variable depend on the call stack, whereas a lexically scoped variable is only visible in its lexical (read: source code context) environment.
Consider this example in Python, which uses lexical scoping:
x = 1
def foo():
x = 2
bar()
def bar():
x = 3
print("x = {}", x)
What will the last line print? The answer is "1", because the x
in foo
is different from the x
in bar
, and both are different from the top-level x
.
If Python were dynamically scoped, we'd print 3
at the end instead.
A bit of history
Bash is one of the few programming languages left in daily use that features dynamic scoping and dynamic scoping only.
Perl started out with dynamic scoping but added lexical scoping already in 1994
The other big user of dynamic scoping was Emacs. In version 24.1 lexical binding was introduced and it quickly became popular.
Perhaps most famously, JavaScript still features a vestigial form of dynamic scoping in the form of the this
keyword: it is always in scope, yet what it is bound to depends on the current call stack.
Scoping in bash
Bash only supports dynamic scoping. This is best illustrated with another example:
#!/usr/bin/env bash
let counter=5
increment() {
let counter++
printf "increment: counter = %s\n" "$counter"
reset
}
reset() {
counter=0
}
increment
printf "counter = %s\n" "$counter"
Both increment
and reset
operate on the same counter
variable, so this program prints:
increment: counter = 6
counter = 0
We can limit the scope of the assignment in reset
by making counter
a local variable in increment
.
This means that reset
will still have access to a variable called counter
, but this is essentially a new variable made available in the environment for the duration of the call to increment
.
Here's the new version of increment
:
increment() {
local -i counter=$counter
let counter++
printf "increment: counter = %s\n" "$counter"
reset
printf "increment: counter = %s\n" "$counter"
}
This yields the following output:
increment: counter = 6
increment: counter = 0
counter = 5
We can see that reset
only reset the copy of counter
introduced by local -i
and the global version of counter
is untouched.
Essentially, every time you call local
, you push a new variable onto a stack. Every time the current function call ends, that stack is popped. In the example above, our variable stack would look like this:
+-------+
| 6 | <-- entry used while `increment` is active
+-------+
| 5 | <-- global value
+-------+
counter
Application: tunable parameters in interactive systems
Judging by the fact that the this
keyword in JavaScript is a constant source of confusion and that every major user of dynamic scoping eventually introduces lexical scoping, one might ask what good dynamic scoping is at all?
I believe Emacs, a large-scale interactive system with built-in documentation for everything is a good example of the usefulness of dynamic scoping. Emacs core functions do what you expect and if you need to change their behavior, there is a variable you can temporarily override to get the desired behavior. In effect, every optional parameter to your function instead becomes a variable in top-level scope that you can override at your convenience.
A prominent example of this same principle is the IFS
variable in Bash: it controls how words are split/joined and you can safely override it in a function or temporary environment if you need behavior that's not the default.
We can apply the same idea to our scripts. For example, here is a function that joins an array, using ,
by default:
# Set join_with to choose which separator to use for individual arguments
join_with=', '
# join produces a single string from all its arguments, by linking them with the value of `join_with`
join() {
local result=
while [[ "$#" -gt 0 ]]; do
result+="$1"
shift
[[ "$#" -gt 0 ]] && result+="$join_with"
done
printf "%s\n" "$result"
}
In essence this allows us to use a form of keyword arguments in Bash:
$ join a b c
a, b, c
$ join_with=/ join a b c
a/b/c