Development Environments

I've never really settled on a good way of managing software-related projects - both things I'm putting together myself, and downloaded resources I'm experimenting with. Not being a software developer, it has never become a priority, and so any structure which might be detectable in the contents of my "Projects" directory has thus far been purely coincidental. However, over the last year I've had several halting attempts at trying to change that.

The end goal of my recent experiments is to determine what the structure of my personal laptop OS is likely to be when Ubuntu 20.04 rolls around - I've settled on having a LTS distro on the device itself, and so my "development" workflow is likely to require more attention. With that in mind, here is a summary of the things I've tried while stumbling towards some sort of sane, professional setup.

Arbitrarily labelled directories: A sadly-too-common fall-back; but not very neat or easy to manage after a period of non-activity. Backups become opaque ("Have I archived this one already?"), and nothing ever truly feels finished. The one exception to this, shining like a beacon of sanity, is the directory which contains example code I've put together as part of an effort to learn Rust. I adore Cargo, and the Rust toolchain management experience in general. It's nice to see that some lessons have been learned from the shit-show that is Python packaging.

Git repositories: A significant step up from raw directories - the trick I haven't mastered is figuring out when something is "significant" enough to require version control. The correct answer is probably to use git for everything immediately, shifting the decision back a step - ie to determining what remote origin to use (if any). Git-by-default is the new habit I'm trying to form. Fortunately I love using git - I just wish I worked on more projects which allowed me to experiment with the more advanced bits of functionality.

Python virtualenvs: More of a necessary evil than a deliberate choice, Python virtual environments have probably introduced more hassle than benefit in the small use-cases I have had to date. I am certainly guilty of over-complicating things when using Python - my default position in the past has always been to install the Anaconda distribution, and occasionally Intel Python as well. In reality I hardly ever used Intel Python, or the high-performance functionality of Anaconda. As a result I've concluded that in future I'll stick to the OS system Python, or use Intel Python/Anaconda exclusively inside a VM or container. More generally, I need to do some more tests to decide between simply using pip+venv and using poetry, since these seem to be the best current options.

Docker containers: Using Docker for local development environments has never really appealed to me. Given that my work doesn't involve the development of persistent services, the benefits of matching a development workflow to a production one are mostly lost. Without this benefit, Docker really becomes just a convoluted and opaque alternative to running in a VM - as such I have tended to avoid the hassle.

Other application containers: Working in the HPC space, I have a natural disposition towards containers other than Docker - namely Singularity, Charliecloud, and more recently Sarus. While I toyed with the idea of Singularity-based development environments, the default approach to bind-mounting host directories wasn't really what I wanted. Similarly, immutable container images are what I want when building an executable - but not when doing ad-hoc experimentation with software packages.

Vagrant VMs: For a time, managing VirtualBox VMs with Vagrant was my preferred option for personal projects. However, I never really liked that the VirtualBox application itself had a license structure which effectively precluded professional use, or that the Vagrant templates themselves were Ruby syntax. While I have nothing against the language, it seemed unnecissarily burdensome that I needed to know anything about a particular programming language rather than just using a standard, well-documented config file syntax like YAML or TOML. Using alternative Vargant providers had some appeal, but I soon decided that keeping a bunch of plugins up to date was an unnecessary hassle.

LXD/LXC: In recent months, LXD with ZFS-backed storage has become my go-to solution for isolated environments. Creating a system container is fast, and the set of available images is acceptable (albeit rather biased towards Ubuntu). LXD is about as lightweight an interface as I could ask for without skimping on any features - my one minor complaint is that customization is only really possible via quite "raw" methods like using cloud-init.

Multipass: I only became aware of Multipass very recently (late 2019), but quickly concluded that if LXD wasn't the right answer for me, then this might do the trick instead. Most of the ease-of-use features I want (copying files in/out, mounting directories) are available in Multipass by default, while LXD requires tweaks or hacks to the host OS. These are trivial enough, but knowing that the problem is being handled in an "official" way is nice. The downside is that despite now being in a 1.0 release, there still seem to be significant bugs - I've yet to get a multipass VM to remain accessible across host sessions when suspending or rebooting.

The One True Answer (for now, at least)

Realistically, I'll probably just use either Multipass or LXD as the "outer" layer for projects, with Python virtualenvs internally where necessary. Both have an acceptable approach to bind-mounting directories from outside, though the automation of UID/GID mapping in Multipass probably means that it will win out in the long run if the problems can be resolved. If it turns out using LXD is still part of my workflow in the coming months, it will almost certainly be because of ZFS and the ease of managing storage. I'm currently considering the best way to manage an isolated ZFS pool (based on the same approach as LXD, ie a file-system in a file) which can be shared between Multipass VMs. However, the fact that this is best achieved by just using a directory mounted from the host begs the question what benefit ZFS is actually adding in this instance.

Regardless of the choice of Multipass or LXD for workspace isolation, enforcing git-by-default and a simpler Python workflow will definitely help to make my life easier.

You'll only receive email when ChrisDowning publishes a new post

More fromĀ ChrisDowning