1  Software development

Note

This guidebook is written following the diátaxis “how-to guide” style. And because this document reflects how we work in the Seedcase Project, it is living and constantly evolving. It won’t ever be in a state of “done”.

This chapter describes an approach to developing software that we’ve found to be effective at helping us build higher-quality, more reliable, and maintainable software products that get built faster and have less issues over time. It combines and mixes several development methods and best practices, starting from a high-level by following an iterative and incremental development approach with aspects of Kanban for project planning and management, including regular update and reflection meetings (called retrospectives). Within a given product, the workflow includes:

So, how do you integrate all of these together when developing software? That’s what this section is all about! 🎉

1.1 Visual overview

The following diagram gives a visual overview of the stages of development and how they connect. We’ll cover each part in more detail in the following sections.

flowchart TB
    Aim([Purpose and need<br>for software]) --> DDDesign([Domain-driven design])
    DDDesign --> Design>"Explanation docs<br>(for design)"]
    DDDesign --> DDDev
    DDDev([Documentation-driven<br>development]) --> Guide>How-to guides]
    DDDev([Documentation-driven<br>development]) --> CodeDocs>How-to guides]
    Guide <--> Test
    Guide <--> Develop[Develop<br>implementation]
    Design --> Test([Test-driven development])
    Test <--> Develop
    CodeDocs>"Reference docs<br>(for code)"] <--> Develop
    Develop --> Tutorials>Tutorials]
    Guide --> Tutorials
    Guide & Develop <--> Examples[Example real<br>world usage]
Figure 1.1: Software development workflow that combines domain-driven design, documentation-driven development, and test-driven development. Rounded boxes represent stages and methods of development, while indented rectangled boxes represent the output of those stages as documentation types. Rectangle boxes represent generic activities that integrate many stages together.

Underlying all of this is an iterative and incremental development approach (described in more detail in Chapter 4), continuously integrating the changes, and continuously deploying the software and associated documentation (as a website).

1.2 Development steps

1.2.1 Identifying the purpose and need

Before even getting into the development workflow, the very first thing you need to do is clearly define the purpose of the software and the need that it meets. The purpose doesn’t have to be very detailed or specific at this point, but there needs to be one. At the least, there needs to be a clear need or a specific problem to be solved by software. This helps to recognise and define the scope of the software.

If you don’t have this clear purpose, everything later will be much harder. In general, don’t continue to the next steps until you have defined the purpose. Though it doesn’t have to be final, defining the purpose helps guide the direction of the subsequent steps. It’s better to not waste time on building something that doesn’t actually solve any problems or needs. There are more than enough real problems in the world to actually spend time on. Defining the purpose and need that the software meets can help mitigate this risk.

The purpose can come from you, your team, your organisation, or from users or clients. Either way, it takes a bit of time and effort to dig into what the actual purpose and need are. It’s difficult to describe how you can go about this process as it involves a lot of human communication, discussion, clarification, and understanding. In general, ask lots of questions and listen with as little judgement and as few assumptions as possible.

1.2.2 Using domain-driven design

Once you have the clear purpose, you can get into designing the software! Like the step above, this involves a substantial amount of listening, asking probing questions, and digging deeper into the problem domain.

Domain-driven design starts with having at least one, though usually more, brainstorming/workshop-like sessions. Participants of these sessions include domain experts (those actually working in the problem area), relevant stakeholders, developers, and users. The goal of these sessions is to identify, decompose, record, and sketch out all the different events, actions, processes, workflows, and tasks that are involved in the problem domain. Think of these as the “nouns” and “verbs” of the problem area.

You rarely identify everything in a given area of the problem the first time you focus on it. So within each session, have multiple rounds of brainstorming and questioning, and examine any given area until you feel like you’ve fully explored it.

How this might look in practice depends a lot on the specific problem domain and context. Among the best tools and practices are to have a synchronised (virtual or in-person) brainstorming or event storming session, using plain sticky notes, pencils/pens, whiteboards, and paper (or their digital equivalents). In general you might follow this sequence of rounds:

  1. Start with the beginning and the end of the problem area. What are the inputs, and what are the final outputs?
  2. Working from either end, start identifying all major events or objects/output (the “nouns” or “nouns” with past-tense actions connected to them) and the actions (the “verbs”) that happen in between the start and end. Focus on events and actions that people actually have to do or take, including any manual, non-computer tasks.
  3. For each event and action, dig in deeper and identify sub-events and sub-actions. This is where asking questions and “why” or “how” can help immensely. A good way to think about this is the “5 Whys” (ask why five times) or the “explain like I’m five” (ELI5) technique. Don’t assume things, ask simple questions (even if you may think you know the answer). You’ll be surprised how often you find out new things this way.
  4. For any area that is unclear or uncertain, make a note of it to come back to it and investigate it further later.
  5. Keep repeating this process until everyone agrees with what’s been (un)covered and there are no major gaps or missing pieces. In reality, as you continue this design stage, there will be areas missing that weren’t identified during these sessions.

Once you’ve gotten through this process, it’s time to organise it into a coherent narrative and structure. A good starting point is to make a diagram, such as a C4 Model diagram or even a simple flowchart to visually represent the overall architecture and flow of the system, such as how events and actions connect to each other. From there, convert all the “nouns” or events into objects or data structures, as appropriate for the programming language you’re using, and “verbs” into functions or methods (noting their signature). For example, create Python classes that map to the events you identified earlier and functions that map to the actions.

Usually, but not always, a “noun” has a “verb” before or after it, such as the case when an action or “verb” creates a “noun”. The reverse can also happen, where a “noun” is used by a “verb” to create another “noun”. But sometimes, “nouns” may not have a preceding or following “verb”, and the same is true for “verbs”. For events, they usually have a preceding past-tense “verb” attached to them, examples of which are shown later below.

Ideally, you want to do several event storming sessions with the (physical or virtual) stickies where everyone is in general agreement about what the design of the software is and what it does. Once you’ve completed these sessions, the ultimate output of the domain-driven design process should be some type of design document that you and your team (or collaborators) can refer to later. For this, you can use the explanation document type from Diataxis. As much as possible, think of every stage of development as part of the product, not as a separate thing. This design document should be included as part of the software product’s documentation, so treat it with the same level of care, attention, and quality as you would any other part of the software.

Note

The design document should be an evolving and living document that gets updated as the design evolves, as you learn more about the problem domain, and as you get feedback during the implementation and from users (or actually using it in practice). This is where the “iterative and incremental development” approach comes in, which is described in more detail in Chapter 4.

In the case of the Seedcase Project, the problem domain would be the research data management, organisation, processing, storing, and sharing domain. So some of the “nouns” or events would be “downloaded data from API”, “filled in high-level metadata”, “verified data integrity”, and “created a data subset for a collaborator”. And some of the “verbs” or actions would be “call web API to get data”, “run verification checks” and “fill in metadata fields”.

1.2.3 Using documentation-driven development

While you are technically writing documentation during the design stage, this stage is more about writing the documentation on how to use the software or on the API of the software itself. These are called “reference” and “how-to” documentation in the Diataxis framework. They are aimed at the (future) users of the software and written for their needs and perspectives.

During any given iteration you will likely include some designing, coding, and writing of docs. However, including an explicit focus on design and documentation early in developing a new product or when starting a major refactor of an existing product has many advantages. The biggest advantage is that it “brings the pain forward” and helps you address and resolve potential issues before you start coding. Code is a liability and is harder to change than documentation and design. Explicitly separating these stages and focusing on them before you code can help you avoid a lot of wasted time and effort later, when you’ve already written a lot of code and realize there are major issues.

Another benefit to focusing on design and documentation earlier is that when you start the coding stage, reviews of pull requests because much easier. The reviewer can focus on the implementation details rather than questioning the entire design and direction of the software, which is what can happen if the design and documentation haven’t been well thought out, explicitly described, and agreed on by the team. This can make the coding stage much more enjoyable and less stressful.

Another reason to start writing documentation before writing code is that there is often less desire to write documentation once coding starts or once specific features have been implemented. So to reduce this risk, use documentation-driven development to write out the documentation before writing any code.

This documentation-driven approach helps with finding any issues with the design. As you write the docs from the perspective and needs of the user (who might also be you), you will very likely encounter issues or gaps in the design. Then you can go back and adjust the design to resolve these issues. It also helps clarify to yourself and any contributors or team members how the software is supposed to work and how to use it, which can help guide the implementation and development process.

For those who enjoy coding, this stage as well as the previous design stage can feel a bit tedious and less fun. But it’s important to remember that documentation is much easier to fix and change than writing code. While the final software product can be considered as an asset, the code itself is a liability. So you really want to be careful with writing code before you are fairly certain of the design. Once you have had several rounds of discussing and brainstorming and refining the design and documentation, then you can be more confident that the design is solid and that the code will be less of a liability. After all, you want to engineer software, not just code it. A useful quote to keep in mind when viewing the design and planning stages is:

“Weeks of coding can save you hours of planning”

So, to start, you can write out the below docs in this order, though any order will work:

  1. Interface level docs, such as the docstrings for functions or classes in Python. Write out a description of the function or class, its parameters or attributes, and its outputs. This is the “reference” documentation in the Diataxis framework.

  2. Implementation comments within the functions and classes that describe what the flow of code might actually be, including some pseudocode. While not technically part of the Diataxis framework, these comments can be helpful in guiding the interface-level and how-to documentation, as well as provide documentation to developers to know how the code should approximately work.

  3. How-to guides for the software, written in Markdown files and in the same repository to keep docs and code close together. These are the “how-to guides” in the Diataxis framework. These guides should be written from the perspective of the user, and they should describe how to use the software to achieve specific goals or tasks. They shouldn’t try to explain the software or the reasons for things, unless it is necessary to understand the task. Keep them focused on the “how” and focused on specific goals.

These documents alongside the design documents will make it much easier to implement the software.