Playtesting for larps

March 22, 2024 · by Callum Deery (@CallumDeery2)#Game design

Share on Bluesky Share on Twitter Share on Facebook

Follow Callum DeeryFollowing Callum DeeryUnfollow Callum Deery

Share this post:

Share on Bluesky Share on Twitter Share on Facebook

Follow Callum DeeryFollowing Callum DeeryUnfollow Callum Deery

This is a little guide about applying techniques from games user research to larps (and roleplaying games more generally), based on my experience running larps, doing games user research, and the things I wish I’d done.

While a lot of the examples will focus on the kind of UK larps I attend, as well as remote games run during covid, it should hopefully be useful to other larp styles (American, Nordic, etc) and to people running other forms of roleplaying games.

Why playtest?

In short because larps have a huge opportunity cost, a ton of writing and design and kit making for potentially a single session. If that session doesn’t go as the game runners intend then that's a lot of effort for a potentially pretty bad experience (for both players and game runners).

Larps also have a lot of moving parts: players, guides, rules, as well as less obvious things like the physical spaces they take place in. Issues with any one of these can have knock-on effects across the whole experience.

Document structure

First I’m going to go over some different goals that testing might be trying to achieve, then techniques to meet those goals, then analysis of feedback, and then ending with a small section talking about gathering feedback while a game is running. Each section will give examples from the larps I’ve run (so enjoy my mistakes!)

Part 1: Goals

A key thing to keep in mind is you can’t test everything, so you have to prioritise what you test. I find a useful model to borrow here is Able Gamer’s APX pyramid, which we can use to fit testing goals into three categories:

Access: How do players access the game? What Barriers exist for specific player groups? How can those barriers be removed?

Challenge: Where does friction exist in the game? Is that friction intentional or not? Is that friction proportional or not?

Experiences: What experience are you aiming for your players to have? Does everyone get that experience? If not, what experience are they having?

In general testing should start by targeting Access, then Challenge, and then look at the Experiences produced. However these are by no means strict walls, and many testing techniques cover multiple categories, for example Critical Facet tests focus on experiences but will also reveal Access issues.

Part 2: Techniques

Usability tests and read-throughs

Aimed at ensuring Access by identifying usability issues. These tests involve getting people outside of the game running team to play or read through important parts of the documents, particularly rules and setting information, to make sure that other people interpret the text in the way you intend. There are broadly two ways to structure this kind of test:

Standard usability test:

Give a group of around 7 people the document of interest and ask them to send in any questions or comments they have after reading it.

7 is a commonly cited number as this will show up 80-90% of issues, with diminishing returns of adding more people.

Case Study: Mech 2099 Character creation

Mech 2099 was a very silly 2 session turn based larp where the players played mech pilots, and had to move in a grid. The game got to its first mission and the players (and myself) all realised that there were two different ways of measuring range and diagonals: One measuring from where the player was, one measuring from the square they were facing.

A 5 by 4 grid showing the two mech attack cases, Case A shows an attack hitting the square next to a player, the square next to that and the two above and below the second square. The second case shows an attack hitting the square the player is in and the 3 squares in a vertical line to their left.

While clarifying this seems like it should be relatively simple, but because people had spent a week with whichever interpretation they’d understood, it was a recurring issue in play. A Usability test would have caught this relatively easily and allowed it to be clarified before it had settled in players minds.

Rapid iteration testing and evaluation (R.I.T.E):

Give 1 person whatever document you want testing, gather their feedback and opinions on it, address their feedback, Give to a different person. Repeat.

This approach can be particularly useful early in working on a game, as each new tester identifies different issues from the last (or shows that you haven't fixed a previously identified issue). However R.I.T.E can be less effective for long documents or documents where there might be a wide variety of misinterpretations, for example setting guides, as it can lead you to addressing a symptom (i.e. avoiding a specific misinterpretation) rather than the root cause.

Case Study: Parliament of Kingdoms

Parliament of Kingdoms was a discord game which ran for 10 weeks, each player managed a nation in a newly formed union, with 4 resources (Happiness, Iron, Food, Gold), each nation having a mix of three religions, and a system of upgrades for each player to manage and a Parliament system for player proposed laws and projects affecting all the kingdoms.

Releasing the rules, an immediate problem: Every player immediately gives feedback that the order of operations in a turn is unclear. Had I run a R.I.T.E test beforehand, this would probably have come up in the first run.

Sensitivity read-throughs

Similar to usability read-throughs, but instead asking people with expertise in topics which may come up in game. For some games this is embedded in the process, for example games about environmentalism who have an expert involved in the core design process, for others this is an unspoken dimension, designers making a setting based on their own culture (Typically Western Europe).

In addition to avoiding cultural appropriation and misrepresentation, sensitivity read-throughs can also be used to help with moderation of themes which are present and ensure that there aren’t unintentionally triggering/unpleasant situations being implied by the game world.

Additional note: Don’t use fascist symbols by accident

The far-right has co opted a large number of symbols, particularly from Nordic mythology. You should avoid using these symbols, as in the lightest case you might look like a Nazi, and in the worst case Fascists will turn up to your game. There are a few good resources for doing this check, for example the Global Extremist Symbols Database.

Critical facet system tests

In this style of test you are aiming to understand the friction and experience of a critical part of your design in as close to real circumstances as possible. This is a later stage test, done as early as possible but after usability testing to make sure the system is parsable. These tests are the best way of identifying the experiences players have with a system.

One thing to bear in mind that the first time playing through any rules or roleplay brief will involve a lot of cognitive load, so while it can be tempting to critical facet complicated situations, it's important to build up to this over the course of a testing session and allow players to get used to whatever it is that you’re testing.

Example: Combat tests of Seventh Sun

Seventh sun was a twice weekly game which ran for just under two years. It involved character progression, and had weekend sessions composed of adventures in which characters would fight and talk through sequences of multiple encounters. A key resource in Seventh Sun was called “Stress”, each character gains Stress when they use an ability and having too much Stress has negative consequences.

Seventh Sun ended up having 5 playtests, starting with 10 people in the first test and scaling up to 25 by the final test. (25-30 being the expected player + monster crew numbers for an adventure). We actually made a mistake in running these tests in that we didn’t perform usability read throughs first, this turned the first hour of each of the playtests into essentially an impromptu usability test as we had to clarify interactions. The early tests looked at the experience of playing a character, the later tests allowed us to look at how to structure encounters. The table below gives examples of the kind of issues found in each playtest

Test	Issues	Changes
1.	There's no reason ever to stop using abilities, combat is noisy, complicated, and lethal. Support characters don’t have anything to do as everyone else is always doing things	Skills which reduce Stress removed, so Stress now acts as a limit on how much a character can do.
2.	Healing system is too slow, and doesn’t work in the dark	Changed to no longer involve drawing beads from a bag.
3.	Large power disparity between certain archetypes. Some archetypes allow for players to choose non-functional sets of skills.	Slowing down how quickly problem archetypes can cycle through their skills. Reducing options in character creation.
4.	Various infinite loops which involve standing still for 20 seconds. Healing being tied to a single archetype pushes players away from it, rather than incentivising it.	A rule change to prevent support spells being cast on yourself addressed this. Healing given to more archetypes.
5.	Guild Archetype had become boring. Monsters need some ranged or crowd control capability in order to pose a threat to the player party Healing system leads to a positive feedback loop of the adventuring party being either fine or all dead.	Changes made to make the Guilds feel more thematic. “Quick healing” added as a mechanic.

Example: Seventh Sun Kit guide tests

Larps usually have some form of guidance on what participants should wear to portray their characters. These guides often breakdown kit by faction and range from the minimum to aspirational kit goals.

In Seventh Sun’s case, our Kit guide test was focused on checking that the different faction briefs were sufficiently distinct and achievable with the kit our players actually had.

It also identified an additional issue: one of the faction kit designs was overly restrictive in the silhouettes non-binary players felt like they could create within it (being forced into stereotypically masculine or feminine silhouettes).

As anyone familiar with cosplay knows, it's also worth encouraging your players to make similar checks on their own costumes. Can they actually sit down? How long does it take for them to put it on? What if it rains? etc.

Site walks

This technique aims to gain an understanding of the physical space you’re using for your game. Broadly it involves walking around the rooms, or field you’re using.

For all kinds of spaces you’re looking at:

How do people actually get there? Public transport, Walking, Parking?
How accessible are the spaces? E.g. Do they require stairs? Have adequate places to sit?
Where can people get water from?
How far away are the nearest bathrooms? How complicated is the route to them?

For indoor spaces you’re looking at:

How many people can the rooms accommodate?
What is actually in the rooms? Tables, Chairs, etc?
Does any part of the room need to be disguised?
How close are different spaces to the general public?
What is the acoustics of the rooms like? Do they deaden sound or amplify it?
What routes are available if people need to leave a room quickly?

For outdoor spaces you’re looking at:

Which spaces have safe ground conditions to run encounters? (this will depend a lot on the kind of field you’re using and what type of encounters you’re planning)
Where do people wait?
Where can people go in bad weather?
How long does it take to move between different areas you’re using?

Part 3: Analysis

Okay, you’ve run your playtest and had a bunch of feedback emails sent in, now what?

This section is going to go over some approaches to analysis of feedback which I’ve found helpful. The goal of analysis is to turn a complicated set of data into ideas you can understand, communicate, and act on.

Start by looking at what happened during the session and comparing it to what you expected to happen. In combat games this is relatively simple:

How quickly do players make progress against different encounters?
How often do players lose/have to retreat?
What happens when things go wrong for the players?

For social games this can be a bit more difficult to track in real time, so you may need to look at feedback direction, but you can examine:

Are characters in factions I expected to be friendly actually friendly?
Are intended conflict points coming up?
Are people actually roleplaying, or just sitting in silence?

Then combine this with the post-game feedback you’ve gathered from players. How do players feel about what happened? This can be as simple as “Did people have fun?” but ideally should be tied to your goals for the game: “Did players with a given role have the experience I intended?”

To answer these questions it's often handy to break down each player's feedback into small bites of information, each a pair of event and emotion. You can then group these across players. By emotion to understand how people are feeling about the game, or by event to understand how players feel about particular systems.

If you’ve got lots of feedback then a spreadsheet isn’t a bad idea at this point, and theres lots of formal methods for analysing feedback (content analysis, thematic analysis, etc), so if you’re already familiar with one of those, you can always use it in your analysis.

Part 4: After session 1

If you are running a multi-session game, the gathering feedback doesn’t have to end at session 1. Things will always play out in unexpected ways (“No game survives contact with the players”), and as well as potential calcification of in-game power/roleplay around social groups.

Feedback forms, as well as requesting feedback by email, are typical ways to approach during game feedback. One thing to consider in live game feedback gathering is the way social discussions can impact player feedback. This can lead to players connecting disparate issues, sometimes making it more difficult to identify root causes.

Case Study: Parliament of Kingdoms 2

(See the introduction to Parliament of Kingdoms in the usability read-throughs case study above) 3 weeks into Parliament of Kingdoms I had some design hunches about things which needed to be adjusted to improve the game (Food being underutilised, Religion system being too opaque), so sent out a short feedback survey to see if the players felt the same:

What do you think the biggest problem with Parliament of Kingdoms is?
What do you think could change to increase your enjoyment of Parliament of Kingdoms?
If you would like one system added to Parliament of Kingdoms, what would it be?
Any final comments?

The results were:

Players were having a lot more fun than I thought they were, this is always a nice thing about feedback, it's good for game runner morale.
Players thought it was too difficult to judge which projects to choose to undertake without accurate costing beforehand.
Players thought there wasn’t enough to use the food resource on.
Also the players were expecting to end up at war with each other, and the game didn’t have any system for doing that written yet.

This showed an important way that feedback gathering augments design thinking: You as a game runner will understand a lot about your game and how players feel, but you don’t have a complete picture. In this case I’d identified 1 issue which was a high priority for players, but had missed 3 highly impactful issues. Which in a nice bonus helped solve each other, warfare was a good way to make food more interesting as a system.

Game runners credits:

Seventh Sun: Simon Barnes, Callum Deery, Tegwen Hammersley, Sam Jenkins,

Lizzie Vialls, A. Williams.

Mech 2099 & Parliament of Kingdoms: Callum Deery.

itch.io

Playtesting for larps

Why playtest?

Document structure

Part 1: Goals

Part 2: Techniques

Usability tests and read-throughs

Standard usability test:

Rapid iteration testing and evaluation (R.I.T.E):

Case Study: Parliament of Kingdoms

Sensitivity read-throughs

Critical facet system tests

Example: Combat tests of Seventh Sun

Example: Seventh Sun Kit guide tests

Site walks

Part 3: Analysis

Part 4: After session 1

Case Study: Parliament of Kingdoms 2

Game runners credits:

Further reading:

Support this post

More posts

Leave a comment