Denis Pătruț

ミーティングは同期ポイントである

2024-06-12

この記事は未翻訳です

When we consider multi-threaded programming (which I do not claim to be an expert in), how and when to sync is always a big part of the discussion.
Specifically, how to avoid syncing and what has to be synced no matter what.
Normally you would want all threads to do their thing independantly as much as possible, and not introduce sync points needlessly, in order to maximize the time that work can proceed in parallel.
Syncing interrupts the work that is proceeding, in order to share the results that were worked on in parallel.

In a team effort endeavour, we have a group of people all working on things together.
What one person cannot possibly finish by themselves, can be possible when we have a whole lot of people working together.
We can already see the parallel with multi-threading, workers have a chunk of the work that they do on their side.
Every now and then, the work needs to be shared with others in order for them to be able to proceed.
Distributing the work in a team of people is not that unlike distributing work between parallel processing units.

Meetings are the most obvious parallel (pun unintended) to sync points.
They necessarily stop everyone from working, in order to discuss how to proceed.
We can compare them to full-on mutexes.
Though they are more like a whole bunch of mutexes because we need to wait for each person to speak one at a time until the matter is resolved and the work that needs to be done becomes clear.
Meetings are the worst-case-scenario in terms of sync-points.
But in practice, this worst-case-scenario is often the go-to, leading to the all-too-common case of having the entire day's calendar filled with them.
A whole day of just syncing, is a whole day of no work progress.
Of course, in a team of people rather than processors, communication can sometimes be the actual work to be done, but is the solution really a stop-the-world 1+ hour long interruption?

In parallel computing, an alternative to stopping all relevant threads until the syncing is done (ala mutexes), is to use lock-free algorithms.
We can consider for example, having a place in memory where items can be written to, and a counter that can be incremented when they are ready.
With atomic instructions, one thread issuing an integer increment can have it happen in one single step such that all threads will read the up-to-date value from shared memory from the moment that it's written.
Threads can produce content that other threads will pick up on.
There isn't a full-on interruption to everyone involved, just a post-it-note saying that the work is ready for the next step.
Of course there can still be inefficiencies, if downstream threads need to wait for the work to be done.
But if structured properly, there would be multiple things ready to do to prevent being idle.
This sort of system is difficult to implement, and communication breakdowns can lead to data corruption and crashes, but when done right it can be the most efficient approach.1
All that being said, what is still the most efficient is to make sure that threads can be as independent as possible, not needing information about what other threads are doing for the most time possible.

If we bring the analogy back to teams of people working together, the equivalent of lock-free algorithms is making sure that wide meetings are kept to a minimum, and work is passed among relevant people without much fanfare.
Just as a thread would increment an integer when it finished calculating some data, a worker can change the status on a ticket or post a message about them being done.
The next worker in line, when they are in need of a new task, can check what was written and pick up where the previous worker left off. Lock-free and asynchronous.

But what happens if the plan has changed?
People working on a large project together are usually bound by the circumstances that led to them needing to carry out the task. But the requirements of a project can change over time.
In this case it may be necessary to interrupt everyone to tell them that the approach needs to change.
This is actually extremely common, and often the reason to have meetings.
The same thing can be achieved with an email or a post, but humans are not perfect; they may not see the notice or they may have questions or concerns with the new direction.
In this scenario a meeting is the fastest way to get everyone on the same page.

Just saying "don't have meetings" is not the answer.
Sync points are needed, and often times the work that needs to be done is precisely figuring out what needs to happen next.
This is the single biggest way that the analogy breaks: While computers are just running a pre-defined program, people are free to discuss how and what to do next.2
What needs to be done generally depends on information that may not be available until something is done first.
This is why it's difficult to just make a plan for very long lengths of time.
Milestones can be set, and the more detailed day-to-day plan can evolve over time.
We can see then, that these are naturally sync points.
Organizing sync points ahead of time, can actually potentially reduce the number of sudden sync points.

If we make a short-term plan that we evaluate at the end of it, we can introduce a wide sync point at the beggining/end, and allow for everything in the middle to be as asynchronous and lock-free as possible.
Problems can be logged, and in-progress results can be continously posted.
There can be a worker whose job is to check this asynchronous-communicated progress, and make small adjustments (also asynchronously) as needed.
They can also decide to interrupt everything and cancel the plan, starting a new cycle, if things have went too far off the rails.
This should be the real job of managers and PMs, and most of it can be completely taken care of without stop-the-world meetings.
Of course, this can only happen if workers are continuously volunteering their progress.
In the real world, work that isn't tracked in tasks happens all the time, and task management software is often convoluted, disincentivizing using it continously.

Many parallels can be seen between parallel/asynchronous computing and teamwork, and many solutions are similar.
But just as parallel programming is an extemely complex, messy, and error-prone task, so is designing workflows for teamwork.
I guess we just need to take things one step at a time! (and prevent This Meeting Could've Been an Email™ situations)


  1. Atomic instructions are of course not free, and not technically 100% lock-free if we consider the memory bus and other CPU-internal mechanisms. But they are sufficiently less locky than a full-on mutex or other more intrusive sync methods that we can consider them essentially lock-free. 

  2. This depends of course on how strict the organization is. In very top-down organizations, lower workers are essentially just worker threads receiving software updates, while the upper levels are the programmers. In a flatter/looser hierarchy, discussions are had to convince each other about the approach going forward.