Handle concurrent binary downloads using file locks #656

EhabY · 2025-11-18T14:53:21Z

Closes #575

EhabY · 2025-11-18T14:56:31Z

src/core/cliManager.ts

+	/**
+	 * Unified handler for any binary-related failure.
+	 * Checks for existing or old binaries and prompts user once.
+	 */
+	private async handleAnyBinaryFailure(


Maybe this is too much of a defensive programming, essentially we try to see if there is a binary and try to just execute that, otherwise, we search for an .old-* binary and try to use it

src/core/downloadProgress.ts

src/core/cliManager.ts

src/core/binaryLock.ts

src/core/cliManager.ts

mtojek · 2025-11-19T10:16:09Z

test/unit/core/downloadProgress.test.ts

+				bytesDownloaded: 1500,
+				totalBytes: 10000,
+				status: "downloading",
+				timestamp: Date.now(),


export interface DownloadProgress { bytesDownloaded: number; totalBytes: number | null; status: "downloading" | "verifying"; }

shouldn't we add timestamp here?

Oh I failed to update the tests, in the first iteration I had a timestamp in there but it wasn't used since now we depend on the lock file staleness (which is handled by proper-lockfile)

src/core/downloadProgress.ts

src/core/cliUtils.ts

test/unit/core/cliManager.concurrent.test.ts

jakehwll · 2025-11-20T09:44:32Z

src/core/cliUtils.ts

+		const stats = await Promise.all(
+			oldBinaries.map(async (f) => ({
+				path: f,
+				mtime: (await fs.stat(f)).mtime,
+			})),
+		);


Would we want to use Promise.allSettled() here instead so we don't accidentally lose the entire dataset if fs.stat(f) were to fail?

Oh good idea, we could at least attempt to run one of them, I've replaced this with:

const stats = await Promise.allSettled( oldBinaries.map(async (f) => ({ path: f, mtime: (await fs.stat(f)).mtime, })), ).then((result) => result .filter((promise) => promise.status === "fulfilled") .map((promise) => promise.value), );

We could potentially log here for file that could not be read, similar to the rmOld 🤔

src/core/binaryLock.ts

mafredri

I'm not against the file locking implementation used here, but asking out of curiosity: would IPC communication between VS Code windows have been an option here, like with the login prompt?

Thanks for working on this ❤️, concurrent downloads has been a pain-point for me!

mafredri · 2025-11-20T10:12:31Z

src/core/binaryLock.ts

+							const release = await this.safeAcquireLock(binPath);
+							if (release) {
+								clearInterval(interval);
+								this.output.debug("Download completed by another process");


If we acquire the lock, could it also mean that the other process failed to download? Do we need to handle that case separately?

Actually you are right yeah, if this was a takeover it could mean that the other process was stuck. The logic that follows is correct though since we essentially recheck if we have the right binary and attempt to download if need be.

We even log right after "Acquired download lock". So I'll just remove this logging here.

src/core/cliUtils.ts

src/core/cliManager.ts

mafredri · 2025-11-20T10:25:30Z

src/core/cliManager.ts

+		);
+		if (existingCheck.version) {
+			// Perfect match - use without prompting
+			if (existingCheck.matches) {


If this is true, why did we ever try to download and fail?

This could be just a lot of defensive programming, see https://github.com/coder/vscode-coder/pull/656/files#r2538524344

We could have encountered an error at any stage here and another process download the binary

mafredri · 2025-11-20T10:26:55Z

src/core/cliManager.ts

-			case 304: {
-				this.output.info("Using existing binary since server returned a 304");
+			// Version mismatch - prompt user
+			if (await this.promptUseExistingBinary(existingCheck.version, message)) {


Unless I misread, this prompt says Run/Exit. If I select Exit I would not expect old binary to be used. Consider changing terminology?

Hmmm, yeah if you click "Cancel" you might get another prompt with an even older version. We should perhaps show this prompt once only for the first match (whether it's binPath or an old binary).

I added throw error; so that if binPath exists we never even attempt to read old binaries.

mafredri · 2025-11-20T10:30:54Z

src/core/cliManager.ts

+				(oldCheck.matches ||
+					(await this.promptUseExistingBinary(oldCheck.version, message)))
+			) {
+				await fs.rename(oldBinaries[0], binPath);


Let's assume binPath exists already, but the existing check did not match, and the user selected Exit in prompt. Then we fallback to old binary and ask the user again. Now the rename may fail and we throw an error.

Why do we have to rename vs using the old binary path as-is? Too many hard-coded references to the non-old path?

The flow here is a bit sketchy I agree, I've made such that if you reject binPath then we do not attempt to read old binaries since to me that makes even less sense... (if you don't want to run the most up-to-date binary, why would you run an older one?)

See https://github.com/coder/vscode-coder/pull/656/files#r2545981797

We rename because a lot of the logic depends on the proper name like removing old binaries but not touching binPath

mafredri · 2025-11-20T10:34:01Z

src/core/cliManager.ts

+			);
+			if (
+				oldCheck.version &&
+				(oldCheck.matches ||


Why do we only try one old binary when we are tracking all of them?

I'm trying to think about when this condition actually might happen. I.e. we have an old binary and it's the right version. This means you either downgraded coderd or switched deployment. In either case, any of the old binaries may be the correct one.

We also clear old binaries when we download new ones so it's unlikely that we have multiple (it could happen if we never remove them because of some errors). Different deployments don't apply here since each deployment has it's own folder, but yes if you downgrade then getting the most recent one might not matter. Should we just get the first match whatever it is?

mafredri · 2025-11-20T10:36:24Z

src/core/cliManager.ts

+			binPath + ".temp-" + Math.random().toString(36).substring(8);
+
+		try {
+			const removed = await cliUtils.rmOld(binPath);


Why do we remove the old before our download has completed successfully? Is this to ensure that we don't try to download in case updating the binary would fail (e.g. in use on Windows)?

Because we remove old binaries only, we still keep the proper binary (binPath). Old binaries should have been removed at the end of the previous download but are done here because there's less conflicts (this was the case already from before).

For example this is what might the folder look like:

coder-linux

coder-linux.old-123

So we remove the "old" binaries only and keep the most recent one if the download fails for example..

mafredri · 2025-11-20T10:42:49Z

src/core/cliManager.ts

+					this.output.info("Using existing binary since server returned a 304");
+					return binPath;
+				}
+				case 404: {


Theoretically, this could also be caused by a poorly configured reverse proxy. I'm just raising a thought, don't think we need to handle it here. There's no easy way to check this either (e.g. no https://dev.coder.com/bin/.exists)

That is true, do keep in mind that their Coder logs would contain the errored request on the error level. So potentially they can quickly identify this using the logs 🤔

EhabY · 2025-11-20T13:17:32Z

would IPC communication between VS Code windows have been an option here, like with the login prompt?

@mafredri We could maybe use this for the progress monitoring but we definitely need a lock file since that would make crashes/staleness/stuck handling much simpler. Libraries like proper-filelock are already implemented to handle this. IPC communication using secrets might get a bit tricky and cumbersome to work with. Also, we already writing the binary to the file system, it only feels natural to use a filelock in the same folder 😉

mtojek

Thanks for addressing my comments! As long as other reviewers approve, you're good to go 👍

mafredri

Nothing to add, thanks for amending/answering my comments 👍🏻

EhabY mentioned this pull request Nov 18, 2025

Fix race condition when downloading binaries from different windows #647

Closed

EhabY commented Nov 18, 2025

View reviewed changes

mtojek requested review from jakehwll and mtojek November 18, 2025 18:33

mtojek reviewed Nov 19, 2025

View reviewed changes

jakehwll reviewed Nov 20, 2025

View reviewed changes

src/core/binaryLock.ts Outdated Show resolved Hide resolved

mafredri reviewed Nov 20, 2025

View reviewed changes

EhabY added 2 commits November 21, 2025 11:12

Handle concurrent binary downloads using file locks

8a3d179

Address review comments

f6d76c8

EhabY force-pushed the concurrent-binary-download-lockfile-fix branch from 6d67d87 to f6d76c8 Compare November 21, 2025 08:12

Add network and disk error tests

eb5d3b4

EhabY requested review from jakehwll, mafredri and mtojek November 21, 2025 09:09

mtojek approved these changes Nov 21, 2025

View reviewed changes

mafredri approved these changes Nov 21, 2025

View reviewed changes

Handle concurrent binary downloads using file locks #656

Are you sure you want to change the base?

Handle concurrent binary downloads using file locks #656

Uh oh!

Conversation

EhabY commented Nov 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EhabY commented Nov 20, 2025

Uh oh!

mtojek left a comment

Choose a reason for hiding this comment

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants