Skip to content

Conversation

@EhabY
Copy link
Collaborator

@EhabY EhabY commented Nov 18, 2025

Closes #575

Comment on lines +248 to +251
/**
* Unified handler for any binary-related failure.
* Checks for existing or old binaries and prompts user once.
*/
private async handleAnyBinaryFailure(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is too much of a defensive programming, essentially we try to see if there is a binary and try to just execute that, otherwise, we search for an .old-* binary and try to use it

@mtojek mtojek requested review from jakehwll and mtojek November 18, 2025 18:33
bytesDownloaded: 1500,
totalBytes: 10000,
status: "downloading",
timestamp: Date.now(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export interface DownloadProgress {
	bytesDownloaded: number;
	totalBytes: number | null;
	status: "downloading" | "verifying";
}

shouldn't we add timestamp here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I failed to update the tests, in the first iteration I had a timestamp in there but it wasn't used since now we depend on the lock file staleness (which is handled by proper-lockfile)

Comment on lines 125 to 130
const stats = await Promise.all(
oldBinaries.map(async (f) => ({
path: f,
mtime: (await fs.stat(f)).mtime,
})),
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to use Promise.allSettled() here instead so we don't accidentally lose the entire dataset if fs.stat(f) were to fail?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good idea, we could at least attempt to run one of them, I've replaced this with:

const stats = await Promise.allSettled(
	oldBinaries.map(async (f) => ({
		path: f,
		mtime: (await fs.stat(f)).mtime,
	})),
).then((result) =>
	result
		.filter((promise) => promise.status === "fulfilled")
		.map((promise) => promise.value),
);

We could potentially log here for file that could not be read, similar to the rmOld 🤔

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against the file locking implementation used here, but asking out of curiosity: would IPC communication between VS Code windows have been an option here, like with the login prompt?

Thanks for working on this ❤️, concurrent downloads has been a pain-point for me!

const release = await this.safeAcquireLock(binPath);
if (release) {
clearInterval(interval);
this.output.debug("Download completed by another process");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we acquire the lock, could it also mean that the other process failed to download? Do we need to handle that case separately?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you are right yeah, if this was a takeover it could mean that the other process was stuck. The logic that follows is correct though since we essentially recheck if we have the right binary and attempt to download if need be.

We even log right after "Acquired download lock". So I'll just remove this logging here.

);
if (existingCheck.version) {
// Perfect match - use without prompting
if (existingCheck.matches) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is true, why did we ever try to download and fail?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just a lot of defensive programming, see https://github.com/coder/vscode-coder/pull/656/files#r2538524344

We could have encountered an error at any stage here and another process download the binary

case 304: {
this.output.info("Using existing binary since server returned a 304");
// Version mismatch - prompt user
if (await this.promptUseExistingBinary(existingCheck.version, message)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I misread, this prompt says Run/Exit. If I select Exit I would not expect old binary to be used. Consider changing terminology?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, yeah if you click "Cancel" you might get another prompt with an even older version. We should perhaps show this prompt once only for the first match (whether it's binPath or an old binary).

I added throw error; so that if binPath exists we never even attempt to read old binaries.

(oldCheck.matches ||
(await this.promptUseExistingBinary(oldCheck.version, message)))
) {
await fs.rename(oldBinaries[0], binPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume binPath exists already, but the existing check did not match, and the user selected Exit in prompt. Then we fallback to old binary and ask the user again. Now the rename may fail and we throw an error.

Why do we have to rename vs using the old binary path as-is? Too many hard-coded references to the non-old path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow here is a bit sketchy I agree, I've made such that if you reject binPath then we do not attempt to read old binaries since to me that makes even less sense... (if you don't want to run the most up-to-date binary, why would you run an older one?)

See https://github.com/coder/vscode-coder/pull/656/files#r2545981797

We rename because a lot of the logic depends on the proper name like removing old binaries but not touching binPath

);
if (
oldCheck.version &&
(oldCheck.matches ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only try one old binary when we are tracking all of them?

I'm trying to think about when this condition actually might happen. I.e. we have an old binary and it's the right version. This means you either downgraded coderd or switched deployment. In either case, any of the old binaries may be the correct one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also clear old binaries when we download new ones so it's unlikely that we have multiple (it could happen if we never remove them because of some errors). Different deployments don't apply here since each deployment has it's own folder, but yes if you downgrade then getting the most recent one might not matter. Should we just get the first match whatever it is?

binPath + ".temp-" + Math.random().toString(36).substring(8);

try {
const removed = await cliUtils.rmOld(binPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove the old before our download has completed successfully? Is this to ensure that we don't try to download in case updating the binary would fail (e.g. in use on Windows)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we remove old binaries only, we still keep the proper binary (binPath). Old binaries should have been removed at the end of the previous download but are done here because there's less conflicts (this was the case already from before).

For example this is what might the folder look like:

  • coder-linux
  • coder-linux.old-123

So we remove the "old" binaries only and keep the most recent one if the download fails for example..

this.output.info("Using existing binary since server returned a 304");
return binPath;
}
case 404: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, this could also be caused by a poorly configured reverse proxy. I'm just raising a thought, don't think we need to handle it here. There's no easy way to check this either (e.g. no https://dev.coder.com/bin/.exists)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, do keep in mind that their Coder logs would contain the errored request on the error level. So potentially they can quickly identify this using the logs 🤔

@EhabY
Copy link
Collaborator Author

EhabY commented Nov 20, 2025

would IPC communication between VS Code windows have been an option here, like with the login prompt?

@mafredri We could maybe use this for the progress monitoring but we definitely need a lock file since that would make crashes/staleness/stuck handling much simpler. Libraries like proper-filelock are already implemented to handle this. IPC communication using secrets might get a bit tricky and cumbersome to work with. Also, we already writing the binary to the file system, it only feels natural to use a filelock in the same folder 😉

@EhabY EhabY force-pushed the concurrent-binary-download-lockfile-fix branch from 6d67d87 to f6d76c8 Compare November 21, 2025 08:12
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments! As long as other reviewers approve, you're good to go 👍

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to add, thanks for amending/answering my comments 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Confusing error message: "Failed to read signature or binary"

4 participants