impl: start the workspace via Coder CLI #221

fioan89 · 2025-11-12T22:21:59Z

Netflix uses custom MFA that requires CLI middleware to handle auth flow. The custom CLI implementation on their side intercepts 403 responses from the REST API, handles the MFA challenge, and retries the rest call again. The MFA challenge is handled only by the start and ssh actions. The remaining actions can go directly to the REST endpoints because of the custom header command that provides MFA tokens to the http calls.

Both Gateway and VS Code extension delegate the start logic to the CLI, but not Toolbox which caused issues for the customer. This PR ports some of the work from Gateway in Coder Toolbox.

Netflix uses custom MFA that requires CLI middleware to handle auth flow. The custom CLI implementation on their side intercepts 403 responses from the REST API, handles the MFA challenge, and retries the rest call again. The MFA challenge is handled only by the `start` and `ssh` actions. The remaining actions can go directly to the REST endpoints because of the custom header command that provides MFA tokens to the http calls. Both Gateway and VS Code extension delegate the start logic to the CLI, but not Toolbox which caused issues for the customer. This PR ports some of the work from Gateway in Coder Toolbox.

code-asher · 2025-11-14T22:29:02Z

src/main/kotlin/com/coder/toolbox/CoderRemoteEnvironment.kt

+                        // cli takes 15 seconds to move the workspace in queueing/starting state
+                        // while the user won't see anything happening in TBX after start is clicked
+                        // During those 15 seconds we work around by forcing a `Queuing` state
+                        while (startJob.isActive && client.workspace(workspace.id).latestBuild.status.isNotStarted()) {
+                            state.update {
+                                WorkspaceAndAgentStatus.QUEUED.toRemoteEnvironmentState(context)
+                            }
+                            delay(1.seconds)
+                        }


Thinking out loud, maybe it would be possible this could incorrectly set the state to queued? Like suppose the workspace goes to canceling, then we could possibly update to queued mistakenly, depending on the timing.

Probably not a huge deal, it would eventually correct itself and probably be brief at worst, but one thought is we could check to see if the latest build has changed yet which would indicate a new build finally actually started. Also occurs to me we probably only need to update to queued once.

// ... // During those 15 seconds ... state.update { WorkspaceAndAgentStatus.QUEUED.toRemoteEnvironmentState(context) } // wait for the build to actually start since there is a dry run first while (startJob.isActive && client.workspace(workspace.id).latestBuild == lastBuild) { delay(1.seconds) }

Or, maybe a more general fix would be for the updater itself to only update the status if the build changed otherwise preserve the current state so we could set the state once manually and have it persist.

The problem with that is every 5 seconds we run the general polling loop. For 15 seconds the workspace has the same status (in most cases it is stopped). We'd have to stop the general polling loop otherwise the workspaces for which we dispatched the start command will go from Stopped to Queued and then back to Stopped at the next general poll.
But if we pause the general poll loop we miss updates on the rest of workspaces. I'm still pondering on the best approach...

This is about my general fix idea right? I was thinking the build check would prevent going back to stopped. So something like:

General poll runs, gets stopped build and sets status to stopped.

User starts workspace and we set the status to queued.

General poll runs, gets the same stopped build again, skips setting the status though because it is the same build it saw previously.

General poll runs twice more, preserves the current status each time.

Workspace finally actually starts after 15 seconds which creates a new started build.

General poll runs, gets started build, updates status to started because it is a new build.

But idk if this is any better than having the startIsInProgress gate, mostly just a thought. I suppose it would let us do things like also immediately set a workspace to stopping when stopping a workspace rather than wait for the request round trip to update the status, which could be nice.

I think the existing implementation works in almost exactly what you proposed except the part about gets the same stopped build again, skips setting the status though because it is the same build it saw previously. which does things a bit differently. Instead of the general poll "remembering" the previous poll state it is actually forbidden to change the status of a workspace we know should be starting soon. A second polling loop only for the workspace that should be starting has the right to move the status from "queued" to "starting", after which the secondary poll loop dies and the primary poll loop will get back the full rights.

Now that I'm explaining it, it sounds like a bit of over engineering (two polling loops) though.

Yup yup, similar but only for this specific starting flow, whereas my proposal works for any state change that might take a while, plus it only needs the one loop.

fioan89 requested review from code-asher, f0ssel and jcjiang November 12, 2025 22:22

code-asher approved these changes Nov 14, 2025

View reviewed changes

fioan89 merged commit e24f564 into main Nov 20, 2025
6 checks passed

fioan89 deleted the impl-start-workspace-via-cli branch November 20, 2025 22:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

impl: start the workspace via Coder CLI #221

impl: start the workspace via Coder CLI #221

Uh oh!

fioan89 commented Nov 12, 2025

Uh oh!

code-asher Nov 14, 2025

Uh oh!

fioan89 Nov 20, 2025

Uh oh!

code-asher Nov 20, 2025

Uh oh!

fioan89 Nov 20, 2025

Uh oh!

code-asher Nov 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

impl: start the workspace via Coder CLI #221

impl: start the workspace via Coder CLI #221

Uh oh!

Conversation

fioan89 commented Nov 12, 2025

Uh oh!

code-asher Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

fioan89 Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

code-asher Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

fioan89 Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

code-asher Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

code-asher Nov 21, 2025 •

edited

Loading