Gitpod US cluster continues to hang, crash, and corrupt workspaces

Since Monday Aug 1 through today we have continually observed:

  • Workspaces fail to start.
  • Workspaces fail to stop.
  • Workspaces crash when stopped (last backup failed: workspace does not exist).
  • Workspaces crash when they time out.
  • Errors in the JavaScript console that probably shouldn’t be there (along with routine messages).

These events postdate all similar allegedly fixed problems I’ve seen in the forum. We have some documentation, but won’t post it here, as you’ve certainly seen it all before. I’m in Portland, Oregon on a gigabit fiber Internet connection with no known outages during this period. We’ve had little trouble with running workspaces.

Sometimes when a workspace fails to start, or crashes, it can later be located in the workspace list. Sometimes it can be restarted from there, other times not. Generally speaking we don’t trust phantom or crashed workspaces. We delete them and start over.

We have lost data. We are not interested in salvaging any workspaces at this time, however. We have learned to push early and often–and always before stopping a workspace, or getting up from a chair.

We would like to know if you intend to fix these problems. This is very frustrating for us, and my team’s patience is starting to wear a little thin. They’re developers, too, and they know very well that I will come after them if they introduce instability into a previously stable product, regardless of whatever other magic they’re currently performing. I’m trying to get us off desktops, but I can’t do that if you can’t provide reasonably stable cloud environment. We are not doing anything clever (no custom Docker images). Please let us know what to expect.

John Underhill
Senior Software Architect
QIS Project Inc.

2 Likes

Hi John,

Welcome to the Gitpod Community :tada:

We’re sorry that you’ve been experiencing this, we totally understand how frustrating it must me.

Thank you for raising it - we’ve made it our top priority to look into today. We’ll post any updates here.

Hi John,

Thanks for your patience! An update from us: as we’ve dug deeper into your issue we found that there was an unusual load spike on the 2nd-3rd of August. However, since then all our systems seem to be operating normally.

You mentioned that you have some documentation, could you share that with us? It would really help us get to the bottom of this. If you don’t want to share this on a public forum, feel free to email us directly at contact@gitpod.io.

Let me know if you need anything else!

I don’t think so. We had another workspace hang up last night, and we see other people with similar problems posting in the forum. For every one of these, there are 10 more who don’t post. I will upload some screenshots for you.

Some screenshots would be very useful, feel free to post them here or send us an email! :blush:

Happy Coding!

Same problems here. In the past week my workspaces have crashed multiple times losing all my work after the last remote git push. Not fun. The latest crash showed an error message that was along the lines of “ephemeral… used up more than the allowed 0 bytes.” .

Here is the gitpod launch link for the public repo if this helps.

I have seen this today too, just a few minutes ago, and created a new top level post. Looks like this has been happening since Feb of 2020. Starting to lose trust in GitPod. I’m on a deadline and just can’t have this. Looks like a lot of the causes of this message are disk space issues on your end. Don’t you have alerts? Time to increase your alert thresholds! I love GItPod, but I may have to migrate back to a local IDE in order to have better control. That would be a pity as GitPod brings great benefits too, but reliability must be paramount.