Retry on Panic in Golang
In the next few lines I will show you how to do something in Golang I did not thingk to ever need … retrying until a panic is gone insertscarynoises.
So yeah, panics are bad and normally when hitting one you should accept your fate and submit to the consequences. But there are certain situations where it is ok to retry until a panic is gone … like in our case.
Why the hell would I catch a panic?
We are using testcontainers-go for our integration tests, which in general is doing a great job to make things easier.
We recently saw random occurences of the follwoing error:
test panicked: runtime error: index out of range [0] with length 0
Some research finally brought up this issue, so basically Docker 20 changed some stuff and now returns an empty array when getting mapped ports from their API. Just … not always. It would work for a few rounds and then start failing.
This behavior leads to the panic in testcontainers since there the returned array is expected to contain at least one entry which leads to the panic above. The panic is fixed in the next version, which isn’t released yet.
So we were faced with two options:
- Deactivate the integration tests until the new version of testcontainers is. released
- Deal with the panic
Option one is a none option, we just got everything into a state where we finally had some decent integration test coverage and a working pipeline.
There was no way we would sactifice that.
So option two it was and I found it rather interesting how things ended up looking.
The nature of defer
I have been using defer before, e.g. to clean up some open sockets or files before leaving a method. To me it was a magic way of doing something before the current function would return.
Let’s look at an example you might know:
The rule I knew was that before the function would return the nil the function referenced by defer would be executed, in this example removing the temporary file.
But there is more to it.
That other thing defer can do
Well, defer actually does a little more. It will also be called whenether a panic occurs.
The question is how to differentiate a regular return from a panic inside the function referenced by defer.
And that’s where the builtin recover() function comes into play. in case of a regular return it will return nil, in case of a panic it will provide whatever has been provided to panic(v interface{})
.
Putting it together
Let me put all my new knowledge to good use and build a function that will retry executing another function until no error or panic are returned:
Now let’s go through it step by step.
I define a function which takes the function to be executed, the number we try it before giving up and a delay between to subsequenet executions:
func Retry(retryFunc RetryFunc, maxIterations int, delay time.Duration) error
Next I define a variable to hold the error to be returned and define a for-loop:
No big surprises here. We iterate for maxIterations and exit early if err is nil. The delay will only be used after the first iteration, we don’t want to start waiting before we have done any work.
Now for the interesting part, defining the defer-behavior.
This looks a little wonky. The first thing that happens is that I define an anonymous function inside which I then define a defer and execute the retryFunc. doing this without wrapping it into an anonymous function would lead to breaking out of the for-loop, killing my beautiful retry logic.
Handling the actual panic is now a piece of cake.
Calling recover() gives us the information if a panic actually occurred. Then I do a switch over the returned value to figure out what was in the panic and asign the resulting value to the variable I defined in the beginning, therefore capturing the problem which occured.
And that’s it.
Not sure if this was a helpful post for you, for me it provided the chance to at least try getting back into writing.
Happy panicking :)