I’ve just stumbled upon this wiki page which describes optimisation methods that can be used for optimising functions (or programs) where you don’t know or is hard to compute a derivative for it.
I plan to implement some optimisers from that page and see how they work, but before doing that I realised that I also need a simulation environment where I can see how the optimisation is progressing. Since we're at it, we might as well plot how SDG works, and make sure everything is working correctly. At the same time we might get some valuable insights into its inner workings.
In the end, the simulation looks something like this.