Hi
Thank you for sharing your implementation. It is quite helpful!
However, I think you missed the weight sharing part (see page 4, under figure 2, each capsule in [6,6] grid is sharing their weights). So my understanding is that your cap_ws should have the last dim of 32, instead of 1152.
Although I tested both cases, in the early training stage, I didn't see too much difference.