1x
4x
1x
4x
Penumbra area (debug view)
It's clearly seen that if extended light source is approximated by single sample, penumbra become incorrect. Several samples instead of one slow down the algorithm, but one can notice that umbra and penumbra now separated from each other to some extent: the shadow becomes more plausible.
The authors of the original penumbra wedges algorithm do not set themselves to solve the problem of approximation. To solve it, it's needed to change the approach. For example, the algorithm was originally based on the construction of the shadow volume from the center of the light source, followed by lighting compensation in penumbra around the edges of hard shadow. I think it would be better to go this way: consider coverage in penumbra without the sign (it depends on whether we're inside the hard shadow or outside it), and instead of extrusion of classic shadow volume, determine the umbra region and shade it to black - there's no any lighting contribution. Thus here are two logical parts: umbra, where there's no illumination and penumbra, where the pixel shader computes it analytically.
It's unclear now how to calculate umbra for an extended light source, but definitely nothing impossible.