It's been a while but I finally had time to look this over. This is a pretty clever technique that does work, to an extend. Camera distance messes with the offset calculation which means it's not always clear which is the closest billboard to the camera position. Still, that only applies when the billboards are very close to each other and for general purposes, this would be a functional workaround.
Only keep in mind, the calculation point_distance works in a circular pattern, a proper depth calculation based only on the y-value of the perspective position of the billboard.
Oh yeah, and for anyone trying this, put the background layer and the depth of the mode7 object at 1000 or so, otherwise the depth of the billboards is below the ground and you can't see them.