Each write is striped across all of the disks. For a stripe to be completely written, all disks must finish writing the data. Thus your max write IOPS equals that of the slowest drive in the vdev.
It's reasonable to expect that issuing a batch of 100 writes at the application layer followed by a fsync would not always require doing 100 writes to each underlying block device. The OS/FS should be able to combine writes when the IO pattern allows for it, and should be doing some buffering prior to the fsync in hopes of assembling full-stripe writes out of smaller application-layer writes.