Vision = visual, while PDF is a container of sorts, usually containing images and text. So I guess the short answer is: 50% yes, the other part you can use any LLM for.
PDF isn't really a binary format, it starts with a text header, structure is mostly text-based objects and you can parse many PDFs as plain-text. They tend to contain embedded binary data though, which is the specific part these vision models can help you with, assuming they're images. The rest a "normal" LLM can parse just fine.