The experiment you discuss is called "stereotype threat" and it belongs into the huge set of psychological results that cannot be replicated. [1] It seems that there is a bias against publishing studies with null results, which skews the overall picture both among the researchers and in the media.
[1] https://www.tandfonline.com/doi/full/10.1080/23743603.2018.1...