New study may provide model for assessing AI performance in real-world settings