I know the GetData() will block the main thread to fetch data from GPU. But in my realization code, the data arrays are encoded into a render texture and pass to the vertex shader for render. which way is best to measure the actual execution time of one kernel Dispatch().
↧