在跑深度学习的过程中发现Python读取读取速度太慢了,后来发现是numpy和pandas的原因。尽量少用numpy和pandas直接读取数据吧。
time_1 = time.time()
dataset_path = 'dataset/origin_cylinder_train_dataset.csv'
with open(dataset_path, 'r+', encoding='utf-8-sig') as f:
data = np.loadtxt(f, delimiter=',', skiprows=0, dtype=np.float32)
time_11 = time.time()
print('time_11 - time_1:', time_11 - time_1)
para = data[:, 0:4] # 4 个输入
real = data[:, 4:305:6] # 实部 31 个点
imag = data[:, 305:606:6] # 实部 31 个点
label = np.concatenate((real, imag), axis=1)
time_12 = time.time()
print('time_12 - time_11:', time_12 - time_11)
time_2 = time.time()
print('time_2 - time_1:', time_2 - time_1)
time_3 = time.time()
data_mat = loadmat('dataset/origin_cylinder.mat')
time_4 = time.time()
print('time_4 - time_3:', time_4 - time_3)
结果
time_11 - time_1: 20.45560312271118
time_12 - time_11: 0.017014026641845703
time_2 - time_1: 20.472617149353027
time_4 - time_3: 1.378253698348999