I'm working on a software that should do realtime people detection on multiple camera devices for an home surveillance system.
I'm currently running Opencv to grab frames from an IP camera and tensorflow to analyze and find objects on them (the code is very similar to the one that can be found in the Tf object detection API). I've also tried different frozen inference graphs from the tensorflow object detection api at this link:
I have a Desktop PC with a CPU Intel Core i7-6700 CPU @ 3.40GHz × 8 and my GPU is NVidia Geforce gtx960ti.
The software is working as intended but is slower than expected (3-5 FPS) and the usage of the CPU is quite high(80-90%) for a single python script that works on only 1 camera device.
Am i doing something wrong? What are the best ways to optimize performances and achieve better FPS and a lower CPU usage to analyze more video feeds at once? So far i've looked into multithreading but i've no idea on how to implement it on my code.
Code snippet:
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while True:
frame = cap.read()
frame_expanded = np.expand_dims(frame, axis = 0)
image_tensor = detection_graph.get_tensor_by_name("image_tensor:0")
boxes = detection_graph.get_tensor_by_name("detection_boxes:0")
scores = detection_graph.get_tensor_by_name("detection_scores:0")
classes = detection_graph.get_tensor_by_name("detection_classes:0")
num_detections=detection_graph.get_tensor_by_name("num_detections:0")
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict = {image_tensor: frame_expanded})
vis_util.visualize_boxes_and_labels_on_image_array(frame,...)
cv2.imshow("video", frame)
if cv2.waitKey(25) & 0xFF == ord("q"):
cv2.destroyAllWindows()
cap.stop()
break