tensorflow object detection api evaluation segmentation fault

Question

I have Segmentation fault whenever evaluation starts, no matter I use object_detection/legacy/eval.py script or object_detection/model_main.py. I tried to reinstall tensorflow, protobuf-compiler and reinstall all dependencies for tensorflow/modules object detection api but didn't help. I'm using python 3.

Tensorflow version: 1.15.0

The way I'm calling eval.py:

python object_detection/legacy/eval.py \
    --logtostderr \
    --eval_dir=${TRAIN_DIR} \
    --checkpoint_dir=${TRAIN_DIR} \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH}

This is log segment when running the script

I1107 17:50:24.818984 140633596692288 saver.py:1284] Restoring parameters from /home/ihahanov/Projects/rooftops/resources/models/mask_rcnn_inception_v2_coco_2018_01_28/train/model.ckpt-29722
2019-11-07 17:50:27.309148: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 92160000 exceeds 10% of system memory.
Fatal Python error: Segmentation fault

Thread 0x00007fe6c4ff9700 (most recent call first):
  File "/usr/lib/python3.6/threading.py", line 295 in wait
  File "/usr/lib/python3.6/threading.py", line 551 in wait
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/coordinator.py", line 311 in wait_for_stop
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/queue_runner_impl.py", line 293 in _close_on_stop
  File "/usr/lib/python3.6/threading.py", line 864 in run
  File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
  File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap

Thread 0x00007fe6bbfff700 (most recent call first):
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443 in _call_tf_sessionrun
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1287 in _single_operation_run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/queue_runner_impl.py", line 257 in _run
  File "/usr/lib/python3.6/threading.py", line 864 in run
  File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
  File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap

Thread 0x00007fe7cf930740 (most recent call first):
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443 in _call_tf_sessionrun
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350 in _run_fn
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365 in _do_call
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359 in _do_run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180 in _run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956 in run
  File "/home/ihahanov/Projects/rooftops/models/research/object_detection/legacy/evaluator.py", line 234 in _process_batch
  File "/home/ihahanov/Projects/rooftops/models/research/object_detection/eval_util.py", line 346 in _run_checkpoint_once
  File "/home/ihahanov/Projects/rooftops/models/research/o/home/ihahanov/Projects/rooftops/shell/train_model.sh: line 26: 13874 Segmentation fault      (core dumped) python3 eval.py --logtostderr --eval_dir=${TRAIN_DIR} --checkpoint_dir=${TRAIN_DIR} --pipeline_config_path=${PIPELINE_CONFIG_PATH}

Even I am facing the same issue, I have replaced GPU version of Tensorflow with CPU and no error comes with this. I checked the CUDA (required 10.0 but have 10.1) and Nvidia drivers (required 418 and have 430) these look incompatible in my case. — Suman
– Suman, Commented Nov 15, 2019 at 16:45
Actually this error disappeared when I started evaluating on GPU instead of CPU. When it was CPU the error happened on 3 different machines. — Ivan Hahanov
– Ivan Hahanov, Commented Nov 17, 2019 at 7:04

user11530462 · Accepted Answer · 2020-03-23 10:07:43Z

1

Providing the solution here (Answer Section), even though it is present in the Comments Section, for the benefit of the community.

Since the computations involved in the Model are very Memory Intensive, executing the Model in GPU, rather than in CPU has resolved the error.

answered Mar 23, 2020 at 10:07

user11530462

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

tensorflow object detection api evaluation segmentation fault

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related