1

I have Segmentation fault whenever evaluation starts, no matter I use object_detection/legacy/eval.py script or object_detection/model_main.py. I tried to reinstall tensorflow, protobuf-compiler and reinstall all dependencies for tensorflow/modules object detection api but didn't help. I'm using python 3.

Tensorflow version: 1.15.0

The way I'm calling eval.py:

python object_detection/legacy/eval.py \
    --logtostderr \
    --eval_dir=${TRAIN_DIR} \
    --checkpoint_dir=${TRAIN_DIR} \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH}

This is log segment when running the script

I1107 17:50:24.818984 140633596692288 saver.py:1284] Restoring parameters from /home/ihahanov/Projects/rooftops/resources/models/mask_rcnn_inception_v2_coco_2018_01_28/train/model.ckpt-29722
2019-11-07 17:50:27.309148: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 92160000 exceeds 10% of system memory.
Fatal Python error: Segmentation fault

Thread 0x00007fe6c4ff9700 (most recent call first):
  File "/usr/lib/python3.6/threading.py", line 295 in wait
  File "/usr/lib/python3.6/threading.py", line 551 in wait
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/coordinator.py", line 311 in wait_for_stop
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/queue_runner_impl.py", line 293 in _close_on_stop
  File "/usr/lib/python3.6/threading.py", line 864 in run
  File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
  File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap

Thread 0x00007fe6bbfff700 (most recent call first):
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443 in _call_tf_sessionrun
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1287 in _single_operation_run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/training/queue_runner_impl.py", line 257 in _run
  File "/usr/lib/python3.6/threading.py", line 864 in run
  File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
  File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap

Thread 0x00007fe7cf930740 (most recent call first):
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443 in _call_tf_sessionrun
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350 in _run_fn
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365 in _do_call
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359 in _do_run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180 in _run
  File "/home/ihahanov/Projects/rooftops/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956 in run
  File "/home/ihahanov/Projects/rooftops/models/research/object_detection/legacy/evaluator.py", line 234 in _process_batch
  File "/home/ihahanov/Projects/rooftops/models/research/object_detection/eval_util.py", line 346 in _run_checkpoint_once
  File "/home/ihahanov/Projects/rooftops/models/research/o/home/ihahanov/Projects/rooftops/shell/train_model.sh: line 26: 13874 Segmentation fault      (core dumped) python3 eval.py --logtostderr --eval_dir=${TRAIN_DIR} --checkpoint_dir=${TRAIN_DIR} --pipeline_config_path=${PIPELINE_CONFIG_PATH}
2
  • Even I am facing the same issue, I have replaced GPU version of Tensorflow with CPU and no error comes with this. I checked the CUDA (required 10.0 but have 10.1) and Nvidia drivers (required 418 and have 430) these look incompatible in my case. Commented Nov 15, 2019 at 16:45
  • Actually this error disappeared when I started evaluating on GPU instead of CPU. When it was CPU the error happened on 3 different machines. Commented Nov 17, 2019 at 7:04

1 Answer 1

1

Providing the solution here (Answer Section), even though it is present in the Comments Section, for the benefit of the community.

Since the computations involved in the Model are very Memory Intensive, executing the Model in GPU, rather than in CPU has resolved the error.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.