TensorFlow Object Detection API training error

Question

Hi there, I execute the following in my Anaconda prompt:(tensorflow) C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection> python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config

I am training the model on a custom dtataset I cerated tagging 'column' objects. I've followed tutorials over and over to make sure all my steps were correct but I seem to still hit the following error, any clue on how to fix please would be very much appreciated?:

WARNING:tensorflow:From C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\trainer.py:210: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
Traceback (most recent call last):
  File "train.py", line 163, in <module>
    tf.app.run()
  File "C:\Users\nicho\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 124, in run
    _sys.exit(main(argv))
  File "train.py", line 159, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\trainer.py", line 228, in train
    clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\slim\deployment\model_deploy.py", line 193, in create_clones
    outputs = model_fn(*args, **kwargs)
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\trainer.py", line 167, in _create_losses
    losses_dict = detection_model.loss(prediction_dict)
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 474, in loss
    location_losses, cls_losses, prediction_dict, match_list)
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 640, in _apply_hard_mining
    match_list=match_list)
  File "C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection\core\losses.py", line 515, in __call__
    location_losses = tf.unstack(location_losses)
  File "C:\Users\nicho\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1054, in unstack
    (axis, -value_shape.ndims, value_shape.ndims))
ValueError: axis = 0 not in [0, 0)

(tensorflow) C:\Users\nicho\Documents\01_Machine_Learning\00_Lynda.com\Ex_Files_TensorFlow\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.confi

g

harshal garg · Accepted Answer · 2018-03-07 06:56:44Z

1

problem solved. In your pipeline.config make

loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}

to

loss {
classification_loss {
weighted_sigmoid {
anchorwise_output: true #add this
}
}
localization_loss {
weighted_smooth_l1 {
anchorwise_output: true #add this
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}

enjoy...

answered Mar 7, 2018 at 6:56

harshal garg

112 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Westerby · Accepted Answer · 2018-02-26 12:53:45Z

0

Which model do you use? You might try using a different network for training. I had the same problem when used ssd_mobilenet_v1_coco and ssd_mobilenet_v1_coco.config. Switched to ssd_mobilenet_v1_pets.config, which has additional parameter anchorwise_output, and it started to train.

answered Feb 26, 2018 at 12:53

Westerby

1

1 Comment

Nicholas Zembashi Over a year ago

Hi, thanks. I actually switched from mobilenet to faster_rcnn_inception and it worked fine!

Collectives™ on Stack Overflow

TensorFlow Object Detection API training error

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related