0

I am using Flask-restx and Spacy NER model.

I have an api that has to receive a text and an Id No., predict a label and return the same using a spacy nlp model. This nlp model is specific to a particular Id number.

Example: For Id '1', nlp model 'a' is to be loaded and used for prediction; for Id '2', nlp model 'b' is to be used, etc.

I want to know if it is possible that I can keep open Threads for particular Ids which have preloaded the specific nlp model and when a request is sent, according to the id number, that particular Thread which is open can process the data and return a value quick.

Example: The api has received a request that a new nlp model 'x' has been created for id '5' and is going to be used, so a new Thread is opened with a loaded model 'x' and all requests that have id number '5' are processed by this Thread only.

The aim is that there is a preloaded model present, so when a request is sent, it can be processed and value returned in a few seconds. Loading the spacy model takes around 30 seconds which cannot be done every time a request is sent as there will be a timeout.

Can this be done or is there any other way it can be done?

3
  • Yes it can be done but is there a specific reason why each NLP model needs its own Thread? If they can be just Python objects which load the data set when initialized, then you can simply instantiate a global set of them at startup, insert them into a dict for name->obj mapping and use them from the request handlers. If they are not thread-safe you have to syncronize with Lock objects. Commented Dec 9, 2021 at 6:49
  • There will be new models created which would make me unable to load them during startup. When a new Id is made, a database table will tell me which model is to be used for that id which then needs to be loaded. The only problem is that, with every request, I cannot load the specific model in real-time. Commented Dec 9, 2021 at 6:57
  • Ok sounds like you don't really need more Threads, explaining how to do it in my answer. Commented Dec 9, 2021 at 7:49

1 Answer 1

1

I suggest you just rely on the Flask threading model and wrap the NLP models into objects which implement lazy loading of the model (only when it is needed) and a separate factory function which creates and caches these objects. Add a threading.Lock to make sure only one Flask thread is in the NLP parser at a time.

Example code:

from threading import Lock

MODELS = {}

class NlpModel():
    _model = None
    _lock = Lock()

    def __init__(self, model_id):
        self._id = model_id

    @property
    def model(self):
        if self._model is None:
            self._model = slow_load_model_with_something(self._id)
        return self._model

    def parse(self, data):
        with self._lock:
            # only one thread will be in here at a time
            return self.model.do_your_thing(data)


def get_model(model_id):
  model = MODELS.get(model_id, None)
  if not model:
    model = NlpModel(model_id)
    MODELS[model_id] = model
  return model

# Example Flask route
@app.route('/parse/<model_id>')
def parse_model(model_id):
    model = get_model(model_id)
    model.parse(data_from_somewhere)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much for this solution. I incorporated it in my code and it works great! I am new to this field and was stuck on this situation for some time, not knowing how to solve the problem. Thank you again!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.