1

I want to code a Named Entity Recognition system using Python spaCy package. However, I couldn't install my local language inside spaCy package. Is there anyone who can tell me how to install or otherwise use my local language?

I tried:

python -m spacy download xx_ent_wiki_sm
3
  • 2
    What is your local langauge? Commented Jul 21, 2020 at 17:55
  • 1
    Amharic language which is spoken in Ethiopia. Commented Jul 23, 2020 at 8:54
  • Is you language model already packaged or just saved in a separate folder? Commented Jul 25, 2020 at 9:54

1 Answer 1

3

spaCy supports a limited amount of languages with standalone models. If you language is one of:

Chinese, Danish, Dutch, English, French, German, Greek, Italian, Japanese, Lithuanian, Norwegian, Bokmål, Polish, Portuguese, Romanian or Spanish

Then you can load the model by first installing it through a similar command that you have posted, for example:

# Lithuanian language
python -m spacy download lt_core_news_sm

# Japanese language
python -m spacy download ja_core_news_sm

You would have to run this command in your command line (terminal). After the model is finished downloading and is linked, you can import it like this:

import spacy

# Loading the Japanese language model.
nlp = spacy.load("ja_core_news_sm")

spaCy also support a multi-language model that you can try to use if your language is not supported with it's own model. For that, you can do (looks like you already tried to install it in your provided command):

# In command line
python -m spacy download xx_ent_wiki_sm

# In Python
import spacy
nlp = spacy.load("xx_ent_wiki_sm")

However, do not expect state-of-the-art results from using the multi-language model as it is not specifically trained on a single language like the other models are.

Sign up to request clarification or add additional context in comments.

2 Comments

thank you for ur comment...but how to install my own language model/package to python like installing spacy package.
The most last comment here was in 2020. Do you know if the multi-language model has evolved now? has it become more reliable? I can already notice from SpaCy site that the number of supported languages became 88.. it's apparently not just limited in a limited set of languages. I'll be happy to hear from you. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.