329 questions
0
votes
0
answers
51
views
Training with spaCy from command line, don't know why gpu-id not recognized
I am having the hardest of times getting my training session to use my gpu 0 which by every measure is present and correctly setup with cuda 12.2.
When I try to do python -m spacy train base_config....
0
votes
0
answers
25
views
Retrieving spaCy transformer tokenization ids
While using spacy transformer pipeline en_core_web_trf. How to retrieve the transformer tokenization (often roberta-base), it can be the tokenizer ids, tokenizer strings, or both (preferably).
Actual ...
0
votes
0
answers
67
views
How do I include a custom component in a spaCy training pipeline using the CLI?
I'm trying to implement a simple custom component in my spaCy training pipeline. I'm using the spaCy CLI for training, which means I'm directing the pipeline configuration through the config.cfg file, ...
1
vote
1
answer
45
views
How can I use multiple spacy.train files in one training run?
I've downloaded the UD Treebank dataset, set up a shell script to discover all folders for a given language and converted the .conllu files to .spacy.
Now I have a collection of files like this: ...
2
votes
1
answer
142
views
What is causing this error in the official spacy classy classification example?
I've been trying to learn how to use spacy and now I want to learn how to use classy classification, however, the example of classy shown in the official spacy webpage is not working. Here's the code ...
1
vote
0
answers
42
views
Spacy detect correctly GPE
I've a set of string where I shall detetect the country its belongs to, referring to detected GPE.
sentences = [
"I watched TV in germany",
"Mediaset ITA canale 5",
&...
1
vote
1
answer
479
views
Cannot use GPU for custom spaCy NER model
I'm trying to make a custom NER model using spaCy. When I try to leverage gpu it throws an error stating that Cupy is not installed even though it is. Attaching relevant info below.
> ubuntu@:~$ ...
0
votes
1
answer
148
views
SpaCy transformer NER training – zero loss on transformer, not trained
I am training a SpaCy pipeline with ['transformer', 'ner'] components, ner trains well, but transformer is stuck on 0 loss, and, I am assuming, is not training.
Here is my config:
[paths]
vectors = &...
0
votes
2
answers
64
views
Spacy - return nouns without the grammatical articles
In Spacy, when we request the nouns, the grammatical articles (ex.: "the", "one", "a") are also presented
import spacy
nlp_en = spacy.load('en_core_web_sm') # v3.7.1
doc ...
1
vote
1
answer
272
views
Load Spacy language module according to detected language
All around I see this example related to the package LanguageDetector
import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector
def get_lang_detector(nlp, name):
...
0
votes
1
answer
180
views
Can I monitor progress of spacy parsing?
I have a simple program to process English text with spacy and output some of the info about the tokens. For a big text it takes a long time for spacy to process it. Is there a way to see how far the ...
1
vote
1
answer
1k
views
Can a Named Entity Recognition (NER) spaCy model or any code like an entity ruler around it catch my new further date patterns also as DATE entities? [duplicate]
Anonymization of entities found by a NER model
I try to anonymize files by means of a NER model for German text that sometimes may have a few English words. If I take spaCy NER models for German and ...
0
votes
1
answer
152
views
Spacy v3 DocBin unable to save train.spacy bytes object is too large
I want to train large data in spacy v3.0+
There are 8000000 data tokens count
i made 1000000 each chunk and finally murge vai DocBin python code but getting error
import os
import spacy
from spacy....
0
votes
0
answers
192
views
Integrating spaCy with SQL Server 2022 Machine Learning Services (MLS)
SQL Server Machine Learning Services (MLS) facilitates running Python and R scripts directly within the SQL Server engine.
This tutorial explains how to store a trained model as VARBINARY in a table ...
0
votes
0
answers
88
views
I am encountering problems while using PyInstaller to package a Qt application that contains a spaCy language model
My Python version is 3.10.13, using a .venv environment. The spaCy model I load is zh_core_web_trf. The code runs normally in VSCode, but when I package it with PyInstaller, it shows an error. I have ...
1
vote
0
answers
3k
views
Lambda function AttributeError: module 'os' has no attribute 'add_dll_directory'
I am encountering an issue with my AWS Lambda function with runtime Python 3.9. The function uses Spacy (version 3.7.2) and is set up as a layer in AWS Lambda. During execution, I'm facing the ...
0
votes
1
answer
220
views
Switch spacy lemmatizer's mode for french language
With Spacy, I want to change the lemmatizer of the French model ('rule-based' by default) to 'lookup'.
I'm using spacy 3.6.1, fr_core_news_lg-3.6.0 model and spacy-lookups-data 1.0.5
This seemed to be ...
-2
votes
1
answer
109
views
Is there a method to extract quotes and their related speakers in the French language?
Is there a method to extract quote and their related speaker with the gestion of coreference?
I want in output to get a dict with [{"speaker" : , "quotes": }] and if we don’t find ...
1
vote
1
answer
78
views
spaCy custom component function is never called
I am adding a custom component to spaCy but it never gets called:
@Language.component("custom_sentence_boundaries")
def custom_sentence_boundaries(doc):
print(".")
for ...
0
votes
1
answer
309
views
Spacy - pdf_reader extraction of text only from specific pages
could you please tell me what is wrong with below function. I would like to parse only first two pages of the pdf. When I call the function with argument page_numbers=[0,1] it extracts text from all ...
1
vote
0
answers
104
views
pke - extractor.load_document (Spacy) limitation of 1000000 characters
While using extractor.load_document() function of python package pke (https://github.com/boudinfl/pke) encountering this error:
ValueError: [E088] Text of length 1717453 exceeds maximum of 1000000. ...
1
vote
1
answer
383
views
How can I enhance morphological information for English models in spaCy?
I am trying to detect verbs that are in the imperative mood using English models in spaCy but I am seeing morphological features that are inconsistent with the examples found in the Morphology ...
-1
votes
1
answer
56
views
error inserting spacy.tokens.span.Span into pandas dataframe
using scispacy, trying to use the Hearst Patterns feature, which returns a spacy.tokens.span.Span object. When trying to get the result into a datafame I get an error, object is treated as several ...
8
votes
1
answer
12k
views
TypeError: issubclass() arg 1 must be a class
I am trying to use the Spacy library again for my NPL task. somedays back it was working totally fine with spacy.load("en_core_web_sm"). I thought of using medium instead of small, but now ...
0
votes
1
answer
413
views
Spacy Not running on GPU in windows 11
I installed spacy using pip install spacy[cuda122],
while doing spacy.require_gpu() it is returning True
My nvcc --version returns
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA ...
0
votes
1
answer
748
views
How to register custom components in a SpaCy config.cfg file?
As the title states:
I seem to have followed the documentation as described and I have looked all over the web for a useful answer but have so far have not found much. Any help is much appreciated! ...
0
votes
1
answer
570
views
Unable to download en_core_web_trf for spacy
I tried using this command.
python -m spacy download en_core_web_trf
GOT this
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a ...
0
votes
1
answer
59
views
Why token "less" has higher similarity with "more" in Spacy?
I'm trying to find sentences which has the word less or words similar to less. When I tried to find the token similarity with less and all the words in doc. I'm getting words like less, more, which ...
2
votes
0
answers
265
views
How to get the Corresponding Negation Terms used for a Set of Detected Negated Lexicons in NegSpacy?
I am working on a project with a clinical dataset. So far, I was able to detect all the diagnoses and whether they are negated or not. But, what I really like to get as well, is the negation term used ...
1
vote
1
answer
184
views
How to create a Entity Ruler pattern that includes dot and hyphen?
I am trying to include brazilian CPF as entity on my NER app using spacy. The current code is the follow:
import spacy
from spacy.pipeline import EntityRuler
nlp = spacy.load("pt_core_news_sm&...
0
votes
1
answer
388
views
Adding a ner from another pretrained model to a blank model including static vectors
I am very new to Spacy so this question might be kinda dumb, but I can't figure out how to add a NER from an existing model to a blank model.
I am following this example: https://spacy.io/api/language#...
2
votes
0
answers
1k
views
spacy - how to load a downloaded pretrained pipeline
How to load the downloaded pretrained pipeline and where is it explained in the document?
import spacy
spacy.cli.download("en_core_web_sm", False, False, "--target", "/tmp/...
0
votes
1
answer
197
views
Not sure why my Python code that uses Spacy to add a phone_number entity is not working
The pattern works with matcher. But not as an entity? Here is my code:
import spacy
from spacy.pipeline import EntityRuler
nlp = spacy.load("en_core_web_sm")
patterns = [
{
&...
2
votes
0
answers
184
views
Spacy to tensorflow lite
I created a spacy model with ner and text classification. This model works great but I'm wondering if I can export it to tensorflow Lite to test it on a mobile device. Do you think it's possible ?
I ...
0
votes
1
answer
43
views
Spacy dependency matcher doesnt find matches in reverse
I am trying to find words related to 'poss' to the word 'my' but it doesn't work. For example in reverse,
pattern = [
{
"RIGHT_ID": "anchor_founded",
"...
0
votes
1
answer
269
views
Spacy - Span that completely lie within another Span
I have docs in spacy that use spans, such as:
sent = 'I eat 5 apples and 2 bananas.'
doc = nlp(sent)
doc.spans['sc'] = [
Span(doc, 2, 3, 'Ingredient'),
Span(doc, 5, 6, 'Ingredient'),
Span(...
1
vote
0
answers
68
views
Why is my SpaCy model returning an empty dictionary
Here is how I created the model and how I am testing it. For the life of me, I can't figure out why it's returning an empty dictionary without any predictions.
I used prodigy to annotate some data ...
1
vote
0
answers
31
views
How to improve entity recognition?
I am using Spacy's named entity recognition with it_core_news_lg but getting low-quality output, e.g.
{'Carlo Rossi?I', 'Carlo', 'Teresa Rossi', 'Pietro?', 'Bruno?Sì', 'Pietro Rossi', 'Pia', 'Maria', '...
0
votes
0
answers
851
views
How do I get confidence score in spacy 3.5 prediction?
I am working with spacy 3.5. Everything is great, but did not find a way to get confidence score from my NER prediction.
I trained a custom NER model with spacy 3.5. I am able to make predictions ...
1
vote
0
answers
242
views
Spacy Entity Linker with Transformer Listener problem
I have a pretrained pipeline composed by a transformer and NER components and I am trying to create an Entity Linker able to use embedding representation produced by the transformer rather than using ...
0
votes
1
answer
751
views
I have created a custom NER using spacy and i want to train it with additional data but what to change in config.cfg file?
I have created a spacy NER model for named entity recognition and its having tok2vec and ner as components in the pipeline. Now i want to add some more data to it, so i am using a model-best directory ...
3
votes
5
answers
4k
views
AttributeError: module 'click.utils' has no attribute '_expand_args', when i'm tring to install en_core_web_sm
I am trying to install "en_core_web_sm",
the commands i ran is:
pip install spacy ( which ran perfectly and got installed)
python -m spacy download en_core_web_sm ( here i'm getting error &...
0
votes
1
answer
41
views
Is it possible to check if a subset of a dictionary comes from a main dictionary in python?
I am working on a NLP process using spaCy and trying to get the results of one dictionary (result of an analysis) to cross check against the full dictionary (pre-determined by me). I am trying to take ...
0
votes
1
answer
100
views
Where is it possible to find python documentation for training spacy model/pipelines
I have been looking through the spacy documentation on training/fine-tuning spacy models or pipelines, however, after walking through the following guide https://spacy.io/usage/training I found that ...
0
votes
1
answer
80
views
Spacy Extracting mentions with Matcher
everyone I am trying to match a sentence into a bigger sentence using Spacy rule-matcher, but the output is empty.
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("...
1
vote
0
answers
264
views
Trying to train a spacy model using GPU but PyTorch throws the "Out Of Memory Error"
I'm Trying to build a parser to recognize entities from text given and to train the spacy model with GPU I went ahead and installed all the necessary packages but while training it runs the first ...
0
votes
1
answer
3k
views
Failed to establish a new connection: [Errno 110] Connection timed out' while downloading en_core_web_sm
i am beginner and learning spacy and to setup my environment i tried to download en_core_web_sm model for spacy by using python3 -m spacy download en_core_web_sm command but after some minutes it is ...
0
votes
1
answer
102
views
Spacy Regex "SyntaxError: invalid syntax" [closed]
Hi everyone I am executing this code in Spacy to match with Regex, but I get an error:
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_md")
doc1 = nlp("...
3
votes
1
answer
69
views
Spacy Rule-Based Matching outputs undesired phrase bit
I was reproducing a Spacy rule-matching example:
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_md")
doc = nlp("Good morning, I'm here. I'll say good ...
1
vote
1
answer
565
views
How to update NER training of existing transformer model code from v2 to SpaCy v3.x
I want to understand how to update the using example of NER updating the model to learn recognize new entity (here ANIMAL) from version 2.x of SpaCy v3.x:
https://github.com/explosion/spaCy/blob/v2.3....