Okay, so I have a simple interface that I designed with the Django framework that takes natural language input from a user and stores it in table.
Additionally I have a pipeline that I built with Java using the cTAKES library to do named entity recognition i.e. it will take the text input submitted by the user and annotate it with relevant UMLS tags.
What I want to do is take the input given from the user then once, its submitted, direct it into my java-cTAKES pipeline then feed the annotated output back into the database.
I am pretty new to the web development side of this and can't really find anything on integrating scripts in this sense. So, if someone could point me to a useful resource or just in the general right direction that would be extremely helpful.
========================= UPDATE:
Okay, so I have figured out that the subprocess is the module that I want to use in this context and I have tried implementing some simple code based on the documentation but I am getting an
Exception Type: OSError
Exception Value: [Errno 2] No such file or directory
Exception Location: /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py in _execute_child, line 1335.
A brief overview of what I'm trying to do:
This is the code I have in views. Its intent is to take text input from the model form, POST that to the DB and then pass that input into my script which produces an XML file which is stored in another column in the DB. I'm very new to django so I'm sorry if this is an simple fix, but I couldn't find any documentation relating django to subprocess that was helpful.
def queries_create(request):
if not request.user.is_authenticated():
return render(request, 'login_error.html')
form = QueryForm(request.POST or None)
if form.is_valid():
instance = form.save(commit=False)
instance.save()
p=subprocess.Popen([request.POST['post'], './path/to/run_pipeline.sh'])
p.save()
context = {
"title":"Create",
"form": form,
}
return render(request, "query_form.html", context)
Model code snippet:
class Query(models.Model):
problem/intervention = models.TextField()
updated = models.DateTimeField(auto_now=True, auto_now_add=False)
timestamp = models.DateTimeField(auto_now=False, auto_now_add=True)
UPDATE 2: Okay so the code is no longer breaking by changing the subprocess code as below
def queries_create(request):
if not request.user.is_authenticated():
return render(request, 'login_error.html')
form = QueryForm(request.POST or None)
if form.is_valid():
instance = form.save(commit=False)
instance.save()
p = subprocess.Popen(['path/to/run_pipeline.sh'], stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
(stdoutdata, stderrdata) = p.communicate()
instance.processed_data = stdoutdata
instance.save()
context = {
"title":"Create",
"form": form,
}
return render(request, "query_form.html", context)
However, I am now getting a "Could not find or load main class pipeline.CtakesPipeline" that I don't understand since the script runs fine from the shell in this working directory. This is the script I am trying to call with subprocess.
#!/bin/bash
INPUT=$1
OUTPUT=$2
CTAKES_HOME="full/path/to/CtakesClinicalPipeline/apache-ctakes-3.2.2"
UMLS_USER="####"
UMLS_PASS="####"
CLINICAL_PIPELINE_JAR="full/path/to/CtakesClinicalPipeline/target/
CtakesClinicalPipeline-0.0.1-SNAPSHOT.jar"
[[ $CTAKES_HOME == "" ]] && CTAKES_HOME=/usr/local/apache-ctakes-3.2.2
CTAKES_JARS=""
for jar in $(find ${CTAKES_HOME}/lib -iname "*.jar" -type f)
do
CTAKES_JARS+=$jar
CTAKES_JARS+=":"
done
current_dir=$PWD
cd $CTAKES_HOME
java -Dctakes.umlsuser=${UMLS_USER} -Dctakes.umlspw=${UMLS_PASS} -cp
${CTAKES_HOME}/desc/:${CTAKES_HOME}/resources/:${CTAKES_JARS%?}:
${current_dir}/${CLINICAL_PIPELINE_JAR} -
-Dlog4j.configuration=file:${CTAKES_HOME}/config/log4j.xml -Xms512M -Xmx3g
pipeline.CtakesPipeline $INPUT $OUTPUT
cd $current_dir
I'm not sure how to go about fixing this error so any help is appreciated.