1

I'm in the process of parameterizing my bokeh apps by having my Flask app expose model data via a route dedicated to jsonifying the requested data passed via query string arguments. I know the data sending route works since when I use it as a url to AjaxDataSource I get the expected data plotted. However when I attempt the equivalent operation using the requests.get api I get a 503 response code which makes me think I'm violating something fundamental here I can't quite grasp with my limited webdev experience. What am I doing wrong and or violating?

I actually need a bit more data retrieving flexibility than the AjaxDataSource provides with its columnar limitations. I was hoping to lean on the requests module to pass arbitrary class instances and what not around by serializing and deserializing Json.

Here's the minimal example I have demonstrating the failure derived from flask_embed.html...

import requests
from flask import Flask, jsonify, render_template
import pandas
from tornado.ioloop import IOLoop

from bokeh.application          import Application
from bokeh.application.handlers import FunctionHandler
from bokeh.embed                import server_document
from bokeh.layouts              import column
from bokeh.models               import AjaxDataSource,ColumnDataSource
from bokeh.plotting             import figure
from bokeh.server.server        import Server

flask_app = Flask(__name__)

# Populate some model maintained by the flask application
modelDf = pandas.DataFrame()
nData = 100
modelDf[ 'c1_x' ] = range(nData)
modelDf[ 'c1_y' ] = [ x*x for x in range(nData) ]
modelDf[ 'c2_x' ] = range(nData)
modelDf[ 'c2_y' ] = [ 2*x for x in range(nData) ]

def modify_doc1(doc):
    # get colum name from query string
    args      = doc.session_context.request.arguments
    paramName = str( args['colName'][0].decode('utf-8') )

    # get model data from Flask
    url    = "http://localhost:8080/sendModelData/%s" % paramName 
    source = AjaxDataSource( data             = dict( x=[] , y=[] ) ,
                            data_url         = url       ,
                            polling_interval = 5000      ,
                            mode             = 'replace' ,
                            method           = 'GET'     )
    # plot the model data
    plot = figure( )
    plot.circle( 'x' , 'y' , source=source , size=2 )
    doc.add_root(column(plot))

def modify_doc2(doc):
    # get column name from query string
    args    = doc.session_context.request.arguments
    colName = str( args['colName'][0].decode('utf-8') )

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName
    #pdb.set_trace()
    res = requests.get( url , timeout=None , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()

    # plot the model data
    plot = figure()
    plot.circle( 'x' , 'y' , source=data , size=2 )
    doc.add_root(column(plot))


bokeh_app1 = Application(FunctionHandler(modify_doc1))
bokeh_app2 = Application(FunctionHandler(modify_doc2))

io_loop = IOLoop.current()

server = Server({'/bkapp1': bokeh_app1 , '/bkapp2' : bokeh_app2 }, io_loop=io_loop, allow_websocket_origin=["localhost:8080"])
server.start()

@flask_app.route('/', methods=['GET'] )
def index():
    res =  "<table>"
    res += "<tr><td><a href=\"http://localhost:8080/app1/c1\">APP1 C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app1/c2\">APP1 C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app2/c1\">APP2 C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/app2/c2\">APP2 C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c1\">DATA C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c2\">DATA C2</a></td></tr>"
    res += "</table>"
    return res

@flask_app.route( '/app1/<colName>' , methods=['GET'] )
def bkapp1_page( colName ) :
    script = server_document( url='http://localhost:5006/bkapp1' , arguments={'colName' : colName } )
    return render_template("embed.html", script=script)

@flask_app.route( '/app2/<colName>' , methods=['GET'] )
def bkapp2_page( colName ) :
    script = server_document( url='http://localhost:5006/bkapp2', arguments={'colName' : colName } )
    return render_template("embed.html", script=script)

@flask_app.route('/sendModelData/<colName>' , methods=['GET'] )
def sendModelData( colName ) :
    x = modelDf[ colName + "_x" ].tolist()
    y = modelDf[ colName + "_y" ].tolist()
    return jsonify( x=x , y=y )

if __name__ == '__main__':
    from tornado.httpserver import HTTPServer
    from tornado.wsgi import WSGIContainer
    from bokeh.util.browser import view

    print('Opening Flask app with embedded Bokeh application on http://localhost:8080/')

    # This uses Tornado to server the WSGI app that flask provides. Presumably the IOLoop
    # could also be started in a thread, and Flask could server its own app directly
    http_server = HTTPServer(WSGIContainer(flask_app))
    http_server.listen(8080)

    io_loop.add_callback(view, "http://localhost:8080/")
    io_loop.start()

Here's the pages rendered... Comparison of working vs not working Flask Model json retrieval

Here's some debug output...

C:\TestApp>python flask_embedJSONRoute.py
Opening Flask app with embedded Bokeh application on http://localhost:8080/
> C:\TestApp\flask_embedjsonroute.py(52)modify_doc2()
-> res = requests.get( url , timeout=None , verify=False )
(Pdb) n
> C:\TestApp\flask_embedjsonroute.py(53)modify_doc2()
-> print( "CODE %s" % res.status_code )
(Pdb) n
CODE 503
> C:\TestApp\flask_embedjsonroute.py(54)modify_doc2()
-> print( "ENCODING %s" % res.encoding )
(Pdb) n
ENCODING utf-8
> C:\TestApp\flask_embedjsonroute.py(55)modify_doc2()
-> print( "TEXT %s" % res.text )
(Pdb) n
TEXT
> C:\TestApp\flask_embedjsonroute.py(56)modify_doc2()
-> data = res.json()
(Pdb)

  File "C:\Anaconda3\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
0

1 Answer 1

2

This appears to be not an issue with Bokeh per se but rather an issue with threading and blocking in the server that's running the Flask app.

It's reproducible apart from Bokeh entirely...

import requests
from flask import Flask, jsonify, request
import pandas
import pdb

flask_app = Flask(__name__)

# Populate some model maintained by the flask application
modelDf = pandas.DataFrame()
nData = 100
modelDf[ 'c1_x' ] = range(nData)
modelDf[ 'c1_y' ] = [ x*x for x in range(nData) ]
modelDf[ 'c2_x' ] = range(nData)
modelDf[ 'c2_y' ] = [ 2*x for x in range(nData) ]

@flask_app.route('/', methods=['GET'] )
def index():
    res =  "<table>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c1\">SEND C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c2\">SEND C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c1\">REQUEST OVER FLASK NO PROXY C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c2\">REQUEST OVER FLASK NO PROXY C2</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c1\">REQUEST OVER FLASK C1</a></td></tr>"
    res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c2\">REQUEST OVER FLASK C2</a></td></tr>"
    res += "</table>"   
    return res

@flask_app.route('/RequestsOverFlaskNoProxy')
def requestsOverFlaskNoProxy() :
    print("RequestsOverFlaskNoProxy")
    # get column name from query string
    colName = request.args.get('colName')

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName

    print("Get data from %s" % url )
    session = requests.Session()
    session.trust_env = False
    res = session.get( url , timeout=5000 , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()
    return data

@flask_app.route('/RequestsOverFlask')
def requestsOverFlask() :
    # get column name from query string
    colName = request.args.get('colName')

    # get model data from Flask
    url = "http://localhost:8080/sendModelData/%s" % colName
    res = requests.get( url , timeout=None , verify=False )
    print( "CODE %s" % res.status_code )
    print( "ENCODING %s" % res.encoding )
    print( "TEXT %s" % res.text )
    data = res.json()
    return data

@flask_app.route('/sendModelData/<colName>' , methods=['GET'] )
def sendModelData( colName ) :
    x = modelDf[ colName + "_x" ].tolist()
    y = modelDf[ colName + "_y" ].tolist()
    return jsonify( x=x , y=y )

if __name__ == '__main__':
    print('Opening Flask app on http://localhost:8080/')

    # THIS DOES NOT WORK
    #flask_app.run( host='0.0.0.0' , port=8080 , debug=True )

    # THIS WORKS
    flask_app.run( host='0.0.0.0' , port=8080 , debug=True , threaded=True ) 

Different behavior serving the same data

One can see from the screen shot that serving data directly from sendModelData renders the JSon appropriately, but when fetched via the requests.get method yields an exception due to a 503 code as reported in the Python Console.

If I make the same attempt trying to eliminate the effect of the proxies which I have enabled via environment variables but this approach never completes and the request leaves the browser spinning indefinitely.

Come to think of it it may be completely unnecessary to even use requests as a middle man and I should be able to just get the json string and go about deserializing it myself. Well, that would work in this setup by in my actual code the Bokeh rendering is done in a completely different python Module than the Flask application so these functions are not even available unless I scramble the layering of the app.

EDIT As it turns out the fundamental thing I was violating was with Flask's development environment...

You are running your WSGI app with the Flask test server, which by default uses a single thread to handle requests. So when your one request thread tries to call back into the same server, it is still busy trying to handle that one request. https://stackoverflow.com/a/22878916/1330381

So then the question becomes how to apply this threaded=True technique in the original Bokeh example? This may not be possible by the flask_embed.py example's reliance on the Tornado WSGI server which from this question suggests Tornado is single threaded by design. Given the above findings an even keener question is how does the AjaxDataSource all together avoid these threading issues faced by the requests module?


Update Some more background on the Bokeh and Tornado coupling...

53:05 so they're actually are not very many, the question is about the dependencies for Bokeh and the Bokeh server. The new Bokeh server is built on tornado and that's pretty much the main dependency is it uses tornado. Beyond that there's not very many dependencies, runtime dependencies, for Bokeh. pandas is an optional dependency for Bokeh.charts. There's other dependencies, you know numpy is used. But there's only, the list of dependencies I think is six or seven. We've tried to pare it down greatly over the years and so, but the main dependency of the server is tornado. Intro to Data Visualization with Bokeh - Part 1 - Strata Hadoop San Jose 2016

Sign up to request clarification or add additional context in comments.

2 Comments

Did you ever figure this out?
@CelesteManu are you referring to the open question about getting Bokeh to be multithreaded? It seems to be tied up with Tornado infrastructure. For sure. I'll update the answer below with a quote from a recentish conference.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.