1

I'm trying to get the URL that has been requested in Python without using a web framework.

For example, on a page (let's say /main/index.html), the user clicks on a URL to go to /main/foo/bar (/foo/bar doesn't exist). Apache (with mod_wsgi) then redirects the user to a PHP script at /main/, which then gets the url and searches MySQL for any matching fields. Then the rest of the field is returned. This helped in PHP:

$_SERVER["REQUEST_URI"];

I'd rather not use PHP since it's becoming increasingly difficult to maintain the PHP code whilst the database keeps changing in structure.

I'm pretty sure there's a better way altogether and any mention would be greatly appreciated. For the sake of relevancy, is this even possible (to get the requested URL in Python)? Should I just use a framework, although it seems quite simple?

Thanks in advance,

Jamie

Note: I don't want to use GET for security purposes.

3
  • 1
    how are you integrating Python and Apache? CGI? Mod_Python? Reverse proxy and BaseHTTPServer? Something else? Commented Jan 3, 2012 at 7:39
  • Oops sorry. I'm using mod_python. Would it be easier if I used mod_wsgi? Commented Jan 3, 2012 at 7:40
  • 2
    It would be less dead if you used mod_wsgi. Commented Jan 3, 2012 at 7:44

4 Answers 4

9

Well, if you run your program as a CGI script, you can get the same information in os.environ. However, if I recall correctly, REQUEST_URI as such is not part of the CGI standard and you need to use os.environ['SCRIPT_NAME'], os.environ['PATH_INFO'] and os.environ['QUERY_STRING'] to get the equivalent data.

However, I seriously urge you to see some lightweight framework, such as Pyramid. Plain CGI with Python is slow and generally just pain in the ass.

Sign up to request clarification or add additional context in comments.

2 Comments

I've noticed. Thanks for your help.
REQUEST_URI is also the original raw URL provided by browser client and is not normalised. It may contain '../' sequences, repeating slashes and % encodings, all of which may be modified in some way by the server. You thus have to be very careful in trying to manipulate the value of REQUEST_URI as your interpretation could end up being different to what web server interpreted it as. When trying to split on slash, the presence of '../' sequences in original is especially a problem.
1

Unlike PHP, Python is a general purpose language and doesn't have this built-in.

The way you can gather this information depends on the deployment solution:

  • CGI (mostly Apache with mod_python, deprecated): see @Antti Haapala solution
  • WSGI (most other deployment solutions): see @gurney alex solution

But you will encouter much more problems: session hanling, url management, cookies, and even juste simple POST/GET parsing. All of this need to be done manually if you don't use a framework.

Now, if you feel like a framework is overkill (but really, incredible tools like Django are worth it), you can use a micro framework like bottle.

Microframeworks will typically make this heavy lifting for you, but without the complicated setup or the additional advanced features. Bottle has actually zero setup an is a one file lib.

Hello word with bottle:

from bottle import route, run, request

@route('/hello/:name')
def index(name='World'):
    return '<b>Hello %s! You are at %s</b>' % (name, request.path)

run(host='localhost', port=8080)

request.path contains what you want, and if you visit http://127.0.0.1:8080/hello/you, you will get:

Hello you! You are at /hello/you

3 Comments

Thanks for your help. I removed mod_python and switched over to mod_wsgi. For some reason, the last command run(host='localhost', port=8080) is being run in an infinite loop and then it breaks (with or without apache). That's obviously not normal. Is my server misconfigured somewhere?
run(host='localhost', port=8080) is just a test server, not meant for production. It's is an infinite loop, since it's a server. If you want to use with apache, you need to run it using mod_wsgi, but you usually do that at the end of your dev cycle, when you deploy to test it. What did you use it for. What do you mean by "it breaks"
No, CGI is not about mod_python, you just need mod_cgi or equivalent.
1

When I want to get a URL outside of any framework using Apache2 and Mod_WSGI I use

environ.get('PATH_INFO')

inside of my application() function.

Comments

0

When using mod_python, if I recall correctly you can use something like:

from mod_python import util
def handler(request):    
    parameters = util.FieldStorage(request)
    url = parameters.get("url", "/")

See http://www.modpython.org/live/current/doc-html/pyapi-util.html for more info on the mod_python.util module and the FieldStorage class (including examples)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.