8

Background information:

I've searched stackoverflow for a specific solution and couldn't find one that fixed my situation. Thanks in advance for any help you can offer. Your knowledge is appreciated.

I've decided to accept a contract to "convert" (in the client's words) a Joomla site into a WordPress site. Everything is going along smoothly, except that the Joomla site links to .html files, both in its navigation and in the content of 100+ posts.

Instead of going through each post one-by-one and updating the links or running a SQL command to remove ".html" from URLs, I've decided to put the pressure on .htaccess, with which I am somewhat comfortable.


What I'm trying to do ↓

In WordPress, I have custom permalinks enabled, and it is as follows: /%category%/%postname%

Here's an example of what one of the old URLs in the posts looks like:

http://the-site.com/category/the-webpage.html

I need the htaccess file to tell the webserver to remove the .html so the user, after visiting "http://the-site.com/the-webpage.html" is instead sent to:

http://the-site.com/category/the-webpage

I'm setting up the page stubs to follow the file name of the Joomla pages, so http://the-site.com/category/the-webpage will work.


My question:

Can you help me discover the solution to removing .html from the URL when someone visits the site, even if the HTML file doesn't exist on the server?


Here's how the .htaccess file looked before I made changes:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Here's the latest .htaccess file as of 5:35pm Eastern:

# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} \.html$
RewriteRule ^(.*)\.html$ $1 [R=301,L]

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# END WordPress

The ↑latest .htaccess changes work. Thanks Tim!

2
  • You'd want to have the added rewrites within the <IfModule mod_rewrite.c section, too. An [L] flag might be good to add here, too, since you have other rewrites after the html one, and they don't apply. Commented Jun 27, 2010 at 21:03
  • I moved the rewrites to within the <IfModule> section, and it still gives me a 404. Could this possibly have anything to do with subcategories and my permalink structure? Commented Jun 27, 2010 at 21:09

3 Answers 3

15

This will work to force an external redirection to your new URLs, but this may not be ideal for your situation. I'm still trying to think if there's a way to keep the redirection internal and update the variable that WordPress uses to determine which page to serve up, but so far I haven't thought of anything that would work.

Entire .htaccess:

RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} \.html$
RewriteRule ^(.*)\.html$ $1 [R=301,L]

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Sign up to request clarification or add additional context in comments.

3 Comments

That did it, Tim! I'll update the original post for future visitors with the same issue. Thank you very, very much.
A 301 redirect is okay, since the new URLs are permanent. Thanks again.
Absolutely, and thank you again. It was one of those things, while designing/developing a site, that sticks out in the back of your mind telling you "I have no idea how I'm going to make that work." You, my friend, are a savior. Thank you a hundred times over.
2

You want to use a URL rewrite

RewriteEngine On
RewriteRule ^(.*)\.html$ $1

5 Comments

I gave this a shot by adding it to below the code I included above, but I'm worried that there will be an issue with the variables, since the original htaccess file includes: RewriteRule ^index\.php$ - [L]
What do you mean by "issue with the variables" exactly? If you put Zurahn's code above your WordPress rule definitions, it should do what you need without any issue (assuming you don't serve up any actual .html files).
Tim, I originally added Zurahn's code beneath the WordPress code. I added it to the top, and it still doesn't work. It's annoying, yes, and I would normally do this is with a find/replace in phpMyAdmin, but there are a few posts that link to external websites whose URLs include HTML files. I added what the code looks like right now to the original post.
Ah, no, you're right. It's because WordPress must use REQUEST_URI to determine the request path, which is unaffected by an internal redirection. You can obviously externally redirect, but let me think if there's a way to do it internally.
Much appreciated thus far, Tim. If you think of anything for me to try, post it in a new comment so I can give you props.
1

This should do it. It will rewrite a request to site.com/category/whatever.html to site.com/category/whatever. it shouldn't be dependent upon the requested file existing.

    <Directory /var/www/category>
        RewriteEngine on
        RewriteRule (.*)\.html$ /category/$1
    </Directory>

This is the format for apache2.conf or virtual host files. Not sure if you use the command in .htaccess. It's best to take care of it in the server conf, if you can, as that is only parsed once, on server startup, and htaccess is parsed on each request.

5 Comments

The site is hosted on Dreamhost, so I don't have access to the configuration files. I'm worried that there will be an issue with the variables, since the original htaccess file includes: RewriteRule ^index\.php$ - [L]
Variables? There aren't any variables involved in an htaccess file. They $ means end of line, and $1 is a captures from a regular expression. They only have meaning within that one line.
Okay, I understand that finally. I updated what my .htaccess looks like right now, and it still results in a 404 when I visit the-site.com/category/the-webpage.html
What is it being rewritten to - does it 404 saying the-webpage.html is not found, or another file name?
The URL stays the same (the-site.com/category/the-webpage.html), and WordPress sends me to the 404 page I designed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.