Categories
.htaccess apache reactjs regex

How do I correct my htaccess for proxying search engine crawl requests?

I have built a website with React at the front end and WordPress as the backend. For search engine crawlers to see my site, I have set up prerendering at the server side, and am trying to set up htaccess to proxy requests coming from search engines so that they are served pre-rendered pages.

For testing, I am using the “Fetch as Google” tool in Google Webmasters.

Here is my attempt:

<IfModule mod_rewrite.c>
RewriteEngine On
<IfModule mod_proxy_http.c>
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_USER_AGENT} googlebot [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Proxy the request ... works for inner pages only
RewriteRule ^(?!.*?)$ http://example.com:3000/https://example.com/$1 [P,L]
</IfModule>
</IfModule>
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

My problem is that this directive doesn’t work for my home page, and works only for inner pages (http://example.com/inner-page/):

RewriteRule ^(?!.*?)$ http://example.com:3000/https://example.com/$1 [P,L]

When I change this line to the following line, the home page request is indeed proxied correctly, but the inner pages stop working.

RewriteRule ^(index\.php)?(.*) http://example.com:3000/https://example.com/$1 [P,L]

Could you help me fix the rewrite rule so that my home page is also proxied correctly for the googlebot?