The Apache mod_rewrite provides a very useful rule-based rewriting engine to rewrite requested URLs on the fly. When using the rewrite engine, you can redirect requests based on the browser’s language setting coming from the user. A common use of mod_rewrite comes from web sites with multi-language support. Consider for example the web site my-syte.com where It’s running one language version for English peolple and a second version only for Italian people. Follow two examples of content URLs based on a language-based path.
English URL
http://my-site.com/en/content
Italian URL
http://my-site.com/it/content
To direct the visitors to the correct language site based on their browser language you can look at the “Accept-Language” of the HTTP Header and configure the RewriteCond and RewriteRule directives. Follows an example in a VirtualHost context.
APPLICATION STACK
– CentOS 6.5 x86_64
– Apache 2.2.15
STEP 1. Check module
[root@]# httpd -M | grep rewrite Syntax OK rewrite_module (shared)
STEP 2. Mod_rewrite Settings
# /ect/httpd/conf.d/virtual_hosts.conf <VirtualHost *:80> ServerName my-site.com DocumentRoot "/var/www/html/my-site" RewriteEngine On ## Rewriting url for IT browser RewriteCond %{HTTP:Accept-Language} ^it [NC] RewriteRule ^/$ http://my-site.com/it/foo-page [L,R=301] ## Rewriting url for EN browser RewriteCond %{HTTP:Accept-Language} ^en [NC] RewriteRule ^/$ http://my-site.com/en/foo-page [L,R=301] </VirtualHost>
You can check the above settings changing the browser language. Here is an example using Firefox.
Thats a Fail, my language is Spanish but I prefer to read on English.
Agreed. Please don’t do this. It’s annoying. It’s better to bring the user’s attention to the area where they can change the site’s language instead of assuming you know what they want.
The article shows only an example of mod_rewrite utilization reading from the HTTP header field Accept-Language.
For multilingual web sites, the first thing to consider is how to determine the user’s language and locale preferences. A web application has two ways to determine the user’s language preferences:
1. it can use language and locale preferences that are transmitted from the browser to the server using the HTTP request header field Accept-Language.
2. it let the user choose directly from a list of supported languages.
A good approach is to combine 1 and 2:
– use the Accept-Language information initially, when nothing is known about the user give the user an opportunity to choose a language explicitly with a button on the web site pages.
In every case a multi-lang web site must always provides a button (or something other similar) to choose from a list of languages.
In conclusion you can use mod_rewrite as default way of redirecting users to their local language, for example configuring a subset of the most common languages (zh, es, fr, de and so on..), but in addition, you must provide a way to choose a country drop-down or flag to redirect users to their language.
The issue with the approaches you have described is the same: you know nothing about the user and you are assuming solely based on their language and locale preferences. Many people don’t change those. I would say directing the user’s attention to the place where they can change it is still the bets way to go. Present your site in its default language, detect if the user has other language preferences and let them know “Hey, we also have this site in Spanish. Click here to go there”. This is much better than Google’s approach which consists of presenting absolutely everything in Spanish solely because my IP is in Mexico. Think how annoying this is when I use an application that has Google search embedded and I get nothing but Spanish results back. Luckily I can hard code the search engine in Firefox and some other programs by using encrypted.google.com or modifying obscure settings.
Geo IP detection is not HTTP Header Accept-Language detection. When you enable mod_rewrite to check the client Accept-Language value, you are detecting the client browser locale not its IP. The Accept-Language approach, assumes for example that spanish pages are preferred as a response to the request of a client with a browser locale set to “es”. Otherwise he will redirect to the default language (english in this case)
The W3C rfc for HTTP says:
“The Accept-Language request-header field restricts the set of natural languages that are preferred as a response to the request.”
Always the W3C:
“It is recommended that client applications make the choice of linguistic preference available to the user. If the choice is not made available, then the Accept-Language header field MUST NOT be given in the request”
This argument let open complex issues about privacy, but they can’t be discussed here.
As I say in the previous comment, a multi-lang web site must provide always a way to make user in condition to manually choose his perefered language (as you right say), but a intital server-side checking of a subset of preferred languages can improve the site usability and prevent unnecessary user clic.
I realize HTTP Header Accept is not the same as GeoIP. The point I’ve been trying to get across is to not assume the user wants a specific language and instead just let them know they can change it.
OK
about the possibility for user to change language we have the same thought ;-)
We do 🙂
I agree with Giuseppe, better to combine these techniques, until the visitor choose their language themselves it is better to give them the content in the browser language if available. I don’t understand why it would be better to give them the default language than the browser language? The default language can be random as anything, while using the browser language it can at least increase the chance to give the correct language directly and avoid the user needing to change it.
However, using Accept-Language, can the user actually change the language later? Will not using the Accept-Language as you describe force the visitor to always see the content in their browser language?
http://stackoverflow.com/questions/22008039/htaccess-acceptlanguage-redirect
I agree Mats!
What I tried to say is a sort of preliminary checking of the browser language. If the server can retrieves the client locale then it makes something, otherwise it gives the user an opportunity to choose a language explicitly with a button.
In the major of cases, if a browser has a “es” locale, I think there is a spanish user who prefers to see spanish pages, isn’t it ?
About your questions, what you say is true. This post is only an example. Apache allows you to play with a lot of regular expressions…
A tip to allow user change the language later is to use a negated regex like this:
RewriteEngine On
# Add /en to url if locale is NOT spanish
RewriteCond %{HTTP:Accept-Language} !^es [NC]
RewriteRule ^/$ http://mysite.com/en [L,R=301]
As a follow up though on this I also added a question at stackoverflow about the use of Accept-Language. I think it is interesting to use the Accept-Language to help visitors find their language, but I have problem to find out how many visitors that actually have something set in their Accept-Language:
http://stackoverflow.com/questions/25519538/what-are-the-most-used-accept-language-in-http-header
Nice question Mats.
As you right say, there is no reliable statistics about the most used languages. The w3c doesn’t release stats about the Accept-Language utilization. I’ve found 2 interested links but they couldn’t be reliable:
http://www.factshunt.com/2014/01/world-wide-internet-usage-facts-and.html
http://shipmodeling.net/vb_forum/statistic-tld.html
In my personal experience, google-analytics is a good start point to track some useful stats. It produces many info about the visitor’s browser locale (Geographical Data > Language).
An alternative solution is to configuring the web server to log the client Accept-Language in the HTTP Header and to store it in a text file or a DB.
Thank for sharing!
Giuseppe
Cool, thanks for great links and tips!
Hey Giuseppe,
thanks fort that posts! I just wonder if it will always redirect yo to the homepage or to the translation of your landingpage? And assume I have two languages (GER/EN)… I want all german browsers to be redirected to the standard language (GER) and all other browser languages to the international version (EN). How can I handle this?
Hi Felix,
thanks for your comment.
If you website has different URLs for different languages, what you ask could be done with a small trick. Suppose:
-german pages www.yoursite.com/de/myhome_page
-english pages www.yoursite.com/en/myhome_page
Update mod_rewrite as follows:
# Add /en in the URL if locale is NOT german
RewriteCond %{HTTP:Accept-Language} !^de [NC]
RewriteRule ^/$ http://www.yoursite.com/en [L,R=301]
I hope this will help you.
Giuseppe
Pay attention, this should be the correct format:
With the original config. if I have both italian and english, I am gonig to be redirected all the times to /it, because this is the first condition that apache checks. “RewriteCond %{HTTP:Accept-Language} ^it [NC]” only checks that italian exists in browser language but it doensn’t check if it is the first language in preference order.
Thanks for tip Adol.
Good to know.
😉
Dear Giuseppe,
I need more explanations… If you put the two languages sites in different folders, so, I need to create a new index.html at the root of the site with your code that will redirect to the index.html languages in the folders…
http://my-site.com/index.html (with the redirect code) and http://my-site.com/en/index.html AND http://my-site.com/it/index.html
Am I right? or am I missing something?
Claude
ok stupid question… I guess that I can use the .htaccess file…
Hi Claude,
if you don’t have access to your Apache server, I think you can make use of the htaccess.
According to the Apache doc:
“htaccess files provide a way to make configuration changes on a per-directory basis….In general, you should only use .htaccess files when you don’t have access to the main server configuration file”.
Giuseppe