In this post, we will cover some fundamentals of the .htaccess file. But, first, we’ll learn about the .htaccess file in general and look at some practical .htaccess examples.
Table of Contents
What is the .htaccess?
You use the .htaccess configuration file on Apache web servers. This file controls how a web server responds to different requests. The .htaccess file allows you to take directives that would typically be put in Apache’s main configuration files and a directory-specific configuration file instead. The Apache server loads the .htaccess file from the directory placed in by detecting and executing the file. It is a server configuration file, and you use it to manipulate features directly. It is also less resource intensive as compared to a plugin.
In a nutshell, the .htaccess configuration file allows you to alter and change the functionalities and features of the Apache webserver software. You can enable, disable, and modify different functionalities at the run time.
What does .htaccess mean?
.htaccess is short for hypertext access. The primary benefit of the .htaccess file was to control user access to files on specific directories. Note that ‘.htaccess’ is not an extension; it is a complete file name. Therefore, you can’t create a file with a .htaccess extension such as sample.htaccess.
Where can you find it?
The .htaccess file can be found primarily in your website’s root folder, for example: /var/www/html/. Essentially, every directory on the webserver can have a ‘.htaccess’ file. Each directory can have only one .htaccess file. Each .htaccess file will set different server behaviors.
Why can’t you see It?
The .htaccess file is an Apache file and will not work on web servers such as Nginx. This is because Linux hides all dotfiles by default. A quick solution is to open your hosting manager and turn on the “Show hidden files” option. Alternatively, you can use the ‘ls -a’ Linux command.
What can you do with it?
You can use the .htaccess file in many ways. Below is a small list of examples.
For example, you can –
- Block specific IP addresses and, at the same time, only allow specific IP addresses to access your website. This feature is beneficial for allowing only specific IP addresses to access your website’s secure pages such as the admin panel. This way, an unauthorized person will get an error if they try to access the page.
- Create custom error pages. Naturally, the webserver displays pre-defined error pages for the errors. You can customize and create custom pages for specific errors.
- Enable basic HTTP authentication on your entire site or specific directories.
Right! Now that we know the theory let’s do a deep dive into some practical uses.
How to add a custom header and value?
We can use Apache’s header directive to add our custom header.
The syntax is as follows:
Header add Sample-Header "My Value"
You can add the above example to your website’s root .htaccess file. Just replace “Sample-Header” with any custom header name. Also, change the name of the parameter and set the value accordingly.
Executing the above line will perform two actions. First, it will instruct the Apache server to add a custom header named “Sample-Header,” and it will set the header parameter and value to “parameter” and” value,” respectively.
Example .htaccess blocking users based on their IP addresses
You can restrict specific users with specific IP addresses from accessing your website. For example, it can restrict everyone except yourself from opening the dashboard of your site. So even if a hacker knows your admin panel’s password, they will not be able to open the admin panel page. Instead, they will be prompted with an error page. Note that your IP address changes unless you have been assigned a static IP.
Open up your site’s root .htaccess file and input the following example commands to it:
To deny a specific IP address:
Deny from 121.212.121.212
Here replace 121.212.121.212 with the IP you want to block. The site will prompt the user with an error message if they access your page from that IP.
To deny multiple IP addresses:
Open up your site’s root .htaccess file and input the following example commands to it:
Deny from 1.2.2.1 2.3.3.2. 3.4.4.3 4.5.5.4
This command will block the above-stated IP addresses from viewing your website.
Allowing Users based on their IP addresses
This works similarly to blocking IP addresses. The only difference is that you allow specific IP addresses to access your site or your web pages.
Open up your site’s root .htaccess file and input the following example directives to it:
To allow specific IP addresses:
Allow from 121.232.121.232
To allow multiple IP addresses:
Allow from 1.2.3.4 2.1.3.4 3.1.2.4 4.1.2.3
This will allow only the above-written IP addresses to view your website. You can add as many IP addresses as you want.
How to block users by domain?
You also have the power to block certain domains. Any requests from the specified domain will receive a 403 forbidden error message. Let’s look at how you can block URLs from certain domains.
Blocking domain:
Open up your site’s root .htaccess file and input the following example commands to it:
SetEnvIfNoCase Referer "sample-domain.com" bad_referer
Order Allow, Deny
Allow from ALL
Deny from env=bad_referer
Edit the above code by replacing “sample-domain.com” with the target domain you want to block. Now all the URL redirects that are hosted on the target domain will be blocked.
How to block by referrers?
Websites (or referrers) can link directly to your images and other resources without any benefit to you. So let us see how we can block these referrers.
Blocking a single referrer
Open up your site’s root .htaccess file and input the following example commands to it:
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} sample-domain\.com
RewriteRule .* - [F]
The above code tells the Apache server to block traffic coming from the URL, “sample-domain.com.” You can replace “sample-domain.com” with the desired URL.
Blocking multiple referrers:
Open up your site’s root .htaccess file and input the following commands to it:
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} sample-domain\.com [NC,OR]
RewriteCond %{HTTP_REFERER} another-sample-domain\.com
RewriteCond %{HTTP_REFERER} another-domain\.com
RewriteRule .* - [F]
This code will block traffic from all the above-stated URLs.
Blocking bots
Bots can be good or bad! Let us see how you block bad bots that scour your site to download your content.
Open up your site’s root .htaccess file and input the following example directives to it:
ErrorDocument 403 /403.html
RewriteEngine On
RewriteBase /
# IF THE UA STARTS WITH THESE
RewriteCond %{HTTP_USER_AGENT} ^(aesop_com_spiderman|alexibot|backweb|bandit|batchftp|bigfoot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(black.?hole|blackwidow|blowfish|botalot|buddy|builtbottough|bullseye) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(cheesebot|cherrypicker|chinaclaw|collector|copier|copyrightcheck) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(cosmos|crescent|curl|custo|da|diibot|disco|dittospyder|dragonfly) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(drip|easydl|ebingbong|ecatch|eirgrabber|emailcollector|emailsiphon) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(emailwolf|erocrawler|exabot|eyenetie|filehound|flashget|flunky) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(frontpage|getright|getweb|go.?zilla|go-ahead-got-it|gotit|grabnet) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(grafula|harvest|hloader|hmview|httplib|httrack|humanlinks|ilsebot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(infonavirobot|infotekies|intelliseek|interget|iria|jennybot|jetcar) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(joc|justview|jyxobot|kenjin|keyword|larbin|leechftp|lexibot|lftp|libweb) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(likse|linkscan|linkwalker|lnspiderguy|lwp|magnet|mag-net|markwatch) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(mata.?hari|memo|microsoft.?url|midown.?tool|miixpc|mirror|missigua) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(mister.?pix|moget|mozilla.?newt|nameprotect|navroad|backdoorbot|nearsite) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(net.?vampire|netants|netcraft|netmechanic|netspider|nextgensearchbot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(attach|nicerspro|nimblecrawler|npbot|octopus|offline.?explorer) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(offline.?navigator|openfind|outfoxbot|pagegrabber|papa|pavuk) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(pcbrowser|php.?version.?tracker|pockey|propowerbot|prowebwalker) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(psbot|pump|queryn|recorder|realdownload|reaper|reget|true_robot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(repomonkey|rma|internetseer|sitesnagger|siphon|slysearch|smartdownload) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(snake|snapbot|snoopy|sogou|spacebison|spankbot|spanner|sqworm|superbot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(superhttp|surfbot|asterias|suzuran|szukacz|takeout|teleport) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(telesoft|the.?intraformant|thenomad|tighttwatbot|titan|urldispatcher) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(turingos|turnitinbot|urly.?warning|vacuum|vci|voideye|whacker) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(libwww-perl|widow|wisenutbot|wwwoffle|xaldon|xenu|zeus|zyborg|anonymouse) [NC,OR]
# STARTS WITH WEB
RewriteCond %{HTTP_USER_AGENT} ^web(zip|emaile|enhancer|fetch|go.?is|auto|bandit|clip|copier|master|reaper|sauger|site.?quester|whack) [NC,OR]
# ANYWHERE IN UA -- GREEDY REGEX
RewriteCond %{HTTP_USER_AGENT} ^.*(craftbot|download|extract|stripper|sucker|ninja|clshttp|webspider|leacher|collector|grabber|webpictures).*$ [NC]
# ISSUE 403 / SERVE ERRORDOCUMENT
RewriteRule . - [F,L]
Reference: https://www.askapache.com/htaccess/blocking-bad-bots-and-scrapers-with-htaccess/.
Setting default pages
The server looks for a specifically named file called the index file as the home page. You can also change the home page or default page by tweaking the name of the index file in the .htaccess file.
Open up your site’s root .htaccess file and input the following directives to it:
DirectoryIndex your_new_index_file.php
Replace your_new_index_file.php with the name of the file you want to set as your default page.
Setting the default directory
By default, the root directory of your website is public_html. This folder is your document root directory. You can change the default directory with changes to the .htaccess file.
Open up your site’s root .htaccess file and input the following directives to it:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteCond %{REQUEST_URI} !new_folder/
RewriteRule (.*) /newfolder/$1 [L]
Replace domain.com and www.domain.com with your website’s domain name. Finally, replace the new_folder with the name of the new folder to set as your default directory.
Blocking referrers (hotlink protection)
This .htaccess example allows you to add hotlink protection. Referral traffic keeps invading your analytics. The results of analytics get inaccurate because of them. You can filter referrers on your analytics, or you can block them through the .htaccess file.
Open up your site’s root .htaccess file and input the following directives to it:
RewriteCond %{HTTP_REFERER} site1\.com [NC,OR]
RewriteCond %{HTTP_REFERER} site2\.com [NC,OR]
RewriteCond %{HTTP_REFERER} site3\.com [NC,OR]
RewriteRule .* – [F]
Replace site1, site2, and site3.com with the URLs you want to block. You can add as many URLs as you want.
Adding MIME types
MIME types tell the Apache server about how to treat a specific type of file as. So, for example, you can tell the server to treat .mp3 files as audio files.
Open up your site’s root .htaccess file and input the following directives to it:
AddType audio/mpeg .mp3
AddType video/mp4 .mp4
AddType application/x-chrome-extension .crx
There are various MIME types you can add. The ones mentioned above are just a few examples.
Specify error documents
To create your custom error documents and link them to the error codes, you need to be familiar with returned error codes. The basic codes are 400, 401, 403, 404, and 500.
Open up your site’s root .htaccess file and input the following directives to it:
ErrorDocument 400 http://yoursite.com/errors/badrequestpage.html
ErrorDocument 401 http://yoursite.com/errors/authreqpage.html
ErrorDocument 403 http://yoursite.com/errors/forbidpage.html
ErrorDocument 404 http://yoursite.com/errors/notfoundpage.html
ErrorDocument 500 http://yoursite.com/errors/serverpage.html
Here the error pages are stored in the error directory. You can name the error documents anything and link them, as shown above.
Leveraging browser caching
Leveraging browser caching is a technique where the websites store their most used web pages on the user’s local storage. This speeds up the web page load time as the contents of the page are stored locally. Unfortunately, browsers can only cache static content.
Open up your site’s root .htaccess file and input the following directives to it:
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType image/jpg "access plus 1 year"
ExpiresByType image/jpeg "access plus 1 year"
ExpiresByType image/gif "access plus 1 year"
ExpiresByType image/png "access plus 1 year"
ExpiresByType text/css "access plus 1 month"
ExpiresByType application/pdf "access plus 1 month"
ExpiresByType text/x-javascript "access plus 1 month"
ExpiresByType application/x-shockwave-flash "access plus 1 month"
ExpiresByType image/x-icon "access plus 1 year"
ExpiresDefault "access plus 7 days"
</IfModule>
You can adjust the time duration according to your website.
Wrapping up .htaccess examples
The .htaccess file gives you great control of your Apache website’s behavior. There are so many things that you can do, and it is really quite flexible, allowing you to manage everything on a per-folder basis. Let us know what other .htaccess examples to add.