Module 4: SEO
Robots File

What is robots.txt?

A robots.txt file tells search engine crawlers which URLs the crawler can access on your site.

Robots.txt vs Sitemap.xml

The purpose of a Sitemap is to indicate the search engines of all the webpages that it should crawl on your website. Robots. txt: A robots. txt file is created to signify the search engines which pages to crawl and which pages to omit.

How to add a Robots File

With Next.js, you can add or generate a robots.txt file that matches the Robots Exclusion Standard (opens in a new tab) in the root of app directory to tell search engine crawlers which URLs they can access on your site.

Static Robots File

Add directly your txt file in the app directory with the links you want search engines to crawl and the ones to ommit.

Example:

// app/robots.txt

User-Agent: *
Allow: /
Disallow: /private/

Sitemap: https://acme.com/sitemap.xml

Dynamic Robots File

Add a robots.js or robots.ts file that returns a Robots object (opens in a new tab).

import { MetadataRoute } from "next";
 
const home = process.env.NEXT_PUBLIC_APP_URL;
 
export default function robots(): MetadataRoute.Robots {
  // Find all unpublished events.
  //   const disallowedEvents = [""];
 
  //   const disallowedPaths = [...disallowedEvents, "/api/contact"];
  const disallowedPaths = ["/api/events", "/api/categories"];
 
  return {
    sitemap: `${home}/sitemap.xml`,
    rules: {
      userAgent: "*",
      allow: "/",
      disallow: disallowedPaths,
    },
  };
}

This code defines a function that returns a robots configuration for a Next.js application. It specifies which paths should be disallowed for web crawlers (e.g., search engine bots). Specifically, it disallows bots from accessing /api/events and /api/categories paths. Additionally, it provides a link to the sitemap located at the root of the application domain.

Exercise

You can use the same file above to create your robots.js file!



Resources: