Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-1138

robots.txt

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0
    • Component/s: None
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      3
    • Documentation Status:
      Needed

      Description

      By default, robots.txt in XMLUI allows indexing all content. This leads to indexing all browse, search and discovery pages. Search engines then give mostly results pointing to these lists of results instead of the proper items. I suggest to disallow the following pages by default:

      User-agent: *
      Disallow: /discover
      Disallow: /search-filter

      Note, that current robots.txt contains this message:

      1. Uncomment the following line ONLY if sitemaps.org or HTML sitemaps are used
      2. and you have verified that your site is being indexed correctly.
      3. Disallow: /browse

      Since all items should be accessible via the browse pages in the community/collection structure, /browse pages should be allowed by default to enable spiders to explore the whole repository. But /discover and /search-filter are surely redundant and only clutter the search results.

        Attachments

          Activity

            People

            Assignee:
            tdonohue Tim Donohue
            Reporter:
            helix84 Ivan Masár
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: