Skip to content | Change text size
 

Search and metadata

For search to be useful, web pages must have good metadata.

Metadata is typically defined as "data about data". Metadata provides information about a web document. It may include details on the subject matter of the document, the name of the person or group who created it, the date of creation, and keywords that can help a user locate that particular document amongst a number of similar documents.

Metadata and search engines

The <meta> element allows page authors to include metadata about a web page. It is usually found within the <head> tags in a web document. All metadata tags use the same basic format. Here are two common examples:

<meta name="keywords" content="web, style, guide" />
<meta name="description" content="Guidelines on the appropriate use of Monash web templates, including branding, visual design, usability, accessibility, and technical issues." />

There are other elements of web pages that provide data for search engines. This includes the text within the <title> tags, text on the page, text alternatives for images, and links to and from other websites.

Most search engines use all of these data sources to index web pages, but they do it in different ways and treat some forms of data with more importance. For example, some ignore keyword metadata elements because it is easy for page authors to deliberately include incorrect or misleading metadata in an attempt to attract visitors to a website.

Monash-specific metadata

There are some Monash-specific uses for metadata that you should consider. Pages that have certain metadata attributes can found by using the search engine or filtered from search results.

Two common uses for Monash-specific metadata are:

<meta name="monash.approval" content="Marketing and Public Affairs" />
<meta name="monash.access" content="public" />

The first example identifies the approver of the page content. The second marks the page as being for public access.

Filtering intranet pages from Monash search engine results

Pages for internal use only should use the monash.access metadata element. Mark the content as intranet, as shown in the example below. Searches from the Monash home page and other publicly-accessible pages have been set up to exclude pages with this metadata attribute from search results.

<meta name="monash.access" content="intranet" />

Mailing lists that are archived on the web should use metadata to prevent the search engine including mailing list messages in search results.

<meta name="monash.access" content="emailarchive" />

Guidelines on use of specific metadata elements

Title

The Monash search engine gives ten times more weight to terms used in the page title than terms in the body of a document. All search engines treat the title tag with importance.

See page titles for a discussion of the importance of page titles and guidelines on how they should be written.

Description

If a description field is provided, most search engines will use this information to place beneath the page title in a list of search results. It is important to make sure that your description includes a concise and accurate summary of the contents of the page. A sentence or two is all that is needed. Search engines limit the number of characters that can be used in a description. Keep descriptions to a maximum of 200 characters to avoid having your description cut off.

<meta name="description" content="Information and resources for current students at Monash University." />

The Monash search engine gives ten times more weight to terms used in the description field than terms in the body of a document. Most other search engines will not give as much importance to this metadata.

Keywords

The keywords meta tag allows you to supplement the title and contents of a web page with a list of index terms for a document. Some search engines may rank keywords higher than body text. The Monash search engine has been configured to add ten times more weight to keyword metadata than to body text.

Keywords should be included on top level and other important pages. However, adding the same keywords to every page is counter-productive. It is better to create unique keywords for each individual page.

When using keywords, make sure the terms you select accurately describe what users can find on the page.

<meta name="keywords" content="current students, portal, exams, fees, scholarships, enrolment, study resources, research resources, graduation, university life, student employment" />

Metadata and the content management system

When websites are migrated into the content management system (CMS), there will be two types of pages: pages that are generated from CMS data capture templates, and regular web pages.

CMS-generated pages will eventually be able to take advantage of metadata generated by a CMS product called MetaTagger. This will reduce the burden on page authors who will then be able to review and edit the automatically generated metadata.

Resources

See also