Browse Prior Art Database

Detecting incorrect links within content

IP.com Disclosure Number: IPCOM000243519D
Publication Date: 2015-Sep-29
Document File: 1 page(s) / 33K

Publishing Venue

The IP.com Prior Art Database

Abstract

Detecting incorrect links within content by analyzing the context of the link.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 57% of the total text.

Page 01 of 1

Detecting incorrect links within content

When creating content in multiple pages that need to link to each other, it is difficult to ensure that the links are all correct. This can apply to documentation for products, marketing materials, Wikipedia, legal documents, scientific or political research.

    Use semantic analysis when analysing links in bodies of content. First check that the link is valid then examine the page that is linked to and ensure that it is related to the page that contains the link. Use information such as the link text, local context and page context to provide a level of confidence that the link is correct.

    This would be an addition to standard link checkers, the overall process is: 1. Link checker parses content to find source links 2. For each source link, the link checker will: 2.1. Check if the link target has been seen before 2.2. If it has been seen, report the previous result for this link target 2.3. If it has not been seen, attempt to fetch the link and record the result
Our addition to this process would be that if a link is a valid link that points to content, we would analyse that content to ensure that it seems valid for the context. The overall process would be:

    1. Collect information about the source link, including the source link text, the paragraph containing that text, the section containing that paragraph, all titles in the "breadcrumb trail" of that section.

    2. Collect information about the target link, including the name...