Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for preventing user from connecting to misspelled URLs by using Key adjacency analysis and weighted cache entries.

IP.com Disclosure Number: IPCOM000205854D
Publication Date: 2011-Apr-06
Document File: 4 page(s) / 89K

Publishing Venue

The IP.com Prior Art Database

Abstract

Method for preventing user from connecting to misspelled URLs by using Key adjacency analysis and weighted cache entries.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 39% of the total text.

Page 01 of 4

Method for preventing user from connecting to misspelled URLs by using Key adjacency analysis and weighted cache entries .

When a user types in a URL in a browser, sometimes a key or two is mis-spelt mostly due to

pressing an adjacent key in the keyboard rather then the intended key. They may also enter

the keys in wrong order resulting in a jumbled URL.The mis-spelt url may resolve to a rogue site which is specially designed to attract traffic due to such mistakes. This is a prevalent malpractice and commonly known as typo-squatting. Typical solutions to this problem revolves around maintaining a database of such rouge sites and informing the user when the mis-spelt URL matches any entry in the rogue list. But due to the dynamic nature of the Web, such database approach is not efficient. Also it depends on continuous updates from the Net, which not all users may prefer.

This solution makes use of an algorithm that takes into account the adjacency of keys in a standard QUERTY keyboard(both on Computers and Hand Held Smart devices) and tries to catch a mis-spelt URL. It also makes use of a weighted Cache of successfully accessed URLs in the past. The weights of each such entry is proportional to the number of successful access that has been made in the past of that particular URL. It does not actively make use of the DNS Resolver to check the validity of an URL, because a mis-spelt URL can very well resolve to a valid IP address, but that can be a Typo-squatting site.

   We define an Adjacency Domain for each key on a standard QUERTY keyboard: KEY Adjacency Domain
Q {1,2,W, S, A}

W {Q,2,3,E,D,S,A}

E {W,3,4,R,F,D,S}

R {E,4,5,D,F,G,T}

T {R,5,6,F,G,H,Y}

Y {T,6,7,G,H,J,U}

U {Y,7,8,H,J,K,I}

I {U,8,9,J,K,L,O}

O {I,9,0,K,L,P}

P {O,L,0}

A {Q,W,S,X,Z}

S {A,Q,W,E,D,C,X,Z}

D {S,W,E,R,F,V,C,X}

F {D,E,R,T,G,B,V,C}

G {F,R,T,Y,H,B,V}

H {G,T,Y,U,J,

N

,B}

J {H,Y,U,I,K,M,

N}

K {J,U,I,O,L,M} L {K,I,O,P}

Z {A,S,X}

X {Z,A,S,D,C} C {X,S,D,F,V}

1


Page 02 of 4

V {C,D,F,G,B} B {V,F,G,H,

N}

N

{B,G,H,J,M}

M {

N,J,K}

1 {2,W,Q}
2 {1,Q,W,3} 3 {2,W,E,4} 4 {3,E,R,5} 5 {4,R,T,6} 6 {5,T,Y,7} 7 {6,Y,U,8} 8 {7,U,I,9}
9 {8,I,O,0}
0 {9,O,P}

The Cache format is:
Full URL (eg: " www.yahoo.c om")

Base URL (eg: "yahoo")

Sorted Base URL
(eg: "ahooy")

Weight
(A variable numeric value)

T Weight
(A variable numeric value)

Sorted in descending order of T Weight. If T Weight is the same, then in descending order of Weight.

Full URL: The complete URL e.g. www.yahoo.com
Base URL: Only the main part of URL after removing prefix/suffix e.g yahoo
Weight: If the URL has been successfully accessed N times in the past, then its weight is N
T Weight: We define a time value called T (Threshold). If the user remains connected to a URL for more than time T, then T Weight is incremented by 1. The basic idea is that, sometimes the user may access a Valid site, but that may not be the one that they are looking for. Then most likely, the access time will not cross T. Hence, access time crossing T means a h...