Browse Prior Art Database

Cluster-Membership Prediction for Efficient Dynamic Testing

IP.com Disclosure Number: IPCOM000249148D
Publication Date: 2017-Feb-08
Document File: 3 page(s) / 25K

Publishing Venue

The IP.com Prior Art Database

Abstract

Dynamic web application testers, such as IBM?s AppScan and its rivals, fail miserably at scanning large web-applications in their entirety in a reasonable amount of time. The main cause of this is the vast number of web pages to crawl to, the amount of testable elements per page and physical constraints of the machine it runs on. In this patent proposal we outline a novel method that substantially reduces the number of web pages to crawl to.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

1

Cluster-Membership Prediction for Efficient Dynamic Testing

Background: Belowaresomedefinitions,clarificationsandbackgrounddetails.

Clustering: Clusteranalysis[1]orclusteringisthetaskofgroupingasetofobjectsinsuchawaythat  objectsinthesamegroup(calledacluster)aremoresimilar(insomesenseoranother)toeach otherthantothoseinothergroups(clusters).

WebCrawler: Awebcrawlerisaprogramthat,givenoneormoreseedURLs,downloadsthewebpages associatedwiththeseURLs,extractsanyhyperlinkscontainedinthemandaddsthemtoalist  ofURLstovisit,calledthecrawlfrontier.URLsfromthecrawlfrontierarethenrecursively visitedaccordingtothecrawler’spolicy.Webcrawlersareanimportantcomponentofweb searchengines,wheretheyareusedtocollectthecorpusofwebpagesindexedbythesearch  engine.

ClusteringforDynamicWebAppTesting: Inapreviousdisclosure,dubbed“FunctionalServer-sideSimilarityofWebPagesfor EfficientDynamicTesting”,wedescribedamethodthatdramaticallyreducesthenumber oftestedelementsthatadynamictestertests.Toachievesuchreduction,pagesare clusteredtoclustersofsimilarserver-sidefunctionality,i.e.pagesinaclusterarethe outputofthesameserver-sidefunctionality(script).Sinceeachclusterisbelievedtohave beencreatedbythesameserversidefunctionality,onlyonepagefromthatclusterneedstobe tested.Weimplementedthealgorithmandtestedit.Theresultsshoweda10-20(!)times reductioninthenumberofelementsthatneedstobetested(dependsonthesite).

Prior-art: Wedidn’tfindany.

Summary:

AsdescribedinClusteringforDynamic*AppScanemploysaclusteringalgorithmtoreduce thenumberofelementsittests.Nevertheless,itdoesn’taffecttheamountofpages*AppScan crawlsto-i.e.*AppScanstillcrawlstheentirewebsite.

Weproposeamethodthatpredictscluster-membershipofarequest-to-be-sent(atextual representationoftherequestthatshouldbesent,butisn’tactuallysent)ofawebpage.Our methodeffectivelyeliminatestheneedofcrawlingtowebpagesitcanpredicttheir  cluster-membership(sincewealreadyderivedtheelementstobetestedofanalreadycrawled pagethatcamefromthesameserversidefunctionality).

2

Weimplementedthealgorithmandtestedit.Theresultsshoweda5-10(!)times reductioninthenumberofpages*AppScancrawlst...