Browse Prior Art Database

va new methodology of re-org the big data processing based on data collect frequency for best result accuracy Disclosure Number: IPCOM000247620D
Publication Date: 2016-Sep-21
Document File: 9 page(s) / 468K

Publishing Venue

The Prior Art Database


This disclosure describes a new methodology of self adaptively re-org the data processing logic based on the data collect frequency, which is used for best result accuracy. The big data processing is usually facing such a problem: too much data for us to handle all of them, so we need to shrink the data collect frequency for not important data or slowly changed data. So this disclosure is aiming to provide such a solution for data handling.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 68% of the total text.

Page 01 of 9

va new methodology of re-org the big data processing based on data collect frequency for best result accuracy

In most big data application, data collect is always a big problem, as there are big amount data generated every day, it is hard to make a decision on how often to collect the data for analysis and machine learning in big data system. The more frequency the data collected, the moreaccurate the result will have, but the more network bandwidth consuming and computing resource consuming will have. To balance this issue, this disclosure describe a new methodology on handling the data processing via upgrading or downgrading the processing steps based on the data collect frequency monitoring and change
this disclosure includes the following main idea:

1> the main process of the sample price big data system

- the target F is the target function for result performance validation
- adjust the processing steps via data in frequency (now is default static 10 min frequency)

The scenario of the sample price big data system to apply the new methodology of frequency adjustment


Page 02 of 9


Page 03 of 9


Page 04 of 9

2> The detail implementation of the new methodology on big data system
- frequency will vary from 10s to 10 min or even larger, it depends on the data change monitoring to locate the most necessary frequency level. - the process will upgrade(add more process step to get accurate medium result) or downgrade(skip unnecessary data process step)
- th...