Browse Prior Art Database

A system and method of synthetic data generation of enterprise marketing data that is easily configurable with all relevant features Disclosure Number: IPCOM000238416D
Publication Date: 2014-Aug-26
Document File: 6 page(s) / 68K

Publishing Venue

The Prior Art Database


Statistical models used for marketing analytics and forecasting cannot be validated without having data that is sufficiently varied and well characterised. This describes a tool that allows such data to be synthesized based on user-defined specifications.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 30% of the total text.

Page 01 of 6

A system and method of synthetic data generation of enterprise marketing data that is easily configurable with all relevant features

Currently a research scientist working on marketing optimization and prediction analytics problems creates models and decides important machine learning features based on small sets of customer data provided to him. Usually, it is very difficult to get large sets of different customer data because of security and privacy concerns. As a result , the entire optimization and prediction solutions are based on limited amount of data which is a risk and hence one of the biggest challenges today.

Similarly , the marketing space is very dynamic and fast changing and hence continuous cross validation/improvements of machine learning models with the most current valid data is extremely important.

Similarly , it is essential that a research solution be tested and validated against the requirements/usage actual software solution being presented to the end user. This requires a level of testing that validates the working of the research solution against the product requirements.

For all 3 reasons above , it is necessary that there is a comprehensive method to generate synthetic data and create different variations to observe their effects.
random ad-hoc generation of marketing data/data of one particular enterprise in the field can not be relied upon. A unique approach to generate contacts using marketing contact patterns for customers and responses using response probabilities deduced from historical state is desired.

The approach is generation of synthetic marketing data that closely resembles practical marketing data and generate variations based on the most important marketing features.

A mean number of contacts to a customer is used to calculate the features of the contact (offer and channel and strategic segment of the customer) based on probabilities of each of these features. Each of these input parameters could be defined in the form of a contact pattern and a customer can be assigned a contact pattern over a particular time period. Responses are defined by considering the contacts and the responses of that customer in the past which defines the state of the customer. For every contact or response in the past , the associated channel of that interaction defines a decay curve which controls how the effect of the contact decreases over time. A response is generated by combining the response probabilities of all the preceding contacts and responses, where each contact's response probability is the combination of the response probabilities of its features (offer, channel, strategic segment, response type), and use the resulting probability to decide whether to generate a response. Most marketing optimization problems segment customers (group them based on demographics, value etc) by advanced analytics into very small micro segments. A decision tree is defined to create contacts and responses to decide the ef...