How to think about your data for predictive modeling

Think of your data journey for predictive modeling as an evolution, not a set place. You don’t need a lot of data to get started, even less than 1000 rows of data is enough for causal AI like Dacture to start being accurate, whereas generative AI and large language models need far vaster amounts of data.

The data you use to train predictive models is going to change. Your product, site, and data collected in tools like SalesForce and ServiceNow won’t remain the same. That’s expected and okay.

We see a lot of people that think they need to be at a perfect place to start doing predictive modeling and running scenario predictions, and that isn’t necessary or true. There is no perfect state. Having a few key pieces of data and an idea of what you’d like to predict is enough to get started. As more data points become available, your model can become more refined or more granular, but predicting with what you have is going to get you headed in the right direction.

Here’s an example of data points that would get you started with predicting upsell opportunities, and data points you could add to the model as they become available.


To start:

  • Customer ID
  • Order/Subscription/Plan changes
  • Any product add-ons
  • Current spend

To get advanced:

  • Features they use
  • How often they use features, the product, or the platform
  • Any consumption-based metrics, if they apply
  • Tasks they’ve completed
  • How long they’ve been customer
  • Interactions with Sales/Support/Customer Success
  • If logged in, areas of the site they’ve visited, and how often
  • Time between initial purchase and upsell
  • Any demographic details you may have about the customer or company

You wouldn’t have to have these exact data points, this is just an example to help you think through what you have and how you can get started with predictive modeling. In the example, you can see we initially focus on some very basic things, that you likely already have through your transaction records, billing system, or in whatever sales management tooling you use, like Salesforce. As your organization or data collection matures, we offer a few suggestions of additional data points that could build a richer model when they’re available. What it comes down to is don’t let perfection impede progress.

Do you have a prediction use case in mind and want to talk through what data you might need to start? We are happy to meet and help you through that process. Schedule a meeting with us or contact us directly.