How to model time-to-event and ‘time between events’ on real data using the Exponential Family to optimize returns of time driven investments.

Screenshots from within the article.


When we want to measure Time-To-Event Data Analysis (or time between events for that matter), time itself is not necessarily an observation axis rather than the observed target. How can I model how much time between events can…

The meaningful features your model is missing maybe a one-liner away.

1. Theory

To compute these models, we need to vectorize our variables and align them in a way that every column is a variable (discretized) and every row is an observation of the variables in a given point of a common axis. This is mandatory for applications (such as machine learning) which require a tidy dataset as input.

To show it in a simple example:

Image by Author.

Above, two variables are both graphed and vectorized sharing a common axis…

Boosting Performance by Generating Features from External Data with Python.

Seattle Landscape; image from


Then we’ll significantly improve that model by generating features from external data such as proximity of cultural spaces, parks, public art spots, golf courses, swimming beaches, picnic tables, etc.. measuring the improvement from each added feature.

What we’ll do

  • Step 1: Explore the Seattle Housing Prices Data
  • Step 2: Create a Price Prediction Model
  • Step 3: Add Features from External Data
  • Step 4: Compare and Analyze Results

Step 1. Explore the Seattle Housing Prices Data

A simple approach.

You will either step forward into growth or you will step backward into safety.

-Abraham Maslow-

Visual credit


Simplicity is key

Simplicity is the keynote of all true elegance.

-Abraham Maslow-

Photo of Google Headquarters from source.

*Disclaimer: I am assuming that whoever has the ability to comprehend and execute the content of this article is savvy enough to perform a robust back-testing in every corner of their trading pipeline before actually running it in production. However, there are some considerations that this article doesn’t take into account (spread, slippage and transaction costs, among others) and for that matter, this article is not to be considered financial advice. It is to be considered an educational step towards better performing results.


It’s simpler than you think.

People who are crazy enough to think they can change the world are the ones who do.

-Rob Siltanen-

Image from


Thousands of companies around the world, from small startups to global corporations, find great value in improving the performance of their supervised or unsupervised ML models, whether it’s a sales or demand forecast, a market basket analysis…

A step-by-step tutorial in Python.

Minimize cost Function

Sales or Demand Forecasts are a priority on a huge amount of companies (from startups to global corporations) Data Science/Analytics departments. To say the least, there is a low supply of experts in the subject. Reducing the error even by a small amount can make a huge difference in revenue or savings.

In this article, we’ll do a simple sales forecast model with real data and then improve it by finding relevant features using Python.

What we’ll do

  • Step 1: Define and understand Data and Target
  • Step 2: Make a Simple Forecast Model
  • Step 3: Improve it by…

Simplicity is key.


Image taken from

We’ll also measure how profitable it would be in real life.

What we’ll do

  • Step 1: Set up technical prerequisites
  • Step 2: Get the data for daily Amazon Candles since 2017
  • Step 3: Define and understand target for ML
  • Step 4: Blend business news to our data and understand tokens
  • Step 5: Prepare our data and apply ML
  • Step 6: Measure and analyze results
  • Step 7: Break the data and train/test through time

Step 1. Prerequisites

  • Have Python…

A step-by-step tutorial using Python.

Image from

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.

On today’s harsh global economic conditions, traditional indicators and techniques can have poor performances (to say the least).

In this tutorial we’ll search for useful information on news and transform it to a numerical format using NLP to train a Machine Learning model which will predict the rise or fall of any given Cryptocurrency…

A Study of Each Swing State’s Key Drivers Using Machine Learning.


The campaigns for the 2020 US Presidential Election have started and roughly 4 months from now (November 3rd is Election Day) a head of state will be selected by the voters.

Which candidate will leverage the concepts that influence the voters’ behavior the most?

Due to the US Electoral College system, it’s highly likely that the election will be defined by the swing states. Currently, the polls are reasonably unanimous on the following:

There are 21 states that lean Democrat

There are 24 states that Lean Republican

And there are 6 swing states where voting preference is still up in…

Federico Riveroll

An attempt to separate signal from noise. | MSDS

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store