Have you ever tried to find a tweet you liked some time ago? Me too, and it’s almost impossible. Scrolling down in the ‘Likes’ tab of my profile while doing CMD-F is a pain and it doesn’t even work sometimes.

I came up with a way of saving all my past and future Twitter likes. It lets me browse, filter them, and search for tweets by text or user. And it’s free.

I thought it could be helpful for others, so here it goes

Before you start

We’ll to use Tinybird to store and query the tweets. It lets you ingest CSVs of…


When you use different emails for personal and work projects, it’s easy to mess up your Git repos committing to them with the wrong user. This is how to fix that

If it’s already happened

If you’ve committed to a personal project (with only you as a committer) with your work email, here’s how to rewrite history for that repo (from this Stackoverflow answer).

This is the one I used and worked like a charm.

git filter-branch -f --env-filter "
GIT_AUTHOR_NAME='Newname'
GIT_AUTHOR_EMAIL='new@email'
GIT_COMMITTER_NAME='Newname'
GIT_COMMITTER_EMAIL='new@email'
" HEAD

If there are multiple committers, don’t use the previous one or you’ll rewrite also the commits that weren’t…


Hey y’all!

A few weeks ago, I started using Spacy to detect locations in job descriptions. Spacy is an NLP library that lets you do pretty powerful stuff out-of-the-box and get things done fast.

Everything was working fine locally. But NoiceJobs (my project) is hosted on Heroku and uses the cheapest dynos possible, with only 0,5GB of RAM. For running simple apps that’s enough, but ML code is normally more memory and CPU-intensive, so when I deployed the new version of the app on Heroku I’d get memory quota exceeded errors all the time.

Some AWS engineers jumped into…


Reducing the number of questions they ask may get more people to apply and enroll, which would increase the company’s revenue

Recently I worked on a project for a US bootcamp from Denver that has been teaching people how to code for the past 5 years. As a part of the admission process, candidates are asked 8 questions out of a bank of 12 questions to test their logical thinking. The school wanted to know if and how, using some data science analysis, this quiz could be reduced in time.

Now, it takes about 1 hour to complete, and ~50% of the people that sign up never start the quiz. Ideally, the school would be able to ask fewer questions, while…


Introducing a new way to look at crypto index funds and visualize and compare asset performance.

This is part 2 to this other post. Enjoy!

When comparing the performance of two assets, it’s common to plot their cumulative returns between a certain date X and another date Y to see which one performed better. But this only tells us about a small part of the story, and maybe a biased one. See the following two plots, comparing Apple to Microsoft.

We could write two completely different stories depending on which plot we choose. …


The majority of this post is interactive visualizations you can hover on, zoom and move are. It’s better read on a computer than on your phone, but landscape mode on the phone will at least let you see the plots better than portrait mode.

Imagine that you’re trying to develop a solution to Kaggle’s
Rossmann Store Sales
competition. You’ve done a lot of feature engineering and created a ton of new variables that may help you predict future sales better.

You’ve created a Random Forest and you’re trying to find its optimal hyperparameters. There’s like 1000+ possible combinations of them…


In the last couple of years, a lot of crypto ‘hedge’ funds have appeared. In this series I’ll analyze how a lot of them would have performed VS bitcoin, with data from 2013 until June 2018 scraped from Coinmarketcap.

‘Hedge’ funds?

Well, originally the hedge in a hedge fund meant that risks were limited by having short positions in addition to the long ones, making the fund market neutral. This is not a common thing in these crypto funds, where they’re mostly long.

So what?

In this series I’ll analyze a particular type of fund where almost all of the data is…

Xoel López Barata

Freelance data scientist and software developer. On Twitter: @xoelipedes.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store