Have you ever tried to find a tweet you liked some time ago? Me too, and it’s almost impossible. Scrolling down in the ‘Likes’ tab of my profile while doing CMD-F is a pain and it doesn’t even work sometimes.
I came up with a way of saving all my past and future Twitter likes. It lets me browse, filter them, and search for tweets by text or user. And it’s free.
I thought it could be helpful for others, so here it goes
We’ll to use Tinybird to store and query the tweets. It lets you ingest CSVs of…
When you use different emails for personal and work projects, it’s easy to mess up your Git repos committing to them with the wrong user. This is how to fix that
If you’ve committed to a personal project (with only you as a committer) with your work email, here’s how to rewrite history for that repo (from this Stackoverflow answer).
This is the one I used and worked like a charm.
git filter-branch -f --env-filter "
A few weeks ago, I started using Spacy to detect locations in job descriptions. Spacy is an NLP library that lets you do pretty powerful stuff out-of-the-box and get things done fast.
Everything was working fine locally. But NoiceJobs (my project) is hosted on Heroku and uses the cheapest dynos possible, with only 0,5GB of RAM. For running simple apps that’s enough, but ML code is normally more memory and CPU-intensive, so when I deployed the new version of the app on Heroku I’d get memory quota exceeded errors all the time.
Some AWS engineers jumped into…
Recently I worked on a project for a US bootcamp from Denver that has been teaching people how to code for the past 5 years. As a part of the admission process, candidates are asked 8 questions out of a bank of 12 questions to test their logical thinking. The school wanted to know if and how, using some data science analysis, this quiz could be reduced in time.
Now, it takes about 1 hour to complete, and ~50% of the people that sign up never start the quiz. Ideally, the school would be able to ask fewer questions, while…
This is part 2 to this other post. Enjoy!
When comparing the performance of two assets, it’s common to plot their cumulative returns between a certain date X and another date Y to see which one performed better. But this only tells us about a small part of the story, and maybe a biased one. See the following two plots, comparing Apple to Microsoft.
We could write two completely different stories depending on which plot we choose. …
The majority of this post is interactive visualizations you can hover on, zoom and move are. It’s better read on a computer than on your phone, but landscape mode on the phone will at least let you see the plots better than portrait mode.
Imagine that you’re trying to develop a solution to Kaggle’s
Rossmann Store Sales competition. You’ve done a lot of feature engineering and created a ton of new variables that may help you predict future sales better.
In the last couple of years, a lot of crypto ‘hedge’ funds have appeared. In this series I’ll analyze how a lot of them would have performed VS bitcoin, with data from 2013 until June 2018 scraped from Coinmarketcap.
Well, originally the hedge in a hedge fund meant that risks were limited by having short positions in addition to the long ones, making the fund market neutral. This is not a common thing in these crypto funds, where they’re mostly long.
In this series I’ll analyze a particular type of fund where almost all of the data is…
Freelance data scientist and software developer. On Twitter: @xoelipedes.