Three platforms, one song, three different stories of "popular."
We built a Python pipeline pulling chart data from Spotify, Billboard, and Deezer into a single SQLite database — then used it to answer a question artists, A&R teams, and product managers all quietly disagree about: what does popular actually mean?
My role
Team build · Insight synthesis · Strategic interpretation
Team
Coding Queens — 4 collaborators across data ingestion, schema, and analysis
Stack
Python · BeautifulSoup · REST APIs · SQLite · matplotlib
↓ THE INVESTIGATION
If three platforms each call a different song "#1," whose chart is the truth?
Chart data isn't neutral. Each platform encodes its business model into how it ranks songs — Spotify rewards stream velocity, Billboard weighs sales and radio, Deezer surfaces what its (largely European) listeners replay. We built the pipeline to make those biases visible, then asked the harder question a product team would care about: what should you do with that?
Spotify
Web scraping · kworb.net
Stream-driven. Reflects active listening, weighted toward younger US/global audiences.
Billboard Hot 100
Billboard API
Industry-standard. Blends sales, radio, and streams — slower-moving, more durable.
Deezer
Deezer REST API
EU-skewed listener base. Reveals what global audiences (outside US algorithms) play.
Data sources unified
Chart entries normalized
SQLite schema, 6 tables
Strategic findings
↓ THE METHOD
Build the pipeline. Then ignore it. The point was never the code.
The technical work was a means: a clean, joinable dataset across three platforms with no repeating strings and no duplicate IDs. Once that existed, the real work began — sitting with the data and asking what a label exec, an artist's manager, or a Spotify PM would want it to tell them.
01
Pivots in ingestion
Spotipy didn't expose chart data → moved to BeautifulSoup against kworb.net. YouTube Music API capped at 40 songs → swapped for Deezer to hit the 100-row floor.
02
Schema as a normalizer
Six tables, including artist-ID lookup tables for each platform — so "Taylor Swift" could be one entity across Spotify, Billboard, and Deezer joins.
03
From SQL to story
Three calculations: artists by song count, top-10 cross-platform position, cumulative weeks on Billboard. Each one was chosen to surface a different definition of "popular."
↓ FINDING 01 · VISIBILITY ≠ ENDURANCE
Taylor Swift looks dominant. Morgan Wallen actually is.
On a snapshot of "songs currently charting," Taylor Swift had 34 entries to Morgan Wallen's 9 — a 4× lead. But measured by cumulative weeks on Billboard, Wallen had 462 weeks to Swift's 157. A near 3× reversal. The same data, two completely different stories about who's "winning."
Snapshot — songs on charts now
Spotify + Billboard + Deezer, current week
Endurance — cumulative weeks on Billboard
Across all charting songs by artist
Recency metrics flatter pop. Endurance metrics reward country and rap.
If you're a label deciding who to re-sign, a streaming PM ranking artists for Wrapped, or a brand picking a partner — "songs on the chart this week" will systematically over-credit pop release cycles and under-credit catalog artists. The metric you choose is a strategic decision, not a neutral one.
↓ FINDING 02 · POSITION IS PLATFORM-DEPENDENT
Spotify's top 10 mostly didn't even crack Billboard or Deezer's top 10.
We took Spotify's top 10 songs (the week of the Tortured Poets Department release) and looked up where they sat on the other two charts. Most landed at 11+ — outside the visible top 10 — on Billboard and Deezer. The same songs, ranked by different audiences and different formulas, told a story that was barely overlapping.
Spotify's top 10 — where they ranked elsewhere
Lower is better · 11 = outside the top 10
A 'chart-topper' in one universe is a mid-tier song in another.
For an A&R team scouting talent: don't triangulate from a single platform. For a product team building discovery: cross-platform signals reveal artists with durable global appeal vs. ones riding a single ecosystem's algorithm. For an artist negotiating a deal: which chart is in your contract changes what "success" earns you.
↓ FINDING 03 · THE LONG TAIL HAS A SHORT NECK
10 artists hold ~40% of the top-100 chart real estate across three platforms.
When we counted every song on every chart and grouped by artist, a small handful of names occupied a disproportionate share of available spots. The "discovery" promise of streaming, on this snapshot, looked more like consolidation.
Top 10 artists by chart appearances
Across Spotify · Billboard · Deezer · current week
Chart visibility is a winner-take-most market. Design accordingly.
For a streaming product: if your homepage mirrors the charts, you're reinforcing concentration — not solving for discovery. For a label: shelf space is more contested than catalog size suggests. For a policy or research lens: "democratized distribution" hasn't (yet) produced democratized attention.
↓ STRATEGIC IMPLICATIONS
What a product or strategy team should actually do with this.
For product managers
Stop shipping single-source rankings.
Whatever "trending" surface you build will inherit the bias of the chart you pull from. Blend at least two definitions — velocity + endurance — and let users toggle. Transparency is the feature.
For strategists & A&R
Treat 'chart position' as a metric family, not a number.
Ask which chart, which window, which weighting. A snapshot win and a 200-week catalog win signal completely different artist trajectories — and warrant completely different investment.
For researchers & analysts
Pipeline first, then narrative.
The unglamorous work — normalized IDs, joinable schemas, deduped strings — is what made the 'so what?' possible. Without it, you're comparing three apples that turn out to be three different fruits.
↓ REFLECTION
The brief said "collect data." The work was learning to make it mean something.
What I took from SI-206 wasn't the Python — it was the discipline of refusing to stop at "here is a chart." The visualization is the easy part. The hard part is sitting with three numbers that disagree and figuring out which decision they should change. That's the muscle I bring to every research project now.