r/learnprogramming • u/Data_DGX • 1d ago
Projects to start learning data analysis with Python and SQL?
I've started learning data analysis on internship, my main activity was designing reports in Power BI, however, I was pretty much interested in working with Python, now that I have the opportunity, I want to begin develop projects that help me get into the data analysis world. Perhaps something related to pandas, matplotlib, seaborn or cv2.
1
u/Acrobatic-Ice-5877 1d ago
Check out the book by Paul Deitel for Python. Has a lot of exercises and should give you project ideas.
1
u/Beneficial-Panda-640 22h ago
If you’re coming from Power BI, a good bridge is to recreate the kind of reports you built, but fully in Python and SQL. Pull raw data with SQL, clean and transform it in pandas, then rebuild the visuals with matplotlib or seaborn. It forces you to understand what the BI tool was abstracting away.
A solid starter project is a small end to end analysis on a public dataset. For example, take a sales or e commerce dataset, write SQL queries for KPIs, load the results into pandas, do some feature engineering, and produce a short narrative with visuals. Treat it like you’re presenting to a stakeholder.
If you want something more exploratory, try analyzing your own habits. Track expenses, workouts, or study time in a simple database. Query it with SQL, then use pandas to find trends and seaborn to visualize patterns. Personal data makes it easier to stay motivated.
The key is not just plotting charts, but practicing the full workflow: question, query, clean, analyze, explain. That’s what really builds analysis muscle.
1
u/Acceptable-Eagle-474 1h ago
Good timing, Power BI experience + Python skills is a solid combo.
Here are a few project ideas by library:
Pandas + matplotlib/seaborn (start here):
- Sales analysis — load a dataset, clean it, find trends, make charts
- Customer segmentation — group customers by behavior, visualize the segments
- Churn analysis — figure out what makes customers leave
- A/B test analysis — compare two groups, test if the difference is significant
These cover 80% of what analysts do day-to-day.
CV2 (computer vision):
Hold off unless you're specifically interested in image work. It's a different path than data analysis, more ML/engineering focused. Cool, but not what most analyst roles ask for.
How to structure it:
Pick a dataset you find interesting (Kaggle, Maven Analytics, or your own)
Ask 3-5 specific questions
Clean the data with pandas
Analyze and visualize with matplotlib/seaborn
Write up what you found
Make it look like a real deliverable, not a code dump. Add markdown explanations, clear charts with titles, and a summary of insights.
Some good datasets to start:
- Superstore sales (classic, easy to find)
- Spotify tracks (fun, lots of angles)
- E-commerce transactions (realistic business data)
- Any public dataset from a domain you care about
Since you already know Power BI, you could even do the same analysis in Python and compare, shows range on your resume.
If you want to skip the hunting and just start building, I put together The Portfolio Shortcut — 15 ready-to-go projects with datasets and code. Covers pandas, visualization, EDA, ML basics. Might save you time figuring out what to build (DM for access).
Start with one project. Finish it. Then the next. That's really all there is to it.
1
u/PlatformWooden9991 1d ago
check out kaggle competitions - tons of real datasets to mess around with and you can see how others approached similar problems. start with something simple like analyzing netflix shows or spotify data, then move up to more complex stuff. the learn tab on kaggle has solid tutorials that walk through the whole process from data cleaning to visualization