Play with SQL and PySpark
2026/3/1小于 1 分钟
I learned some SQL at university, but I have never used PySpark.
I want to try modern data tools like Spark and big data platforms.
Simple goals:
- Install PySpark and run it on my laptop.
- Load a small dataset and query it with SQL.
- Compare SQL on a normal database and on Spark.
- Maybe build a tiny data pipeline or dashboard.
I do not want a huge project. I just want to understand how PySpark feels in real work, and write down some notes about what is easy and what is painful.
