“To work through this process in PySpark, we’ll load the stats dataset into a dataframe, expose it as a view, and calculate the summary statistics.”
Why is a view needed?
And in this chapter, we read the csv file into a data frame, does that mean it will not use the lazy load?
Type your question above this line.
Course: https://www.educative.io/collection/10370001/6068402050301952
Lesson: https://www.educative.io/collection/page/10370001/6068402050301952/6308607927779328