0

I created Spark DataFrame in Scala using Databricks. After doing some preprocessing,I came up with a smaller data subset that fits into memory. Therefore I want to convert it to Pandas and then save as CSV file.

The problem is that the DataFrame df on which I worked in Databricks notebook in Scala cells is not visible in a Python cell.

%python

df.toPandas().to_csv("dbfs:/FileStore/tables/test.csv", header=True, index=False)

How can I make df visible in the Python cell?

6
  • Probably too good to be true, but: df_py = df.toPandas().to_csv("dbfs:/FileStore/tables/test.csv", header=True, index=False) And then print(df_py)? Commented Jun 20, 2019 at 22:25
  • @Erfan: It does not work. It says that df cannot be found: NameError: name 'df' is not defined. But df exists in the above cell that I executed successfully before. Commented Jun 20, 2019 at 22:26
  • You don't need to export to csv actually, just do: df_py = df.toPandas() Then print(df_py) Commented Jun 20, 2019 at 22:34
  • @Erfan: This should be Python cell, right? If so, the thing is that df is not visible in Python cell. Commented Jun 20, 2019 at 22:52
  • Try it in Spark cell, after that use df_py in python cell Commented Jun 20, 2019 at 23:02

1 Answer 1

2

Do this display(df) . It usually displays some nested Structs as well.

Or I would do something like this df.createOrReplaceTempView("dfViewName") In the next cell %sql

Select * from dfViewName

Sign up to request clarification or add additional context in comments.

3 Comments

display(df) is exactly what I need. Regarding sql, I think that it would be useful if I wanted to use SQL in the next cell, but I wanted to use Python. Since my final goal was just to save CSV file, display is the right solution.
By the way, which approach would I use to save DataFrame to make it accessible in another Databricks Notebook on the same cluster?
@Erfan: I wanted pandas for saving the DataFrame as CSV file. Sorry, if it was unclear. Of course, I apprecate to see a solution with Pandas. But if it's impossible, then "display" would be a workaround for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.