0

I basically want to get the name, number of stars, and the number of reviews, of the restaurants with 5 stars and more than 1000 reviews.

  def fiveStarBusinessesSQL():DataFrame = {
    spark.sql("SELECT name, stars, review_count FROM yelpBusinessesView WHERE stars == 5 && review_count >= 1000")
  }

It makes no sense to me why I get the error. It is a basic SQL call, as basic as it can get IMO.

Here's the error I get:

Exception in thread "main" org.apache.spark.sql.catalyst.parser.ParseException: 
mismatched input 'FROM' expecting <EOF>(line 1, pos 33)

== SQL ==
SELECT name, stars, review_count FROM yelpBusinessesView WHERE stars == 5 && review_count >= 1000
---------------------------------^^^

I'm working on the Yelp Dataset. Here's an example of what's in yelpBusinessesView

{"business_id":"1SWheh84yJXfytovILXOAQ","name":"Arizona Biltmore Golf Club","address":"2818 E Camino Acequia Drive","city":"Phoenix","state":"AZ","postal_code":"85016","latitude":33.5221425,"longitude":-112.0184807,"stars":3.0,"review_count":5,"is_open":0,"attributes":{"GoodForKids":"False"},"categories":"Golf, Active Life","hours":null}
3
  • 1
    Try changing && to AND. Commented Oct 18, 2019 at 5:46
  • That did the trick. Thanks a lot! Commented Oct 18, 2019 at 6:02
  • Happy to help :) Commented Oct 18, 2019 at 7:39

1 Answer 1

1

Use String Interpolation while dealing with plain SQL queries

Sign up to request clarification or add additional context in comments.

2 Comments

Can you give some more details how this exactly should be done, would be helpful for others
Welcome to S.O. Let me be the first one to hand you some rep.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.