Luca Giovagnoli, Software Engineer
- Oct 25, 2019
At Yelp, we are heavy users of both Spark and Redshift. We’re excited to announce spark-redshift-community, a fork from databricks’ original spark-redshift project. spark-redshift is a Scala package which uses Amazon S3 to efficiently read and write data from AWS Redshift into Spark DataFrames. After the open source project effort was abandoned in 2017, the community has struggled to keep up with updating dependencies and fixing bugs. The situation came to a complete halt upon release of Spark 2.4 which was sharply incompatible with the latest spark-redshift. Developers looking for a solution turned to online threads on websites like StackOverflow...