We built multiple applications based Cassandra and Spark. During the project we encountered a number of challenges and problems with both technologies as well as with the Spark-Cassandra-Connector
In this talk we want to outline a few of those problems and our actions to solve them. Furthermore we want to give best practices which turned out to be useful in our projects.
Topics include are not limited to:
– Cassandra Bucketing
– Spark Partitioning
– Efficient Queries
– Spark Join With Cassandra Table
– Spark Data Locality