{"id":19389,"date":"2021-03-31T09:57:40","date_gmt":"2021-03-31T15:57:40","guid":{"rendered":"https:\/\/www.fullcontact.com\/?p=19389"},"modified":"2022-08-16T16:45:38","modified_gmt":"2022-08-16T22:45:38","slug":"resolve-transition-to-scylladb","status":"publish","type":"post","link":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/","title":{"rendered":"Improving the Graph: Transition to ScyllaDB"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">In 2020, FullContact launched our <\/span><a href=\"https:\/\/www.fullcontact.com\/products\/resolve\/\"><span style=\"font-weight: 400;\">Resolve<\/span><\/a><span style=\"font-weight: 400;\"> product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra with its promises for scalability, high availability, and low latency on commodity hardware. However, we could never run our internal workloads as fast as we wanted &#8212; Cassandra didn\u2019t seem to live up to expectations. Early on, we had a testing goal of hitting 1000 queries per second, and then soon after 10x-ing that to 10,000 queries per second through the API. We couldn\u2019t get to that second goal due to Cassandra, even after lots of tuning.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Late last year, a small group of engineers at FullContact tried out ScyllaDB to replace Cassandra after hearing about it from one of our DevOps engineers. If you haven\u2019t heard about <\/span><a href=\"https:\/\/www.scylladb.com\/\"><span style=\"font-weight: 400;\">ScyllaDB<\/span><\/a><span style=\"font-weight: 400;\"> before, I encourage you to check it out &#8212; it\u2019s Cassandra-compatible, written in C++, promising <\/span><b>big<\/b><span style=\"font-weight: 400;\"> performance improvements.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this blog, we explore our experience starting from a hackathon and ultimately our transition to ScyllaDB from Cassandra. The primary benchmark we use for performance testing is how many queries per second we can run through the API. While it\u2019s helpful to measure a database by reads and writes per second, our database is only as good as our API can send its way, and vice versa.\u00a0<\/span><\/p>\n<h2>The Problem with Cassandra<\/h2>\n<p><span style=\"font-weight: 400;\">Our Resolve Cassandra cluster is relatively small: 3 instances of c5.2xlarge EC2 instances, each with 2 TB of gp2 EBS storage. This cluster is relatively inexpensive and, short of being primarily limited by the EBS volume speed limitation (250MB\/s), it gave us sufficient scale to launch Resolve. Using EBS as storage also lets us increase the size of EBS volumes without needing to redeploy or rebuild the database and gain storage space. Three nodes may be sufficient for now, but if we\u2019re running low on disk, we can add a terabyte or two to each node while running and keep the same cluster.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">After several production customer-runs and some large internal batch loads began, our Cassandra Resolve tables grew from hundreds of thousands to millions and soon to over a hundred million rows. While we load-tested Cassandra before release and could sustain 1000 API calls per second from one Kubernetes pod, this was primarily an empty database or at least one with only a relatively small data set (~ a few million identifiers) max.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With both customers calling our production Resolve API and internal loads at 1000\/second, we saw API speeds starting to creep up: 100ms, 200ms, and 300ms under heavy load. For us, this is too slow. And upon exceptionally heavy load for this cluster, we were seeing more and more often the dreaded:<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">DriverTimeoutException: Query timed out after PT2S<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">coming from the Cassandra Driver.<\/span><\/p>\n<h2>Cassandra Tuning<\/h2>\n<p><span style=\"font-weight: 400;\">One of the first areas we found to gain performance had to do with Compaction Strategies &#8212; the way Cassandra manages the size and number of backing SS tables. We used the <\/span><a href=\"https:\/\/cassandra.apache.org\/doc\/latest\/operating\/compaction\/stcs.html\"><span style=\"font-weight: 400;\">Size Tiered Compaction Strategy<\/span><\/a><span style=\"font-weight: 400;\"> &#8212; the default setting, designed for \u201cgeneral use,\u201d and insert heavy operations. This compaction strategy caused us to end up with single SS Tables larger than several gigabytes. This means on reads, for any SS tables that get through the bloom filter, Cassandra is iterating through many extensive SS tables, reading them sequentially. Doing this at thousands of queries per second means we were quite easily able to max the EBS disk throughput, given sufficient traffic. 2 TB EBS volumes attached to an i3.2xlarge max out at a speed of ~250MB\/s. From the Cassandra nodes, it was difficult to see any bottlenecks or why we saw timeouts. However, it was soon evident in the EC2 console that the EBS write throughput was pegged at 250MB\/s, where memory and CPU were well below their maximums. Additionally, as we were doing large reads and writes concurrently, we have huge files being read. Still, the background compaction added additional stress on the drives by continuously bucketing SS tables into different size tables.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We ended up moving to <\/span><a href=\"https:\/\/cassandra.apache.org\/doc\/latest\/operating\/compaction\/lcs.html\"><span style=\"font-weight: 400;\">Leveled Compaction Strategy<\/span><\/a><span style=\"font-weight: 400;\">:\u00a0<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">alter table mytable WITH compaction = { 'class' : \r\n'LeveledCompactionStrategy\u2019};<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Then after an hour or two of Cassandra completing its shuffling data around to smaller SS Tables, were we again able to handle a reasonably heavy workload.<\/span><\/p>\n<p>Weeks after updating the table\u2019s compaction strategies, Cassandra (having so many small SS Tables) struggled to run as fast with heavy read operations. We realized that the database likely needed more heap to run the bloom filtering in a reasonable amount of time. Once we doubled the heap in<\/p>\n<pre><span style=\"font-weight: 400;\">\/opt\/cassandra\/env.sh<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<pre><span style=\"font-weight: 400;\">MAX_HEAP_SIZE=\"8G\"<\/span>\r\n\r\n<span style=\"font-weight: 400;\">HEAP_NEWSIZE=\"3G\"<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Followed by a Cassandra service restart, one instance at a time, it was back to performing more closely to how it did when the cluster was smaller, up to a few thousand API calls per second.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Finally, we looked at tuning the size of the SS Tables to make them even smaller than the 160MB default. In the end, we did seem to get a marginal performance boost after updating the size to something around 8MB. However, we still couldn\u2019t get more than about 3,000 queries per second through the Cassandra database before we\u2019d reach timeouts again. It continued to feel like we were approaching the limits of what Cassandra could do.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">alter table mytable WITH compaction = { 'class' : \r\n'LeveledCompactionStrategy\u2019, \u2018sstable_size_in_mb\u2019 : 80 };<\/span><\/pre>\n<h2><\/h2>\n<h2>Enter ScyllaDB<\/h2>\n<p><span style=\"font-weight: 400;\">After several months of seeing our Cassandra cluster needing frequent tuning (or more tuning than we\u2019d like), we happened to hear about <\/span><a href=\"https:\/\/www.scylladb.com\/\"><span style=\"font-weight: 400;\">ScyllaDB<\/span><\/a><span style=\"font-weight: 400;\">. From their website: \u201cWe reimplemented Apache Cassandra from scratch using C++ instead of Java to increase raw performance, better utilize modern multi-core servers and minimize the overhead to DevOps.\u201d<\/span><\/p>\n<p><a href=\"https:\/\/www.scylladb.com\/scylla-vs-cassandra\/#performance-tab\"><span style=\"font-weight: 400;\">This overview<\/span><\/a><span style=\"font-weight: 400;\"> comparing ScyllaDB and Cassandra was enough to give it a shot, especially since it \u201cprovides the same CQL interface and queries, the same drivers, even the same on-disk SSTable format, but with a modern architecture.\u201d<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With ScyllaDB billing itself as a drop-in replacement for Cassandra promising MUCH better performance on the same hardware, it sounded almost too good to be true!<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As we\u2019ve explored in our <\/span><a href=\"https:\/\/www.fullcontact.com\/blog\/2021\/03\/11\/redriving-the-databus\/\"><span style=\"font-weight: 400;\">previous Resolve blog<\/span><\/a><span style=\"font-weight: 400;\">, our database is primarily loaded by loading SS Tables built offline using Spark on EMR. Our initial attempt to load a ScyllaDB database with the same files as our current production database left us a bit disappointed. loading all the files to a fresh ScyllaDB cluster required us to rebuild them with an older version of the Cassandra driver to force it to generate files using an older format.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">After talking to the folks at ScyllaDB, we learned that it doesn\u2019t support Cassandra\u2019s latest MD file format. However, you can rename the .md files to .mc, and this will supposedly allow these files to be read by ScyllaDB.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once we were able to get SS tables loaded, we ran into another performance issue of starting the database in a reasonable amount of time. On Cassandra, when you copy files to each node in the cluster and start it, the database starts up within a few seconds. In ScyllaDB, after copying files and restarting the ScyllaDB service, it would take hours for larger tables to be re-compacted, shuffled, and ready to go, even though our replication factor was 3, on a 3 node cluster. So in copying all the files to each cluster, our thinking was data shouldn\u2019t need to be transformed at all.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once data was loaded, we were able to properly load test our APIs finally! And guess what? <\/span><i><span style=\"font-weight: 400;\">We were finally able to hit 10,000 queries per second relatively easily!<\/span><\/i><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19392\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png\" alt=\"\" width=\"1430\" height=\"522\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png 1430w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1-300x110.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1-1024x374.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1-768x280.png 768w\" sizes=\"auto, (max-width: 1430px) 100vw, 1430px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Grafana dashboard showing our previous maximum from 13:30 &#8211; 17:30 running around 3,000 queries\/second. We were able to hit 5,000, 7,500, and over 10,000 queries per second with a loaded ScyllaDB cluster.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We\u2019ve been very pleased with ScyllaDB\u2019s performance out-of-the-box, being able to achieve double our goal set earlier last year of 10,000 queries per second, peaking at over 20,000 requests per second, <\/span><i><span style=\"font-weight: 400;\">all while keeping our 98th percentile under 50ms<\/span><\/i><span style=\"font-weight: 400;\">!\u00a0 And best of all &#8212; this is all out-of-the-box performance! No JVM or other tuning needs required! (The brief blips near 17:52, 17,55, and 17:56 are due to our load generator changing Kafka partitioning assignments as more load consumers are added).<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19393\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1.png\" alt=\"\" width=\"1667\" height=\"454\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1.png 1667w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1-300x82.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1-1024x279.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1-768x209.png 768w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image1-1536x418.png 1536w\" sizes=\"auto, (max-width: 1667px) 100vw, 1667px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19394\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image8.png\" alt=\"\" width=\"790\" height=\"415\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image8.png 790w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image8-300x158.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image8-768x403.png 768w\" sizes=\"auto, (max-width: 790px) 100vw, 790px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">In addition to the custom dashboards we have from the API point of view, ScyllaDB conveniently ships <\/span><span style=\"font-weight: 400;\"><a href=\"https:\/\/www.scylladb.com\/2016\/11\/22\/scylla-monitoring\/\">Prometheus metric support<\/a><\/span><span style=\"font-weight: 400;\">\u00a0and lets us install their <\/span><a href=\"https:\/\/docs.scylladb.com\/operating-scylla\/monitoring\/\"><span style=\"font-weight: 400;\">Grafana dashboards<\/span><\/a><span style=\"font-weight: 400;\"> easily to monitor our clusters with minimal effort.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OS metrics dashboard from ScyllaDB:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19395\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image9.png\" alt=\"\" width=\"1471\" height=\"976\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image9.png 1471w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image9-300x199.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image9-1024x679.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image9-768x510.png 768w\" sizes=\"auto, (max-width: 1471px) 100vw, 1471px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">ScyllaDB Advanced Dashboard:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19396\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image2.png\" alt=\"\" width=\"1468\" height=\"932\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image2.png 1468w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image2-300x190.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image2-1024x650.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image2-768x488.png 768w\" sizes=\"auto, (max-width: 1468px) 100vw, 1468px\" \/><\/p>\n<h2>Offline SS Tables to Cassandra Streaming<\/h2>\n<p><span style=\"font-weight: 400;\">After doing some quick math factoring in ScyllaDB\u2019s need to recompact and reshuffle all your data loaded from offline SS tables, we realized reworking the database building, replacing it with streaming inserts straight into Cassandra would be faster using the <\/span><a href=\"https:\/\/github.com\/datastax\/spark-cassandra-connector\"><span style=\"font-weight: 400;\">spark-cassandra-connector<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In reality, rebuilding a database offline isn\u2019t the primary use case that\u2019s run regularly. Still, it is a useful tool for large schema changes and large internal data changes. This, combined with the fact that our SS Table build ultimately has SS tables being written to a single executor, we\u2019ve since abandoned the offline SS Table build process.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We\u2019ve updated our Airflow DAG to stream directly to a fresh ScyllaDB cluster:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19397\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1.png\" alt=\"\" width=\"1999\" height=\"344\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1.png 1999w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1-300x52.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1-1024x176.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1-768x132.png 768w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image4-1-1536x264.png 1536w\" sizes=\"auto, (max-width: 1999px) 100vw, 1999px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Version 1 of our Database Rebuild process, building SS Tables offline.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Updated version 2 looks very similar, but it streams data directly to ScyllaDB:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19398\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5.png\" alt=\"\" width=\"1999\" height=\"392\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5.png 1999w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5-300x59.png 300w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5-1024x201.png 1024w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5-768x151.png 768w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image5-1536x301.png 1536w\" sizes=\"auto, (max-width: 1999px) 100vw, 1999px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Conveniently the code is pretty straightforward as well:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">We create a spark config and session:<\/span><\/li>\n<\/ol>\n<pre><span style=\"font-weight: 400;\">val sparkConf = super.createSparkConfig()<\/span>\r\n<span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.set(\"spark.cassandra.connection.host\", \r\ncassandraHosts)<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\/\/ any other settings we need\/want to set, \r\nconsistency level, throughput limits, etc.<\/span>\r\n\r\n<span style=\"font-weight: 400;\">val session = \r\nSparkSession.builder().config(sparkConf).getOrCreate()<\/span>\r\n\r\n<span style=\"font-weight: 400;\">val records = session.read<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.parquet(inputPath)<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.as[ResolveRecord]<\/span>\r\n<span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.cache()<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">2. For each table we need to populate, we can map to a case class matching the table schema and saving as the correct table name and keyspace:<\/span><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">records<\/span>\r\n\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ map to a row<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.map(row =&gt; TableCaseClass(id1, id2, \u2026.))<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.toDF()<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.format(\"org.apache.spark.sql.cassandra\")<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.options(Map(\"keyspace\" -&gt; keyspace, \"table\" -&gt; \r\n\"mappingtable\"))<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.mode(SaveMode.Append)<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/\/ stream to scyllaDB<\/span>\r\n<span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0.save()<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">With some trial and error, we have found the sweet spot of the numbers and size of EMR EC2 nodes: for our data sets, running an 8 node c5.large was able to keep the load as fast as the EBS drives could handle while not running into more timeout issues.<\/span><\/p>\n<h2>Cassandra and ScyllaDB Performance Comparison<\/h2>\n<div id=\"attachment_19399\" style=\"width: 523px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-19399\" class=\"size-full wp-image-19399\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image3-1.png\" alt=\"\" width=\"513\" height=\"220\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image3-1.png 513w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image3-1-300x129.png 300w\" sizes=\"auto, (max-width: 513px) 100vw, 513px\" \/><p id=\"caption-attachment-19399\" class=\"wp-caption-text\">Our Cassandra cluster under heavy load<\/p><\/div>\n<p>&nbsp;<\/p>\n<div id=\"attachment_19400\" style=\"width: 537px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-19400\" class=\"size-full wp-image-19400\" src=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image6-1.png\" alt=\"\" width=\"527\" height=\"223\" srcset=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image6-1.png 527w, https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image6-1-300x127.png 300w\" sizes=\"auto, (max-width: 527px) 100vw, 527px\" \/><p id=\"caption-attachment-19400\" class=\"wp-caption-text\">Our ScyllaDB cluster on the same hardware, with the same type of traffic<\/p><\/div>\n<p><span style=\"font-weight: 400;\">The top graph shows queries per second (white line; right Y-axis) we were able to push through our Cassandra cluster before we encountered timeout issues with the API speed measured at the mean, 95th, and 98th percentiles, (blue, green, and red, respectively; left-Y axis). You can see we could push through about <\/span><b>7 times<\/b><span style=\"font-weight: 400;\"> the number of queries per second while dropping the 98th percentile latency from around 2 seconds to 15 milliseconds!<\/span><\/p>\n<h2>Next Steps<\/h2>\n<p><span style=\"font-weight: 400;\">As our data continues to grow, we are continuing to look for efficiencies around data loading. A few areas we are currently evaluating:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Using ScyllaDB Migrator to load Parquet straight to ScyllaDB, using ScyllaDB\u2019s partition aware driver<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Exploring i3 class EC2 nodes<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Network efficiencies with batching rows and compression, on the spark side<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Exploring more, smaller instances for cluster setup<\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>In 2020, FullContact launched our Resolve product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra with its promises for scalability, high availability, and low latency on commodity hardware. However, we could never run our internal workloads as fast as we wanted &#8212; Cassandra didn\u2019t seem to live [&hellip;]<\/p>\n","protected":false},"author":115,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_improvement_type_select":"improve_an_existing","_thumb_yes_seoaic":false,"_frame_yes_seoaic":false,"seoaic_generate_description":"","seoaic_improve_instructions_prompt":"","seoaic_rollback_content_improvement":"","seoaic_idea_thumbnail_generator":"","thumbnail_generated":false,"thumbnail_generate_prompt":"","seoaic_article_description":"","seoaic_article_subtitles":[],"footnotes":""},"categories":[656],"tags":[5989,5990,5991,5992,5673,5661,674,390,50],"class_list":["post-19389","post","type-post","status-publish","format-standard","hentry","category-engineering","tag-scylladb","tag-c","tag-kubernetes","tag-ss-tables","tag-resolve-api","tag-scylla","tag-resolve","tag-cassandra","tag-api"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.1 (Yoast SEO v27.1.1) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Improving the Graph: Transition to ScyllaDB | FullContact<\/title>\n<meta name=\"description\" content=\"In 2020, FullContact launched our Resolve product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Resolve: Transition to ScyllaDB\" \/>\n<meta property=\"og:description\" content=\"In this blog, we explore the reasons behind our transition to Scylla from Cassandra.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\" \/>\n<meta property=\"og:site_name\" content=\"FullContact\" \/>\n<meta property=\"article:published_time\" content=\"2021-03-31T15:57:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-08-16T22:45:38+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/Engineering-March2Scylla-blog-li.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Nathan Pensack-Rinehart\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Resolve: Transition to ScyllaDB\" \/>\n<meta name=\"twitter:description\" content=\"In this blog, we explore the reasons behind our transition to Scylla from Cassandra.\" \/>\n<meta name=\"twitter:creator\" content=\"@fullcontact\" \/>\n<meta name=\"twitter:site\" content=\"@fullcontact\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Nathan Pensack-Rinehart\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\"},\"author\":{\"name\":\"Nathan Pensack-Rinehart\",\"@id\":\"https:\/\/www.fullcontact.com\/#\/schema\/person\/db7f8de0ef68cd75e9d41158ce8b25ee\"},\"headline\":\"Improving the Graph: Transition to ScyllaDB\",\"datePublished\":\"2021-03-31T15:57:40+00:00\",\"dateModified\":\"2022-08-16T22:45:38+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\"},\"wordCount\":1791,\"publisher\":{\"@id\":\"https:\/\/www.fullcontact.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png\",\"keywords\":[\"ScyllaDB\",\"C++\",\"Kubernetes\",\"SS Tables\",\"Resolve API\",\"Scylla\",\"Resolve\",\"cassandra\",\"API\"],\"articleSection\":[\"Engineering\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\",\"url\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\",\"name\":\"Improving the Graph: Transition to ScyllaDB | FullContact\",\"isPartOf\":{\"@id\":\"https:\/\/www.fullcontact.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png\",\"datePublished\":\"2021-03-31T15:57:40+00:00\",\"dateModified\":\"2022-08-16T22:45:38+00:00\",\"description\":\"In 2020, FullContact launched our Resolve product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra\",\"breadcrumb\":{\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage\",\"url\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png\",\"contentUrl\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png\",\"width\":1430,\"height\":522},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.fullcontact.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Improving the Graph: Transition to ScyllaDB\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.fullcontact.com\/#website\",\"url\":\"https:\/\/www.fullcontact.com\/\",\"name\":\"FullContact\",\"description\":\"Relationships, reimagined.\",\"publisher\":{\"@id\":\"https:\/\/www.fullcontact.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.fullcontact.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.fullcontact.com\/#organization\",\"name\":\"FullContact\",\"url\":\"https:\/\/www.fullcontact.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.fullcontact.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2019\/11\/fc-logo@2x.png\",\"contentUrl\":\"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2019\/11\/fc-logo@2x.png\",\"width\":200,\"height\":38,\"caption\":\"FullContact\"},\"image\":{\"@id\":\"https:\/\/www.fullcontact.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/fullcontact\",\"https:\/\/www.linkedin.com\/company\/fullcontact-inc-\",\"https:\/\/www.youtube.com\/user\/FullContactAPI\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.fullcontact.com\/#\/schema\/person\/db7f8de0ef68cd75e9d41158ce8b25ee\",\"name\":\"Nathan Pensack-Rinehart\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.fullcontact.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/f0feafea0610500024c73036de213bb244aae0ba84513647d6e7fdbd7a20444c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/f0feafea0610500024c73036de213bb244aae0ba84513647d6e7fdbd7a20444c?s=96&d=mm&r=g\",\"caption\":\"Nathan Pensack-Rinehart\"},\"url\":\"https:\/\/www.fullcontact.com\/blog\/author\/nathan-pensack-rinehart\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Improving the Graph: Transition to ScyllaDB | FullContact","description":"In 2020, FullContact launched our Resolve product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/","og_locale":"en_US","og_type":"article","og_title":"Resolve: Transition to ScyllaDB","og_description":"In this blog, we explore the reasons behind our transition to Scylla from Cassandra.","og_url":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/","og_site_name":"FullContact","article_published_time":"2021-03-31T15:57:40+00:00","article_modified_time":"2022-08-16T22:45:38+00:00","og_image":[{"width":1200,"height":630,"url":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/Engineering-March2Scylla-blog-li.png","type":"image\/png"}],"author":"Nathan Pensack-Rinehart","twitter_card":"summary_large_image","twitter_title":"Resolve: Transition to ScyllaDB","twitter_description":"In this blog, we explore the reasons behind our transition to Scylla from Cassandra.","twitter_creator":"@fullcontact","twitter_site":"@fullcontact","twitter_misc":{"Written by":"Nathan Pensack-Rinehart","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#article","isPartOf":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/"},"author":{"name":"Nathan Pensack-Rinehart","@id":"https:\/\/www.fullcontact.com\/#\/schema\/person\/db7f8de0ef68cd75e9d41158ce8b25ee"},"headline":"Improving the Graph: Transition to ScyllaDB","datePublished":"2021-03-31T15:57:40+00:00","dateModified":"2022-08-16T22:45:38+00:00","mainEntityOfPage":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/"},"wordCount":1791,"publisher":{"@id":"https:\/\/www.fullcontact.com\/#organization"},"image":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png","keywords":["ScyllaDB","C++","Kubernetes","SS Tables","Resolve API","Scylla","Resolve","cassandra","API"],"articleSection":["Engineering"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/","url":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/","name":"Improving the Graph: Transition to ScyllaDB | FullContact","isPartOf":{"@id":"https:\/\/www.fullcontact.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage"},"image":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png","datePublished":"2021-03-31T15:57:40+00:00","dateModified":"2022-08-16T22:45:38+00:00","description":"In 2020, FullContact launched our Resolve product, backed by Cassandra. Initially, we were eager to move from our historical database HBase to Cassandra","breadcrumb":{"@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#primaryimage","url":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png","contentUrl":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2021\/03\/image7-1.png","width":1430,"height":522},{"@type":"BreadcrumbList","@id":"https:\/\/www.fullcontact.com\/blog\/engineering\/resolve-transition-to-scylladb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.fullcontact.com\/"},{"@type":"ListItem","position":2,"name":"Improving the Graph: Transition to ScyllaDB"}]},{"@type":"WebSite","@id":"https:\/\/www.fullcontact.com\/#website","url":"https:\/\/www.fullcontact.com\/","name":"FullContact","description":"Relationships, reimagined.","publisher":{"@id":"https:\/\/www.fullcontact.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.fullcontact.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.fullcontact.com\/#organization","name":"FullContact","url":"https:\/\/www.fullcontact.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.fullcontact.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2019\/11\/fc-logo@2x.png","contentUrl":"https:\/\/www.fullcontact.com\/wp-content\/uploads\/2019\/11\/fc-logo@2x.png","width":200,"height":38,"caption":"FullContact"},"image":{"@id":"https:\/\/www.fullcontact.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/fullcontact","https:\/\/www.linkedin.com\/company\/fullcontact-inc-","https:\/\/www.youtube.com\/user\/FullContactAPI"]},{"@type":"Person","@id":"https:\/\/www.fullcontact.com\/#\/schema\/person\/db7f8de0ef68cd75e9d41158ce8b25ee","name":"Nathan Pensack-Rinehart","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.fullcontact.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/f0feafea0610500024c73036de213bb244aae0ba84513647d6e7fdbd7a20444c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/f0feafea0610500024c73036de213bb244aae0ba84513647d6e7fdbd7a20444c?s=96&d=mm&r=g","caption":"Nathan Pensack-Rinehart"},"url":"https:\/\/www.fullcontact.com\/blog\/author\/nathan-pensack-rinehart\/"}]}},"_links":{"self":[{"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/posts\/19389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/users\/115"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/comments?post=19389"}],"version-history":[{"count":0,"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/posts\/19389\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/media?parent=19389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/categories?post=19389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fullcontact.com\/wp-json\/wp\/v2\/tags?post=19389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}