Rust Concurrency

rustFor a long time I have been thinking about writing a sample program in Rust “the” new systems language. I have done coding in C++ for initial 5 years of my career before I moved on completely to Java and recently in one of my products a requirement came up that a low latency high performance component had to be developed.

As I have written by default Java was a default choice as its my first choice anyways. However I realized that this component could not afford non deterministic nature of garbage collector.

So need was to write program where I could have control over exact memory deallocation without worrying about “stop the world” garbage collection. Natural Choice was C++ but programming is all about having fun and I wanted to try out something new and C++ threading support and syntax is not all that great even in C++11.

So I decided to try out Go. but again Go had an issue of garbage collection and same fear of non determinism creeped in.

So time to try out Rust.

Program is simple but can be extended to lot of other scenarios.

One thread keeps spitting out data at some regular intervals. A vector keeps track of generated data.

Other thread keeps ticking at regular intervals (100ms or so) and whenever there are items which have elapsed time greater than a threshold those items are expired. Same as cache TTL.

use std::thread;
    use std::sync::mpsc;
    use std::time::{Duration,Instant};
    use std::collections::HashMap;

   //Define struct
    #[derive(Clone)]
    struct Item {
        created_at: Instant,
        id:i64,
        pub description: String
    }
//Implement Item
    impl Item {

        pub fn new(id: i64,description: String) -> Item {
            Item {
                created_at: Instant::now(),
                id: id,
                description: description
            }
        }

        fn created_at(&self) -> Instant {
            self.created_at
        }

        fn id(&self) -> i64 {
            self.id
        }
    }


    fn main() {
        let (sender, receiver) = mpsc::channel(); //Creat  multiple publisher single receiver channel
        let sender_pop = sender.clone(); //clone sender

        //Create a thread that sends pop every 2 seconds
        thread::spawn(move || {
            //Create infinite loop
            loop {
                thread::sleep(Duration::from_millis(100));
                sender_pop.send(Item::new(-1,String::from("Pop"))).unwrap();
            }
        });

        //Create a thread that keeps sending data every second t
        thread::spawn(move || {
            let mut val = 1;
            //Create infinite loop
            loop {
                val = val + 1;
                sender.send(Item::new(val,String::from("New"))).unwrap();
                thread::sleep(Duration::from_millis(1000));
                //Break out of loop if you want to
//                if val == 10 {
//                    println!("OK, that's enough");
//                    // Exit this loop
//                    break;
//                }
            }
        });
        //Create a mutable vector
        let mut vals: Vec<Item> = Vec::new(); 
        let ttl = 5; //TTL in seconds
        //Receive items in non blocking fashion
        for received in receiver {
            //let item = &received;
            let mut item = &received;
            let newItem: Item  = item.clone();
            match item.description.as_ref(){
                "Pop" => {
                    println!("Pop");
                    vals.retain(|ref x| Instant::now().duration_since(x.created_at).as_secs() < ttl);

                },
                _ => {
                    vals.push(newItem);
                }
            }
        }
    }

That’s it. You have done synchronisation between threads without any race condition. That’s how cool Rust is.

In the next blog we will try to send notification whenever items are expired.

Happy Coding !!

Rust Concurrency

Why Kafka?

Today Many companies in startup world are completely dependent on AWS infrastructure. Its a good strategy since you do not have to manage your own infrastructure and saves you from lot of headache.

Today we will discuss a bit about brokers available in AWS infrastructure. AWS has mainly 2 types of broker offering

a. SQS (Simple queue service) – More like ActiveMQ, RabbitMQ

b. Kinesis (Distributed, fault tolerant, highly scalable message broker) – less features but optimized for ingesting and delivering massive number of events at extremely low latency.

Design of Kinesis is inspired by Linked-in donated Kafka. Linked in processes billions of events per day using Kafka and it’s apache top level project which is being used in many highly scalable architecture.

I want to focus in this post on some of the key differences between Kinesis and Kafka. As stated in the beginning working with AWS infrastructure is a good thing but over-reliance on AWS infrastructure has some major problems.

a. You are vendor locked-in so tomorrow if you want to shift to Digital Ocean or even own infrastructure you will not be able to do so.

b. You are limited by the restrictions put by AWS like how many transactions you can do per unit of time

so, in the light of above 2 points I will try to explain where Kafka should be used instead of RabbitMQ and in-place of Kinesis

RabbitMQ Pros:

  • Simple to install and manage
  • Excellent routing capabilities based on rules
  • Decent performance
  • Cloud Installation also available (CloudAMQP)

Cons:

  • Not-distributed
  • Unable to scale to high loads (due to non-distributed nature)

Kafka Pros:

  • Amazingly fast reads and writes (due to sequential reads and writes only)
  • Does one thing and one thing only i.e. to transfer messages reliably
  • Does provide load balancing of consumers via partitions of topic so real parallel-processing no ifs and buts
  • No restrictions on transaction numbers unlike Kinesis

Cons:

  • Complicated to setup cluster compared to rabbitmq
  • Dependency on Zookeeper
  • No Routing

So bottom line

  • Use RabbitMQ for any simple use case
  • Use Kafka if you want insane scalability and you are ready to put effort in learning kafka topics and partitions
  • Use Kinesis if setting up kafka is not your cup of tea
Kafka Kinesis RabbitMQ
Routing Basic (Topic Based) Basic (Topic Based) Advanced (Exchange based)
Throughput Extremely high Extremely high
Latency Depends on region (Not available in some regions hence Http call) Very low High (Compared to other 2)
Ease of implementation Moderate..but setting up cluster requires effort Moderate (but identifying number of shards can be tough) Easy
Restrictions on transactions None 5 reads per seconds and 1000 write/sec/shard None
Types of applications High throughput High throughput Low to medium throughput

As always drop me an email if still confused about your use case

Happy Coding !!

Why Kafka?

Web Application Scalability – Part 1

Foundation

I have been thinking about writing about software architecture for a while. There is lot of confusion among programmers about how to design software at scale. Teams fight about a programming language/framework/library/tools and what not. So I intend to share my experience in this field and I will keep writing new articles and improvise existing ones.

Software architecture is a field that is highly based on

  • your experience
  • how many choices you made independently
  • if you made any mistakes did you learn from them
  • Are you up-to-date with latest
  • Knowing about pros and cons of each choice you are making

This will be a 3 part series.

Each part will deal with one layer in the Web Application. We will look at aspect of scalability, problems faced today, relevant technologies in today’s date and how to utilize them.

A traditional Web Application is made of three layers.

  1. Presentation
  2. Application
  3. Persistence

For more than a decade this architecture has been very popular in both Open Source community (Java, Ruby on Rails, PHP) and proprietary technologies (Asp.Net)

To achieve this architecture excellent frameworks are available across different technology and its easier to find talent.

From a typical web framework there are two main areas where developers seek help:

How to make web development easier?

Scale on demand – to be able to handle large traffic.

So if this is a popular architecture then what’s the problem?All web frameworks that I have worked with or that I know of primarily solve first problem.

But they fail miserably while answering second question. Developer is left on his own whenever scalability comes into picture. Fortunately large social networking websites have shown us the way forward and given us many tools which if utilized effectively can deliver this second aspect called scalability.

 

We will talk about second point in this series. When we talk about handling traffic here we are assuming that hardware available to us is Commodity hardware.

My approach will be practical by using as less jargons as possible so that a programmer with couple years of experience can easily understand the how he can make design and framework choices to achieve this nasty goal called scalability.

 

Web Application Scalability – Part 1