Kafka Cluster Setup

from my own experience I find that while setting up kafka cluster on AWS we face some issues so just want to highlight them.

a. First setup zookeeper cluster. Let’s say 3 node cluster. Modify each node zoo.conf to publish ip address as internal IP address.

server.id=<internal aws ip1>:2888:3888

server.id=<internal aws ip2>:2888:3888

server.id=<internal aws ip3>:2888:3888

b. Go to kafka server.properties and change broker.id

node1 — broker.id 0

node2 — broker.id 1

node 3 — broker.id2

Change bind address to <internal ip> so that it is not accessible from outside

Change advertised.host.name to <internal ip>

List all zookeeper nodes within your own cluster under setting zookeeper.connect

Start all kafka nodes and you they should be able to create a cluster

Create a topic, publish a message using kafka-console-producer and see if there are not errors.

Vert.x Cluster

Vert.x is an extremely simple event based, non blocking,  library for distributed computing that can be easily embedded in any Java framework of your choice.

For sometime now I have been exploring Vert.x. I was looking for a vert.x cluster sample but I did not find any decent example so I decided to share that with community.

I modified one of the sample available from Vert.x examples and I will explain core parts of it here.

First the cluster has to be configured.

Config hazelcastConfig = new Config();

Config hazelcastConfig = new Config();
hazelcastConfig.getNetworkConfig().getJoin().getTcpIpConfig().addMember("127.0.0.1").setEnabled(true);
hazelcastConfig.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);

ClusterManager mgr = new HazelcastClusterManager(hazelcastConfig);
VertxOptions options = new VertxOptions().setClusterManager(mgr);
Vertx.clusteredVertx(options, res -> {
    if (res.succeeded()) {
        vertx = res.result();
        deploy(vertx,context,mode);
    }
});

This code can be part of your main function. You can also create a cluster config file called “cluster.xml” rather than creating configurations programatically. This should be on classpath of your application so it can be put inside src/main/resources I am going to test this application on my local machine so I am using TCP discovery rather than Multicast for my application. Vert.x underlying uses hazelcast (default) for all it’s clustering capabilities. Hazelcast also comes up with a special configuration for AWS. So if you are going to deploy your application on AWS then that is the configuration you should use.

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.2.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <properties>
       .....
        <property name="hazelcast.wait.seconds.before.join">0</property>
    </properties>

    <group>
        <name>dev</name>
        <password>dev-pass</password>
    </group>
    <management-center enabled="false">http://localhost:8080/mancenter</management-center>
    <network>
        <port auto-increment="true" port-count="10000">5701</port>
        <outbound-ports>
            <!--
            Allowed port range when connecting to other nodes.
            0 or * means use system provided port.
            -->
            <ports>0</ports>
        </outbound-ports>
        <join>
            <!--<multicast enabled="false">-->
            <!--<multicast-group>224.2.2.3</multicast-group>-->
            <!--<multicast-port>54327</multicast-port>-->
            <!--</multicast>-->
            <tcp-ip enabled="true">
                <interface>192.168.1.8</interface>
            </tcp-ip>
            <aws enabled="false">
                .....
            </aws>
        </join>
        <interfaces enabled="false">
            <interface>10.10.1.*</interface>
        </interfaces>        
    </network>
    <partition-group enabled="false"/>
    <executor-service name="default">
        <pool-size>16</pool-size>
        <!--Queue capacity. 0 means Integer.MAX_VALUE.-->
        <queue-capacity>0</queue-capacity>
    </executor-service>
    <map name="__vertx.subs">

        <!--
            Number of backups. If 1 is set as the backup-count for example,
            then all entries of the map will be copied to another JVM for
            fail-safety. 0 means no backup.
        -->
        <backup-count>1</backup-count>
      
        <time-to-live-seconds>0</time-to-live-seconds>
        <max-idle-seconds>0</max-idle-seconds>
        <!--
            Valid values are:
            NONE (no eviction),
            LRU (Least Recently Used),
            LFU (Least Frequently Used).
            NONE is the default.
        -->
        <eviction-policy>NONE</eviction-policy>
        <!--
            Maximum size of the map. When max size is reached,
            map is evicted based on the policy defined.
            Any integer between 0 and Integer.MAX_VALUE. 0 means
            Integer.MAX_VALUE. Default is 0.
        -->
        <max-size policy="PER_NODE">0</max-size>
        <!--
            When max. size is reached, specified percentage of
            the map will be evicted. Any integer between 0 and 100.
            If 25 is set for example, 25% of the entries will
            get evicted.
        -->
        <eviction-percentage>25</eviction-percentage>
        <merge-policy>
      com.hazelcast.map.merge.LatestUpdateMapMergePolicy
</merge-policy>
    </map>

    <!-- Used internally in Vert.x to implement async locks -->
    <semaphore name="__vertx.*">
        <initial-permits>1</initial-permits>
    </semaphore>

</hazelcast>

I have included only important parts and left others from configuration Once we are done with cluster configuration all we have to do is create a verticle and deploy it.

public class ServerVerticle extends AbstractVerticle {

    int port;
    public ServerVerticle(int port){
        super();
        this.port = port;
    }
    @Override
    public void start() throws Exception {
        super.start();
        HttpServer server = vertx.createHttpServer();
        server.requestHandler(req -> {
            if (req.method() == HttpMethod.GET) {
                req.response().setChunked(true);

                if (req.path().equals("/products")) {
                    vertx.eventBus().<String>send(SpringDemoVerticle.ALL_PRODUCTS_ADDRESS, "", result -> {
                        if (result.succeeded()) {
                            req.response().setStatusCode(200).write(result.result().body()).end();
                        } else {
                            req.response().setStatusCode(500).write(result.cause().toString()).end();
                        }
                    });
                } else {
                    req.response().setStatusCode(200).write("Hello from vert.x").end();
                }

            } else {
                // We only support GET for now
                req.response().setStatusCode(405).end();
            }
        });

        server.listen(port);
    }
}

Great…let’s deploy this verticle.

Once configuration has been done you need to deploy verticles in cluster

Vertx.clusteredVertx(options, res -> {
    if (res.succeeded()) {
        Vertx vertx = res.result();
       //You should deploy verticles only when cluster has been initialized
        vertx.deployVerticle(new ServerVerticle(Integer.parseInt(args[0])));
    } else {
    }
});

 

You start the application by giving a port let’s say at port 9000 and start another instance at different port let’s say 9005 and Voila can see both of them start communicating. Complete source code can be found at https://github.com/singhmarut/vertx-cluster.git Happy Coding !!

Why Java is my default choice?

Recently in my company this never ending debate of language has come up.
Needless to say everybody loves the language/stack they have experience with. So I can not really debate about why not other languages but I can tell why Java is my default language.

Before I start let me tell you how many languages I have command today as far server side is considered

a. C++, Java,C#.Net, Scala, Groovy, Javascript (Node.js)  and ready to explore more always

I started my career 12 long years back as a rookie and started working on a project called Rule Engine developed in VC++.

Liked them as well but C++ remained the first love..I was of the view that compared to other programmers if I learn the language deep enough to be a master (yes cause I was immensely impressed with the likes of Scott Mayers, Herb Sutter, Alexndrascu (creator of D), Don Box “the com guy” and many others. They were all C++ programmers.

Fast forward I joined a start a “web start-up” which was creating a web app. and I was asked to lead. We rearchiteted whole server side code in Java but again somebody else took responsibility for front end cause that was “web” according to C++ programmer inside me.

It’s performance is only a tad bit slow compared to C++ and is negligible for a web application where other aspects of the application architecture come into play but there is no way Java can be slower than PHP, Python, Ruby etc

So you should Java cause

  • It is high performance
  • It has huge ecosystem..So many web frameworks to choose from
  • It is enterprise grade
  • It is the only language that runs on 6 billion devices
  • It is stable and will remain a popular language in coming future
  • It’s ecosystem is well managed by apache
  • No dearth of programmers
  • JVM opens new opportunities everyday
  • Monitoring tools available aplenty
  • It keeps evolving (Functional programming in Java8)

Very surprising to see people working in languages like PHP/Python rarely do memory profiling may be because they simply do not have a tool.

To me Java is a language you can do anything with. You want web programming you got it.

You want to write high performance server? Bring it on..So many projects like Storm and Hadoop if they can rely on Java/JVM then you can too.

Whatever you can achieve in Pythons and Rubys you can do with Java. May be you wont like the syntax but vast ecosystem around Java makes sure you are never at loss while selecting a library or a tool.

and if you know Java then you know JVM and if you know JVM you can choose from any language from your choice like Scala..Don’t like Scala? Try Kotlin..Don’t like that either try groovy,Ceylon and so son.

So yes I guess you are in safe spot with Java which extremely well understood with infinite documentation and remains one of the top choice of programmers for last 2 decades and I do not see its glory fading away anytime soon.

So working with Java is like working with a world class MNC…bit old school but still better then many start ups and yet deliver what it promises. Very little surprises here.

Happy coding !!

Not so many reasons Why C++ sucks..

Coded in C++ for 5 years so would apologize in advance. But this is just my opinion. Do I want to start a language war? You bet.

– Pointers suck
– Memory management sucks
– Function Pointers yeah they do
– goto statements sucks
– Friend Functions sucks
– static initialization sucks

– Macros Suck
– headers files sucks? If you don’t know then don’t bother..you can live without knowing them
– Multiple Inheritance sucks
– Diamonds suck
– Interfaces? Sorry C++ doesn’t have them so they can’t…
– reinterpret_cast I think they do suck
– Compilation suck
– Makefile suck


– Threading? sorry what? oh yeah use PThreads or boost..What is boost? some other day man..long story

– Templates? so complex they deserve 10 posts like this as they just don’t suck they suck bigtime.You think you know them? solve a simple puzzle and see if you knew the answer already

C++ Inheritance Puzzle

Did I mention it was a simple one..people interested in knowing them in detail read
alexandrescu modern design

Correction – By the way C++ did get great support for threads now simplifies the job so much
http://en.cppreference.com/w/cpp/thread  Boy, I love this language

Why foo?

Why foo?
Somebody asked in Neal Ford presentation at thought works why we name everything Foo? Class Foo, Function Foo

He could not answer so here is my take on what other names could be adopted?

Aoo – Can’t even pronounce
Boo- Pretty romantic doesn’t look like a geeky word
Coo – Don’t know how to pronounce
Doo – Looks like a children word Doo Doo
Eoo – Can’t even pronounce

Goo – sorry doesn’t work for Indians
Hoo – Spreads negativity looks like hooting
Ioo – Can’t even pronounce
Joo – May be let’s keep it for now
Koo – Doesn’t sound good Koo – coo
Loo – Don’t want to go there during a presentation
Moo – Looks like straight from nursery rhyme
Noo – May be let’s keep it
Ooo – What is this? Doesn’t work
Poo – Hmm..who wants to utter it while coding?
Qoo – Can’t even pronounce
Roo – May be let’s keep it for now
Soo – Doesn’t work for Indians at least and that’s all I am bothered
Too – looks like counting
Uoo – Can’t even pronounce
Voo – Seems romantic again…need a geeky word
Woo – Same as above
Xoo – Can’t even pronounce
Zoo – Sorry will go there later let me code for now

So we are left with

Foo – Joo – Noo

Out of this Foo comes first and it definitely sounds better than other 2..So Foo wins Yayyy…

Web Application Scalability Part-3 ES vs Redis

Its 3rd article in Web Scalability..Today I am looking at various projects going on in Indian Start ups and I am thrilled by variety of technologies being used.

While this sounds good at the same time I feel we are unable to utilize core strengths of technologies used.

Their usage in architecture will always be dependent on person to person and their experience and comfort level. But some simple things if adopted early stage can give immense benefits.

One such technology is our simple cache which is being used everywhere but alas not in the right way.

It’s called redis. Yes the kind of benefit this humble software can bring to create a highly scalable websites is awesome.

I have seen many startups are using Elastic Search as a “Cache”. Yes, I can not imagine someone doing this until you are total nuts. ES is an indexing technology and not a “Cache”. It will never update in real time which is what is expected from cache…It will never be in sync with underlying storage and hence almost always have stale data.

Using it as primary data source for anything is simply insane and shows the immaturity of the architect who chose it in the first place. It will not work except for most trivial projects.

Anyways ES has it niche but it’s not a cache but Redis is. It provides many advanced data structures and can do hell lot of thing in caching layer itself.

One needs to take a hard look at various data structures provided and how they can be used to create a “real” caching layer on top of it and get same output which one was hoping to get from ES out of the box.

It has sets/hashes/sorted sets/lists etc. etc.

Keep putting data at the rate your site might never experience and retrieve it at blazingly fast speed.

Most importantly its single thread model brings the consistency one might need for creating distributed counters.

Think about it and if you are confused how it can be used for your use case just like always drop an email..who knows I may like your use case interesting enough and help you

Akka – A practical Example

This problem is a real life scenario in a production application and displays how Akka can provide highly scalable solutions to complicated problems

Background

1.    Channels are a mechanism for broadcasting messages.
2.    Messages can be sent a variety of ways, including via SMS.
3.    People follow one or more channels, and receive messages when channels they follow broadcast a message.
4.    Some followers have email, some have SMS, some have both.
5.    Channels can have a phone number so that when SMS messages are sent from a channel, the user can see a phone number for that channel. The user can respond and we know to which channel they are responding because the phone number is associated with that channel.
6.    Phone numbers cost money, and so they are a scarce resource. If they were free, every channel would be assigned a discrete phone number upon creation.
7.    For this reason, more than one channel can use the same number. This is fine, as long as no two channels with the same phone number are followed by same person. If one person is following two channels that both use the same phone number, when that person sends and SMS message to that shared phone number, we won’t know to whichchannel they are communicating. When one person is following two channels that both have the same phone number, this is called a “Collision”.
8.    If we don’t allocate enough phone numbers, then a person who is following more than one channel could
9.    Numbers are allocated on demand, that is, the first time that a broadcast for that channel needs to go via SMS.
10.  If phone number X is being used for both Channel A and Channel B, and no followers of either follow the other, then we are fine. As soon as a follower of A starts to follow B, we must change the phone number of one of the channels.
11.  When the phone number of a channel changes, we need to communicate to the channel users that the number has changed. It is very disruptive to change a number, and so we must minimize the number of users that are impacted. For this reason, we must take into consideration the number of recent SMS messages each channel has sent to determine which channel will be impacted the least by a change, and choose that channel to change the number.
Data model
class Channel(UUID id, String name, String phoneNumber)
class User( UUID id, String  name)
class Following(UUID  channelId, UUID  UserId)
class PhoneNumber(String  number)
Don’t worry about persistence for now. Maybe just use a mutable List to store the collections of objects for now.
You can populate the data store with random data.
If you find that this data model is insufficient, please augment as needed. Also, this model is missing broadcasts and broadcast statistics. If you need data that is not represented here, as in the case of broadcast statistics, create placeholder functions that returns random data for now.
Objective
1.    Build a mechanism in Scala that assigns to channels and changes phone numbers as needed.
2.    Minimize the number of phone numbers allocated.
3.    Do not allow collisions.
4.    Minimize the impact of a channel phone number change.
5.    Minimize the number of potential channel phone number changes by balancing the phone number allocations such that least used phone numbers are the first to be allocated when a new channel needs a number.
Here is the repository you can download source code from

Flipkart – Big Billion day and Scalability problems

Recently flipkart created quite a buzzword in Indian e-commerce industry. They sold 600cr Rs worth of goods in a matter of 10 hours as per their official statement. I want to focus my thoughts on tech side since I fail to understand how regularly they see outages and what they are doing to fix it since it has happened many times. This time it was more prominent but I get feedback from lot of people that they are down every now and then. Today technologies are so advanced that building and ecommerce website is not considered a big deal anymore..India has seen hundreds of ecommerce websites mushrooming over last few years with the help of investors. You can build a store in a matter of few days on Shopify and even if you want to build your own you dont have to build a payment gateway or search or cart or networking middleware…Already available piece of softwares will take you a long way. There is hardly any category for which there is no “specialized” website does not exist. Coming back to billion hits in a day with 5000 servers was surprising for me so I decided to do a quick analysis of what might have happened depending my past experience with ecommerce websites. Most of the these websites when started didn;t expect growth at such a rapid pace hence they were built using a web framework either Java based or PHP based a single “monolithic” chunk of software. Admin function, Inventory Management, Order Procesing, Payment Gateway, WishList, Search, Email Campaign all built into one big piece of software. Phew…We really cant blame them as twitter was also one big ruby on rails based application at one time. Difference is unlike twitter these websites lead by less experienced people in the industry were not able to embrace the design of distributed architecture either SOA based or otherwise. They continued building classes after classes in same codebase and were unable to keep a balance in good software design and features. There are couple of reasons for that a. Most of the websites are lead by people in their late 20s or early 30s. b. Their technical leads and along with CTOs are very less experienced again compared to companies like Amazon. Hence they are not able enforce agile practices like TDD and use best suitable stack for the job because they themselves never went through that process. c. They are afraid of changing things because they are afraid of breaking sthg which is already working. d. Unexperienced in tech and generously funded by venture capitalist they keep hiring people and want more head count to build new features quickly and support them since they know that a new ecommerce website can be launched pretty easily today. Anyways we will focus on “technical analysis” of Flipkart issue. They issued a statement that they have received 1 billion hits in one day. which comes down to 12000 hits per sec. If we assume that 2/3 hits come in first 8 hours it will come down to 24,000 hits per sec. If they had 2000 servers at most 20 hits per sec will come. Today’s state of the art commodity hardware can handle 10 times of that traffic provided you don’t write bad software. Amazon’s conversion rate is around 7% i.e. 7 out of 100 visitors who come to website end up checking out. So 93 people are actually just browsing the products either using search or by category. The key is to serve this traffic from different servers compared to other functions. so SOA is only solution when site grows big but you need to build a infrastructure where you can roll out services and define separate SLA for each of them. This infrastructure should support creation,consumption, discovery and seamless deployment. However this will take time experience and hard work while making sure every person is on same page. This will also reduce tech cost in the long term provided you are eyeing for it. Recently came across an interview by flipkart CEO where he claims that there architecture is well designed an SOA model just like Amazon but there are many complex algorithm in their system which kick-in only in specific scenarios and do not crop up during regular stress testing. SOA is easier said then done. Biggest problem is S (Service). How do you identify that a certain piece of code should be packaged as a separate service and not part of an existing service. Guiding light according to me  are following questions. a. “Do you believe that in future you would want to maintain and evolve this code separately”. b. Do you think its gonna change very frequently and whether it is going to be used by other services. c. Do you think this piece of code can affect the SLA of existing services significantly. i.e. Is it a complex piece of algorithm that will make the performance of existing services unpredictable when on execution path. If answer to any of the above questions is yes then one should wrap this piece of code in a new service. Now problem is many a times while starting the project you are not sure about answer to all these questions. Some day business requirement gets changed and you find out it would have been better if I would have created a new service for it. But code is a problem because it is not designed in a way that you can easily abstract it out and put it in a new service so you are doomed. This is exactly the nature of enterprise software and that is why experts in software are always looking for new way of developing software either through various libraries or going to the extent of creating a new language so that problem can be addressed before they even appear. Functional programming and tools like Play Framework and Akka as of today offer a lot to the developers and can be leveraged to create truly SOA architecture and if there programming paradigm is followed in true sense then Akka will help you create a distributed architecture out of monolithic piece of code in much less painful way. One just need to be brave enough to understand these tools and be ready to take risk. In my next article I will show a use case of E-commerce which will do exactly this. i.e. Without changing a piece of code i.e. without creating a new WebService or putting a JMS in between or writing custom networking code, we will create a distributed app out of monolithic chunk. Happy Coding!!

Web Application Scalability – Service Layer

So In our last post I promised that we will talk about how to make monolithic code into distributed SOA architecture.

Well its not easy. Once you have decided that you want to re-architect the single chunk of software in distributed manner you have to decide about different parts of the system which can be deployed on different machines and still everything will work fine.

There are many problems while doing so

a. How will you host these services (S in SOA)

b. How will you communicate and how serialization/deserialization will happen?

c. How do you make sure that you are able to implement the same workflow which you had in single monolithic component assuming you were doing things sequentially.

 

To answer the first question there are many options available but I will list down some which I have personal experience with

 

a. Design your services as RESTful and deploy them in you preferred container like weblogic

Pros

– Location transparency. You refer to the services using URI..

– Interacting with RESTful service is pretty easy in any language

Cons

– Serious overhead of HTTP protocol

– Serious maintenance issues with container management not to mention huge performance hit

b. Design your services as Java processes and interact with them using a service bus like JMS (ActiveMQ, Fiorano etc.)

c. Design your services as Thrift based and interact with them by passing pre-generated client code

Pros

– Serialization/Deserialization take care by thrift. You just interact with POJOs everywhere

– Lightweight highly scalable proven architecture (Facebook?). Forget about converting from Json to POJO and vice-versa

Cons

– Every time service or schema is changed you need to redistribute client code.

– Schema definition in thrift file has some limitations of its own

d. Design your services as separate Java Processes and have a networking library like ZeroMQ take care of communication

Pros

– ZeroMQ is battle tested and used in many mission critical projects. Takes care of networking needs

Cons

– Low level programming is needed compared to other approaches

– Many scenarios will have to be handled even with this approach so code size will swell up

For long time I had been dealign with Hub and Spoke Model by using a ServiceBus like MSMQ, ActiveMQ

ServiceBus is a popular approach especially in financial institutions who invest in enterprise grade product like TIBCO MQ.

Problem with service bus approach is that you have to write code specific to Messaging Bus. So another layer is added which increases complexity

So for some time I had been tracking if there is a way to remove this layer. Given my experience in Industry there are some areas I believe a company should not invest until absolutely necessary and one such area is networking.

Having said that one day while learning Scala I came across a project called Akka. Akka is a framework which addresses many problems with simple concept of Actors.

Networking and Concurrency provide easy way to program local or distributed services. What you get is complete location transparency as you communicate with remote actors in same way as you communicate with local actors.

There is no hub and spoke model so no new software has to be installed/maintained or written specific code to make things work.

It’s P2P…networking complexities are hidden deep inside and you never have to deal with thread. WOW !! Sounds too good to be true..It is.

Completely asynchronous with a programming paradigm that is easy to understand and concept of actors is already proven in telecom industry.

In the next post we will talk about some sample code for how location transparency can be achieved.

 

 

Web Application Scalability – Part 1

Foundation

I have been thinking about writing about software architecture for a while. There is lot of confusion among programmers about how to design software at scale. Teams fight about a programming language/framework/library/tools and what not. So I intend to share my experience in this field and I will keep writing new articles and improvise existing ones.

Software architecture is a field that is highly based on

  • your experience
  • how many choices you made independently
  • if you made any mistakes did you learn from them
  • Are you up-to-date with latest
  • Knowing about pros and cons of each choice you are making

This will be a 3 part series.

Each part will deal with one layer in the Web Application. We will look at aspect of scalability, problems faced today, relevant technologies in today’s date and how to utilize them.

A traditional Web Application is made of three layers.

  1. Presentation
  2. Application
  3. Persistence

For more than a decade this architecture has been very popular in both Open Source community (Java, Ruby on Rails, PHP) and proprietary technologies (Asp.Net)

To achieve this architecture excellent frameworks are available across different technology and its easier to find talent.

From a typical web framework there are two main areas where developers seek help:

How to make web development easier?

Scale on demand – to be able to handle large traffic.

So if this is a popular architecture then what’s the problem?All web frameworks that I have worked with or that I know of primarily solve first problem.

But they fail miserably while answering second question. Developer is left on his own whenever scalability comes into picture. Fortunately large social networking websites have shown us the way forward and given us many tools which if utilized effectively can deliver this second aspect called scalability.

 

We will talk about second point in this series. When we talk about handling traffic here we are assuming that hardware available to us is Commodity hardware.

My approach will be practical by using as less jargons as possible so that a programmer with couple years of experience can easily understand the how he can make design and framework choices to achieve this nasty goal called scalability.