Flipkart – Big Billion day and Scalability problems

Recently flipkart created quite a buzzword in Indian e-commerce industry. They sold 600cr Rs worth of goods in a matter of 10 hours as per their official statement. I want to focus my thoughts on tech side since I fail to understand how regularly they see outages and what they are doing to fix it since it has happened many times. This time it was more prominent but I get feedback from lot of people that they are down every now and then. Today technologies are so advanced that building and ecommerce website is not considered a big deal anymore..India has seen hundreds of ecommerce websites mushrooming over last few years with the help of investors. You can build a store in a matter of few days on Shopify and even if you want to build your own you dont have to build a payment gateway or search or cart or networking middleware…Already available piece of softwares will take you a long way. There is hardly any category for which there is no “specialized” website does not exist. Coming back to billion hits in a day with 5000 servers was surprising for me so I decided to do a quick analysis of what might have happened depending my past experience with ecommerce websites. Most of the these websites when started didn;t expect growth at such a rapid pace hence they were built using a web framework either Java based or PHP based a single “monolithic” chunk of software. Admin function, Inventory Management, Order Procesing, Payment Gateway, WishList, Search, Email Campaign all built into one big piece of software. Phew…We really cant blame them as twitter was also one big ruby on rails based application at one time. Difference is unlike twitter these websites lead by less experienced people in the industry were not able to embrace the design of distributed architecture either SOA based or otherwise. They continued building classes after classes in same codebase and were unable to keep a balance in good software design and features. There are couple of reasons for that a. Most of the websites are lead by people in their late 20s or early 30s. b. Their technical leads and along with CTOs are very less experienced again compared to companies like Amazon. Hence they are not able enforce agile practices like TDD and use best suitable stack for the job because they themselves never went through that process. c. They are afraid of changing things because they are afraid of breaking sthg which is already working. d. Unexperienced in tech and generously funded by venture capitalist they keep hiring people and want more head count to build new features quickly and support them since they know that a new ecommerce website can be launched pretty easily today. Anyways we will focus on “technical analysis” of Flipkart issue. They issued a statement that they have received 1 billion hits in one day. which comes down to 12000 hits per sec. If we assume that 2/3 hits come in first 8 hours it will come down to 24,000 hits per sec. If they had 2000 servers at most 20 hits per sec will come. Today’s state of the art commodity hardware can handle 10 times of that traffic provided you don’t write bad software. Amazon’s conversion rate is around 7% i.e. 7 out of 100 visitors who come to website end up checking out. So 93 people are actually just browsing the products either using search or by category. The key is to serve this traffic from different servers compared to other functions. so SOA is only solution when site grows big but you need to build a infrastructure where you can roll out services and define separate SLA for each of them. This infrastructure should support creation,consumption, discovery and seamless deployment. However this will take time experience and hard work while making sure every person is on same page. This will also reduce tech cost in the long term provided you are eyeing for it. Recently came across an interview by flipkart CEO where he claims that there architecture is well designed an SOA model just like Amazon but there are many complex algorithm in their system which kick-in only in specific scenarios and do not crop up during regular stress testing. SOA is easier said then done. Biggest problem is S (Service). How do you identify that a certain piece of code should be packaged as a separate service and not part of an existing service. Guiding light according to me  are following questions. a. “Do you believe that in future you would want to maintain and evolve this code separately”. b. Do you think its gonna change very frequently and whether it is going to be used by other services. c. Do you think this piece of code can affect the SLA of existing services significantly. i.e. Is it a complex piece of algorithm that will make the performance of existing services unpredictable when on execution path. If answer to any of the above questions is yes then one should wrap this piece of code in a new service. Now problem is many a times while starting the project you are not sure about answer to all these questions. Some day business requirement gets changed and you find out it would have been better if I would have created a new service for it. But code is a problem because it is not designed in a way that you can easily abstract it out and put it in a new service so you are doomed. This is exactly the nature of enterprise software and that is why experts in software are always looking for new way of developing software either through various libraries or going to the extent of creating a new language so that problem can be addressed before they even appear. Functional programming and tools like Play Framework and Akka as of today offer a lot to the developers and can be leveraged to create truly SOA architecture and if there programming paradigm is followed in true sense then Akka will help you create a distributed architecture out of monolithic piece of code in much less painful way. One just need to be brave enough to understand these tools and be ready to take risk. In my next article I will show a use case of E-commerce which will do exactly this. i.e. Without changing a piece of code i.e. without creating a new WebService or putting a JMS in between or writing custom networking code, we will create a distributed app out of monolithic chunk. Happy Coding!!

Flipkart – Big Billion day and Scalability problems

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s