Kotlin Advanced Features vs Java

I have been playing around with Kotlin and I am liking it more and more. Its cleaner, less verbose, more functional adn comes with many advanced features. I will list down few with short examples

Kotlin vs Java Comparison

✅ Kotlin vs Java: A Feature-by-Feature Comparison

Kotlin is a modern, expressive programming language fully interoperable with Java and designed to overcome many of Java’s limitations. Here’s a detailed comparison of Kotlin’s key features versus Java, including advanced language capabilities.

🚀 1. Null Safety

Kotlin:

var name: String = "John"
name = null // Compile-time error

Java: All object references are nullable by default. Null safety relies on careful programming or external tools like Optional.

🧵 2. Concise Syntax

Kotlin:

val list = listOf("A", "B", "C")

Java:

List<String> list = Arrays.asList("A", "B", "C");

📦 3. Data Classes

Kotlin:

data class User(val name: String, val age: Int)

Java: Manual creation or use of tools like Lombok.

🔗 4. Extension Functions

Kotlin:

fun String.isEmail(): Boolean {
    return this.contains("@")
}

Java: No extension functions; you must use utility classes.

🧬 5. Coroutines for Async Programming

Kotlin:

suspend fun fetchData() = coroutineScope {
    // Async non-blocking work
}

Java: Uses CompletableFuture, which is more verbose and harder to manage.

📐 6. Smart Type Inference

val name = "Alice" // Inferred as String

Java: Type inference is limited and available only from Java 10+ using var.

🛠️ 7. Default and Named Arguments

Kotlin:

fun greet(name: String = "Guest") {
    println("Hello, $name!")
}

Java: Requires method overloading. Named arguments not supported.

🔒 8. Sealed Classes

Kotlin:

sealed class Result
class Success(val data: String) : Result()
class Error(val message: String) : Result()

Java: Supported only in Java 17+ using sealed keyword.

🔁 9. Functional Programming Support

Kotlin: Native support with lambdas, higher-order functions, etc.

Java: More verbose functional style introduced in Java 8.

🔄 10. Interoperability with Java

Kotlin: Can call Java code seamlessly and vice versa.

Java: Can interact with Kotlin (compiled to bytecode), but tooling is better from Kotlin to Java.

🔍 Additional Advanced Kotlin Features

🧾 Type Aliases

typealias UserMap = Map<String, List<User>>

⏱ Lazy Initialization

val config by lazy {
    loadConfiguration()
}

🧷 Delegated Properties

var name: String by Delegates.observable("<no name>") { _, old, new ->
    println("Changed from $old to $new")
}

🧩 Destructuring Declarations

val (name, age) = User("Alice", 25)

🛡 Safe Call (?.) and Elvis Operator (?:)

val length = name?.length ?: 0

📊 Summary Table

FeatureKotlinJava
Null Safety✅ Built-in❌ Not safe by default
Concise Syntax✅ Yes❌ Verbose
Data Classes✅ One-liner❌ Manual / Lombok
Extension Functions✅ Yes❌ No
Coroutines✅ Yes❌ Limited
Type Inference✅ Yes✅ (Java 10+)
Default/Named Args✅ Yes❌ No
Sealed Classes✅ Yes✅ (Java 17+)
Functional Features✅ Native✅ Limited
Interop with Java✅ Excellent✅ Kotlin-compatible
Type Aliases✅ Yes❌ No
Lazy Initialization✅ by lazy❌ Manual
Delegated Properties✅ Yes❌ No
Destructuring✅ Yes❌ No
Safe Call / Elvis✅ Yes❌ No

🧠 Conclusion

Kotlin offers a more modern, expressive, and safer syntax compared to Java, while maintaining complete interoperability. If you’re starting a new project—especially for Android or backend APIs—Kotlin is often the more productive and maintainable choice.

Microservice guidelines

Microservice

When to expose a new microservice?

Microservice has to expose a well defined functionality with significant code. It’s difficult to define what is significant code so how to make a decision about new microservice.

  • If some service is turning into monolith. Various kind of functionality being added frequently into service
  • Performance is becoming a major bottleneck because some apis are getting hit harder compared to other and it cant be taken care in existing microservice
  • Too many frequent changes in one part of service and risk of breaking existing functionality
  • HA requirements are drastically different
  • Business wise it makes sense to break modules in different services
  • A specific technology needs to be used to solve a specific problem
  • Investment is justified and available

Microservice comes with lot of overhead so its mandatory that a thorough brainstorming has been done before adding new service in the system.

Provisioning of Microservice

Before service goes to production infrastructure provisioning ticket has to be opened 2 days before by manager with following guidelines about each resource type

Performance Measurement Guidelines


Sizing of machine/docker containers

Before you deploy your application you must understand performance characteristics of your service. Every service serves a different purpose hence their performance requirements will be different.

  • Some Services deal with lot of data hence their memory requirement will be different
  • Some services like log collector read lot of data and transport it to separate end point
  • Some services get lot of requests so responding in time is critical
  • Some services are heavily dependent on database so depending on type of database their performance characteristics change

There are some parameters to think about

1st step is to identify most critical APIs.

Depending on Cars24 load characteristics make sure you have done load test with these numbers

1. 20 requests / second for most frequently used APIs

2. Per API SLA should not exceed more than 100ms

CPU – Is your service CPU Intensive

Memory – How much data at any given point in time will be hold in memory

Networking  – Does your service require special networking capabilities? Usually these applications deal with extensive data ingestion like 100MBPs.

IOPS – Most applications are IO Bound meaning rather than CPU Database operations will consume lot of time. So you should observe your applications behavior in production to understand where is the bottleneck. Usually applications CPU will spike because underlying database and real problem is not with application code base. In this case Caching must be considered as an alternative either using Redis or Elastic Search

Kind of questions you should ask from first principle while provisioning EC2 Machine

EC2 (Fargate Cluster)


  • how many requests per second you are expecting?
  • does this require High Availability
  • Does it expose HTTP API?
  • Can it be shutdown during night?
  • Can it be reduced during night? If yes then what type? If No then why not?
  • Hard Disk required?
  • Is it temporary? If yes please specify the date on which it will be stopped

Containers


  • Just like EC2 Machines Docker Containers Sizing is necessary while deploying applications
  • RAM, CPU, Number of Tasks and Storage are main factors
  • Each Fargate container is deployed as ECS “Service”. Service deployment helps in service discovery using Route53 and is completely automated
  • CPU and RAM are defined in multiples of 1024
  • If you require 2 cores then define 2048 as CPU and RAM will be minimum 4*1024 = 4096 (4GB).
  • Fargate capacity is available on in slabs. For 1 CPU (1024) Memory comes with 2048 so be careful that you are asking about right CPU
  • Nature of Service (HTTP/TCP)
  • Health Check has to be defined
  • Health Check interval has to be defined

Database


  • Is your application in POC Phase? If yes can it share database server with other applications?
  • Whats the default connection pool size of your application? If Its a poc can it be set to 2. Connections eat up resources at both application end and database has to use lot of CPU/Memory to maintain them
  • Connection Pool size to be limited to 10 assuming there will be 2 instances running. If a request can complete within 100ms
    • 1 connection = 10 requests per second
    • 10 connection = 10×10 = 100 requests per second per server
    • 2 servers = 100×2 = 200 requests per second.
    • How many we need? 20 requests per second?
  • Will database be deleted after few days? if yes then define a date
  • How many IOPS will be required
  • How much disk size will be required – provide estimation logic
  • By default we use MySQL for storing all microservices data. Version of MySQL has to be 5.7

Redis


  • Redis is an extremely fast and efficient caching store and provides lot of functionality
  • You must state what kind of operations you are going to do on redis because operations define whether Redis instance requires higher CPU
  • Redis throughput is measured in thousands of requests per second so application should be able to handle heavy load on small instance of redis

Logging Guidelines

  • Logs put tremendous stress on systems. They consume lot of IOPS and also result in degraded performance when used extensively. Use them in useful places
  • Use proper context in logs by capturing useful information like userid/lead-id but not personal information like phone number

Microservice Architecture and its Challenges

Microservices architecture is becoming increasingly popular while building large scale applications as it provides n number of benefits

  • Separate lifecycle of each service so services can be deployed independently which means services can evolve separately.
  • Each service can be fine tuned for different SLAs and scalability
  • Each Service can be developed using different stack
  • Each service can be monitored separately

However, Microservice service architecture is not a free lunch. It throws many challenges that must be discussed and dealt with even before venturing into this (unchartered) territory.

  • Authorization – How do you make sure that particular service and it’s APIs can be called only when user is authorised to do so.
  • Data Consistency – ACID is no longer available, deal with eventual consistency.
  • Service Discovery – How to find and interact with new services. If there are lots of services then how to make sure that they are discoverable from other services
  • Deployment – What if there is a hierarchy in microservice dependency
    • A -> B -> C
  • SLA – Multiple hops in a request will add to latency and affect your SLA
  • Fault Tolerance – How to handle Cascading Failure
  • Monitoring – (How to monitor a system which has 100s or even 1000s of services)
  • Tracing  – Logging & Request Tracing (A message will travel across many boundaries so how to nail down where message got lost/delayed)
  • Tech Stack – Selecting a stack (to go with single stack or multiple stacks?)
  • Packaging and Deploying – (How to come up with a uniform way of packaging and delivering services )
  • Debugging – During development phase how to make sure that developers are able to debug code as effectively as they do in monolithic system

 

 

Rust Concurrency

rustFor a long time I have been thinking about writing a sample program in Rust “the” new systems language. I have done coding in C++ for initial 5 years of my career before I moved on completely to Java and recently in one of my products a requirement came up that a low latency high performance component had to be developed.

As I have written by default Java was a default choice as its my first choice anyways. However I realized that this component could not afford non deterministic nature of garbage collector.

So need was to write program where I could have control over exact memory deallocation without worrying about “stop the world” garbage collection. Natural Choice was C++ but programming is all about having fun and I wanted to try out something new and C++ threading support and syntax is not all that great even in C++11.

So I decided to try out Go. but again Go had an issue of garbage collection and same fear of non determinism creeped in.

So time to try out Rust.

Program is simple but can be extended to lot of other scenarios.

One thread keeps spitting out data at some regular intervals. A vector keeps track of generated data.

Other thread keeps ticking at regular intervals (100ms or so) and whenever there are items which have elapsed time greater than a threshold those items are expired. Same as cache TTL.

use std::thread;
    use std::sync::mpsc;
    use std::time::{Duration,Instant};
    use std::collections::HashMap;

   //Define struct
    #[derive(Clone)]
    struct Item {
        created_at: Instant,
        id:i64,
        pub description: String
    }
//Implement Item
    impl Item {

        pub fn new(id: i64,description: String) -> Item {
            Item {
                created_at: Instant::now(),
                id: id,
                description: description
            }
        }

        fn created_at(&self) -> Instant {
            self.created_at
        }

        fn id(&self) -> i64 {
            self.id
        }
    }


    fn main() {
        let (sender, receiver) = mpsc::channel(); //Creat  multiple publisher single receiver channel
        let sender_pop = sender.clone(); //clone sender

        //Create a thread that sends pop every 2 seconds
        thread::spawn(move || {
            //Create infinite loop
            loop {
                thread::sleep(Duration::from_millis(100));
                sender_pop.send(Item::new(-1,String::from("Pop"))).unwrap();
            }
        });

        //Create a thread that keeps sending data every second t
        thread::spawn(move || {
            let mut val = 1;
            //Create infinite loop
            loop {
                val = val + 1;
                sender.send(Item::new(val,String::from("New"))).unwrap();
                thread::sleep(Duration::from_millis(1000));
                //Break out of loop if you want to
//                if val == 10 {
//                    println!("OK, that's enough");
//                    // Exit this loop
//                    break;
//                }
            }
        });
        //Create a mutable vector
        let mut vals: Vec<Item> = Vec::new(); 
        let ttl = 5; //TTL in seconds
        //Receive items in non blocking fashion
        for received in receiver {
            //let item = &received;
            let mut item = &received;
            let newItem: Item  = item.clone();
            match item.description.as_ref(){
                "Pop" => {
                    println!("Pop");
                    vals.retain(|ref x| Instant::now().duration_since(x.created_at).as_secs() < ttl);

                },
                _ => {
                    vals.push(newItem);
                }
            }
        }
    }

That’s it. You have done synchronisation between threads without any race condition. That’s how cool Rust is.

In the next blog we will try to send notification whenever items are expired.

Happy Coding !!

Master Worker Architecture using Vert.x

 

Today I am going to explain how Vert.x can be used for creating distributed Master Worker Paradigm. In large scale systems it’s applicable to wide variety of problems.

First – Just to refresh our memories about what Vert.x is

Vert.x as we know is a lightweight framework for creating distributed microservices. It can sale up and scale out depending on your needs. It also takes away all your pain of dealing with complexity of heavily multithreaded environments, race conditions etc. etc.

Primary unit of work in Vert.x is a verticle. Verticles are thread safe and they can run locally or remotely. One Verticle interacts with other verticle using Events which carry data with them.

Now – let’s take a day to day scenario.

We are getting lot of requests. Each request is independent of each other but we are unable to process all these requests on the commodity hardware that we have. How to serve all these requests coming to our cool high traffic website?

Well one answer is serve each request in a new “thread” and keep increasing the CPU Cores (Scale Up) and hope it will work. This is what your webserver does. Problem is you can only increase no of cores to a limit (How high you can go?).

Once you reach that limit you will add more such machines and leave it to load balancer to divide all these requests equally between all machines. Sounds familiar?

Well, you will have problem relying on load balancer when every service in the system faces same issue. Every time you will have to scale these services and keep re-configuring load balancer. What if this was possible in application layer dynamically. What if we could scale up and out without any pain of load balancer.  Good news is it can be achieved using Vert.x except load balancing happens inside your application layer. There can be lot of benefits of this approach which I will discuss some other time but for now let’s just focus on how can this be achieved using Vert.x

So this problem has 2 major challenges : –

a. How to divide the work between different machines so that we can keep up with this load.

b. How to combine the result from all this processing so that we can return this result to client (master) who needs answers from all workers before proceeding further (Can this be achieved by load balancer?).

So Master is like Management whose only job is to distribute all the work to developers (like you and me) and when work is done…combine all the statuses, create a report and notify the boss and hopefully get a fat pay hike (sounds familiar?)

In terms of Vert.x We have a master Verticle which gets lot of work to do. But Master does not want to do any work..Why? Because its a “Master”. So master wants to assign all this work to Workers. Worker are also verticles in Vert.x. But then problem arises that master needs to know if all work is completed so that it can make right decision about what to do next..Right..

So here is high level architecture we are going to fllow

Vert.x Master Worker

Ok..so in order to simulate this first lets create lot of work

import io.vertx.core.Future;
import io.vertx.core.Vertx;
import io.vertx.core.eventbus.Message;

/**
 * Created by marutsingh on 2/20/17.
 */
public class Application {

    public static void main(String[] args){

        final Vertx vertx = Vertx.vertx();
        //vertx.deployVerticle(new HttpVerticle());
        vertx.deployVerticle(new MasterWorker());

        //DeploymentOptions options = new DeploymentOptions().setInstances(10);
        //vertx.deployVerticle("WorkVerticle",options);
        for (int i = 0; i &amp;lt; 5; i++){
            vertx.deployVerticle("WorkVerticle");
        }

        vertx.eventBus().send("vrp", "Job1,Job2,Job3,Job4,Job5,Job6,Job7,Job8,Job9,Job10");

        System.out.println("Deployment done");
    }
}

Great ..We created our own master..lets see how does it look

public class MasterWorker extends AbstractVerticle {

    @Override
    public void start(Future fut) {
       vertx.eventBus().localConsumer("vrp", new Handler() {
           @Override
           public void handle(Message objectMessage) {
               String data =  objectMessage.body().toString();
               String[] work = data.split(",");
               String jobId = UUID.randomUUID().toString();
               List futureList = new ArrayList();

               for (String w : work){
                   Future f1 = Future.future();
                   futureList.add(f1);
                   vertx.eventBus().send("work",w + ":" + jobId,
                  f1.completer());
               }
           }
       });
    }
}

Great so our master is doing…sending work over event bus and hoping some worker will start working upon it.

Lets see what our worker is doing

public class WorkVerticle extends AbstractVerticle {

    @Override
    public void start(Future<Void> fut) {
        final String verticleId = super.deploymentID();

        vertx.eventBus().localConsumer("work",
new Handler() {
            @Override
            public void handle(Message objectMessage) {
                String[] data =  objectMessage.body().toString()
                .split(":");
                String work = data[0];
                String jobId = data[1];
                String result = work + "Completed***";
                objectMessage.reply(result);
            }
        });
    }
}

So worker does the work and sends an event on event bus with result.

Now master needs to combine all these results. This is way cool features introduced in Vert.x 3…Composable futures It makes this so easy

CompositeFuture.all(futureList).setHandler(ar -> {
                   if (ar.succeeded()) {
                       ar.result().list().forEach((result) ->
 resultSet.append(((MessageImpl) result).body().toString()));
                       System.out.println(resultSet.toString());
                       // All succeeded
                   } else {
                       // All completed and at least one failed
                   }
               });

Thats all !!.  I hope this will be useful in some of your scenario.

Source code is available at

https://github.com/singhmarut/vertx-master-worker-simulator

Happy coding !!

Code Quality Guidelines

Comprehensive Software Development Guidelines

Table of Contents


General Coding Guidelines

Naming and Organization

  1. Naming Conventions
  • Use descriptive, intention-revealing names for variables, methods, and classes
  • Avoid abbreviations unless universally understood
  • Follow language-specific conventions (camelCase, PascalCase, etc.)
  • Name boolean variables with prefixes like “is”, “has”, or “should”
  1. Code Organization
  • Separate static and dynamic parts of your application
  • Follow a consistent, logical file and directory structure
  • Group related functionality together
  • Keep source files focused on a single responsibility

Coding Practices

  1. No Hard Coding
  • Define constants or enums in appropriate locations
  • Use configuration files for environment-specific values
  • Centralize configuration management
  1. Simplicity and Complexity
  • Prefer simplicity over complexity
  • If code becomes complex, it’s likely a candidate for refactoring
  • Remember: “It’s hard to build simple things”
  • Complex solutions usually indicate design issues
  1. Optimization
  • Avoid premature optimization
  • Optimize only after profiling and identifying actual bottlenecks
  • Consider algorithmic efficiency (Big O notation) for critical paths
  1. Design Patterns
  • Look for opportunities to apply standard design patterns
  • Adapt patterns to fit your specific use case
  • Document pattern usage for clarity
  1. Code Repetition
  • Strictly prohibit repetitive code (DRY – Don’t Repeat Yourself)
  • Extract repeated logic into reusable methods or classes
  • Consider refactoring when similar code appears multiple times
  1. Code Formatting
  • Consistently align and indent code
  • Follow language/framework-specific style guides
  • Use automated formatting tools
  • Always format before committing code

Class Design

  1. Size and Responsibility
  • Keep classes under 600 lines (smaller is better)
  • Follow Single Responsibility Principle
  • Each class should have one clear purpose
  1. Constructors
  • Keep constructors simple and exception-safe
  • Use dependency injection for dependencies
  • Initialize essential state only
  1. Composition vs Inheritance
  • Prefer composition over inheritance
  • Use inheritance only for genuine “is-a” relationships
  • Avoid deep inheritance hierarchies
  1. Extensibility
  • Design classes for extensibility
  • Implement the Open/Closed Principle
  • Consider future use cases without overengineering

Functions and Methods

  1. Size and Complexity
  • Keep functions under 25 lines (ideally 5-10 lines)
  • Each function should do one thing and do it well
  • Extract complex operations into separate functions
  1. Parameters
  • Validate parameters in public methods
  • Throw exceptions for invalid inputs
  • Keep parameter count low (ideally 0-3)
  • Use parameter objects for multiple related parameters
  1. Return Values
  • Avoid returning null when possible
  • Return empty collections instead of null for collections
  • Consider using Optional for potentially missing values
  • Try to avoid multiple return statements
  1. Organization
  • Group functions logically
  • Separate functions for initialization and business logic
  • Keep functions testable (avoid side effects)

Control Flow

  1. Conditional Statements
  • Avoid deep nested if/else statements (max 2-3 levels)
  • Use guard clauses for early returns
  • Break complex conditions into well-named helper methods or variables
  • Use parentheses in long conditions to clarify precedence
  1. Loops
  • Keep loop bodies simple
  • Extract complex loop bodies into separate functions
  • Consider functional approaches where appropriate
  • Be aware of performance implications for large datasets

Error Handling

  1. Exception Strategy
  • Define a clear exception hierarchy
  • Don’t suppress exceptions without handling them
  • Maintain the original exception’s cause when wrapping
  • Document exceptions in method contracts
  1. Exception Handling
  • Do not write complex code in exception handlers
  • Extract try/catch blocks to dedicated functions
  • Log sufficient information for troubleshooting
  • Follow language-specific best practices

Comments and Documentation

  1. Effective Comments
  • Write comments for complex or non-obvious code
  • Explain why, not what (the code shows what)
  • Keep comments up-to-date when code changes
  • Use standardized comment formats (JavaDoc, JSDoc, etc.)
  1. Error Messages
  • Create clear, actionable error messages
  • Use error message frameworks for consistency
  • Avoid exposing sensitive information in errors
  • Include relevant context for troubleshooting

Logging and Monitoring

  1. Logging Strategy
  • Use appropriate log levels (ERROR, WARN, INFO, DEBUG)
  • Log contextual information (request IDs, user IDs)
  • Follow consistent log patterns
  • Avoid logging sensitive information
  1. Performance Considerations
  • Be mindful of logging overhead
  • Use conditional logging for verbose information
  • Consider log aggregation and search requirements

Clean Code Principles

Code Clarity

  1. Intention-Revealing Names
  • Name variables, methods, and classes to reveal their purpose
  • Avoid abbreviations and acronyms unless universally understood
  • Use pronounceable names that can be discussed in conversation
  • Example: getUserTransactionHistory() instead of getUsrTrHist()
  1. Function Design
  • Functions should do one thing, do it well, and do it only
  • Functions should be small – ideally 5-10 lines
  • Reduce function parameters to absolute minimum (0-2 is ideal)
  • Functions that modify state should not return values (Command-Query Separation)
  • Extract try/catch blocks into their own functions for cleaner code
  1. Comments
  • Good code mostly documents itself; comments should explain “why” not “what”
  • Comments that merely echo the code are worse than no comments
  • Keep comments relevant and up-to-date or delete them
  • Use comments for legal information, explanation of intent, clarification, and warnings

Structure and Organization

  1. The Scout Rule
  • Always leave the code cleaner than you found it
  • Make incremental improvements with each touch
  • Refactor gradually to avoid breaking changes
  1. Step-Down Rule
  • Organize code to read like a top-down narrative
  • Each function should be followed by those at the next level of abstraction
  • Creates natural flow from high-level concepts to implementation details
  1. Cohesion and Coupling
  • Classes should be highly cohesive (focused on a single responsibility)
  • Minimize coupling between classes and packages
  • Law of Demeter: Only talk to your immediate friends (avoid chains like a.getB().getC().doSomething())

SOLID Principles

  1. Single Responsibility Principle (SRP)
  • A class should have only one reason to change
  • Separate business logic from infrastructure concerns
  • Example: Separate OrderProcessor from OrderRepository
  1. Open/Closed Principle (OCP)
  • Software entities should be open for extension but closed for modification
  • Use abstractions and polymorphism to allow behavior changes without modifying existing code
  • Example: Strategy pattern for payment processing methods
  1. Liskov Substitution Principle (LSP)
  • Subtypes must be substitutable for their base types without altering program correctness
  • Override methods should not violate parent class contracts
  • Example: Square is not a proper subtype of Rectangle if it violates Rectangle’s behavior
  1. Interface Segregation Principle (ISP)
  • Clients should not be forced to depend on methods they do not use
  • Create specific interfaces rather than general-purpose ones
  • Example: Split large interfaces into OrderCreator, OrderFinder, etc.
  1. Dependency Inversion Principle (DIP)
  • High-level modules should not depend on low-level modules; both should depend on abstractions
  • Abstractions should not depend on details; details should depend on abstractions
  • Example: Service depends on Repository interface, not implementation

Clean Architecture

  1. Independence from Frameworks
  • The framework is a tool, not the architecture
  • Core business logic should be isolated from framework code
  • Use adapters/wrappers to interact with frameworks
  1. Testability
  • Business rules should be testable without UI, database, or external services
  • Use dependency injection to allow substituting test doubles
  • Create a testing strategy for each architectural layer
  1. Independence of Delivery Mechanism
  • Business logic should not know or care whether it’s being accessed via web, console, or API
  • Separate domain models from data transfer objects (DTOs)
  • Use mappers to convert between domain and external representations
  1. Screaming Architecture
  • Your architecture should “scream” the intent of the system
  • Package structure should reflect business domains, not technical concerns
  • Example: com.company.billing, com.company.shipping instead of com.company.controllers, com.company.services

Layered Architecture

  1. Layer Separation
  • Implement strict layering (e.g., Controller → Service → Repository)
  • Each layer has specific responsibilities
  • Upper layers call lower layers, never vice versa
  1. Layer Responsibilities
  • Presentation Layer: Handles user interaction and display
  • Service Layer: Contains business logic and orchestrates operations
  • Data Access Layer: Manages data storage and retrieval
  • Domain Layer: Contains business entities and rules
  1. Package Structure
  • Organize by feature or domain, not technical layer
  • Follow consistent naming conventions
  • Place functions in domain-appropriate packages
  • Avoid generic “utils” packages when possible

Java-Specific Guidelines

Core Java Features

  1. Effective Use of Java Language Features
  • Prefer enhanced for loops over traditional loops
  • Use annotations for configuration instead of XML when possible
  • Use var judiciously for local variables to reduce verbosity (Java 10+)
  • Avoid checked exceptions for predictable cases; use runtime exceptions or Optional
  • Use @Override annotation for all overridden methods
  1. Enums
  • Use enums instead of constants for related groups of values
  • Leverage enum methods and fields for related functionality
  • Consider using enum-based singletons for type-safe factories
   public enum PaymentMethod {
       CREDIT_CARD(true) {
           @Override
           public void process(Payment payment) {
               // Process credit card payment
           }
       },
       PAYPAL(true) {
           @Override
           public void process(Payment payment) {
               // Process PayPal payment
           }
       },
       INVOICE(false) {
           @Override
           public void process(Payment payment) {
               // Process invoice payment
           }
       };

       private final boolean isOnline;

       PaymentMethod(boolean isOnline) {
           this.isOnline = isOnline;
       }

       public boolean isOnlineMethod() {
           return isOnline;
       }

       public abstract void process(Payment payment);

       public static List<PaymentMethod> getOnlineMethods() {
           return Arrays.stream(values())
                    .filter(PaymentMethod::isOnlineMethod)
                    .collect(Collectors.toList());
       }
   }
  1. Generics
  • Use generics for type safety and to avoid casts
  • Understand PECS: “Producer Extends, Consumer Super”
  • Use wildcards appropriately to increase API flexibility
   // Producer (read from) - use extends
   public void process(List<? extends Animal> animals) {
       for (Animal animal : animals) {
           // can read animals
       }
   }

   // Consumer (write to) - use super
   public void addCats(List<? super Cat> catList) {
       catList.add(new Cat()); // can add cats or its subtypes
   }
  1. Default Methods
  • Use default methods to evolve interfaces without breaking implementations
  • Keep default methods simple, avoiding complex state or dependencies
   public interface UserRepository {
       List<User> findByRole(Role role);

       // Default method to find admins
       default List<User> findAdmins() {
           return findByRole(Role.ADMIN);
       }
   }
  1. Records (Java 16+)
  • Use records for simple data carriers
  • Leverage compact constructors for validation
  • Combine with sealed classes for complete domain modeling
   public record OrderDetails(
       String orderId,
       BigDecimal amount,
       LocalDateTime createdAt
   ) {
       // Compact constructor for validation
       public OrderDetails {
           Objects.requireNonNull(orderId, "Order ID cannot be null");
           Objects.requireNonNull(amount, "Amount cannot be null");
           Objects.requireNonNull(createdAt, "Created date cannot be null");

           if (amount.compareTo(BigDecimal.ZERO) <= 0) {
               throw new IllegalArgumentException("Amount must be positive");
           }
       }
   }
  1. Streams and Lambda Expressions
  • Use streams for collection processing to improve readability
  • Avoid side effects in stream operations
  • Use method references instead of lambdas when possible
  • Know when to use parallel streams (CPU-intensive operations on large collections)
  • Extract complex stream operations into well-named methods
   // Good: Clear, functional style with descriptive method references
   public List<UserDto> getActiveUserDtos() {
       return userRepository.findAll().stream()
           .filter(User::isActive)
           .filter(this::hasPermissions)
           .map(userMapper::toDto)
           .collect(Collectors.toList());
   }

   // Avoid: Overly complex stream pipeline
   public List<UserDto> getActiveUserDtos() {
       return userRepository.findAll().stream()
           .filter(user -> user.getStatus() == Status.ACTIVE 
               && !user.getRoles().isEmpty() 
               && user.getLastLogin().isAfter(LocalDate.now().minusDays(30)))
           .map(user -> new UserDto(user.getId(), user.getName(), user.getEmail()))
           .collect(Collectors.toList());
   }
  1. Optional Usage
  • Use Optional as a return type, not a parameter type
  • Avoid creating Optional objects for null checks
  • Leverage Optional methods like map(), filter(), and orElse()
  • Don’t use get() without checking isPresent() first; prefer orElse*() methods
   // Good: Fluent Optional usage
   public String getUserDisplayName(Long userId) {
       return userRepository.findById(userId)
           .map(User::getDisplayName)
           .orElse("Guest");
   }

   // Avoid: Improper Optional usage
   public Optional<User> findUser(Optional<String> username) { // Don't use Optional as parameter
       if (username.isPresent()) {
           return Optional.ofNullable(userRepository.findByUsername(username.get()));
       }
       return Optional.empty();
   }

Collections & Data Structures

  1. Collection Selection
  • Choose the appropriate collection for your use case:
    • ArrayList: Fast random access, slower insertions/deletions
    • LinkedList: Fast insertions/deletions, slower random access
    • HashSet: Fast lookups, no ordering guarantees
    • TreeSet: Sorted elements, slower than HashSet
    • HashMap: Fast key lookups
    • LinkedHashMap: Predictable iteration order
    • TreeMap: Sorted keys
  1. Collection Factory Methods (Java 9+)
  • Use collection factory methods for creating small, immutable collections
   // Good: Concise immutable collections
   List<String> colors = List.of("red", "green", "blue");
   Set<String> primaryColors = Set.of("red", "blue", "yellow");
   Map<String, Integer> ratings = Map.of(
       "Star Wars", 5,
       "Inception", 4,
       "Titanic", 3
   );
  1. Collection Performance
  • Set initial capacity when you know the approximate size
  • Use EnumMap and EnumSet for enum-based keys for better performance
  • Use ArrayDeque instead of Stack for stack operations
   // Good: Improved performance with proper sizing and specialized collections
   Map<String, User> userMap = new HashMap<>(1000); // Preallocate for 1000 users

   // Use EnumMap for enum keys
   Map<UserRole, List<User>> usersByRole = new EnumMap<>(UserRole.class);

   // Use ArrayDeque instead of Stack
   Deque<Command> commandStack = new ArrayDeque<>();
   commandStack.push(command);
   Command lastCommand = commandStack.pop();
  1. Type Safety
  • Always use generics when working with collections
  • Declare collection variables with interface types:
   // Good:
   Map<String, User> users = new HashMap<>();
   List<Order> orders = new ArrayList<>();
  • Avoid raw types
  • Avoid using Object as a parameter or return type

Date and Time

  1. Modern Date/Time API
  • Use java.time package instead of legacy Date/Calendar
  • Choose appropriate classes for your needs:
    • LocalDate for date without time
    • LocalTime for time without date
    • LocalDateTime for date and time without timezone
    • ZonedDateTime for date and time with timezone
    • Instant for machine timestamps
  1. Timezone Handling
  • Always specify timezones explicitly when needed
  • Store dates in UTC internally
  • Convert to user’s timezone only for display
  • Use a consistent timezone throughout your application
  1. Formatting and Parsing
  • Use DateTimeFormatter for formatting and parsing
  • Create reusable formatters as constants
  • Consider locale when formatting dates for display
   // Define reusable formatters
   private static final DateTimeFormatter ISO_FORMATTER = 
       DateTimeFormatter.ISO_DATE_TIME;

   private static final DateTimeFormatter DISPLAY_FORMATTER = 
       DateTimeFormatter.ofPattern("MMM d, yyyy h:mm a", Locale.US);

   // Parsing
   LocalDateTime dateTime = LocalDateTime.parse(isoString, ISO_FORMATTER);

   // Formatting
   String displayDate = dateTime.format(DISPLAY_FORMATTER);

String Handling

  1. Efficient String Operations
  • Use StringBuilder for string concatenation in loops
  • Use String.format() or text blocks (Java 15+) for multi-line strings
  • Be careful with string splitting and regular expressions in performance-critical code
  1. Text Blocks (Java 15+)
  • Use text blocks for SQL queries, HTML, JSON, etc.
   String sql = """
       SELECT u.id, u.name, u.email
       FROM users u
       JOIN roles r ON u.role_id = r.id
       WHERE r.name = ?
       ORDER BY u.name
       """;
  1. Character Encoding
  • Always specify character encoding when reading/writing text
  • Use UTF-8 as the default encoding
  • Be explicit about encoding in file I/O and network operations

Spring Framework Practices

  1. Dependency Injection
  • Prefer constructor injection over field or setter injection
  • Keep components stateless when possible
  • Use qualifiers or named beans to resolve ambiguities
   // Good: Constructor injection
   @Service
   public class UserServiceImpl implements UserService {
       private final UserRepository userRepository;
       private final PasswordEncoder passwordEncoder;

       @Autowired // Optional in newer Spring versions
       public UserServiceImpl(UserRepository userRepository,
                            PasswordEncoder passwordEncoder) {
           this.userRepository = userRepository;
           this.passwordEncoder = passwordEncoder;
       }
   }

   // Avoid: Field injection
   @Service
   public class UserServiceImpl implements UserService {
       @Autowired
       private UserRepository userRepository;

       @Autowired
       private PasswordEncoder passwordEncoder;
   }
  1. Configuration Management
  • Use @Configuration classes instead of XML
  • Group configurations by functional area
  • Use profiles for environment-specific configuration
  • Externalize sensitive configuration (credentials, API keys)
   @Configuration
   @PropertySource("classpath:messaging.properties")
   public class MessagingConfiguration {

       @Bean
       @Profile("production")
       public MessageSender productionMessageSender() {
           return new KafkaMessageSender();
       }

       @Bean
       @Profile("development")
       public MessageSender developmentMessageSender() {
           return new InMemoryMessageSender();
       }
   }
  1. Spring Boot Best Practices
  • Use starter dependencies to reduce boilerplate
  • Leverage auto-configuration but understand what’s happening
  • Override only what you need to customize
  • Use application-{profile}.properties for profile-specific configuration
  • Prefer configuration properties classes over direct @Value injections
   @Configuration
   @ConfigurationProperties(prefix = "app.mail")
   @Validated
   public class MailProperties {
       @NotBlank
       private String host;

       @Min(1)
       @Max(65535)
       private int port = 25;

       private String username;
       private String password;

       // Getters and setters
   }

Database & Persistence

  1. JPA/Hibernate Best Practices
  • Use lazy loading judiciously and handle N+1 query problems
  • Define appropriate fetch strategies in queries
  • Configure proper cascade types (avoid CascadeType.ALL)
  • Use value types for complex attributes without identity
  • Set appropriate batch sizes for batch operations
   @Entity
   public class Order {
       @Id
       @GeneratedValue(strategy = GenerationType.IDENTITY)
       private Long id;

       @ManyToOne(fetch = FetchType.LAZY)
       @JoinColumn(name = "customer_id")
       private Customer customer;

       @OneToMany(mappedBy = "order", cascade = {CascadeType.PERSIST, CascadeType.MERGE})
       private List<OrderItem> items = new ArrayList<>();

       @Embedded
       private Address shippingAddress;

       // N+1 problem avoided with fetch join
       public static Optional<Order> findByIdWithItems(EntityManager em, Long id) {
           return em.createQuery(
                   "SELECT o FROM Order o " +
                   "LEFT JOIN FETCH o.items " +
                   "WHERE o.id = :id", Order.class)
               .setParameter("id", id)
               .getResultStream()
               .findFirst();
       }
   }
  1. Transaction Management
  • Use declarative transaction management with @Transactional
  • Set the appropriate isolation level for your use case
  • Keep transactions as short as possible
  • Be explicit about read-only transactions
  • Understand transaction propagation behaviors
   @Service
   public class OrderService {
       private final OrderRepository orderRepository;
       private final InventoryService inventoryService;

       // Constructor omitted for brevity

       @Transactional
       public Order createOrder(OrderRequest request) {
           // This method creates a new transaction
           Order order = new Order();
           // Set order properties

           return orderRepository.save(order);
       }

       @Transactional(readOnly = true)
       public List<Order> findOrdersByCustomer(Long customerId) {
           // Read-only optimization
           return orderRepository.findByCustomerId(customerId);
       }

       @Transactional(propagation = Propagation.REQUIRES_NEW)
       public void processRefund(Long orderId) {
           // Always creates a new transaction
           // even if called from another transactional method
       }
   }
  1. Native Queries
  • Specify the result class when using native queries
  • Use named parameters instead of positional parameters
  • Consider using projections for partial entity loading
   @Repository
   public class OrderRepositoryImpl implements OrderRepositoryCustom {
       @PersistenceContext
       private EntityManager entityManager;

       @Override
       public List<Order> findActiveOrdersByCustomer(Long customerId) {
           return entityManager.createNativeQuery(
               "SELECT o.* FROM orders o " +
               "WHERE o.customer_id = :customerId " +
               "AND o.status = 'ACTIVE'", Order.class)
               .setParameter("customerId", customerId)
               .getResultList();
       }

       @Override
       public List<OrderSummary> findOrderSummaries() {
           return entityManager.createNativeQuery(
               "SELECT o.id as orderId, o.order_date as orderDate, " +
               "c.name as customerName, SUM(oi.quantity * oi.price) as total " +
               "FROM orders o " +
               "JOIN customers c ON o.customer_id = c.id " +
               "JOIN order_items oi ON oi.order_id = o.id " +
               "GROUP BY o.id, o.order_date, c.name",
               "OrderSummaryMapping")
               .getResultList();
       }
   }

   @Entity
   @SqlResultSetMapping(
       name = "OrderSummaryMapping",
       classes = @ConstructorResult(
           targetClass = OrderSummary.class,
           columns = {
               @ColumnResult(name = "orderId", type = Long.class),
               @ColumnResult(name = "orderDate", type = LocalDate.class),
               @ColumnResult(name = "customerName", type = String.class),
               @ColumnResult(name = "total", type = BigDecimal.class)
           }
       )
   )
   public class Order {
       // Entity implementation
   }
  1. Database Access
  • Use prepared statements to prevent SQL injection
  • Use named parameters in queries for readability
  • Consider using JPA entity classes when appropriate

Concurrency & Multithreading

  1. Thread Safety
  • Make classes immutable when possible
  • Use concurrent collections from java.util.concurrent
  • Understand the proper use of synchronized, volatile, and atomic classes
  • Minimize shared mutable state
  • Document thread safety (or lack thereof) in class Javadoc
   /**
    * Thread-safe counter implementation.
    */
   public class Counter {
       private final AtomicLong count = new AtomicLong();

       public void increment() {
           count.incrementAndGet();
       }

       public long getCount() {
           return count.get();
       }
   }
  1. Modern Concurrency (Java 8+)
  • Use CompletableFuture for async operations
  • Understand the Fork/Join framework for parallel tasks
  • Use ExecutorService properly and shut it down
  • Consider thread pools and task executors for managing threads
   @Service
   public class ProductEnrichmentService {
       private final ExecutorService executor;
       private final PriceService priceService;
       private final InventoryService inventoryService;
       private final ReviewService reviewService;

       public ProductEnrichmentService(PriceService priceService,
                                     InventoryService inventoryService,
                                     ReviewService reviewService) {
           this.priceService = priceService;
           this.inventoryService = inventoryService;
           this.reviewService = reviewService;
           this.executor = Executors.newFixedThreadPool(10);
       }

       public CompletableFuture<EnrichedProduct> enrichProduct(Product product) {
           CompletableFuture<Price> priceFuture = 
               CompletableFuture.supplyAsync(() -> priceService.getPrice(product.getId()), executor);

           CompletableFuture<InventoryStatus> inventoryFuture = 
               CompletableFuture.supplyAsync(() -> inventoryService.getStatus(product.getId()), executor);

           CompletableFuture<List<Review>> reviewsFuture = 
               CompletableFuture.supplyAsync(() -> reviewService.getReviews(product.getId()), executor);

           return CompletableFuture.allOf(priceFuture, inventoryFuture, reviewsFuture)
               .thenApply(v -> new EnrichedProduct(
                   product,
                   priceFuture.join(),
                   inventoryFuture.join(),
                   reviewsFuture.join()
               ));
       }

       @PreDestroy
       public void shutdown() {
           executor.shutdown();
           try {
               if (!executor.awaitTermination(5, TimeUnit.SECONDS)) {
                   executor.shutdownNow();
               }
           } catch (InterruptedException e) {
               executor.shutdownNow();
           }
       }
   }

RESTful API Design

  1. Resource Modeling
  • Design URLs around resources, not actions
  • Use appropriate HTTP methods (GET, POST, PUT, DELETE)
  • Use proper HTTP status codes
  • Provide API versioning strategy
   @RestController
   @RequestMapping("/api/v1/orders")
   public class OrderController {
       private final OrderService orderService;

       // Constructor omitted for brevity

       @GetMapping
       public List<OrderDto> getAllOrders() {
           return orderService.findAll();
       }

       @GetMapping("/{id}")
       public ResponseEntity<OrderDto> getOrder(@PathVariable Long id) {
           return orderService.findById(id)
               .map(ResponseEntity::ok)
               .orElse(ResponseEntity.notFound().build());
       }

       @PostMapping
       @ResponseStatus(HttpStatus.CREATED)
       public OrderDto createOrder(@Valid @RequestBody OrderRequest request) {
           return orderService.createOrder(request);
       }

       @PutMapping("/{id}")
       public ResponseEntity<OrderDto> updateOrder(
               @PathVariable Long id,
               @Valid @RequestBody OrderRequest request) {
           return ResponseEntity.ok(orderService.updateOrder(id, request));
       }

       @DeleteMapping("/{id}")
       @ResponseStatus(HttpStatus.NO_CONTENT)
       public void deleteOrder(@PathVariable Long id) {
           orderService.deleteOrder(id);
       }
   }
  1. API Documentation
  • Use Swagger/OpenAPI for API documentation
  • Document all endpoints, parameters, and possible responses
  • Include examples for better understanding “`java
    @RestController
    @RequestMapping("/api/v1/products")
    @Tag(name = "Product Management", description = "APIs for managing products")
    public class ProductController { @Operation(
    summary = "Get a product by ID",
    description = "Returns a single product by its unique identifier",
    responses = {
    @ApiResponse(
    responseCode = "200",
    description = "Product found",
    content = @Content(schema = @Schema(implementation = ProductDto.class))
    ),
    @ApiResponse(
    responseCode = "404",
    description = "Product not found",
    content = @Content(schema = @Schema(implementation = ErrorResponse.class))
    )
    }

MicroService Architecture

Here is a microservice architecture deployment diagram. All Services are docker containers which are registered to Consul Server via Registrator. Client (External – Mobile, UI) makes a call to Bouncer which is our API Proxy. Bouncer has all permissions configured on API URLs. It makes a call to Auth Server which authenticates the request and if successful it passes the Service URL to HAProxy. HAProxy then has rules configured which redirect the URL to exact service.

Service always follow a naming convention so when service is registered in consul then consul-template refreshes the HAProxy configuration in the background.

microservice-architecture-1

Bouncer – API Proxy gateway…all calls come to bouncer to make sure that only authenticated requests are passed to actual services

Auth Server – Single point of authentication and authorization. All applications create permissions and save in Auth Server

External ELB – All public APIs talk to External ELB which in turn are passed to HA Proxy cluster

Internal ELB – All internal APIs are routed through Internal ELBs. There will be URLs which will only be exposed to Internal Services

HA Proxy (Cluster) – The Load balancer cum service router

Consul Server (Cluster) – Centralized Service Registry

Registrator – SideCar application running with each service which updates Consul Cluster about service health

Consul Template – Background application which updates HAProxy whenever there is a change in service configurations

ECS Cluster – AWS ECS where all docker containers are registered. Takes care of dynamically launching new docker containers based on parameters. Autoscaling is handled automatically

There you have major parts in deployment..Please share your comments..Happy Coding !!

A friend posted this question on my FB account

 

Question 1.

I am depicting from diagram that all microservice API will proxy through API gateway. Is it true for internal cluster communication also? If yes, then wouldn’t it be too much load on gateway server for too many micro-services in infrastructure? Or will it be as lightweight as load balancer? Can I assume this gateway as a simple reverse proxy server which is just passing through the request/response then why not use Apache or Nginx?

Answer : 1st….internal cluster communication may or may not happen through Bouncer..depending on what your security requirements are.if you want all API calls to be authenticated then yes it will go through Bouncer…Point to note is.Bouncer is very lightweight Node.JS application so it will have extremely high throughput and because its behind HA Proxy you can always add new nodes…

API Proxy is a reverse proxy but with some logic…Most of the time when you want to expose an API which interacts with multiple microservices you will have to aggregate data..that logic will reside in API Proxy..Its a common pattern in large scale architecture

Question 2.
As per your description, Auth server is also responsible for authorisation here? Then how are you making sure of isolation of microservice, if it’s authorisation logic is shared to auth server the how are you making sure of data integrity which is a security measure which protects against the disclosure of information to parties other than the intended recipient.

Answer:

Auth Server is like “Google auth server”…All resource permissions reside in AuthServer..Authorization server has permissions for each application…These permissions can either be added via API by app or by an Admin UI…so each app can have different permissions…A single user will be given different permissions for different apps so isolation is guaranteed. e.g. I May have “user-create” permission in UserService but I may not have “account-create” permission in AccountService

Who creates permissions, who gives them to users.. and when depends on your design.

 

Serverless Architecture – AWS Lambda

aws-lambda

I want to write this post about my views about serverless architecture (specifically AWS Lambda) which all cloud service providers like AWS are promoting as “holy grail” for solving all problems.  This post is targeting developers who understand that every technology has a limitation and its wise to make an informed decision

Before I start my discussion around this want to state some facts so that we can have a fruitful discussion.

a. All companies including cloud service providers have to make money

b. All companies are afraid of competition and do not want to loose their customers

c. There is no branch of engineering where “one site fits all” approach works.

d. No matter what tools an engineer chooses when you cant find a solution “go back to basics” is the best approach.

e. Lambda architecture in the context of “AWS” is different from lambda architecture in general as many problem with this architecture are AWS specific only.

If you want to understand some issues with “Lambda architecture”

https://www.oreilly.com/ideas/questioning-the-lambda-architecture

Coming to the point Many attempts have been made in the past to find one “holy grail” to find solutions to teething problems. Let’s look at some of these

Problem 1– lets take operating systems Question is why do we have multiple operating systems? Why noone is able to solve “all the problems under sun”? Why so many

Problem 2– Write multiple programs for different OS even if program does same thing. Java solved this problem.Don’t worry about garbage collection, no worry about performance or underlying platforms. After 20 years of research and billions of dollars we only have more languages. If Java would have solved all the problems that it targeted we will never have to learn “node.js”

Problem 3. Learn multiple languages for front-end and backend development. GWT from house of google did solve it. great !!. Where is it now? Why did google decide to stick to Javascript for front end development and created angular?

Problem 4– Integration. In vast variety of protocols and hundreds of disparate systems in a sizeable organization Integration is a major problem. Hence birth of ESB. Where is it now? How many start ups use this? haven’t heard or found anyone

Problem 5 – Modeling business processes. Rule  Engine tried to solve all problems under the sun and be de-facto expert systems. No-doubt it is formidable technology but does it work in most of the scenarios. Answer is a resounding NO.

I can go on and on but you know what I am implying. One size does not fit all. Never have, never will. Probably there will be enough research into AI that technology stack and architecture can be determined by a program itself but I do not see that happening anytime soon.

There are many ways you can architect your applications. Serverless is one of those architectures.In this architecture you do not manage your own resources but someone else does. When and how many are decided by AWS for you. you do not get to choose your operating system, runtime versions etc etc.

Serverless architecture also tries to create an “illusion” that all my problems related to understanding processes behavior and problems with distributed systems will go away and it will scale on its own without “paying any cost”.

Before you find this approach or so called “architecture” fascinating give a call to your friend working for Amazon and ask how many amazon folks are actually using this. So far I have found none. Or even if they tried they do not understand its internals. Can you imagine lot of people in Amazon use “Oracle”. so much for their AWS offerings

Some people have written lengthy articles about Lambdas and its problems. Try to see if you have found answers to questions raised.

https://www.datawire.io/3-reasons-aws-lambda-not-ready-prime-time/

arguments in favor of lambda:

http://thenewstack.io/amazon-web-services-isnt-winning-problems-poses/

There is definitely logic in arguments posted in above blog but I see serious fundamental issue in writers approach of “why do I need to know this”. What kind of approach is this?  One that I do not agree with. Knowing  things in detail makes you better than competition and more adapt at solving problems. Knowing things help you solve critical technical issues and innovate. Yes its effort so what. AWS did not come out of thin air. It adopted many architectural solutions like Dynamo, RDS, Kinesis they are all based on cutting edge research papers.

To be more straight forward there are some points you need to be careful about while using Lambda architecture.

  • Your application will be difficult to debug
  • Encounter bugs which will be extremely hard to debug simply because your production environment will be very different from your local machine and you can never replicate it
  • Poor Error reporting. already shared a blog
  • Lambda warm up time. For batch processing lambda seems absolutely appropriate but for scalable real-time application? Well good luck if it works for you.
  • Timeouts. Lambda will always have a time-out which means that if your code takes more time then defined time-out lambda will never run and keep throwing exceptions. So keep this carefully in mind especially when making external HTTP calls. Even more serious problem is the fact that Lambda will cool down when there is not much to process. This logic of cool down is somewhere hidden without much detail.
  • Multiple environments – Create multiple lambda function..hell with code reuse
  • Application state – Well forget this as lambda has to be stateless. You have absolutely no control on when a process runs and stops.
  • JVM Optimizations – It also needs to be kept in mind that in Java techniques like JIT might be of no use in case of lambda functions. so all JVM optimizations will go out of the window.
  • Throttling – This is something we have faced recently. Throttling limits are absolutely ridiculous for high traffic apps. Yes its a soft limit you can raise requests but isn’t it contradictory to auto-scaling part. I thought this was the problem Lambda was solving in first place.

Bottom line is pretty simple. AWS Lambda is a great tool provided you use it wisely just like any other technology

  • you are fine with limited language support for Lambda.
  • you know how to deal with lambda limitations. timeouts, filepaths, limited memory, sudden restarts, multiple copies of same code,  etc.
  • you want to avoid and for good reasons complexity involved with developing distributed systems
  • analyzing your process performance is beyond your skills. e.g. you have no idea of JVM tuning
  • You are OK with Open-JDK.
  • you can be reasonably that your process will never hung or crash or have memory leak because you wont be able to login to machine and analyze dump using some advanced tool.

Apache Kafka – Simple Tutorial

 

In this post I want to highlight my fascination with Kafka and its usage.

Kafka is a broker just like “RabbitMQ” or “JMS”. So what’s the difference?

Difference are:

  • It is distributed
  • it is fault tolerant – because of messages being replicated across the cluster
  • It does one thing and one thing only i.e. Transferring your messages and does it really well
  • Highly scalable due to its distributed nature
  • Tunable consistency
  • Parallel processing of messages unlike others which do sequential
  • Ordering guarantee per partition

How do you set it up?

Kafka is inherently distributed. So that means you are going to have multiple machine creating a Kafka cluster.

Kafka uses zookeeper for leader election among other things so you need to have zookeeper cluster already running somewhere. otherwise you can go to

https://www.tutorialspoint.com/zookeeper/zookeeper_installation.htm

You install Kafka on all the machines which will participate in Kafka Cluster and then open the ports where Kafka is running. Then provide configuration of all other machines in the cluster in each machine. e.g. if Kafka is running on machines K1,K2,K3 then K1 will have information of K2 and K3 and so son.

Yes its that simple

How does it work?

The way Kafka works is you create a topic, send a message and read message at the other end.

So if there are multiple machines how do you send message to Kafka? Well you keep a list of all the machines inside your code and then send message by high level Kafka Producer (which is a helper class in Kafka Driver). Kafka high level consumer class is available for reading messages.

Before you send a message create a topic first with a “replication factor”” which tells kafka hos many brokers will have the copy of this data

Some important terminologies related to Kafka are:

Topic – Where you publish message. You need to create beforehand

Partition – Number of consumers that can listen to a topic in parallel. Default is 1 but you can create hundreds

Ordering of Messages – Guaranteed for single partition

TTL – Time to live for messages on the disk – default 7 days

Group – Kafka guarantees that a message is only ever read by a single consumer in the group. so if you want that a message be delivered only once then just go and put all consumers in same group.

If you want to go deep here are some useful links

https://kafka.apache.org/08/design.html

http://www.tutorialspoint.com/apache_kafka/apache_kafka_consumer_group_example.htm