Manish KC

Who I am

I'm a Software Engineer. I love building things, and always have. Sixteen years across side projects, military service, and a professional engineering career have shaped how I work. Reliable systems, real problems, no shortcuts. Most recently at Amazon, I spent over six years building distributed systems at scale—data pipelines, payment infrastructure, and access control platforms supporting millions of workloads. Today, I build the same way, just faster. With modern AI tooling, I design and deliver agents, pipelines, and production systems in weeks instead of months. What I bring to teams and founders is simple: systems that work, scale, and ship on time.

What I've shipped recently

  • Led the design and rollout of an enterprise-scale data access control plane built on an event-driven architecture enforcing compliance and visibility across more than 500,000 datasets. Delivered a microservice platform supporting over one million workloads and eliminated more than an hour of manual onboarding effort per profile across over 400 VP-level organizations.
  • Key contributor to Amazon's exabyte-scale data lake modernization for GDPR compliance, migrating more than 200,000 datasets to standardized formats with automated data quality validation. Built detection and escalation systems processing over 15,000 issues daily and saving more than 6,000 developer hours per day, later evolving into an organization-wide platform supporting more than 60 microservices.
  • Re-architected disaster recovery for a high-throughput Redshift service by introducing compacted snapshots and redesigning reconciliation logic, reducing p90 recovery synchronization time by 99 percent during severe outage scenarios.
  • Built platform-wide observability and operational tooling supporting real-time visibility into execution status, progress, and failure ownership across distributed services, sustaining throughput of 250,000 metrics per minute and enabling faster incident response during peak operational periods.
  • Modernized authentication across nine critical production pipelines by coordinating runtime upgrades and replacing legacy symmetric-key authentication with OAuth-based identity controls across heterogeneous systems, delivering a zero-downtime rollout aligned with executive-level reliability goals.
  • Designed and scaled fault-tolerant ETL pipelines for Amazon's internal data lake processing more than 300,000 daily jobs across over 100 petabytes of data, strengthening trust in system reliability and data accuracy.
  • Built and optimized more than 100 production REST APIs with rate limiting, database tuning, and caching strategies achieving 99.99 percent availability with sub-100 millisecond response times under sustained load.
  • Built an error classification system capturing failure contexts across distributed workflows, surfacing root causes and categorizing errors by type, reducing debugging time by 70 percent and cutting operational tickets by 50 percent.

Stack

LanguagesJava · Python · TypeScript · JavaScript · C++ · SQL
FrameworksSpring Boot · FastAPI · Apache Spark · Apache Flink · LangChain · LangGraph
CloudAWS · Docker · Kubernetes
DataPostgreSQL · Data Lake · Caching · Kafka
APIsREST · OAuth · gRPC
AIOpenAI · Anthropic · Gemini · Llama

What I'm building now

Agentic AI systems. Prototypes that become real things. Small, focused projects for teams who have the idea but need someone to make it actual.

Let's work together.

Whether you need someone to build something fast or want to talk through a hard problem — I'm happy to chat.