Summary
Overview
Work History
Education
Skills
Timeline
Personal Information
Scientific Interests
Generic
Ioannis Kalyvas

Ioannis Kalyvas

Lead Cloud Infra Architect
Zurich,ZH

Summary

Dynamic cloud computing professional with expertise in designing and optimizing cloud environments to drive operational efficiency and align with business objectives. Proven success in delivering impactful solutions that enhance performance and streamline processes. Dependable team collaborator, adept at adapting to evolving project requirements while maintaining a focus on innovation. Proficient in cloud architecture and infrastructure management, leveraging technical skills to support organizational growth and transformation.

Overview

15
15
years of professional experience
2
2
Languages

Work History

Lead DevOps Cloud Architect @ AWS

AWS
03.2023 - Current
  • Led the architecture and implementation of a nationwide, serverless management framework for NTT DOCOMO, orchestrating thousands of EKS-A clusters for 5G workloads, enabling reliable and scalable operations across Japan.
  • Provided technical leadership for large-scale Kubernetes (EKS/EKS-A) deployments, including multi-region and multi-tenant architectures, reducing incident resolution time by ~30%.
  • Collaborated with Sunrise to deliver public web platforms on EKS using ArgoCD, ensuring high availability and security for millions of users.
  • Drove innovation in cloud-native operations, leveraging AWS Lambda, Step Functions, and serverless patterns to monitor and manage distributed infrastructure at scale, handling thousands of daily operational tasks.
  • Advised on security, compliance, and cost optimization in complex, regulated environments, achieving ~15% cloud cost savings while maintaining performance and compliance standards.
  • Architected and deployed a GenAI-powered chatbot for Swisscom, using Retrieval-Augmented Generation (RAG) and AWS Bedrock, improving customer support efficiency for tens of thousands of users.
  • Designed and launched the Swisscom ML Factory, a scalable platform for AI/ML workloads with NVIDIA GPUs, reducing model training times by 50–70% and accelerating AI adoption across the organization.
  • Built a secure and resilient image processing pipeline for Sanofi, enabling real-time analysis of medical images, improving throughput by over 350x and ensuring SLA compliance.
  • Advised executive leadership on security, compliance, and cost optimization in complex, regulated environments, reducing cloud spend by ~15% while maintaining performance and compliance standards.
  • Architected and deployed a GenAI-powered chatbot using Retrieval-Augmented Generation (RAG) and AWS Bedrock for advanced customer support and knowledge management, serving tens of thousands of users.
  • Designed and launched a scalable ML platform with GPU acceleration, accelerating AI/ML adoption across the organization and reducing model training time by 50–70%.
  • Built a secure, resilient image processing pipeline for the life sciences industry, enabling real-time analysis of medical images, improving throughput by >350x and ensuring SLA adherence.

Principal Cloud Architect & AI Solutions Lead

Swisscom
03.2021 - 03.2023
  • Conceived and led the GitOps approach for nationwide 5G deployment using FluxCD, making Swisscom one of the first operators globally to use GitOps for 5G, and presented this innovation at KubeCon.
  • Led the design and implementation of a cloud-native platform for Service Management and Orchestration (SMO) for 5G networks, enabling automated lifecycle management and orchestration of network services written in Golang.
  • Orchestrated deployment and maintenance of containerized microservices using Kubernetes (EKS), Helm, and Terraform.
  • Improved scalability and reliability by implementing monitoring solutions (Prometheus, Grafana, AWS CloudWatch) and setting performance metrics.
  • Automated routine operations and optimized pipelines to support service quality.
  • Provided leadership for resolving system issues quickly and efficiently.

Senior Engineer & DevOps Leader

Confidential Start Up
06.2019 - 10.2020
  • Designed web services with Django REST Framework on Amazon EKS, emphasizing reliability and scalability.
  • Implemented GitOps practices with FluxCD and streamlined deployments via GitHub Actions.
  • Developed Helm charts and automated testing pipelines to boost productivity.
  • Enhanced query performance on Amazon RDS (PostgreSQL) by automating maintenance tasks.
  • Mentored team members and fostered a collaborative, adaptive work environment.
  • ML Time Series analysis: Designed and implemented a predictive autoscaler for OSM MANO. OSM is delivering an open source Management and Orchestration (MANO) stack aligned with ETSI NFV Information Models. (https://osm.etsi.org).The autoscaler was integrated into Telefonica premises under their OSM MANO installation.Its goal is to predict proactively based on the CPU load of the VMs in the operators premises how many VMs need to be shut down or launched to comply with the operators SLAs, when traffic is decreasing or increasing respectively.
  • Open Source contribution: Contributed to the OSM MANO upstream repo a bug fix i found while developing the autoscaler. (https://osm.etsi.org/gerrit/c/osm/LCM/+/7556)
  • Docker: Autoscaler was a dockerized application, using a Django , Postgres and python containers. Deployed via docker swarm.
  • Smart Mobile Edge Computing: Designed and implemented a cpu load based clustering of VMs in a SmartMEC environment. The application is dockerized , deployed with docker swarm and the ML algorithm was based on this paper
  • Deployment Docker: The clustering application was deployed in the FLAME square in Bristol using real equipment against real load of users.
  • GCP cloud and Configuration management: Worked in Google Cloud Platform and deployments were done using Ansible.

Senior Engineer & Team Lead

Dialog Semiconductor
02.2017 - 08.2018
  • Developed predictive auto-scaling systems to manage resource allocation effectively. This work was part of OSM MANO open source project.
  • Improved system resilience with bug fixes and open-source contributions.
  • Designed containerized applications using Django, Postgres, and Python; streamlined deployments with Docker Swarm.
  • Implemented structured incident management practices that minimized manual interventions.
  • Led automated deployments using Ansible to ensure consistency and stability.
  • Bluetooth Low Energy SDK

Infrastructure & DevOps Engineer

Commsquare BVBA
03.2016 - 02.2017
  • Identified performance bottlenecks in a single-threaded Python Pandas ETL pipeline, which previously required ~7 minutes to process the data, and architected a migration to Apache Spark.
  • Designed and implemented a Spark-based solution for ETL and KPI monitoring, reducing processing time to 7 seconds and ensuring the company consistently met SLA targets.
  • Supported Big Data processing frameworks with a focus on fault tolerance, scalability, and high availability during production upgrades.
  • Automated code deployment processes with CI/CD pipelines, improving overall efficiency and reliability.

Senior Software Engineer

Nokia
09.2010 - 05.2016
  • Developed scalable REST APIs for dynamic resource allocation.
  • Led projects aimed at improving system continuity by minimizing downtime through testing and automation.
  • Built a Python-based CLI tool that streamlined upgrade procedures and increased reliability, and contributed to technical documentation and internal training for operational best practices.
  • Conducted debugging and root cause analyses in production environments.
  • Automated routine tasks to enhance team productivity and system efficiency.
  • Part of a release team responsible for implementing and testing key features for a major Japanese telecom operator (NTT DoCoMo).
  • Implemented security best practices, safeguarding sensitive data and minimizing vulnerabilities.
  • Enhanced platform performance by optimizing infrastructure and implementing automation tools.
  • Championed best practices in code quality through code reviews, refactoring initiatives, and documentation efforts.

Education

MSc - Digital Signal Processing and Communication Systems

University of Patras
Patras, Greece
04.2008

MEng - Computer Engineering and Informatics

University of Patras
Patras, Greece
04.2006

Skills

Deep expertise in cloud-native architectures, Kubernetes, Docker, and large-scale orchestration for telecom and enterprise workloads

Timeline

Lead DevOps Cloud Architect @ AWS

AWS
03.2023 - Current

Principal Cloud Architect & AI Solutions Lead

Swisscom
03.2021 - 03.2023

Senior Engineer & DevOps Leader

Confidential Start Up
06.2019 - 10.2020

Senior Engineer & Team Lead

Dialog Semiconductor
02.2017 - 08.2018

Infrastructure & DevOps Engineer

Commsquare BVBA
03.2016 - 02.2017

Senior Software Engineer

Nokia
09.2010 - 05.2016

MEng - Computer Engineering and Informatics

University of Patras

MSc - Digital Signal Processing and Communication Systems

University of Patras

Personal Information

English (fluent)
Greek (native)

Scientific Interests

Distributed Systems, Service Reliability, Automation, Capacity Planning, Big Data, Machine Learning, Software Development
Ioannis KalyvasLead Cloud Infra Architect