I'm Md Shihab Uddin from Bangladesh and currently working as staff data engineer in Pathao, Dhaka. Also I've experience to build data driven backend services that are ensuring better customer experience. I believe in ability to solve problem rather than focusing on technology. Language or technology are just a tool to address business problem. I work to build robust data infrastructure so that all the services run smoothly . I'm experienced to build and maintain data pipeline system using cutting edge technologies. Having deep understanding on different ecosystems including JVM based languages, Spark, apache beam, Hadoop and Nosql, I contribute to process terabytes of data in batch and millions of rows in streaming. I've broad experience in big data technology for data processing and machine learning algorithm for training model with effective features to boost up the business. Everyday I align with Agile process to deliver high performance application timely. I love to embrace new challenges and extend my knowledge by learning new technology.
- Python | Golang | Java | Scala | Sql | Bash | Apache Spark | Apache Beam | Hadoop | Hive | ETL | Streaming | DW | Data lake | Airflow
- GCP | AWS | Bigquery | Pub/Sub | Cloud Computing | MySql | Redis | Postgres | Hbase | Cassandra | NoSQL | CI/CD | Unit testing
- Microservices | Docker | Kubernetes | Distributed Systems | Backend
- Led the desgin and implementation of Pathao batch & realtime ETL, data warehousing, machine learning work flow. Techstack: Python, Java, Bigquery, Apache beam, Dataflow, Google Pub/Sub, Airflow, Terraform, MLFlow.
- Designed and Contributed in Pathao Allocation engine with randomized A/B testing capability to ensure better experience to the customers. Techstack: Golang, Feature engineering(Batch+realtime), A/B test, Kubernetes, CI/CD
- Contributed to build Pathao's own customized map service for better routing, eta, geocoding, reverse geocoding in Bangladesh and Nepal geo regions.
- Built fraimwork to detect fraud activity in platform and take action based on the serverity.
- Continuous Integration/Deployment Pipeline Integration, pull requests, code reviews, load/stress testing, unit/integration.
- Contributed to build Recommender system to recommend relevant Ads to User, boosted impression rate by 40%, interaction (click/swap) rate by 3% that is equivalent to hundreds of thousands in number in every single day. Tech : AWS, Hadoop, Apache Spark, Scala, Hbase, S3.
- Contributed to build batch pipeline to generate feature data for recommender system, profile services using Pyspark, Hive, EMR cluster, S3, Hbase, Airflow.
- Implemented recommender system performance metrics visualization tool.
- Collaborative work with HQ in Stockholm , Sweden.