Data Scientist | AI & Quantum ML Researcher | Builder of practical AI systems
I apply machine learning & generative AI to real problems—ranging from OCR/NLP pipelines and computer vision to LLMs & agentic workflows—and I research Quantum Machine Learning (QML) for healthcare (e.g., breast-cancer detection). I enjoy shipping end-to-end: data engineering → modeling → evaluation → deployment.
- Quantum ML (PhD @ UM5, Rabat): QSVMs, VQCs, QNNs with Qiskit, noise-aware kernels, feature selection, and benchmarking vs classical baselines.
- Ports analytics & recommender systems: Data pipelines & KPIs to optimize operations and decision-making.
- LLM & Agentic AI apps: Retrieval + tool-use + multi-step reasoning for practical assistants.
Core AI/ML: ANN • CNN • RNN • LSTM • Transformers • LLMs (instruction-tuned & RAG) • Generative AI (prompting, fine-tuning) • Agentic AI (tools/planning)
NLP & OCR: spaCy / NLTK • Hugging Face • TF-IDF/Word2Vec • PDF/Image OCR (Tesseract & DL-based)
Vision & Audio: OpenCV • PIL • basic ASR/audio feature extraction
Quantum: Qiskit • quantum kernels • variational circuits
Data & MLOps: Python • NumPy • Pandas • scikit-learn • SQL • Excel • Matplotlib • Git/GitHub
Cloud & Tools: Google Cloud • VS Code • PyCharm • MATLAB
Langs: Python • R • SQL • (some) C/Java
- QML for healthcare: Designed kernels & circuits for small, noisy biomedical datasets; compared scaling strategies and feature-subset sizes against classical baselines.
- Port operations analytics: Built pipelines to clean, join, and model multi-source data, surfacing KPIs and decision recommendations.
- Resume intelligence (PFE): End-to-end OCR → NLP classification with quality enhancement (GANs) and feature optimization (PCA).
Detailed roles, talks, and certifications are in my CV; highlights include national conference presentations and an IBM Qiskit Fall Fest hackathon win.
- quantum-breast-cancer-qsvc — QSVM/VQC vs classical baselines with scaling & feature maps, confusion matrices, and AUC/F1 dashboards.
- agentic-rag-porter — LLM+tools assistant for KPIs and recommendation queries over port datasets (RAG + evaluations).
- ocr-nlp-resume-pipeline — OCR → text cleanup → embedding/classification; includes dataset cards & reproducible notebooks.
- vision-violence-detection-mvp — Simple CV/audio fusion baseline (CLIP/AST) with a FastAPI inference endpoint.
- Variational Algorithm Design (IBM) • Azure Fundamentals (AZ-900, in progress)
- Oral/poster presentations at Moroccan conferences on QML for medical diagnostics and Classical vs Quantum ML
- Qiskit Fall Fest Hackathon Winner (IBM)
- Email: souhiabbenbouazza@gmail.com
- LinkedIn: linkedin.com/in/souhaib-benbouazza/
- Location: Salé, Morocco (open to relocation)
I love turning research into working prototypes and teaching complex topics simply. When I’m not coding or running experiments, you’ll find me doing calisthenics, swimming, or creating accessible educational content.