Skip to content

πŸ‘©πŸ»β€πŸš€ 16 – Data Mining: LLM Tabular Preprocessing with Dictionary Groups β€” Dictionary-Based Feature Grouping for LLM/AI Pipelines

License

Notifications You must be signed in to change notification settings

Quantum-Software-Development/16-DataMining_llm-tabular-preprocessing-dict-groups

Repository files navigation


[πŸ‡§πŸ‡· PortuguΓͺs] [πŸ‡¬πŸ‡§ English]



Dictionary-Based Feature Grouping for LLM/AI Pipelines



Institution: Pontifical Catholic University of SΓ£o Paulo (PUC-SP)
School: Faculty of Interdisciplinary Studies
Program: Humanistic AI and Data Science Semester: 2nd Semester 2025
Professor: Professor Doctor in Mathematics Daniel Rodrigues da Silva



==================================

Yupiiii πŸ‘©πŸ»β€πŸš€ Still Building ΰΉ‹ β­‘πŸ›ΈΰΉ‹β­‘

=================================



Sponsor Quantum Software Development






Important

⚠️ Heads Up







🎢 Prelude Suite no.1 (J. S. Bach) - Sound Design Remix
Statistical.Measures.and.Banking.Sector.Analysis.at.Bovespa.mp4

πŸ“Ί For better resolution, watch the video on YouTube.



Tip

This repository is a review of the Statistics course from the undergraduate program Humanities, AI and Data Science at PUC-SP.

☞ Access Data Mining Main Repository



<br














  • Chen, X., et al. (2024). LLM-based feature generation from text for interpretable machine learning. arXiv preprint. Retrieved from arxiv.org/html/2409.07132v2

  • DataCamp. (2024). Pandas GroupBy Explained: Syntax, Examples, and Tips. Retrieved from datacamp.com/tutorial/pandas-groupby

  • GeeksforGeeks. (2024). Pandas dataframe.groupby() Method. Retrieved from geeksforgeeks.org/pandas/python-pandas-dataframe-groupby

  • Machine Learning Mastery. (2024). Feature Engineering with LLM Embeddings: Enhancing Scikit-learn Models. Retrieved from machinelearningmastery.com

  • McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython (2nd ed.). O’Reilly Media.

  • Pandas Documentation. (2024). Group by: split-apply-combine. Retrieved from pandas.pydata.org/docs/user_guide/groupby.html

  • VanderPlas, J. (2016). Python Data Science Handbook: Essential Tools for Working with Data. O’Reilly Media.





πŸ›ΈΰΉ‹ My Contacts Hub





────────────── πŸ”­β‹† ──────────────

➣➒➀ Back to Top

Copyright 2026 Quantum Software Development. Code released under the MIT License license.

About

πŸ‘©πŸ»β€πŸš€ 16 – Data Mining: LLM Tabular Preprocessing with Dictionary Groups β€” Dictionary-Based Feature Grouping for LLM/AI Pipelines

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published