Welcome to the Python Data Engineer learning repository! This repo contains a structured, practical set of Jupyter notebooks and example projects for learning core Python concepts with a focus on data engineering.
Note: This summary is based on the top-level files; for a full list of all tutorials and scripts, check the GitHub repository contents.
- Overview: Introduction to Python, variables, data types, and basic operations.
- Key Concepts:
- Printing and string manipulation
- Variable assignment and naming
- Numeric, string, and boolean data types
- Type conversion, built-in functions, and string methods
- List basics and common list operations
- Overview: Mastering conditional statements for decision making.
- Key Concepts:
if,elif,else, comparison and logical operators, nested conditions.
03. Python Loops
- Overview: Using loops to automate repetitive tasks.
- Key Concepts:
forandwhileloops, loop control (break,continue,pass), iterating collections.
04. Python Functions
- Overview: Writing reusable blocks of code with functions.
- Key Concepts: Defining and calling functions, parameters, return values, scope, lambda functions.
05. Python Operators
- Overview: Using operators to manipulate data.
- Key Concepts: Arithmetic, assignment, comparison, logical, bitwise, and membership operators.
- Overview: Mastering data structures for efficient storage and retrieval.
- Key Concepts: Lists, tuples, sets, dictionaries, and real-world examples.
- Overview: Organizing and reusing code with modules and packages.
- Key Concepts:
- Difference between modules, packages, and libraries
- Importing and using built-in and external libraries (e.g., Pandas, NumPy, Matplotlib, Requests)
- Creating custom modules and packages
- Overview: File handling (text & CSV) and JSON management for configuration and data exchange. (JSON content previously listed separately has been merged into this section.)
- Key Concepts:
- Reading and writing text and CSV files using built-in modules and pandas
- Using
osandshutilfor file and directory operations - Reading, writing, parsing, and serializing JSON with Python’s
jsonmodule - Data extraction and ingestion from files and JSON API responses
- Error handling and path management
09. Python OOPs
- Overview: Object-oriented programming in Python.
- Key Concepts:
- Defining classes and creating objects
- Constructors (
__init__)
10. Python Randoms
- Overview: Working with randomness, generating random numbers and data for testing and simulations.
- Key Concepts:
randommodule,fakerlibrary, random sampling and anonymization.
- Overview: Reusable scripts and code blocks for modular data engineering workflows.
- Key Concepts: Encapsulating logic in functions and scripts, templates for batch processing.
- Overview: Logging and monitoring data engineering processes.
- Key Concepts: Python’s
loggingmodule, log formats, levels, handlers, and best practices.
- Overview: Common Python interview questions and concise answers, focusing on practical explanations.
- Overview: Example Streamlit apps and instructions to run them.
- Key Concepts: Installing Streamlit, building and running simple apps, basic visualization with Plotly.
- Overview: End-to-end projects to apply learned concepts.
- Contents: Project ideas, example implementations, and deployment notes.
- Browse Notebooks: Start with the Jupyter notebooks in the main directory for a structured learning path.
- Explore Directories: Check out the additional folders for sample scripts, data, and projects.
- Try the Code: Run the notebooks locally or in an online Jupyter environment.
- Contribute: Pull requests to add new topics or improve examples are welcome!