diff --git a/sources/academy/platform/getting_started/apify_client.md b/sources/academy/platform/getting_started/apify_client.md index 9622ee324e..ce875609df 100644 --- a/sources/academy/platform/getting_started/apify_client.md +++ b/sources/academy/platform/getting_started/apify_client.md @@ -189,7 +189,7 @@ from apify_client import ApifyClient client = ApifyClient(token='YOUR_TOKEN') -actor = client.actor('YOUR_USERNAME/adding-actor').call(run_input={ +run = client.actor('YOUR_USERNAME/adding-actor').call(run_input={ 'num1': 4, 'num2': 2 }) diff --git a/sources/academy/tutorials/python/process_data_using_python.md b/sources/academy/tutorials/python/process_data_using_python.md index 5e72eaddb9..56e6aa197a 100644 --- a/sources/academy/tutorials/python/process_data_using_python.md +++ b/sources/academy/tutorials/python/process_data_using_python.md @@ -31,7 +31,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings* First, we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `pandas` package for parsing the downloaded weather data, and the `matplotlib` package for visualizing it. We don't care about versions of these packages, so we list just their names: -```py +```text # Add your dependencies here. # See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format # for how to format them @@ -44,6 +44,8 @@ The Actor's main logic will live in the `main.py` file. Let's delete everything Next, we'll import all the packages we will use in the code: + + ```py from io import BytesIO import os @@ -127,6 +129,8 @@ print(f'Result is available at {os.environ["APIFY_API_PUBLIC_BASE_URL"]}' + f'/v2/key-value-stores/{os.environ["APIFY_DEFAULT_KEY_VALUE_STORE_ID"]}/records/prediction.png') ``` + + And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future re-use, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, it will output the URL where you can access the plot we created in its log. ![Building and running the BBC Weather Parser Actor](./images/bbc-weather-parser-source.png) diff --git a/sources/academy/tutorials/python/scrape_data_python.md b/sources/academy/tutorials/python/scrape_data_python.md index df8dfcdbfd..4f042c4a80 100644 --- a/sources/academy/tutorials/python/scrape_data_python.md +++ b/sources/academy/tutorials/python/scrape_data_python.md @@ -63,7 +63,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings* First we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `requests` package for downloading the BBC Weather pages, and the `beautifulsoup4` package for parsing and processing the downloaded pages. We don't care about versions of these packages, so we list just their names: -```py +```text # Add your dependencies here. # See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format # for how to format them @@ -78,6 +78,8 @@ Finally, we can get to writing the main logic for the Actor, which will live in First, we need to import all the packages we will use in the code: + + ```py from datetime import datetime, time, timedelta, timezone import os @@ -205,6 +207,8 @@ default_dataset_client.push_items(weather_data) print(f'Results have been saved to the dataset with ID {os.environ["APIFY_DEFAULT_DATASET_ID"]}') ``` + + ### Running the Actor And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future reuse, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, you can view the scraped data in the **Dataset** tab in the Actor run view. @@ -231,7 +235,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings* First, we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `pandas` package for parsing the downloaded weather data, and the `matplotlib` package for visualizing it. We don't care about versions of these packages, so we list just their names: -```py +```text # Add your dependencies here. # See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format # for how to format them @@ -244,6 +248,8 @@ The Actor's main logic will live in the `main.py` file. Let's delete everything Next, we'll import all the packages we will use in the code: + + ```py from io import BytesIO import os @@ -327,6 +333,8 @@ print(f'Result is available at {os.environ["APIFY_API_PUBLIC_BASE_URL"]}' + f'/v2/key-value-stores/{os.environ["APIFY_DEFAULT_KEY_VALUE_STORE_ID"]}/records/prediction.png') ``` + + And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future re-use, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, it will output the URL where you can access the plot we created in its log. ![Building and running the BBC Weather Parser Actor](./images/bbc-weather-parser-source.png) diff --git a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md index 5c9fb95543..383637020e 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md @@ -148,7 +148,7 @@ if (priceText.startsWith("From ")) { Great! Only if we didn't overlook an important pitfall called [floating-point error](https://en.wikipedia.org/wiki/Floating-point_error_mitigation). In short, computers save floating point numbers in a way which isn't always reliable: -```py +```pycon > 0.1 + 0.2 0.30000000000000004 ``` diff --git a/sources/academy/webscraping/scraping_basics_python/04_downloading_html.md b/sources/academy/webscraping/scraping_basics_python/04_downloading_html.md index e3866cfcb2..6441eca537 100644 --- a/sources/academy/webscraping/scraping_basics_python/04_downloading_html.md +++ b/sources/academy/webscraping/scraping_basics_python/04_downloading_html.md @@ -34,7 +34,7 @@ Now let's test that all works. Inside the project directory we'll create a new f ```py import httpx -print("OK") +print("OK", httpx.__version__) ``` Running it as a Python program will verify that our setup is okay and we've installed HTTPX: diff --git a/sources/academy/webscraping/scraping_basics_python/05_parsing_html.md b/sources/academy/webscraping/scraping_basics_python/05_parsing_html.md index dbfa52cb9a..b8458a48c5 100644 --- a/sources/academy/webscraping/scraping_basics_python/05_parsing_html.md +++ b/sources/academy/webscraping/scraping_basics_python/05_parsing_html.md @@ -46,6 +46,8 @@ Now let's use it for parsing the HTML. The `BeautifulSoup` object allows us to w We'll update our code to the following: + + ```py import httpx from bs4 import BeautifulSoup @@ -74,6 +76,8 @@ first_heading = headings[0] print(first_heading.text) ``` + + If we run our scraper again, it prints the text of the first `h1` element: ```text diff --git a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md index 0708dc071e..c0307592cd 100644 --- a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md @@ -158,12 +158,14 @@ When translated to a tree of Python objects, the element with class `price` will We can use Beautiful Soup's `.contents` property to access individual nodes. It returns a list of nodes like this: -```py +```text ["\n", Sale price, "$74.95"] ``` It seems like we can read the last element to get the actual amount. Let's fix our program: + + ```py import httpx from bs4 import BeautifulSoup @@ -198,6 +200,8 @@ The results seem to be correct, but they're hard to verify because the prices vi print(title, price, sep=" | ") ``` + + The output is much nicer this way: ```text diff --git a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md index eb49b7ce69..7724b099cc 100644 --- a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md @@ -34,6 +34,12 @@ It's because some products have variants with different prices. Later in the cou Ideally we'd go and discuss the problem with those who are about to use the resulting data. For their purposes, is the fact that some prices are just minimum prices important? What would be the most useful representation of the range for them? Maybe they'd tell us that it's okay if we just remove the `From` prefix? + + + + ```py price_text = product.select_one(".price").contents[-1] price = price_text.removeprefix("From ") @@ -51,6 +57,8 @@ else: price = min_price ``` + + :::tip Built-in string methods If you're not proficient in Python's string methods, [.startswith()](https://docs.python.org/3/library/stdtypes.html#str.startswith) checks the beginning of a given string, and [.removeprefix()](https://docs.python.org/3/library/stdtypes.html#str.removeprefix) removes something from the beginning of a given string. @@ -59,6 +67,8 @@ If you're not proficient in Python's string methods, [.startswith()](https://doc The whole program would look like this: + + ```py import httpx from bs4 import BeautifulSoup @@ -112,7 +122,7 @@ These might be useful in some complex scenarios, but in our case, they won't mak We got rid of the `From` and possible whitespace, but we still can't save the price as a number in our Python program: -```py +```pycon >>> price = "$1,998.00" >>> float(price) Traceback (most recent call last): @@ -154,7 +164,7 @@ else: Great! Only if we didn't overlook an important pitfall called [floating-point error](https://en.wikipedia.org/wiki/Floating-point_error_mitigation). In short, computers save floating point numbers in a way which isn't always reliable: -```py +```pycon >>> 0.1 + 0.2 0.30000000000000004 ``` @@ -174,6 +184,8 @@ price_text = ( ) ``` + + In this case, removing the dot from the price text is the same as if we multiplied all the numbers with 100, effectively converting dollars to cents. For converting the text to a number we'll use `int()` instead of `float()`. This is how the whole program looks like now: ```py diff --git a/sources/platform/proxy/datacenter_proxy.md b/sources/platform/proxy/datacenter_proxy.md index 1cf81cb789..04ed35e530 100644 --- a/sources/platform/proxy/datacenter_proxy.md +++ b/sources/platform/proxy/datacenter_proxy.md @@ -118,7 +118,8 @@ await Actor.exit(); ```python from apify import Actor -import requests, asyncio +import asyncio +import requests async def main(): async with Actor: @@ -258,7 +259,8 @@ await Actor.exit(); ```python from apify import Actor -import requests, asyncio +import asyncio +import requests async def main(): async with Actor: