Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,27 +25,31 @@ the following additional dependencies need to be available:
* [libosmium](https://github.com/osmcode/libosmium) >= 2.16.0
* [protozero](https://github.com/mapbox/protozero)
* [cmake](https://cmake.org/)
* [Pybind11](https://github.com/pybind/pybind11) >= 2.2
* [expat](https://libexpat.github.io/)
* [libz](https://www.zlib.net/)
* [libbz2](https://www.sourceware.org/bzip2/)
* [Boost](https://www.boost.org/) variant and iterator >= 1.41
* [Python Requests](https://docs.python-requests.org/en/master/)
* Python setuptools
* a recent C++ compiler (Clang 3.4+, GCC 4.8+)

The following additional dependencies are automatically installed as part
of the build process:

* [scikit-build-core](https://scikit-build-core.readthedocs.io/en/latest/)
* [Pybind11](https://github.com/pybind/pybind11)

On Debian/Ubuntu-like systems, the following command installs all required
packages:

sudo apt-get install python3-dev build-essential cmake libboost-dev \
libexpat1-dev zlib1g-dev libbz2-dev

libosmium, protozero and pybind11 are shipped with the source wheel. When
building from source, you need to download the source code and put it
in the subdirectory 'contrib'. Alternatively, if you want to put the sources
somewhere else, point pyosmium to the source code location by setting the
CMake variables `LIBOSMIUM_PREFIX`, `PROTOZERO_PREFIX` and
`PYBIND11_PREFIX` respectively.
Compatible versions of libosmium and protozero are shipped with the source
wheel. When building from source, you need to download the source code of these
two libraries and put it in the subdirectory 'contrib'. Alternatively,
if you already have the sources somewhere else,
point pyosmium to the source code location by setting the
CMake variables `Libosmium_ROOT` and `Protozero_ROOT`.

To compile and install the bindings, run

Expand Down
8 changes: 6 additions & 2 deletions docs/user_manual/01-First-Steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ out about the tags. It is also always useful to consult
different keys and value in actual use.

Tags are common to all OSM objects. After that there are three kinds of
objects in OSM: nodes, ways and relations.
object types in OSM: nodes, ways and relations.

### Nodes

Expand Down Expand Up @@ -187,7 +187,7 @@ backward references when talking about the dependencies between objects:

* A __forward reference__ means that an object is referenced to by another.
Nodes appear in ways. Ways appear in relations. And a node may even have
an indirect forward reference to a relation through a way it appear in.
an indirect forward reference to a relation through a way it appears in.
Forward references are important when tracking changes. When the location
of a node changes, then all its forward references have to be reevaluated.

Expand All @@ -198,6 +198,10 @@ backward references when talking about the dependencies between objects:
to follow the backward references for ways and relations until we reach
the nodes.

Closely related to backward references is the concept of __reference
completeness__. A dataset or file is considered reference complete when
all backward references can be resolved.

## Order in OSM files

OSM files usually follow a sorting convention to make life easier for
Expand Down
23 changes: 20 additions & 3 deletions docs/user_manual/02-Extracting-Object-Data.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ Finally, there is a type for changesets, which contains information about
edits in the OSM database. It can only appear in special changeset files
and explained in more detail [below](#changeset).

The FileProcessor may return any of these objects, when iterating over a file.
When iterating over a file, then the FileProcessor may return any of these
objects.
Therefore, a script will usually first need to determine the type of object
received. There are a couple of ways to do this.

Expand Down Expand Up @@ -83,7 +84,7 @@ You can simply test for this object type:
## Reading object tags

Every object has a list of properties, the tags. They can be accessed through
the `tags` property, which provides a simple dictionary-like view of the tags.
the `tags` property. It provides a simple dictionary-like view of the tags.
You can use the bracket notation to access a specific tag or use the more
explicit `get()` function. Just like for Python dictionaries, an access by
bracket raises a `ValueError` when the key you are looking for does not exist,
Expand Down Expand Up @@ -140,7 +141,23 @@ list into a Python dictionary:
## Other common meta information

Next to the tags, every OSM object also carries some meta information
describing its ID, version and information regarding the editor.
which all can be accessed through read-only properties.

The most important meta information is the object's ID in the `id` property.
This is the ID used when objects reference each other.

The other meta fields contain information when and by whom the objet was edited.
The following table gives a quick overview over these fields:

| Property | Description |
|-----------|--------------------------|
| version | Version of the object. A newly created object starts with version 1. |
| deleted | A boolean property stating if the object should be used or ignored. Only relevant for [change](08-Working-With-Change-Files.md) and [history](09-Working-With-History-Files.md) files. |
| changeset | The ID of the change set this object was created with. A change set contains a set of edits that have been uploaded by an editor in a single session. |
| timestamp | UTC time at which the object was created, or more precisely, added to the database. |
| uid | The ID of the user who created this version of the object. User IDs are univocal and prepetual. |
| user | The name of the user who created this version of the object. This is the name the user had when the object was created. User names may be changed over time. The same name in different objects doesn't necessarily reference the same user. |


## Properties of OSM object types

Expand Down
60 changes: 57 additions & 3 deletions docs/user_manual/06-Writing-Data.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,18 @@ pyosmium will refuse to overwrite any existing files. Either make sure to
delete the files before instantiating a writer or use the parameter
`overwrite=true`.

All writers are [context managers](https://docs.python.org/3/reference/datamodel.html#context-managers) and to ensure that the file is properly closed in the
end, the recommended way to use them is in a with statement:

!!! example
```python
with osmium.SimpleWriter('my_extra_data.osm.pbf') as writer:
# do stuff here
```

When not used inside a with block, then don't forget to call the `close()`
function explicitly to close the writer.

Once a writer is instantiated, one of the `add*` functions can be used to
add an OSM object to the file. You can either use one of the
`add_node/way/relation` functions to force writing a specific type of
Expand All @@ -27,9 +39,6 @@ they are given to the writer object. It is your responsibility as a user to
make sure that the order is correct with respect to the
[conventions for object order][order-in-osm-files].

After writing all data the writer needs to be closed using the `close()`
function. It is usually easier to use a writer as a context manager.

Here is a complete example for a script that converts a file from OPL format
to PBF format:

Expand Down Expand Up @@ -129,3 +138,48 @@ pyosmium implements three different writer classes: the basic
the two reference-completing writers
[ForwardReferenceWriter][osmium.ForwardReferenceWriter] and
[BackReferenceWriter][osmium.BackReferenceWriter].

### Writing specific objects only

The [SimpleWriter][osmium.SimpleWriter] creates an OSM data file by directly
writing out any OSM object that it receives in the chosen format.


### Writing reference-complete files

The [BackReferenceWriter][osmium.BackReferenceWriter] will make sure that the
file that is written out is reference-complete, meaning all objects that are
directly referenced by the object written are added to the output file as well.
This is needed when you want to make sure that geometries can be recreated
from the object in the file.

Creating a file with backward references is a two-stage process: while the
writer is open, it will write all objects received through one of the `add_*()`
functions into a temporary file and keeps a record of which objects are needed
to make the file reference-complete. Once the writer is closed, it collects the
missing object from a given reference file, merges them with the data from
the temporary file and writes out the final result.

### Writing files with forward references

The [ForwardReferenceWriter][osmium.ForwardReferenceWriter] completes the
written objects with forward references. This is particularly useful when
creating geographic extracts of any kind: one selects the node of interest
in a particular area and then lets the ForwardReferenceWriter complete the
ways and relations referring to the nodes.

Files written by the ForwardReferenceWriter are not necessarily
reference-complete. That is easy to see when considering the example of the
geographic extract: there may be ways in the area that cross the boundary
of the area chosen but only the nodes within the area are written out. This
might be useful in many situations as the way would be simply seem to be cut
on the area of interest. However, it has the disadvantage that some objects
will get invalid geometries, especially when they represent areas.

The other thing to consider during forward completion are indirect references.
When completing relations indirectly referenced through ways or other relations,
then the resulting file can become big very quickly. For example, a seemingly
small extract of the city of Strasbourg can suddenly contain not only the
relations for France and Germany but also electoral boundaries and entire
timezones. For that reason, when forward-completing relations, it is not
recommended to use backward completion.
2 changes: 1 addition & 1 deletion docs/user_manual/07-Input-Formats-And-Other-Sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The special file name `-` can be used to read from standard input or
write to standard output.

When reading data, use a `File` object to specify the file format. With
the SimpleReader, you need to use the parameter `filetype`.
the SimpleWriter, you need to use the parameter `filetype`.

!!! example
This code snipped dumps all ids of your input file to the console.
Expand Down
Loading