eds4jinja2 package

Subpackages

Module contents

build_eds_environment(external_data_source_builders={'add_relative_figures': <function <lambda>>, 'from_endpoint': <function <lambda>>, 'from_file': <function <lambda>>, 'from_rdf_file': <function <lambda>>, 'invert_dict': <function <lambda>>, 'namespace_inventory': <function <lambda>>, 'replace_strings_in_tabular': <function <lambda>>, 'simplify_uri_columns_in_tabular': <function <lambda>>}, external_filters={'escape_latex': <function <lambda>>}, **kwargs)[source]

creates a JINJA environment and injects the global context with EDS functions

Parameters
  • external_filters – additional filters to be make available in the templates

  • external_data_source_builders – additional instructions to be made available in the templates

  • kwargs

Returns

inject_environment_globals(jinja_environment: jinja2.environment.Environment, context: dict, update_existent=True)[source]

Inject the context into JINJA2 environment making it globally available from any template. Updates in place the global environment dictionary by adding non existent keys from another dictionary. If the keys exist then they are replaced depending whether the update_existent flag is set.

Parameters
  • update_existent – whether the overlapping values shall be overwritten

  • context – additional context to be injected

  • jinja_environment – JINJA environment to be updated

Returns

class FileDataSource(file_path)[source]

Bases: eds4jinja2.adapters.base_data_source.DataSource

Fetches data from a file. Automatically determines the file type.

  • Supported tabular file types: “.csv”, “.tsv”, “.xlsx”, “.xls”

  • Supported tree file types: “.json”, “.yaml”, “.yml”, “.toml”, “.json-ld”, “.jsonld”

To read a CVS

>>> ds = FileDataSource("path/to/the/file.csv")
>>> pd_data_frame, error = ds.fetch_tabular()

To read a JSON

>>> ds = FileDataSource("path/to/the/file.json")
>>> pd_data_frame, error = ds.fetch_tree()
property file_path: pathlib.Path

The location of the DataSource file

Returns

property _file_extension
_can_be_tree() bool[source]
_can_be_tabular() bool[source]
_fetch_tree()[source]

fetch data and return as tree representation

Returns

_fetch_tabular()[source]

fetch data and return as tabular representation

Returns

_abc_impl = <_abc_data object>
class RemoteSPARQLEndpointDataSource(endpoint_url)[source]

Bases: eds4jinja2.adapters.base_data_source.DataSource

Fetches data from SPARQL endpoint. Can be used either with a SPARQL query or a URI to be described.

To query a SPARQL endpoint and get the results as dict object

>>> ds = RemoteSPARQLEndpointDataSource(sparql_endpoint_url)
>>> dict_object = ds.with_query(sparql_query_text)._fetch_tree()

unpack the content and error for a fail safe fetching >>> dict_object, error_string = ds.with_query(sparql_query_text).fetch_tree()

To describe an URI and get the results as a pandas DataFrame

>>> pd_dataframe = ds.with_uri(existent_uri)._fetch_tree()

unpack the content and error for a fail safe fetching

>>> pd_dataframe, error_string = ds.with_uri(existent_uri).fetch_tree()

In case you want to target URI description from a Named Graph

>>> pd_dataframe, error_string = ds.with_uri(existent_uri,named_graph).fetch_tree()
with_query(sparql_query: str, substitution_variables: Optional[dict] = None, sparql_prefixes: str = '') eds4jinja2.adapters.remote_sparql_ds.RemoteSPARQLEndpointDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

with_query_from_file(sparql_query_file_path: str, substitution_variables: Optional[dict] = None, prefixes: str = '') eds4jinja2.adapters.remote_sparql_ds.RemoteSPARQLEndpointDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

with_uri(uri: str, graph_uri: Optional[str] = None) eds4jinja2.adapters.remote_sparql_ds.RemoteSPARQLEndpointDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

_fetch_tree()[source]

fetch data and return as tree representation

Returns

_fetch_tabular()[source]

fetch data and return as tabular representation

Returns

_can_be_tree() bool[source]
_can_be_tabular() bool[source]
_abc_impl = <_abc_data object>
class RDFFileDataSource(filename)[source]

Bases: eds4jinja2.adapters.base_data_source.DataSource

Accesses a local RDF file and provides the possibility to fetch data from it by SPARQL queries.

__reduce_bound_triple_to_string_format(dict_of_bound_variables: dict)
with_query(sparql_query: str, substitution_variables: Optional[dict] = None, prefixes: str = '') eds4jinja2.adapters.local_sparql_ds.RDFFileDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

with_query_from_file(sparql_query_file_path: str, substitution_variables: Optional[dict] = None, prefixes: str = '') eds4jinja2.adapters.local_sparql_ds.RDFFileDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

with_file(file: str) eds4jinja2.adapters.local_sparql_ds.RDFFileDataSource[source]

Set the query text and return the reference to self for chaining.

Returns

_fetch_tabular()[source]

fetch data and return as tabular representation

Returns

_fetch_tree()[source]

fetch data and return as tree representation

Returns

_can_be_tree() bool[source]
_can_be_tabular() bool[source]
_abc_impl = <_abc_data object>
add_relative_figures(data_frame: pandas.core.frame.DataFrame, target_columns: List[str], relativisers: List, percentage: bool = True)[source]

For each target_columns add a calculate column with relative values calculated based on the provided relativisers.

Parameters
  • percentage

  • data_frame

  • target_columns

  • relativisers – a list of indicators corresponding to the target_columns comprising either None, a number or a column name

Returns

replace_strings_in_tabular(data_frame: pandas.core.frame.DataFrame, target_columns: Optional[List[str]] = None, value_mapping_dict: Optional[Dict] = None, mark_touched_rows: bool = False) List[str][source]

Replaces the values from the target columns in a data frame according to the value-mapping dictionary. If the inverted_mapping flag is true, then the inverted value_mapping_dict is considered. If mark_touched_rows is true, then adds a boolean column _touched_ where

>>> mapping_dict example = {"old value 1" : "new value 1", "old value 2":"new value 2"}
Parameters
  • mark_touched_rows – add a new boolean column _touched_ indicating which rows were updated

  • value_mapping_dict – the string substitution mapping

  • target_columns – a list of column names otehrwise leave empty if substitution applies to all columns

  • data_frame – the data frame

:return the list of unique strings found in the dataframe

class NamespaceInventory(namespace_definition_dict=None)[source]

Bases: rdflib.namespace.NamespaceManager

namespaces_as_dict()[source]
Returns

return the namespace definitions as a dict

uri_to_qname(uri_string, prefix_cc_lookup=True, error_fail=False)[source]

Transform the uri_string to a qname string and remember the namespace. If the namespace is not defined, the prefix can be looked up on prefix.cc

Parameters
  • error_fail – whether the errors shall fail hard or just issue a warning

  • prefix_cc_lookup – whether to lookup a namespace on prefix.cc in case it is unknown or not.

  • uri_string – the string of a URI to be reduced to a QName

Returns

qname string

qname_to_uri(qname_string: str, prefix_cc_lookup=True, error_fail=False) str[source]

Transform the QName into an URI

Parameters
  • qname_string – the qname string to be expanded to URI

  • error_fail – whether the errors shall fail hard or just issue a warning

  • prefix_cc_lookup – whetehr to look for missing prefixes at the http://prefix.xx

  • error_fail – shall the error fail hard or pass with a warning

Returns

the absolute URI string