Impira Python SDK

The Impira SDK allows you to execute various commands against the Impira API as well as some more advanced operations (e.g. creating fields).

class impira.Impira(org_name, api_token, base_url='https://app.impira.com', ping=True)

This class is the main wrapper around a connection to an Impira org (including credentials). It uses a requests.Session object to optimize usage across requests. You should assume this class is not threadsafe.

Parameters
  • org_name (str) – Your org name. You can find this by logging into Impira and pulling out the text after /o/ in your URL: .../o/<YOUR_ORG_NAME>/...

  • api_token (str) – Your API token. See the API docs for instructions on how to obtain it.

  • ping (bool) – By default, Impira will try to ping the API when you create a connection to verify your credentials. You can set this flag to False to disable this check.

upload_files(collection_id: Optional[str], files: List[impira.api_v2.FilePath])

The main entry point to upload files. Based on the paths of the files, this will automatically call the standard URL-based file upload (higher performance) or upload the files directly through the multi-part form API.

Parameters
  • collection_id (str, optional) – The collection to upload the files into. Specify None to upload the files to “All files”.

  • files (List[FilePath]) – A list of files (including their name, path, and any mutation options) to upload. These can be URLs or local files.

Returns

The uids of the uploaded files. If you specify mutations which result in a dynamic number of documents (e.g. page splitting), then the returned value will be a generator.

get_collection_id(collection_name: str)

Retrieve the collection id corresponding to a given name.

Parameters

collection_name (str) – The name of the collection to look up.

Returns

The uid of the collection.

update(collection_id: str, data: List[Dict[str, Any]])

Update fields in a collection.

Parameters
  • collection_id (str) – The collection to update.

  • data (List[Dict[str, Any]]) – A list of data updates to perform. Each record should contain a uid field corresponding to the file to update.

Returns

The updated uids.

add_files_to_collection(collection_id: str, file_ids: List[str])

Add existing files to a collection.

Parameters
  • collection_id (str) – The collection id to add the files into.

  • file_ids (List[str]) – A list of file uids to add into the collection.

Returns

None

create_collection(collection_name: str)

Create a collection with the provided name.

Parameters

collection_name (str) – The name of the collection to create.

Returns

The collection id of the newly created collection.

create_field(collection_id: str, field_spec: impira.api_v2.FieldSpec)

Create a field in a collection. If you’re trying to create an inferred field (e.g. text extraction) then use create_inferred_field() which wraps this function and constructs an inferred field spec for you.

Parameters
  • collection_id (str) – The collection in which to create the field.

  • field_spec (FieldSpec) – The field’s definition.

Returns

None

create_fields(collection_id: str, field_specs: List[impira.api_v2.FieldSpec])

Create multiple fields in a collection. If you need to create multiple fields, this function is significantly more performant than calling create_field() in a loop.

Parameters
  • collection_id (str) – The collection in which to create the field.

  • field_specs – A list of fields to create.

Returns

None

create_inferred_field(collection_id: str, field_name: str, inferred_field_type: impira.api_v2.InferredFieldType, path: List[str] = [])

Create an inferred field. This is just a wrapper around create_field().

Parameters
  • collection_id (str) – The collection in which to create the field.

  • field_name (InferredFieldType) – The name of the field to create.

  • inferred_field_type – The inferred field type.

  • path (List[str]) – Specify a path if this is a sub-field of a table. For example, if this field should be created inside of a table named T, path should be ["T"].

Returns

None

delete_field(collection_id: str, field_name: str)

Delete a field in a collection

Parameters
  • collection_id (str) – The collection in which to delete the field.

  • field_name (str) – The name of the field to delete.

Returns

None

import_fields(collection_id: str, from_collection_id: str)

Import field definitions from another collection.

Parameters
  • collection_id (str) – The (destination) collection in which to add the fields.

  • from_collection_id (str) – The (source) collection from which to add the fields.

Returns

None

rename_file(uid: str, name: str)

Rename a file.

Parameters
  • uid (str) – The uid of the file to rename

  • name (str) – The name to rename the file to.

Returns

The uid of the updated file (it should match the uid parameter you pass in)

poll_for_results(collection_id: str, uids: Generator[str, None, None] = [])

Poll a collection for new results for a set of uids. This method will block until each of the specified files has fully processed, so it’s most often used after uploading files to a collection as a way to block on them processing.

Parameters
  • collection_id (str) – The collection id to poll for results.

  • uids (List[str] or Generator[str, None, None]) – A list or generator of file uids to block on. You can pass in the output of upload_files() to this function directly.

Returns

A generator which yields results for each uid as it’s available.

query(query: str, mode: str = 'iql', cursor: str = None, timeout: int = None)

Execute an Impira Query Language (IQL) query. You can either run the query ad-hoc (the default) or specify “poll” for the mode argument to poll for changes. In poll-mode, you can also specify a cursor to retrieve results since a particular point-in-time. See the poll docs for more information on polling.

Parameters
  • query (str) – The IQL query to execute

  • mode (str) – Either iql (the default) or poll. iql will run an ad-hoc query (against the current state) and poll will block until there are changes to the query results.

  • cursor (str, optional) – If specified in poll mode, the command will return changes to the query results since this cursor.

  • timeout (int, optional) – A timeout in seconds to run the query or poll for new results.

Returns

A generator which yields results for each uid as it’s available.

get_app_url(resource_type: impira.api_v2.ResourceType, resource_id: str) str

A helper function to generate a resource URL.

Parameters
  • resource_type (ResourceType) – The type of the resource.

  • resource_id (str) – The value (of the resource type) to point to.

Returns

The https URL of the resource.

get_collection_uid(collection_name: str)

Deprecated: Use get_collection_id() instead.

class impira.FilePath(*, name: str, path: str, uid: str = None, mutate: impira.api_v2.Mutation = None)

A local or remote file that can be uploaded to Impira.

Parameters
  • name (str) – The name of the file. This will appear in the Impira UI. The file’s name does not have to be unique.

  • path (str) – A path which is a local file or a remote file that Impira can access. The path is optionally URL-formatted. You can specify file:// as the protocol for a local file, or use a remote protocol like http:// for a remote file.

  • uid (str, optional) – The unique id for the file in Impira. Impira will automatically assign a unique uid if you do not specify one. If you do specify a uid, and the file already exists, the upload will overwrite the existing file as a new version.

  • mutate (Mutation, optional) – A set of mutations to apply while uploading the file.

class impira.Mutation(*, rotate: int = None, split: str = None, remove_pages: str = None, split_segments: List[str] = None, rotate_segments: List[impira.api_v2.RotateSegment] = None, split_exprs: Dict[str, str] = None)

A class that allows you to configure the mutations to apply to a file. See the Mutation API docs for more details on the available options.

class impira.RotateSegment(*, pages: str, degrees: int)

A single segment (set of pages) to rotate.

enum impira.ResourceType(value)

An enumeration of the various resource types in Impira to help with generating URLs.

Parameters
  • fc – Reference a collection by its id.

  • dc – Reference a dataset by its id.

  • ec – Reference an entity class by its name (e.g. file_collections::574764db867afdb9).

  • collection – Reference a collection by name.

  • files – Reference “All files” (the global endpoint).

Member Type

str

Valid values are as follows:

fc = <ResourceType.fc: 'fc'>
dc = <ResourceType.dc: 'dc'>
ec = <ResourceType.ec: 'ec'>
collection = <ResourceType.collection: 'collection'>
files = <ResourceType.files: 'files'>
exception impira.APIError(response)

An exception that wraps a failed response. The response field gives you access to the underlying response.

exception impira.IQLError

An exception that wraps an invalid IQL query.

enum impira.FieldType(value)

An enumeration of various field types.

Member Type

str

Valid values are as follows:

text = <FieldType.text: 'STRING'>
number = <FieldType.number: 'NUMBER'>
bool = <FieldType.bool: 'BOOL'>
timestamp = <FieldType.timestamp: 'TIMESTAMP'>
entity = <FieldType.entity: 'ENTITY'>
enum impira.InferredFieldType(value)

An enum that wraps inferred field types and contains helper methods to create FieldSpec instances from them.

Valid values are as follows:

text = <InferredFieldType.text: {'expression': '`text_string-dev-1`(File.text)', 'type': 'STRING'}>
number = <InferredFieldType.number: {'expression': '`text_number-dev-1`(File.text)', 'type': 'NUMBER'}>
timestamp = <InferredFieldType.timestamp: {'expression': '`text_date-dev-1`(File.text)', 'type': 'TIMESTAMP'}>
checkbox = <InferredFieldType.checkbox: {'expression': 'checkbox(File.text)', 'type': 'STRING'}>
signature = <InferredFieldType.signature: {'expression': 'region_signature(File.text)', 'type': 'STRING'}>
table = <InferredFieldType.table: {'expression': 'entity_one_many(File.text)', 'type': 'ENTITY', 'isList': True}>
document_tag = <InferredFieldType.document_tag: {'expression': 'document_tag(File)', 'type': 'STRING'}>

The Enum and its members also have the following methods:

build_field_spec(field_name: str, path: List[str] = []) impira.api_v2.FieldSpec

Build a field spec for an inferred field type.

Parameters
  • field_name (str) – The name of the field to create.

  • path (List[str]) – Specify a path if this is a sub-field of a table. For example, if this field should be created inside of a table named T, path should be ["T"].

:returns A FieldSpec.

class impira.FieldSpec(*, field: str, type: impira.api_v2.FieldType, expression: str = None, path: List[str] = None, isList: bool = None)

The definition of a field.

exception impira.InvalidRequest