This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Getting Started

DataSchemer lets you define the shape of your input once, and then rely on it to parse, validate, and structure user input consistently across your application.

At its core, DataSchemer uses schemas to describe what input is expected and how it should be interpreted. From that description, it handles type coercion, validation, defaults, and structured results for you.

A first example

Suppose your program expects a set of lattice vectors and an optional scale factor. You would like to accept human-friendly input, but work internally with numeric data.

A simple example

from data_schemer.schema_projector import SchemaProjector

schema = {
  "variables": {
    "lattice_vectors": {
      "type": "float-matrix",
      "help": "3×3 lattice vectors",
    },
    "scale": {
      "type": "float",
      "optional": True,
      "default": 1.0,
      "help": "Scale factor applied to the lattice vectors",
    },
  }
}

input_data = {
  "lattice_vectors": """
    0 1 1
    1 0 1
    1 1 0
  """,
  "scale": "4.8",
}

result = SchemaProjector(schema, input_data)

print (result.data)

The output is

{
  'lattice_vectors': array([[0., 1., 1.],
                            [1., 0., 1.],
                            [1., 1., 0.]]),
  'scale': 4.8
}

Basic substitution and arithmetic is safely supported. A hexagonal lattice may be entered as

input_data_tri = {
  "lattice_vectors": """\
    a=2.23  c=5.8
    a*r3/2  a/2  0
    a*r3/2 -a/2  0
    0       0    c
  """,
  "scale": "1.0",
}

result = SchemaProjector(schema, input_data_tri)

print(result.data["lattice_vectors"])

yielding

[[ 1.93123665  1.115       0.        ]
 [ 1.93123665 -1.115       0.        ]
 [ 0.          0.          5.8       ]]

A command line interface can automatically be constructed from the schema as follows

# test.py
from data_schemer.schema_command_line import SchemaCommandLine 
scl = SchemaCommandLine(schema)

print (scl.data['lattice_vectors'])

Executing on the command line gives

$ python test.py --lattice-vectors '0 1/2 1/2 ; 1/2 0 1/2 ; 1/2 1/2 0'
[[0.  0.5 0.5]
 [0.5 0.  0.5]
 [0.5 0.5 0. ]]

where the text string is parsed using a hierarchy of delimiters. Help menus are automatically generated from the schema information.

$ python test.py -h

usage: tt.py [-h] [--lattice-vectors X] [--scale X]

options:
  -h, --help
                        show this help message and exit
  --lattice-vectors X
                        (required) 3×3 lattice vectors
  --scale X    Scale factor applied to the lattice vectors

1 - Install

DataSchemer can be installed using pip. Very soon, you will be able to install from PyPI using

pip install data-schemer

Until then, clone the repo https://github.com/marianettigroup/data-schemer . The install using pip, preferably using editable mode.

pip install -e .