Coder’s Cauldron | DyPy: Interface to Backward Dynamic Programming

DyPy’s goal is to provide an interface to backward dynamic programming that supports the following priorities (in order): ⁽³⁾

Ease of learning and use.
Flexibility (can be adapted to new problems).
Speed (once the previous goals are met).

Dynamic Program

The Dynamic Program class is the core of DyPy. Each problem you wish to solve will involve creating an instance of this class and attaching the classes below to it in ways that tell it how to solve your problem.⁽³⁾ DyPy should be able to handle problems with numerous state variables, which is a significant design consideration.

Dynamic Program manages all data and the flow of the optimization. By default, it will build all the stages and manage their tables, but this part of the process can be customized as well.

Objective Functions

The objective function will do some of the heavy lifting for your dynamic program and must be created by the user for each specific optimization problem. In each stage of the optimization, DyPy will call the objective function for each combination of state variables and stage variables; the objective function must return the cost or benefit value for that set of inputs. The objective function will be provided access to the Stage object for the stage it is currently evaluating, as well as the values of all the state variables and the decision variable. These will be provided as keyword arguments to the objective function.^[3]

Stage

A stage provides the set of potential states and decisions at each modelled point (sequential moment) in the dynamic program. Most classes eventually tie to either the Dynamic Program class or the Stage class, which does most of the heavy lifting and data management in this package. The Dynamic Program class automatically creates and handles stages by default, but you can alter this behavior for more complex scenarios. ⁽³⁾

State Variable

State Variable objects provide options for potential future conditions. A https://nickrsan.github.io/dypy/build/html/conceptual_overview.html – dynamicprogramDynamic Program can involve multiple state variables – in which case, all permutations of all state variable values are evaluated.

Be careful, because the solution space can quickly grow as you add more state variables with more options. A State Variable should have a name and a set of potential values. By default, the potential values can be generated for you if you provide a minimum value, a maximum value, and the discretization size of steps in between.

Decision Variable

Decision Variables describe potential choices that can be made at each stage. Like State Variable objects, they have names and values. However, they are managed slightly differently. DyPy currently only supports a single decision variable; several decision variables might theoretically be included, with increasing complexity to both code and solution time. Both Decision Variables and State Variables are provided to the objective function to determine the value of each potential decision when the system is in a certain state.⁽³⁾

Prior

Priors are used in DyPy in two ways, each referring data from a previous stage that should be included in the present stage. This requirement arises both during the backward matrix formulation and the forward path calculation. The dypy.Prior techniques of applying future stage values to older stages are provided by the prior class and subclasses. dypy.SimplePrior has a single-variable implementation that may or may not work for multi-variable issues. To offer a new implementation, this class can be subclassed and the apply method overridden. The new matrix should be returned by the apply method.^[3]

By default, the Prior class to be used should be provided to the Dynamic Program upon creation, but they can also be overridden per-stage in case of a need to apply priors differently at different stages.

Reducer

Reducers are still to be implemented; they provide a tool for turning multi-state variable problems into single state variable problems before calculating the best path. One state of a stochastic dynamic program may be determined by your decisions, while other states are determined by probabilistic future events. Reducers can help reduce the probabilistic states so that a single state variable reflecting the needs of the decision can be used for the forward optimal path calculation.

Use of reducers is not required, and those with need for a true stochastic dynamic program may wish to implement branching behavior reflecting the uncertainty in future stages. The Stage and Prior classes would then need to be overridden to provide such behavior in lieu of using reducers

Programming Effort

You’re a programmer working hard to triage bugs in your software before a big release coming up. In your bug tracker, bugs are split into 5 categories: “Core”, “UX/UI”, “Network”, “Database”, and “API”. The release is in 12 days, but your team manager wants to make sure to get some fixes in each. They ask you to spend no more than 5 days on any single area and at least one day on each, but other than that, they just want you to fix the most bugs possible.

Considering the prioritization of tickets in the tracker, you estimate the number of bugs you fix in each category as a function of time as follows:

Bugs closed by days of effort in each area
Category	1 Day	2 Days	3 Days	4 Days	5 Days
Core	2	5	7	8	10
UI/UX	3	6	8	9	9
Network	1	2	4	7	9
Database	2	3	6	8	11
API	4	6	8	9	10

Code

# we have 12 days available to us for work

state_variable = dypy.StateVariable(“Days Available for Category”, values=range(1, 13))

# but we can only spend between 1 and 5 days working on a single category

decision_variable = dypy.DecisionVariable(“Time on Category”, options=decision_options)

def objective_function(stage, days_available_for_category, time_on_category):

“””

When the objective is called, the solver makes the stage, state variables,

and decision variable available to it. The keyword argument names here

match the names of the state variables and decision variables (which will be

passed as keyword arguments; the order isn’t important, but the name is). The name

is automatically derived by lowercasing the name of the variable and replacing

spaces with underscores. It will remove digits from the front if they are present

so that the name is a valid python identifier

In this example, we can think of these variables as providing:

stage – this will give access to the DyPy.stage object
state – this just provides access to the state value being assessed, not the object
decision – this just provides access to the decision value being assessed, not the object

“””

# we are given benefits, so defining here – each category is a row, each column is a

# number of days, and the value is how much benefit we get from working on that category

# for that long

benefit_list = [

# column 0 means no time spent on it, so we get no benefit

[0, 2, 5, 7, 8, 10],

[0, 3, 6, 8, 9, 9],

[0, 1, 2, 4, 7, 9],

[0, 2, 3, 6, 8, 11],

[0, 4, 6, 8, 9, 10],

]

# can’t spend more time than we have available

if time_on_category > days_available_for_category:

# so exclude these options – sets to high positive or negative value

return stage.parent_dp.exclusion_value

else:

return benefit_list[stage.number][time_on_category]

# define the dynamic program and tell it we have four timesteps

dynamic_program = dypy.DynamicProgram(timestep_size=1, time_horizon=5,

objective_function=objective_function, calculation_function=dypy.MAXIMIZE,

prior=dypy.SimplePrior)

dynamic_program.decision_variable = decision_variable

dynamic_program.add_state_variable(state_variable)

# each category will be a stage, in effect – tell it to create all five stages as empty

# assigns names by default, but we can override them

dynamic_program.build_stages(name_prefix=”Category”)

# make a list of names in the same order they’re used in our objective function

stage_names = [“Core”, “UI/UX”, “Network”, “Database”, “API”]

# and use it to set the value of .name for each stage

for i, stage in enumerate(dynamic_program.stages):

dynamic_program.stages[i].name = stage_names[i]

# run the dynamic program – results are logged to python logger `dypy` and decisions are set on each stage as .decision_amount

dynamic_program.run()

References

FavTutor.com: Dynamic Programming (With Python Problems)
DyPy Core Concepts and Classes
DyPy Documentation

Authored by: Tariq Khosla, Analyst at Absolutdata