跳转至

Research Reference

Research workflow tracking, preprocessing, and portfolio construction tools.

Back to API Reference

Strategy research workflow helpers.

FeatureSet

Executable feature definitions for research pipelines.

fit

fit(data: Any) -> FeatureSet

No-op fit so FeatureSet can be used inside Pipeline.

transform

transform(data: Any, *, include_target: bool = False) -> pd.DataFrame

Build features, excluding target columns by default.

fit_transform

fit_transform(data: Any, *, include_target: bool = False) -> pd.DataFrame

Build features while keeping Pipeline semantics.

get_params

get_params() -> dict[str, Any]

Return serializable feature builder parameters.

Pipeline

Sequential train/test-safe research transformer pipeline.

fit

fit(data: Any) -> Pipeline

Fit each step sequentially using training data.

transform

transform(data: Any) -> Any

Transform data using the fitted pipeline state.

fit_transform

fit_transform(data: Any) -> Any

Fit each step and return transformed training data.

get_params

get_params() -> dict[str, Any]

Return serializable pipeline parameters for tracking artifacts.

Transformer

Protocol for user-defined research preprocessing steps.

fit

fit(data: Any) -> Any

Fit state from training data.

transform

transform(data: Any) -> Any

Transform data using fitted state.

get_params

get_params() -> dict[str, Any]

Return serializable parameters for tracking.

ResearchResult dataclass

Structured output of a research workflow.

as_weight_dict

as_weight_dict() -> dict[str, float]

Return weights as a plain dict suitable for target_weights().

to_dict

to_dict() -> dict[str, Any]

Return a JSON-friendly research payload.

ResearchRun

Context manager that records tracked research function calls.

preprocess

Static-method convenience facade for tracked preprocess functions.

record_step

record_step(name: str, category: str, params: dict[str, Any]) -> None

Append a recorded step.

finish

finish(*, features: Any | None = None, target: Any | None = None, selected_features: Sequence[str] | None = None, model: Any | None = None, scores: Any | None = None, selected: Any | None = None, weights: Any | None = None, artifacts: dict[str, Any] | None = None) -> ResearchResult

Build the final structured result.

ResearchStep dataclass

A recorded research operation.

to_dict

to_dict() -> dict[str, Any]

Return a JSON-friendly step payload.

ModelScorer

Turn model predictions into score Series for portfolio builders.

predict

predict(data: DataFrame) -> pd.Series

Predict scores from a feature frame.

get_params

get_params() -> dict[str, Any]

Return serializable scorer parameters.

current_run

current_run() -> ResearchRun | None

Return the current research run, if any.

tracked

tracked(category: str, name: str | None = None) -> Callable[[Callable[..., Any]], Callable[..., Any]]

Decorate a function so calls inside ResearchRun are recorded.

split_bars

split_bars(bars: DataFrame | Series, *, split: Any, level: str | int | None = None) -> pd.DataFrame | pd.Series

Return bar data on or after split for test-period backtests.

This keeps research workflows explicit: full bars can be used to build factors and fit preprocessing, while only the test period enters backtest performance statistics.

time_split

time_split(data: DataFrame | Series, *, split: Any, level: str | int | None = None) -> tuple[pd.DataFrame | pd.Series, pd.DataFrame | pd.Series]

Split time-indexed data into train/test sets.

The split point belongs to the test set: train < split and test >= split. MultiIndex panels use the first datetime-like level unless level is provided.