加载中…
正文 字体大小:

facebook开源时间序列预测框架-Forecasting at Scale

(2017-02-25 19:27:37)
标签:

时间序列

facebook

python

分类: 数据挖掘

Forecasting at Scale

1. facebook时间序列预测

facebook开源时间序列预测算法,该算法基于加法模型,支持非线性趋势预测,改变点(change point),周期性,季节性以及节假日等等。

It is based on an additive model where non-linear trends are fit with yearly and weekly seasonality, plus holidays. It works best with daily periodicity data with at least one year of historical data. Prophet is robust to missing data, shifts in the trend, and large outliers.

时间序列预测在实际工作中非常频繁,譬如预测业务发展,制定业务目标;设定产品的kpi,预测未来的UV, PV等等;

2. 时间序列预测框架

facebook开源时间序列预测框架-Forecasting <wbr>at <wbr>Scale

3. 算法

加法模型

y(t)=g(t)+s(t)+h(t)+ϵt" role="presentation" style="-webkit-print-color-adjust: exact; display: inline; line-height: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">y(t)=g(t)+s(t)+h(t)+ϵt其中,

g(t)" role="presentation" style="-webkit-print-color-adjust: exact; display: inline; line-height: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">g(t)表示增长函数,拟合时间序列模型中非周期性变化的值;

s(t)" role="presentation" style="-webkit-print-color-adjust: exact; display: inline; line-height: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">s(t)表示周或者年等季节性的周期性变化;

h(t)" role="presentation" style="-webkit-print-color-adjust: exact; display: inline; line-height: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">h(t)表示节假日或者事件,对时间序列预测值的影响;

4. 实例


# -*- coding: utf-8 -*-
# @DATE    : 2017/2/25 18:18
# @Author  : 
# @File    : fb_example1.py

import pandas as pd
import numpy as np
from fbprophet import Prophet

data_df = pd.read_csv("data/example_wp_peyton_manning.csv")
data_df["y"] = np.log(data_df["y"])
print(data_df.head())
print(data_df.tail())

# fit the model, model params
# growth = 'linear',
# changepoints = None,
# n_changepoints = 25,
# yearly_seasonality = True,
# weekly_seasonality = True,
# holidays = None,
# seasonality_prior_scale = 10.0,
# holidays_prior_scale = 10.0,
# changepoint_prior_scale = 0.05,
# mcmc_samples = 0,
# interval_width = 0.80,
# uncertainty_samples = 1000
m = Prophet()
m.fit(data_df)

# make prediction
data_future = m.make_future_dataframe(periods=30)
print(data_future.tail())
pred_res = m.predict(data_future)
print(pred_res[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail())

# visualization
m.plot(pred_res)

运行结果,


           ds         y
0  2007-12-10  9.590761
1  2007-12-11  8.519590
2  2007-12-12  8.183677
3  2007-12-13  8.072467
4  2007-12-14  7.893572
              ds          y
2900  2016-01-16   7.817223
2901  2016-01-17   9.273878
2902  2016-01-18  10.333775
2903  2016-01-19   9.125871
2904  2016-01-20   8.891374
STAN OPTIMIZATION COMMAND (LBFGS)
init = user
save_iterations = 1
init_alpha = 0.001
tol_obj = 1e-12
tol_grad = 1e-08
tol_param = 1e-08
tol_rel_obj = 10000
tol_rel_grad = 1e+07
history_size = 5
seed = 1691376609
initial log joint probability = -19.4685
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
      99       7977.57   0.000941357       431.339      0.3404      0.3404      134   
     199        7988.7   0.000894011       356.862       0.739       0.739      241   
     299       7996.29    0.00359033       180.856           1           1      358   
     399       8000.11   0.000546236       205.358     0.09131      0.7253      481   
     499       8002.89    0.00024026        99.613           1           1      608   
     514       8003.11   5.25911e-05       135.817   7.646e-07       0.001      671  LS failed, Hessian reset 
     580       8003.41   3.04884e-05       92.4947    1.88e-07       0.001      798  LS failed, Hessian reset 
     599       8003.49   8.15685e-05        83.046      0.6885      0.6885      821   
     607        8003.5   2.60204e-05       67.9783   1.712e-07       0.001      874  LS failed, Hessian reset 
     654       8003.64   0.000118504       280.906   6.562e-07       0.001      973  LS failed, Hessian reset 
     699       8003.75   2.52751e-06       58.0645      0.3238           1     1029   
     705       8003.75   4.61033e-07       59.0008      0.2964           1     1037   
Optimization terminated normally: 
  Convergence detected: relative gradient magnitude is below tolerance
             ds
2930 2016-02-15
2931 2016-02-16
2932 2016-02-17
2933 2016-02-18
2934 2016-02-19
             ds      yhat  yhat_lower  yhat_upper
2930 2016-02-15  8.021739    7.371417    8.641458
2931 2016-02-16  7.710504    7.079853    8.334700
2932 2016-02-17  7.448298    6.849103    8.012131
2933 2016-02-18  7.370376    6.724225    8.004908
2934 2016-02-19  7.305117    6.683996    8.001754

Process finished with exit code 0

5. 参考资源

facebook prophet 

https://facebookincubator.github.io/prophet/

PS:在日常工作应用中,预测成交额,销量,PV等等可以借鉴fb的时间序列技术,引入季节性因素,节假日,促销事件(譬如双11,双12等);

0

阅读 评论 收藏 转载 喜欢 打印举报
已投稿到:
  • 评论加载中,请稍候...
发评论

       

    发评论

    以上网友发言只代表其个人观点,不代表新浪网的观点或立场。

      

    新浪BLOG意见反馈留言板 不良信息反馈 电话:4006900000 提示音后按1键(按当地市话标准计费) 欢迎批评指正

    新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 会员注册 | 产品答疑

    新浪公司 版权所有