加载中…

加载中...

wepay:基于机器学习的自动化欺诈检测系统

转载 2016-02-17 20:29:39

wepay:基于机器学习的自动化欺诈检测系统

第三方支付平台,https://go.wepay.com/about-wepay

wepay


https://en.wikipedia.org/wiki/WePay

wepay基于机器学习进行欺诈检测,减少资损。

you have to be able to spot fraud with a high degree of accuracy so that you can shut it down before it results in a loss.

人工经验+机器学习,实现自动化,减少人力成本,提升性能和效率

At WePay, it increasingly also means machine learning models which can spot complicated fraud patterns faster with less human intervention.

目前基于机器学习进行反欺诈存在的挑战

(1)欺诈不是静止不变的

道高一尺魔高一丈,fraud is constantly changing

Machine learning models are great for spotting fraud, but they aren’t psychic — they rely on past data to make predictions about the transactions they’re currently looking at. Since the patterns aren’t constant, that means they go out of date quickly.模型性能衰减较快

根据wepay的经验,Beyond the month, its accuracy may drop by 50%, and will continue to slowly decrease after that.

(2)更新模型比较困难

Retraining a model by running the full machine learning pipeline can take hours. This includes extraction and transformation (ETL) of incremental new data, feature creation and engineering, model training, performance evaluation, and model deployment.

为了减少复杂度,某些公司采用简单的模型,logistic regression,但是治标不治本。the newest data might not be the most useful for model training purposes because new fraud can take time to mature — it can often take two or more months for a cardholder to see and report fraud. This means new data can be labeled good before it’s seen as bad, and training models with the latest data can actually hurt model accuracy.

wepay欺诈检测自动化

wepay 自动化方法:

+ Pull new, incremental retraining data daily 增量计算

+ Refresh the model by running it again with combined new and existing fraud data

+ Test the new models, evaluating each on Area Under Curve (AUC), precision and recall

+ Transfer models that meet initial test criteria into a pseudo-production environment for additional assessment against test cases + Deploy upon satisfactory completion of all performance and test case validation

基于python实现机器学习自动化

wepay采用python作为模型原型和生产环境语言。

基于python做web服务,flask,django

基于python scikit-learn pandas numpy构建机器学习模型,快速,方便,简洁

Just copy the model files to production instance and import the same libraries in production as in development, and you are almost good to go!

都是基于python开发,部署到迁移,完全兼容

Putting it all together

模型日更新,When we’re training our models, we simply exclude transactions flagged as good in the most recent time period while including every transaction flagged as fraud that we can. This lets us train on data that includes the most recent fraud patterns while also not contaminating our model with bad data.

总结

数据科学自动化,提升性能,减少成本,增加效率

持续学习新技术,优化方法,提升反欺诈效果

fraud doesn’t stand still. If we’re to be successful in fighting crime and protecting our customers’ money, we must constantly be working to improve our approach, explore new techniques, and create new systems that let us tackle newer and more sophisticated attacks.

比如深度学习算法,ensemble technique等

from:

http://blog.wepay.com/automating-machine-learning-for-platform-fraud-detection/

感受:

基于业务的机器学习平台,自动化系统和平台化,增量计算,模型日更新,借鉴应用到实际工作。


阅读(0) 评论(0) 收藏(0) 转载(0) 举报

评论

重要提示:警惕虚假中奖信息
0条评论展开
相关阅读
加载中,请稍后
bicloud
  • 博客等级:
  • 博客积分:0
  • 博客访问:373,150
  • 关注人气:0
  • 荣誉徽章:

相关博文

新浪BLOG意见反馈留言板 电话:4006900000 提示音后按1键(按当地市话标准计费) 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 会员注册 | 产品答疑

新浪公司 版权所有