Tech Talk: Apache Samza, a distributed stream processing framework.

Date and time

Sunday, June 21, 2015 · 1:30 - 4pm PDT

Location

1320 Ridder Park Dr San Jose, CA 95131

Description

本次活动详情链接:http://www.tech-meetup.com/events/20150621

加入我们的community:http://www.tech-meetup.com/wechat and http://www.tech-meetup.com/signup

Apache Samza: a distributed stream processing framework.

Abstract:
The world is going real-time. MapReduce, SQL-on-Hadoop and similar batch processing tools are fine for analyzing and processing data after the fact — but sometimes you need to process data continuously as it comes in, and react to it within a few seconds or less. How do you do that at Hadoop scale?

Apache Samza is an open source stream processing framework designed for continuous data processing. Unlike batch processing systems such as Hadoop which typically has high-latency responses (sometimes hours), Samza continuously computes results as data arrives which makes sub-second response times possible. Samza has some unique features that make it powerful. It provides high performance for stateful processing jobs, including aggregation and joins between many input streams. It is designed to support an ecosystem of many different jobs written by different teams, and it isolates them from each other, so that one badly behaved job can’t affect the others.

At LinkedIn, we have been using Samza in production both for internal analytic purposes and for data products that are served on the live site. In this talk, we will focus on detailed architecture of Samza, and comparison with other major open-sourced streaming process frameworks.

活动安排:

1:30pm - 1:50pm receiption and social time
1:50pm - 3:00pm talk and Q&A
3:00pm - 4:00pm: offline networking

主办
湾区同学技术沙龙(www.tech-meetup.com)

协办
南京大学硅谷校友会
硅谷清华联网
中国科技大学校友会创业俱乐部
浙江大学校友会海纳创新创业俱乐部
北京大学北加州校友会
武汉大学北加州校友会
东南大学硅谷校友会
吉林大学硅谷校友会
复旦大学北加州校友会
华人事业互助会
华美信息存储协会

Organized by

Sales Ended