Software Developer - Realtime Analytics Infrastructure
You will be responsible for building a new infrastructure to support realtime analytics. You will work with a smart, driven team to transform Big Data into applications and services that help delight, connect, and inspire users. The successful candidate will have significant influence over the roadmap, feature set, and design of the overall architecture across all components of the Big Data pipeline: data quality, model and schema design, large-scale data processing, statistics, machine learning, and visualization.
- 5+ years proven experience building, debugging, and optimizing complex distributed systems.
- Craftsmanship in distributed or data flow programming (Spark preferred, Cascading/Scalding, Crunch) with strong bias towards functional programming languages
- Hunger for bleeding edge open source Big Data stacks is a must. Proficiency at working with open source projects is required. Extra bonus points for Apache project committers and contributors.
- Independent and collaborative, exhibited personal initiative and strong ownership of projects, and a drive for results with the balance between quality and quantity.
- A good learner, and a good mentor who inspires peers.
- Demonstrated ability to deep dive and deliver results with reliability, scale and performance optimizations.
- Established history with Big Data technologies. Production engineering with Hadoop ecosystem, BDAS and realtime/reactive/streaming applications strongly preferred HBase, Spark/MLlib/GraphX, Kafka.
- Created and productionalized data projects in one or more of these areas: statistical modeling, data mining, natural language processing, graph analysis, recommendation engine, regression/predictive modeling, social network analysis, or other machine learning techniques.
To apply for this job, either use the 'Apply' link above or email a resume and cover letter to email@example.com mentioning the position in the subject.